[HN Gopher] It Can Happen to You
___________________________________________________________________
It Can Happen to You
Author : mooreds
Score : 1542 points
Date : 2021-03-04 00:37 UTC (22 hours ago)
(HTM) web link (www.mattkeeter.com)
(TXT) w3m dump (www.mattkeeter.com)
| praptak wrote:
| The reward system of my brain wants me to write clever code to
| solve the "core" problem. The remaining 80-99.999% (depending on
| the environment) of the code - authentication & authorization,
| logging, parsing, I/O, ensuring that the program is "well
| behaved" - all that is perceived as "boring", "boilerplate",
| "glue code", which results in churning out code that is not
| thought through.
|
| So yes, that code probably contains some bugs from the "stupid if
| you think about it for a second" category.
| Jennifer9918910 wrote:
| I'm available, just follow my link
| https://rebrand.ly/sexygirls007
| mkeeter wrote:
| Blog author here! Thanks to HN for warning me about sscanf at
| _exactly_ the right time - within a day of me trying to load some
| ASCII STLs and noticing it was slow...
|
| Linked deep in the Twitter replies [1], there's an open glibc
| issue about this, dating back to 2014:
|
| https://sourceware.org/bugzilla/show_bug.cgi?id=17577
|
| C doesn't have any _requirements_ on the complexity of sscanf, so
| it might not be a bug per se, but it 's certainly... a pitfall.
|
| [1] https://twitter.com/impraxical/status/1367194430835425283
| froh wrote:
| printf and scanf match nicely with their format specifiers, so
| the serialization and deserialization can be maintained nicely
| in lockstep.
|
| to avoid the quadratic overheating sttlen you can simply use
| fmemopen(3), which makes the temporary sscanf FILE object
| explicit and persistent for the whole parse, and needs just one
| strlen call.
|
| https://news.ycombinator.com/item?id=26343149
| Waterluvian wrote:
| I love the format of this blog post. Perfect length. Perfect
| detail. You don't waste words. A good number of images.
| lambda_obrien wrote:
| And it works without javascript!
| icemelt8 wrote:
| all blogs work without javascript.
| ignoramous wrote:
| I like this blog post but I also happen to quite like long
| forms like these, too: https://mango.pdf.zone/finding-former-
| australian-prime-minis... / https://apenwarr.ca/log/20201227
| komodo009 wrote:
| The mango blog post was incredibly entertaining. Thanks for
| sharing.
| chromanoid wrote:
| When looking at loading performance of STLs, you may want to
| look at this: https://papas-best.com/stlviewer_en
| fermienrico wrote:
| Love the post, except for the title. It is too click-baity.
| gnyman wrote:
| Thank you for introducing me to the concept/term _Ascetic
| programming_. Not sure how widely used it is, but I find it
| more fitting for what I try to do than _minimalistic_ or
| _KISS_.
|
| Also, it is great to see someone write > I
| noticed that ASCII STL loading was really quite slow. >
| From startup to showing the window, it took over 1.8 seconds!
|
| I always find pleasure seeing projects which highlight just how
| fast modern computers really are.
| gnyman wrote:
| Re-read my comment and to be clear, the quote and the last
| paragraph are not related. The last sentence was meant to
| refer to the Erizo project as a nice single purpose high
| performing tool, not as a comment to the bug that made it
| slow.
| rkagerer wrote:
| Hey, Matt, neat to see you here and congrats on making the
| front page! Recognize you from the Formlabs forums &
| conferences.
|
| Love that notion of professional empathy underscoring your
| message in the blog post.
| mkeeter wrote:
| Hey RK! Thanks for the kind words - hopefully I'll see you
| again, once we're back to holding in-person events!
| koalahedron wrote:
| I know Matt too, primarily from him rejecting my Libfive PRs
| for being " _too_ 1337 ".
|
| But seriously, Matt, I might have a project for an in-person
| event regarding printing robots. Stay tuned--to what channel
| I don't know.
| BenFrantzDale wrote:
| IMO the lack of a complexity requirement is a bug in the C
| standard. And really it's a bug in the implementation(s?) too.
| If it can be done on O(1), shame on library authors for doing
| it in O(n). If you want programmers to trust library authors,
| don't do this to us. Maybe std::from_chars FTW?
| [deleted]
| ironmagma wrote:
| Shouldn't we just come clean and admit to ourselves that
| there is no such thing as the C standard? There is a
| collection of loosely related languages that look similar and
| that collectively we call C, but really they're all
| completely different and share almost no interoperability or
| common characteristics. And those standards that do exist
| provide almost no ability to reason about your code including
| things like ordering of statements.
| loeg wrote:
| No, that's complete nonsense.
|
| Here's the latest revision of the standard:
| https://www.iso.org/standard/74528.html
|
| C has had a well-defined memory model since C11, I believe
| (re: "ordering of statements").
| ironmagma wrote:
| > ISO-C11 specifies 203 circumstances that cause
| undefined behaviors.
|
| 203 is enough to make almost every line of code
| questionable. The result of this is that looking at a
| simple 3 line C program and being asked whether the
| program terminates is undecidable without knowing which
| compiler was used.
|
| Null dereference for example is undefined behavior, and
| could cause a termination or not, depending on the
| implementation, even if it is known to be standards
| conforming to C11.
|
| [1] https://resources.tasking.com/p/undefined-behaviors-
| iso-c-th...
| ncmncm wrote:
| A good argument for compiling your debug builds with
| "-fsanitize=undefined".
| ironmagma wrote:
| There is always a fix for your own code but that's not
| the problem. The issue is all the millions of lines of
| code in the wild that are intended to be compiled without
| that option.
| loeg wrote:
| Sure, or use Rust! Rust is great! We can criticize C for
| its faults without making baseless claims.
| loeg wrote:
| > 203 is enough to make almost every line of code
| questionable. The result of this is that looking at a
| simple 3 line C program and being asked whether the
| program terminates is undecidable without knowing which
| compiler was used.
|
| This is hyperbole to the point of being nonsensical.
|
| > Null dereference for example is undefined behavior, and
| could cause a termination or not, depending on the
| implementation, even if it is known to be standards
| conforming to C11.
|
| This sentence doesn't make any sense. If your C code has
| UB, it is wrong. The behavior of particular environments
| around certain UB is irrelevant to standards-conforming
| code, because standards-conforming code doesn't have UB.
| gwd wrote:
| > This is hyperbole to the point of being nonsensical.
|
| I think you can only say this if you've never had
| aggressive compiler optimizations introduce security
| issues into perfectly reasonable-looking code.
|
| Quiz, what's wrong with the following code?
| int buflen, untrusted; char buf[MAX];
| /* `untrusted` comes from an untrusted source */
| if (buflen + untrusted > MAX) { return
| -EINVAL; }
|
| The answer of course is that integer overflow is
| undefined; so if buflen + untrusted is greater than
| INT_MAX, the compiler is allowed to do _absolutely
| anything it wants_ ; and making sure it's only allowed to
| do something sensible turns out to be extremely
| difficult.
|
| EDIT For instance, in an earlier age, people might have
| done something like this: if (buflen +
| untrusted > MAX || buflen + untrusted < buflen)
|
| But the second clause relies on overflow. The compiler is
| perfectly justified in saying, "Well, overflow is UB
| anyway, so if it happens, I'm allowed to not do anything;
| so I'll just make this code more efficient by removing
| that check entirely."
| ironmagma wrote:
| > If your C code has UB, it is wrong.
|
| This goes against the sheer notion of UB. If some code
| was wrong, the standard would say it is not allowed and
| it would result in a compile error, or at least a runtime
| error. As it is, the language standards choose to leave
| it open almost as if to concede that the standard can't
| cover every base. UB isn't wrong, almost by definition.
| It's just implementation specific, and that's my point.
| We don't have an overarching C language, we have a
| hundred or so C dialects.
| matkoniecz wrote:
| I always though that code with UB is wrong, and UB allows
| implementation to deal with it on its own way (it is
| allowed to ignore it, stop program, corrupt memory,
| delete hard drive contents...).
|
| So if your code has UB then it is wrong, one thing not
| specified in standard is exact consequences of that.
|
| (yes, in some hacks one may rely on UB behaving in some
| way in some circumstances - it will be hack)
| ironmagma wrote:
| Suppose it is wrong, though; that implies a good chunk of
| C code out there is wrong code. Yet it compiles and
| people are using it, which means that their code does not
| conform to the standard. Just as wrong math isn't math at
| all, wrong C is not C. People are therefore writing code
| whose runtime characteristics are not defined by any
| standard. Thus it is not actually C, it's whatever
| compiler they're using's language.
| matkoniecz wrote:
| Working and usable program typically contains wrong code
| of various kinds.
|
| Nontrivial bugfree programs are extreme rarity.
|
| > wrong C is not C
|
| buggy C is still C, if on discovering undefined behavior
| people treat it as a bug - then it is just C program with
| some bugs in it.
|
| If on discovering undefined behavior people treat it
| acceptable people treat it differently "on my compiler it
| does XYZ, therefore I will knowingly do ABC" then it is
| becoming something else.
| ironmagma wrote:
| It's not really a bug if it works as the way it was
| intended by the developer. It just exists in a world
| outside the law, makes its own laws based on what works,
| a renegade program. Most people don't read the C standard
| or care what it says (and it costs money, so it's almost
| as if reading it is discouraged), so it seems very likely
| the default human behavior is just to use this UB.
| einpoklum wrote:
| There's "implementation-defined" behavior, and then there
| is "undefined behavior". I think you're conflating the
| two.
| andrewaylett wrote:
| One problem here is that correct code relies on valid
| inputs in order to avoid UB -- Undefined behaviour is a
| runtime property of a running program, rather than
| (necessarily) a static property of an isolated unit of
| code.
|
| In this way, UB is essentially the converse of Rust's
| `unsafe` -- we must assume that our caller won't pass in
| values that would trigger undefined behaviour, and we
| don't necessarily have the local context to be able to
| tell at runtime whether our behaviour is well-defined or
| not.
|
| There definitely are instances where local checks can
| avoid UB, but it's also perfectly possible to write a
| correct program where a change in one module causes UB to
| manifest via different module -- use after free is a
| classic here. So we can have two modules which in
| isolation couldn't be said to have any bugs, but which
| still exhibit UB when they interact with each other.
|
| And that's before we start getting into the processing of
| untrusted input.
|
| A C compiler -- and especially the optimiser --
| assumes[1] that the conditions for provoking UB won't
| occur, while the Rust compiler (activate RESF[0]) mostly
| has defined behaviour that's either the same as common C
| compilers would give for a local UB case[2] in practice
| or have enough available context to prove that the UB
| case genuinely doesn't happen.
|
| [0] https://enet4.github.io/rust-tropes/rust-evangelism-
| strike-f...
|
| [1] Proof by appeal to authority: I was a compiler
| engineer, back in the day.
|
| [2] Signed integer wrap-around is the classic here: C
| assumes it can't happen, Rust assumes it might but is
| much less likely to encounter code where there's a
| question about it happening.
| TheCoelacanth wrote:
| I still think undefined behavior is the wrong choice
| here. It should have been implementation-defined, like
| what happens if you bit shift a negative integer to the
| right. They could pick two's complement or trap on
| overflow or whatever is most convenient on their
| platform, but not just assume it will never happen.
| progre wrote:
| There is a standard, sure. But there are also a _lot_ of
| compilers out there and I would bet that all but a few
| has either a "this compiles c11 except for [list of
| unimplemented features]" caveat or non-standard
| extensions.
| loeg wrote:
| > But there are also a lot of compilers out there and I
| would bet that all but a few has either a "this compiles
| c11 except for [list of unimplemented features]" caveat
| or non-standard extensions.
|
| Your statement is broadly reasonable. Here's the GP:
|
| > Shouldn't we just come clean and admit to ourselves
| that _there is no such thing as the C standard_? There is
| a collection of _loosely related languages_ that look
| similar and that collectively we call C, but really
| _they're all completely different and share almost no
| interoperability or common characteristics_. And those
| standards that do exist provide _almost no ability to
| reason about your code_ including things like ordering of
| statements.
|
| It's just a string of incorrect statements, or at best,
| extreme hyperbole.
|
| 1. There is a C standard.
|
| 2. There's only one C language. Implementations differ,
| but not "completely," and are mostly interoperable. They
| all have in common the standard language, of course.
|
| 3. The C standard provides a fairly strong mental model
| for reasoning about code. Especially in later revisions,
| but even in C99. "Almost no ability to reason about" is
| false.
|
| If you think C is fragmented or difficult to reason
| about, let me introduce you to Python (Python3, Jython,
| PyPy), Java (Oracle, OpenJDK, Dalvik), C# (Mono, .NET
| Core, .NET on Windows...), C++ (if you think implementing
| C99-C18 is hard, check out the stdlib required by modern
| versions of C++), etc.
| blackoil wrote:
| Don't know about C, but
|
| In Java/C#/Python, you have a standard, a dominant
| implementation which is compliant to that standard. Few
| more independent/niche implementations which may not be
| 100% compliant. Do we have a similar implementation in C?
| loeg wrote:
| I think you're maybe overstating the degree of
| homogeneity in Java/C#/Python. E.g., CPython3 cannot run
| Python2 code at all, and there are a lot of platform-
| specific modules and behaviors in the stdlib ("import
| os"). I don't think Python has _any_ real single standard
| to the level of detail that C does, although I may be
| mistaken.
|
| For C, on approximately the same platforms, to
| approximately the same degree: either GCC or Clang would
| be the corresponding standard-compliant implementation.
| ironmagma wrote:
| CPython is looked to as the canonical Python. IronPython
| and PyPy are all modeled after it and aim to behave as
| closely to it as possible, while CPython is considered
| the gold standard. Comparing CPython3 and CPython2 is
| orthogonal to that; one is not trying to implement or
| emulate the other. You have similar situations with Mono
| C# imitating Microsoft C# and IronRuby imitating Ruby. If
| there is a difference, it is considered a deviation from
| Ruby which is the reference implementation.
| progre wrote:
| You are right of course. I was thinking specifically
| about the early PIC Microchip compilers for "C" where the
| weird banked memory and Harvard RISC architecture made
| the "C" you wrote for those essentially non-portable even
| to other 8-bit micros. I think the Microchip marketing
| was very carefully trumpeting C but not claiming to
| follow any version of the standard though.
|
| And, the of course, the community around PIC's where
| almost uniformly on board with writing in assembly
| anyway.
| quelsolaar wrote:
| This is not a complexity issue with the function. The
| function is linear to the input, as it should be. The problem
| is that the implementation does more work then it needs to
| (it doesn't need the length of the string). It should be
| linear to the end of parsing not the end of string. The
| complexity in this case comes from the loops calling it.
| echlebek wrote:
| Thanks for the great blog post, and congrats on finding that
| long dormant bug!
| beart wrote:
| There are currentely 19 million + results when searching for code
| using sscanf on github. I wonder how many of those are suffering
| under the same problem.
| jchw wrote:
| It is a good opportunity to mention that you should not use
| strtod/strtol either if you can help it, since they are impacted
| by locale. Exactly what to use instead is a bit of a tough nut to
| crack; you could extract musl's floatscan code, or implement the
| Clinger algorithm yourself. Or, of course, use programming
| languages that have a more reasonable option in the standard
| library...
| kazinator wrote:
| I see you are reiterating this point raised in the previous
| discussion several days ago, but I don't thing it is
| particularly well grounded.
|
| ISO C allows strtod and strtol to accept, other than in the "C"
| locale, additional "subject sequence forms".
|
| This does not affect programming language implementations which
| extract specific token patterns from an input stream, which
| either are, or are transformed into the portable forms
| supported by these functions.
|
| What the requirement means is that the functions cannot be
| relied on to reject inputs that are outside of their
| description. Those inputs could accidentally match some locale-
| dependent representations.
|
| You must do your own rejecting.
|
| So for instance, if an integer token is a sequence of ASCII
| digits with an optional + or - sign, ensured by your lexical
| analyzer's regex, you can process that with strtol without
| worry about locale-dependent behavior.
|
| Basically, rely on the functions only for conversion, and feed
| them only the portable inputs.
| jchw wrote:
| I can't really understand what you mean. You can validate
| yourself that parsing with strtod will break if your system's
| locale is set to a locale where the decimal separator is a
| comma ',' instead of a period '.' - as an example, most
| European locales. Whether or not strtod will try to magically
| fall back to "C" locale behavior is irrelevant because it is
| ambiguous. For example, what do you do if you are in Germany
| and you try to parse 100.001? Is it 100001?
|
| strtod also doesn't guarantee round-trip accuracy that you
| can achieve when you use Steele & White and Clinger. All in
| all, I really think it is just not a good idea to use the C
| standard library for string operations.
| kazinator wrote:
| Sorry about that; I see the next now in the description of
| strtod where it is required that the datum is interpreted
| like a floating-point constant in C, except that instead of
| the period character, the decimal point character is
| recognized.
|
| Yikes!
|
| There is a way to fix that, other than popping into the "C"
| locale, which is to look up that decimal character in the
| current locale, and substitute it into the string that is
| being fed to strtod. That's a fairly unsavory piece of code
| to have to write.
|
| What locale issues did you find for strol{l,ul,ull}?
|
| > _I really think it is just not a good idea to use the C
| standard library for string operations._
|
| I really despise locale-dependent behavior also, and try to
| avoid it.
|
| If you control the entire program, you can be sure nothing
| calls setlocale, but if you are writing middleware code,
| you can't assume anything.
|
| Also in POSIX environments, the utilities react to
| localization environment variables; they do call setlocale.
| And so much stuff depends on it, like for instance
| operators in regex! [A-Z] does not refer to 26 letters;
| what [A-Z] denotes in a POSIX regex depends on the
| collation order of the character set.
|
| There are functions in the C library without locale-
| specific behaviors, though, like strchr, strpbrk, strcpy,
| and their wide character counterparts.
| jchw wrote:
| > There are functions in the C library without locale-
| specific behaviors, though, like strchr, strpbrk, strcpy,
| and their wide character counterparts.
|
| Obviously these string functions are totally fine if you
| use them correctly, and their implementations should be
| fairly optimal. However, I do think they put a lot of
| onus on the programmer to be very careful.
|
| For example, using strncpy to avoid buffer overflows is
| an obvious trap, since it doesn't terminate a string if
| it overflows... strlcpy exists to help deal with the null
| termination issue, but it still has the failure mode of
| truncating on overflow, which can obviously lead to
| security issues if one is not careful. strcpy_s exists in
| C11 and Microsoft CRT, though I believe Microsoft's
| "secure" functions work differently from C11's. These
| functions are a bit better because they fail explicitly
| on truncation and clobber the destination.
|
| OpenBSD arguably has some of the best security track
| record of all C projects and I still feel weary about
| their preferred mechanism for string copying and
| concatenation (strlcpy and strlcat.) I feel strlcpy and
| strlcat are both prone to errors if the programmer is not
| careful to avoid security and correctness issues caused
| by truncation, and strlcat has efficiency issues in many
| non-trivial use cases.
|
| While there are obvious cases where dynamically allocated
| strings of arbitrary length, such as those seen in C++,
| Rust, Go, etc. can lead to security issues, especially
| DoS issues, I still feel they are a good foundation to
| build on because they are less prone to correctness
| issues that can lead to more serious problems. Whether
| you are in C or not, you will always need to set limits
| on inputs to avoid DoS issues (even unintentional ones)
| so I feel less concerned about the problems that come
| with strings that grow dynamically.
|
| One of the biggest complaints about prefix-sized
| strings/Pascal style strings is that you can't point into
| the string to get a suffix of the original string.
| However, in modern programming languages this is
| alleviated by making not only dynamic strings a
| primitive, but also string slices. (Even modern C++, with
| its string_view class.) String slices are even more
| powerful, since they can specify any range in a string,
| not just suffixes.
|
| So really locale and strtod are just little microcosms of
| why I am weary of C string handling. Clearly you can
| write code using C string functions that is efficient,
| secure and correct. However, I feel like there are plenty
| of pitfalls for all three that even experienced
| programmers have trouble avoiding sometimes. I don't
| actually know of a case where locale can break strtol,
| but it doesn't matter too much, since anyone can write a
| decent strtol implementation (as long as they test the
| edge cases carefully...) strtod though, is not so easy,
| and I guess that means apps are best off avoiding locales
| other than C. In a library though, there's not much you
| can do about it. At least not without causing thread
| safety issues :) In other languages, aside from dynamic
| strings and string slices, locale independent string
| functions is also typically the default. Rust's
| f64::from_str, Go's strconv.ParseFloat, C++'s
| std::from_chars and so forth. It's not too surprising
| since a lot of the decisions made in these languages were
| specifically made from trying to improve on C pitfalls. I
| do wish C itself would also consider at least adding some
| locale independent string functions for things like
| strtod in a future standard...
| [deleted]
| [deleted]
| pid_0 wrote:
| 8932hu09rueirhori3efw
| tdeck wrote:
| This just makes me think that null-terminated strings are the bad
| gift that keeps on giving. If we were to design an OS, language,
| or standard library in 2021 (or even 1999) we probably wouldn't
| use them, but we're stuck with this relic of a former era.
| kaba0 wrote:
| The thing is, they are even worse for performance than string
| implementations that store the length.. that extra few bits of
| memory is much cheaper than checking the size of a string
| everywhere. For example, copying a string with known length.
|
| Also, c++'s strings even do some clever hacking where they
| store the text itself for shorter strings in the pointer,
| barring a pointer lookup. And this is possible only because
| abstraction.
| oblio wrote:
| They were designed when an extra byte or so per string cost
| you a lot of money. Nowadays, when 99% of the systems anyone
| will program start at 1MB RAM and 90% probably start at
| 512MB, they're a liability for almost no benefit.
| kaba0 wrote:
| You've got an extra byte either way, the \0 at the end.
| Which in many cases will make you copy a string because you
| can't just "point" into a string literal and say take n
| chars from there. Of course I am not that old so I don't
| have enough expertise -- but seeming that every other
| language even at the time decided against it is pretty
| telling.
| tdeck wrote:
| I think your parent was referring to the cost of storing
| a 2-byte string length instead of a 1-byte terminator. In
| the 1970s and 1980s, 2 bytes would likely be the minimum
| storage needed for the length of a general purpose string
| implementation. Although there were some language
| environments (e.g. Pascal) that had counted strings with
| a max length of 255.
| kaba0 wrote:
| Fair enough; but actually it can be more memory efficient
| as well because of the better reusability of substrings
| (in case of null-terminated ones only the end can be
| reused)
| wruza wrote:
| Ok, let's assume that 10mb json source was loaded into a not
| null-terminated opaque struct str_t {size_t; pchar;}. You have
| to parse a number from a position `i' and you have (double
| parse_number(str_t)). Next obvious step?
| kevingadd wrote:
| You can keep the existing sscanf function and now strlen is
| O(1) so the bug is gone. Any questions?
| wruza wrote:
| I just can't figure out the exact code.
| [deleted]
| mkl wrote:
| I think your code would be pretty much the same, sscanf,
| strlen and all. The main differences would be the
| standard library's implementations of strlen and whatever
| function you use to read the file into a string in the
| first place.
| wruza wrote:
| str_t json = loadfile(); size_t offset = random();
| sscanf("%d", ?);
|
| With opaque str_t you can't just json[offset]. Should
| sscanf take offset with every string (sscanf(fmt, s,
| off))? Should we copy a slice of json and parse it?
| Should str_t have zerocopy mirroring ability (s2 =
| strmirror(s, off, len))? How many of these three are just
| a snakeoil that changes nothing?
|
| It's only pretty much the same until you try to write
| _actual code_ with a new idea in mind.
| adrianN wrote:
| C++'s std::string_view is essentially your struct. You
| can check the methods it provides.
| wruza wrote:
| Yes, I'm aware of it. I'm just tired by these layman's
| "oh that's another reason to ditch C strings", when it
| has nothing to do with it. Working with offsets requires
| handling offsets and lengths, be it explicit 'off' and
| 'n' or a string_view. All that is needed in this case in
| C is snscanf (note the 'n'), so that it would know its
| limits apriori, like snprintf does. Sadly that 'n' never
| made it into the standard.
| [deleted]
| mkl wrote:
| Okay, I see what you're saying now. I haven't worked with
| C strings in a while. Python uses offset parameters or
| seek operations in various places, and C++ string streams
| have an inherent position too (C++ probably has a number
| of other ways to do it too...).
| namibj wrote:
| Rust's &str type, or the non-unicode-assuming version
| &[u8], allow creating (sub-)slices, which probably
| matches your strmirror function. Except that the same
| syntax works for all (dynamic/static) length arrays, and
| even allows custom code to e.g. transparently wrap an
| SoA[0] representation.
|
| [0]: https://en.wikipedia.org/wiki/AoS_and_SoA
| TeMPOraL wrote:
| Well, in C++, it would read: int target;
| sscanf(json+offset, "%d", &target)
|
| Where str_t's operator+ would look roughly like:
| str_t str_t::operator+(size_t offset) { return
| str_t{size - offset, ptr + offset}; }
|
| (Might look exactly like this, if str_t's constructor
| would throw if size was negative.)
| rahimiali wrote:
| You can offset your str_t by creating a new str_t that
| subtracts offs from the length and adds offs to the
| pchar. There is no need to keep track of the offset
| separately.
| drodil wrote:
| Cool stuff mate! Searching for sscanf in github produces over 19
| million results with over 15 million in C code. I bet there might
| be few cases where it could be replaced with something else... :)
| (not saying it's not useful function when used correctly)
| cbsmith wrote:
| Honestly, I don't know why people store floats in anything other
| than binary. IEEE 754, or bfloat16, if you don't want all that
| precision is _so_ much more compact & efficient to parse. The
| amount of work the computer does to convert text floating point
| to binary representations, and the potential for there to be
| subtle differences/bugs in the encoding are just... tremendous.
| ZephyrBlu wrote:
| Based on Jonathon Blow's tweet it seems like he want to purge the
| web from existence.
| Keyframe wrote:
| That guy seems to understand and know all and everything, based
| on one and a half small game he made. Interesting character.
| known wrote:
| I use KERNEL code as reference/guidance whenever I had to use
| strlen/strtok etc
| stefs wrote:
| many years ago i slaved away in a low end (5 people) ecommerce
| web shop with a homebrew php cms. i hated working there, but was
| too vain to quit. one day our biggest customer by far complained
| about slow loading speeds and i was set upon the task. it was
| spaghetti code of the worst sort, but i managed to find the
| culprit quickly: converting a list of rows (for pages) from the
| database into the hierarchical menu tree structure; of course it
| was O(n2). i fixed it and the page generation time went down from
| 7 seconds to a few milliseconds again.
|
| they didn't let me push the fix back into the main repo because
| "it only affects this one customer, so we'll only fix it here".
| luckily i was fired shortly after. fin.
| another_kel wrote:
| By the way, does anyone know whether red dead redemption online
| has the same problem? Maybe the issue was fixed for the newer
| game but they decided not to update gta repo?
| [deleted]
| johnfn wrote:
| Loving the progression here. Tomorrow, someone's going to reduce
| the boot times of macOS by 90% by the same principle. A week from
| now, someone will prove P=NP because all the problems we thought
| were NP were just running strlen() on the whole input.
| xconverge wrote:
| Agreed, also hey bud, hope you are doing well
| johnfn wrote:
| Hey, there's a username I recognize! Long time no see! Same
| to you :)
| notretarded wrote:
| You too
| nightmunnas wrote:
| I don't think I've laughed this much since the pandemic
| started, well done.
| stainforth wrote:
| Next Hacktober should offer a free t-shirt to anyone who goes
| out and issues a PR to a repo containing sscanf
| TeMPOraL wrote:
| You're joking, but now I'm thinking about the XML we parse at
| work and the library we're using to do it. We parse a lot of
| it, but I've always had this vague feeling that it takes a bit
| too long (given the codebase is C++).
|
| The XML library we use is rather well-known, so if someone
| found a bug like this there, I'd suspect a general improvement
| of performance across the board in the entire industry.
| Efficient Market Hypothesis tells me it's unlikely the library
| has this problem, but then again, so I thought about AAA
| videogames, and then GTA Online thing came out.
| rightbyte wrote:
| I wonder of scanf on Playstation was not using strlen in that
| way. GTA was written for PS right?
| colejohnson66 wrote:
| It also runs on PC
| acqq wrote:
| It's possible. I've personally reduced the time spent for
| reading huge XML file on the startup of an application at
| least 10 times in the application I was in charge of, by
| avoiding the library dependence and writing a custom code.
| Having a lot of experience in such kinds of code and in the
| performance issues, it was quite a fast change with no
| negative effects.
|
| The prehistory of that was simple: up to some point the
| amount of data stored was reasonably small. Then from some
| point on the amount of data grew significantly (a few orders
| of magnitude), and the startup times became very unpleasant.
|
| There's a lot that is going on when loading huge XML files.
| As an example, don't forget all the possible Unicode
| conversions, all the possible allocations of the elements in
| the handling code, just to be discarded etc.
|
| I don't suggest everybody doing it "just because" but if some
| specific use is known to have very specific assumptions and
| it is in the "hot path" and really dominates (profile first!)
| and it is known that only a small subset of all XML
| possibilities will ever be used it can be justified to avoid
| the heavy libraries. For example, in that specific case, I
| knew that the XML is practically always only read and written
| by the application, or by somebody who knew what he was
| doing, and not something that somebody random in some random
| form would regularly provide from the outside, and I knew
| that my change surely won't break anything for years to come,
| as I knew for sure that that part of application was not the
| "hot spot" of expected future changes.
|
| So it was a win-win. Immensely faster application startup,
| which is something that improved everybody's work, while
| preserving the "readability" of that file for the infrequent
| manual editing or control (and easy diff).
| MaxBarraclough wrote:
| I'm reminded of a 2008 article, _Why is D /Tango so fast at
| parsing XML?_ [0]
|
| One of the main factors seems to be that a lot of XML
| parser libraries, even the high-profile ones, did a lot of
| unnecessary copy operations. D's language features made it
| easy and safe to avoid unnecessary copying.
|
| I wonder what became of that Tango code.
|
| [0] https://web.archive.org/web/20140821164709/http://dotno
| t.org... , see also reddit discussion where WalterBright
| makes an appearance, https://old.reddit.com/r/programming/c
| omments/6bt6n/why_is_d...
| marton78 wrote:
| What about parsing that XML upfront, serialising to some
| binary format (e.g. CBOR, maybe with nlohmann's JSON library,
| or Cap'n Proto) and shipping the binary file?
| toyg wrote:
| _> it 's unlikely the library has this problem_
|
| Any sufficiently-complex library code likely has plenty of
| problems, often unavoidably so (e.g. trade-offs between best
| performance and edge cases). Whether they have been found or
| not is a function of many, many factors.
|
| _> Efficient Market Hypothesis_
|
| I've lived long enough to be very sceptical about that sort
| of thing. Markets tend to be efficient _in aggregate_ ,
| maybe, but on the single case they can fail quite brutally.
| Look at how "dramatic" bugs are overlooked even in critical
| pieces of infrastructure like openssl, for years and years;
| maybe it happens _less_ for openssl than most equivalent
| libraries, but it still happens.
|
| Also, once the "market" for standards moves on, network
| effects make it very hard to have any meaningful competition.
| I mean, who writes XML parsers nowadays? Whichever XML lib
| was winning when JSON "happened" is now likely to stay in
| control of that particular segment; and the likelihood that
| top developers will keep reviewing it, falls off a cliff.
| Sprinkle a bit of cargo-cultism on top, and "efficient
| markets" become almost a cruel joke.
| lloeki wrote:
| > I've lived long enough to be very sceptical about that
| sort of thing.
|
| I've also seen this story unfold too many times:
|
| _code code code build run fail_
|
| > dammit, I could have sworn this was correct?!
|
| _think think think GOTO 1_
|
| > no way, my code has to be wrong, this can't be the
| compiler?! it's never the compiler! right?
|
| _reduce code to a generic two liner, build, run, fail_
|
| > oh.
|
| _open support ticket at compiler vendor_
| drdec wrote:
| If you have a lot of nesting in the XML, and it is formatted
| for human reading (i.e. indented), you may want to consider
| not doing that. We had a project where we were creating
| human-readable versions of the XML (mostly for developer
| convenience) and then parsing it. When we stopped adding all
| the extra white space the parsing speed increased a couple of
| orders of magnitude. (The downside was we no longer had built
| in coffee breaks in our development process.)
| TeMPOraL wrote:
| That's interesting. I can't think of a mechanism why this
| would give so much of a performance boost, though -
| rejecting extra whitespace should be just a matter of a
| simple forward scan against a small set of characters,
| shouldn't it?
|
| (Or maybe in your case something was running strlen() a lot
| during parsing, and just the difference in file size
| explains the boost?)
| CGamesPlay wrote:
| And maybe, in a decade or so, the man page for these functions
| will list their algorithmic complexity! That was the most
| interesting takeaway from this article, for me at least. I have
| only seen a one or two libraries that actually list this in
| their documentation.
| yread wrote:
| How would it help you knowing that it's O(n)? It needs to
| read all the characters of the float. Problem is that it's
| needlessly reading characters even after the float
| est31 wrote:
| The cppreference page linked by the blog post has been
| changed since:
| https://en.cppreference.com/w/cpp/io/c/fscanf#Notes
|
| > Note that some implementations of sscanf involve a call to
| strlen, which makes their runtime linear on the length of the
| entire string. This means that if sscanf is called in a loop
| to repeatedly parse values from the front of a string, your
| code might run in quadratic time
| TeMPOraL wrote:
| Good. I'm so happy they put it there. It's a little thing,
| but such little things - documenting corner cases - can
| have great benefits.
|
| I have a bad memory for all but most frequently used
| standard library calls, so I regularly end up refreshing my
| memory from cppreference.com, and I tend to instinctively
| scan any notes/remarks sections, as there's often critical
| information there. So now I can be sure I'll be reminded of
| this the next time I need to use scanf family.
| BenFrantzDale wrote:
| All of the C++ algorithms list complexity guarantees, I
| believe. This saga stunned me to learn that C doesn't seem to
| do this.
| TheOtherHobbes wrote:
| That's because C is just a wrapper for machine code on a
| cheap PDP-11, and cheap PDP-11s didn't have enough RAM to
| do complexity.
| 0xcde4c3db wrote:
| It's easy to forget that the original C standards were
| largely codifying existing practice during an era when
| using gets() [1] was existing practice. The world wasn't
| quite ready for Ada, I guess. Best-laid plans of mice and
| men etc. etc..
|
| Also, keep an eye out for "amortized" complexity. This does
| have a legitimately rigorous definition, but for latency-
| bound paths it can practically amount to "O(whatever),
| except for the particular invocations that are far, far
| worse under unspecified conditions".
|
| [1] https://www.man7.org/linux/man-
| pages/man3/gets.3.html#BUGS
| alexchamberlain wrote:
| It makes me chuckle when hash maps are stated to be O(1)
| insertions. Which is true, in respect to the number of
| items in the map, assuming the map doesn't need resizing
| and there isn't a hash collision... but it's generally
| not true in respect to the key length. (I think most
| implementations are O(ln), where l is the length of the
| key and n is the number of inserted items, assuming the
| hash function is O(l) - the _amortised_ runtime would be
| O(l))
| thaumasiotes wrote:
| > assuming the map doesn't need resizing
|
| This isn't a big difficulty; it's still amortized O(1).
|
| > and there isn't a hash collision
|
| This is a real difficulty, unless you allow map resizing.
| Luckily, we do.
|
| > but it's generally not true in respect to the key
| length.
|
| OK, but in most cases the key length is constant, making
| anything that depends on the key length O(1) by
| definition.
| mnw21cam wrote:
| I wrote my own version of a part of a very popular Java
| scientific tool, and my version runs about 50 times
| faster. Their mistake? They had a hashCode()
| implementation on the objects they were using as keys for
| HashMaps that iterated through all of the voluminous
| content of that object. And there was no point - they
| could have used IdentityHashMaps instead with the same
| result. I pointed this out to them, and they still
| haven't fixed it.
| TeMPOraL wrote:
| I'm guessing GP means the complexity guarantee sidesteps
| the complexity of the hashing function. It probably
| doesn't matter all that much in typical case - I'm
| guessing 80-90% of hash map use is with very short
| strings.
| thaumasiotes wrote:
| Short strings, long strings; they're going to use the
| same key length. Calculating the key may take longer for
| the long string, if you're basing the hash on the
| contents of the string[1], but the key won't end up being
| a different size. The md5 of a 3-byte string is 16 bytes
| and the md5 of a 40GB string is also 16 bytes.
|
| [1] Not typical. e.g. Java takes the hash key of an
| object to be its address in memory, which doesn't require
| looking at the contents.
| iainmerrick wrote:
| _Calculating the key may take longer for the long string_
|
| Right, that's exactly what they are warning about.
|
| _Not typical. e.g. Java takes the hash key of an object
| to be its address in memory_
|
| No, that's just the base implementation in Object (and
| arguably it was a bad idea). All useful "value type"
| classes will override it with a real hash of the content,
| including String.
|
| There are some cases in Java where you do want to use IDs
| instead of values as your map keys, but they're rare.
| thaumasiotes wrote:
| > All useful "value type" classes will override it with a
| real hash of the content
|
| Well, this is necessary for a lot of sensible things
| you'd want to do with non-numeric value types as hash
| keys...
|
| > including String
|
| ...except String is something of an intermediate case.
| There are loads of use cases where what you're really
| using is a set of constant strings, not variables that
| contain arbitrary character data. In that case, you
| should intern the strings, resulting in non-"value type"
| keywords where the only thing you care about for equality
| is whether two keywords do or don't have the same machine
| address.
|
| I don't actually know how Java handles this, but I had
| the vague idea that two equal String literals will in
| fact share their machine address. And String is
| specifically set up to accommodate this; Strings are
| immutable, so in theory it could easily be the case that
| any two equal Strings must share their machine address,
| even if you got them from user input.
| mnw21cam wrote:
| Yes, Strings are immutable, so they only calculate their
| hashCode once, then cache it. However, you need to
| explicitly intern them with String.intern() if you want
| to avoid multiple copies of the same String.
| heavenlyblue wrote:
| > Strings are immutable, so in theory it could easily be
| the case that any two equal Strings must share their
| machine address, even if you got them from user input.
|
| Hey, and now you have two problems: String hashing and
| finding all strings which are equal to each other in
| memory
| thaumasiotes wrote:
| Well, no, the whole point of this discussion is that
| solving the second problem means the first problem never
| comes up.
|
| And this isn't exactly some exotic approach; how often do
| you think people write Hashes in Ruby where the keys they
| use are all symbols? It's so common that there's
| dedicated syntax for it.
| TeMPOraL wrote:
| It's as old as Lisp, but there's a reason symbols exist
| separately from strings - they're used differently.
| Strings are frequently transformed, symbols almost never
| are. String are frequently taken from end-user input,
| symbols very rarely. Strings sometimes are very large,
| symbol names are almost universally very short.
|
| The problem is, interning is an expensive operation. It
| means adding to an ever growing database of strings, but
| first checking if the string isn't already there. You
| don't want to do that every time you change case or flip
| a letter in a string, or use it to access a hash table.
| I'm not saying it can't be done, but I honestly have no
| idea how to implement sane, generic, automatic interning
| of strings. I feel more comfortable having a symbol type,
| and control over turning strings into symbols.
| thaumasiotes wrote:
| I definitely agree that uninterned strings are important.
| All I'm really trying to say down here is that there are
| many cases where you have a hash table which uses strings
| as keys (as an implementation detail), when
| (conceptually) it wants to be using symbols.
|
| (And on a less fundamental level, the particular Java
| String class is less string-like and more symbol-like
| than most string types, and this appears to have been
| done intentionally.)
| erik_seaberg wrote:
| Java does intern string literals and constants, but you
| can't rely on reference equality unless you intern every
| string you create at runtime by formatting or decoding,
| and it isn't specified whether that creates strong
| references that will never be GC'd.
| steerablesafe wrote:
| Well-written complexity guarantees specify the operations
| they count. Otherwise sorting in O(n log(n)) also
| "sidesteps" the cost of comparison too.
| SilasX wrote:
| And the analysis of hashmaps is _not_ such a well-written
| guarantee -- as you resize, you need a bigger hash
| function output to reach all possible buckets. A bigger
| hash function output, assuming you have to keep the
| avalanche effect to keep output well-scrambled, requires
| more computations.
|
| Earlier discussion:
| https://news.ycombinator.com/item?id=9807739
| ummonk wrote:
| Key length must necessarily be O(log(N)) to be able to
| identify N different keys.
| thaumasiotes wrote:
| This is O(1) where N is constant.
| layoutIfNeeded wrote:
| Yes. Everything is O(1) if N is constant, including
| log(N), N^2, 2^N, N!, etc. That's a tautology.
| thaumasiotes wrote:
| > Everything is O(1) if N is constant, including log(N),
| N^2, 2^N, N!, etc.
|
| Not even close. 2^k is not O(1) by virtue of N being
| constant. Only 2^N.
|
| This has been covered above. It is more common to
| consider the complexity of hash table operations in terms
| of the number of operations, or the size of the table;
| the size of the key is very often constant. These are
| different variables; the constant size of the key does
| not trivialize the complexity of inserting N items each
| with a constant key size.
| SilasX wrote:
| Here, the relevant key is the _output_ of the hash
| function though -- that 's what you need to increase in
| order to ensure you can reach all buckets. And that (k)
| must increase with the table size. So it is not constant
| and depends on n (table size).
|
| Earlier discussion:
| https://news.ycombinator.com/item?id=9807739
| thaumasiotes wrote:
| I remember a proof in CLRS which first developed a
| function that was bounded above by 5 for all conceivable
| input ("a very quickly-growing function and its very
| slowly-growing inverse"), and then substituted the
| constant 4 or 5 into a complexity calculation in place of
| that function, giving a result which was "only" correct
| for all conceivable input.
|
| The same approach applies to key length requirements for
| hash tables with arbitrarily large backing stores. They
| do not grow as slowly as the CLRS log* function, but they
| grow so slowly that there are easily identifiable sharp
| limits on how large they can be -- an easy example is
| that a hash table cannot use more memory than the
| hardware offers no matter how the software is written. A
| backing store with 1TB of addressable bytes cannot _need_
| the key to be more than 40 bits long.
|
| On a different note, by "table size" in my earlier
| comment I meant to refer to the number of entries in the
| table, not the capacity of the backing store. It seems
| like you might be using the same word for a different
| concept?
| SilasX wrote:
| >The same approach applies to key length requirements for
| hash tables with arbitrarily large backing stores. They
| do not grow as slowly as the CLRS log* function, but they
| grow so slowly that there are easily identifiable sharp
| limits on how large they can be -- an easy example is
| that a hash table cannot use more memory than the
| hardware offers no matter how the software is written. A
| backing store with 1TB of addressable bytes cannot need
| the key to be more than 40 bits long.
|
| So? That's still putting a bound on table size, which
| makes it in-practice constant, but doesn't make the
| algorithm O(1), because you can never get such a result
| by bounding n, for the reasons the GGP gave -- that's
| cheating.
|
| Your complexity bound has to be written on the assumption
| that n (number of elements to store in hashtable)
| increases without bound. Assuming you will never use more
| that Y bytes of data is not valid.
|
| >On a different note, by "table size" in my earlier
| comment I meant to refer to the number of entries in the
| table, not the capacity of the backing store. It seems
| like you might be using the same word for a different
| concept?
|
| No, I was using table size exactly as you, to mean the
| number of elements stored. Is there a reason my comments
| only made sense under a different definition? It not, be
| charitable. (And avoid using obscure terms.)
| thaumasiotes wrote:
| > No, I was using table size exactly as you, to mean the
| number of elements stored. Is there a reason my comments
| only made sense under a different definition? It not, be
| charitable. (And avoid using obscure terms.)
|
| I interpreted your comment to refer to the size of the
| backing store, because that is fundamentally what a hash
| key needs to be able to address.
|
| I didn't mean to say that, if you were using it that way,
| you were doing anything wrong, only that there appeared
| to be a mismatch.
| SilasX wrote:
| >I interpreted your comment to refer to the size of the
| backing store, because that is fundamentally what a hash
| key needs to be able to address.
|
| Under the assumption (upthread) of constant resizing as
| element are added, the distinction is irrelevant. The
| more elements you have in the table, the more elements
| you need to address, and the more possible outputs your
| hash function needs to have.
|
| And the needed size of the backing store scales with the
| number elements you want to store anyway.
|
| >I didn't mean to say that, if you were using it that
| way, you were doing anything wrong, only that there
| appeared to be a mismatch.
|
| Why bring up something like that if it doesn't translate
| into something relevant to the discussion e.g. to show my
| point to be in error?
| aliceryhl wrote:
| If the key length is constant, the map as an upper limit
| on the number of possible elements, so all operations are
| constant time.
| dorgo wrote:
| But, isn't the key length a constant and we are back to
| O(1)? Ok, in theory you could exhaust all possible keys
| of a certain length and proceed with longer keys. It
| would give us what? O(ln(n))?
| OskarS wrote:
| His point is, if you use Moby Dick as the key, it's going
| to take longer to hash that than a three letter string.
| Hashing isn't O(1) if the key has variable size.
| lostcolony wrote:
| ...I fully plan to use "O(whatever)". Not sure for what.
|
| But, yes. (naive) Quicksort's amortized complexity being
| O(nlogn), but its O(n^2) on already sorted data, is all I
| ever needed to learn to take away that lesson. When
| sorting already sorted data is worse than sorting
| randomized data, it's a quick realization that "amortized
| cost" = "read the fine print".
| nyanpasu64 wrote:
| Quicksort as O(n log n) is not amortized complexity, but
| average runtime for random data.
| lostcolony wrote:
| Fair point; I'm confusing my terminology. Analogy and
| realization still holds.
| virgilp wrote:
| Also, already sorted data.. in reverse order. If it's
| already sorted in the right order, quicksort takes linear
| time. This is an important difference - data you use
| might indeed often be appropriately sorted, but in
| practice will seldom be sorted in reverse order.
| SuchAnonMuchWow wrote:
| If it's already sorted in the right order, quicksort runs
| in O(n log n). quicksort is O(n log n) bestcase, O(n*n)
| worstcase.
| virgilp wrote:
| Actually, yeah, my original reply was dumb, I forgot
| quicksort :)
|
| It can be n*n for properly-sorted arrays too, depending
| on pivot selection (for randomized pivot selection, it's
| n log n).
| lostcolony wrote:
| Yes; random pivot selection is nlogn (unless you are
| very, very, statistically impossibly, unlucky. Or using
| very short arrays where it doesn't matter anyway).
|
| But I'm pretty sure data sorted in either direction
| (i.e., 'reversed' or not, ascending or descending), and
| taking a pivot from either end, is n^2. It doesn't have
| to be reversed; you always end up with everything
| unsorted ending up on one side or the other of the pivot,
| with each recursive step being just one less comparison
| than the step prior, meaning it has N-1 + N-2 + ... + 1
| comparisons regardless of which way the array is sorted,
| or N(N-1)/2 comparisons total (Gauss' formula but
| starting at one less than the total number of items N,
| since that's the number of comparisons each step), which
| is O(N^2). There is no cause where it's linear time,
| unless you are first iterating across the array to select
| the first pivot that is out of place (which may be a
| reasonable optimization, but can also be made to apply
| regardless of what direction the array is sorted).
| [deleted]
| jameshart wrote:
| On the contrary: very common UI pattern to have a data
| grid that sorts by a particular column when you click the
| header, then reverses that sort order when you click the
| header again. So for a user to sort by date, descending,
| they click the header, causing an ascending sort, then
| click it again, causing a descending one.
|
| Often such a grid will be quite well abstracted from its
| data source - it might be executing a remote query to
| return data in the new order every time - but I bet there
| are some examples out there that are backed by a local
| dataset and carry out an actual sort operation when you
| hit the header... and fall into a quicksort worst case if
| the user clicks the same header twice in a row.
| mcguire wrote:
| Something that is amortized complexity:
| vector.push(x)
|
| Most of the time, it's O(1). Sometimes it's O(n). If you
| double the size of the backing array when it runs out of
| space, it's amortized O(1).
| ncmncm wrote:
| Interestingly, it is quite easy to double the speed of
| Quicksort, even at this late date, with just a couple-
| line change.
|
| <http://cantrip.org/sortfast.html>
|
| Anyway I thought it was interesting.
| lloeki wrote:
| I (sadly) have to resort to use "O(scary)" too often for
| my taste.
|
| From: https://stackoverflow.com/a/185576/368409
| /* This is O(scary), but seems quick enough in practice.
| */
| inimino wrote:
| It's also easy to forget that C was competing mainly with
| assembly, while C++ competed with managed languages. The
| early C programmer ethos, especially among library
| authors, was much more along the lines of "look at the
| generated object code if you want to know what it's
| doing" while modern practice leans more towards "read the
| documentation for complexity guarantees". I'm not saying
| that worse documentation leads to better programmers, but
| I'm not not saying that either. Practices change,
| standards change.
| DarkWiiPlayer wrote:
| Good documentation and inspecting the compiled bytecode
| are both good ways of finding out about performance
| characteristics of certain features. The problem starts
| when people rely on assumptions ("sscanf should be fast
| because it's widely used") or performance folklore
| ("localizing every function you'll ever use makes your
| Lua code faster"), because those tend to either be
| completely wrong or lack very important context.
| IggleSniggle wrote:
| I live in js land, and the barrier between "folklore" and
| "documentation" is extremely thin. Especially since V8
| may introduce changes at any time that affect performance
| characteristics of js.
|
| I'd respond with "well if performance matters it
| shouldn't be in js" except for all the shite being
| written in js these days, with js being the hammer that
| makes everything else look like a nail.
| tedunangst wrote:
| Are there complexity guarantees for
| std::istream::operator>>?
| rtpg wrote:
| In the standard there's things like "exactly N
| operations", but not seeing stuff for `istream`. There's
| like... an explanation of how things should work and I
| imagine you can derive complexity from it, but I think
| `istream` is a bit special since you're talking about
| this wrapper for (potentially) an arbitrary input source.
|
| [0]: https://eel.is/c++draft/alg.move#15
|
| [1]: https://eel.is/c++draft/input.streams#istream.extrac
| tors-7
| mkl wrote:
| No, or at least not in general, because you can overload
| it for your custom types with whatever terrible
| implementation you like.
| loeg wrote:
| This is true in essentially the same way for C's FILE,
| which is the point I think Ted is insinuating.
| formerly_proven wrote:
| I don't know if it is required to, but there doesn't really
| seem to be an upper bound to what glibc's scanf will eat for
| a %f (e.g. a gigabyte of zeroes followed by "1.5" will still
| be parsed as 1.5), so for that implementation there certainly
| isn't a trivial upper bound for the amount of input read and
| processed that is done for %f, like you would perhaps expect.
|
| Yet another reason to not stringify floats. Just use
| hexfloats (but beware of C++ standard bugs involving >>) or
| binary.
|
| Unfortunately "gigabytes of numerical data, but formatted as
| a text file" is commonplace. For some reason HDF5 is far less
| popular than it ought to be.
| dominicl wrote:
| But why do strlen() at all? And why are all platforms
| (Linux, Windows, MacOS) seemingly doing that?
|
| I think you're right that there is no upper bound but it
| shouldn't be necessary to do a full strlen() if you're
| instead scanning incremental. You could go char by char
| until the pattern '%f' is fullfilled and then return. That
| would solve the issue on it's root -- and who know how many
| programs would suddenly get faster...
|
| So looking at glibc souces I've found the culprit in
| abstraction. Looks like a FILE* like stringbuffer object is
| created around the c-string:
| https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-
| commo...
|
| And that abstraction does call something similiar to
| strlen() when initializing to know it's bounds here: https:
| //sourceware.org/git/?p=glibc.git;a=blob;f=libio/strop...
|
| I'm reading this source the first time, but I guess to not
| break anything one could introduce a new type of FILE*
| stringbuffer let's say in 'strops_incr.c' that is working
| incrementally reading one char at the time from the
| underlying string skipping the strlen()...
|
| Would be awesome cool if GTA online would be loading faster
| under wine than on windows :-)
| JdeBP wrote:
| See https://news.ycombinator.com/item?id=26298300 . The
| alternative implementation technique that already exists
| in some C implementations is not to use nonce FILE
| objects at all.
| fastball wrote:
| Listing time complexity up top is my favorite thing about the
| Redis docs.
|
| https://redis.io/commands/
| iab wrote:
| Amazing
| ineedasername wrote:
| That's actually a very simple one. Just run a regex on "P !=
| NP" to remove the "!" and you're good to go.
| ctoth wrote:
| Seriously the most I have laughed in like 6 months. Which
| probably says a lot more about me than this joke. I know that
| jokes aren't really welcome on HN, and I generally really
| like this policy. But just had to mention this was just ...
| what I needed to read right now.
| koalahedron wrote:
| > I know that jokes aren't really welcome on HN
|
| IMO, while I really don't come to HN to find dial-a-joke,
| or joke-of-the-day, I think some humor is essential in
| modern life.
|
| Since we're talking about Matt Keeter, you will find he has
| a great sense of humor if you read his website or interact
| with him. Some of his jokes are ROTFL funny, but subtle.
| specialist wrote:
| You kid. But truer things are said in jest.
|
| > _...Tomorrow, someone's going to reduce the boot times of
| macOS by 90% by the same principle._
|
| My 2019 MacBook often pauses when I connect the charging cable.
| Sometimes it just seizes, requiring a hard bounce.
|
| Clearly there's a contended lock buried deep. Something non-
| obvious.
|
| I'm _certain_ everything these days has dozens of hidden
| quadratics and contended locks.
|
| Which is one of the reasons I'm excited for stuff like
| structured concurrency (Java's Project Loom) and retoolings
| like systemd becoming the norm.
|
| Ages ago I worked on kitchen sink app that had a very finicky
| startup. Any breeze would break it. Much consternation by mgmt.
| Apparently if we only clapped louder, Tinkerbell would fly. I
| couldn't take it any more. LARPing as a bulldozer, I replumbed
| the innards, changing from something like initd to be more like
| systemd with some lazy loading for good measure. Voila!
|
| Back to GTA. The failure here is the product owner didn't
| specify a max load time, and then hold the team to it. Devs
| will mostly do the work that's expected of them. If load time
| wasn't measured (and managed), no one is going to bother with
| expunging sscanfs.
| dcolkitt wrote:
| > My 2019 MacBook often pauses when I connect the charging
| cable. Sometimes it just seizes, requiring a hard bounce.
|
| Yesterday my MBP kernel panicked because my keyboard was low
| on battery and the bluetooth connection kept dropping.
| There's something weird with MacOS where peripherals seem to
| really not be well isolated from the core OS runtime.
| mfbx9da4 wrote:
| actually lolled
| bsldld wrote:
| > A week from now, someone will prove P=NP because all the
| problems we thought were NP were just running strlen() on the
| whole input.
|
| You owe me a keyboard!
| eru wrote:
| You joke, but there's actually lots of work going on into what
| techniques will definitely NOT be enough to settle P=NP.
|
| (I find it pretty exciting, that this kind of negative result
| is possible. Ain't mathematics wonderful?)
| fanf2 wrote:
| Well,
| https://twitter.com/stephentyrone/status/1366573121365565444
|
| >> _So many years ago when I first took over the iOS string
| functions, I found that like 30% of boot time in debug builds
| was in strstr._ <<
|
| >> _Needless to say, we did not fix that issue by writing a
| more efficient strstr. Removed the parser and then removed
| strstr from the environment where it had been in use =)_ <<
| coolgeek wrote:
| I just got an infinite loop down to 8.6 seconds! And I'm not
| done yet!
| kellyrose019 wrote:
| I'm available, just follow my link
| https://rebrand.ly/sexygirls007
| dutchbrit wrote:
| Well, that's the first time I've encountered real spam on HN...
| dmitryminkovsky wrote:
| Same! In all these years... I am examining it now and dare is
| say it is just basic spam. "It's better than Tinder!" A
| redirect but then just a very small, simple page with no JS
| that looks malicious. Strange community to target. :/
| flukus wrote:
| It comes up quite often if you sort by new or comments. The
| mods here are very on top of it.
| mkl wrote:
| You see it if you have showdead turned on, littering the
| bottoms of threads. There's not very much of it though; I
| can't imagine it's very effective. Brand new accounts that
| post links in their first comments often get automatically
| shadow-banned, which is why the spam isn't very visible.
| mathattack wrote:
| Performance is one of the hardest things in programming.
| Separates the folks who deeply know the craft from those who
| don't.
| scaramanga wrote:
| "it will open a 97 MB binary STL file in about 165 milliseconds
| flat, on a 2013 Macbook Pro. This is blinding fast."
|
| This actually sounds incredibly slow, that's nearly 1/5th of an
| entire second. What can it possibly be doing? :)
|
| In case anyone else was wondering, I followed the link and
| clicked the description and this is actually based on the time to
| the first frame being rendered - not just the time to load the
| file (which should be essentially infinitely fast). So it's
| actually more impressive than it sounds.
| Const-me wrote:
| > What can it possibly be doing?
|
| Possibly, two things.
|
| 1. Indexing the mesh. STL files don't contain meshes, they
| instead have a triangle soup. Indexed meshes are more efficient
| for rendering, they save VRAM bandwidth and vertex shaders.
|
| 2. Computing normals. STL files have per-triangle normals (can
| be complete garbage because most software ignores them), for
| smooth surfaces you want per-vertex normals. Computing them
| well (like I did there https://github.com/Const-
| me/Vrmac#3d-gpu-abstraction-layer ) is slow and complicated.
| tomck wrote:
| Is that really that slow? idk how they even read the file in
| that amount of time, my drive is only about 125MB/s
| mkeeter wrote:
| Touche - after all, disk-to-RAM is hundreds of MB/s, and faster
| if it's cached!
|
| In practice, I'm racing mesh loading against "how long does the
| OS take to give you an OpenGL context", which is rarely below
| 160 ms (longer if it has to switch from integrated to discrete
| GPU).
| scaramanga wrote:
| Great data-point, I was wondering what the costs of all that
| would be.
|
| How do you measure when the frame is done, are you waiting
| for a vsync?
| mkeeter wrote:
| It's self-reported by the logging system after
| glfwShowWindow() returns for the first time, so probably
| not 100% accurate, but reasonably close.
|
| A truly fancy system would be something like Tristan Hume's
| keyboard-to-photon latency system:
| https://thume.ca/2020/05/20/making-a-latency-tester/
| titzer wrote:
| Do anything with a file where you actually have to touch every
| byte (e.g. parsing), it is pretty impressive to get anything to
| go faster than 100MB/s. 500MB/s _is_ blindly fast, IMHO.
| scaramanga wrote:
| Yeah, parsing text with a state machine is slow. Parsing,
| say, HTTP at that speed would be impressive without SIMD. But
| this is a binary file full of fixed sized structures, hence
| my confusion.
|
| Anyway, the answer is there, it's actually measuring the
| performance of sending the mesh to openGL, to the GPU, and
| getting a frame rendered.
| fsociety wrote:
| How can loading a file be infinitely fast? There is latency to
| receive the first byte from loading the file.
| gowld wrote:
| Predict your access pattern and prefetch into RAM or DMA.
| scaramanga wrote:
| When there is nothing to do. Like with an array of fixed
| sized structures all you need to know is how many there are
| and then you can increment a pointer past any number of those
| objects, and that pointer increment itself can probably be
| compiled away to nothing.
|
| It depends on exactly what you're measuring, you see. If you
| are loading data in order to do something, then the "do
| something" part will incur 100% of all the costs in that
| case. So when you subtract the "do something" part to just be
| left with the "load it" part, you can end up with a
| meaningless zero-to-do/doing-it-infinitely-fast kind of a
| result.
|
| So then, what would you measure? Just peak RAM throughput?
| You can get that from the data-sheets :)
| esperent wrote:
| From my experience creating loaders for 3D formats (FBX, glTF,
| LWO) it's not loading the file that takes a long time, it's
| parsing the data in the file and converting it to a suitable
| format for rendering in OpenGL. In practice, most people use
| the terms "parsing" and "loading" interchangeably, or "loading"
| means "reading + parsing file".
|
| There can be a lot of processing involved (looking at you FBX)
| or less (glTF, probably STL) but there's still going to be at
| least decompressing the binary data and copying it into buffers
| then uploading those to the GPU. So, without knowing how STL
| binary is specified, parsing 97mb in 165ms seems reasonable.
| rcpt wrote:
| Ok but his side project didn't have a whole mission devoted to
| mocking tech workers
| astrea wrote:
| It has been a hot minute since I've touched C, so I'm failing to
| grok the issue here. Sscanf is reading the data variable for a
| float-formatted string into a float variable. How is that also
| getting the size? What is different about strtof? It looks from
| the docs that it does something similar, just without using the
| formatting string.
| edflsafoiewq wrote:
| > sscanf() converts the string you pass in to an _IO_FILE* to
| make the string look like a "file". This is so the same
| internal _IO_vfscanf() can be used for both a string and a
| FILE*.
|
| > However, as part of that conversion, done in a
| _IO_str_init_static_internal() function, it calls __rawmemchr
| (ptr, '\0'); essentially a strlen() call, on your input string.
| This conversion is done on every call to sscanf(), and since
| your input buffer is rather large, it'll spend a fair amount of
| time calculating the length of the input string.
|
| https://stackoverflow.com/a/23924112
| froh wrote:
| and the proper remedy is to use fmemopen(3), which makes the
| temporary FILE object explicit and persistent for the whole
| parse, and needs just one strlen call.
|
| https://news.ycombinator.com/item?id=26343149
| [deleted]
| CamperBob2 wrote:
| Everybody in this thread seems to be missing a particularly
| large elephant in this particular room, which is that sscanf()
| supports scientific notation while strtod() and strtof() do
| not.
|
| Or at least, they didn't originally support it.
|
| Has this been fixed in the 20+ years since I noticed it and
| started using sscanf() everywhere instead?
| moonchild wrote:
| Sscanf _shouldn 't_ need to get the size. The fact that it does
| is a flaw in a particular implementation of the c standard
| library.
| metafunctor wrote:
| Some of the blame should be placed on the "worse is better"
| dogma. Often it really just means "my library does whatever was
| easy to implement". It has it's merits, but it's a very leaky
| abstraction. It's part of why writing good C code is hard: much
| of the hard stuff is left to the caller, because the library
| implementation would be more difficult otherwise.
| person_of_color wrote:
| Can we add a check in a static analyser for n^2 code?
| tpetry wrote:
| I don't get the heat of this topic. Yes they wrote some very slow
| code because it's easy to shoot in your foot with scanf. It's
| nothing new that most software could be heavily optimized by just
| benchmarking slow parts. There is no reason for this shit storm
| than to feel better than other developers.
|
| The real problem is that they shipped a game with a loading
| screen which is taking minutes and not looking whether they could
| optimize it. THAT is the real shit show!
| c-c-c-c-c wrote:
| Thing is that they didnt ship it that way. Back when it came
| out the loading screens were "fast". Things just grew out of
| proportion with the exponential increase of new items in the
| online mode.
| zionic wrote:
| No they weren't GTAV had terrible loading times at launch.
| dom96 wrote:
| The loading times for GTAV have always been horrible.
| tpetry wrote:
| And that's ok. But it seems no one in management approves
| benchmarking it now that loading is taking several minutes
| (!).
|
| I am not blaming the developers, they have a lot to do every
| day. Maybe someone even wanted to try to fix it. But that
| it's still like this is clearly showing that management
| doesn't care and they are completely ok with a loading screen
| taking longer than brewing a new cup of coffee.
| matheusmoreira wrote:
| Exactly. The problem isn't the bug itself or the developers who
| introduced it. The problem is they simply didn't care enough
| about their billion dollar game to fix the problem _over seven
| years after release_ until someone got mad enough to reverse
| engineer the game, figure out why it was so slow and fix it on
| their behalf.
|
| People will always make mistakes but it's how they deal with
| them that matters. Gotta have enough pride in one's work to be
| bothered when it performs badly. Having an user fix their
| billion dollar project for them just because they couldn't be
| bothered is just embarrassing.
| milesvp wrote:
| Forget pride in your work, this is literally costing rockstar
| money. It may be hard to estimate, but I'm sure it was in the
| hundreds of dollars a day category at some point. Shaving
| milliseconds off of loading time has real effects in the web
| world and so at a previous job we made sure to evangelize
| that fact to management and did what we could to tie it to
| revenue as well as other KPI in order to get this kind of
| thing prioritized. I'm just surprised there isn't a PM at
| rockstar pushing hard to reduce load times.
| TeMPOraL wrote:
| Yeah, but I think it puts to test the usual mantra of "write
| now, profile later, and optimize the bottlenecks". As reality
| repeatedly shows, developers often don't progress past the
| "write now" part, even if performance issues are crippling
| and annoying users to no end.
|
| We can and should do better.
|
| What this topic also is, is a reminder that with null-
| terminated strings and enough abstraction layers, you can
| easily make a O(n^2) or O(n^3) out of something you _thought_
| is O(n). I technically knew this (there was a popular article
| about it many years ago, I think by Joel Spolsky, but I can
| 't find it now), but I didn't bother to look out for it in my
| current project. Thanks to this debacle, I'll do a
| performance review looking for "accidental quadratics", and
| I'm guessing many other developers will do it too. So it's a
| win!
| [deleted]
| quelsolaar wrote:
| I think that its fairly notable that functionality, that have
| been arround for so long, and have been implemented so many
| times, is as poorly implemented as this.
|
| Usually you can count on the C std lib to be very optimized.
| Many std functions like memcpy are even intrinsics in compiles,
| and than means they are literally faster then its possible to
| write in C since someone has gone in and hand optimized the
| assembler.
| bovermyer wrote:
| The actual moral of the story here is twofold.
|
| 1. Don't assume you would never make _that_ mistake, and
|
| 2. Be understanding and kind to those who do.
|
| No one died as a result of this error. There is zero reason to be
| hostile to the developers.
| le-mark wrote:
| I propose a third moral, don't use scanf when a dfa would
| suffice!
| tshaddox wrote:
| I didn't follow the original story or comments about GTA, but
| based on the description in this article, I wouldn't be surprised
| that this sort of problem could happen to any coder of any
| experience level and I wouldn't give them any grief, but I would
| be surprised that the problem would be live in production for a
| very long time without ever having been profiled. Surely seeing
| JSON parsing taking more than 70% of the time would have made it
| onto someone's radar?
| TeMPOraL wrote:
| You'd be surprised. In the companies I worked for so far, it's
| usually _my_ radar that bleeped over ridiculously inefficient
| code that was present for years in the codebase. That is, some
| developers would be aware something is off with performance,
| but they didn 't bother to do anything about it. Hell,
| sometimes even management would know users complain about
| performance, but the feedback didn't percolate down to
| developers.
|
| Sure, I get priorities and "good enough". But really, about
| half of the order-of-magnitude wins in performance I achieved
| were on things you could find and fix in few hours if you
| bothered to look. The other half tends to be unfortunate
| architectural decisions that may take a few days to fix, so I
| get why those don't get done.
| JdeBP wrote:
| The problem was reported in the comp.lang.c newsgroup on Usenet
| in 2002. This very discussion mentions the GNU C library bug
| report that has been around since 2014. It was reported in
| RapidYAML in 2020. This problem has been noticed in production,
| several times over, and yet still lives to this day.
|
| * https://news.ycombinator.com/item?id=26302915
| Guillaume86 wrote:
| The premise of the post is wrong, from what I read here and on
| reddit the people were rightfully complaining about the lack of
| reaction of Rockstar, not the initial bug that could happen to
| most people.
| usefulcat wrote:
| I would bet that a lot of games receive a lot less development
| effort after release. Most of the devs probably got moved on to
| something else (sequel, or maybe a completely different title).
| lvturner wrote:
| Or get 'made redundant', burn out or move in to a different
| industry vowing never to do a release crunch ever again...
| kevingadd wrote:
| GTA Online is a live game that pulls in over half a billion
| USD per year in revenue. They release new content on a
| regular basis. It's actively developed, they just don't care
| (my guess would be that defects were filed against this on a
| regular basis and fixing it was never prioritized)
| 20thCB wrote:
| C string processing is the other "billion dollar mistake" we are
| all paying for decades later (in terms of slow programs, time
| wasted, power used etc).
| tux3 wrote:
| The moral of the story, as far as I'm concerned: do NOT parse
| strings in C! Use a library, prefferably in a higher-level
| language.
|
| C string handling is a mess of viciously surprising APIs,
| juggling those particular footguns is almost certainly not your
| least bad option.
| wruza wrote:
| Agreed, that would never happen if a developer simply used
| parseInt(text.slice(offset, offset+40)).
| JdeBP wrote:
| Like RapidYAML, written in C++, which had the bug until 2020?
| (-:
|
| * https://news.ycombinator.com/item?id=26302744
|
| "Use a library" is observably not a panacea in practice.
| harikb wrote:
| I would argue the reverse - there is higher chance of this
| accidentally quadratic problems with more abstraction layers,
| convenience functions, and advanced language syntax. But I
| agree we shouldn't write parsing in C, but for other reasons :)
| perl4ever wrote:
| I say there's an equal chance, and it's equally easy to fix,
| but a high level language provides fewer distractions and
| temptations to work on small optimizations which don't
| matter. Getting in the weeds with the details is how you lose
| sight of the big picture. And allocate your time poorly.
| rurban wrote:
| Wrong conclusion. The problem is the broken C string library in
| POSIX. Don't use it! Wrong design. Zero-termination is too
| fragile and the cause of the evil.
|
| Rather use buffers with known lenghts, and for strings you need
| to known the string (=unicode) rules. Nobody does that. Know
| libunistring. I have my own, because libunistring is too slow,
| but know it.
|
| For my string libraries I rather follow the STL, with
| ranges/views and boehmgc core. const vs dynamic strings. So I
| will not step into the accidental strlen and buffer-overflow
| trap.
|
| E.g. For input buffers know if they are zero-terminated and
| const. With the GTA post I pointed out the libfuzzer design
| flaw, giving you an ASCII input buffer which is not zero-
| terminated. Even strtol/strtod cannot be used then. You need to
| copy the buffer, terminate it, and then you can use the broken
| string libc. Not talking about sscanf, which I usually use only
| as sscanf_s if available. Or _snscanf/_snscanf_s. Microsoft
| does many things wrong, but its libc is far superior to glibc,
| bsd or musl. musl is better than glibc, but also lacks in this
| regard.
| rudedogg wrote:
| Funny, the ergonomics of the Swift string API are so bad that
| I've started learning a lower level language for parsing, etc.
|
| Here's my favorite WTF:
| https://christiantietze.de/posts/2020/01/string-index-offset...
| eternalban wrote:
| You can fault the docs but what's the problem with the API?
| Why should you be surprised that accessing a collection with
| 'nil' index is a runtime error? What else could it be?
|
| A simple fix in the doc seems to solve the confusion:
|
| "Returns an index that is the specified distance from the
| given index, unless that distance is beyond a given limiting
| index [in which case it returns nil]".
|
| It does say "returns an index ... unless ..". So yeah, a bit
| too terse. But with is the issue with the API?
| // /!\ Warning! Do not use in production! /!\ let
| s = "Swift" if let i = s.index(s.startIndex,
| offsetBy: 5, limitedBy: s.endIndex) { print(s[i])
| }
|
| "What does it print? Nothing, you hope?"
|
| Seriously? 'print(s[nil])' should print nothing? How about
| 'let c = s[nil]'. Should that just silently pass? _That_ is
| the runtime error, btw. It won 't even get to print(). (Valid
| to question the entirely non-informative error, however.)
| adamzegelin wrote:
| The problem is that `s.index` with an `offset` equal to
| `limitedBy` returns non-nil index, rather than nil, but
| that index is invalid (out of bounds) and causes the
| program to blow up... let s = "Swift"
| print(s.index(s.startIndex, offsetBy: 4, limitedBy:
| s.endIndex)) print(s.index(s.startIndex, offsetBy:
| 5, limitedBy: s.endIndex))
| print(s.index(s.startIndex, offsetBy: 6, limitedBy:
| s.endIndex))
|
| outputs:
| Optional(Swift.String.Index(_rawBits: 262401))
| Optional(Swift.String.Index(_rawBits: 327681)) <- this is
| unexpected nil
| saagarjha wrote:
| The index is perfectly valid; it's just that not every
| index may be used for subscripting. This is exactly how
| C++ does it too.
| bzbarsky wrote:
| If s.index() returned nil, the "if" would test false and
| the s[i] would not be reached.
|
| The problem is that it returns non-nil, but the _limit_ is
| broken in this case: using s.endIndex as a limit means you
| can get non-nil but bogus indices returned. And yes, this
| is the fault of the docs for using a broken last arg to the
| API, but there's no really clean way to use this API as
| designed, afaict. At least not if you want to limit to "end
| of string" as opposed to "some index into the string that I
| already know to be valid".
| saagarjha wrote:
| The index is not bogus, the API is working as designed.
| The example provided shows the use of the index, which I
| understand can be confusing because the index returned
| may not always be valid for this, but the index is
| decidedly valid. FWIW, since String is a
| BiderectionalCollection, this code works for what you are
| probably trying to do: let s = "Swift"
| if !s.isEmpty, let index =
| s.index(s.startIndex, offsetBy: 5, limitedBy:
| s.index(before: s.endIndex)) { print(s[index])
| }
|
| I am sure the equivalent code in other languages, barring
| Python, is going to be similarly verbose.
| dfox wrote:
| For me the moral of the story is do not use (whatever)scanf()
| for anything other than toy programs. In most cases
| implementing your own tokenizer (for both of these cases of
| reading numbers that involves str(c)spn() to get length of
| candidate token and then strtosomething()) is significantly
| easier than reasoning about what scanf() really does (even
| ignoring accidentally quadratic implementation details) and
| whether that can be adapted to your usecase.
| domnomnom wrote:
| Can you ELICSUndergraduate. Tokenizing is normally for if
| you're writing a compiler of DSL right?
| orangeshark wrote:
| It is a general term for the process of breaking a string
| into "tokens" which have a sort of meaning. Definitely a
| common task in compilers, but not limited to it.
| TeMPOraL wrote:
| Yes. But any data format you read, particularly any
| plaintext data format you read, is essentially interpreting
| or compiling a DSL. On a typical job, people are writing
| compilers much more often than they think!
| ludston wrote:
| I think a better moral is "don't roll your own parser unless
| your core competency/product is the parser". Especially in a
| corporate situation.
| epr wrote:
| Don't roll your own parser? How the hell would you get
| anything done? Unless you don't count regular expressions or
| something, I can't imagine somehow avoiding problems
| requiring parsers, especially on any unix-based system.
| loeg wrote:
| There are a lot of tasks that only need to work with
| existing, commonplace file formats with existing high-
| quality parsers (e.g., JSON, XML, sqlite, ...).
| TeMPOraL wrote:
| SQLite I can grant you, but I'm starting to get a bit
| worried about certain XML parser popular in C/C++ land.
| Might do a swing at it with a profiler later today.
| imtringued wrote:
| >How the hell would you get anything done?
|
| This is a sign that writing (trivial) parsers is a core
| competency for you. However, that doesn't meant that
| writing all parsers is your core competency. Especially not
| for an industry standard like JSON that should have plenty
| of libraries that take care of the problem.
| ActorNightly wrote:
| Disagree.
|
| Writing parsers is the same level of complexity as interview
| FAANG level questions. Any decent developer should be able to
| write a parser from scratch.
| TheCoelacanth wrote:
| Any decent developers can write a parser. Most can't (or at
| least don't have enough time before they need to get back
| to whatever their main job is) to write a fast, secure and
| robust one for a complex grammar.
| tedunangst wrote:
| There's also an entire book on using lex and yacc.
| akkartik wrote:
| I've taken it a step further (or three). I'm building a whole
| computer on the principle[1] (one possible way of looking at
| it) of avoiding parsing as far as possible.
|
| https://github.com/akkartik/mu
|
| Mu takes the dwm[2] principle of avoid-config-files/modify-
| sources-to-reconfigure to the limit. The idea is that there are
| at any given time only 3 languages in the computer:
|
| 1. A self-hosted notation for a subset of 32-bit x86 machine
| code
|
| 2. A memory-safe statement-oriented language where most
| statements map 1:1 to machine-code instructions. Written in
| level 1 above.
|
| 3. A high-level interpreted language.
|
| The vision is to fix 1 and 2 but allow 3 to fork wantonly. If
| you want a Lisp for your HLL, make a fork. If you want a
| Python-like syntax, make a fork. But, and this is the important
| part, in any given computer/repo there is only ever one
| language. Only one non-trivial parser. (Levels 1 and 2 have
| extremely uniform syntax.)
|
| As best I can tell, the #1 way to avoid the need to run a
| fuzzer is to avoid writing parsers. Just say no.
|
| [1]
| https://lobste.rs/s/to8wpr/configuration_files_are_canary_wa...
|
| [2] https://dwm.suckless.org
| kaba0 wrote:
| Hi, level 1 and 2 looks really cool, but I may not understand
| the point of only having these languages "running" on a
| computer? Both 2 and 3 are interpreted and the interpreter is
| the area you want to minimize?
|
| What about a program written in 3 that can compile to either
| 1 or 2? Why would it hurt anything to have a different
| language somehow made possible to run here?
| akkartik wrote:
| I'm not sure I follow your question, but the idea is that
| the computer the end user receives has a single opinionated
| language for programming it. Kinda like Basic for
| microcomputers. The end user is of course welcome to build
| their own languages. That is encouraged! But multiple
| languages make the computer more difficult for others to
| comprehend. My goal is to convince you that, all things
| being equal, a computer with fewer languages is in your
| best interest. Less code, fewer bugs, fewer security holes.
|
| (Level 2 is not interpreted, it's compiled. Skipping level
| 2 would be totally in line with my principle above. But it
| would be more painful, I think. You'd basically be
| programming in machine code.)
| kaba0 wrote:
| My question is regarding why would a single language
| codebase be easier to comprehend/have fewer bugs,
| security holes? In terms of a single program it makes
| sense, but I seldom read the source code of a library for
| example I depend on - if it has a good public API, it
| could be written in anything for all I care.
|
| Not trying to dismiss the idea at all, just I don't yet
| see "the light", so to say.
| akkartik wrote:
| Yeah, these are good questions and to be fair your
| questions are shared by many.
|
| My basic worldview is that we need to have 100-1000x more
| people reading and auditing open source. The original
| value of open source was in the ability of people to read
| the source. If we don't use that ability then we don't
| really get the benefit.
|
| The world today focuses on APIs and ignores
| implementation. I would argue that's the biggest source
| of problems today, with security holes, data breaches,
| user-hostile UX and so on.
|
| If you accept that being able to read sources matters,
| hopefully it makes sense why reducing the number of
| languages matters. Every new language you add is more
| code to read, another language to learn and become
| proficient in, a new source of gotchas and subtleties to
| spend 10 years learning. Another set of tools, another
| set of moving parts that might not build on your computer
| because of some subtle version mismatch.
|
| It's a hard problem. So let's make it easier for
| ourselves by relying on fewer languages, and being more
| thoughtful about the dependencies we introduce into our
| projects.
| kaba0 wrote:
| Thanks for the answer! I totally agree on the not enough
| people read source code part -- unfortunately I believe
| it is not only a language "barrier" thing. I mean, even
| in a language I know by heart, I probably could not make
| sense of some complex part of the linux kernel, because I
| lack both the underlying technical knowledge on some
| hardware interface, or the context of the code. And
| especially this latter can not be overcome with only code
| with good comments. It needs good documentation, which
| should give a basic understanding, and on top of it we
| can build the code for the fine detail. Of course it is a
| noble goal to try to decrease the surface area of the
| required knowledge, so props to you!
|
| What's your proposed solution to the problem with low and
| high level languages? Is the level 3 language a higher
| level one? Because I'n not sure there could exist a one
| language to rule them all, because of the inherent
| difference between the two domains.
| akkartik wrote:
| Yeah, level 3 will be a HLL. It just doesn't matter too
| much which one it is, or that it "rules them all". A
| single reasonably high-level language X is in practice
| superior to a basket of high-level languages, even if
| some of the languages in the basket are individually
| higher-level than X.
|
| You're absolutely right that languages are only part of
| the problem. Beyond the language choice, Mu provides
| guardrails to help you pick up the underlying technical
| knowledge by just trying things and seeing informative
| error messages (often failing tests) in response. That's
| the hope, anyway.
|
| Right now the first HLL is still in progress. I spent
| some time with a postfix-based language before changing
| my mind. Now I'm working on a Lisp-based HLL. So I'm not
| dogmatic about what the HLL should be, and in time there
| will probably be multiple options in separate
| forks/repos.
| beached_whale wrote:
| My biggest issue with c-strings is that by requiring a zero
| terminator and a single char const _, it forces a lot of
| copies(when truncating) and calls to strlen(caller doesn 't
| know the length).
|
| Had it been a char const _/size_t or a pair of char const * for
| first/last it should be both safer and faster. I prefer the
| later as a pair of pointers doesn't require updating both when
| iterating with the first ptr.
| perl4ever wrote:
| I used to work for people that processed emails and loaded them
| into databases with perl scripts. One day someone asked me if I
| could help, because the script they were running on a batch of
| emails was inexplicably freezing or running out of memory, I
| forget the exact details.
|
| There were maybe a few thousand or tens of thousands of emails,
| and so, I came to look at the issue with my usual attitude
| which is that if it isn't running instantly on a GHz computer
| of any kind, something must be _horrifically_ wrong. Not to
| mention running out of memory.
|
| It turned out that essentially every email was cc'ed to several
| thousand people; you know, the thread went to a whole company
| or something. The file size itself wasn't huge in the scheme of
| things, but our perl script (written by someone much more
| elevated that me, with a master's degree) read the whole file
| at once into memory, and expanded the headers to perl hashes,
| multiplying the size.
|
| But there was no reason the whole thing had to be in memory.
| Only that people learn to program in CS 101 these days, I
| guess, as if memory is infinite, because gigabytes might as
| well be infinity, but if you multiply a certain moderate
| overhead times tens of thousands by tens of thousands, suddenly
| all your gigabytes are gone.
|
| Another thing I remember, was when I was given a T-SQL report
| that typically ran on a few thousand documents, and was asked
| to run it on all of the millions on all our databases. It was
| hundreds or thousands of times too slow, but it was running a
| SQL statement in a procedural loop _per document_ and it could
| be turned into a single statement.
|
| So my experience has basically taught me that if something is
| within a factor of two of optimal, it's good enough, but an
| _incredible_ amount of code, regardless of high or low level,
| is _way_ worse than that.
|
| But I've never gotten much pay or glory for fixing this sort of
| thing. Sometimes I daydream about being a high paid consultant
| who goes and turns three nested loops into two, or two into
| one.
| geon wrote:
| The other week I optimized some processing in a web app from
| 3 minutes to 11 seconds.
|
| The customer was ecstatic, but I still think it is 2 orders
| of magnitude too slow.
| quelsolaar wrote:
| Wouldn't this be an argument to go in the opposit direction? If
| you are using high level functionality that you dont know the
| implementation details of, you are running the risk of
| unintended consequences.
|
| I am a C programmer who have implemented string to number
| parsing for this very reason. I know exactly what it does and
| how fast it is.
|
| If you do use code you didn't write, The chance of a standard
| library being poorly implemented, is probably lover then most
| other libraries, so picking a non standard lib as a grantee
| against bad performance seems misguided.
| kaba0 wrote:
| I think it goes both ways in that you either go full low
| level and write yourself everything (for questionable
| benefits), or you use a (possibly higher level) language with
| sane standard library, but the important thing is the quality
| of said library.
| quelsolaar wrote:
| I find that writing everything yourself, especially simple
| things like a text2inteeger parser, is very valuable
| because it takes very little time and it levels up your
| understanding of the system. I'm starting to believe that
| you rarely understand something until you have implemented
| it. Therefor implementation is the best way to learn.
| kaba0 wrote:
| I'm totally with you here, it is a really invaluable
| learning tool. But similarly to science with "standing on
| the shoulders of giants", we would not be anywhere if
| everyone started everything from scratch. Like, it's
| okayish to reimplement a vector or something, but even a
| sort gets harder (especially if you want to make one that
| is performant both on a few element list and longer
| ones). And algorithms are just one thing, will you also
| learn into the totally foreign world of eg. audio
| processing?
|
| Basically the only silver bullet for productivity
| increase is shared code. We just have to place higher
| emphasis on software correctness/quality instead of the
| functionality churn in certain parts of the software
| world.
| fireattack wrote:
| Speak of which, is there any official follow-up from R* yet?
| noobermin wrote:
| I think the only way I've avoided any of this is my academic work
| primarily deals with "write the performant, simulation codes in
| C++ or Fortran, write analysis in Python" (although others use
| matlab et al.) so everything parsing related has been through
| python, not C++, which ofc just has the "float()" type
| constructor. Generally, like most academics, I own my whole shop
| so fortunately no one uses my code other than one or two
| colleagues occasionally, so when bottlenecks arise I know where
| and what it is.
| iamleppert wrote:
| Wouldn't it be better to read the entire file into memory first?
| And then do the parsing in the next step? Would take more memory
| of course but if you're trying to maximize performance.
| CGamesPlay wrote:
| Wouldn't matter at all, and the function in question is already
| operating on a string buffer. You might be able to get a minor
| boost by reading the file in parallel with parsing it, using
| async IO or threads, but it doesn't seem like the disk is
| actually the bottleneck here.
| xirbeosbwo1234 wrote:
| I don't think anyone sensible would claim they would never have
| made this mistake. It's not even surprising that it made it to
| production. What _is_ surprising is that it lead to a massive and
| easily-diagnosable slowdown that no one bothered to fix for
| several years.
| Scarbutt wrote:
| This, the article misses the point entirely. Either the author
| already knew that and used it as an opportunity to show off or
| has a low level common sense.
| sjburt wrote:
| I'd bet that some combination of "Oh, of course it's slow, it's
| json" and "I'm too important to work on microtransactions"
| accounts for a lot it.
| bentcorner wrote:
| Or it may have worked fine in a dev environment and it only
| became an issue once the game was in production for a certain
| amount of time.
|
| By that point R* developers had moved on to other projects
| and the artists may have used a different workflow to
| validate things and never bothered booting up the "real" game
| to test things.
| flukus wrote:
| I don't find it that surprising because there's a huge lack of
| awareness in the industry when it comes to profiling tools.
| Quite often I've seen people trying to speed up programs
| without once profiling to see where the problem is. More than
| once I've seen the supposed solution (usually caching or
| distributing in my line of work) actually slow things down. At
| times it can be almost impossible to get permission to "waste"
| time on performance profiling because it's not fixing bugs or
| adding features.
|
| I kind of expected more from game developers, but I doubt the
| guys shaving micro second off tight game loops are the same
| ones writing the asset loading code.
|
| Devs should follow carpentry rules for this: measure twice, cut
| once.
| simias wrote:
| I think the really embarrassing part for Rockstar is that they
| didn't bother to investigate what took 5+ minutes to load in
| their star product, a simple profiling would've made the issue
| obvious. So either they knew and they didn't care, or they didn't
| know and they didn't care.
|
| That being said both for GTA and for TFA the issue is a very
| similar sscanf call: sscanf(data, "%f", &f);
|
| I already posted a similar comment in the GTA story but I really
| want to emphasize it: scanf is almost never the right tool for
| the job, and it's definitely not the right tool in this
| situation. Just use strtof. That's literally what it's for.
| String to float. There. Done.
|
| Scanf is crappy and if it were up to me would've been deprecated
| a while ago. I can sort of see using it for a quick one-off
| "script", for instance to parse user input, but seeing it in the
| middle of a program will always raise a huge red flag for me.
|
| Use strtok_r if you need to split a string, then parse every
| entry individually. It's more robust, more flexible (you can
| parse custom types and formats that way) and allows for much
| better error handling and diagnostics. And of course it's also
| usually vastly faster.
|
| Scanf is an antipattern in my opinion. I literally never use it
| and I'm better off for it. The last time I interviewed for a C
| coder position I managed to answer the full C test quizz except
| for the one question regarding scanf. That's how much I don't use
| it.
|
| I think it's even worse for developers who come from higher level
| languages and (reasonably) expect to be able to deserialize data
| easily. You simply can't do that in C, the type system and
| general philosophy of the language won't let you, but scanf may
| convey the illusion that it's sort of possible. Don't believe its
| lies.
| lostcolony wrote:
| I think 'embarrassing' is too strong a word. AAA game
| development is rushed; the pressure is to ship. Something has
| to give. This is a user facing issue, but one that doesn't
| actually affect the gameplay. Assuming they had -time- to
| profile the load process, given that low a priority, seems
| extremely optimistic.
| undefined1 wrote:
| embarrassing is too light a word.
|
| Rockstar has virtually endless resources and the game has
| been out for many years. for years, they didn't reduce the
| extremely long load times? not only embarrassing, but shows
| deep incompetence and lack of respect for the craft and for
| end users.
| simias wrote:
| I wouldn't have said anything if the game was released one
| month ago, but GTA V is almost 8 year old now and it's been
| ported to several generations of hardware (IIRC they've even
| announced "next gen" ports to release this year). The online
| function is still maintained and makes them a lot of money. I
| also do think that it affects the gameplay because these
| loading times are genuinely terrible. A 30second loading
| screen is a nuisance, a 5+ minute loading screen just makes
| me want not to play the game.
|
| I think that Rockstar deserves some blame here, especially
| since this problem might well be a consequence of their
| notoriously bad development practices.
| wpietri wrote:
| Well, it doesn't affect the gameplay if the player starts the
| game once and never closes it. But for anybody who wants to
| hop on for a quick bit of fun, it's a notable barrier. There
| are definitely games I've stopped playing because it takes
| too much time to launch the thing.
| Hamuko wrote:
| > _AAA game development is rushed; the pressure is to ship._
|
| I'd be more understanding if GTA Online hadn't already
| shipped its first version in October of 2013. Surely there
| would've been _some_ time after shipping the first version to
| profile the game.
| lostcolony wrote:
| Sure. I'd be embarrassed if they didn't have the issue on
| their backlog ("Load times are high"). But the priority
| seems low, and the actual effort and viability of a fix
| seems unknown. Speaking as an engineering manager, that is
| very much going to be a "if you have spare time" ticket.
| Now, I also try to ensure people have spare time to
| investigate stuff like that, but that's me, and I don't
| work in game dev. I can easily see another manager,
| especially one in game dev (where what keeps players coming
| back is new content and features, not reduced load times)
| prioritizing other tickets ahead.
| ajmurmann wrote:
| (disclaimer: I'm not in game development and only read
| about this)
|
| Usually different staff rolls on and off at different times
| of product development and post-release lifecycle. I
| understand that most programmers would have been rolled off
| a while before launch. You early on have people build or
| adjust the engine and tooling, but later on you don't need
| most of them anymore and things come down to creating
| content.
| vegesm wrote:
| That's true for all software development. In seven years
| most of your team is replaced.
| ajmurmann wrote:
| In other areas of software development are perpetual. You
| don't hit some milestone at which 90% of developers are
| moved to a different project or laid off and folks with a
| different skill set are added.
|
| Usually in software development you have different people
| over time, because of individual churn, not because you
| are changing the role mix
| dijit wrote:
| I work in gamedev and I'm on your side in this.
|
| But I should note that once you ship a product in this
| space there is a heavy emphasis on not breaking much.
| Changes are for the next milestone (seasons, service packs,
| new features). There's very rarely any emphasis on "fixing"
| something because it could introduce even more bugs and
| Producers prefer sitting on a stack of known issues than
| addressing them with more unknown ones. Since known issues
| have a known cost.
|
| Until it gets so bad that you have to make health patches,
| we made such patches (and referred to them internally as
| "Sanity" patches)
| acomjean wrote:
| I tend to agree. When you are rushing things get missed. Also
| if it was a problem from the beginning you just might not
| think its an issue (its just how long it takes) .
|
| One philosophy I heard in my days of programming (not sure
| how I remembered this but its still out there) :
|
| Make it work, make it right, make it fast. -- Kent Beck
| IanNorris wrote:
| Not excusing this but there are likely a few mitigating factors
| here.
|
| * Tight deadlines result in shipping code that's barely tested
| and may have resulted in minimal code reviews on it. * The
| original article mentioned how the ids were always unique. It
| may have been intended to load content from multiple sources or
| to allow patching of content on disk (or repurposed entirely
| from a different game). Or it could well be an oversight/over-
| engineering. * It may even be a general purpose json parser
| from another project that had never been tested with data of
| this size until after launch. * It probably wasn't always this
| bad. Likely when the game launched the loading times were much
| more reasonable as the amount of in-app-purchases was an order
| of magnitude smaller.
|
| Typically most of the IAPs will be added much later, so much of
| the profiling work would have been done with this code having a
| much smaller json block.
|
| When the game was shipped the dev team will likely have been
| shrunk significantly as the bulk of the team moves to a new
| project leaving a smaller team with a focus more on the content
| itself and the engine team that likely deal with and spot stuff
| like this will probably have their attention elsewhere.
|
| Don't work for R*, have shipped many high budget titles though
| including live services.
| tgtweak wrote:
| I was thinking about it the other day when reading the original
| article, and this was the only plausible (and defensible) cause
| for it not being addressed:
|
| When GTA online was released 7 years ago in 2013, the list of
| DLC items was probably much shorter, and grew over time. The
| performance issue is exponentially aggravated with list-length.
| The list growth was probably bell-curve shaped over the
| lifetime of the game.
|
| This has an interesting dynamic when it comes to perceived
| performance:
|
| In the beginning, on consoles and PCs - it was already a pretty
| long load time, but would have been 90s or so on an average
| gaming PC (I remember this from the early days playing it, on a
| modest gaming PC with an FX-8150 cpu). This is long, but
| tolerable for a game of this size. I'm certain that early
| complaints that it was sluggish to load were profiled and
| looked at, and at the time it wasn't a 4 minute ordeal to load
| the json and probably represented a fraction of the CPU time it
| takes today - not standing out as obviously as in OPs guerilla
| profiling. Devs put a pin in it and say "this is netcode
| related, it is what it is"
|
| Over time, the list gets longer, the loading time takes more
| cycles, BUT, PCs are getting progressively faster year over
| year as well, with many of those improvements happening at the
| instruction-level - optimizing for things like, surprise,
| string scanning. Two console generations are released since,
| masking the problem on that side. For comparison sake, I just
| checked and I can load GTA online in about 75s on my Ryzen
| 3900x. This cpu is probably 4-6x faster in single core
| performance than the 8150 for most workloads. Again, it's slow
| but tolerable and by this time it's "yeah GTA online is just a
| big game and takes a while to load, it's always been that way".
| Complacency is the enemy of improvement, and things that
| regress slowly over time are hard for us to notice in general.
|
| Don't take this as a "this is fine" comment, but instead the
| only reasonable justification I can think of as to why it might
| have flown under the radar all these years.
| SomeCallMeTim wrote:
| Agreed. My first instinct was the same: *scanf is never the
| right tool for pretty much any job.
|
| I learned this 20+ years ago. As far as I'm concerned it should
| have been considered deprecated along with gets; it was
| considered dangerous in the early 90s and probably before. Not
| sure why people are still using it in the 2000s+.
| randyrand wrote:
| why?
| froh wrote:
| scanf and printf have complementary format specifiers, which
| can make maintaining serialization and parsing of regular data
| a breeze...
|
| the proper remedy is to simply wrap the string to parse with
| fmemopen(3), which makes the temporary FILE object explicit and
| persistent for the whole parse, and needs just one strlen call.
|
| https://news.ycombinator.com/item?id=26343149
| abetlen wrote:
| Cool trick, thanks for sharing. I don't get why there isn't a
| suitable snscanf function that takes the buffer length as an
| argument and returns the number of bytes parsed?
| froh wrote:
| fmemopen takes the buffer length, and there is no need to
| have the buffer \0 terminated, so instead of strlen you can
| also just give the buffer size.
|
| The number of bytes parsed can be fetched with the scanf %n
| format specifier.
| earthboundkid wrote:
| The Go standard library is pretty good, but unfortunately, it
| includes a scanf clone, so every once in a while you see a poor
| new developer posting to help forums trying to get it to work
| properly and you have to break it to them that they're using
| the wrong tool for basically any job.
| protontorpedo wrote:
| It's impossible to know all the pitfalls, and the author notes
| that. Metrics (or - ugh - telemetry if it's client-side), as well
| as automated testing with expectations around performance can go
| a long way to prevent these issues. Of course, ideally everyone
| should think about how their code performs with large input, but
| everyone messes up every now and then.
| NightMKoder wrote:
| Could the sscanf bug also be a security issue? Most C strings are
| null terminated, but I could imagine using sscanf to read outside
| of bounds due to the null-seeking behavior on a non-null
| terminated array.
|
| If it is an issue, I think it probably can't be used for more
| than a DoS or a timing attack. That said after meltdown & co
| anything is possible.
| loeg wrote:
| It's already invalid to pass non-null terminated strings to
| functions that consume null-terminated strings.
| ummonk wrote:
| I agree with the "it can happen to you" bit and the fact that
| this is largely the fault of C's poor documentation (e.g. not
| mentioning algorithmic complexity) and poor design choices when
| it comes to strings (you get similar issues with buffer overruns
| when using standard library functions).
|
| That said, the GTA thing was far more noteworthy because
| apparently the GTA developers hadn't bothered to profile loading
| times and get to the bottom of why exactly game load was taking
| such a ridiculously long time.
| ph2082 wrote:
| How many oss code bases using sscanf same way worth finding out?
| All of them can benefit from same speed gain.
| mehrdadn wrote:
| Given everyone's interest in the topic, can I also share
| something I wrote under "accidentally quadratic"? I think people
| might enjoy reading it:
| https://news.ycombinator.com/item?id=26337913
|
| It turns out that multi-pass algorithms _in general_ can be quite
| susceptible to this issue, and it can be far from obvious in
| general.
| bigdict wrote:
| You are missing a "3" at the end of the link.
| codetrotter wrote:
| I am writing an app for iOS in Swift and I have an array of
| structs with some 70,000 elements or thereabouts and for some
| bizarre reason the compiler uses so much memory if I define it as
| such directly in the source, that I run out of memory. So instead
| as a workaround for now I am storing the data as a JSON string
| that I parse at runtime. It's very sad, but it's the only option
| I had because I have a ton of other code to write too for this
| app and cannot afford to spend the time to even make a binary
| format for this data.
|
| But I don't understand why the Swift compiler decides to use so
| much RAM when compiling it in the first place. The string
| representation of the data itself is only ~3 MB. But when I tried
| to declare the data as an array of structs in Swift directly it
| uses gigabytes of memory when I try to compile it, which causes
| the system to start swapping and then the disk space runs out
| because I only have about ~20 GB of free space on the disk, so
| then the system can't swap no more and is out of RAM also.
|
| And my struct is very simple it's just struct
| Bazinga: Identifiable, Codable { let id: Int32
| let name: String }
|
| And before I had to turn to JSON it used to be only Identifiable
| even. So it's like one of the simplest possible structs, and the
| 70,000 items of data only a few MB when written in the source.
| Yet more GB of memory is needed to compile an array of these
| structs than I have RAM, and even exceeds the amount of disk
| space I have that it can swap to. It's super weird to me that
| this is even a problem, and it's insane how many GB of memory it
| consumes trying to compile my code.
| bobbylarrybobby wrote:
| Do you explicitly indicate the type of the array? E.G., `let
| array: [Bazinga] = [ ... 70k elements ...]` instead of `let
| array = [ ... 70k elements ...]`.
| codetrotter wrote:
| I will try giving it an explicit type when I'm on the
| computer again. Would be nice if that turns out to make it
| behave nicely.
| bogwog wrote:
| I don't know why you're running into that issue, but...
|
| > It's very sad, but it's the only option I had because I have
| a ton of other code to write too for this app and cannot afford
| to spend the time to even make a binary format for this data.
|
| You should look into Flatbuffers (https://google.github.io/flat
| buffers/flatbuffers_guide_use_s...). It's a tool that can
| generate an API for reading/writing binary data based on a
| schema file where you design the layout (similar to protocol
| buffers). The data is ready to read, so you don't have to do
| any parsing at all, AND the compiler includes a feature to
| convert JSON files into binary that matches your given schema.
|
| It won't solve your compiler woes, but it will help you avoid
| having to store and parse JSON, and it's a tiny dependency.
| Gibbon1 wrote:
| Your story reminds me of watching a modern CAD program run out
| of memory and barf when trying to import a DXF file with a few
| thousand elements.
| saagarjha wrote:
| Looks like it's solving constraints in the typechecker:
| 8140 swift::ASTVisitor<(anonymous namespace)::StmtChecker,
| void, swift::Stmt*, void, void, void,
| void>::visit(swift::Stmt*) (in swift-frontend) + 125
| [0x110560f9d] 8140 (anonymous
| namespace)::StmtChecker::typeCheckASTNode(swift::ASTNode&) (in
| swift-frontend) + 1043 [0x11055dcc3] 8140 (anonymous
| namespace)::DeclChecker::visit(swift::Decl*) (in swift-
| frontend) + 4497 [0x1104e0721] 8140 swift::TypeChe
| cker::typeCheckPatternBinding(swift::PatternBindingDecl*,
| unsigned int, swift::Type) (in swift-frontend) + 250
| [0x11049648a] 8140
| swift::TypeChecker::typeCheckBinding(swift::Pattern*&,
| swift::Expr*&, swift::DeclContext*, swift::Type,
| swift::PatternBindingDecl*, unsigned int) (in swift-frontend)
| + 140 [0x1104962bc] 8140 swift::TypeChecker::t
| ypeCheckExpression(swift::constraints::SolutionApplicationTarge
| t&, swift::OptionSet<swift::TypeCheckExprFlags, unsigned int>)
| (in swift-frontend) + 897 [0x110495e71] 8140
| swift::constraints::ConstraintSystem::solve(swift::constraints:
| :SolutionApplicationTarget&, swift::FreeTypeVariableBinding)
| (in swift-frontend) + 974 [0x11032cb1e]
| 8140 swift::constraints::ConstraintSystem::solve(llvm::SmallVec
| torImpl<swift::constraints::Solution>&,
| swift::FreeTypeVariableBinding) (in swift-frontend) + 52
| [0x11032d8b4] 8140 swift::constraints::Co
| nstraintSystem::solveImpl(llvm::SmallVectorImpl<swift::constrai
| nts::Solution>&) (in swift-frontend) + 372 [0x11032aa14]
| 8135 swift::constraints::ComponentStep::take(bool) (in swift-
| frontend) + 2911 [0x1103393af] + 4015
| swift::constraints::ConstraintSystem::finalize() (in swift-
| frontend) + 5258,5080,... [0x110325a7a,0x1103259c8,...]
| + 1819 swift::constraints::ConstraintSystem::finalize() (in
| swift-frontend) + 5291 [0x110325a9b]
| ineedasername wrote:
| _Still, folks in the comments section generally agreed: they
| wouldn 't write anything that silly._
|
| Well, if you've never accidentally crashed a system running an
| unexpectedly (and unnecessarily) "non-performant" piece of code
| then you're either an absolute genius of a coder, or relatively
| inexperienced.
| ALittleLight wrote:
| I don't think it's a problem that they wrote anything "that
| silly" (okay - maybe that list/hashmap construct was pretty
| stupid to write originally). Instead, I think it is that they
| were satisfied with 6 minute load times for years. They should
| have been profiling this and tracking it themselves prior to
| launch, getting customer feedback, etc. Someone should have
| realized this was a problem and then developers on their team
| should have taken out the profilers and figured out what was up
| and how to make it come down.
| justaman wrote:
| Genuinely confused how this guy ever thought capitalism was to
| blame for the GTA's loading issue. As if any other economic
| system wouldn't have the same result.
| rectang wrote:
| Pro tip: the classic recursive approach to implementing the
| Fibonacci sequence exhibits eye-wateringly poor performance.
| elseweather wrote:
| ...which makes it a classic exercise for teaching memoization!
| astine wrote:
| And now your memory usage will grow eye-wateringly large.
| Instead, convert the algorithm to be iterative or at least
| tail-recursive and it will be faster than both the naive
| _and_ memoized versions!
| karamazov wrote:
| Or use the closed-form solution.
| jacobolus wrote:
| The "closed-form solution" is slower than standard
| method. It just uses arbitrary-precision fractional
| arithmetic (square root, exponentiation, division)
| instead of arbitrary-precision integer arithmetic
| (exponentiation of a 2x2 integer matrix).
| singhrac wrote:
| The best solution is to use the O(log n) time
| exponentiation of a matrix, which is fast enough to be
| constant.
| sjburt wrote:
| I've always hated how that's used as an example of recursive
| programming in early programming education. Invariably,
| students try a large n, get a stack overflow and conclude that
| recursion is inefficient, leads to mysterious errors, and must
| be avoided.
| saagarjha wrote:
| Fibonacci is generally more likely to take forever than stack
| overflow.
| jbluepolarbear wrote:
| Looped algorithms generally outperform recursive algorithms.
| didibus wrote:
| Recursion is a form of looping. I think it's more clear if
| you say imperative algorithms.
| arcturus17 wrote:
| In what sense is recursion a form of looping?
|
| I'm thinking of the memory model for each - one mutates
| some variables (at least some sort of index, unless you're
| doing an infinite loop) in a single frame, while the other
| accumulates function calls and stack frames until you
| bottom out, then return values are "folded" back up.
|
| Semantically, from a caller's perspective, they can achieve
| the exact same thing, but aside from that they seem
| radically different to me - is there anything else that
| could lead us to say that recursion is a form of looping?
| didibus wrote:
| > In computer science, a loop is a programming structure
| that repeats a sequence of instructions until a specific
| condition is met.
|
| That's the general definition at least I've always been
| most aware of. I don't want to claim it is the most
| common one, cause I don't really have numbers and who is
| the authority on comp-sci definitons? But I do feel it is
| at least a somewhat common academic definition for
| looping.
|
| That would mean that recursion is a form of looping. The
| distinction between recursion and imperative loops would
| then be a matter of implementation and exposed programmer
| interface. Similarly like others have said, goto's can
| also be used to implement similar loops.
|
| And in that regard, there are variants of recursion as
| well, such as tail-call recursion which has a similar
| memory model as what you described and differs only to an
| imperative loop in the programming interface it exposes.
| arcturus17 wrote:
| There's no denying that from that definition they are the
| same. It's just after you've debugged enough loops and
| recursions you can't help but think they are quite
| different!
| beaconstudios wrote:
| does that hold for languages/compilers with tail call
| optimisation?
| desertrider12 wrote:
| This isn't really related to your question, but I don't
| think tail calls could help for Fibonacci since f(n)
| branches to two calls, f(n-1) and f(n-2). And each of those
| branches into 2. So it can't be done in a finite stack area
| with naive recursion.
|
| The compiler would either have to memoize, or be extremely
| clever and start at the base case (0, 1) and then transform
| the code to use the 2x2 matrix exponentiation. I wouldn't
| have been suprised if GHC haskell was that clever, but even
| with -O2 "print $ fibb 10000" isn't terminating.
| kaba0 wrote:
| Haskell is not that clever (or at least it was not a year
| ago). The memoized version is much much faster.
| adrianN wrote:
| If you replaced the recursion with loops but implemented
| the same algorithm you'd just have to manage a stack on
| the heap. I don't think that would be faster.
| iamevn wrote:
| Wouldn't this be the recursion version with tail calls?
| (define/contract (fib n) (-> nonnegative-integer?
| nonnegative-integer?) (let fib-recur ([i 1] [curr
| 1] [prev 0]) (cond [(= i n) curr]
| [else (fib-recur (+ i 1)
| (+ curr prev) curr)])))
| hydroxideOH- wrote:
| At least for Racket, which does have TCO, the for-loop is
| really a macro for a tail recursive loop. I'm not sure
| about other languages with more aggressive optimization
| though.
| jbluepolarbear wrote:
| Tail calling is basically looping and why I said generally.
| exporectomy wrote:
| And don't cause stack overflows in normal operation. I've run
| into that before and now avoid recursion.
| Tyr42 wrote:
| Goto based algorithms generally outperform looped algorithms.
|
| (They're just more general, you can always implement a loop
| based algorithm using gotos, but not the other way around)
| ThatGeoGuy wrote:
| Not exactly. This is more an artefact of language design.
|
| If you convert everything to continuation passing style,
| then every function call, tail call, recursion, etc. is
| just as expensive (and expressive) as a GOTO. This is,
| incidentally, the main "trick" or lightbulb moment in the
| classic Cheney on the MTA paper by Henry Baker [1].
|
| Now if we're talking specifically in C, then absolutely!
| But while this intuition holds for C, it's not always true
| and doesn't have to be always true.
|
| [1] https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1
| .1.54...
| jbluepolarbear wrote:
| That was a really good short read.
| FpUser wrote:
| Strange. I've written some simple parsers over the course of my
| life. Latest one was to parse some subset of SQL for home grown
| in memory database.
|
| I do not remember ever using anything even remotely resembling
| scanf or related stuff. It was always read next char from some
| abstracted stream and execute state machine on it.
| geerlingguy wrote:
| So many times, in higher level code, it's seeing a foreach loop
| in a foreach loop, and the nested loop is calling an API or re-
| rerunning the same database call 5000 times.
|
| Move things around or just use a cache... and instant 1000%+
| speedup.
|
| I've seen this too many times to count, often in apps and on
| sites that are in fairly heavy use.
|
| The answer is often to scale up and pay for 10x more server
| capacity to handle the load, rather than spend some time
| optimizing the slow code paths.
| mooreds wrote:
| My fav optimization story:
|
| A few years back, I was doing some geographic calculations.
| Basically building a box of lat/lngs and getting all the points
| within that box.
|
| It was slow. Weirdly slow. I made sure the lat and lng of the
| records were in the index, but it was still slow.
|
| More testing revealed that the way I was passing the lat/lng
| into the query was causing those values to be converted to
| strings, which were converted back to numbers, but caused the
| index to not be used. This meant the database had to do a lot
| more work.
|
| Converting the parameters to numbers made sure the index could
| be used, which lead to a nice speed up.
|
| Edit: I thought I wrote this up, but I didn't. But it turns out
| there was some interesting sorting shenanigans that I did:
| https://www.mooreds.com/wordpress/archives/547
|
| Sometimes it's worth peering down through those layers of
| abstraction!
| flukus wrote:
| At least that was an accidental conversion and an
| understandable mistake, I've seen the same with dates stored
| as strings from devs unaware that dates are just integers
| with a lot of maths to make them meaningful.
|
| At once place our end of month billing was getting slower and
| slower over the course of months, from one hour out to about
| twelve and it had to baby sit and run in batches in order to
| not bring the whole database to it's knees. We couldn't
| change the mess of legacy classic asp of course, but adding a
| couple of actual date columns calculated from the string
| fields on insert/update bought the whole process down to
| seconds.
| TeMPOraL wrote:
| In a project I worked on few years ago, we had persistent
| performance problems, present for a long time before I showed
| up. One of the first things I did when I joined the team was do
| some unsolicited statistical profiling and narrow it down to
| database interactions. Since we were using a custom-written
| connector to a graph database, we assumed it's the connector,
| or the JSON-based API it's using.
|
| The issue got on the back burner, but it bugged me for a long
| time. I was later writing some performance-intensive number
| crunching code, and ended up writing a hacky tracing
| profiler[0] to aid my work. Having that available, I took
| another swing at our database connector issue...
|
| And it turned out the connector was perfectly fine, its
| overhead was lower than the time the DB typically spent
| filtering data. The problem was, where I expected a simple
| operation in our application to do a few DB queries, it did
| _several hundreds of them_! Something that, for some reason,
| wasn 't obvious on the statistical profiler, but it stood out
| on the full trace like a sore thumb.
|
| I cut the run time of most user-facing operations by half by
| doing a simple tweak to the data connector - it turns out even
| a tiny unnecessary inefficiency becomes important when it's run
| a thousand times in a row. But ultimately, we were doing 100x
| as many queries as we should have, nobody noticed, and fixing
| it would be a huge architecture rework. We put it on the
| roadmap, but then the company blew up for unrelated reasons.
|
| I sometimes think that maybe if we tracked and fixed this
| problem early in the development, our sales would have been
| better, and the company would have been still alive. For sure,
| we'd be able to iterate faster.
|
| --
|
| [0] - https://github.com/TeMPOraL/tracer
| _kst_ wrote:
| There's a bigger potential problem with the *scanf() functions
| than performance. They are inherently unsafe for reading numeric
| input.
|
| For example, if you do something like this: int
| n; sscanf("9999999999999999999999999", "%d", &n);
|
| the behavior is undefined. As the C standard says:
|
| > ... the result of the conversion is placed in the object
| pointed to by the first argument following the format argument
| that has not already received a conversion result. If this object
| does not have an appropriate type, or if the result of the
| conversion cannot be represented in the object, the behavior is
| undefined.
|
| You can control the appropriate type by writing the call
| directly, but you can't guarantee that the result can be
| represented unless you have control over the input.
|
| Remember that in C "undefined behavior" doesn't mean that your
| program will fail, or will crash, or will tell you there was a
| problem. It means that a conforming implementation can do
| literally anything. In the worst case, it will do what what you
| expect until it fails at the most inconvenient possible moment.
|
| Now most implementations will probably do something sane, like
| setting a floating-point object to infinity or an integer object
| to some arbitrary value, but the language doesn't guarantee
| anything.
|
| If you want to write safe code, you can extract a substring that
| represents a number and pass it to one of the strto*() functions,
| which do have well defined behavior on overflow. (But I couldn't
| tell you exactly what that behavior is without looking it up.)
| sicariusnoctis wrote:
| > Remember that in C "undefined behavior" [...] means that a
| conforming implementation can do literally anything. In the
| worst case, it will do what what you expect until it fails at
| the most inconvenient possible moment.
|
| Actually, the worst case possibility is that your program will
| become Skynet, enslave humanity for 10000 years, collapse all
| stars in the universe into black holes, and significantly
| accelerate processes such as the heat death of the universe.
| ncmncm wrote:
| It's even allowed for the program to change all extant texts
| of the Standard to indicate that its behavior is well-
| defined.
| Xorlev wrote:
| There's probably an npm package for that. :)
| saagarjha wrote:
| Programs don't need undefined behavior to do those things,
| you just need to wait a bit :)
| ajkjk wrote:
| Obligatory: it is flabbergasting that this is a quality of
| language that is still in active use.
| Gibbon1 wrote:
| It's even more flabbergasting that all these problems haven't
| been fixed.
| layer8 wrote:
| That would likely break backward compatibility of existing
| implementation-defined behavior.
| hyperpape wrote:
| Converting sscanf from undefined to implementation
| defined behavior would, by definition, not break
| implementation defined behaviors.
|
| There are real cases where removing undefined behavior
| blocks optimizations, but this doesn't feel like one.
| hctaw wrote:
| Most of these things can't be fixed at the language level
| without seriously breakages or preventing users from doing
| necessarily dangerous things.
| imtringued wrote:
| Wow, the problems are so deep that no human should ever use
| sscanf again. It's just as bad as gets() because there is no
| way for the programmer to deal with the error case.
| vlovich123 wrote:
| You're misreading the standard a bit I think. It's saying
| undefined behavior comes from the format string (which you
| should control and is a common compiler warning if it's not a
| literal) doesn't match the types of variables you pass it. This
| is kind of obvious when you think about it. Variadic C
| functions lose type information so the format string is the
| source of that.
|
| The "out-of-range" issue just means that the library isn't
| going to mandate every implementation of this function is
| guaranteeing to provide the same overflow behavior (some might
| stop when you saturate, others might stop at the end of digits
| input and overflow, others might detect the overflow and
| saturate).
|
| The Linux man page is clearer here IMO:
|
| > If the number of conversion specifications in format exceeds
| the number of pointer arguments, the results are undefined. If
| the number of pointer arguments exceeds the number of
| conversion specifications, then the excess pointer arguments
| are evaluated, but are otherwise ignored.
|
| That's the only spot the word "undefined" appears and doesn't
| discuss overflow. My general impression is that the "undefined"
| problem largely only applies to language operations or user
| input causing a library to perform such an undefined behavior.
| Older C functions with older documentation may be using
| "undefined" with a less strict meaning to also cover
| "implementation-defined". The "undefined behavior" brouhaha
| came up in the past 5-10 years only when compilers actually
| started leveraging it breaking a lot of assumptions.
| Gravityloss wrote:
| I think these kind of things get caught in an organization with
| technically competent people that can give feedback to the
| product process. People who can understand from first principles
| how long something should last, and don't tolerate it lasting ten
| times as much.
| ganafagol wrote:
| Appreciate the blog content. But a more descriptive title here in
| HN would be nice. This one is right up there with "get a life"
| recently.
| moonchild wrote:
| (Originally on lobsters[0].)
|
| I maintain my original position that sscanf calculating the
| entire length of its input is absolutely ridiculous. Are *scanf
| difficult to use safely, not very robust, and somewhat baroque?
| Yes. Should sscanf("%f") be a correct (not performance-killing)
| way of reading floats? Also yes. (Though aside: the OP seems to
| be reading data from files, so they could have just used fscanf,
| which has correct performance already.)
|
| Unfortunately, many libcs are guilty of this:
|
| - glibc uses memchr (the trail is convoluted, but ends up at
| _IO_str_init_static_internal)
|
| - freebsd libc (and thus also the apple and android libcs, as
| well as those of the other BSDs) use strlen
|
| - uclibc and newlib are the same as freebsd (appear to be copied
| directly from it)
|
| - Since the original bug was in GTA, which only runs on windows,
| I must presume msvcrt has the same problem
|
| - musl has the correct behaviour, processing input in 128-byte
| chunks
|
| - managarm doesn't strlen but looks broken for unrelated reasons.
| (Assumes nul byte means eof.) Also has codebloat because of
| templates.
|
| - serenityos tries to implement fscanf in terms of sscanf, not
| the other way around! Unfortunately that means it chomps a whole
| line of input at every call, so it doesn't even work correctly.
| Horrifying.
|
| - pdclib has ok performance, but with an interesting
| implementation: it duplicates code between sscanf and fscanf,
| though the heavyweight format parsing is shared.
|
| - dietlibc and sortix have the sensible, simple implementation
|
| 0. https://lobste.rs/s/0obriy/it_can_happen_you#c_giuxfq
| shawnz wrote:
| Agreed... I don't see why people are arguing this problematic
| sscanf behaviour should be seen as an intentional deficiency
| and not a bug.
|
| The argument seems to be, "what were you expecting using C
| stdlib functions for that?" Well, of course they will suck
| forever with that mentality.
| froh wrote:
| most of these provide fmemopen(3), which makes the temporary
| FILE object explicit and persistent for the whole parse, and
| needs just one strlen call.
|
| https://news.ycombinator.com/item?id=26343149
| lostdog wrote:
| Exactly. If the standard library's sorting function executed in
| O(n^5), I would consider that a problem with the standard
| library.
| desertrider12 wrote:
| Reading this article was a surprise for me, I didn't know of
| this issue at all.
|
| But this is pretty ridiculous. If it's possible to write scanf,
| which matches chars from a stream, why can't sscanf just do the
| exact same thing but check for '\0' rather than EOF...
| JdeBP wrote:
| It can, and the people who only check a few well-known open
| source C library implementations miss that there is quite a
| range of _other_ C library implementations out there that _do
| this very thing_ , from P.J. Plauger's through OpenWatcom's
| and Tru64 Unix's to mine. (-:
|
| * https://news.ycombinator.com/item?id=26300532
| albertzeyer wrote:
| I think I would argue that both in this case and in case of GTA,
| sscanf is actually to blame. Surely, by profiling, this could
| have been detected, and workarounds are simple. But sscanf
| doesn't need to be so slow. A naive implementation of sscanf
| would not be slow. So I think it is perfectly fine to assume that
| scanf should only take constant time to parse sth like numbers
| (obviously not "%s").
| tyingq wrote:
| This HN comment and its follow ups are an interesting read. There
| are some stdlib implementations that don't have this issue.
|
| https://news.ycombinator.com/item?id=26298300
| froh wrote:
| also a trivial workaround is to wrap the buffer with a FILE via
| fmemopen(3), which makes the temporary FILE object explicit and
| persistent for the whole parse, and needs just one strlen call.
|
| https://news.ycombinator.com/item?id=26343149
| jedimastert wrote:
| Quick question: At the top of the parser they define
| const char VERTEX_STR[] = "vertex ";
|
| And a few lines in data += strlen(VERTEX_STR);
|
| Would parsers optimize this out? Seems like an easy win to
| replace that with a "7" (or a constant or something), although I
| don't know how much of a win it would be.
| mkeeter wrote:
| I was trusting the compiler on this one, but after someone
| asked this question on Twitter, I doubled-checked:
| https://cppx.godbolt.org/z/fhTGcx
|
| Sure enough, it compiles down to "add rax, 7"
| jedimastert wrote:
| I figured it would, but I didn't want to assume. Thanks!
| tom_mellior wrote:
| Years ago I was working on a compiler frontend that was capable
| of reading multiple files into one program representation, as
| opposed to compilers that compile each input file completely
| separately. At some point we were trying to compile some project
| made up of many small files, and our frontend got very slow.
| However, when we concatenated all those files into one big source
| file, everything went smoothly.
|
| I investigated. It turned out that we were keeping some sort of
| global symbol table structure. It was updated after parsing each
| file, something like this (the original was C++, but this is
| pseudocode because I can't be bothered):
| list<symbol> global_symbol_table; void
| update_symbol_table(list<symbol> new_symbols) {
| global_symbol_table.add_all(new_symbols); // Keep
| sorted so we can do efficient lookups with binary search.
| sort(global_symbol_table); }
|
| For N files, this meant N calls to sort() on a list that grew
| linearly in N, so having something like O(N^2 log N) complexity
| overall.
|
| This has a few problems:
|
| 1. Even if you want to use a "sorted list" representation, it
| would suffice to sort the new symbols only (the global table is
| always sorted by its invariant) and do a merge of the sorted
| list.
|
| 2. But really, you want a set data structure that does the same
| thing better.
|
| 3. But also, could we maybe speed up the lookups in some other
| way? I looked around for the uses of this global symbol table and
| found... none. We were keeping data around and updating it in the
| least efficient manner imaginable, without ever using that data.
|
| I deleted the above function and the global symbol table, and
| performance was back to expected levels.
| einpoklum wrote:
| Bugs happen to everyone, all the time - correctness bugs and
| performance bugs.
|
| On the other hand - if you write parsing code and completely
| ignore the cost of functions, then, well, you sort of had it
| coming.
| anothernewdude wrote:
| Doesn't matter unless it's a critical path for your application.
| You can do things in an especially stupid and obnoxious way, as
| long as they're not on a common path that will affect your users.
| froh wrote:
| I wonder if then The Right Way of scanf-scanning potentially
| large strings is the fmemopen route? /* ... goal:
| scanf some looong char *s in some loop */ FILE
| *s_as_FILE = fmemopen(s, strlen(s), "r"); /* loop loop loop
| */ ... { fscanf(s_as_FILE, "FORMAT", &v); }
| fclose(s_as_FILE);
|
| fmemopen is around for a while now, it should work with many libc
| implementations these days.
|
| ?
| fastball wrote:
| One of my favorite things about the Redis docs[1] is that they
| include the time complexity for every command type.
|
| [1] https://redis.io/commands/
| jbluepolarbear wrote:
| This is really cool. I didn't know about this until the GTA story
| came out. I wonder how many of many applications get this wrong.
| souprock wrote:
| It's just plain weird that the library would use a generic length
| function. Might it have something to do with needlessly accepting
| pathological NaN representations?
|
| The expected code would do something closer to this:
|
| len = strspn(s, "0123456789INFTYAEBCDXPinftyaebcdxp-+.");
|
| That would be optimized to use a bitmap.
| decasia wrote:
| It would be nice if it were more common for standard library
| functions to include algorithmic complexity as part of the
| standard documentation.
|
| Absent that, of course we can potentially read the source code
| and find out, but I think for the most part we tend to operate
| based on an informed assumption about what we _imagine_ the
| algorithmic complexity of a given operation would be. Inevitably,
| sometimes the assumption is incorrect.
|
| There's no way to develop software without making assumptions,
| some of which inevitably turn out to be incorrect, so I don't
| think there is any great shame in having that happen, in itself.
| But better docs could help us make better assumptions with less
| effort, at least.
| quelsolaar wrote:
| There are many implementations of the C standard library and
| thy may not have the same complexity. In this case the code has
| the right complexity(linear) but to the wrong n (string length
| instead of parse length)
| zachrip wrote:
| This is a cool idea..I'm sure it's been done before
| automatically in code, but I'd love to see this in my editor.
| It sounds similar to the bundle-size extension that you can use
| that will annotate how big a package that you're importing is
| in node.
| beaconstudios wrote:
| personally, I think I wouldn't even bother to check the
| algorithmic complexity of every external function I call. I'd
| just use the logical choice (like sscanf) and only consider
| optimising if things started to slow down and profiling the
| application highlighted it as a bottleneck.
| decasia wrote:
| Yes, I absolutely think profiling and then only optimizing
| the actual problems is always a sound choice.
|
| I don't check the docs for every library function I use. I'm
| just saying, it wouldn't hurt if, when you _do_ read the docs
| for standard library functions, the algorithmic complexity
| was mentioned in passing.
| saagarjha wrote:
| Luckily, this happens to be one of the places where
| learning computer science can help!
| chihuahua wrote:
| In principle, that sounds good. But then it can happen that
| you profiled when N=1000 and it seems fine. Then a few
| years later (like in GTA), N has grown to 63,000 and it's
| no longer fine. It seems unlikely the developer will go
| back and profile it again. Also, I think the original
| Windows Update algorithm for figuring out which updates you
| needed to download started out fine, but 20 years later it
| turns out it's quadratic and now there are tens of
| thousands of updates, so it becomes almost impossible to
| install XP SP2 from a CD and have it update itself.
| telotortium wrote:
| Do you have a link for the Windows Update algorithm being
| quadratic?
| ymosy wrote:
| I believe this is what the parent was referring to:
| https://arstechnica.com/information-
| technology/2013/12/expon...
| chihuahua wrote:
| Thank you, I had a vague memory of this but this article
| sums it up perfectly and states that it's worse than
| quadratic, it's exponential.
| zbendefy wrote:
| also dont forget the quadratic time desktop icon
| arrangement:
| https://news.ycombinator.com/item?id=26152335
| ajkjk wrote:
| But you'll lose so much time doing that! Realizing there's a
| bug and investigating it is a huge amount of work compared to
| never writing it in the first place.
| beaconstudios wrote:
| No, spending time optimising areas of the code that will
| never become bottlenecks is the waste of time.
| account42 wrote:
| This is how you get software where everything is just
| fast enough to be tolerable but still annoyingly slow.
| beaconstudios wrote:
| No, not paying attention to performance at all is how
| that happens. Optimising all your code in advance is just
| being frivolous with your time.
| ajkjk wrote:
| But if you know the performance of an algorithm up front,
| you don't have to spend any time optimizing it in the
| first place. You just know what to do, because you know
| the performance.
|
| For instance: suppose you are building a CRUD app on a
| SQL database. Do you (a) add indexes for important
| queries as you go? or (b) ignore indexes and later
| profile and see what queries are slow. No, of course you
| just make the indexes in the first place. Having to do
| the latter would mean that instead of having a fast app
| out of the gate, you have an app that gets slower over
| time and requires additional dev time to debug and
| improve. Profiling and fixing performance problems is a
| massive waste of everyone's time if the problem could
| have been dodged when the code was being written.
|
| It's different if the optimization is significant
| engineering effort. Then, yes, put them off till it's
| needed. But most aren't, in my experience: most
| optimizations are totally simple, in hindsight, and the
| code should have been written that way in the first
| place.
| beaconstudios wrote:
| Of course you index hot columns up front in that case,
| but I think where we disagree is that you want to
| generalise "optimise up front" into a rule, do or don't;
| I consider whether it's applicable in the circumstance. C
| programs tend to use a lot of system calls, and are also
| usually easily rapidly testable with large data. So
| rather than profile every individual std function I call,
| I'll just profile the very resource intensive paths with
| different scales of data and see if anything pops off. If
| R* had profiled their JSON parser with a 1gb file, they
| would've found this bug.
|
| I don't disagree unilaterally with "optimise up front"; I
| disagree with unilateralism.
| gifnamething wrote:
| Bugfixing isn't optimisation
| beaconstudios wrote:
| The line between bug fixing and faffing around is context
| based, and there are efficient and inefficient ways to
| both fix bugs, and faff around. Profiling every stdlib
| function is probably both inefficient and faffing around
| unless your circumstances dictate its a worthwhile and
| effective (there aren't better alternatives to reach the
| goal) effort.
| TeMPOraL wrote:
| I personally would, if it was listed in documentation. Doing
| stuff and profiling later is the right _general_ approach to
| performance optimization. But what 's better is not doing
| stupid mistakes in the first place, if they are trivial to
| avoid. To achieve that, you need to know the complexity
| guarantees of functions and data structures - or at least
| their ballpark (like, "this could be O(n) or perhaps O(n
| logn), definitely not worse").
|
| This is where setting the guarantees and documenting them is
| useful - it allows people to trivially avoid making these
| performance mistakes. Prevention is better than cure, in that
| - as GTA Online case demonstrates - in the latter stage of
| product development, people may not bother fixing performance
| anymore.
| layer8 wrote:
| It might still not help in he case of sscanf. The
| documentation would specify that it's O(N) in the size of
| the input string, just what one would expect without deeper
| thought. The problem is not O(N), the problem is that N is
| the complete input string, not just the part being parsed.
| The documentation would have to include a big fat warning
| about that.
| beaconstudios wrote:
| Part of the issue in my mind is that big O complexity
| values don't (can't?) tell you the point of inflection, if
| they are nonlinear. Sscanf could be O(N^10) (does the rule
| about ignoring the coefficient also apply to the exponent?)
| but if it only starts to hurt your application at the point
| of 10tb string reads then it's still unimportant.
|
| I do agree that people often take "don't optimise early" as
| licence to not make optimisations up front that will
| clearly be needed (for example, implementing the buffer as
| a rope in a text editor), but I don't think this is the
| case here unless you test the function with reasonable
| future file sizes (something you should be doing anyway)
| and Sscanf profiles as a bottleneck.
| TeMPOraL wrote:
| I agree. And I'd personally probably trip over it too, if
| I used C functions much (I use C++ standard library
| equivalents, and I'm going to recheck string parsing code
| in our project now, because we may have that same
| problem, just with different API!). strlen() in a loop is
| something that can sneak up on you by virtue of having
| too many layers of abstraction - scanf() family being one
| of them.
|
| But what I'm saying is, documenting big O complexity is
| useful, even if imperfect (and if the function has
| "tricky" complexity, the conditions where it gets bad
| should be documented too!).
|
| > _Sscanf could be O(N^10) (does the rule about ignoring
| the coefficient also apply to the exponent?) but if it
| only starts to hurt your application at the point of 10tb
| string reads then it 's still unimportant._
|
| Sure, but then applications grow, functions get reused.
| Iff it's trivial to spot and replace O(N^10) Sscanf with
| a O(n) alternative, I'd want to know the complexity and
| do the replacement immediately - otherwise, you may
| discover two years later that the company is investing
| employee-years into horizontally scaling your application
| as it's slow under workload, where fixing that one thing
| would let it run that same workload on a laptop, on a
| single core.
|
| (This is not a joke, there are "big data" stories like
| this.)
| beaconstudios wrote:
| > But what I'm saying is, documenting big O complexity is
| useful, even if imperfect (and if the function has
| "tricky" complexity, the conditions where it gets bad
| should be documented too!).
|
| Ah yes I totally agree on this point, it _should_ be
| documented for those who are more cautious /mathematical
| in their engineering efforts. I'm just not one of those
| people! I prefer to focus on the structural behaviour of
| a team (if it's my responsibility) to ensure that cases
| like:
|
| > two years later that the company is investing employee-
| years into horizontally scaling your application as it's
| slow under workload
|
| Resolve correctly - if a team is set up to respond
| healthily to production slowdowns (of course the equation
| is different for software with a slow/nonexistent update
| loop) then IMO you don't need to invest as heavily into
| up-front optimisation and can instead invest into
| features, allowing (sample-based for busy paths)
| profiling in production to notify you when optimisation
| is warranted.
|
| At the end of the day, up front optimisation is an
| exchange of time with your future self/colleagues. It's
| worth it in some cases, not in others - but knowing where
| the balance in a tradeoff lies in your circumstance is a
| good chunk of all engineering decisions!
| imtringued wrote:
| The important part of the documentation would be that N
| is the length of the entire string, not just the subset
| of data that needs to be processed. The actual complexity
| isn't relevant in this case.
| epr wrote:
| Keeping track of algorithmic complexity would be nice as a
| language and/or static analysis feature. If you wanted to be
| exact or do it for a language with complex metaprogramming I
| assume it would be a nightmare to implement. Absent those
| complications and especially if you always reduced it to O(1),
| O(n), O(log(n)), etc it might not even be that difficult given
| the potential advantages.
| groby_b wrote:
| The difficulty here is "define n". And I don't mean that
| facetiously. You have a string parsing lib. It is, for
| reasons, quadratic over the number of strings parsed, and
| linear per string.
|
| This is overall n^3, but that's meaningless because there
| actually isn't just one n. So, more m^2 * n. That means you
| can't reduce it to anything, because you want to keep both
| components. (Because, say, you know it will only ever be
| called with a single string).
|
| But then, in the next app, this gets called and reinitialized
| once per file. And the routine handling files, for reasons
| beyond our ken, is (n lg n). We're now at k * log(k) * m^2 *
| n.
|
| And so, over any sufficiently long call chain, "what is n" is
| the overriding question - string length, number of strings,
| number of files? Not "how complex is the algorithm", because
| you want to optimize for what's relevant to your use case.
| epr wrote:
| I guess you're right. Keeping track of it all is required
| for the information to be meaningful enough. Still seems
| doable to me, assuming the functions are pure.
|
| Here's another crazy idea: keeping track of this while
| taking into consideration aggressive compiler
| optimizations.
| phkahler wrote:
| It would be a huge step to simply follow the call tree and
| report the depth of nested loops for each branch. You could
| then check what N is at each level.
|
| The trick is knowing where the nested loops are since they
| can be spread across functions.
|
| I had a function that scaled as N^2, but it was creating a
| list of that size as well. Then it called a function to
| remove duplicates from that list. That function was N^2,
| which meant the whole thing was actually N^4. And now that
| I think of it, those loops were not nested... I rewrote the
| first part to no create duplicates and deleted the
| quadratic deduplication. Now its N^2, but it has to be.
| [deleted]
| adrianN wrote:
| There is no general solution to deciding whether a program is
| O(n^k).[1] So either your static analysis won't know the
| answer for some programs or report a wrong bound, or report a
| ridiculous overestimate.
|
| [1] https://cstheory.stackexchange.com/questions/5004/are-
| runtim...
| Jyaif wrote:
| Can you point me towards some source code where a human
| can't find the algorithmic complexity?
| adrianN wrote:
| It's not hard to implement the construction in the proof.
| Generally you'll encounter problems in the wild in any
| interpreter. Similarly you can encode many open
| mathematical problems into simple programs where finding
| runtime bounds is equal to solving the problem. The
| Collatz Conjecture for example.
| namibj wrote:
| Once you leave primitive recursive functions[0],
| reasoning quickly becomes very non-trivial.
|
| [0]: https://en.wikipedia.org/wiki/Primitive_recursive_fu
| nction
| smallnamespace wrote:
| Humans can't tell you whether this program will run
| forever on any particular (positive integer) input, or
| whether all inputs terminate. def
| collatz(n): while n != 1:
| print(n) if n % 2 == 0: n = n //
| 2 else: n = n * 3 + 1
| print(1)
| kibibyte wrote:
| I think your indentation needs to be adjusted? Like so:
| def collatz(n): while n != 1:
| print(n) if n % 2 == 0: n
| = n // 2 else: n = n * 3
| + 1 print(1)
|
| Otherwise, n = 1 terminates, and n != 1 gets stuck
| looping at lines 2-3.
| Jyaif wrote:
| Thank you!
| rcxdude wrote:
| So? Static analysis doesn't need to always produce an
| answer, only produce an answer most of the time. The
| question isn't whether you can do it in general for all
| inputs (this is not possible for basically anything you
| would want to know), it's whether you can do it enough of
| the time on the kind of code which people actually write.
| chii wrote:
| > standard library functions to include algorithmic complexity
| as part of the standard documentation.
|
| it would be acceptable to have a graph of the input size vs
| time, and this graph could be autogenerated using a testbed
| that is also used by unit testing! two birds one stone!
| jacobajit wrote:
| I don't think we could come up with a standard x axis bounds
| for such a graph, since n=1000, or 1,000,000 may not be
| zoomed out enough to showcase the behavior approaching
| infinity.
| AdieuToLogic wrote:
| > It would be nice if it were more common for standard library
| functions to include algorithmic complexity as part of the
| standard documentation.
|
| Isn't the point of standard library functions to define the
| contract and not implementation constraints? In other words, if
| algorithmic complexity is a significant concern, perhaps a
| standard library function is not a suitable choice.
| adrianN wrote:
| Algorithmic complexity is not an implementation detail, it's
| part of the constraints on how you can use the function.
___________________________________________________________________
(page generated 2021-03-04 23:01 UTC)