hngopher.com

       [HN Gopher] Love C, hate C: Web framework memory problems
       ___________________________________________________________________
        
       Love C, hate C: Web framework memory problems
        
       Author : OneLessThing
       Score  : 149 points
       Date   : 2025-10-10 03:39 UTC (1 days ago)
        
 (HTM) web link (alew.is)
 (TXT) w3m dump (alew.is)
        
       | jacquesm wrote:
       | There are many, many more such issues with that code. The person
       | that posted it is new to C and had an AI help them to write the
       | code. That's a recipe for disaster, it means the OP does not
       | actually understand what they wrote. It looks nice but it is full
       | of footguns and even though it is a useful learning exercise it
       | also is a great example of why it is better run battle tested
       | frame works than to inexpertly roll your own.
       | 
       | As a learning exercise it is useful, but it should never see
       | production use. What is interesting is that the apparent
       | cleanliness of the code (it reads very well) is obscuring the
       | fact that the quality is actually quite low.
       | 
       | If anything I think the conclusion should be that AI+novice does
       | not create anything that is useable without expert review and
       | that that probably adds up to a net negative other than that the
       | novice will (hopefully) learn something. It would be great if
       | someone could put in the time to do a full review of the code, I
       | have just read through it casually and already picked up a couple
       | of problems, I'm pretty sure that if you did a thorough job of it
       | there would be many more.
        
         | drnick1 wrote:
         | > What is interesting is that the apparent cleanliness of the
         | code (it reads very well) is obscuring the fact that the
         | quality is actually quite low.
         | 
         | I think this is a general feature and one of the greatest
         | advantages of C. It's simple, and it reads well. Modern C++ and
         | Rust are just horrible to look at.
        
           | messe wrote:
           | I slightly unironically believe that one of the biggest
           | hindrances to rust's growth is that it adopted the :: syntax
           | from C++ rather than just using a single . for namespacing.
        
             | jacquesm wrote:
             | I believe that the fanatics in the rust community were the
             | biggest factor. They turned me off what eventually became a
             | decent language. There are some language particulars that
             | were strange choices, but I get that if you want to start
             | over you will try to get it all right this time around. But
             | where the Go authors tried to make the step easy and kept
             | their ego out of it, it feels as if the rust people aimed
             | at creating a new temple rather than to just make a new
             | tool. This created a massive chicken-and-the-egg problem
             | that did not help adoption at all. Oh, and toolchain speed.
             | For non-trivial projects for the longest time the rust
             | toolchain was terribly slow.
             | 
             | I don't remember any other language's proponents actively
             | attacking the users of other programming language.
        
               | imtringued wrote:
               | Software vulnerabilities are an implicit form of
               | harassment.
        
               | messe wrote:
               | I'm hoping that's meant to satirise the rust community,
               | because it's horseshit like this that makes a sizeable
               | subset of rust evangelists unbearable.
        
               | 01HNNWZ0MV43FF wrote:
               | > I don't remember any other language's proponents
               | actively attacking the users of other programming
               | language.
               | 
               | I just saw someone on Hacker News saying that Rust was a
               | bad language because of its users
        
               | jacquesm wrote:
               | Yawn. Really, if you have nothing to say don't do it
               | here.
        
               | LexiMax wrote:
               | Gotcha hypocrisy might be a really cheap thing to point
               | out, but they're not wrong.
               | 
               | I have noticed my fair share of Rust Derangement Syndrome
               | in C++ spaces that seems completely outsized from the
               | series of microaggressions that they eventually point out
               | when asked "Why?"
        
               | dgfitz wrote:
               | It's interesting, over the past 15 years I've had
               | occasion to work with other c/c++ devs on various
               | contracts, probably 50ish distinct different companies.
               | Not once has rust even come up in casual conversation.
        
               | lelanthran wrote:
               | > I believe that the fanatics in the rust community were
               | the biggest factor.
               | 
               | I second this; for a few years it was impossible to have
               | any sort of discussion on various programming places when
               | the topic was C: the conversation would get quickly
               | derailed with accusations of "dinosaur", etc.
               | 
               | Things have gone quiet recently (last three years,
               | though) and there have been much fewer derailments.
        
               | LexiMax wrote:
               | As an outsider, I don't really see Rust having done
               | anything different recently than they weren't doing from
               | the start.
               | 
               | What seems to have changed in recent years is the buy-in
               | from corporations that seemingly see value in its
               | promises of safety. This seems to be paired with a
               | general pulling back of corporate support from the C++
               | world as well as a general recession of fresh faces, a
               | change that at least from the sidelines seems to be
               | mostly down to a series of standards committee own-goals.
        
               | LexiMax wrote:
               | Being a C++ developer and trafficking mostly in C++
               | spaces, there is a phenomenon I've noticed that I've
               | taken to calling Rust Derangement Syndrome. It's where C
               | and C++ developers basically make Rust the butt of every
               | joke, and make fun it it in a way that is completely
               | outsized with how much they interact with Rust developers
               | in the wild.
               | 
               | It's very strange to witness. Annoying advocacy of
               | languages is nothing new. C++ was at one point one of
               | those languages, then it was Java, then Python, then
               | Node.js. I feel like if anything, Rust was a victim of a
               | period of increased polarization on social media, which
               | blew what might have been previously seen as simple
               | microaggressions completely out of proportion.
        
               | hu3 wrote:
               | I don't think Rust will ever be as big as C++ because
               | there were fewer options back then.
               | 
               | These days Go/Zig/Nim/C#/Java/Python/JS and other
               | languages are fast enough for most use cases.
               | 
               | And Rust learning curve doesn't help either. C++ was
               | basically C with OOP on steroids. Rust is very different.
               | 
               | I say that because I wouldn't group Rust opposition with
               | any of those languages you cited. It's different for
               | mostly different reasons and magnitudes.
        
               | cyphar wrote:
               | > But where the Go authors tried to make the step easy
               | and kept their ego out of it
               | 
               | That is very different to my memories of the past decade+
               | of working on Go.
               | 
               | Almost every single language decision they eventually
               | caved on that I can think of (internal packages,
               | vendoring, error wrapping, versioning, generics) was
               | preceded by months if not years of arguing that it wasn't
               | necessary, often followed by an implementation attempt
               | that seems to be ever so slightly off just out of spite.
               | 
               | Let's don't forget that the original Go 1.0 expected
               | every project's main branch to maintain backward
               | compatibility forever or else downstreams would break,
               | and without hacks (which eventually became vendoring) you
               | could not build anything without an internet connection.
               | 
               | To be clear, I use Go (and C... and Rust) and I do like
               | it on the whole (despite and for its flaws) but I don't
               | think the Go _authors_ are that different to the Rust
               | _authors_. There are (unfortunately) more fanatics in the
               | Rust community but I think there 's also a degree to
               | which some people see anything Rust-related as being an
               | attack on other projects regardless of whether the Rust
               | authors intended it to be that way.
        
               | jacquesm wrote:
               | Fair enough.
        
           | citbl wrote:
           | The safer the C code, the more horrible it starts looking
           | though... e.g.                   my_func(char msg[static 1])
        
             | uecker wrote:
             | Compared to other languages, this is still nice.
        
               | jacquesm wrote:
               | It is - like everything else - nice because you, me and
               | lots of others are used to it. But I remember starting
               | out with C and thinking 'holy crap, this is ugly'. After
               | 40+ years looking at a particular language it no longer
               | looks ugly simply because of familiarity. But to a
               | newcomer C would still look quite strange and
               | intimidating.
               | 
               | And this goes for almost all programming languages. Each
               | and every one of them has warts and issues with syntax
               | and expressiveness. That holds true even for the most
               | advanced languages in the field, Haskell, Erlang, Lisp
               | and more so for languages that were originally designed
               | for 'readability'. Programming is by its very nature more
               | akin to solving a puzzle than to describing something.
               | The puzzle is to how to get the machine to do something,
               | to do it correctly, to do it safely and to do it
               | efficiently, and all of those while satisfying the
               | constraint of how much time you are prepared (or allowed)
               | to spend on it. Picking the 'right' language will always
               | be a compromise on some of these, there is no programming
               | language that is perfect (or even just 'the best' or
               | 'suitable') for all tasks, and there are no programming
               | languages that are better than any other for any subset
               | of all tasks until 'tasks' is a very low number.
        
               | uecker wrote:
               | I agree that the first reaction usually is only about
               | what one is used to. I have seen this many times. Still,
               | of course, not all syntax is equally good.
               | 
               | For example, the problem with Vec<Vec<T>> for a 2D array
               | is not that one is not used to it, but that the syntax is
               | just badly designed. Not that C would not have
               | problematic syntax, but I still think it is fairly good
               | in comparison.
        
               | jacquesm wrote:
               | C has one massive advantage over many other languages: it
               | is just a slight level above assembler and it is just
               | about as minimal as a language can be. It doesn't force
               | you into an eco-system, plays nice with lots of other
               | tools and languages and gets out of the way. 'modern'
               | languages, such as Java, Rust, Python, Javascript (Node)
               | and so on all require you to buy in to the whole menu,
               | they're not 'just a language' (even if some of them
               | started out like that).
        
               | uecker wrote:
               | Not forcing you into an eco-system is what makes C
               | special, unique and powerful, and this aspect is not well
               | understood by most critics. Stephen Kell wrote a great
               | essay about it.
        
             | moefh wrote:
             | I don't understand why people think this is safer, it's the
             | complete opposite.
             | 
             | With that `char msg[static 1]` you're telling the compiler
             | that `msg` can't possibly be NULL, which means it will
             | optimize away any NULL check you put in the function. But
             | it will still happily call it with a pointer that could be
             | NULL, with no warnings whatsoever.
             | 
             | The end result is that with an "unsafe" `char *msg`, you
             | can at least handle the case of `msg` being NULL. With the
             | "safe" `char msg[static 1]` there's nothing you can do --
             | if you receive NULL, you're screwed, no way of guarding
             | against it.
             | 
             | For a demonstration, see[1]. Both gcc and clang are passed
             | `-Wall -Wextra`. Note that the NULL check is removed in the
             | "safe" version (check the assembly). See also the gcc
             | warning about the "useless" NULL check ("'nonnull' argument
             | 'p' compared to NULL"), and worse, the lack of warnings in
             | clang. And finally, note that neither gcc or clang warn
             | about the call to the "safe" function with a pointer that
             | could be NULL.
             | 
             | [1] https://godbolt.org/z/qz6cYPY73
        
               | lelanthran wrote:
               | > I don't understand why people think this is safer, it's
               | the complete opposite.
               | 
               | Yup, and I don't even need to check your godbolt link -
               | I've had this happen to me once. It's the implicit
               | casting that makes it a problem. You cannot even typedef
               | it away as a new type (the casting still happens).
               | 
               | The real solution is to create and use opaque types. In
               | this case, wrapping the `char[1]` in a struct would
               | almost certainly generate compilation errors if any
               | caller passed the wrong thing in the `char[1]` field.
        
             | pjmlp wrote:
             | Meanwhile, in Modula-2 from 1978, that would be
             | PROCEDURE my_func(msg: ARRAY OF CHAR);
             | 
             | Now you can use LOW() and HIGH() to get the lower and upper
             | bounds, and naturally bounds checked unless you disabled
             | them, locally or globaly.
        
               | jacquesm wrote:
               | This should not be downvoted, it is both factually
               | correct _and_ a perfect illustration of these problems
               | already being solved and ages ago at that.
               | 
               | It is as if just pointing this out already antagonizes
               | people.
        
               | pjmlp wrote:
               | A certain group of people likes to pretend before C there
               | were no other systems programming languages, other than
               | BCPL.
               | 
               | Ignoring what happened since 1958 (JOVIAL being a first
               | attempt), and thus all its failings are excused because
               | it was discovering the world.
        
         | OneLessThing wrote:
         | I agree that it reads really well which is why I was also
         | surprised the quality is not high when I looked deeper. The
         | author claims to have only used AI for the json code, so your
         | conclusion may be off, it could just be a novice doing novice
         | things.
         | 
         | I suppose I was just surprised to find this code promoted in my
         | feed when it's not up to snuff. And I'm not hating, I do in
         | fact love the project idea.
        
         | lifthrasiir wrote:
         | Yeah, I recently wrote a moderate amount of C code [1] entirely
         | with Gemini and while it was much better than what I initially
         | expected I needed a constant steering to avoid inefficient or
         | less safe code. It needed an extensive fuzzing to get the
         | minimal amount of confidence, which caught at least two serious
         | problems---seriously, it's much better than most C programmers,
         | but still.
         | 
         | [1] https://github.com/lifthrasiir/wah/blob/main/wah.h
        
           | jacquesm wrote:
           | I've been doing this the better part of a lifetime and I
           | still need to be careful so don't feel bad about it. Just
           | like rust has an 'unsafe' keyword I realize _all_ of my code
           | is potentially unsafe. Guarding against UB, use-after-free,
           | array overruns and so on is a lot of extra work and you only
           | need to slip up once to have a bug, and if you 're unlucky
           | something exploitable. You get better at this over the years.
           | But if I know something needs to be bullet proof the C
           | compiler would not be my first tool of choice.
           | 
           | One good defense is to reduce your scope continuously. The
           | smaller you make your scope the smaller the chances of
           | something escaping your attention. Stay away from globals and
           | global data structures. Make it impossible to inspect the
           | contents of a box without going through a well defined
           | interface. Use assertions liberally. Avoid fault propagation,
           | abort immediately when something is out of the expected
           | range.
        
             | uecker wrote:
             | I strategy that helps me is just not use open-coded pointer
             | arithmetic or string manipulation but encapsulate those
             | behind safe bounds-checked interfaces. Then essentially
             | only life-time issues remain and for those I usually do
             | have a simple policy and clearly document any exception. I
             | also use signed integers and the sanitizer in trapping
             | mode, which turns any such issue I may have missed into a
             | run-time trap.
        
               | OneLessThing wrote:
               | This is why I love C. You can build these guard rails at
               | exactly the right level for you. You can build them all
               | the way up to CPython and do garbage collection and
               | constant bounds checking. Or keep them at just raw
               | pointer math. And everywhere in between. I like your
               | approach. The downside being that there are probably
               | 100,000+ bespoke implementations of similar guard rails
               | where python users for example all get them for free.
        
               | jacquesm wrote:
               | It definitely is a lot of freedom.
               | 
               | But the lack of a good string library is by itself
               | responsible for a very large number of production issues,
               | as is the lack of foresight regarding de-referencing
               | pointers that are no longer valid. Lack of guardrails
               | seems to translate in 'do what you want' not necessarily
               | 'build guard rails at the right level for you', most
               | projects simply don't bother with guardrails at all.
               | 
               | Rust tries to address a lot of these issues, but it does
               | so by tossing out a lot of the good stuff as well and
               | introducing a whole pile of new issues and concepts that
               | I'm not sure are an improvement over what was there
               | before. This creates a take-it-or-leave it situation, and
               | a barrier to entry. I would have loved to see that guard
               | rails concept extended to the tooling in the form of
               | compile time flags resulting in either compile time
               | flagging of risky practices (there is some of this now,
               | but I still think it is too little) and runtime errors
               | for clear violations.
               | 
               | The temptation to 'start over' is always there, I think C
               | with all of its warts and shortcomings is not the best
               | language for a new programmer to start with if they want
               | to do low level work. At the same time, I would - still,
               | maybe that will change - hesitate to advocate for rust,
               | it is a massive learning curve compared to the kind of
               | appeal that C has for a novice. I'd probably recommend Go
               | or Java over both C and rust if you're into imperative
               | code and want to do low level work. For functional
               | programming I'd recommend Erlang (if only because of the
               | very long term view of the people that build it) or
               | Clojure, though the latter seems to be on its retour.
        
               | uecker wrote:
               | I think the C standard should provide some good
               | libraries, e.g. a string library. But in any case the
               | problem with 100000+ bespoke implementations in C is not
               | fixed by designing new programming languages and also
               | adding them to the mix. Entropy is a bitch.
        
               | lelanthran wrote:
               | > I strategy that helps me [...]
               | 
               | In another comment recently I opined that C projects,
               | initiated in 2025, are likely to be much more secure than
               | the same project written in Python/PHP (etc).
               | 
               | This is because the only people _choosing_ C in 2025 are
               | those who have been using it already for decades, have
               | internalised the handful of footguns via actual
               | experience and have a set of strategies for minimising
               | those footguns, all shaped with decades of experience
               | working around that tiny handful of footguns.[1]
               | 
               | Sadly, _this_ project has rendered my opinion wrong - it
               | 's a project initiated in 2025, in C, that was obviously
               | done by an LLM, and thus is filled with footguns and
               | over-engineering.
               | 
               | ============
               | 
               | [1] I also have a set of strategies for dealing with the
               | footguns; I would gues if we sat down together and
               | compared notes our strategies would have more in common
               | than they would differ.
        
               | uecker wrote:
               | If you want something fool-proof where a statistical code
               | generated will not generate issues, then C is certainly
               | not a good choice. But also for other languages this will
               | cause issues. I think for vibe-coding a network server
               | you might want something sand-boxed with all security
               | boundaries outside, in which case it does not really
               | matter anymore.
        
           | OneLessThing wrote:
           | This is exactly my problem with LLM C code, lack of
           | confidence. On the other hand, when my projects get big
           | enough to the point where I cannot keep the code base
           | generally loaded into my brains cache they eventually get to
           | the point where my confidence comes from extensive testing
           | regardless. So maybe it's not such a bad approach.
           | 
           | I do think that LLM C code if made with great testing tooling
           | in concert has great promise.
        
             | jacquesm wrote:
             | That generalizes to anything LLM related.
        
           | lelanthran wrote:
           | > It needed an extensive fuzzing to get the minimal amount of
           | confidence, which caught at least two serious problems---
           | seriously, it's much better than most C programmers, but
           | still.
           | 
           | How are you doing your fuzzing? You need either valgrind (or
           | compiler sanitiser flags) in the loop for a decent level of
           | confidence.
        
             | lifthrasiir wrote:
             | The "minimal" amount of confidence, not a decent level of
             | confidence. You are completely right that I need much more
             | to establish anything higher than that.
        
         | citbl wrote:
         | The irony is also that AI could have been used to audit the
         | code and find these issues. All the author had to do was to
         | question.
        
         | nurettin wrote:
         | > should never see production use.
         | 
         | I have an issue with high strung opinions like this. I wrote
         | plenty of crappy delphi code while learning the language that
         | saw production use and made a living from it.
         | 
         | Sure, it wasn't the best experience for users, it took years to
         | iron out all the bugs and there was plenty of frustration
         | during the support phase (mostly null pointer exceptions and db
         | locks in gui).
         | 
         | But nobody would be better off now if that code never saw
         | production use. A lot of business was built around it.
        
           | zdragnar wrote:
           | Buggy code that just crashes or produces incorrect results
           | are a whole different category. In C a bug can compromise a
           | server and your users. See the openssl heart bleed
           | vulnerability as a prime example.
           | 
           | Once upon a time, you could put up a relatively vulnerable
           | server, and unless you got a ton of traffic, there weren't
           | too many things that would attack it. Nowadays, pretty much
           | anything Internet facing will get a constant stream of
           | probes. Putting up a server requires a stricter mindset than
           | it used to.
        
           | jacquesm wrote:
           | There are minimum standards for deployment to the open web. I
           | think - and you're of course entirely free to have a
           | different opinion - that those are not met with this code.
        
             | nurettin wrote:
             | Yes, I have lots of opinions!
             | 
             | I guess the question at spotlight is: At what point would
             | your custom server's buffer overflow when reading a header
             | matter and would that bug even exist at that point?
             | 
             | Could a determined hacker get to your server without even
             | knowing what weird software you cooked up and how to
             | exploit your binary?
             | 
             | We have a lot of success stories born from bad code. I mean
             | look at Micro$oft.
             | 
             | Look at all the big players like discord leaking user
             | credentials. Why would you still call out the little fish?
             | 
             | Maybe I should create a form for all these ahah.
        
               | frumplestlatz wrote:
               | > Could a determined hacker get to your server without
               | even knowing what weird software you cooked up and how to
               | exploit your binary?
               | 
               | Yes.
        
               | nurettin wrote:
               | Yes but how? After the overflow they still have to know
               | the address of the next call site and the server would be
               | in a UB state.
        
               | jacquesm wrote:
               | The code is on github. Figure out a way to get a shell
               | through that code and you're hosed if someone recognizes
               | it in active use.
        
               | nurettin wrote:
               | I mean tha hacker won't know what software is running on
               | the server, unless the server announces itself which can
               | be traced to the repo, but then, why ?? Who cares about
               | this guy's vps? This whole thread makes no sense to me
               | and I'm the only one questioning.
        
       | lelanthran wrote:
       | I can't completely blame the language here: anyone "coding" in a
       | language new to them using an LLM is going to have real problems.
        
         | OneLessThing wrote:
         | It's funny the author says this was 90% written without AI, and
         | that AI was mostly used for the json code. I think they're just
         | new to C.
         | 
         | Trust me I love C. Probably over 90% of my lifetime code has
         | been written in C. But python newbies don't get their web
         | frameworks stack smashed. That's kind of nice.
        
           | lelanthran wrote:
           | > But python newbies don't get their web frameworks stack
           | smashed. That's kind of nice.
           | 
           | Hah! True :-)
           | 
           | The thing is, smashed stacks are _difficult_ to exploit
           | deterministically or automatically. Even heartbleed, as
           | widespread as it was, was not a guaranteed RCE.
           | 
           | OTOH, an exploit in a language like Python is almost
           | certainly going to be easier to exploit deterministically.
           | Log4j, for example, was a _guaranteed_ exploit and the skill
           | level required was basically _" Create a Java object"_.
           | 
           | This is because of the ease with which even very junior
           | programmers can create something that appears to run and work
           | and not crash.
        
             | alfiedotwtf wrote:
             | > The thing is, smashed stacks are difficult to exploit
             | deterministically or automatically. Even heartbleed, as
             | widespread as it was, was not a guaranteed RCE.
             | 
             | That's like driving without a seatbelt - it's not safe, but
             | it would only matter on that very rare chance you have a
             | crash. I would rather just wear a seatbelt!
        
         | uyzstvqs wrote:
         | It's a double-sided coin. LLMs are probably the best way to
         | learn programming languages right now. But if you vibecode in a
         | programming language that you don't understand, it's going to
         | be a disaster sooner or later.
         | 
         | This is also the reason why AI will not replace any actual jobs
         | with merit.
        
           | AdieuToLogic wrote:
           | > LLMs are probably the best way to learn programming
           | languages right now.
           | 
           | Books still exist, be they in print or electronic form.
        
             | zweifuss wrote:
             | I would claim that:
             | 
             | (interactive labs + quizzes) > Learning from books
             | 
             | Good online documentation > 5yr old tome on bookshelf
             | 
             | chat/search with ai > CTRL+F in a PDF manual
        
               | skydhash wrote:
               | Interactive labs can do a great job of teaching skills,
               | but they fell short of teaching understanding. And at
               | some point, it's faster to read a book to learn, because
               | there's a reduced need for practice.
               | 
               | Hypertext is better than printed book format, but if
               | you're just starting with something you need a guide that
               | provides a coherent overview. Also most online
               | documentation are just bad.
               | 
               | Why ctrl+f? You can still have a table of contents and an
               | index with pdf. And the pdf formats support link. And I'd
               | prefer filtering/querying over generation because the
               | latter is always tainted by my prompt. If I type `man
               | unknown_function`, I will get an error, not a generated
               | manual page.
        
             | estimator7292 wrote:
             | Examples are the best documentation, and we now have a
             | machine to produce infinite examples tailored specifically
             | to any situation
        
               | nxobject wrote:
               | Pending on the quality of the examples, of course.
        
       | messe wrote:
       | > Another interesting choice in this project is to make lengths
       | signed:
       | 
       | There are good reasons for this choice in C (and C++) due to
       | broken integer promotion and casting rules.
       | 
       | See: "Subscripts and sizes should be signed" (Bjarne Stroustrup)
       | https://open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1428r0...
       | 
       | As a nice bonus, it means that ubsan traps on overflow (unsigned
       | overflows just wrap).
        
         | uecker wrote:
         | I do not agree that the integer promotion or casting (?) rules
         | are broken in C. That some people make mistakes because they do
         | not know them is a different problem.
         | 
         | The reason you should make length signed is that you can use
         | the sanitizer to find or mitigate overflow as you correctly
         | observe, while unsigned wraparound leads to bugs which are
         | basically impossible to find. But this has nothing to do with
         | integer promotion and wraparound bugs can also create bugs in -
         | say - Rust.
        
           | OneLessThing wrote:
           | It's interesting to hear these takes. I've never had problems
           | catching unsigned wrap bugs with plain old memory sanitizers,
           | though I must admit to not having a lot of experience with
           | ubsan in particular. Maybe I should use it more.
        
             | jacquesm wrote:
             | I've had some fun reviewing some very old code I wrote
             | (1980's) to see what it looked like to me after such a long
             | time of gaining experience. It's not unlike what the OP did
             | here, it reads cleanly but I can see many issues that
             | escaped my attention at the time. I always compared C with
             | a very fast car: you can take some corners on two wheels
             | but if you make a habit of that you're going to end up in a
             | wall somewhere. That opinion has not changed.
        
               | uecker wrote:
               | I think the correct comparison is a sharp knife. It is
               | extremely useful and while there is a risk it is fully
               | acceptable. The idea that we should all use plastic
               | knifes because there are often accidents with knifes is
               | wrong and so is the idea that we use should abandon C
               | because of memory safety. I follow computer security
               | issues for several decades, and while I think we should
               | have memory safety IMHO the push and arguments are
               | completely overblown - and they are especially not worth
               | the complexity and issues of Rust. I never was personally
               | impacted by a security exploit caused by memory safety or
               | know anybody in my personal vicinity who was. I know many
               | cases where people where affected by _other_ kinds of
               | security issues. So I think those are what we should
               | focus on first. And having timely security updates is a
               | hell lot more important than memory safety, so I am not
               | happy that Rust now makes this harder.
        
               | jacquesm wrote:
               | That's an interesting point you are making there. The
               | most common exploits are of the human variety. Even so it
               | is probably a good idea to minimize the chances of all
               | kinds of exploits. One other problem - pet peeve of mine
               | - is that instead of giving people _just_ security
               | updates manufacturers will happily include a whole bunch
               | of new and  'exciting' stuff in their updates that in
               | turn will (1) introduce new security issues and (2) will
               | inevitably try to extract more money from the updaters.
               | This is extremely counterproductive.
        
               | simonask wrote:
               | I'm sorry, but there is an incredible amount of hard data
               | on this, including the number of CVEs directly
               | attributable to memory safety bugs. This is publicly
               | available information, and we as an industry should take
               | it seriously.
               | 
               | I don't mean to be disrespectful, but this cavalier
               | attitude towards it reads like vaccine skepticism to me.
               | It is not serious.
               | 
               | Programming can be inconsequential, but it can also be
               | national security. I know which engineers I would trust
               | with the latter, and they aren't the kind who believe
               | that discipline is "enough".
        
               | jacquesm wrote:
               | So what do you propose to do?
        
               | simonask wrote:
               | I propose that we start taking the appropriate amount of
               | professional responsibility.
               | 
               | That includes being honest about the actual costs of
               | software when you don't YOLO the details. Zero UB is
               | table stakes now - it didn't use to be, but we don't live
               | in that world anymore.
               | 
               | It's totally fine to use C or whatever language for it,
               | but you are absolutely kidding yourself if you think the
               | cost is less than at least an order of magnitude higher
               | than the equivalent code written in Rust, C#, or any
               | other language that helps you avoid these bugs. Rust even
               | lets you get there at zero performance cost, so we're
               | down to petty squabbles about syntax or culture - not
               | serious.
        
               | jacquesm wrote:
               | > I propose that we start taking the appropriate amount
               | of professional responsibility.
               | 
               | I agree. For me that means: software _engineering_ should
               | start taking the same attitude to writing software that
               | structural engineers bring to the table when they talk
               | about bridges, buildings and other structures that will
               | have people 's lives depending on them. I'm not sure how
               | we're going to make rings out of bits but we need to
               | realize - continuously - that the price of failure is
               | often paid in blood, or in the best case with financial
               | loss and usually not by us. And in turn we should be
               | enabled to impose that same ethic on management, because
               | more often than not that's the root cause of the problem.
               | 
               | > That includes being honest about the actual costs of
               | software when you don't YOLO the details.
               | 
               | Does that include development cost?
               | 
               | Maintenance costs?
               | 
               | Or just secondary costs?
               | 
               | Why the focus on costs?
               | 
               | > Zero UB is table stakes now - it didn't use to be, but
               | we don't live in that world anymore.
               | 
               | This is because 'Rust and C# exist'? Or is it because
               | Java, Erlang, Visual Basic, Lisp etc exist?
               | 
               | > It's totally fine to use C or whatever language for it,
               | but you are absolutely kidding yourself if you think the
               | cost is less than at least an order of magnitude higher
               | than the equivalent code written in Rust, C#, or any
               | other language that helps you avoid these bugs.
               | 
               | We were talking about responsibility first, and that goes
               | well beyond just measuring 'cost'. The mistake in
               | bringing cost into it is that cost is a business concept
               | that is used to justify picking a particular technology
               | over another. And just like security is an expense that
               | doesn't show anything on the balance sheet if it works
               | besides that it cost money the same goes for picking a
               | programming language eco-system.
               | 
               | So I think focusing on cost is a mistake. That just
               | allows the bookkeeper to make the call and that call will
               | often be the wrong one.
               | 
               | > Rust even lets you get there at zero performance cost,
               | so we're down to petty squabbles about syntax or culture
               | - not serious.
               | 
               | The debate goes a lot further than that. You have
               | millions of people that are writing software every day
               | that are not familiar with Rust. To get them to pick a
               | managed language over what they are used to is going to
               | take a lot of convincing.
               | 
               | It starts of with ethics, and I don't think it should
               | start off with picking a favorite language. You educate,
               | show by example and you deliver at or below the same cost
               | that those other eco-systems do and then you slowly eat
               | the world because your projects are delivered on-time,
               | with provably lower real world defects and hopefully at a
               | lower cost.
               | 
               | And then I really couldn't care what language was picked,
               | in the rust world that translates into 'anything but C'
               | because that is perceived to be the enemy somehow, which
               | is strange because there are many alternatives to rust
               | that are perfectly suitable, have much higher mind share
               | already.
               | 
               | C is - even today - at 10x the popularity that rust is,
               | it will take a massive amount of resources to switch
               | those people over, and likely it will take more than one
               | generation. In the meantime all of the C code in the
               | world will have to be maintained, which means there is
               | massive job security for people learning C. For people
               | learning rust to the exclusion of learning C that
               | situation is far worse. This needs to be solved.
               | 
               | These are not 'petty squabbles' about syntax or culture.
               | They are the harsh reality of the software development
               | world at large, which has seen massive projects deployed
               | at scale developed with those really bad languages full
               | of undefined behavior (well, that's at least one thing
               | that Assembly Language has going for it, as long as the
               | CPU does what it says in the book undefined behavior
               | doesn't exist). People are going to point at that and say
               | 'good enough'. And they see all those memory overflows,
               | CVEs etc as a given, and they realize that in spite of
               | all of those the main vector for security issues is
               | people, and configuration mistakes not so much the
               | software itself.
               | 
               | This is not ideal, obviously, but C, like any bad habit,
               | is very hard to dislodge if your main argument is 'you
               | should drop this tool because mine is better'. Then you
               | need to _show_ that your tool is better, so much better
               | that it negates the cost to switch. And that 's a very
               | tall order, for any programming language, much more so
               | for one that is struggling for adoption in the first
               | place.
        
               | simonask wrote:
               | Cost is a useful metric because it reflects a number of
               | relevant things: Time to develop, effort to maintain -
               | yes, but also people turnover, required expertise levels,
               | satisfaction, and so on. Whether or not you like it, you
               | have to care about cost if you want to make rational
               | decisions. I'm not talking about assigning a
               | Euro/Dollar/Yuan value to each hour spent on a project,
               | but you need a rough idea about the size of the time and
               | energy investment you are making when starting a project.
               | 
               | > This is because 'Rust and C# exist'? Or is it because
               | Java, Erlang, Visual Basic, Lisp etc exist?
               | 
               | Things have changed for three important reasons: (1)
               | C/C++ compilers have evolved, and UB is significantly
               | more catastrophic than it was in the 90s and early 00s.
               | (2) As societies digitize, the stakes are higher than
               | even - leaking personal data has huge legal and moral
               | consequences, and system outages can have business-
               | killing financial consequences. (3) There are actual,
               | viable alternatives - GC is no longer a requirement for
               | memory safety.
               | 
               | > To get them to pick a managed language over what they
               | are used to is going to take a lot of convincing.
               | 
               | Perhaps you didn't mean to say so, but Rust is not a
               | managed language (that's a .NET term referring to C#, F#,
               | etc.).
               | 
               | Me and other Rust users are obviously trying to convince
               | even more people to use the language, and that's because
               | we are having a great time over here. It's a very
               | pleasant language with a pleasant community and a high
               | level of technical expertise, and it allows me to get
               | significantly closer to living up to my own ideals. I'm
               | not making a moral argument here, trying to say that you
               | or anyone is a bad person for not using Rust, but I am
               | making a moral argument saying that _denying_ the huge
               | cost and risk associated with developing software in C
               | and C++ is bullshit.
               | 
               | > And then I really couldn't care what language was
               | picked, in the rust world that translates into 'anything
               | but C' because that is perceived to be the enemy somehow,
               | which is strange because there are many alternatives to
               | rust that are perfectly suitable, have much higher mind
               | share already.
               | 
               | The point here is that, until Rust came along, you had
               | the choice between wildly risky (but fast) C and C++
               | code, or completely safe (but slow) garbage collected
               | languages with heavy runtimes and significant deployment
               | challenges.
               | 
               | C is certainly not "the enemy" - I never said that, and I
               | wouldn't. But that old world is gone. The excuse of
               | picking risky, problem-riddled languages that we _know_
               | are associated with extreme costs for reasons of
               | performance no longer has any technical merit. There can
               | be other reasons, but this isn 't it.
               | 
               | > C is - even today - at 10x the popularity that rust is,
               | it will take a massive amount of resources to switch
               | those people over [...]
               | 
               | It's insane to me that anyone would limit themselves to a
               | single language. Every competent programmer I know knows
               | at least a handful. Why are we worried about this? I'm a
               | decent C programmer, and a very good C++ programmer -
               | better at both because I'm also fairly good at Rust.
               | 
               | > And they see all those memory overflows, CVEs etc as a
               | given, and they realize that in spite of all of those the
               | main vector for security issues is people, and
               | configuration mistakes not so much the software itself.
               | 
               | "Pobody's nerfect." I'm sorry, I really dislike this
               | attitude. We can't let the fact that security is hard, or
               | that perfection is unattainable, be an excuse to deliver
               | more crap.
               | 
               | > This is not ideal, obviously, but C, like any bad
               | habit, is very hard to dislodge if your main argument is
               | 'you should drop this tool because mine is better'
               | 
               | Again, that's not my argument. My argument is that you
               | should be honest about what the actual costs, or
               | alternatively the actual quality.
        
               | jacquesm wrote:
               | > Cost is a useful metric because it reflects a number of
               | relevant things: Time to develop, effort to maintain -
               | yes, but also people turnover, required expertise levels,
               | satisfaction, and so on. Whether or not you like it, you
               | have to care about cost if you want to make rational
               | decisions. I'm not talking about assigning a
               | Euro/Dollar/Yuan value to each hour spent on a project,
               | but you need a rough idea about the size of the time and
               | energy investment you are making when starting a project.
               | 
               | You are missing the cost to switch and that's a massive
               | one and the one that I think most parties are using to
               | decide whether or not to stick with what they know or to
               | try something that is new to them. If you have a team of
               | 50 embedded C++ developers and a deadline 'let's use
               | rust' is a gamble very few managers will make.
               | 
               | > Things have changed for three important reasons: (1)
               | C/C++ compilers have evolved, and UB is significantly
               | more catastrophic than it was in the 90s and early 00s.
               | 
               | That depends on what industry you are looking at. For
               | instance, in aviation the cost of undefined behavior,
               | crashing software or wrong calculations was always that
               | high. The difference is that in that industry (and a
               | handful of others) there is enough budget to do it right
               | resulting in far fewer in production issues than what we
               | have come to accept in the 'always online, auto-update'
               | world. That whole attitude is as much or more to blame
               | for this than any particular language.
               | 
               | > (2) As societies digitize, the stakes are higher than
               | even - leaking personal data has huge legal and moral
               | consequences, and system outages can have business-
               | killing financial consequences.
               | 
               | Show me the names of the businesses that have died
               | because of data leaks or UB. See, the problem is that for
               | those businesses it usually is just a speedbump. They
               | don't care and no matter what the size of the breach the
               | consequences are usually minor.
               | 
               | The employee sticking a USB drive found on the street
               | into their laptop causing a cryptolocker incident is a
               | much more concrete problem.
               | 
               | > (3) There are actual, viable alternatives - GC is no
               | longer a requirement for memory safety.
               | 
               | GC is a convenience, and if you're going to switch
               | languages you might as well pick one that is is more
               | convenient. Java for instance is suitable now for 90% or
               | so of the use cases where C or C++ would be your only
               | option 15 years ago.
               | 
               | > Perhaps you didn't mean to say so, but Rust is not a
               | managed language (that's a .NET term referring to C#, F#,
               | etc.).
               | 
               | I know, but Java, Lisp and so on _are_ managed languages,
               | and they offer both safety _and_ convenience. Rust only
               | offers safety, other than that it is only marginally more
               | convenient than C and some would argue less so.
               | 
               | > Me and other Rust users are obviously trying to
               | convince even more people to use the language, and that's
               | because we are having a great time over here.
               | 
               | Show, don't tell.
               | 
               | > It's a very pleasant language with a pleasant community
               | and a high level of technical expertise, and it allows me
               | to get significantly closer to living up to my own
               | ideals.
               | 
               | Yes, but those are _your_ ideals, which don 't
               | necessarily overlap with mine. I don't particularly care
               | about one programming language or another, I've learned
               | enough of them by now to know that _all_ of them have
               | their limitations, their warts, their good bits and their
               | bad bits. I also know that the size of the eco-system is
               | a large function in whether or not I 'll be able to get
               | through the day in a productive way.
               | 
               | > I'm not making a moral argument here, trying to say
               | that you or anyone is a bad person for not using Rust,
               | but I am making a moral argument saying that denying the
               | huge cost and risk associated with developing software in
               | C and C++ is bullshit.
               | 
               | See, your use of the word 'bullshit' triggers me in a way
               | that you probably do not intend, but it is exactly that
               | attitude that turns me off the language that you would
               | like me to switch to. I don't particularly see that huge
               | cost and risk as applied to myself because I'm not
               | currently writing code that is going to be part of some
               | network service. If I see an embedded shop doing their
               | work in Rust then I'm happy because I can ignore at least
               | one small aspect of the source of bugs in such software.
               | But there are plenty remaining and Rust - no matter what
               | you think - is not a silver bullet for all of the things
               | that can go wrong with low level software. There are
               | other, better alternatives for most of those
               | applications, I'd be more inclined to use Java or Erlang
               | if those are available, and Go if they are not. The speed
               | at which I can develop software is a massive factor in
               | that whole 'cost' evaluation for me.
               | 
               | > The point here is that, until Rust came along, you had
               | the choice between wildly risky (but fast) C and C++
               | code, or completely safe (but slow) garbage collected
               | languages with heavy runtimes and significant deployment
               | challenges.
               | 
               | That just isn't true. There are more languages besides
               | Rust that allow for low level and fast work. Go for
               | instance is an excellent contender. And for long running
               | processes Java is excellent, it is approaching C levels
               | of throughput and excels at networked services.
               | 
               | > C is certainly not "the enemy" - I never said that, and
               | I wouldn't. But that old world is gone.
               | 
               | Sorry, but this is not a realistic stance. That old world
               | is not gone, and it is likely here to stay for many more
               | decades. There is so much inertia here in terms of
               | invested capital that you can't just make declarations
               | like these and expect to be taken serious.
               | 
               | > The excuse of picking risky, problem-riddled languages
               | that we know are associated with extreme costs for
               | reasons of performance no longer has any technical merit.
               | There can be other reasons, but this isn't it.
               | 
               | Do you realize that this is just your opinion and not a
               | statement of fact?
               | 
               | > It's insane to me that anyone would limit themselves to
               | a single language.
               | 
               | 'Insane' is another very loaded word. Is this really the
               | kind of language you want to be using while advocating
               | for Rust? There are many programmers that learn one eco
               | system well enough to carve out a career for themselves,
               | and I'm not going to be the one to judge them for that.
               | I'm not one of them, but I can see how it happens and I
               | would definitely not label everybody that's not a
               | polyglot as not entirely right in the head.
               | 
               | > Every competent programmer I know knows at least a
               | handful.
               | 
               | I know some _very_ competent programmers that only know
               | one. But they know that one better than I know any of the
               | ones that I 'm familiar with. For instance, I know a guy
               | that decided early on that if nobody wants to work on
               | COBOL projects that that is exactly what he's going to
               | do: become a world class expert in COBOL to help maintain
               | all that old stuff. At a price. He's making very good
               | money with that, far more than he'd have ever made by
               | going with something more popular. I know plenty of Java
               | only programmers and a couple that have decided that
               | python is all they need. That's _their right_ and it isn
               | 't up to me to look down on them or call them incompetent
               | because they can do something that I apparently can't:
               | focus, and get really good at one thing.
               | 
               | > Why are we worried about this? I'm a decent C
               | programmer, and a very good C++ programmer - better at
               | both because I'm also fairly good at Rust.
               | 
               | I would not label myself as 'very good' in any language,
               | I always hope to get better and in spite of doing this
               | for 4+ decades I have never felt that I was 'good
               | enough'.
               | 
               | > "P[sic]obody's nerfect." I'm sorry, I really dislike
               | this attitude.
               | 
               | Again, why the antagonism. We have many different classes
               | of issues, and depending on the context some of them may
               | not be a problem at all. I've built stuff in _JavaScript_
               | because it was the most suitable for the job. But I stay
               | the hell away from node and anything associated with it
               | because I don 't consider myself qualified to audit all
               | of the code that could be pulled in through a dependency.
               | And that's a good chunk of this: just know your
               | limitations, and realize that not just 'nobody's perfect'
               | but also that _you yourself_ are not perfect and more
               | than likely to mess up when you go into territory that is
               | unfamiliar to you.
               | 
               | > We can't let the fact that security is hard, or that
               | perfection is unattainable, be an excuse to deliver more
               | crap.
               | 
               | Ok. So now you are labeling what other people produce as
               | 'crap'. This isn't helping.
               | 
               | > Again, that's not my argument. My argument is that you
               | should be honest about what the actual costs, or
               | alternatively the actual quality.
               | 
               | So I'm not honest. If you are wondering what I meant when
               | I wrote earlier that it is the attitude of some of the
               | Rust advocates that turns me off then here in this thread
               | you have a very nice example of that. All of this
               | pontification and emotionally laden language serves
               | nobody, least of all Rust.
               | 
               | If you want to win people over try the following:
               | 
               | - refrain from insulting your target audience
               | 
               | - respect the fact that your opinions are just that
               | 
               | - understand that there may be factors outside of your
               | view that are part of the decision making process
               | 
               | - understand that you may not have a complete
               | understanding of the problem domain or the restrictions
               | involved (is a variation on the previous one)
               | 
               | - try to not use emotional language to make your point
               | 
               | - showing beats telling any day of the week
        
               | simonask wrote:
               | I don't have time to respond to all of this, but let me
               | just say that you seem to be under the impression that it
               | is somehow my responsibility to "win you over" and
               | convince you to use Rust. I have stated very clearly that
               | that's not my point. My point is that we should all stop
               | lying about the actual cost of delivering reliable
               | software written in C or C++, and in particular that we
               | as an industry _need_ to stop downplaying the
               | consequences of things like UB.
               | 
               | Are _you personally_ doing any of those things? I don 't
               | know, and I don't think I have accused you of that.
               | 
               | I'm not here to swoon you by sweet-talking you into using
               | a different programming language. All this "show don't
               | tell" - what are you talking about? Do you need real-
               | world examples of successful Rust projects? There's a
               | myriad of impressive ones, but you are fully capable of
               | googling that.
               | 
               | I'm not a representative of Rust the language (how could
               | I be), and I reserve the right to call out moral
               | corruption as I see it. I frankly do not need any "well-
               | meaning" advice about how best to advocate for Rust -
               | that's not my job.
        
               | jacquesm wrote:
               | > I'm not a representative of Rust the language (how
               | could I be), and I reserve the right to call out moral
               | corruption as I see it. I frankly do not need any "well-
               | meaning" advice about how best to advocate for Rust -
               | that's not my job.
               | 
               | Whether you realize it or not, you are an advocate and
               | you are doing a very, very poor job of it.
        
               | pjmlp wrote:
               | > The point here is that, until Rust came along, you had
               | the choice between wildly risky (but fast) C and C++
               | code, or completely safe (but slow) garbage collected
               | languages with heavy runtimes and significant deployment
               | challenges.
               | 
               | Not really, I have been mostly coding in managed
               | languages for the last couple of decades, and this has
               | been not really true for quite some time.
               | 
               | Yes if we go down language benchmark games, they won't
               | win every little micro benchmark, however for like 99% of
               | commercial use cases, what they deliver is fast enough
               | for project requirements in execution time, and hardware
               | resources.
               | 
               | Now where they fail is in human perception and urban
               | myths, of where they are suitable to be adopted.
               | 
               | Languages like Rust overcome this, with their type system
               | approach to resource management, the naysayers have run
               | out of excuses.
        
               | simonask wrote:
               | I think you are pointing out that garbage collected
               | languages can be very fast, right? I agree about that,
               | but it does fundamentally comes with some very big
               | caveats.
               | 
               | There's a huge number of use cases that are perfectly
               | served by GC languages, even where performance matters,
               | but there's also a huge number that benefit from the
               | extra boost and significantly lower memory usage of a
               | compiled language.
        
               | pjmlp wrote:
               | There are plenty of compiled languages with GC, value
               | types and low level programming capabilities, including
               | playing with pointers C style.
               | 
               | D, C#, Nim, Swift, Go for mainstream examples.
               | 
               | If we dive into less successful attempts from the past,
               | 
               | Cedar, Modula-2+, Modula-3, Oberon, Oberon-2, Active
               | Oberon, Component Pascal, Oberon-07, Spec#, System C#
               | among plenty others that are probably listed on ACM
               | SIGPLAN list of papers.
               | 
               | As for some commercial examples,
               | 
               | https://www.withsecure.com/en/solutions/innovative-
               | security-...
               | 
               | https://dlang.org/blog/2018/12/04/interview-liran-zvibel-
               | of-...
               | 
               | https://www.wildernesslabs.co/
               | 
               | https://www.astrobe.com/boards.htm
        
               | pjmlp wrote:
               | Thankfully the new cybersecurity laws will help here,
               | when companies map production costs to languages, the
               | needle will keep moving away from those that tank
               | security budgets.
        
               | jacquesm wrote:
               | I was actually hoping for far more strict enforcement but
               | so far they're taking it relatively easy.
        
               | pjmlp wrote:
               | Indeed, however better slowly than nothing at all.
        
               | goalieca wrote:
               | CVE are important but there's also a lot of theatre
               | there. How many are known exploitable? Most aren't if you
               | follow threat intel. Most of the Internet infrastructure
               | is running c/c++ and is very safe.
        
               | simonask wrote:
               | It's fine to have a sober view of the severity, but we
               | can hopefully agree in general that writing any program
               | in C or C++ that faces the internet requires _extreme_
               | caution.
        
               | goalieca wrote:
               | I think anything that faces the internet needs extreme
               | caution. I've done enough pentesting myself to see that
               | mistakes are abound and most of them are logic problems.
        
               | pjmlp wrote:
               | Except any good chef or butcher knows that they should be
               | wearing protective gloves when using sharp knifes.
               | 
               | > Cut-resistant gloves are an essential piece of safety
               | equipment in any kitchen.
               | 
               | https://www.restaurantware.com/blogs/smallwares/how-to-
               | choos...
               | 
               | Where are C's gloves?
        
             | uecker wrote:
             | GCC's sanitizer does not catch unsigned wraparound. But the
             | bigger problem is that a lot of code is written where it
             | assumes that unsigned wraps around and this is ok. So you
             | you would use a sanitizer you get a lot of false positives.
             | For signed overflow, one can always consider this a bug in
             | portable C.
             | 
             | Of course, if you consistently treat unsigned wraparound as
             | a bug in your code, you can also use a sanitizer to screen
             | for it. But in general I find it more practical to use
             | signed integers for everything except for modular
             | arithmetic where I use unsigned (and where wraparound is
             | then expected and not a bug)
        
           | messe wrote:
           | I meant implicit casting, but I guess that really falls under
           | promotion in most cases where it's relevant here (I'm on a
           | train from Aarhus to Copenhagen right now to catch a flight,
           | and I've slept considerably less than usual, so apologies if
           | I'm making some slight mistakes).
           | 
           | The issues really arise when you mix signed/unsigned
           | arithmetic and end up promoting everything to signed
           | unexpectedly. That's usually "okay", as long as you're not
           | doing arithmetic on anything smaller than an int.
           | 
           | As an aside, if you like C enough to have opinions on
           | promotion rules then you might enjoy the programming language
           | Zig. It's around the same level as C, but with much nicer
           | ergonomics, and overflow traps by default in
           | Debug/ReleaseSafe optimization modes. If you want explicit
           | two's complement overflow it has +%, *% and -% variants of
           | the usual arithmetic operations, as well as saturating +|,
           | *|, -| variants that clamp to [minInt(T), maxInt(T)].
           | 
           | EDIT to the aside: it's also true if you hate C enough to
           | have opinions on promotion rules.
        
             | jacquesm wrote:
             | Yes, this is one of the more subtle pitfalls of C. What
             | helps is that in most contexts the value of 2 billion is
             | large enough that a wraparound would be noticed almost
             | immediately. But when it isn't then it can lead to very
             | subtle errors that can propagate for a long time before
             | anything goes off the rails that is noticed.
        
             | uecker wrote:
             | I prefer C to Zig. IMHO all the successor languages throw
             | out the baby with the bathwater and add unnecessary
             | complexity. But Zig is much better than Rust, but, still, I
             | would never use it for a serious project.
             | 
             | The "promoting unexpectedly" is something I do not think
             | happens if you know C well. At least, I can't remember ever
             | having a bug because of this. In most cases the promotion
             | prevents you from having a bug, because you do not get
             | unexpected overflow or wraparound because your type is too
             | small.
             | 
             | Mixing signed and unsigned is problematic, but I see issues
             | mostly in code from people who think they need to use
             | unsigned when they shouldn't because they heard signed
             | integers are dangerous. Recently I saw somebody "upgrading"
             | a C code basis to C++ and also changing all loop variables
             | to size_t. This caused a bug which he blamed on working on
             | the "legacy C code" he is working on, although the original
             | code was just fine. In general, there are compiler warnings
             | that should catch issues with sign for conversions.
        
               | lelanthran wrote:
               | > Recently I saw somebody "upgrading" a C code basis to
               | C++ and also changing all loop variables to size_t. This
               | caused a bug which he blamed on working on the "legacy C
               | code" he is working on, although the original code was
               | just fine.
               | 
               | I had the same experience about 10 years back when a
               | colleague "upgrade" code from using size_t to `int`; on
               | that platform (ATMEGA or XMEGA, not too sure now) `int`
               | was too small, overflowed and bad stuff happened in the
               | field.
               | 
               | The only takeaway is "don't needlessly change the size
               | and sign of existing integer variables".
        
               | uecker wrote:
               | I don't think this is the only takeway. My point is that
               | you can reliably identify signed integer overflow using
               | sanitizers and you can also reliably mitigate related
               | attacks by trapping for signed integer overflow (it still
               | may be a DoS, but you can stop more serious harm). Both
               | does not work with unsigned types except in a tightly
               | controlled project where you treat unsigned wraparound as
               | a bug, but this fails the moment you introduce other
               | idiomatic C code that does not follow this.
        
           | Sukera wrote:
           | Could you expand on how these wraparound bugs happen in Rust?
           | As far as I know, integer overflow panics (i.e. aborts) your
           | code when compiled in debug mode, which I think is often used
           | for testing.
        
           | 01HNNWZ0MV43FF wrote:
           | > That some people make mistakes because they do not know
           | them is a different problem.
           | 
           | We can argue til we're blue in the face that people should
           | just not make any mistakes, but history is against us -
           | People will always make mistakes.
           | 
           | That's why surgeons are supposed to follow checklists and
           | count their sponges in and out
        
           | bringbart wrote:
           | >while unsigned wraparound leads to bugs which are basically
           | impossible to find.
           | 
           | What?
           | 
           | unsigned sizes are way easier to check, you just need one
           | invariant:
           | 
           | if(x < capacity) // good to go
           | 
           | Always works, regardless how x is calculated and you never
           | have to worry about undefined behavior when computing x. And
           | the same invariant is used for forward and backward loops -
           | some people bring up i >= 0 as a problem with unsigned, but
           | that's because you should use i < n for backward loops as
           | well, The One True Invariant.
        
         | user____name wrote:
         | I just put assertions to check the ranges of all sizes and
         | indices upon function entry, doubles as documentation, and I
         | mostly don't have to worry about signedness as a result.
        
         | kstenerud wrote:
         | Yup, unsigned math is just nasty.
         | 
         | Actually, unchecked math on an integer is going to be bad
         | regardless of whether it's signed or unsigned. The difference
         | is that with signed integers, your sanity check is simple and
         | always the same and requires no thought for edge cases:
         | `if(index < 0 || index > max)`. Plus ubsan, as mentioned above.
         | 
         | My policy is: Always use signed, unless you have a specific
         | reason to use unsigned (such as memory addresses).
        
           | bringbart wrote:
           | unsigned is easier: 'if(index >= max)' and has fewer edge
           | cases because you don't need to worry about undefined
           | behavior when computing index.
        
           | lelanthran wrote:
           | > The difference is that with signed integers, your sanity
           | check is simple and always the same and requires no thought
           | for edge cases: `if(index < 0 || index > max)`
           | 
           | Wait, what? How is that easier than `if (index > max)`?
        
             | kstenerud wrote:
             | Because if max is a calculated value, it could silently
             | wrap around and leave index to cause a buffer overflow.
             | 
             | Or if index is counting down, a calculated index could
             | silently wrap around and cause the same issue.
             | 
             | And if both are calculated and wrap around, you'll have fun
             | debugging spooky action at a distance!
             | 
             | If both are signed, that won't happen. You probably do have
             | a bug if max or index is calculated to a negative value,
             | but it's likely not an exploitable one.
        
               | 1718627440 wrote:
               | I have no clue what cases you have in mind, can you give
               | some examples? Surely when you have index as unsigned the
               | maximum would be represented unsigned as well?
        
         | accelbred wrote:
         | If using C23, _BitInt allows for integer types without
         | promotion.
        
       | bluetomcat wrote:
       | Good C code will try to avoid allocations as much as possible in
       | the first place. You absolutely don't need to copy strings around
       | when handling a request. You can read data from the socket in a
       | fixed-size buffer, do all the processing in-place, and then
       | process the next chunk in-place too. You get predictable
       | performance and the thing will work like precise clockwork.
       | Reading the entire thing just to copy the body of the request in
       | another location makes no sense. Most of the "nice" javaesque
       | XXXParser, XXXBuilder, XXXManager abstractions seen in "easier"
       | languages make little sense in C. They obfuscate what really
       | needs to happen in memory to solve a problem efficiently.
        
         | 01HNNWZ0MV43FF wrote:
         | Can you do parsing of JSON and XML without allocating?
        
           | bluetomcat wrote:
           | Yes, you can do it with minimal allocations - provided that
           | the source buffer is read-only or is mutable but is unused
           | later directly by the caller. If the buffer is mutable, any
           | un-escaping can be done in-place because the un-escaped
           | string will always be shorter. All the substrings you want
           | are already in the source buffer. You just need a growable
           | array of pointer/length pairs to know where tokens start.
        
           | gritzko wrote:
           | Yep, no problem. In place parsing only requires a stack.
           | Stack length is the maximum JSON nesting allowed. I have a C
           | dialect exactly like that.
        
           | veqq wrote:
           | Of course. You can do it in a single pass/just parse the
           | token stream. There are various implementations like:
           | https://zserge.com/jsmn/
        
             | andrepd wrote:
             | It requires manual allocation of an array of tokens. So it
             | needs a backing "stack vector" of sorts.
             | 
             | And what about escapes?
        
           | Ygg2 wrote:
           | Theoretically yes. Practically there is character escaping.
           | 
           | That kills any non-allocation dreams. Moment you have "Hi
           | \uxxxx isn't the UTF nice?" you will probably have to
           | allocate. If source is read-only you have to allocate. If
           | source is mutable you have to waste CPU to rewrite the
           | string.
        
             | lelanthran wrote:
             | > Moment you have "Hi \uxxxx isn't the UTF nice?" you will
             | probably have to allocate.
             | 
             | Depends on what you are doing with it. If you aren't
             | displaying it (and typically you are not in a server
             | application), you don't _need_ to unescape it.
        
               | mpyne wrote:
               | And this is indeed something that the C++ Glaze library
               | supports, to allow for parsing into a string_view
               | pointing into the original input buffer.
        
             | deaddodo wrote:
             | I'm confused why this would be a problem. UTF-8 and UTF-16
             | (the only two common unicode subsets) are a maximum of 4
             | bytes wide (and, most commonly, 2 in English text). The
             | ASCII representation you gave is 6-bytes wide. I don't know
             | of many ASCII unicode representations that have less
             | bytewidth than their native Unicode representation.
             | 
             | Same goes for other characters such as \n, \0, \t, \r, etc.
             | All half in native byte representation.
        
             | topspin wrote:
             | > Practically there is character escaping
             | 
             | The voice of experience appears. Upvoted.
             | 
             | It is conceivable to deal with escaping in-place, and thus
             | remain zero-alloc. It's hideous to think about, but I'll
             | bet someone has done it. Dreams are powerful things.
        
             | _3u10 wrote:
             | It's just two pointers the current place to write and the
             | current place to read, escapes are always more characters
             | than they represent so there's no danger of overwriting the
             | read pointer. If you support compression this can become
             | somewhat of and issue but you simply support a max block
             | size which is usually defined by the compression algorithm
             | anyway.
        
               | Ygg2 wrote:
               | If you have a place to write, then it's not zero
               | allocation. You did an allocation.
               | 
               | And usually if you want maximum performance, buffered
               | read is the way to go, which means you need a write slab
               | allocation.
        
               | lelanthran wrote:
               | > If you have a place to write, then it's not zero
               | allocation. You did an allocation.
               | 
               | Where did that allocation happen? You can write into the
               | buffer you're reading from, because the replacement data
               | is shorter than the original data.
        
           | lelanthran wrote:
           | > Can you do parsing of JSON and XML without allocating?
           | 
           | If the source JSON/XML is in a writeable buffer, with some
           | helper functions you can do it. I've done it for a few small-
           | memory systems.
        
           | zzo38computer wrote:
           | It depends what you intend to do with the parsed data, and
           | where the input comes from and where the output will be going
           | to. There are situations that allocations can be reduced or
           | avoided, but that is not all of them. (In some cases, you do
           | not need full parsing, e.g. to split an array, you can check
           | if it is a string or not and the nesting level, and then find
           | the commas outside of any arrays other than the first one, to
           | be split.) (If the input is in memory, then you can also
           | consider if you can modify that memory for parsing, which is
           | sometimes suitable but sometimes not.)
           | 
           | However, for many applications, it will be better to use a
           | binary format (or in some cases, a different text format)
           | rather than JSON or XML.
           | 
           | (For the PostScript binary format, there is no escaping, and
           | the structure does not need to be parsed and converted ahead
           | of time; items in an array are consecutive and fixed size,
           | and data it references (strings and other arrays) is given by
           | an offset, so you can avoid most of the parsing. However,
           | note that key/value lists in PostScript binary format is
           | nonstandard (even though PostScript does have that type, it
           | does not have a standard representation in the binary object
           | format), and that PostScript has a better string type than
           | JavaScript but a worse numeric type than JavaScript.)
        
           | megous wrote:
           | Yes, you can first validate the buffer, to know it contains
           | valid JSON, and then you can work with pointers to beginings
           | of individual syntactic parts of JSON, and have functions
           | that decide what type of the current element is, or move to
           | the next element, etc. Even string work (comparisons with
           | other escaped or unescaped strings, etc.) can be done on
           | escaped strings directly without unescaping them to a buffer
           | first.
           | 
           | Ergonomically, it's pretty much the same as parsing the JSON
           | into some AST first, and then working on the AST. And it can
           | be much faster than dumb parsers that use malloc for
           | individual AST elements.
           | 
           | You can even do JSON path queries on top of this, without
           | allocations.
           | 
           | Eg. https://xff.cz/git/megatools/tree/lib/sjson.c
        
           | acidx wrote:
           | Yes! The JSON library I wrote for the Zephyr RTOS does this.
           | Say, for instance, you have the following struct:
           | struct SomeStruct {             char *some_string;
           | int some_number;         };
           | 
           | You would need to declare a descriptor, linking each field to
           | how it's spelled in the JSON (e.g. the some_string member
           | could be "some-string" in the JSON), the byte offset from the
           | beginning of the struct where the field is (using the
           | offsetof() macro), and the type.
           | 
           | The parser is then able to go through the JSON, and
           | initialize the struct directly, as if you had reflection in
           | the language. It'll validate the types as well. All this
           | without having to allocate a node type, perform copies, or
           | things like that.
           | 
           | This approach has its limitations, but it's pretty efficient
           | -- and safe!
           | 
           | Someone wrote a nice blog post about (and even a video) it a
           | while back: https://blog.golioth.io/how-to-parse-json-data-
           | in-zephyr/
           | 
           | The opposite is true, too -- you can use the same descriptor
           | to serialize a struct back to JSON.
           | 
           | I've been maintaining it outside Zephyr for a while, although
           | with different constraints (I'm not using it for an embedded
           | system where memory is golden): https://github.com/lpereira/l
           | wan/blob/master/src/samples/tec...
        
         | lock1 wrote:
         | Why does "good" C have to be zero alloc? Why should "nice"
         | javaesque make little sense in C? Why do you implicitly assume
         | performance is "efficient problem solving"?
         | 
         | Not sure why many people seem fixated on the idea that using a
         | programming language must follow a particular approach. You can
         | do minimal alloc Java, you can simulate OOP-like in C, etc.
         | 
         | Unconventional, but why do we need to restrict certain
         | optimizations (space/time perf, "readability", conciseness,
         | etc) to only a particular language?
        
           | bluetomcat wrote:
           | Because in C, every allocation incurs a responsibility to
           | track its lifetime and to know who will eventually free it.
           | Copying and moving buffers is also prone to overflows, off-
           | by-one errors, etc. The generic memory allocator is a smart
           | but unpredictable complex beast that lives in your address
           | space and can mess your CPU cache, can introduce undesired
           | memory fragmentation, etc.
           | 
           | In Java, you don't care because the GC cleans after you and
           | you don't usually care about millisecond-grade performance.
        
             | jstimpfle wrote:
             | No. Look up Arenas. In general group allocations to avoid
             | making a mess.
        
               | rictic wrote:
               | If you send a task off to a work queue in another thread,
               | and then do some local processing on it, you can't
               | usually use a single Arena, unless the work queue itself
               | is short lived.
        
               | jenadine wrote:
               | I don't see how arenas solve the problems.
        
               | jstimpfle wrote:
               | You group things from the same context together, so you
               | can free everything in a single call.
        
               | estimator7292 wrote:
               | No. Arenas are not a general case solution. Look it up
        
           | lelanthran wrote:
           | > Why does "good" C have to be zero alloc?
           | 
           | GP didn't say "zero-alloc", but "minimal alloc"
           | 
           | > Why should "nice" javaesque make little sense in C?
           | 
           | There's little to no indirection in idiomatic C compared with
           | idiomatic Java.
           | 
           | Of course, in both languages you can write unidiomatically,
           | but that is a great way to ensure that bugs get in and never
           | get out.
        
             | lock1 wrote:
             | > Of course, in both languages you can write
             | unidiomatically, but that is a great way to ensure that
             | bugs get in and never get out.
             | 
             | Why does "unidiomatic" have to imply "buggy" code? You're
             | basically saying an unidiomatic approach is doomed to
             | introduce bugs and will never reduce them.
             | 
             | It sounds weird. If I write Python code with minimal side
             | effects like in Haskell, wouldn't it at least reduce the
             | possibility of side-effect bugs even though it wasn't
             | "Pythonic"?
             | 
             | AFAIK, nothing in the language standard mentions anything
             | about "idiomatic" or "this is the only correct way to use
             | X". The definition of "idiomatic X" is not as clear-cut and
             | well-defined as you might think.
             | 
             | I agree there's a risk with an unidiomatic approach.
             | Irresponsibly applying "cool new things" is a good way to
             | destroy "readability" while gaining almost nothing.
             | 
             | Anyway, my point is that there's no single definition of
             | "good" that covers everything, and "idiomatic" is just
             | whatever convention a particular community is used to.
             | 
             | There's nothing wrong with applying an "unidiomatic"
             | mindset like awareness of stack/heap alloc, CPU cache
             | lines, SIMD, static/dynamic dispatch, etc in languages like
             | Java, Python, or whatever.
             | 
             | There's nothing wrong either with borrowing ideas like
             | (Haskell) functor, hierarchical namespaces, visibility
             | modifiers, borrow checking, dynamic dispatch, etc in C.
             | 
             | Whether it's "good" or not is left as an exercise for the
             | reader.
        
               | lelanthran wrote:
               | > Why does "unidiomatic" have to imply "buggy" code?
               | 
               | Because when you stray from idioms you're going off down
               | unfamiliar paths. All languages have better support for
               | specific idioms. Trying to pound a square peg into a
               | round hole _can_ work, but is unlikely to work well.
               | 
               | > You're basically saying an unidiomatic approach is
               | doomed to introduce bugs and will never reduce them.
               | 
               | Well, yes. Who's going to reduce them? Where are you
               | planning to find people who are used to code written in
               | an unusual manner?
               | 
               | By definition alone, code is written for humans to read.
               | If you're writing it in a way that's difficult for humans
               | to read, then _of course_ the bug level can only go up
               | and not down.
               | 
               | > It sounds weird. If I write Python code with minimal
               | side effects like in Haskell, wouldn't it at least reduce
               | the possibility of side-effect bugs even though it wasn't
               | "Pythonic"?
               | 
               | "Pythonic" does not mean the same thing as "Idiomatic
               | code in Python".
        
             | codr7 wrote:
             | In C, direct memory control is the top feature, which means
             | you can assume anyone who uses your code is going to want
             | to control memory through the process. This means not
             | allocating from wherever and returning blobs of memory,
             | which means designing different APIs, which is part of the
             | reason why learning C well takes so long.
             | 
             | I started writing sort of a style guide to C a while ago,
             | which attempts to transfer ideas like this one more by
             | example:
             | 
             | https://github.com/codr7/hacktical-c
        
               | jabits wrote:
               | Thanks for sharing this work.
        
               | nxobject wrote:
               | Echoing my sibling comment - thanks for sharing this.
        
           | cogman10 wrote:
           | > Why should "nice" javaesque make little sense in C?
           | 
           | Very importantly, because Java is tracking the memory.
           | 
           | In java, you could create an item, send it into a queue to be
           | processed concurrently, but then also deal with that item
           | where you created it. That creates a huge problem in C
           | because the question becomes "who frees that item"?
           | 
           | In java, you don't care. The freeing is done automatically
           | when nobody references the item.
           | 
           | In C, it's a big headache. The concurrent consumer can't free
           | the memory because the producer might not be done with it.
           | And the producer can't free the memory because the consumer
           | might not have ran yet. In idiomatic java, you just have to
           | make sure your queue is safe to use concurrently. The right
           | thing to do in C would be to restructure things to ensure the
           | item isn't used before it's handed off to the queue or that
           | you send a copy of the item into the queue so the question of
           | "who frees this" is straight forward. You can do both
           | approaches in java, but why would you? If the item is
           | immutable there's no harm in simply sharing the reference
           | with 100 things and moving forward.
           | 
           | In C++ and Rust, you'd likely wrap that item in some sort of
           | atomic reference counted structure.
        
           | estimator7292 wrote:
           | Good C has minimal allocations because you, the human, are
           | the memory allocator. It's up to your own meat brain to
           | correctly track memory allocation and deallocation. Over the
           | last century, C programmers have converged on some best
           | practices to manage this more effectively. We statically
           | allocate, kick allocations up the call chain as far as
           | possible. Anything to get that bit of tracked state out of
           | your head.
           | 
           | But we use different approaches for different languages
           | because those languages _are designed for that approach_. You
           | _can_ do OOP in C and you _can_ do manual memory management
           | in C#. Most people don 't because it's unnecessarily
           | difficult to use languages in a way they aren't designed for.
           | Plus when you re-invent a wheel like "classes" you _will_
           | inevitably introduce a bug you wouldn 't have if you'd used a
           | language with proper support for that construct. You _can_
           | use a hammer to pull out a screw, but you 'd do a much better
           | job if you used a screwdriver instead.
           | 
           | Programming languages are not all created equal and are
           | _absolutely not_ interchangeable. A language is much, much
           | more than the text and grammar. The entire reason we have
           | different languages is because we needed different ways to
           | express certain classes of problems and constructs that go
           | way beyond textual representation.
           | 
           | For example, in a strictly typed OOP language like C#,
           | classes are hideously complex under the hood. Miles and miles
           | of code to handle vtables, inheritance, polymorphism,
           | virtual, abstract functions and fields. To implement this in
           | C would require effort _far_ beyond what any single
           | programmer can produce in a reasonable time. Similarly, I 'm
           | sure one _could_ force JavaScript to use a very strict typing
           | and generics system like C#, but again the effort would be
           | enormous and guaranteed to have many bugs.
           | 
           | We use different languages in different ways because they're
           | different and work differently. You're asking why everyone
           | twists their screwdrivers into screws instead of using the
           | back end to pound a nail. Different tools, different uses.
        
         | lelanthran wrote:
         | > Good C code will try to avoid allocations as much as possible
         | in the first place.
         | 
         | I've upvoted you, but I'm not so sure I agree though.
         | 
         | Sure, each allocation imposes a new obligation to track that
         | allocation, but on the downside, passing around already-
         | allocated blocks imposes a new burden for each call to ensure
         | that the callees have the correct permissions (modify it,
         | reallocate it, free it, etc).
         | 
         | If you're doing any sort of concurrency this can be hard to
         | track - sometimes it's easier to simply allocate a new block
         | and _give_ it to the callee, and then the caller can forget all
         | about it (callee then has the obligation to free it).
        
           | 1718627440 wrote:
           | To reduce the amount of allocation instead of:
           | struct parsed_data * = parse (...);         struct
           | process_data * = process (..., parsed_data);         struct
           | foo_data * = do_foo (..., process_data);
           | 
           | you can do                   parse (...) {             ...
           | process (...);             ...         }              process
           | (...) {             ...             do_foo (...);
           | ...         }
           | 
           | It sounds like violating separation of concerns at first, but
           | it has the benefit, that you can easily do procession and
           | parsing in parallel, and all the data can become readonly.
           | Also I was impressed when I looked at a call graph of this,
           | since this essentially becomes the documentation of the whole
           | program.
        
             | ambicapter wrote:
             | How testable is this, though?
        
               | 1718627440 wrote:
               | It might be a problem when you can't afford side-effects
               | that you later throw away, but I haven't experienced that
               | yet. The functions still have return codes, so you still
               | can test, whether a correct input results in no error
               | check being followed and that incorrect input results in
               | an error check being triggered.
        
           | throwawaymaths wrote:
           | is there _any_ system where doing the basics of http
           | (everything up to framework handoff of structured data) are
           | done outside of a single concurrency unit?
        
           | obviouslynotme wrote:
           | The most important pattern to learn in C is to allocate a
           | giant arena upfront and reuse it over and over in a loop.
           | Ideally, there is only one allocation and deallocation in the
           | entire program. As with all things multi-threaded, this
           | becomes trickier. Luckily, web servers are embarrassingly
           | parallel, so you can just have an arena for each worker
           | thread. Unluckily, web servers do a large amount of string
           | processing, so you have to be careful in how you build them
           | to prevent the memory requirements from exploding. As always,
           | tradeoffs can and will be made depending on what you are
           | actually doing.
           | 
           | Short-run programs are even easier. You just never deallocate
           | and then exit(0).
        
             | adrianN wrote:
             | Arenas are a nice tool, but they don't work for all use
             | cases. In the limit you're reimplementing malloc on top of
             | your big chunk of memory.
        
               | galangalalgol wrote:
               | Most games have to do this for performance reasons at
               | some point and there are plenty of variants to choose
               | from. Rust has libraries for some of them, but in c
               | rolling it yourself is the idiom. One I used in c++ and
               | worked well as a retrofit was to overload new to grab the
               | smallest chunk that would fit the allocation from banks
               | of them. Profiling under load let the sizes of the banks
               | be tuned for efficiency. Nothing had to know it wasn't a
               | real heap allocation, but it was way faster and with zero
               | possibility of memory fragmentation.
        
               | lifthrasiir wrote:
               | Most pre-2010 games had to. As a prior gamedev after that
               | period I can confidently say that it is a relic of the
               | past in most cases now. (Not like that I don't care, but
               | I don't have to be _that_ strict about allocations.)
        
               | card_zero wrote:
               | Because why?
        
               | lifthrasiir wrote:
               | Probably because hardwares became powerful enough that
               | you can make a performant game without thinking much
               | about allocations.
        
               | user____name wrote:
               | Virtual memory gets rid of a lot of fragmentation issues.
        
               | galangalalgol wrote:
               | Yeah. Fragmentation was a niche concern of that embedded
               | use case. It had an mmu, just wasn't used by the rtos. I
               | am surprised that allocations aren't a major hitter
               | anymore. I still have to minimize/eliminate them in linux
               | signal processing code to stay realtime.
        
               | juped wrote:
               | The normal practical version of this advice that isn't a
               | "guy who just read about arenas post" is that you
               | generally kick allocations outward; the caller allocates.
        
               | lelanthran wrote:
               | They don't work for all use-cases, but they most
               | certainly work for this use-case (HTTP server).
        
             | bheadmaster wrote:
             | > Ideally, there is only one allocation and deallocation in
             | the entire program.
             | 
             | Doesn't this techically happen with most of the modern
             | allocators? They do a lot of work to avoid having to
             | request new memory from the kernel as much as possible.
        
               | Asmod4n wrote:
               | last time i checked, the glibc allocator doesnt ask the
               | OS that often for new heap memory.
               | 
               | Like, every ~thousand malloc calls invoked (s)brk and
               | that was it.
        
             | lelanthran wrote:
             | I agree, which is why I wrote an arena allocator library I
             | use (somewhere on github, probably public and free).
        
             | card_zero wrote:
             | > there is only one allocation and deallocation in the
             | entire program.
             | 
             | > Short-run programs are even easier. You just never
             | deallocate and then exit(0).
             | 
             | What's special about "short-run"? If you deallocate only
             | once, presumably just before you exit, then why do it at
             | all?
        
               | free_bip wrote:
               | Just because there's only one deallocation doesn't mean
               | it's run only once. It would likely be run once every
               | time the thread it belongs to is deallocated, like when
               | it's finished processing a request.
        
         | fulafel wrote:
         | This shared memory and pointer shuffling is of course fraught
         | with requiring correct logic to avoid memory safety bugs. Good
         | C code doesn't get you pwned, I'd argue.
        
           | jenadine wrote:
           | > Good C code doesn't get you pwned, I'd argue.
           | 
           | This is not a serious argument because you don't really
           | define good C code and how easy or practical it is to do. The
           | sentence works for every language. "Good <whatever language>
           | code doesn't get you pwned"
           | 
           | But the question is whether "Average" or "Normal" C code gets
           | you pwned? And the answer is yes, as told in the article.
        
             | fulafel wrote:
             | The comment I was responding to suggested Good C Code
             | employes optimizations that, I opined, are more error prone
             | wrt memory safety - so I was not attempting to define it,
             | but challenging the offered characterisation.
        
         | riedel wrote:
         | A long time ago I was involved in building compilers. It was
         | common that we solved this problem with obstacks, which are
         | basically stacked heaps. I wonder one could not build more
         | things like this, where freeing is a bit more best effort but
         | you have some checkpoints. (I guess one would rather need tree
         | like stacks) Just have to disallow pointers going the wrong
         | way. Allocation remains ugly in C and I think explicit data
         | structures are are definitely a better way of handling it.
        
         | self_awareness wrote:
         | That mythical "Good C Code", which is known only to some people
         | who I never met.
        
         | pjmlp wrote:
         | These abstractions were already common in enterprise C code
         | decades before Java came to be, thanks to stuff like Yourdon
         | Structured Method.
         | 
         | Using fixed size buffers doesn't fix out of bounds errors, and
         | stack corruption caused by such bugs.
         | 
         | Naturally we all know good C programmers never make them. /s
        
       | jqpabc123 wrote:
       | Reads like an indictment of vibe coding.
       | 
       | LLMs are fundamentally probabilistic --- not deterministic.
       | 
       | This basically means that anything produced this way is highly
       | suspect. And this framework is an example.
        
       | erichocean wrote:
       | Give Fil-C a try, the speed hit is pretty minimal and you get
       | full memory safety.
       | 
       | https://fil-c.org/
        
         | Karrot_Kream wrote:
         | Wow this is really cool, I'd never seen this before. Thanks!
        
         | adhamsalama wrote:
         | Why isn't this used more?
        
       | dang wrote:
       | Recent and related:
       | 
       |  _Show HN: I built a web framework in C_ -
       | https://news.ycombinator.com/item?id=45526890 - Oct 2025 (208
       | comments)
        
       | yipikaya wrote:
       | As an aside, it's amusing that it took 25 years for C coders to
       | embrace the C99 named struct designator feature:
       | HttpParser parser = {             .isValid = true,
       | .requestBuffer = strdup(request),             .requestLength =
       | strlen(request),             .position = 0,         };
       | 
       | All the kids are doing it now!
        
         | 1718627440 wrote:
         | This is nice for constant data, but strdup can return NULL
         | here, which is again never checked.
         | 
         | > it took 25 years for C coders to embrace the C99 named struct
         | designator feature
         | 
         | Not sure if this actually true, but this is kind of the feature
         | of C, 20 years old code or compiler is supposed to work just
         | fine, so you just wait for some time to settle things. For fast
         | and shiny, there is Javascript.
        
         | davemp wrote:
         | I'm still regularly getting on projects and moving C89 variable
         | declarations from the start of functions to where they're
         | initialized, but I guess it's not the kids doing it.
        
           | mkfs wrote:
           | > C89 variable declarations from the start of functions
           | 
           | Technically it's the start of a block.
        
             | davemp wrote:
             | Technically but I don't think folks ever really bothered.
        
           | 1718627440 wrote:
           | I only declare variables at the begin of a block, not because
           | I would need C89 compatibility, but because I find it clearer
           | to establish the working set of variables upfront. This
           | doesn't restrict me in anyway, because I just start a new
           | block, when I feel the need. I also try to keep the scope of
           | a variable as small as possible.
        
         | rurban wrote:
         | It's only Microsoft's fault to have not implemented it for
         | decades in MSVC. They stayed at C89 forever.
        
           | flykespice wrote:
           | I never understand the reason why Microsoft lagged so much
           | behind on newer c standards adoption. Did their compiler
           | infrastructure made it difficult to adopt newer standards
           | flexible? Or they simply did not care?
        
             | rurban wrote:
             | They focused on C++ only. Management, not their devs.
        
         | varjag wrote:
         | Some of the most famous C codebases (e.g. the Linux kernel)
         | been using them for some time.
        
       | ge96 wrote:
       | Long as you allocate me, it's alright
        
       | acidx wrote:
       | One thing to note, too, is that `atoi()` should be avoided as
       | much as possible. On error (parse error, overflow, etc), it has
       | an unspecified return value (!), although most libcs will return
       | 0, which can be just as bad in some scenarios.
       | 
       | Also not mentioned, is that atoi() can return a negative number
       | -- which is then passed to malloc(), that takes a size_t, which
       | is unsigned... which will make it become a very large number if a
       | negative number is passed as its argument.
       | 
       | It's better to use strtol(), but even that is a bit tricky to
       | use, because it doesn't touch errno when there's no error but you
       | need to check errno to know if things like overflow happened, so
       | you need to set errno to 0 before calling the function. The man
       | page explains how to use it properly.
       | 
       | I think it would be a very interesting exercise for that web
       | framework author to make its HTTP request parser go through a
       | fuzz-tester; clang comes with one that's quite good and easy to
       | use (https://llvm.org/docs/LibFuzzer.html), especially if used
       | alongside address sanitizer or the undefined behavior sanitizer.
       | Errors like the one I mentioned will most likely be found by a
       | fuzzer really quickly. :)
        
         | MathMonkeyMan wrote:
         | Unspecified, really? cppreference's [C documentation][1] says
         | that it returns zero. The [OpenGroup][2] documentation doesn't
         | specify a return value when the conversion can't be performed.
         | This recent [draft][3] of the ISO standard for C says that if
         | the value cannot be represented (does that mean over/underflow,
         | bad parse, both, neither?), then it's undefined behavior.
         | 
         | So three references give three different answers.
         | 
         | You could always use sscanf instead, which tells you how many
         | values were scanned (e.g. zero or one).
         | 
         | [1]: https://en.cppreference.com/w/c/string/byte/atoi.html
         | 
         | [2]:
         | https://pubs.opengroup.org/onlinepubs/9799919799/functions/a...
         | 
         | [3]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf
        
           | acidx wrote:
           | The Linux man page (https://man7.org/linux/man-
           | pages/man3/atoi.3.html#VERSIONS) says that POSIX.1 leaves it
           | unspecified. As you found out, it's really something that
           | should be avoided as much as possible, because pretty much
           | everywhere disagrees how it should behave, especially if you
           | value portability.
           | 
           | sscanf() is not a good replacement either! It's better to use
           | strtol() instead. Either do what Lwan does
           | (https://github.com/lpereira/lwan/blob/master/src/lib/lwan-
           | co...), or look (https://cvsweb.openbsd.org/src/lib/libc/stdl
           | ib/strtonum.c?re...) at how OpenBSD implemented strtonum(3).
           | 
           | For instance, if you try to parse a number that's preceded by
           | a lot of spaces, sscanf() will take a long time going through
           | it. I've been hit by that when fuzzing Lwan.
           | 
           | Even cURL is avoiding sscanf():
           | https://daniel.haxx.se/blog/2025/04/07/writing-c-for-curl/
        
             | MathMonkeyMan wrote:
             | If your use case can have C++, then [std::from_chars][1] is
             | ideal. Here's gcc's [implementation][2]; a lot of it seems
             | to be handling different bases.
             | 
             | [1]:
             | https://en.cppreference.com/w/cpp/utility/from_chars.html
             | 
             | [2]: https://github.com/gcc-
             | mirror/gcc/blob/461fa63908b5bb1a44f12...
        
       | AdieuToLogic wrote:
       | While the classic "Parse, don't validate"[0] paper uses Haskell
       | instead of C as its illustrative programming language, the
       | approach detailed is very much applicable in these scenarios.
       | 
       | 0 - https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-
       | va...
        
         | lelanthran wrote:
         | > While the classic "Parse, don't validate"[0] paper uses
         | Haskell instead of C as its illustrative programming language,
         | the approach detailed is very much applicable in these
         | scenarios.
         | 
         | Good thing someone (i.e. _me_ ) took the time to demonstrate
         | PdV in C: https://www.lelanthran.com/chap13/content.html
        
           | nxobject wrote:
           | I appreciate that link - now I see the parallels between
           | "consolidate allocation in C to the extent that the rest of
           | your code doesn't have to worry", and "consolidate validation
           | in C" to the extent that...".
        
       | pizlonator wrote:
       | Just compile it with Fil-C
        
       | kazinator wrote:
       | I definitely don't love C that does atoi on a Content-Length
       | value that came from the network and passes that to malloc.
       | 
       | Even before we get to how a malicious would interact with malloc,
       | there is this:
       | 
       | > The functions atof, atoi, atol, and atoll are not required to
       | affect the value of the integer expression errno on an error. If
       | the value of the result cannot be represented, the behavior is
       | undefined. [ISO C N3220 draft]
       | 
       | That includes not only out-of-range values by garbage that cannot
       | be converted to a number at all. atoi("foo") can behave in any
       | manner whatsoever and return anything.
       | 
       | Those functions are okay to use on something that has been
       | validated in a way that it cannot cause a problem. If you know
       | you have a nonempty sequence of nothing but digits, possibly with
       | a minus sign, and the number digits is small enough that the
       | value will fit into int, you are okay.
       | 
       | > A malicious user can pass Content-Length of 4294967295
       | 
       | But why would they when it's fewer keystrokes to use -1, which
       | will go to 4294967295 on a 32 bit malloc, while scaling to
       | 18446744073709551615 on 64 bit?
        
       ___________________________________________________________________
       (page generated 2025-10-11 23:02 UTC)