[HN Gopher] Ask HN: A retrofitted C dialect?
       ___________________________________________________________________
        
       Ask HN: A retrofitted C dialect?
        
       Hi I'm Anqur, a senior software engineer with different backgrounds
       where development in C was often an important part of my work. E.g.
       1) Game: A Chinese/Vietnam game with C/C++ for making
       server/client, Lua for scripting [1]. 2) Embedded systems:
       Switch/router with network stack all written in C [2]. 3)
       (Networked) file system: Ceph FS client, which is a kernel module.
       [3]  (I left some unnecessary details in links, but are true
       projects I used to work on.)  Recently, there's a hot topic about
       Rust and C in kernel and a message [4] just draws my attention,
       where it talks about the "Rust" experiment in kernel development:
       > I'd like to understand what the goal of this Rust "experiment"
       is: If we want to fix existing issues with memory safety we need to
       do that for existing code and find ways to retrofit it.  So for
       many years, I keep thinking about having a new C dialect for
       retrofitting the problems, but of _C itself_.  Sometimes big
       systems and software (e.g. OS, browsers, databases) could be made
       entirely in different languages like C++, Rust, D, Zig, etc. But
       typically, like I slightly mentioned above, making a good
       filesystem client requires one to write kernel modules (i.e. to
       provide a VFS implementation. I do know FUSE, but I believe it's
       better if one could use VFS directly), it's not always feasible to
       switch languages.  And I still love C, for its unique "bare-bone"
       experience:  1) Just talk to the platform, almost all the platforms
       speak C. Nothing like Rust's PAL (platform-agnostic layer) is
       needed. 2) Just talk to other languages, C is the lingua franca
       (except Go needs no libc by default). Not to mention if I want
       WebAssembly to talk to Rust, `extern "C"` is need in Rust code. 3)
       Just a libc, widely available, write my own data structures
       carefully. Since usually one is writing some critical components of
       a bigger system in C, it's just okay there are not many choices of
       existing libraries to use. 4) I don't need an over-generalized
       generics functionality, use of generics is quite limited.  So
       unlike a few `unsafe` in a safe Rust, I want something like a few
       "safe" in an ambient "unsafe" C dialect. But I'm not saying
       "unsafe" is good or bad, I'm saying that "don't talk about unsafe
       vs safe", it's C itself, you wouldn't say anything is "safe" or
       "unsafe" in C.  Actually I'm also an expert on implementing
       advanced type systems, some of my works include:  1) A row-
       polymorphic JavaScript dialect [5]. 2) A tiny theorem prover with
       Lean 4 syntax in less than 1K LOC [6]. 3) A Rust dialect with reuse
       analysis [7].  Language features like generics, compile-time eval,
       trait/typeclass, bidirectional typechecking are trivial for me, I
       successfully implemented them above.  For the retrofitted C, these
       features initially come to my mind:  1) Code generation directly to
       C, no LLVM IR, no machine code. 2) Module, like C++20 module, to
       eliminate use of headers. 3) Compile-time eval, type-level
       computation, like `malloc(int)` is actually a thing. 4) Tactics-
       like metaprogramming to generate definitions, acting like type-safe
       macros. 5) Quantitative types [8] to track the use of resources
       (pointers, FDs). The typechecker tells the user how to insert
       `free` in all possible positions, don't do anything like RAII. 6)
       Limited lifetime checking, but some people tells me lifetime is not
       needed in such a language.  Any further insights? Shall I kickstart
       such project? Please I need your ideas very much.  [1]:
       https://vi.wikipedia.org/wiki/V%C3%B5_L%C3%A2m_Truy%E1%BB%81...
       [2]: https://e.huawei.com/en/products/optical-access/ma5800  [3]:
       https://docs.ceph.com/en/reef/cephfs/  [4]:
       https://lore.kernel.org/rust-for-linux/Z7SwcnUzjZYfuJ4-@infr...
       [5]: https://github.com/rowscript/rowscript  [6]:
       https://github.com/anqurvanillapy/TinyLean  [7]:
       https://github.com/SchrodingerZhu/reussir-lang  [8]:
       https://bentnib.org/quantitative-type-theory.html
        
       Author : anqurvanillapy
       Score  : 42 points
       Date   : 2025-02-22 08:11 UTC (3 days ago)
        
       | leecommamichael wrote:
       | We seem to have the same desire for a "cleaned up C." Could you
       | say more about how metaprogramming would work? I doubt you want
       | to put lifetimes into the type system to any degree. The reason C
       | compiles so much quicker than C++ is the lack of features. Every
       | feature must be crucial. Modules are crucial to preserving C.
        
         | anqurvanillapy wrote:
         | > We seem to have the same desire for a "cleaned up C."
         | 
         | That's so great! But sad that no enough ideas and argument came
         | up here. :'(
         | 
         | > How metaprogramming would work?
         | 
         | When it comes to "tactics" in Coq and Lean 4 (i.e. DSL to
         | control the typechecker, e.g. declare a new variable), there
         | are almost equivalent features like "elaborator reflection" in
         | Idris 1/2 [1] (e.g. create some AST nodes and let typechecker
         | check if it's okay), and most importantly, in Scala 3 [2], you
         | could use `summonXXX` APIs to generate new definitions to the
         | compiler (e.g. automatically create an instance for the JSON
         | encoding trait, if all fields of a record type is given).
         | 
         | So the idea is like: Expose some typechecker APIs to the user,
         | with which one could create well-typed or ready-to-type AST
         | nodes during compile time.
         | 
         | [1]: https://docs.idris-
         | lang.org/en/latest/elaboratorReflection/e...
         | 
         | [2]: https://docs.scala-
         | lang.org/scala3/reference/contextual/deri...
         | 
         | > Lifetime and compilation speed.
         | 
         | Yes exactly, I was considering features from Featherweight Rust
         | [3], some subset of it might be partially applied. But yes it
         | should be super careful on bringing new features in in case of
         | compilation speed.
         | 
         | It's also worth to mention that C compiler itself would do some
         | partial "compile-time eval" like constant folding, during
         | optimization. I know some techniques [4] to achieve this during
         | typechecking, not in another isolated pass, and things like
         | incremental compilation and related caching could bring
         | benefits here.
         | 
         | [3]: https://dl.acm.org/doi/10.1145/3443420
         | 
         | [4]: https://en.wikipedia.org/wiki/Normalisation_by_evaluation
         | 
         | > Every feature must be crucial.
         | 
         | I want to hear more of your ideas on designing such language
         | too, and what's your related context and background for it BTW,
         | for my curiosity?
        
       | fithisux wrote:
       | C3 to C compiler could be a proposal.
        
         | anqurvanillapy wrote:
         | Ah that should be good for source-level compatibility. But I'm
         | thinking about extending existing codebase that crosses between
         | the kernel and user space, e.g. DPDK, SPDK, FUSE, kernel
         | module, etc. Curious that how C3 would be adopted in such
         | projects.
        
           | fithisux wrote:
           | Start small.
        
             | anqurvanillapy wrote:
             | And then? https://github.com/anqurvanillapy/TinyLean
        
               | fithisux wrote:
               | Very very interesting for me. I always wanted to do
               | something similar for Maude in Golang (Python is not a
               | bad choice).
               | 
               | Currently my focus is on data engineering, but I can use
               | it as an inspiration.
               | 
               | I talked about C3 to C translator, this is what I said
               | start small.
        
       | Rochus wrote:
       | There are approaches with at least partly the same goals as you
       | mentioned, e.g. Zig. Personally I have been working on my own C
       | replacement for some time which meets many of your points (see
       | https://github.com/micron-language/specification); but the syntax
       | is derived from my Oberon+ language, not from C (even if I use C
       | and C++ for decades, I don't think it's a good syntax); it has
       | compile-time execution, inlines and generic modules (no need for
       | macros or a preprocessor); the current version is minimal, but
       | extensions like inheritance, type-bound procedures, Go-like
       | interfaces or the finally clause (for a simple RAII or "deferred"
       | replacement) are already prepared.
        
         | anqurvanillapy wrote:
         | > There are approaches e.g. Zig.
         | 
         | Yes! Zig has done a great job on many C-related stuff, e.g.
         | they've already made it possible to cross-compile C/C++
         | projects with Zig toolchain years ago. But I'm still quite
         | stupidly obsessed with source-level compatibility with C, don't
         | know if it's good, but things like "Zig uses `0xAA` on
         | debugging undefined memory, not C's traditional `0xCC` byte"
         | make me feel Zig is not "bare-bone" enough to the C world.
         | 
         | > Micron and Oberon+ programming language.
         | 
         | They look absolutely cool to me! The syntax looks inspired from
         | Lua (`end` marker) and OCaml (`of` keyword), CMIIW. The
         | features are pretty nice too. I would look into the design of
         | generic modules and inheritance more, since I'm not sure what a
         | good extendability feature would look like for the C users.
         | 
         | Well BTW, I found there's only one following in your GitHub
         | profile and it's Haoran Xu. Any story in here lol? He's just
         | such a genius making a better LuaJIT, a baseline Python JIT and
         | a better Python interepreter all happen in real life.
        
           | Rochus wrote:
           | > _The syntax looks inspired from Lua (`end` marker) and
           | OCaml (`of` keyword), CMIIW_
           | 
           | Oberon+ and Micron are mostly derived from Wirth's Oberon and
           | Pascal lineage. Lua inherited many syntax features from
           | Modula-2 (yet another Wirth language), and also OCaml
           | (accidentally?) shares some keywords with Pascal. If you are
           | interested in even more Lua similarities, have a look at
           | https://github.com/rochus-keller/Luon, which I published
           | recently, but which compiles to LuaJIT and thus serves
           | different use-cases than C.
           | 
           | > _I would look into the design of generic modules_
           | 
           | I found generic modules to be a good compromise with
           | simplicity in mind; here is an article about some of the
           | motivations and findings: https://oberon-
           | lang.github.io/2021/07/17/considering-generic...
           | 
           | > _Haoran Xu, making a better LuaJIT_
           | 
           | You mean this project: https://github.com/luajit-
           | remake/luajit-remake? This is a very interesting project and
           | as it seems development continues after a break for a year.
        
           | woodrowbarlow wrote:
           | > source-level compatibility with C
           | 
           | not sure if this is exactly what you meant, but in Zig you
           | can #include a C header and then "just" invoke the function.
           | no special FFI syntax or typecasting (except rich enums and
           | strings). it can produce compatible ASTs for C and Zig.
        
       | SleepyMyroslav wrote:
       | I think you don't need any rants but here it goes anyway.
       | 
       | Ditching headers does not solve anything at least if your
       | language targets include performance or my beloved example
       | Gamedev =) . You will have to consume headers until operating
       | systems will not stop using them. It is a people problem not
       | language problem.
       | 
       | Big elephants in the room I do not see in your list:
       | 
       | 1) "threading" was bolted onto languages like C and C++ without
       | much groundwork. Rust kinda has an idea there but its really
       | alien to everything I saw in my entire 20+ career with C++. I am
       | not going to try to explain it here to not get downvoted into
       | oblivion. Just want you to think that threading has to be natural
       | in any language targeting multicore hardware.
       | 
       | 2) "optimization" is not optional. Languages also will have to
       | deal with strict aliasing and UB catastrophes. Compilers became
       | real AGI of the industry. There are no smart developers
       | outsmarting optimizing compilers anymore. You either with the big
       | compilers on optimization or your language performance is not
       | relevant. Providing even some ways to control optimization is
       | something sorely missed every time everything goes boom with a
       | minor compiler update.
       | 
       | 3) "hardware". If you need performance you need to go back to
       | hardware not hide from it further behind abstract machines. C and
       | C++ lack real control of anything hardware did since 1985.
       | Performant code really needs to be able to have memory pages and
       | cache lines and physical layout controls of machine code. Counter
       | arguments that these hardware things are per platform and
       | therefore outside of language are not really helping. Because
       | they need to be per platform and available in the language.
       | 
       | 4) "libc" is a problem. Most of it being used in newly written
       | code has to be escalated straight to bug reporting tool. I used
       | to think that C++ stl was going to age better but not anymore.
       | Assumptions baked into old APIs are just not there anymore.
       | 
       | I guess it does not sound helpful or positive for any new
       | language to deal with those things. I am pretty sure we can kick
       | all those cans down the road if our goal is to keep writing
       | software compatible with PDP that somehow limps in web browser
       | (sorry bad attempt at joking).
        
         | anqurvanillapy wrote:
         | Exactly the kind of thoughts and insights I need from more of
         | the users. Thank you for pointing out many concerns.
         | 
         | > Headers.
         | 
         | C++20 modules are left unstable and unused in major compilers
         | there, but it's a standard. And C is ironically perfect for
         | FFI, as I said, almost every programming language speaks C:
         | Rust WebAssembly API is extern C, JNI in Java, every scripting
         | language, even Go itself talks to OS solely using syscall ABI,
         | foreign-function calls are only possible with Cgo. C was not
         | just an application/systems language for some sad decades.
         | 
         | > Big elephants.
         | 
         | Since I was in the zoo watching tigers:
         | 
         | Mostly three groups of people are served under a language:
         | Application writers, library writers, compiler writers
         | (language itself).
         | 
         | I narrowed down and started "small" to see if people writing
         | programs crossing kernel and user space would have more
         | thoughts about C since it's the only choice. That's also my
         | job, I made distributed block device (AWS EBS replacement)
         | using SPDK, distributed filesystem (Ceph FS replacement) using
         | FUSE, packet introspection module in router using DPDK. I know
         | how it feels.
         | 
         | Then for the elephants you mentioned, I see them more fitted
         | into a more general library and application development, so
         | here we go:
         | 
         | > Threading.
         | 
         | Async Rust is painful, Send + Sync + Pin, long signatures of
         | trait bounds, no async runtimes are available in standard
         | libraries, endless battles in 3rd party runtimes.
         | 
         | I would prefer Go on such problems. Not saying goroutines and
         | channels are perfect (stackful is officially the only choice,
         | when goroutine stacks somehow become memory intensive, going
         | stackless is only possible with 3rd party event loops), but
         | builtin deadlock and race detection win much here. So it just
         | crashes on violation, loops on unknown deadlocks, I would
         | probably go to this direction.
         | 
         | > Optimization, hardware.
         | 
         | Quite don't understand why these concerns are "concerns" here.
         | 
         | It's the mindset of having more known safer parts in C, like a
         | disallow list, rather than under a strong set of rules, like in
         | Rust, an allowlist (mark `unsafe` to be nasty). Not making
         | everything reasonable, safe and generally smart, which is
         | surreal.
         | 
         | C is still, ironically again, the best language to win against
         | assembly upon an optimizing performance, if you know these
         | stories:
         | 
         | - They increased 30% speed on CPython interpreter recently on
         | v3.14.
         | 
         | - The technique was known 3 years ago to be applied in LuaJIT-
         | Remake, they remade a Lua interpreter to win against the
         | original handwritten assembly version, without inline caching.
         | 
         | - Sub-techniques of it exist more than a decade even it's in
         | Haskell LLVM target, and they theoretically exist before C was
         | born.
         | 
         | It is essentially just an approach to matching how the real
         | abstract machine looks like underneath.
         | 
         | > libc.
         | 
         | Like I said, C is more than a language. Ones need to switch a
         | new allocator algorithm upon malloc/free, Rust quits using
         | jemalloc by default and uses just malloc instead. Libc is
         | somewhat a weird de facto interface.
        
           | SleepyMyroslav wrote:
           | I guess I need to illustrate my points a bit because I never
           | needed to poke kernels and my concerns are mostly from large
           | games. I am trying to imagine writing large games in your
           | language so please bear with me for a moment.
           | 
           | >Modules
           | 
           | Nobody plans to provide other interfaces to
           | oses/middlewares/large established libraries. Economy is just
           | not there.
           | 
           | >Threading
           | 
           | I was not talking about I/O at all. All of that you mention
           | will be miles better in any high level language because
           | waiting can be done in any language. Using threads for
           | computation intensive things is a niche for low level
           | languages. I would go further say that copying stuff around
           | and mutexes also will be fine in high level languages.
           | 
           | >Optimization/Hardware
           | 
           | Is very important to me. I don't know how it was not relevant
           | to your plan of fixing low level language. Here goes couple
           | of examples to try to shake things up.
           | 
           | The strlen implementation in glibc is not written in C. UB
           | just do not allow to implement the same algorithm. Because
           | reading up until memory page end is outside of abstract
           | machine. Also note how sanitizers are implemented to avoid
           | checking strlen implementation.
           | 
           | Pointer provenance that is both present in each major
           | compiler and impossible to define atm. You need to decide if
           | your language goes with abstract machine or gcc or clang or
           | linux. None of them agree on it. A good attempt to add into C
           | standard a logical model of pointer provenance did not
           | produced any results. If you want to read up on that there
           | was HN thread about it recently.
           | 
           | >libc
           | 
           | I am pretty sure I can't move you on that. Just consider
           | platforms that need to use new APIs for everything and have
           | horrendous 'never to be used' shims to be posix 'compatible'.
           | Like you can compile legacy things but running it does not
           | make sense. Games tend to run there just fine because games
           | used to write relevant low level code per platform anyway.
        
             | anqurvanillapy wrote:
             | > Imagine writing large games in your language.
             | 
             | You don't. Read the features I listed. One ends up with a C
             | alternative frontend (Cfront, if you love bad jokes)
             | including type system like Zig without any standard
             | library. No hash tables, no vectors. You tended to write
             | large games with this.
             | 
             | Like I said the main 3 groups of users, if you're concerned
             | about application writing, ask it. Rest of the comments
             | talked about possible directions of langdev.
             | 
             | > Modules.
             | 
             | You write C++ and don't know what a standard is. Motivating
             | examples, real world problems (full and incremental
             | compilation, better compilation cache instead of
             | precompiled headers), decades spent on discussions. Economy
             | would come for projects with modern C++ features.
             | 
             | > Threading.
             | 
             | If you know Rust and Go, talk about them more. Go creates
             | tasks and uses futexes, with bare-bone syscall ABI. Higher
             | level primitives are easy to use. Tools and runtime are
             | friendly to debugging.
             | 
             | I wrote Go components with channels running faster than
             | atomics with waits, in a distributed filesystem metadata
             | server.
             | 
             | On CPU intensiveness, I would talk about things like
             | automatic vectorization, smarter boxing/unboxing, smarter
             | memory layout (aka levity, e.g. AoS vs SoA). Not threading
             | niche.
             | 
             | > Strlen implementation and plan of low level programming.
             | 
             | Because I keep talking about designing a general purpose
             | language. One can also use LLVM IR to implement such
             | algorithms.
             | 
             | The design space here is to write these if necessary. Go
             | source code is full of assembly.
             | 
             | > Pointer provenance.
             | 
             | Search for Andras Kovacs implementation of 2ltt in ICFP
             | 2024 (actually he finished it in 2022), and his dtt-rtcg,
             | you would realize how trivial these features could be
             | implemented "for a new language". I design new languages.
             | 
             | > libc.
             | 
             | Like I said, your happy new APIs invoke malloc.
        
               | SleepyMyroslav wrote:
               | Good luck with metaprogramming. It looks cool.
               | 
               | No worries, I got your message about target audience
               | first time. It's just that language development for me is
               | where I did some things. Langdev is an open ended
               | problem. I wish I could express games needs without
               | wasting time on things games don't care about.
        
           | needlesslygrim wrote:
           | > Async Rust is painful
           | 
           | On the other hand, I've found normal threading in Rust quite
           | simple (generally using a thread pool).
        
         | PaulDavisThe1st wrote:
         | > Just want you to think that threading has to be natural in
         | any language targeting multicore hardware.
         | 
         | parallel execution and thus parallel programming will never be
         | natural to any human being. We don't do it, we can't think it
         | except by using various cognitive props (diagrams, objects) to
         | help us. You cannot make it natural no matter how strongly you
         | desire it.
         | 
         | Now, there is a different sort of "natural" which might mean
         | something more like "idiomatic to other language forms and
         | patterns", and that's certainly a goal that can widely missed
         | or closely approximated.
        
       | pjc50 wrote:
       | > So unlike a few `unsafe` in a safe Rust, I want something like
       | a few "safe" in an ambient "unsafe" C dialect. But I'm not saying
       | "unsafe" is good or bad, I'm saying that "don't talk about unsafe
       | vs safe", it's C itself, you wouldn't say anything is "safe" or
       | "unsafe" in C.
       | 
       | Eh?
       | 
       | The critical criterion is "does your language make it difficult
       | to write accidental RCEs". There's huge resistance to changing
       | language _at all_ , as we can see from the kernel mailing lists,
       | so in order to go through the huge social pain of encouraging
       | people to use a different language it's got to offer real and
       | significant benefits.
       | 
       | Lifetimes are a solution to memory leaks and use-after free.
       | Other solutions may exist.
       | 
       | Generics: Go tried to resist generics. It was a mistake. You need
       | to be able to do Container<T> somehow. Do you have an opinion on
       | the dotnet version of generics?
       | 
       | (You mention Ceph: every time I read about it I'm impressed, in
       | that it seems an excellent solution to distributed filesystems,
       | and yet I don't see it mentioned all that often. I'm glad it's
       | survived)
        
       | AlotOfReading wrote:
       | The problem with "safe pockets in ambient unsafety" is that C and
       | C++ intentionally disallow this model. It doesn't matter what you
       | do to enforce safety within the safe block, the definition of
       | Undefined Behavior means that code elsewhere in your program can
       | violate any guarantees you attempt to enforce. The only ways
       | around this are with a language that doesn't transpile to C and
       | doesn't have undefined behavior like Rust, or a compiler that
       | will translate C safely like zig attempts to do. Note that zig
       | still falls short here with unchecked illegal behavior and rustc
       | has struggled with assumptions about C's undefined behavior
       | propagating into LLVM's backend.
        
         | jjnoakes wrote:
         | Safe pockets in ambient unsafety does have benefits though. For
         | example, some code has a higher likelihood of containing
         | undefined behavior (code that manipulates pointers and offsets
         | directly, parsing code, code that deals with complex lifetimes
         | and interconnected graphs, etc), so converting just that code
         | to safe code would have a high ROI.
         | 
         | And once you get to the point where a large chunk of code is in
         | safe pockets, any bugs that smell of undefined behavior only
         | require you to look at the code outside of the safe pockets,
         | which hopefully decreases over time.
         | 
         | There are also studies that show that newly written code tends
         | to have more undefined behavior due to its age, so writing new
         | code in safe pockets has a lot of benefit there too.
        
       | mikexstudios wrote:
       | Kind of along these lines but for C++: https://docs.carbon-
       | lang.dev/
        
       | arnsholt wrote:
       | In 2014 John Regehr and colleagues suggested what he called
       | Friendly C[0], in an attempt to salvage C from UB. About bit more
       | than a year later, he concluded that the project wasn't really
       | feasible because people couldn't agree on the details of what
       | Friendly C should be.[1]
       | 
       | In the second post, there's an interesting comment towards the
       | end:
       | 
       | > Luckily there's an easy away forward, which is to skip the step
       | where we try to get consensus. Rather, an influential group such
       | as the Android team could create a friendly C dialect and use it
       | to build the C code (or at least the security-sensitive C code)
       | in their project. My guess is that if they did a good job
       | choosing the dialect, others would start to use it, and at some
       | point it becomes important enough that the broader compiler
       | community can start to help figure out how to better optimize
       | Friendly C without breaking its guarantees, and maybe eventually
       | the thing even gets standardized. There's precedent for
       | organizations providing friendly semantics; Microsoft, for
       | example, provides stronger-than-specified semantics for volatile
       | variables by default on platforms other than ARM.
       | 
       | I would argue that this has happened, but not quite in the way he
       | expected. Google (and others) _has_ chosen a way forward, but
       | rather than somehow fixing C they have chosen Rust. And from what
       | I see happening in the tech space, I think that trend is going to
       | continue: love it or hate it, the future is most likely going to
       | be Rust encroaching on C, with C increasinly being relegated to
       | the  "legacy" status like COBOL and Fortran. In the words of
       | Ambassador Kosh: "The avalanche has already started. It is too
       | late for the pebbles to vote."
       | 
       | 0: https://blog.regehr.org/archives/1180 1:
       | https://blog.regehr.org/archives/1287
        
         | Macha wrote:
         | I think the problem with "friendly C", "safe C++" proposals is
         | they come from a place of "I want to continue using what I know
         | in C/C++ but get some of the safety benefits. I'm willing to
         | trade some of the safety benefits for familiarity". The problem
         | is the friendly C/safe C++ that people picture from that is on
         | a spectrum. On one end you have people that really just want to
         | keep writing C++98 or C99 and see this as basically a way to
         | keep the network effects of C/C++ by having other people write
         | C who wouldn't. The other extreme are people who are willingly
         | to significantly rework their codebases to this hypothetical
         | safe C.
         | 
         | The people on one end of this spectrum actually wouldn't accept
         | any of the changes to meaningfully move the needle, while the
         | people on the other end have already moved or are moving to
         | Rust.
         | 
         | Then in the middle you have a large group of people but not one
         | that agrees on which points of compatibility they will give up
         | for which points of safety. If someone just said "Ok, here's
         | the standard variant, deal with it", they might adopt it... but
         | they wouldn't be the ones invested enough to make it and the
         | people who would make it have already moved to other languages.
        
         | awesome_dude wrote:
         | > Luckily there's an easy away forward, which is to skip the
         | step where we try to get consensus.
         | 
         | This is true, the Benovolant Dictator model, versus the Rule by
         | committee model problesm.
         | 
         | Committees are notorius for having problems coming to a
         | consensus, because everyone wants to pull in a different
         | direction, often at odds with everyone else.
         | 
         | Benevolent dictators get things done, but it's not necessarily
         | what people want.
         | 
         | And, we live in hope that they stay benevolent.
        
       | melon_tusk wrote:
       | This is a dream come true. Please do it, for the love of mankind.
        
       | vmchale wrote:
       | Have a look at ATS, it is memory-safe and designed for kernel
       | development. There's a kernel and arduino examples. Fluent C
       | interop.
       | 
       | No tactics metaprogramming but it'll give you a start.
        
       | gwbas1c wrote:
       | There are plenty of attempts at "safe C-like" languages that you
       | can learn from:
       | 
       | C++ has smart pointers. I personally haven't worked with them,
       | but you can probably get very close to "safe C" by mostly working
       | in C++ with smart pointers. Perhaps there is a way to annotate
       | the code (with a .editorconfig) to warn/error when using a
       | straight pointer, except within a #pragma?
       | 
       | > Just talk to the platform, almost all the platforms speak C.
       | Nothing like Rust's PAL (platform-agnostic layer) is needed. 2)
       | Just talk to other languages, C is the lingua franca
       | 
       | C# / .Net tried to do that. Unfortunately, the memory model
       | needed to enable garbage collection makes it far too opinionated
       | to work in cases where straight C shines. (IE, it's not practical
       | to write a kernel in C# / .Net.) The memory model is also so
       | opinionated about how garbage collection should work that C# in
       | WASM can't use the proposed generalized garbage collector for
       | WASM.
       | 
       | Vala is a language that's inspired by C#, but transpiles to C. It
       | uses the gobject system under the hood. (I guess gobjects are
       | used in some linux GUIs, but I have little experience with it.)
       | Gobjects, and thus Vala, are also opinionated about how automatic
       | memory management should work, (In this case, they use reference
       | counting.), but from what I remember it might be easier to drop
       | into C in a Vala project.
       | 
       | Objective C is a decent object-oriented language, and IMO, nicer
       | than C++. It allows you to call C directly without needing to
       | write bindings; and you can even write straight C functions mixed
       | in with Objective C. But, like C# and Vala, Objective C's memory
       | model is also opinionated about how memory management should
       | work. You might even be able to mix Swift and Objective C, and
       | merely use Objective C as a way to turn C code into objects.
       | 
       | ---
       | 
       | The thing is, if you were to try to retrofit a "safe C" inside of
       | C, you have to be _opinionated about how memory management should
       | work._ The value of C is that it has no opinions about how your
       | memory management should work; this allows C to interoperate with
       | other languages that allow access to pointers.
        
         | neonsunset wrote:
         | It's less so opinionated and more so that WASM GC spec is just
         | bad and too rudimentary to be anywhere near enough for more
         | sophisticated GC implementations found in JVM and .NET.
        
           | gwbas1c wrote:
           | It's been awhile since I skimmed the proposal. What I
           | remember is that it was "just enough" to be compatible with
           | Javascript; but didn't have the hooks that C# needs. (I don't
           | remember any mentions about the JVM.)
           | 
           | I remember that the C# WASM team wanted callbacks for
           | destructors and type metadata.
           | 
           | Personally, having spent > 20 years working in C#,
           | destructors is a smell of a bigger problem; and really only
           | useful for debugging resource leaks. I'd rather turn them off
           | in the WASM apps that I'm working on.
           | 
           | Type metadata is another thing that I think could be handled
           | within the C# runtime: Much like IntPtr is used to
           | encapsulate native pointers, and it can be encapsulated in a
           | struct for type safety when working with native code, there
           | can be a struct type used for interacting with non-C# WASM
           | managed objects that doesn't contain type metadata.
        
             | neonsunset wrote:
             | Here's the issue which gives an overview of the problems:
             | https://github.com/WebAssembly/gc/issues/77
             | 
             | Further discussion can be found here:
             | https://github.com/dotnet/runtime/issues/94420
             | 
             | Turning off destructors will not help even a little because
             | the biggest pain points are support for byref pointers and
             | insufficient degree of control over object memory layout.
        
       | pkkm wrote:
       | I'm a lot less experienced than you, but since you're collecting
       | ideas, I'll give my opinion.
       | 
       | For me personally, the biggest improvements that could be made to
       | C aren't about advanced type system stuff. They're things that
       | are technically simple but backwards compatibility makes them
       | difficult in practice. In order of importance:
       | 
       | 1) Get rid of null-terminated strings; introduce native slice and
       | buffer types. A slice would be basically _struct { T *ptr, size_t
       | count }_ and a buffer would be _struct { T *ptr, size_t count,
       | size_t capacity }_ , though with dedicated syntax to make them
       | ergonomic - perhaps _T ^slice_ and _T @buffer_. We 'd also want
       | buffer -> slice -> pointer decay, _beginof_ / _endof_ / _countof_
       | / _capacityof_ operators, and of course good handling of type
       | qualifiers.
       | 
       | 2) Get rid of _errno_ in favor of consistent out-of-band error
       | handling that would be used in the standard library and
       | recommended for user code too. That would probably involve using
       | the return value for a status code and writing the actual result
       | via a pointer: _int do_stuff(T *result, ...)_.
       | 
       | 3) Get rid of the strict aliasing rule.
       | 
       | 4) Get rid of various tiny sources of UB. For example,
       | standardize _realloc_ to be equivalent to _free_ when called with
       | a length of 0.
       | 
       | Metaprogramming-wise, my biggest wish would be for a way to
       | enrich programs and libraries with custom compile-time checks,
       | written in plain procedural code rather than some convoluted
       | meta-language. These checks would be very useful for libraries
       | that accept custom (non- _printf_ ) format strings, for example.
       | An opt-in linear type system would be nice too.
       | 
       | Tool-wise, I wish there was something that could tell me
       | definitively whether a particular run of my program executed any
       | UB or not. The simpler types of UB, like null pointer
       | dereferences and integer overflows, can be detected now, but I'd
       | also like to know about any violations of aliasing and pointer
       | provenance rules.
        
       | ryao wrote:
       | Here is a sound static analyzer that can identify all memory
       | safety bugs in C/C++ code, among other kinds of bugs:
       | 
       | https://www.absint.com/astree/index.htm
       | 
       | You can use it to produce code that is semi-formally verified to
       | be safe, with no need for extensions. It is used in the aviation
       | and nuclear industries. Given that it is used only by industries
       | where reliability is so important that money is no object, I
       | never bothered to ask them how much it costs. Few people outside
       | of those industries knows that it exists. It is a shame that the
       | open source alternatives only support subsets of what it
       | supports. The computing industry is largely focused on unsound
       | approaches that are easier to do, but do not catch all issues.
       | 
       | If you want extensions, here is a version of C that relies on
       | hardware features to detect pointer dereferences to the wrong
       | places through capabilities:
       | 
       | https://github.com/CTSRD-CHERI/cheri-c-programming
       | 
       | It requires special CHERI hardware, although the hardware does
       | exist.
        
         | AlotOfReading wrote:
         | Astree is a pain in the butt. Even if it were free, I'd
         | recommend it to very few people. It's not usable without
         | someone (often a team) being responsible for it full time.
         | 
         | TrustInSoft is the higher quality option, polyspace is the more
         | popular option, and IKOS is probably the best open source
         | option. I've also had luck with tools from Galois Inc and the
         | increasingly dated rv-match tool.
        
       | fxtentacle wrote:
       | I believe what programmers actually want is clean dialect-free C
       | with sidecar files.
       | 
       | It seems people pretty universally dislike type annotations and
       | overly verbose comments, like Ruby's YARD or Java's Javadoc.
       | Also, if your new language doesn't compile with a standard C
       | compiler, kernel usage is probably DOA. That means you want to
       | keep the source code pure C and store additional data in an
       | additional file. That additional file would then contain stuff
       | like pointer type annotations, object lifecycle and lifetime
       | hints, compile-time eval hints, and stuff to make the macros type
       | safe. Ideally, your tool can then use the C code and the sidecar
       | file together to prove that the C code is bug-free and that
       | pointers are handled correctly. That would make your language as
       | safe as Rust to use.
       | 
       | The hardcore C kernel folks can then just look at the C code and
       | be happy. And you and your users use a special IDE to modify the
       | C code and the sidecar file simultaneously, which unlocks all the
       | additional language features. But as soon as you hit save, the
       | editor converts its internal representation back into plain C
       | code. That means, technically, the sidecar file and your IDE are
       | a fancy way of transpiling from whatever you come up with to pure
       | C.
        
       | muricula wrote:
       | You may be interested in the new Clang -fbounds-safety extension
       | https://clang.llvm.org/docs/BoundsSafety.html
        
       | bachmeier wrote:
       | You mentioned D, but are you familiar with D's BetterC?
       | 
       | https://dlang.org/spec/betterc.html
       | 
       | The goal with BetterC is to write D code that's part of a C
       | program. There's no runtime, no garbage collector, or any of
       | that. Of course you lose numerous D features, but that's kind of
       | the point - get rid of the stuff that doesn't work as part of a C
       | program.
        
       | viraptor wrote:
       | Here's a thing... There's been many of them and they all die
       | because they don't provide enough benefit over the status quo.
       | Cyclone
       | https://en.wikipedia.org/wiki/Cyclone_(programming_language) is
       | probably the most known one. There's Safe C
       | https://www.safe-c.org/ A bit further from just "dialect" there's
       | OOC https://ooc-lang.github.io/ and Vala https://vala.dev/
       | 
       | But the only thing that really took off was effort to change
       | things at the very base level rather than patch issues: Rust,
       | Zig, Go.
        
       | ebiederm wrote:
       | If the goal is something that can be used to improve existing C
       | code, I have a few thoughts.
       | 
       | To get to memory safety with C:
       | 
       | - Add support for array bounds checking. Ideally with the
       | compiler doing the heavy lifting and providing to itself that
       | most runtime bounds checks are unnecessary.
       | 
       | - Implement trivial dependent types so the compiler can know
       | about the array size field that is passed next to a pointer. AKA
       | 
       | void do_something(size_t size, entry_t ptr[size]);
       | 
       | - Enforce the restrict keyword. This is actually the tricky bit.
       | I have some ideas for a language that is not C, but making it
       | backwards compatible is beyond where I have gotten. My hint is
       | separation logic.
       | 
       | - Allow types to change safely. So that free() can change the
       | type of the pointer passed to it, to be a non-dereferencable
       | pointer (whatever bits it has).
       | 
       | This is an idea from separation logic.
       | 
       | Allowing functions to change types of data safely could also be a
       | safe solution to code that needs type punning today.
       | 
       | I think conceptually modules are great, but if your goal is
       | source compatible changes that bring memory safety then something
       | like modules is an unnecessary distraction.
       | 
       | Any changes that ultimately cannot be implemented in the default
       | C compiler I don't think will be preferable to just rewriting the
       | code in a more established language like Rust.
       | 
       | On the other hand I think we are in a local maxima with
       | programming languages and type systems. With everyone busy
       | recombining proven techniques in different ways instead of
       | working on the hard problem of how to have assignment, threading,
       | and memory safety. Plus how to do proofs of interesting program
       | properties with things like asserts.
       | 
       | Unfortunately it appears that only through proof can programs be
       | consistent enough that specific security concerns can be said to
       | not be problems.
       | 
       | What I have seen of ADA Spark lately has been very tantalizing.
       | 
       | I have a personal project that I think I have solved the memory
       | safety problem, while still allowing manual memory management and
       | assignment. Unfortunately I am at a stage where everything is
       | mostly clear in my head, but I haven't finished fleshing it out
       | and proving the type system, so I really can't share it yet :-(.
       | 
       | While implementing modules, memory safety, type variables, and
       | functions that can change the types of their argument pointers. I
       | think I will end up with something simpler than C in most
       | respects.
       | 
       | I keep going well that doesn't make any sense today, as I go
       | through all of the details and ask why is something done the way
       | it is done.
       | 
       | One of those questions is why doesn't C use modules.
        
       | 1718627440 wrote:
       | In my opinion SPLint (http://splint.org/) would be a nice
       | approach. It is a way to specify ownership semantics, inout
       | parameters etc., but also allows to specify arbitrary pre- and
       | postconditions. It works by annotating whole functions, their
       | parameters, types and variables. These are then checked by
       | calling splint on the codebase, you can also opt out of several
       | checks by flags or using the preprocessor.                 -
       | nullability: /*@null@*/       - in/out parameter (default in):
       | /*@inout@*/, /*@out@*/       - ownership: /*@only@*/, /*@temp@*/,
       | /*@shared@*/, /*@refcounted@*/       - also supports partial
       | defined parameters       - allows to be introduced gradually in
       | the codebase
       | 
       | Example from the documentation:                 void * /*@alt
       | char * @*/       strcpy (/*@unique@*/ /*@out@*/ /*@returned@*/
       | char *s1, char *s2)               /*@modifies *s1@*/
       | /*@requires maxSet(s1) >= maxRead(s2) @*/
       | /*@ensures maxRead(s1) == maxRead (s2) @*/;
       | 
       | My main problem was that it was annoying to add to a project, but
       | that is only because you need to specify ownership semantic, not
       | because of the syntax which is short and readable, and that the
       | program is sometimes crashing and there doesn't seem to be active
       | development.
        
       ___________________________________________________________________
       (page generated 2025-02-25 23:01 UTC)