[HN Gopher] The Inevitability of the Borrow Checker
       ___________________________________________________________________
        
       The Inevitability of the Borrow Checker
        
       Author : rbanffy
       Score  : 105 points
       Date   : 2025-02-06 20:54 UTC (1 days ago)
        
 (HTM) web link (yorickpeterse.com)
 (TXT) w3m dump (yorickpeterse.com)
        
       | etyp wrote:
       | I like how this is structured. When I read that inline types get
       | copied-on-borrow I was pretty put off. Then since fields of
       | inline types can't be assigned new values it seems a bit better,
       | as long as you roughly know what's happening. Hopefully the
       | diagnostics are good enough there. I like the detailed
       | alternatives that weren't chosen.
       | 
       | I appreciate being able to choose which side of the tradeoff
       | (always-copy or heap allocated) you want to be on, but either way
       | be assured it's safe. Not sure how I feel about it in practice
       | without trying it, though :)
        
         | YorickPeterse wrote:
         | On the diagnostics side of things, the compiler produces these
         | in two places:
         | 
         | 1. If you try to define a `mut` field (= one you can normally
         | assign a new value), it will produce an error at the definition
         | site
         | 
         | 2. If you try to do something like `some_inline_value.field =
         | value` it also produces an error at the call/assignment site
         | 
         | The actual messages could use some improvements, but that's not
         | a super high priority for the time being as I'm more focusing
         | on the language overall.
        
       | agentultra wrote:
       | A nicely written article! And an interesting project.
       | 
       | Myself, I'd lean towards a sound (linear) type theory. If it's
       | not too much trouble, insert the run-time checks in debug builds
       | but use the type system to erase them for optimized builds. It
       | might seem like the mountain is impossible to climb if you're not
       | used to formalizing such systems but every mountain is surmounted
       | one step at a time.
       | 
       | It's hard to bolt-on correctness after the fact. In my
       | experience, for critical pieces like this, it's better to get the
       | specification right first before digging in too deep and writing
       | code.
       | 
       | Best of luck on the project either way you go. Memory safety is
       | increasingly important!
        
         | YorickPeterse wrote:
         | Although linear typing is certainly interesting, I think linear
         | typing on its own is not enough. That is, you basically end up
         | with something like Austral where you use linear types _plus_
         | some form of compile-time borrow checking, at which point we're
         | back at borrow checking (did I mention it seems inevitable?) :)
        
           | agentultra wrote:
           | You did! And maybe it _is_ inevitable? I don 't know.
           | 
           | It'd be interesting to see different theories evolve, for
           | sure. Maybe something in separation logic will make it's way
           | into mainstream type theory and into compilers at some point.
           | 
           | What's cool is how much we're starting to see folks push the
           | boundaries these days. :)
        
           | Kinrany wrote:
           | Wouldn't the borrow checking be simpler when built on top of
           | linear types?
        
             | YorickPeterse wrote:
             | I don't think the choice of linear vs affine makes much of
             | a difference, but I could be mistaken.
        
         | sunshowers wrote:
         | I think a real challenge with trying to work on specifications
         | first is error handling -- you often find that a truly sound
         | model is quite difficult to explain to users. So some
         | prototyping and iteration becomes necessary in my experience.
         | 
         | Like, rustc only recently gained the ability to explain that
         | the borrow checker rejected a program because a lifetime
         | parameter was invariant. And this isn't even an artificial bit
         | of complexity -- if you have mutability you are required to
         | think about variance. If you have a Cell<&'static str> you
         | cannot just turn that into a Cell<&'a str>, the way you can
         | turn a regular &'static str into a &'a str. (Java programmers
         | might be familiar with similar issues around
         | ArrayList<Object>.)
        
       | pjc50 wrote:
       | This article seems to use "borrow" to mean what I would normally
       | understand to be the reference count of a refcount-gc system?
       | Rather than a Rust borrow, which is part of the type system and
       | not counted at runtime.
       | 
       | In C# you can force a type to be stack allocated with "ref
       | struct". https://learn.microsoft.com/en-
       | us/dotnet/csharp/language-ref...
        
         | YorickPeterse wrote:
         | In Inko a borrow is sort of a hybrid in that it does increment
         | a counter, but there's no shared ownership like there is in
         | traditional reference counting (e.g. a borrow is never tasked
         | with cleaning up data).
         | 
         | Terminology is a bit annoying here, as e.g. "reference" can
         | also be interpreted as "a pointer", though it depends on who
         | you ask. I stuck with "borrow" because that just made the most
         | sense to me :)
        
           | thomasmg wrote:
           | > borrow is never tasked with cleaning up data
           | 
           | And the reason for this is that there is no branch needed at
           | the end of the borrow, to check for refCount=0? (Or, is this
           | at least one of the reasons?) I'm wondering about the
           | performance impact for this... There's also the code size
           | impact.
        
             | YorickPeterse wrote:
             | Not quite. A borrow doesn't do cleanup because it's, well,
             | a borrow, i.e. a temporary reference to some owned value.
             | It's the owned value that's tasked with disposing of any
             | resources/memory when it goes out of scope, just as in
             | Rust.
             | 
             | The cost of borrowing (at least with heap types) is an
             | increment upon creating the borrow, and a decrement upon
             | disposing of the borrow. Over time I'd like to optimize
             | those away, but that's not implemented at this time.
        
         | harrison_clarke wrote:
         | you can stack allocate with just `struct` in C#. if you put a
         | struct in a local variable, it'll be on the stack
         | 
         | `ref struct` allows that struct to contain refs, and disallows
         | putting it on the heap (so, you can't have an array/list of ref
         | structs, and a class or a normal struct can't have a ref struct
         | field)
        
           | neonsunset wrote:
           | This is correct. Ref structs and refs are also subject to
           | similar[0] to Rust's lifetime analysis to prevent them from
           | being ever invalidated and having use-after-free (even if
           | that free is essentially going out of scope or popping a
           | stack frame).
           | 
           | I assume pjc50 specifically refers to the fact that `ref
           | struct` gives a strong guarantee while a regular struct can
           | be placed wherever.
           | 
           | [0]: https://em-tg.github.io/csborrow/
        
             | harrison_clarke wrote:
             | i think of it like having rust's rules, but you only get
             | one (implicit) lifetime variable
             | 
             | (that's probably wrong in some subtle ways)
        
         | pjmlp wrote:
         | Or _stackalloc_ instead of _new_.
        
       | harrison_clarke wrote:
       | something i'd like to see in a borrow checker (which i think
       | causes problems in rust, because of the "leaking is safe" thing.
       | which sounds like it could be a difficult hole to plug in
       | language design):
       | 
       | in rust, &mut means "this pointer points to an initialized value,
       | and will point to an initalized value at the end of scope"
       | 
       | i wish it also had &in, &out, and &tmp, for
       | "initialized->uninitialized", "uninitialized->initialized", and
       | "uninitialized->uninitialized"
        
         | masklinn wrote:
         | Their intra-function behaviour seems difficult to define for
         | non trivial cases. For instance a &tmp would need to be write
         | only until it's written to, then it can be read from as well,
         | but it needs to be consumed before its scope ends,
         | transitioning back to a write only reference. So you'd need a
         | type system which can transition parameters through typestates
         | (is that a subset of effects systems or unrelated?).
        
           | wavemode wrote:
           | What you're describing is exactly correct - you need a more
           | robust "typestate" system (I call them "annotations"). Most
           | languages have typestates - where you can, for example,
           | declare a variable without immediately assigning it a value,
           | but that variable remains write-only until it is assigned to.
           | 
           | But these typestate systems aren't very elaborate usually.
           | I've recently been doing research on the (hypothetical)
           | design of a language which has typestates that can cross
           | function boundaries - you can call a function that annotates
           | that it uninitializes one of its arguments, and then that
           | reference you passed in is now considered uninitialized in
           | your local scope.
        
       | ivanjermakov wrote:
       | Author introduces inline type definition. Shouldn't allocation
       | strategy be decided by the caller, not type definiton? Or there
       | is a way to heap allocate value of inline type by wrapping it
       | into some utility type, similar to Rust's Box?
        
         | jayd16 wrote:
         | Hmm, if it was by the caller then methods on the inline type
         | can't know if members are copied or shared in assignment, I
         | wouldn't think.
         | 
         | Seems trivial to box the inline types like most languages do.
        
         | dcrazy wrote:
         | > Shouldn't allocation strategy be decided by the caller, not
         | type definiton?
         | 
         | This is how C++ is designed. Unfortunately, it precludes types
         | from taking dependencies on their own address, which is
         | critical for e.g. atomics. As far as I know, there is no way to
         | actually force a C++ class to be heap allocated. I've tried.
         | 
         | Newer languages like Swift give the type designer the ability
         | to say "the address of this object in memory is important and
         | must remain stable for the lifetime of this object." This
         | decays to heap allocation.
        
           | mtklein wrote:
           | The idiom I am familiar with here is to make the constructor
           | private and provide a public static factory method that
           | allocates on the heap.
        
           | badmintonbaseba wrote:
           | > As far as I know, there is no way to actually force a C++
           | class to be heap allocated.
           | 
           | Make the destructor (and possibly the constructors) private,
           | have public factory functions that hand out unique_ptr with a
           | custom deleter, or define a public destroying operator delete
           | (C++20) and use unique_ptr with its default deleter.
        
           | spacechild1 wrote:
           | > Unfortunately, it precludes types from taking dependencies
           | on their own address,
           | 
           | It does not! You just have to make it non-copyable and non-
           | moveable. std::mutex and std::atomic do exactly this.
           | 
           | > As far as I know, there is no way to actually force a C++
           | class to be heap allocated. I've tried.
           | 
           | As others have already pointed out: private constructor +
           | factory method. However, this pattern is typically used for
           | different reasons (implementation hiding or runtime
           | polymorphism) and heap allocation is only a side effect.
           | 
           | > This decays to heap allocation.
           | 
           | No! The object might just as well live on the stack or have
           | static storage duration.
        
         | YorickPeterse wrote:
         | > Or there is a way to heap allocate value of inline type by
         | wrapping it into some utility type, similar to Rust's Box?
         | 
         | There isn't. For this to work, functions for inline types have
         | to be compiled such that one of these approaches is used:
         | 
         | 1. They always take inline values by pointer, and we have to
         | guarantee those pointers are never invalidated. This again
         | means you need some sort of borrow checking scheme.
         | 
         | 2. We compile two versions for each function: one that takes
         | the data by value, and one by pointer, resulting in significant
         | code bloat and compiler complexity.
         | 
         | I think the split also better captures the intent: heap types
         | are for cyclic and heavily mutated values, inline types are
         | more for short-lived mostly immutable values.
        
       | aidenn0 wrote:
       | Why not a tracing GC instead?
        
         | YorickPeterse wrote:
         | Inko used to have a tracing GC (see
         | https://yorickpeterse.com/articles/friendship-ended-with-the...
         | for more details), but this was removed due to the various
         | problems the presence of a tracing GC introduces.
        
       | brokencode wrote:
       | Inko is an interesting language, though the handling for panics
       | sounds disturbing. Based on the documentation, panics abort the
       | whole program and are not recoverable.
       | 
       | Things like dividing by zero or accessing an index outside of the
       | bounds of an array will panic.
       | 
       | While these aren't incredibly common, I would never want my whole
       | production web server to go down because some random,
       | infrequently used endpoint had a bug in some edge case.
       | 
       | The language seems heavily inspired by Erlang, which is very
       | resilient to programmer errors in comparison.
       | 
       | In Erlang, the philosophy is to let a lightweight process crash
       | and restart it when there is a problem. This doesn't need to have
       | any effect on other lightweight processes that are running.
        
         | nixpulvis wrote:
         | That's why you have wrappers that return Result enumerations.
         | There is an overhead to check if an operation is valid, pay for
         | it if you need it.
         | 
         | This is exactly how Rust works, for example.
        
           | demurgos wrote:
           | Rust's panics are recoverable, so they're more like
           | exceptions. This means that you may need to care about
           | "unwind" safety and values in a potentially invalid state.
           | It's the main reason for mutex poisoning and having to deal
           | with it when locking.
           | 
           | Overall, I'm not entirely sure if Rust would be better or not
           | if panics were non-recoverable.
           | 
           | PS: For completeness, there are flags to control the panic
           | behavior; but you can't rely on them when writing a lib.
        
       | hinkley wrote:
       | My entirely made up origin story of borrow checked is that escape
       | analysis (for GC) made huge progress in the 00's and borrow
       | checkers are just the logical conclusion of having that shoulder
       | to stand on.
       | 
       | I don't know what inspired borrow checking but I am certain
       | someone else would have thought it up presently if they hadn't.
        
         | panstromek wrote:
         | I think the Rust borrow checker is inspired by Cyclone. Not
         | sure how that coincides with the GC development timeline
        
           | mkehrt wrote:
           | Exactly correct, I think. Cyclone allowed borrowing of
           | "regions" (which were similar to rust lifetimes) in a very
           | similar way. This was either based on or itself inspired
           | other theoretical models of borrowing at around the same
           | time; I'm not sure on the causality but I read all the
           | literature at the time!
        
       | pmarreck wrote:
       | Never heard of the Inko language. Interesting.
        
       ___________________________________________________________________
       (page generated 2025-02-07 23:00 UTC)