[HN Gopher] Four limitations of Rust's borrow checker
       ___________________________________________________________________
        
       Four limitations of Rust's borrow checker
        
       Author : todsacerdoti
       Score  : 169 points
       Date   : 2024-12-22 10:33 UTC (2 days ago)
        
 (HTM) web link (blog.polybdenum.com)
 (TXT) w3m dump (blog.polybdenum.com)
        
       | xdfgh1112 wrote:
       | A pattern in these is code that compiles until you change a small
       | thing. A closure that works until you capture a variable, or code
       | that works in the main thread but not in a separate thread. Or
       | works until you move the code into an if/else block.
       | 
       | My experience with Rust is like this: make a seemingly small
       | change, it balloons into a compile error that requires large
       | refactoring to appease the borrow checker and type system. I
       | suppose if you repeat this enough you learn how to write code
       | that Rust is happy with first-time. I think my brain just doesn't
       | like Rust.
        
         | alfiedotwtf wrote:
         | I don't know anyone who has gotten Rust first time around. It's
         | a new paradigm of thinking, so take your time, experiment, and
         | keep at it. Eventually it will just click and you'll be back to
         | having typos in syntax overtake borrow checker issues
        
         | stouset wrote:
         | Your supposition about Rust is correct.
         | 
         | I'll add that--having paid that upfront cost--I am happily
         | reaping the rewards even when I write code in other languages.
         | It turns out the way that Rust "wants" to be written is overall
         | a pretty good way for you to organize the relationships between
         | parts of a program. And even though the borrow checker isn't
         | there looking out for you in other languages, you can code as
         | if it is!
        
           | farresito wrote:
           | As someone who is interested in getting more serious with
           | Rust, could you explain the essence of how you should always
           | approach organizing code in Rust as to minimize refactors as
           | the code grows?
        
             | oconnor663 wrote:
             | In my experience there are two versions of "fighting the
             | borrow checker". The first is where the language has tools
             | it needs you to use that you might not've seen before, like
             | enums, Option::take, Arc/Mutex, channels, etc. The second
             | is where you need to stop using references/lifetimes and
             | start using indexes: https://jacko.io/object_soup.html
        
               | meindnoch wrote:
               | >and start using indexes
               | 
               | So basically raw pointers with extra hoops to jump
               | through.
        
               | nordsieck wrote:
               | > So basically raw pointers with extra hoops to jump
               | through.
               | 
               | That's one way to look at it.
               | 
               | The other way is: raw pointers, but with mechanical
               | sympathy. Array based data structures crush pointer based
               | data structures in performance.
        
               | jpc0 wrote:
               | > Array based data structures crush pointer based data
               | structures in performance
               | 
               | Array[5] And *(&array + 5) generates the same code...
               | Heap based non-contiguous data structures definitely are
               | slower than stackbased contiguous data structures.
               | 
               | How you index into them is unrelated to performance.
               | 
               | Effectively pointers are just indexes into the big array
               | which is system memory... I agree with parent,
               | effectively pointers without any of the checks pointers
               | would give you.
        
               | frutiger wrote:
               | > pointers are just indexes into the big array which is
               | system memory...
               | 
               | I'm sure you are aware but for anyone else reading who
               | might not be, pointers actually index into your very own
               | private array.
               | 
               | On most architectures, the MMU is responsible for mapping
               | pages in your private array to pages in system memory or
               | pages on disk (a page is a subarray of fixed size,
               | usually 4 KiB).
               | 
               | Usually you only get a crash if you access a page that is
               | not currently allocated to your process. Otherwise you
               | get the much more insidious behaviour of silent
               | corruption.
        
               | _dain_ wrote:
               | _> How you index into them is unrelated to performance._
               | 
               | Not true. If you store u32 indices, that can impose less
               | memory/cache pressure than 64-bit pointers.
               | 
               | Also indices are trivially serializable, which cannot be
               | said for pointers.
        
               | jpc0 wrote:
               | I'll happily look at a benchmark which shows that the
               | size of the index has any significant performance
               | implications vs the work done with the data stored at
               | said index, never mind the data actually stored there.
               | 
               | I haven't looked closely at the decompiled code but I
               | wouldn't be surprised if iterating through a contiguous
               | data structure has no cache pressure but is rather just
               | incrementing a register without a load at all other than
               | the first one.
               | 
               | And if you aren't iterating sequentially you are likely
               | blowing the cache regardless purely based on jumping
               | around in memory.
               | 
               | This is an optimisation that may be premature.
               | 
               | EDIT:
               | 
               | > Also indices are trivially serializable, which cannot
               | be said for pointers
               | 
               | Pointers are literally 64bit ints... And converting them
               | to an index is extremely quick if you want to store an
               | offset instead when serialising.
               | 
               | I'm not sure if we are missing each other here. If you
               | want an index then use indices. There is no performance
               | difference when iterating through a data structure, there
               | may be some for other operations but that has nothing to
               | do with the fact they are pointers.
               | 
               | Back to the original parent that spurred this
               | discussion... Replacing a reference (which is basically a
               | pointer with some added suger) with an index into an
               | array is effectively just using raw pointers to get
               | around the borrow checker.
        
               | trealira wrote:
               | > Pointers are literally 64bit ints... And converting
               | them to an index is extremely quick if you want to store
               | an offset instead when serialising.
               | 
               | I'm not them, but they're saying pointer based structures
               | are just less trivial to serialize. For example, to
               | serialize a linked list, you basically need to copy them
               | into an array of nodes, replacing each pointer to a node
               | with a local offset into this array. You can't convert
               | them into indices just with pointer arithmetic because
               | each allocation was made individually. Pointer arithmetic
               | assumes that they already exist in some array, which
               | would make the use of pointers instead of indices
               | inefficient and redundant.
        
               | jpc0 wrote:
               | I understand that entirely, a link list is a non-
               | contiguous heap based data structure.
               | 
               | What I am saying is if you store a reference to an item
               | in a Vec or an index to an item to a Vec it is an
               | implementation detail and looking up the reference or the
               | index generates effectively the same machine code.
               | 
               | Specifically in the case that I'm guessing they are
               | referring to which is the optimisation used in patterns
               | like ECS. The optimisation there is the fact that it is
               | stored contiguously in memory and therefore it is trivial
               | to use SIMD or a GPU to do operations on the data.
               | 
               | In that case whether you are storing a u32 or size_t
               | doesn't exactly matter and on a 32bit arch is literally
               | equivalent. It's going to be dwarfed by loading the data
               | into cache if you are randomly accessing the items or by
               | the actual operations done to the data or both.
               | 
               | As I said, sure use an index but that wasn't the initial
               | discussion. The discussion was doing it to get around the
               | borrow check which is effectively just removing the
               | borrow checker from the equation entirely and you may as
               | well have used a different language.
        
               | IX-103 wrote:
               | The main benefit from contiguous storage is it can be a
               | better match to the cache. Modern CPUs read an entire
               | cache line in a burst. So if you're iterating through a
               | contiguous array of items then chances are the data is
               | already in the cache. Also the processor tends to
               | prefetch cache lines when it recognizes a linear access
               | pattern, so it can be fetching the next element in the
               | array while it's working on the one before it.
        
               | mrkeen wrote:
               | > Pointers are literally 64bit ints... And converting
               | them to an index is extremely quick if you want to store
               | an offset instead when serialising.
               | 
               | This implies serialisation/deserialisation passes, so you
               | can't really let bigger-than-ram data live on your disk.
        
               | quotemstr wrote:
               | Yep. The array index pattern is unsafe code without the
               | unsafe keyword. Amazing how much trouble Rust people go
               | through to make code "safe" only to undermine this safety
               | by emulating unsafe code with safe code.
        
               | umanwizard wrote:
               | The difference is that the semantics of your program are
               | still well-defined, even with bugs in index-based arenas.
        
               | quotemstr wrote:
               | The semantics of a POSIX program are well-defined under
               | arbitrary memory corruption too --- just at a low level.
               | Even with a busted heap, execution is deterministic and
               | the every interaction with the kernel has defined
               | behavior --- even if they behavior is SIGSEGV.
               | 
               | Likewise, safe but buggy Rust might be well-defined at
               | one level of abstraction but not another.
               | 
               | Imagine an array index scheme for logged-in-user objects.
               | Suppose we grab an index to an unprivileged user and
               | stuff it in some data structure, letting it dangle. The
               | user logs out. The index is still around. Now a
               | privileged user logs in and reuses the same slot. We do
               | an access check against the old index stored in the data
               | structure. Boom! Security problems of _EXACTLY_ the sort
               | we have in C.
               | 
               | It doesn't matter that the behavior is well-defined at
               | the Rust level: the application still has an escalation
               | of privilege vulnerability arising from a use-after-free
               | even if no part of the program has the word u-n-s-a-f-e.
        
               | IX-103 wrote:
               | Undefined behavior in C/C++ has a different meaning than
               | you're using. If a compiler encounters a piece of code
               | that does something whose behavior is undefined in the
               | spec, it can theoretically emit code that does _anything_
               | and still be compliant with the standards. This could
               | include things like setting the device on fire and
               | launching missiles, but more typically is something
               | seemingly innocuous like ignoring that part of the code
               | entirely.
               | 
               | An example I've seen in actual code: You checked for null
               | before dereferencing a variable, but there is one code
               | path that bypasses the null check. The compiler knows
               | that dereferencing a null pointer is undefined so it
               | concludes that the pointer can never be null and removes
               | the null checks from all of the code paths as an
               | "optimization".
               | 
               | That's the C/C++ foot-gun of undefined behavior. It's
               | very different from memory safety and correctness that
               | you're conflating it with.
        
               | quotemstr wrote:
               | From the kernel's POV, there's no undefined behavior in
               | user code. (If the kernel knew a program had violated C's
               | memory rules, it could kill it and we wouldn't have
               | endemic security vulnerabilities.) Likewise, in safe
               | Rust, the access to that array might be well defined with
               | respect to Rust's view of the world (just like even UB in
               | C programs is well defined from the kernel POV), but it
               | can still cause havoc at a higher level of abstraction
               | --- your application. And it's hard to predict what kind
               | of breakage at the application layer might result.
        
               | dmkolobov wrote:
               | It's not the same. The term "safe" has a specific meaning
               | in rust: memory safety. As in:
               | 
               | - no buffer overflows - no use after free - no data races
               | 
               | These problems lead to security vulnerabilities whose
               | scope extends beyond your application. Buffer overflows
               | have historically been the primary mechanism for taking
               | over entire _machines_. If you emulate pointers with Rust
               | indices and don't use "unsafe", those types of attacks
               | are impossible.
               | 
               | What you're referring to here is correctness. Safe Rust
               | still allows you to write programs which can be placed in
               | an invalid state, and that may have security implications
               | for _your application_.
               | 
               | It would be great if the compiler could guarantee that
               | invalid states are unreachable. But those types of
               | guarantees exist on a continuum and no language can do
               | all the work for you.
        
               | quotemstr wrote:
               | "Safe" as a colloquial meaning: free from danger. The
               | whole reason we care about memory safety is that memory
               | errors become security issues. Rust does nothing to
               | prevent memory leaks and deadlocks, but it does prevent
               | memory errors becoming arbitrary code execution.
               | 
               | Rust programs may contain memory errors (e.g. improper
               | use of interior mutability and out of bounds array
               | access), but the runtime guarantees that these errors
               | don't become security issues.
               | 
               | This is good.
               | 
               | When you start using array indices to manage objects, you
               | give up some of the protections built into the Rust type
               | system. Yes, you're still safe from some classes of
               | vulnerability, but other kinds of vulnerabilities, ones
               | you thought you abolished because "Rust provides memory
               | safety!!!", reappear.
               | 
               | Rust is a last resort. Just write managed code. And if
               | you insist on Rust, reach for Arc before using the array
               | index hack.
        
               | whytevuhuni wrote:
               | Even when you use array indices, I don't think you give
               | those protections up. Maybe a few, sure, but the
               | situation is still overall improved.
               | 
               | Many of the rules references have to live by, are also
               | applied to arrays:
               | 
               | - You cannot have two owners simultaneously hold a
               | mutable reference to a region of the array (unless they
               | are not overlapping)
               | 
               | - The array itself keeps the Sync/Send traits, providing
               | thread safety
               | 
               | - The compiler cannot do provenance-based optimizations,
               | and thus cannot introduce undefined behavior; most other
               | kinds of undefined behavior are still prevented
               | 
               | - Null dereferences still do not exist and other classes
               | of errors related to pointers still do not exist
               | 
               | Logic errors and security issues will still exist of
               | course, but Rust never claimed guarantees against them;
               | only guarantees against undefined behavior.
               | 
               | I'm not going to argue against managed code. If you can
               | afford a GC, you should absolutely use it. But, compared
               | to C++, if you have to make that choice, safety-wise Rust
               | is overall an improvement.
        
               | dmkolobov wrote:
               | I tend to agree w.r.t. managed languages.
               | 
               | Still, being free from GC is important in some domains.
               | Beyond being able to attach types to scopes via
               | lifetimes, it also provides runtime array bounds checks,
               | reference-counting shared pointers, tagged unions, etc.
               | These are the techniques used by managed languages to
               | achieve memory-safety and correctness!
               | 
               | For me, Rust occupies an in-between space. It gives you
               | more memory-safe tools to describe your problem domain
               | than C. But it is less colloquially "safe" than managed
               | languages because ownership is hard.
               | 
               | Your larger point with indices is true: using them throws
               | away some benefits of lifetimes. The issue is
               | granularity. The allocation assigned to the collection as
               | a whole is governed by rust ownership. The structures you
               | choose to put inside that allocation are not. In your
               | user ID example, the programmer of that system should
               | have used a generational arena such as:
               | 
               | https://github.com/fitzgen/generational-arena
               | 
               | It solves _exactly_ this problem. When you `free` any
               | index, it bumps a counter which is paired with the next
               | allocated index /slot pair. If you want to avoid having
               | to "free" it manually, you'll have to devise a system
               | using `Drop` and a combination of command queues,
               | reference-counted cells, locks, whatever makes sense.
               | Without a GC you _need_ to address the issue of
               | allocating /freeing slots for objects within in an
               | allocation in some way.
               | 
               | Much of the Rust ecosystem is libraries written by people
               | who work hard to think through just these types of
               | problems. They ask: "ok, we've solved memory-safety, now
               | how can we help make code dealing with this other thing
               | more ergonomic and correct by default?".
        
               | quotemstr wrote:
               | Absolutely. If I _had_ to use an index model in Rust, I
               | 'd use that kind of generational approach. I just worry
               | that people aren't going to be diligent enough to take
               | precautions like this.
        
               | screcth wrote:
               | You can still have use-after-free errors when you use
               | array indices. This can happen if you implement a way to
               | "free" elements stored in the vector. "free" should be
               | interpreted in a wide sense. There's no way for Rust to
               | prevent you from marking an array index as free and later
               | using it.
        
               | oconnor663 wrote:
               | > There's no way for Rust to prevent you from marking an
               | array index as free and later using it.
               | 
               | I 2/3rds disagree with this. There are three different
               | cases:
               | 
               | - Plain Vec<T>. In this case you just can't remove
               | elements. (At least not without screwing up the indexes
               | of other elements, so not in the cases we're talking
               | about here.)
               | 
               | - Vec<Option<T>>. In this case you can make index reuse
               | mistakes. However, this is less efficient and less
               | convenient than...
               | 
               | - SlotMap<T> or similar. This uses generational indexes
               | to solve the reuse problem, and it provides other nice
               | conveniences. The only real downside is that you need to
               | know about it and take a dependency.
        
               | throwawaymaths wrote:
               | and you now have unchecked use-after-decommisioning-the-
               | index and double-decommission-the-index errors, which
               | could be security regressions
        
               | estebank wrote:
               | That's true only if you use Vec<T> instead of a
               | specialized arena, either append only, maybe growable, or
               | generational, where access invalidation is tracked for
               | you on access.
        
               | oconnor663 wrote:
               | Yeah if you go with Vec, you have to accept that you
               | can't delete anything until you're done with the whole
               | collection. A lot of programs (including basically
               | anything that isn't long running) can accept that. The
               | rest need to use SlotMap or similar, which is an easy
               | transition that you can make as needed.
        
               | oconnor663 wrote:
               | Sort of. But you still get guaranteed-unaliased
               | references when you need them. And generational indexes
               | (SlotMap etc) let you ask "has this pointer been freed"
               | instead of just hoping you never get it wrong.
        
               | simgt wrote:
               | > stop using references/lifetimes and start using indexes
               | 
               | Aren't arenas a nicer suggestion?
               | https://docs.rs/bumpalo/latest/bumpalo/
               | https://docs.rs/typed-arena/latest/typed_arena/
               | 
               | Depending on the use case, another pattern that plays
               | very nicely with Rust is the EC part of ECS:
               | https://github.com/Ralith/hecs
        
               | oconnor663 wrote:
               | Yes, Slab and SlotMap are the next stop on this train,
               | and ECS is the last stop. But a simple Vec can get you
               | surprisingly far. Most small programs never really need
               | to delete anything.
        
           | saghm wrote:
           | I'm fortunate enough not to have to often write code in other
           | languages anymore, but my experience that writing code in
           | ways that satisfies the compiler actually ends up being code
           | I prefer anyhow. I was somewhat surprised at the first
           | example because I haven't run into something like that, but
           | it's also not really the style I would write that function
           | personally (I'm not a big fan of repetitions like having
           | `Some(x)` repeated both as a capture pattern and a return
           | value), so on a whim I tried what would have been the way I'd
           | write that function, and it doesn't trigger the same error:
           | fn double_lookup_mut(map: &mut HashMap<String, String>, mut
           | k: String) -> Option<&mut String> {
           | map.get_mut(&k)?;             k.push_str("-default");
           | map.get_mut(&k)         }
           | 
           | I wouldn't have guessed that this happened to be a way around
           | a compiler error that people might run into with other ways
           | of writing it; it just genuinely feels like a cleaner way for
           | me to implement a function like that.
        
             | khold_stare wrote:
             | Isn't that the opposite of the intended implementation? I
             | don't write Rust, but I think your implementation will
             | always return either `None` or the "fallback" value with
             | the `"-default"` key. In the article, the crucial part is
             | that if the first `map.get_mut()` succeeds, that is what is
             | returned.
        
               | brabel wrote:
               | A great example of how "if it compiles, it runs
               | correctly" is bullshit.
        
               | unshavedyak wrote:
               | You're reaching pretty hard there. Your assertion is a
               | massive strawman, the implication seeming to be that
               | "every problem in your logic won't exist if it compiles"
               | - no one thinks you can't write bad logic in any
               | language.
               | 
               | Rather it's about a robust type system that gives you
               | tooling to cover many cases at compile time.
               | 
               | Even if we ignore logic, Rust has plenty of runtime
               | tooling and runtime issues can still happen as a result.
               | A complaint i often have about Bevy (despite loving it!)
               | is that it has a lot of runtime based plugins/etc which
               | i'd prefer to be compile time. Axum for example has a
               | really good UX while still being heavily compile time (in
               | my experience at least).
               | 
               | "If it compiles it works" is still true despite my
               | complaints. Because i don't believe the statement even
               | remotely implies you can't write bad logic or bad runtime
               | code in Rust.
        
               | oivey wrote:
               | This particular example explicitly dodges compile time
               | checking for some ad-hoc (but likely safe) runtime
               | behavior. It's not a strawman at all. It's a classic
               | example of how sometimes the compiler can't help you, and
               | even worse, how programmers can defeat their ability to
               | help you.
        
               | unshavedyak wrote:
               | Right but their statement (as i parsed it) was that the
               | the "if it compiles it works" phrase is bullshit. Since
               | there's some cases where it obviously won't be true.
               | 
               | At best it's ignorant of what that phrase means, in my
               | view.
        
               | saghm wrote:
               | Honestly, I think the majority of the times I've said
               | that sentence has been after running code that has an
               | obvious mistake (like the code I posted above)!
        
               | saghm wrote:
               | Whoops, you're definitely right. This is why I shouldn't
               | try to be productive in the morning.
        
           | ajb wrote:
           | It sounds like the ideal, then, would be to detect the
           | problematic patterns _earlier_ so people wouldn 't need to
           | bang their heads against it.
        
           | tonyarkles wrote:
           | I had a similar experience with Erlang/Elixir. The primary
           | codebase I work with in $DAYJOB is C++ but structured very
           | OTP-like with message passing and threads that can crash (via
           | exceptions, we're not catching defaults for example) and
           | restart themselves.
           | 
           | Because of the way we've set up the message passing and by
           | ensuring that we don't share other memory between threads
           | we've virtually eliminated most classes of concurrency bugs.
        
             | pdimitar wrote:
             | That's the main use-case of Erlang/Elixir anyway: eliminate
             | concurrency / parallelism bugs by copying data in the
             | dangerous places, and make sure not to use locks but
             | message passing instead. These two alone have eliminated
             | most of the bugs I've wrote in other languages.
             | 
             | So, same experience. And it taught me to be a better Golang
             | and Rust programmer, too.
        
           | ajross wrote:
           | > It turns out the way that Rust "wants" to be written is
           | overall a pretty good way for you to organize the
           | relationships between parts of a program
           | 
           | That's what it promised not to do, though! Zero cost
           | abstractions aren't zero cost when they force you into a
           | particular design. Several of the cases in the linked article
           | involve actual runtime and code size overhead vs. the obvious
           | legacy/unchecked idioms.
        
             | oconnor663 wrote:
             | > vs. the obvious legacy/unchecked idioms
             | 
             | You can go crazy with legazy/unchecked/unsafe stuff if you
             | want to in Rust. It's less convenient and more difficult
             | than C in some ways, but 1) it's also safer and more
             | convenient in other ways, and 2) "this will be safe and
             | convenient" isn't exactly the reason we dive into
             | legacy/unchecked/unsafe stuff.
             | 
             | And of course the greatest strength of the whole Rust
             | language is that folks who want to do crazy unsafe stuff
             | can package it up in a safe interface for the rest of us to
             | use.
        
               | ajross wrote:
               | > crazy unsafe stuff
               | 
               | The first example in the linked article is checking if a
               | value is stored in a container and doing something
               | different if it's not than if it is. Hardly "crazy unsafe
               | stuff".
        
               | oconnor663 wrote:
               | I think there's an important distinction here. A systems
               | programming language needs to have all the max speed /
               | minimum overhead / UB-prone stuff available, but that
               | stuff doesn't need to be the default / most convenient
               | way of doing things. Rust heavily (both syntactically and
               | culturally) encourages safe patterns that sometimes
               | involve runtime overhead, like checked indexing, but this
               | isn't the same as "forcing" you into these patterns.
        
               | ajross wrote:
               | And I can only repeat verbatim: The first example in the
               | linked article is checking if a value is stored in a
               | container and doing something different if it's not than
               | if it is.
               | 
               | Hardly "max speed / minimum overhead / UB-prone stuff"
        
           | kazinator wrote:
           | Why would you cling to some cockamamie memory management
           | model, where it is not required or enforced?
           | 
           | That's like Stockholm Syndrome.
        
         | joshka wrote:
         | Rust definitely forces you to make more deliberate changes in
         | your design. It took me about 6 months to get past hitting that
         | regularly. Once you do get past it, rust is awesome though.
        
           | brabel wrote:
           | I suppose you haven't had to refactor a large code base yet
           | just because a lifetime has to change?
        
             | merb wrote:
             | actually the higher versions of rust actually do need these
             | refactors way less often since more lifetimes can be elided
             | and when using generic code or impl traits you can
             | basically scrap a ton of it. I still sometimes stumble upon
             | the first example tough but most often it happens because I
             | want to encapsulate everything inside the function instead
             | of doing some work outside of it.
        
             | nostradumbasp wrote:
             | Nope.
             | 
             | I have worked professionally for several years on what
             | would now be considered a legacy rust code base. Probably
             | hundreds of thousands of lines, across multiple mission
             | critical applications. Few applications need to juggle
             | lifetimes in a way that is that limiting, maybe a module
             | would need some buffing, but not a major code base change.
             | 
             | Most first pass and even refined "in production" code bases
             | I work on do not have deeply intertwined life-times that
             | require immense refactoring to cater to changes. Before
             | someone goes "oh your team writes bad code!", I would say
             | that we had no noteworthy problems with lifetimes and our
             | implementations far surpassed performance of other GC
             | languages in the areas that mattered. The company is
             | successful built by a skeleton crew and the success is owed
             | too an incredibly stable product that scales out really
             | well.
             | 
             | I question how many applications truly "need" that much
             | reference juggling in their designs. A couple allocations
             | or reference counted pointers go a really long way to
             | reducing cognitive complexity. We use arenas and whatever
             | else when we need them, but no I've never dealt with this
             | in a way that was an actual terrible issue.
        
         | resonious wrote:
         | Maybe I'm just brainwashed, but most of the time for me, these
         | "forced refactors" are actually a good thing in the long run.
         | 
         | The thing is, you can almost always weasel your way around the
         | borrow checker with some unsafe blocks and pointers. I tend to
         | do so pretty regularly when prototyping. And then I'll often
         | keep the weasel code around for longer than I should (as you
         | do), and almost every time it causes a very subtle, hard-to-
         | figure-out bug.
        
           | twic wrote:
           | I think the problem isn't that the forced changes are bad,
           | it's that they're lumpy. If you're doing incremental
           | development, you want to be able to to quickly make a long
           | sequence of small changes. If some of those changes randomly
           | require you to turn your program inside-out, then incremental
           | development becomes painful.
           | 
           | Some people say that after a while, they learn how to
           | structure their program from the start so that these changes
           | do not become necessary. But that is also partly giving up
           | incremental development.
        
             | NotCamelCase wrote:
             | My concern is slightly different; it's the ease of
             | debugging. And I don't mean debugging the code that I (or
             | sb else) wrote, but the ability to freely modify the code
             | to kick some ideas around and see what sticks, etc. which I
             | frequently need to do, given my field.
             | 
             | As an example, consider a pointer to a const object as a
             | function param in C++: I can cast it away in a second and
             | modify it as I go on my experiments.
             | 
             | Any thoughts on this? How much of an extra friction would
             | you say is introduced in Rust?
        
               | resonious wrote:
               | I would say it's pretty easy to do similar stuff in Rust
               | to skirt the borrow checker. e.g. you can cast a mut ref
               | to a mut ptr, then back to a mut ref, and then you're
               | allowed to have multiple of them.
               | 
               | The problem is Rust (and its community) does a very good
               | job at discouraging things like that, and there are no
               | guides on how to do so (you might get lambasted for
               | writing one. maybe I should try)
        
             | stouset wrote:
             | I don't really think it gives up incremental development.
             | I've done large and small refactors in multiple Rust code
             | bases, and I've never run into one where a tiny change
             | suddenly ballooned into a huge refactor.
        
         | IshKebab wrote:
         | Yeah I think this becomes more true the closer your type system
         | gets to "formal verification" type systems. It's essentially
         | trying to prove some fact, and a single mistake anywhere means
         | it will say no. The error messages also get worse the further
         | along that scale you go (Prolog is infamous).
         | 
         | Not really unique to Rust though; I imagine you would have the
         | same experience with e.g. Lean. I have a similar experience
         | with a niche language I use that has dependent types. Kind of a
         | puzzle almost.
         | 
         | It is more work, but you get lots of rewards in return
         | (including less work overall in the long term). Ask me how much
         | time I've spent debugging segfaults in C++ and Rust...
        
           | ykonstant wrote:
           | Lean is far more punishing even for simple imperative code.
           | The following is rejected:                 /- Return the
           | array of forward differences between consecutive
           | elements of the input. Return the empty array if the input
           | is empty or a singleton.       -/            def diffs
           | (numbers : Array Int) : Array Int := Id.run do         if
           | size_ok : numbers.size > 1 then           let mut diffs :=
           | Array.mkEmpty (numbers.size - 1)           for index_range :
           | i in [0:numbers.size - 2] do             diffs := diffs.push
           | (numbers[i+1] - numbers[i])           return diffs
           | else           return #[]
        
           | binary132 wrote:
           | That's not what OP is discussing. OP is discussing corner
           | cases in Rust's typesystem that would be sound if the
           | typesystem were more sophisticated, but are rejected because
           | Rust's type analysis is insufficiently specific and rejects
           | blanket classes of problems that have possible valid
           | solutions, but would need deeper flow analysis, etc.
        
             | IshKebab wrote:
             | Yes I know. You get the same effect with type systems that
             | are closer to formal verification. Something you know is
             | actually fine but the prover isn't quite smart enough to
             | realise until you shift the puzzle pieces around so they
             | are just so.
        
         | amelius wrote:
         | > I suppose if you repeat this enough you learn how to write
         | code that Rust is happy with first-time.
         | 
         | But this assumes that your specifications do not change.
         | 
         | Which we know couldn't be further from the truth in the real
         | world.
         | 
         | Perhaps it's just me, but a language where you can never change
         | your mind about something is __not__ a fun language.
         | 
         | Also, my manager won't accept it if I tell him that he can't
         | change the specs.
         | 
         | Maybe Rust is not for me ...
        
           | stephenbennyhat wrote:
           | "Malum est consilium, quod mutari non potest" you might say.
        
           | oconnor663 wrote:
           | My recommendation is that you do whatever you feel like with
           | ownership when you first write the code, but then if
           | something forces you to come back and change how ownership
           | works, seriously consider switching to
           | https://jacko.io/object_soup.html.
        
             | amelius wrote:
             | Isn't that just reinventing the heap, but with indexes in a
             | vector instead of with addresses in memory?
        
               | oconnor663 wrote:
               | You could look at it that way. But C++ programs often use
               | similar strategies, even though they don't have to.
               | Array/Vec based layouts like this give you the option of
               | doing some very fancy high-performance stuff, and they
               | also happen to play nicely with the borrow checker.
        
               | amelius wrote:
               | It's very basic and not a general solution because the
               | lifetimes of objects are now set equal. And there's no
               | compaction, so from a space perspective it is worse than
               | a heap where the space of deleted objects can be filled
               | up by new objects. It is nice though that you can delete
               | an entire class of objects in one operation. I have used
               | this type of memory management in the context of web
               | requests, where the space could be freed when the request
               | was done.
        
           | stouset wrote:
           | I genuinely don't know where you've gotten the idea that you
           | can "never change your mind" about anything.
           | 
           | I have changed my mind plenty of times about my Rust
           | programs, both in design and implementation. And the language
           | does a damn good job of holding my hand through the process.
           | I have chosen to go through both huge API redesigns and large
           | refactors of internals and had everything "just work". It's
           | really nice.
           | 
           | If Rust were like you think it is, you're right, it wouldn't
           | be enjoyable to use. Thankfully it is nothing like that.
        
         | dinosaurdynasty wrote:
         | And in C++, those changes would likely shoot yourself in the
         | foot without warning. The borrow checker isn't some new weird
         | thing, it's a reification of the rules you need to follow to
         | not end up with obnoxious hard to debug memory/threading
         | issues.
         | 
         | But yeah, as awesome as Rust is in many ways it's not really
         | specialized to be a "default application programming language"
         | as it is a systems language, or a language for thorny things
         | that need to _work_ , as opposed to "work most of the time".
        
           | cogman10 wrote:
           | C++ allows both more incorrect and correct programs. That's
           | what can be a little frustrating about the BC. There are
           | correct programs which the BC will block and that can feel
           | somewhat limiting.
        
             | AlotOfReading wrote:
             | In most cases, those "correct" C++ are also usually buggy
             | in situations the programmer simply hasn't considered.
             | That's why the C++ core guidelines ban them and recommend
             | programs track ownership with smart pointers that obey
             | essentially the same rules as Rust. The main difference is
             | that C++ smart pointers have more overhead and a bunch of
             | implicit rules you have to read the docs to know. Rust
             | tells you in (occasionally obscure) largely helpful
             | compiler errors at the point where you've violated them,
             | rather than undefined behavior or a runtime sanitizer.
        
             | stouset wrote:
             | While this obviously and uncontroversially true in an
             | absolute sense (the borrowck isn't perfect), I think in the
             | overwhelming majority of real-world cases its concerns are
             | either actual problems with your design or simple and well-
             | known limitations of the checker that have pretty
             | straightforward and idiomatic workarounds.
             | 
             | I haven't seen a lot of programs designs in practice that
             | are sound but fundamentally incompatible with the borrow
             | checker. Every time I've thought this I've come to realize
             | there was something subtly (or not so subtly) wrong with
             | the design.
             | 
             | I have seen some contrived cases where this is true but
             | they're inevitably approaches nobody sane would actually
             | want to use anyway.
        
         | rendaw wrote:
         | Yo, everyone's interpreting parent's comment in the worst way
         | possible: assuming they're trying to do unsound refactorings.
         | There are plenty of places where a refactoring is fine, but the
         | rust analyzer simply can't verify the change (async `FnOnce`
         | for instance) gives up and forces the user to work around it.
         | 
         | I love Rust (comparatively) but yes, this is a thing, and it's
         | bad.
        
           | joshka wrote:
           | Yeah, Rust-analyzer's palette of refactorings is woefully
           | underpowered in comparison to other languages / tooling I've
           | used (e.g. Resharper, IntelliJ). There's a pretty high
           | complexity bar to implementing these too unfortunately. I say
           | this as someone that has contributed to RA and who will
           | contribute more in the future.
        
         | redman25 wrote:
         | If you are trying overly hard to abstract things or work
         | against that language, then yes, things can be difficult to
         | refactor. Here's a few things I've found:
         | 
         | - Generics
         | 
         | - Too much Send + Sync
         | 
         | - Trying too hard to avoid cloning
         | 
         | - Writing code in an object oriented way instead of a data
         | oriented way
         | 
         | Most of these have to do with optimizing too early. It's better
         | to leave the more complex stuff to library authors or wait
         | until your data model has settled.
        
           | nostradumbasp wrote:
           | "Trying too hard to avoid cloning"
           | 
           | This is the issue I see a certain type of new rustaceans
           | struggle with. People get so used to being able to chuck
           | references around without thinking about what might actually
           | be happening at run-time. They don't realize that they can
           | clone, and even clone more than what might "look good", and
           | that it is super reasonable to intentionally make a clone,
           | and still get incredibly acceptable performance.
           | 
           | "Writing code in an object oriented way instead of a data
           | oriented way" The enterprise OOP style code habits also seem
           | to be a struggle for some but usually ends up really
           | liberating people to think about what their application is
           | actually doing instead of focusing on "what is the language
           | to describe what we want it to do".
        
         | QuadDamaged wrote:
         | When this happens to me, it's mostly because my code is written
         | with too coarse separation of concerns, or I am just mixing
         | layers
        
         | ajross wrote:
         | > A pattern in these is code that compiles until you change a
         | small thing.
         | 
         | I think that's a downstream result of the bigger problem with
         | the borrow checker: nothing is actually specified. In most of
         | the issues here, the changed "small thing" is a change in
         | control flow that is (1) obviously correct to a human reader
         | (or author) but (2) undetectable by the checker because of some
         | quirk of its implementation.
         | 
         | Rust set out too lofty a goal: the borrow checker is supposed
         | to be able to prove correct code correct, despite that being a
         | mathematically undecidable problem. So it fails, inevitably.
         | And worse, the community (this article too) regards those
         | little glitches as "just bugs". So we're treated to an endless
         | parade of updates and enhancements and new syntax trying to
         | push the walls of the language out further into the infinite
         | undecidable wilderness.
         | 
         | I've mostly given up on Rust at this point. I was always a
         | skeptic, but... it's gone too far at this point, and the
         | culture of "Just One More Syntax Rule" is too entrenched.
        
         | summerlight wrote:
         | This is because programming is not a work in a continuous
         | solution space. Think in this way; you're almost guaranteed to
         | introduce obvious bugs by randomly changing just a single
         | bit/token. Assembler, compiler, stronger type system, etc etc
         | all try to limit this by bringing a different view that is more
         | coherent to human reasoning. But computation has an inherently
         | emergent property which is hard to predict/prove at compile
         | time (see Rice's theorem), so if you want safety guarantee by
         | construction then this discreteness has to be much more
         | visible.
        
         | perrygeo wrote:
         | > make a seemingly small change, it balloons into a compile
         | error that requires large refactoring to appease the borrow
         | checker and type system
         | 
         | Same experience, but this is actually why I like Rust. In other
         | languages, the same seemingly small change could result in
         | runtime bugs or undefined behavior. After a little thought,
         | it's always obvious that the Rust compiler is 100% correct -
         | it's not a small change after all! And Rust helpfully guides me
         | through its logic and won't let my mistake slide. Thanks!
        
       | Sytten wrote:
       | I am pretty sure the example 2 doesn't work because of the move
       | and should be fixed in the next release when async closure are
       | stabilized (I am soooo looking forward to that one).
        
       | aw1621107 wrote:
       | At least based on the comments on lobste.rs [0] and /r/rust,
       | these seem to be actively worked on and/or will be solved Soon
       | (TM):
       | 
       | 1. Checking does not take match and return into account: I think
       | this should be addressed by Polonius?
       | https://rust.godbolt.org/z/8axYEov6E
       | 
       | 2. Being async is suffering: I think this is addressed by async
       | closures, due to be stabilized in Rust 2024/Rust 1.85:
       | https://rust.godbolt.org/z/9MWr6Y1Kz
       | 
       | 3. FnMut does not allow reborrowing of captures: I think this is
       | also addressed by async closures:
       | https://rust.godbolt.org/z/351Kv3hWM
       | 
       | 4. Send checker is not control flow aware: There seems to be
       | (somewhat) active work to address this? No idea if there are
       | major roadblocks, though. https://github.com/rust-
       | lang/rust/pull/128846
       | 
       | [0]:
       | https://lobste.rs/s/4mjnvk/four_limitations_rust_s_borrow_ch...
       | 
       | [1]:
       | https://old.reddit.com/r/rust/comments/1hjo0ds/four_limitati...
        
         | dccsillag wrote:
         | (Side note) That's odd, lobste.rs seems to be down for me, and
         | has been like that for a couple of months now -- I literally
         | cannot reach the site.
         | 
         | Is that actually just me??
         | 
         | EDIT: just tried some things, very weird stuff: curl works
         | fine. Firefox works fine. But my usual browser, Brave, does
         | not, and complains that "This site can't be reached
         | (ERR_INVALID_RESPONSE)". Very very very weird, anyone else
         | going through this?
        
           | yurivish wrote:
           | Why Brave is blocked: https://github.com/lobsters/lobsters-
           | ansible/issues/45
        
             | DaSHacka wrote:
             | Looks as though it's not currently in effect, however?
             | 
             | https://github.com/lobsters/lobsters/issues/761
             | 
             | What a trite cat-and-mouse game, though at least it's
             | entertaining to watch them try.
        
           | Permik wrote:
           | Hello from Finland, I can reach the site all fine. Hope you
           | get your connection issues sorted :)
        
           | rascul wrote:
           | Seems like lobste.rs might be blocking Brave.
           | 
           | https://news.ycombinator.com/item?id=42353473
        
           | porridgeraisin wrote:
           | They are throwing a bit of a hissy fit over brave. Change the
           | user agent or something and view the site.
        
             | ykonstant wrote:
             | Reading the facts of the situation, it seems like a
             | warranted "bit of a hissy fit".
        
               | sammy2255 wrote:
               | Disagree. Regardless of what Brave is doing you shouldn't
               | block via User Agent like this.
        
               | gs17 wrote:
               | Especially not simply making the site not load like that.
               | If you really think a browser is so bad you don't want
               | people using it, at least have it redirect to a message
               | explaining what your grievance is. Unless the browser is
               | DDoSing webpages it loads successfully, making the site
               | look broken is pretty worthless as a response.
               | 
               | EDIT: Although, it looks like they tried to do that
               | sometimes? No idea why they would switch from that
               | approach.
        
               | hitekker wrote:
               | Eh, pushcx's is right to disagree with past bad decision
               | Brave made, but I think he's conflating a few grievances
               | together. Someone tried to reason with him on that front:
               | https://lobste.rs/s/iopw1d/what_s_up_with_lobste_rs_block
               | ing...
               | 
               | I sense hidden ideology, but it's his community to own,
               | not mine.
        
               | coffeeling wrote:
               | It's not just that he disagrees with the things on an
               | object, what happened level. He actively reads malice
               | into every misstep to paint the organization as abusive.
        
         | DylanSp wrote:
         | Case 1 is definitely addressed by the Polonius-related work.
         | There's a post [1] on the official Rust blog from 2023 about
         | that, and this post [2] from Niko Matsakis' blog in June 2024
         | mentions that they were making progress on it, though the
         | timeline has stretched out.
         | 
         | [1]: https://blog.rust-lang.org/inside-
         | rust/2023/10/06/polonius-u...
         | 
         | [2]:
         | https://smallcultfollowing.com/babysteps/blog/2024/06/02/the...
        
       | joshka wrote:
       | One approach to solving item 1 is to think about the default as
       | not being a separate key to the HashMap, but being a part of the
       | value for that key, which allows you to model this a little more
       | explicitly:                   struct WithDefault<T> {
       | value: Option<T>,             default: Option<T>,         }
       | struct DefaultMap<K, V> {             map: HashMap<K,
       | WithDefault<V>>,         }              impl<K: Eq + Hash, V>
       | DefaultMap<K, V> {             fn get_mut(&mut self, key: &K) ->
       | Option<&mut V> {                 let item =
       | self.map.get_mut(key)?;
       | item.value.as_mut().or_else(|| item.default.as_mut())
       | }         }
       | 
       | Obviously this isn't a generic solution to splitting borrows
       | though (which is covered in https://doc.rust-
       | lang.org/nomicon/borrow-splitting.html)
        
         | twic wrote:
         | The article makes the 'default' key with push_str("-default"),
         | and given that, your approach should work. But i think that's a
         | placeholder, and a bit of an odd one - i think it's more likely
         | to see something like (pardon my rusty Rust) k = if let
         | Some((head, _)) = k.split_once("_") { head.to_owned() } else {
         | k } - so for example a lookup for "es_MX" will default to "es".
         | I don't think your approach helps there.
        
           | joshka wrote:
           | Yeah, true. But that (assuming you're saying give me es_MX if
           | it exists otherwise es) has a similar possible solution.
           | Model your Language and variants hierarchically rather than
           | flat. So languages.get("es_MX") becomes                   let
           | language = languages.get_language("es");         let variant
           | = language.get_variant("MX");
           | 
           | There's probably other more general ideas where this can't be
           | fixed (but there's some internal changes to the rules
           | mentioned in other parts of this thread somewhere on
           | (here/reddit/lobsters).
        
       | germandiago wrote:
       | The limitation is the borrow checker itself. I think it restricts
       | too much. clang implements lifetimebound, for example, which is
       | not viral all the way down and solves some typical use cases.
       | 
       | I find that relying on values and restricted references and when
       | not able to do it, in smart pointers, is a good trade-off.
       | 
       | Namely, I find the borrow-checker too restrictive given there are
       | alternatives, even if not zero cost in theory. After all, the
       | 80/20 rule helps here also.
        
         | lumost wrote:
         | Using value types for complex objects will reck performance.
         | Why not just use a GCd language at that point?
        
           | mjburgess wrote:
           | Given the amount of cloning and Arc's in typical Rust code,
           | it just seems to be an exercise in writing illegible Go.
        
             | binary132 wrote:
             | Ironically Go has pretty clean and straightforward
             | guarantees about when heap allocation happens and how to
             | avoid gc.
        
           | CyberDildonics wrote:
           | This is not true, the heavy data will be on the heap and you
           | can move the values around. It actually works out very well.
        
         | pornel wrote:
         | A borrow checker that isn't "viral all the way down" allows
         | use-after-free bugs. Pointers don't stop being dangling just
         | because they're stashed in a deeply nested data structure or
         | passed down in a way that [[lifetimebound]] misses. If a
         | pointer has a lifetime limited to a fixed scope, that limit
         | _has to_ follow it everywhere.
         | 
         | The borrow checker is fine. I usually see novice Rust users
         | create a "viral" mess for themselves by confusing Rust
         | references with general-purpose pointers or reference types in
         | GC languages.
         | 
         | The worst case of that mistake is putting temporary references
         | in structs, like `struct Person<'a>`. This feature is
         | incredibly misunderstood. I've heard people insist it is
         | necessary for performance, even when their code actually
         | returned an address of a local variable (which is a bug in C
         | and C++ too).
         | 
         | People want to avoid copying, so they try to store data "by
         | reference", but Rust's references don't do that! They exist to
         | _forbid_ storing data. Rust has other reference types (smart
         | pointers) like Box and Arc that exist to store by reference,
         | and can be moved to avoid copying.
        
       | the__alchemist wrote:
       | The one I run into most frequently: Passing field A mutably, and
       | field B immutably, from the same struct to a function. The naive
       | fix is, unfortunately, a clone. There are usually other ways as
       | well that result in verbosity.
        
         | bionhoward wrote:
         | Could you change the function to accept the whole struct and
         | make it mutate itself internally without external mutable
         | references?
        
           | the__alchemist wrote:
           | Yes. Note that this requires a broader restructure that may
           | make the function unusable in other contexts.
        
             | Someone wrote:
             | Also only _if it is under your control_. If it's in the OS
             | or a third-party library, you can't change the API.
        
         | pitaj wrote:
         | What? I think this just works...
         | 
         | https://play.rust-lang.org/?version=stable&mode=debug&editio...
        
           | estebank wrote:
           | The borrow checker is smart enough to track disjointed field
           | borrows individually and detect that's fine, but if you have
           | two _methods_ that return borrows to a single field, there 's
           | no way of communicating to the compiler that it's not
           | borrowing the entire struct. This is called "partial
           | borrows", the syntax is not decided, and would likely only
           | work on methods of the type itself and not traits (because
           | trait analysis doesn't need to account for which impl you're
           | looking at, and partial borrows would break that).
           | 
           | The solution today is to either change the logic to keep the
           | disjointed access in one method, provide a method that
           | returns a tuple of sub-borrows, have a method that takes the
           | fields as arguments, or use internal mutability.
        
             | pitaj wrote:
             | Ah, it wasn't clear from they they wrote that this is what
             | they meant.
        
         | duped wrote:
         | The fix is destructuring
        
         | pornel wrote:
         | The problem is typically caused by &mut self methods
         | exclusively borrowing _all_ of self.
         | 
         | I wish this not-a-proposal turned into a real proposal:
         | 
         | https://smallcultfollowing.com/babysteps//blog/2021/11/05/vi...
        
       | kra34 wrote:
       | "Normally you'd return &str rather than &String, but I'm using
       | String here for the sake of simplicity and clarity."
       | 
       | Yeah, I think I'm going to skip Rust entirely.
        
         | brabel wrote:
         | The reason for that is simple though: &String converts to &str,
         | but not the other way around... so you should always use &str
         | so that your code works with either, and notice that literal
         | strings are &str. I think Rust has lots of warts, but I don't
         | see this as one of them (at least it's something you get
         | irritated at only once, but then never have problems with).
        
           | wat10000 wrote:
           | I'm barely familiar with rust and forgot about this aspect,
           | if I ever knew it.
           | 
           | Seems pretty sensible though. String is dynamic data on the
           | heap that you own and can modify. str is some data somewhere
           | that you can't modify.
           | 
           | C has this distinction as well. Of course, in typical C
           | fashion, the distinction isn't expressed in the type system
           | in any way. Instead, you just have to know that this char* is
           | something you own and can modify and that char* just a
           | reference to some data.
           | 
           | Higher level languages typically unify these ideas and handle
           | the details for you, but that's not rust's niche.
        
             | Arnavion wrote:
             | >String is dynamic data on the heap that you own and can
             | modify. str is some data somewhere that you can't modify.
             | 
             | This is not the definition. You can modify both. Being able
             | to modify something depends on whether you can do something
             | with a &mut reference to it, and both &mut String and &mut
             | str provide methods for modifying them.
             | 
             | The difference between the two types is just that String
             | owns its allocation while str doesn't. So modifying a
             | String is allowed to change its bytes as well as add and
             | remove bytes, the latter because the String owns its
             | allocation. Modifying a str only allows changing its bytes.
        
         | ninkendo wrote:
         | Every now and then I worry about the rust ecosystem growing too
         | fast and there being too many JavaScript expats flooding cargo
         | with useless code and poorly thought out abstractions, etc...
         | 
         | Thank you for reminding me that most people don't have the
         | patience to even learn something that makes them think even the
         | tiniest bit. Most of the JavaScript people won't even get past
         | a hello world program. I think we're mostly safe.
        
           | hu3 wrote:
           | rust community hubris at its finest.
           | 
           | Still, I find Scala and Haskell community more elegant and
           | intellectually superior when it comes to gatekeeping.
        
         | jeroenhd wrote:
         | The difference is very minor when interoperating with methods,
         | but the performance gains of this dual string system are often
         | worth it.
         | 
         | &str is basically a C string allocated on the stack while
         | String is like a Java string, an object on the heap with a
         | reference to a raw string hidden from plain sight. To avoid
         | unnecessary and unintended allocations and other expensive
         | memory operations, operating on &str is usually preferred for
         | performance reasons.
         | 
         | String almost transparently casts down to &str so in practice
         | you rarely care about the difference when calling library code.
         | 
         | If you're coming from a language that doesn't have a
         | distinction between character arrays and string objects, you're
         | probably fine just using &str.
         | 
         | If you're coming from a higher level language like JS or
         | Python, you're probably used to paying the performance price
         | for heap allocation anyway so you might as well use String in
         | Rust as well and only start caring when performance is
         | affected.
        
           | ninkendo wrote:
           | &str doesn't mean stack-allocated. It's just a pointer [0]
           | (and a len) to a section of memory that's (required to be)
           | legal utf-8.
           | 
           | A &str can point at stack memory or heap memory (usually the
           | latter, since it's common for them to point to a String,
           | which allocate on the heap), or static memory.
           | 
           | But yeah, String keeps things simple, and when in doubt just
           | use it... but if you want to understand it more, it's better
           | to think of who "owns" the data.
           | 
           | Take a String when you need to build something that needs to
           | own it, like if you're building a struct out of them, or
           | store them in a hash map or something. Because maybe a caller
           | already "owns" the string and is trying to hand over
           | ownership, and you can avoid the clone if it's just passed by
           | move.
           | 
           | If you're only using the string long enough to read it and do
           | something based on it (but don't want to own it), take a
           | &str, and a caller can be flexible of how it produces that (a
           | &'static str, a String ref, a substring, etc.)
           | 
           | The example that always works for me as a way to remember is
           | to think of HashMap.
           | 
           | HashMap.get takes a reference for the key (analogous to
           | &str), because it's only using your reference long enough to
           | compare to its keys and see which one matches.
           | 
           | HashMap.insert takes a value for the key (analogous to
           | String) because it needs to own the key and store it in the
           | table.
           | 
           | HashMap.insert _could_ take a reference, but then it'd have
           | to clone it, which means you'd miss out on the opportunity to
           | more cheaply move the key (which is a simple memcpy) instead
           | of calling clone() (which often does more calls to clone and
           | can be complicated)... and only would support clone able
           | keys.
           | 
           | [0] yeah yeah, a reference, not a pointer, but the point is
           | it "points to" a place in memory, which may be heap, stack,
           | static, anything.
        
           | umanwizard wrote:
           | str can be allocated on the stack, or heap, or static
           | storage.
        
         | wrs wrote:
         | If you thought that was confusing, you'll definitely want to
         | skip C++ too!
        
         | robot_no_421 wrote:
         | The difference between a heap allocated string (String), a
         | static string literal embedded in the binary (&str), and a
         | stack allocated string ([char], but this is more common in C
         | than Rust) is the simplest introduction to manually managed
         | memory.
         | 
         | The complications have nothing to do with Rust but with how
         | computers manage and allocate memory. You might as well also
         | skip C, C++, Zig, and every other language which gives you
         | fine-tuned access to the stack and heap, because you'll run
         | into the same concept.
        
           | ninkendo wrote:
           | Nit: A &str doesn't mean it has to be static, a &'static str
           | does (which are a subset of &str). A &str can easily point to
           | a dynamic String's heap storage too.
        
           | umanwizard wrote:
           | str doesn't have to be embedded in the binary. It can be
           | that, or it can be on the heap, or it can be on the stack.
        
         | umanwizard wrote:
         | Why? str and String are different things, why shouldn't they be
         | different types?
        
       | divs1210 wrote:
       | Easy to write bugs in unsafe languages like C / C++.
       | 
       | Rust makes memory management explicit, hence eliminating those
       | bugs. But it also shows how hard memory management actually is.
       | 
       | Systems programming languages like this should be used sparingly,
       | only for stuff like device drivers, OSs and VMs.
       | 
       | Any general purpose programming language should be garbage
       | collected.
        
       | Animats wrote:
       | My big complaint about Rust's borrow checking is that back
       | references need to be handled at compile time, somehow.
       | 
       | A common workaround is to put items in a Vec and pass indices
       | around. This doesn't fix the problem. It just escapes lifetime
       | management. Lifetime errors then turn into index errors,
       | referencing the wrong object. I've seen this three times in Rust
       | graphics libraries. Using this approach means writing a reliable
       | storage allocator to allocate array slots. Ad-hoc storage
       | allocators are often not very good.
       | 
       | I'm currently fixing some indexed table code like that in a
       | library crate. It crashes about once an hour, and has been doing
       | that for four years now. I found the bug, and now I have to come
       | up with a conceptually sound fix, which turns out to be a sizable
       | job. This is Not Fun.
       | 
       | Another workaround is Arc<Mutex<Thing>> everywhere. This can
       | result in deadlocks and memory leaks due to circularity. Using
       | strong links forward and weak links back works better, but
       | there's a lot of reference counting going on. For the non-
       | threaded case, Rc<RefCell<Thing>>, with .borrow() and
       | .borrow_mut(), it looks possible to do that analysis at compile
       | time. But that would take extensions to the borrow checker. The
       | general idea is that if the scope of .borrow() results of the
       | same object don't nest, they're safe. This requires looking down
       | the call chain, which is often possible to do statically.
       | Especially if .borrow() result scopes are made as small as
       | possible. The main objection to this is that checking may have to
       | be done after expanding generics, which Rust does not currently
       | do. Also, it's not clear how to extend this to the Arc multi-
       | threaded case.
       | 
       | Then there are unsafe approaches. The "I'm so cool I don't have
       | to write safe code" crowd. Their code tends to be mentioned in
       | bug reports.
        
         | lalaithion wrote:
         | https://docs.rs/refbox/latest/refbox/
        
           | Animats wrote:
           | Neat. It's still run-time checking. A good idea, though. The
           | one-owner, N users case is common. The trick is checking that
           | the users don't outlive the owner.
        
         | recursivecaveat wrote:
         | Yes the "fake pointer" pattern is a key survival strategy.
         | Another one I use often is the command pattern. You borrow a
         | struct to grab some piece of data, based on it you want to
         | modify some other piece of the struct, but you can't because
         | you have that first immutable borrow still. So you return a
         | command object that expresses the mutation you want, back up
         | the call stack until you're free to acquire a mutable reference
         | and execute the mutation as the command instructs. Very verbose
         | to use frequently, but often good for overall structure for key
         | elements.
        
           | Animats wrote:
           | Yes. Workarounds in this area exist, but they are all major
           | headaches.
        
         | saurik wrote:
         | > A common workaround is to put items in a Vec and pass indices
         | around. This doesn't fix the problem. It just escapes lifetime
         | management. Lifetime errors then turn into index errors,
         | referencing the wrong object.
         | 
         | That people seriously are doing this is so depressing... if you
         | build what amounts to a VM inside of a safe language so you can
         | do unsafe things, you have at best undermined the point of the
         | safe language and at worse disproved the safe language is
         | sufficient.
        
           | Animats wrote:
           | That's a good way to put it. I'll keep that in mind when
           | trying to convince the Rust devs.
        
       | rolandrodriguez wrote:
       | I didn't get past the first limitation before my brain started
       | itching.
       | 
       | Wouldn't the approach there be to avoid mutating the same string
       | (and thus reborrowing Map) in the first place? I'm likely missing
       | something from the use case but why wouldn't this work?
       | // Construct the fallback key separately         let fallback =
       | format!("{k}-default");              // Use or_else() to avoid a
       | second explicit `if map.contains_key(...)`         map.get_mut(k)
       | .or_else(|| map.get_mut(&fallback))
        
         | css wrote:
         | Yes, or use the entry api: https://doc.rust-
         | lang.org/beta/std/collections/hash_map/enum...
        
         | aoeusnth1 wrote:
         | This creates the fallback before knowing that you'll need it.
        
       ___________________________________________________________________
       (page generated 2024-12-24 23:01 UTC)