hngopher.com

       [HN Gopher] Making C++ safe without borrow checking, reference c...
       ___________________________________________________________________
        
       Making C++ safe without borrow checking, reference counting, or
       tracing GC
        
       Author : jandeboevrie
       Score  : 173 points
       Date   : 2023-06-23 16:04 UTC (6 hours ago)
        
 (HTM) web link (verdagon.dev)
 (TXT) w3m dump (verdagon.dev)
        
       | diabllicseagull wrote:
       | lost me at the unordered map
        
       | oleganza wrote:
       | The reason I use Rust is because I can bypass all this messy
       | business altogether and have my sensible patterns wrapped in a
       | usable syntax and enforced by the compiler out of the box.
       | 
       | Whenever people say "just follow these rules" I read "just add
       | this extra mental burden and do not slip up". Computers were
       | invented to automate things. Rust automates ownership and
       | borrowing rules. Suggestions like "do not forget to initialize
       | unique_ptr with something" are not intelligent solutions.
        
       | kubb wrote:
       | it's not about making C++ memory safe, but about describing a
       | safe subset of C++
        
         | pjmlp wrote:
         | Ideally we would have -fsafe and [[unsafe]], but it will take
         | years for something like that.
        
           | derefr wrote:
           | Presuming syntax for "unsafe" that gracefully degrades in
           | non-aware compilers, why couldn't a particular compiler start
           | doing it right now, starting with a very trivial safety
           | checker than can be iteratively improved upon once the
           | framework is in place?
        
             | eslaught wrote:
             | I feel like D has gone this route of incrementally adding
             | features (like borrow checking) to the language that, in
             | principle, improve safety.
             | 
             | I wonder if anyone here has more experience to know how
             | well it has worked?
             | 
             | One massive advantage of Rust is that they started with
             | borrow checking from the beginning. I think one thing that
             | often gets understated in these discussions is how much it
             | matters to have your entire ecosystem using a set of safe
             | abstractions. This is a major drag for C++, and I suspect
             | that even if the language went a route like D they'd still
             | have gaping safety holes in practical, everyday usage.
        
               | pjmlp wrote:
               | It still hasn't, that has been unfortunely a common theme
               | in D's evolution, chasing the next big idea that will
               | this time bring folks into D, while leaving the previous
               | ones half implemented with bugs.
               | 
               | So now there is GC and @nogc, lifetimes but not quite,
               | scoped pointers, scoped references,... while Phobos and
               | ecosystem aren't in a state to fully work across all
               | those variations.
        
             | pjmlp wrote:
             | You can have it today on Circle, but its relationship with
             | some C++ folks is complicated.
        
             | bluGill wrote:
             | It is easy to say add unsafe. However the details are very
             | complex. I've read a few of the papers proposing something
             | like this, and they spend a lot of time discussing some
             | nasty details that are important to get right.
        
       | rdtsc wrote:
       | In Rule 3:                     struct Ship { int fuel; };
       | void print(Ship* ship) {             cout << ship.fuel << endl;
       | }
       | 
       | Should that be "ship->fuel" instead?
        
       | MagicMoonlight wrote:
       | Deleting and re-adding each item from an array every time you use
       | something seems like a massive pain
        
       | winrid wrote:
       | > "We'll instead take and return the vector directly"
       | 
       | Won't this clone it?
        
         | rbancroft wrote:
         | Not necessarily, although it's a bit complicated to understand
         | in C++.
         | 
         | Starting with C++17, there is a feature called guaranteed copy
         | elision that works for many/most scenarios that you would want.
         | You need to read through the following resources to understand
         | it fully:
         | 
         | https://en.cppreference.com/w/cpp/language/copy_elision
         | https://en.cppreference.com/w/cpp/language/value_category
        
           | spoiler wrote:
           | > Not necessarily, although it's a bit complicated to
           | understand in C++.
           | 
           | One could say this statement applies to most lines of C++
           | code. Lol
        
         | masklinn wrote:
         | Copy elision exists, the author might just assume (or know)
         | it'll trigger. The rules are way too arcane for me so I could
         | not tell.
        
       | azakai wrote:
       | There is also Type-After-Type:
       | 
       | https://dl.acm.org/doi/10.1145/3274694.3274705
       | 
       | (though maybe that's covered by what the author meant by
       | "arenas").
        
       | imtringued wrote:
       | RIP all the modern languages that haven't made any improvements
       | in memory management at all.
       | 
       | There is so much low hanging fruit in programming language design
       | and nobody is picking it up and instead everyone produces
       | marginal improvements over existing languages.
        
         | antonvs wrote:
         | Because implementing a new language and getting it to wide
         | adoption is an enormously challenging task, with a much lower
         | success rate than e.g. SV startups.
         | 
         | Languages that try to implement one new bright idea don't go
         | anywhere, because that's not enough to cause people to switch.
         | At best they serve as examples for feature adoption in other
         | languages.
         | 
         | Look at Rust for example: it seems to be succeeding and gaining
         | adoption, but right now it's still relatively niche (check the
         | number of Rust job postings), and it's taken 17 years to get to
         | this point, with sponsorship from major organizations like
         | Mozilla.
         | 
         | Given this, the idea that there's much low-hanging fruit that's
         | being ignored, that could easily be exploited, seems dubious.
         | What's an example of what you have in mind?
        
           | kbenson wrote:
           | > it's taken 17 years to get to this point
           | 
           | Yes and no. Rust went through quite a bit of changes early
           | on, ro the point that it's not really that similar of a
           | language, and 1.0 was released in May 2015.
           | 
           | That's still quite a while (8 years), but IMO doesn't quite
           | mean the same thing as a language that's been around for 17
           | years with a similar level of adoption. My impression (from
           | the outside) is that Rust usage is still increasing, at least
           | in specific areas, and has not leveled off or tapered. It
           | doesn't seem to be exploding into lots of teams and places,
           | but it does seem to be getting footholds still, like at
           | Azure.
        
             | tcmart14 wrote:
             | While that is true about Rust, most new languages are gonna
             | have the same thing. It'll be years before they get to 1.0.
             | Look at Zig, just about every new language. So I don't
             | think it is valid to discount the 1.0 days because all
             | languages are gonna need awhile to get to the 1.0 day. It
             | still took 17 years of time investment to get Rust to where
             | it is today.
        
         | imachine1980_ wrote:
         | this is because programming languages have network effects, and
         | are costly to move and test in real world case, you can use
         | pony, but luck searching sdk, databases, performant compilers,
         | and maintained libraries, the community aspect of programming
         | languages ecosystems makes this, no matters how great it is if
         | inst popular you will have hard time being a developer in it.
         | that why most languages that works start in niche great
         | scripting, good for data analysis, great for concurrent
         | programming scala, and some of then like python then scale and
         | other like scala or julia don't.
        
       | pie_flavor wrote:
       | > Borrow checking is incompatible with some useful patterns and
       | optimizations (described later on), and its infectious
       | constraints can have trouble coexisting with non-borrow-checked
       | code.
       | 
       | Not that this isn't true, but the rest of the article introduces
       | a system with a superset of those limitations, gradually
       | decreasing over time but never becoming a subset. In fact the
       | pattern described in the article is a common pattern in Rust and
       | I make use of it all the time; the library for making use of it
       | is `slotmap`.
        
         | [deleted]
        
         | dxhdr wrote:
         | > In fact the pattern described in the article is a common
         | pattern in Rust and I make use of it all the time; the library
         | for making use of it is `slotmap`.
         | 
         | Slotmap uses unsafe everywhere, it's a memory usage pattern not
         | supported by the borrow checker. It's basically hand-
         | implementing use-after-free and double-free checks, which is
         | what the borrow checker is supposed to do. Is that really a
         | common pattern in Rust?
        
           | dralley wrote:
           | > Slotmap uses unsafe everywhere, it's a memory usage pattern
           | not supported by the borrow checker. Is disabling the borrow
           | checker really a common pattern in Rust?
           | 
           | Wrapping "unsafe" code in a safe interface is a common
           | pattern in Rust, yes. There is absolutely nothing wrong with
           | using "unsafe" so long as you are diligent about checking
           | invariants, and keep it contained as much as possible.
           | Obviously the standard library uses some "unsafe" as well,
           | for instance.
           | 
           | "unsafe" just means "safe but the compiler cannot verify it".
           | 
           | Unsafe does not disable the borrow checker, though. All of
           | the restrictions of safe Rust still apply. All "unsafe" does
           | is unlock the ability to use raw pointers and a few other
           | constructs.
           | 
           | https://doc.rust-lang.org/book/ch19-01-unsafe-
           | rust.html#unsa...
        
             | dxhdr wrote:
             | It's essentially a "user-space" memory allocator with it's
             | own use-after-free and double-free checks, apparently
             | because the language implementation isn't adequate. If
             | anything it just reinforces the articles point that "borrow
             | checking is incompatible with some useful patterns and
             | optimizations."
        
               | junon wrote:
               | Eh? This is a wild take. How do you draw the conclusion
               | the default implementation is inadequate?
        
               | dymk wrote:
               | Because something like slotmap has to use `unsafe` to get
               | around the inadequacies of the borrow checker...
        
               | burntsushi wrote:
               | A downside for sure, but one that, at least in this
               | specific example, has limited downsides. If you can
               | button it up into a safe abstraction that you can share
               | with others, then I don't really see what the huge
               | problem is. The fact that you might need to write
               | `unsafe` inside of a well optimized data structure isn't
               | a weakness of Rust, it's the entire point: you use it to
               | encapsulate an unsafe core within a safe interface. The
               | standard library is full of these things.
               | 
               | Now if you're trying to do something that you can't
               | button up into a safe abstraction for others to use, then
               | that's a different story.
        
             | mr_00ff00 wrote:
             | If unsafe means "safe but the compiler cannot verify" then
             | I guess just consider .cpp to mean "safe but the compiler
             | cannot verify" and we have suddenly made C++ memory safe
        
               | ammar2 wrote:
               | Sure but you're missing the
               | 
               | > so long as you are diligent about checking invariants
               | 
               | part. Could you go through and check all the parts of a
               | huge C++ codebase to make sure invariants are held as
               | opposed to a few hundred lines of unsafe Rust code?
        
               | mr_00ff00 wrote:
               | Sure, but I think the point here is the degree.
               | 
               | Presumably if it takes a lot of unsafe rust lines to
               | build something, it won't matter if it's 30% safe or
               | whatever.
               | 
               | I just see the point of "unsafe is fine" a lot when the
               | whole point of rust is that memory safety issues are
               | never worth the cost.
        
               | ammar2 wrote:
               | Right, I guess the question is what will that proportion
               | be when Rust is used for things like operating systems
               | and web browsers. 30% would be untenable but a few
               | hundred/thousand lines of unsafe code is fairly easy to
               | put under a microscope.
               | 
               | For some current day research into this, there is the
               | paper "How Do Programmers Use Unsafe Rust?"[1] which I'll
               | drop a quote from here:
               | 
               | > The majority of crates (76.4%) contain no unsafe
               | features at all. Even in most crates that do contain
               | unsafe blocks or functions, only a small fraction of the
               | code is unsafe: for 92.3% of all crates, the unsafe
               | statement ratio is at most 10%, i.e., up to 10% of the
               | codebase consists of unsafe blocks and unsafe functions
               | 
               | That paper is definitely worth reading and goes into why
               | programmers use unsafe. e.g 5% of the crates at that time
               | were using it to perform FFI.
               | 
               | In writing "RUDRA: Finding Memory Safety Bugs in Rust at
               | the Ecosystem Scale" [2], I recreated this data and year-
               | by-year the % of crates using unsafe is going down. And
               | for what it's worth, crates are probably a bad data-set
               | for this. crates tend to be libraries which are exactly
               | where we would expect to find unsafe code encapsulated to
               | be used safely. There's also plenty of experimental and
               | hobby crates. A large dataset of actual binaries would be
               | way more interesting to look at.
               | 
               | [1] https://dl.acm.org/doi/10.1145/3428204
               | 
               | [2] https://taesoo.kim/pubs/2021/bae:rudra.pdf
        
               | mr_00ff00 wrote:
               | Ahh that is quite interesting, I'll check those links out
        
               | jjnoakes wrote:
               | Sure, and if a typical Rust program that I write has no
               | unsafe in it directly, and 5% of its dependencies' code
               | have unsafe in them, that's also the same as writing a
               | program in the "not c++" language directly, and using
               | "not c++" dependencies for all but 5% of the dependency
               | code.
               | 
               | Seems like a silly analogy to me, though.
        
               | mr_00ff00 wrote:
               | Right but it's that 5% the origin comment is talking
               | about. The times when rust has to use unsafe for the type
               | of program.
        
               | Ygg2 wrote:
               | It's not what unsafe means. Unsafe means this might cause
               | UB for some invocations (accessing raw pointers, calling
               | into another language, etc.). Safe means it will not
               | cause UB for any invocations (it may panic or abort).
        
         | mr_00ff00 wrote:
         | I would really love a definitive answer on whether the borrow
         | checker and rust's rules do really limit optimizations and
         | such.
         | 
         | It seems like I see this opinion often and every time there are
         | tons of people on both sides who seem sure they are correct.
         | 
         | What are the limitations for optimization? Does unsafe rust
         | really force those?
        
           | amelius wrote:
           | Difficult to answer.
           | 
           | However, what you can say is that the borrow-checker works
           | like a straight-jacket for the programmer, making them less
           | capable to focus on other things like performance issues,
           | high-level data leaks (e.g. a map that is filled with values
           | without removing them eventually), or high-level safety
           | issues.
        
             | steveklabnik wrote:
             | You can also say that the borrow checker works like a
             | helpful editor, double checking your work, so that you can
             | focus on the important details of performance issues,
             | safety issues, and such, without needing to waste brain
             | power on the low-level details.
        
               | amelius wrote:
               | This would be true if code using the borrow checker was
               | easier to read than to write.
        
               | SubjectToChange wrote:
               | I think it's generally accepted that writing code is
               | nearly universally easier than reading code, in any
               | language. That aside, getting a mechanical check on
               | memory safety for the price of some extra language
               | verbosity is obviously worth it IMO.
               | 
               | By the same token, it is common to see criticisms of the
               | complexity of templates in C++, but templates are the
               | cornerstone of "Modern C++" and many libraries could not
               | exist without them.
        
               | steveklabnik wrote:
               | The point is that the compiler helps you "read" it. This
               | takes mental effort off of you.
               | 
               | I agree that not everyone thinks this is true, but this
               | is my experience. I do not relate to the compiler as a
               | straight jacket. I relate to it as a helpful assistant.
        
               | jjnoakes wrote:
               | This is my experience as well. I find it much easier to
               | work faster when the compiler is helping me, and I don't
               | consider it a "straitjacket" at all.
        
           | bluGill wrote:
           | There can be no answer. Research is ongoing, smart people are
           | actively trying to make optimizer better, so even if I gave a
           | 100% correct answer now (which would be pages long), a new
           | commit 1 minute latter will change the rules. Sometimes
           | someone discovers what we thought was safe isn't safe in some
           | obscure case and so we are forced to no longer apply some
           | optimization. sometimes optimization is a compromise and we
           | decide that the using a couple extra CPU cycles is worth it
           | because of some other gain (a CPU cycle is often impossible
           | to measure in the real world as things like caches tend to
           | dominate benchmarks, so you can make this comprise many times
           | when suddenly the total adds up to something you can
           | measure.).
           | 
           | The short answer for those who don't want details: it is
           | unlikely you can measure a difference in real world code
           | assuming good clean code with the right algorithm.
        
           | verdagon wrote:
           | I'd say it mostly applies to manual optimization, when we're
           | restructuring our program.
           | 
           | If the situation calls for a B-tree, the borrow checker loves
           | that. If the situation calls for some sort of intrusive or
           | self-referential data structure (like in
           | https://lwn.net/Articles/907876/), then you might have to
           | retreat to a different data structure which could incur more
           | bounds checking, hasher costs, or expansion costs.
           | 
           | It's probably not worth worrying about most the time, unless
           | you're in a _very_ performance-sensitive situation.
        
           | apendleton wrote:
           | Without directly answering your question, it's worth noting
           | that there are also additional optimizations made available
           | by Rust that are not easily accessible in C/C++ (mostly
           | around stronger guarantees the Rust compiler is able to make
           | about aliasing).
        
           | steveklabnik wrote:
           | The question is far too broad, and contextual. You're never
           | going to get an answer to that question.
           | 
           | Sometimes, the rules add more optimization potential. (like
           | how restrict technically exists in C but is on every (okay
           | _almost every_ ) reference in Rust) Sometimes, the rules let
           | you be more confident that a trickier and faster design will
           | be maintainable over time, so even if it is possible without
           | these rules, you may not be able to do that in practice.
           | (Stylo)
           | 
           | Sometimes, they may result in slower things. Maybe while you
           | _could_ use Rust 's type system to help you with a design,
           | it's too tough for you, or simply not worth the effort, so
           | you make a copy instead of using a reference. Maybe the
           | compiler isn't fantastic at compiling away an abstraction,
           | and you end up with slower code than you otherwise would.
           | 
           | And that's before you get into complexities like "I see
           | Rc<RefCell<T>> all the time in Rust code" "that doesn't make
           | sense, I never see that pattern in code".
        
       | coliveira wrote:
       | The reason why safety in C++ is difficult to achieve is due to
       | the memory model used by C and C++. The memory model is a flat
       | space provided by the OS that can be addressed by pointers. In
       | this sense, C++ is similar to assembly code. A language like
       | Java, on the other hand, assumes a different model where you can
       | only access objects with well defined behavior. To change this,
       | one needs to disallow the use of native pointers in C++ or make
       | them less powerful, like Java did.
        
         | [deleted]
        
         | josefx wrote:
         | > The memory model is a flat space provided by the OS that can
         | be addressed by pointers
         | 
         | From what I understand this is not true. Pointers cease to be
         | valid the moment you try to leave a single allocation. You get
         | to play around within a single continuous allocation and one
         | past the end, everything further out is playing with fire.
         | 
         | Even comparing the "addresses" of two separate allocations is
         | undefined if done with "<" . The comparison function std::less
         | is basically magic to get well defined behavior out of a
         | language that doesn't guarantee it.
         | 
         | > C++ is similar to assembly code
         | 
         | Only if you use a compiler that does not optimize anything.
        
           | LoganDark wrote:
           | > Pointers cease to be valid the moment you try to leave a
           | single allocation.
           | 
           | For the other readers who might not know what this is
           | referring to, it's pointer provenance. For an introduction to
           | the topic, I always recommend Ralf Jung's blog series,
           | "Pointers Are Complicated":
           | 
           | https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html
        
           | shadowgovt wrote:
           | It is, for all practical purposes, a flat space in the sense
           | that for bare pointers, operator++ is defined (increments to
           | the next whatever, defined based on type of pointer).
           | 
           | There is no operator++ equivalent in Java to apply to object
           | references (unless you go unsafe); you can't immediately
           | shoot yourself in the foot without the compiler noticing by
           | asking for "the next object after this one" when no such
           | thing exists.
           | 
           | (handwave a bit: of course, you can ask for an object past
           | the last object in any container. That's (a) not the same
           | thing and (b) results in an immediate runtime error in Java,
           | instead of undefined behavior)
        
           | antonvs wrote:
           | > everything further out is playing with fire.
           | 
           | That's the point. C and C++ don't prevent you from playing
           | with that for. Memory-safe language do.
        
         | kimixa wrote:
         | One issue is that the memory model _isn 't_ just a flat space
         | that can be addresses by any pointer value - it may look
         | similar to one if your compiler and OS let you, but doing
         | things like accessing memory allocated as a different type or
         | outside (an array of) objects is invalid, and the compiler is
         | perfectly allowed by the standard to assume that never happens
         | and happily "optimize" everything that may be a result of that
         | away.
         | 
         | A lot of bugs have been caused by programmers assuming any
         | access to the 'linear address space' is fine, but that has
         | never been reliable as it's not allowed by the standard. The
         | worse thing is when it looks like it works for a while, but
         | you're relying on stuff not allowed by the standard so may
         | change at any time (like a compiler version or option change,
         | or even a change to a different part of the code that happens
         | to tickle the compiler's analysis stages a slightly different
         | way). See the "Time traveling NULL-check removal" - as the
         | compiler "knows" that no pointer can ever have the value of
         | NULL during deference, any path that does that can be
         | completely removed - even if there's something like a NULL
         | check and a logging output before said deference, if compiler
         | decides that deference will eventually happen in that path
         | unconditionally, that path and logging _before_ the deference
         | Can Never Happen so can be removed.
         | 
         | Or type punning and pointer aliasing - objects are created with
         | a type, and so the compiler Knows if you convert a pointer type
         | to another type that isn't compatible with the first type, they
         | somehow magically point to different memory, and all the
         | assumptions that implies for the following code.
         | 
         | A lot of these restrictions are pretty similar to things like
         | Java have - the difference is that the JVM checks and flags
         | violations and/or straight up disallows them when compiling -
         | not just allowing the compiler to (silently) optimize based on
         | those assumptions, and throwing the result at hardware to see
         | what happens.
         | 
         | There may be a few platform/compiler-specific behavior used to
         | implement super low-level stuff like OSs, but that's platform-
         | specific stuff outside the C++ (or C) spec itself.
        
         | tsimionescu wrote:
         | It depends what you mean exactly. The C and C++ official memory
         | model is very much not a flat space, but exactly what you
         | describe for Java - you can only (validly) access objects. For
         | example, the operation x < y is only defined if x and y are
         | both pointers into the same object or array of objects (or one
         | past the end of an array of objects). Otherwise, the operation
         | is entirely undefined in both the C and the C++ memory models.
         | The following program has no defined C or C++ semantics, and
         | neither the C nor the C++ standards can tell you anything about
         | what it could do:                 int x = 0;       int y = 0;
         | if(&x < &y) {         printf("???");       }
         | 
         | Now of course the implementation of C and C++ actually assumes
         | without checking that you only access objects and not raw
         | memory, and thus will happily read raw memory directly.
        
           | leni536 wrote:
           | The result of the pointer comparison is unspecified, this is
           | not undefined behavior in C++.
           | 
           | I don't know about C.
        
           | shadowgovt wrote:
           | I really feel like it's a hell of a definitions dodge to say
           | "This is what the model is" when no compiler implements
           | constraints to require the user to treat the model like that
           | (i.e. I can always just increment the pointer, or typecast it
           | to numeric type, do math on it, and typecast back to a
           | pointer, without having to pull any big red levers like using
           | "unsafe" methods).
           | 
           | If it's undefined but it compiles to _something_ , is it
           | _really_ undefined, or is the definition merely not
           | standardized?
        
             | [deleted]
        
             | epcoa wrote:
             | Yes it's _really_ undefined. There is a distinction from
             | "implementation defined behavior" which you seem to be
             | confusing it with. You are practically wrong in your
             | assumptions. Since undefined behavior is undefined the
             | compiler is free to do anything with compilation, it may
             | compile to something but you have no guarantee what that
             | something is. And in real life this often actually bites
             | you when the optimizer comes into play - modern optimizing
             | compilers can and do optimize undefined behavior into noops
             | or other weird stuff.
             | 
             | Read this and don't come back on this topic until you
             | clearly understand it:
             | https://en.cppreference.com/w/cpp/language/ub
        
               | shadowgovt wrote:
               | No; this is a common misconception I see from people who
               | swallowed the "it's allowed to format your hard drive and
               | blow up your monitor" dodge vs. the electrical engineers
               | who know where terminology like 'undefined behavior'
               | originated in engineering. In practice, it tends to do
               | something _subtle and usually right but probably wrong_
               | for the simple, practical reason that if it did anything
               | as obviously wrong as  "format your hard drive and blow
               | up your monitor," _someone would have tripped over it
               | testing the compiler and changed the compiler._
               | 
               | This is why I actually hate using this programming
               | language, because when you hit undefined behavior (which
               | the language makes trivial to do; incrementing a pointer
               | past the allocated memory is a one-line operation that
               | throws no errors) the end-result is usually _subtle,
               | wrong, and hard to find later_ if it isn 't actually
               | "close enough to right" because the compiler desperately
               | tries to make a useful program because that's what
               | compilers are for. Hell, if it formatted my hard drive
               | and blew up my monitor, it'd be much easier to figure out
               | where the problem was! Hand-waving this flaw in the
               | design of the programming tool with "oh, it's undefined
               | behavior; you should never have relied on that in the
               | first place" when so many valid statements in the
               | language _compile to_ undefined behavior, as if that is
               | _good enough,_ is building a house on sand.
               | 
               | ... and quite frankly, our industry is full of sand
               | houses and we could stand to respond to the amount of
               | undefined behavior in C++ by ceasing to build on that
               | shaky foundation.
        
         | adamnemecek wrote:
         | You just need an unsafe keyword.
        
         | ajross wrote:
         | That's pretty much what the article says though. "Don't use
         | traditional pointers" is a fairly trivial rule to enforce via
         | static analysis, and constructs like unique_ptr are
         | syntactically identical anyway.
         | 
         | The bit that has me confused is that it's inventing a new term,
         | "borrowing affine style", to describe a longstanding paradigm
         | that has traditionally been called "RAII". Now, neither term is
         | very clear, but surely it's better to use the existing
         | confusing jargon instead of inventing new terms.
        
           | bluGill wrote:
           | borrowing affine style is more than RAII. borrowing affine
           | style means that there are no pointers, and always one owner.
           | in borrowing affine style your functions take a unique_ptr
           | for everything, if the lifetime of the data needs to live
           | beyond the function, then the function returns a unique_ptr
           | of that data back.                   std::unique_ptr<foo>
           | var;         // init and use var         var =
           | SomeFunction(std::move(var));         // use var again.
           | 
           | Note that while in SomeFunction you lose access to var, but
           | since SomeFunction returns it again you don't really lose
           | anything. Of course Somefunction can also return some other
           | unique_ptr<foo> that isn't var and you can't control that.
           | 
           | It is an interesting idea, though I'm not sure if I like it
           | for real world code or not.
        
           | gpderetta wrote:
           | The significant difference is a static guarantee of no reuse
           | after move, hence the 'affine' qualifier (which is not new).
        
       | gavinray wrote:
       | See also:
       | 
       | Thomas Neumann's current proposal for memory safe C++ using
       | dependency tracking:
       | 
       | - https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27...
       | 
       | Google's proposal for memory safety using Rust-like lifetime
       | analysis:
       | 
       | - https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/...
       | 
       | - https://github.com/google/crubit/tree/main/lifetime_analysis
        
         | pjmlp wrote:
         | And Microsoft' work on Visual C++ lifetime checker and SAL, as
         | well.
         | 
         | It will never be perfect, but every little improvement helps.
        
           | Voultapher wrote:
           | > It will never be perfect, but every little improvement
           | helps.
           | 
           | Or it might convince people to stay longer on a plane with a
           | provably [0] terrible safety record.
           | 
           | [0] https://alexgaynor.net/2020/may/27/science-on-memory-
           | unsafet...
        
             | pjmlp wrote:
             | To put matters into perspective, Rust reference
             | implementations depend on C++ toolchains.
             | 
             | Same applies to all major Ada, Java, .NET, Swift, Ocaml and
             | Haskell implementations. And any GPGPU toolchain.
             | 
             | Which kind of shows it isn't going anywhere and those
             | planes have to be improved no matter what.
        
               | SubjectToChange wrote:
               | As an addendum, the same goes for many C toolchains.
               | Anything requiring GCC 4.8 or later is depending on a C++
               | compiler. And projects like LLVM's libc, Fuchsia's Zircon
               | kernel, the bareflank hypervisor, etc, demonstrate that
               | C++ really can be used anywhere C is used.
               | 
               | C++ is the new C in the sense that it's the language
               | everything else is built on and I expect it will be even
               | _more difficult_ to displace than C. For instance, the
               | complexity of C++ makes it next to impossible to
               | incrementally rewrite in another language, simply writing
               | a production quality C++ implementation is a gargantuan
               | investment so a superset language is questionable, and
               | the C++ community is committed to evolving and improving
               | their language whereas C has largely ossified. Perhaps C
               | will outlive everyone reading this thread, but C++ will
               | outlive C.
        
               | Voultapher wrote:
               | I agree these planes are important and deserve care. At
               | the same time pretty much all suggestions on how to
               | meaningfully improve the safety of those planes boil down
               | to successor languages Cpp2, Carbon etc. or require some
               | other complex manual rewrite of components of said plane.
               | There is an argument to be made for having good out-of-
               | the-box interoperability, however even in some of the
               | most complex and important code-bases in existence,
               | namely browsers Firefox and Chrome, have demonstrated
               | that you can do that part replacement in Rust. I'm not
               | saying there is no other way. But these suggested and yet
               | unproven improvements to C++ will not automatically make
               | those planes safer. They will require replacing parts
               | with new code, and if we are writing new code there is a
               | serious question we should ask ourselves, building on
               | what foundation do we want to improve those "planes".
        
               | pjmlp wrote:
               | Rust in Firefox is a very tiny portion of it and now they
               | are using some WASM sandbox tricks, because they aren't
               | going to rewrite everything in Rust, given the effort.
               | 
               | Chrome only now started to consider to allow adding Rust,
               | and it is baby steps, not coming close to V8, graphics
               | engine and such.
        
               | gavinray wrote:
               | The second someone makes a successor language that
               | seamlessly/directly interops with C++ _AND_ has the level
               | of build/IDE tooling that C++/Rust have, I'm on board.
               | 
               | The closest thing right now is Sean Baxter's "Circle"
               | compiler in "Carbon" mode IMO:
               | 
               | https://github.com/seanbaxter/circle/blob/master/new-
               | circle/...
               | 
               | Unfortunately, Circle is closed-source and there's no LSP
               | or other tooling to make the authoring experience nice.
        
               | pjmlp wrote:
               | I also see Circle as the most promisor C++ wannabe, from
               | all the contenders, and it being closed-source, once upon
               | a time all major compilers were, so lets see.
        
               | freeone3000 wrote:
               | Rust has been bootstrapped for nearly a decade. The rust
               | reference toolchain is built in rust.
        
               | detaro wrote:
               | if you pretend that LLVM and friends are not part of the
               | toolchain
        
               | pjmlp wrote:
               | So no need for LLVM and GCC, Great news!
               | 
               | Where can we download it?
        
               | Voultapher wrote:
               | Assuming you are serious, there is https://github.com/byt
               | ecodealliance/wasmtime/tree/main/crane... which is
               | written in Rust and is targeted to become the default
               | debug backend in rustc. LLVM has accumulated _a lot_ of
               | optimizations contributed by various groups and people
               | over more than a decade. It 's hard to catch up to that
               | by virtue of resource limits.
        
               | mr_00ff00 wrote:
               | Is there a reason to replace LLVM? Are there still memory
               | bugs that are popping up and causing issues?
        
               | steveklabnik wrote:
               | https://github.com/bytecodealliance/wasmtime/blob/main/cr
               | ane...
        
               | pjmlp wrote:
               | I was being sarcastic, when Cranelift becomes the
               | official reference implementation then I shut up.
        
               | [deleted]
        
       | Voultapher wrote:
       | > Tracing GC is the simplest model for the user, and helps with
       | time management and development velocity, two very important
       | aspects of software engineering.
       | 
       | > Borrow checking is very fast, and helps avoid data races.
       | 
       | One thing many people seem to assume is that not having to care
       | about memory means you can program faster and get to your goal
       | faster. As the author here seems to do. However as it turns out,
       | if your program is more complex than a ~100-1000 lines of code,
       | explaining in a explicit way who owns what and who gets to change
       | state when, is a very useful way to avoid bugs.
       | 
       | Saoirse Shipwreckt aka withoutboats mentioned this a while ago in
       | https://without.boats/hire-me/
       | 
       | > Rust works because it enables users to write in an imperative
       | programming style, which is the mainstream style of programming
       | that most users are familiar with, while avoiding to an
       | impressive degree the kinds of bugs that imperative programming
       | is notorious for. As I said once, pure functional programming is
       | an ingenious trick to show you can code without mutation, but
       | Rust is an even cleverer trick to show you can just have
       | mutation.
       | 
       | and later follows up on this in
       | https://without.boats/blog/revisiting-a-smaller-rust/
       | 
       | > I still think this is Rust's "secret sauce" and it does mean
       | what I said: the language would have to have ownership and
       | borrowing. But what I've realized since is that there's a very
       | important distinction between the cases in which users want these
       | semantics and the cases where they largely get in the way. This
       | distinction is between types which represent resources and types
       | which represent data.
        
         | 634636346 wrote:
         | [flagged]
        
         | cmrdporcupine wrote:
         | Two things, full time Rust dev here:
         | 
         | a) Rust's borrow checker is good and its type system good, but
         | IMHO it's not really doing what you say it is as well as you're
         | implying: _" explaining in an explicit way who owns what"_;
         | While ownership _is_ explicit and static (apart from RefCell
         | and friends), description of that ownership is scattered all
         | over, program state flows are _not_ modelled in the type system
         | at all, and on the whole Rust is far from having being a kind
         | of explicit  "I can reason about the whole program" declaritive
         | system with the kind of clarity you're implying. Or maybe I'm
         | taking your claims too strongly.
         | 
         | b) Rust's borrow checker is good. But it's not perfect and
         | fails to pass things that in fact should be legal borrows. In
         | particular there's edge cases around where things are grabbed
         | in if/let/else or matches, like this fail (from my own code):
         | {                 let local_version = self.seek_local(tx);
         | if local_version.is_some() {                     return match
         | &local_version.unwrap().value {
         | Entry::Value(v) => Some(v),  // reference to value
         | Entry::Tombstone => None,                     };
         | }                      }             // note that 'local' has
         | gone out of scope here and so self should not be borrowed
         | 
         | ... code later in func complains 'self' is still borrowed,
         | 
         | but the same thing done this way (but less efficiently) passes:
         | if self.seek_local(tx).is_some() {                 let
         | local_version = self.seek_local(tx).unwrap();
         | return match &local_version.value {
         | Entry::Value(v) => Some(v),
         | Entry::Tombstone => None,                 };             }
         | 
         | ... same other code that uses 'self' compiles fine
         | 
         | In neither case is the 'local_version' being used outside of
         | the lexical scope, and 'self' cannot be borrowed in either
         | case, but the borrow checker is convinced in version #1 that
         | they are and that code below that lexical scope cannot proceed
         | because 'self' is borrowed. They're logically basically
         | equivalent from a program flow and state mgmt, but the second
         | passes while the first fails. Rust 1.7.0 stable.
         | 
         | (Before you ask, I did have if/let to take apart local_version
         | instead of using unwrap, and the compiler griped about that
         | even more)
         | 
         | Having the burden of how to fix that fall on the programmer
         | sucks. This is all a step in the right direction, but I run
         | into this kind of thing here and there and I shouldn't have to.
        
           | ziml77 wrote:
           | The limitations of the borrow checker when it comes to
           | borrowing self are annoying. I've had cases where I just said
           | "screw it" and copied the body of a function inline in the 1
           | or 2 places it was being called just to make the borrow
           | checker happy.
        
         | liuliu wrote:
         | I don't write Rust.
         | 
         | But here is what you said and what the author said don't
         | conflict with each other, and it has been on my mind for a
         | while.
         | 
         | People who write similar code, or work on things for decades
         | usually don't really think through what "sketch out some code"
         | looks like. They spend most of their time on refactoring things
         | that has clear use-cases, but not well-defined API boundaries
         | within the component, or between components. So ownerships,
         | nullability checks, data race checks are all comes very
         | naturally as a starter.
         | 
         | But there are other side of the world, where people constantly
         | sketching out something, for things like creative arts, high-
         | level game logic, data analysis, machine learning etc. Now
         | putting yourself in that position, the syntax noises are
         | actively in the way of this type of programming. Ownerships,
         | even nullability checks are not helpful if you just want to
         | have partial code running and checking if it draws part of the
         | graph. This is a world Python excels, and people constantly
         | complaining about why this piece of Python code doesn't have
         | type-annotation.
         | 
         | We may never be at peace between these two worlds, and this
         | manifest itself somewhat into the "two-language problem". But
         | that to me, is when someone mean "development velocity is
         | faster".
        
           | marcosdumay wrote:
           | > Ownerships, even nullability checks are not helpful
           | 
           | Memory management does get on the way. But you are wrong
           | about algebraic data types, they will help you sketch
           | something.
           | 
           | Ideally, if you don't know what you want, you will want
           | extendable1 algebraic types, more like Type Script than Rust,
           | but what you call "nullability check" is a benefit since the
           | beginning.
           | 
           | 1 - Where you can say "here comes a record with those
           | columns" instead of "here comes this record". You _can_ write
           | this in Rust, but it 's easier to simply completely define
           | everything.
        
           | convolvatron wrote:
           | I really love parts of rust and kinda hate other parts.
           | 
           | but this is what really ruins it for me. I want to play. I
           | want to knock something together and work with it and see
           | what kind of shape it is.
           | 
           | rust demands that I cross every last t before I can run it at
           | all. which is great if you already have a crystal notion of
           | what you are building
        
             | cmrdporcupine wrote:
             | This is definitely true, but I also don't know what a
             | reasonable alternative is at this point for systems dev
             | (aka places where a GC is a Bad Idea). I wouldn't unleash C
             | or C++ onto a new project like that? I'd just feel icky.
             | And Zig's type system IMHO isn't good enough, I'd really
             | miss pattern matching for one.
             | 
             | I _do_ think many people are using Rust in the Wrong
             | Places(tm). It seems like torture to me to be applying it
             | for general application development (though because I
             | basically now  "think" in it, I can see I myself would be
             | tempted to do so).
             | 
             | And for things with complicated ownership graphs or nested
             | interrelated data? It's just... no. Dear god, _Iterator_ in
             | Rust is an ownership and type traits nightmare, let alone
             | anything more complicated
             | 
             | So I think people should just use a hybrid approach and
             | keep Rust where it belongs down in the guts and use
             | something higher level and garbage collected higher up.
             | 
             | Here's another thing about Rust that's driving me batty: it
             | is nominally positioned as a "systems" programming
             | language, but key things that would make it more useful
             | there are being neglected, while things that I would
             | consider webdev/server programming aspects are being highly
             | emphasized.
             | 
             | Examples I would give that have driven _me_ nuts recently:
             | allocator_api  / pluggable per-object allocators ... stuck
             | in nightly since _2016_ (!). Full set of SIMD intrinsics
             | and broader SIMD support generally ... also stuck.
             | const_generics_expr ... still not there.
             | 
             | Meanwhile async this and async that and things more useful
             | to the microservice crowd proliferate and prosper
        
               | Yoric wrote:
               | I think I agree with most of what you write, but note
               | that async has lots of applications beyond microservices.
               | In particular, writing anything that uses the network
               | (e.g. a web browser), which definitely feels system-y to
               | me.
        
             | sroussey wrote:
             | This is the nice thing about TypeScript--you can type want
             | you want. As you iterate you can either ramp or down your
             | type checking. This is outside the realm of memory
             | management, of course.
             | 
             | And new to JS/TS land is the separation of pure data
             | structures from resources. Something a sibling comment or
             | brought up.
        
             | snek_case wrote:
             | > rust demands that I cross every last t before I can run
             | it at all.
             | 
             | It's worse than that IMO. Rust makes it very
             | awkward/impractical to have cyclic data structures, which
             | are necessary to write a lot of useful programs. The Rust
             | fans will quickly jump in and tell you that if you need
             | cycles, your program is wrong and you're just not a good
             | enough programmer, but Maybe it's just that the Rust borrow
             | checker is too limited and primitive, and it really just
             | gets in the way sometimes.
             | 
             | Some of the restrictions of the Rust borrow checker and
             | type system are arbitrary. They're there because Rust
             | currently can't do better. They're not the gospel, they
             | aren't necessarily inherent property that must always be
             | satisfied for a program to be bug free. The Rust notion of
             | safety is not an absolute. It's a compromise, and a really
             | annoying, tiresome drain on motivation and productivity
             | sometimes.
        
               | nsajko wrote:
               | > currently can't do better
               | 
               | The limitations are an inherent consequence of basic
               | tenets of Rust's design. Rust wouldn't be Rust anymore if
               | you fixed them.
               | 
               | > Some of the restrictions of the Rust borrow checker and
               | type system are arbitrary. They're there because Rust
               | currently can't do better. They're not the gospel, they
               | aren't necessarily inherent property that must always be
               | satisfied for a program to be bug free. The Rust notion
               | of safety is not an absolute. It's a compromise, and a
               | really annoying, tiresome drain on motivation and
               | productivity sometimes.
               | 
               | Yeah, but this actually seems consistent with the
               | philosophy behind Rust: to take away the tools a
               | programmer needs for creativity, so they couldn't do
               | potentially costly mistakes, as applicable to big teams
               | in huge corporations. Another commenter in this thread
               | put it nicely: the borrow checker is a straitjacket for
               | the programmer.
               | 
               | It's not meant to foster creativity, it's meant to be
               | safe for big business and novice employees.
        
               | Yoric wrote:
               | > It's not meant to foster creativity, it's meant to be
               | safe for big business and novice employees.
               | 
               | Interestingly, my experience is the opposite.
               | 
               | I find that the "straightjacket" is extremely precious
               | during refactorings - in particular, the type of
               | refactorings that I perform constantly when I'm
               | prototyping.
               | 
               | Compared to this, I'm currently writing Python code, and
               | every time I attempt a refactoring, I waste considerable
               | amounts of time before I can test the interesting new
               | codepath, because I end up breaking hundreds of other
               | codepaths that get in the way and I need to go through
               | the testsuite (and pray that it contains a sufficient
               | number of tests) hundreds of time until the code is kinda
               | stable.
               | 
               | Which is not to say that Rust matches every scenario. We
               | agree that it doesn't, by design. But I don't think that
               | the scenarios you sketch out are the best representation
               | of what Rust can/should be used for and can't/shouldn't
               | be used for.
        
               | yipyip wrote:
               | [dead]
        
               | ordu wrote:
               | Cyclic data structures are implemented easily with
               | unsafe. Like non-cyclical ones (Vec for example). The
               | difficult part is to make a safe API to that. This
               | difficulties are not of syntactic nature but design
               | difficulties. You need to think through your use cases
               | for such a struct and to devise an API that supports
               | them.
               | 
               | This is more difficult than C++ way "just do it". With
               | C++ you will solve the same problems but on a case by
               | case basis as they come into view. With Rust you need to
               | solve these problems upfront or do a lot of refactoring
               | later. There are upsides and downsides in both
               | approaches, but it is clear that Rust is not good to
               | sketch some code quickly to see how it will do.
               | 
               | It is still possible to do it quickly with Rust in a C++
               | way by leaking usafety everywhere and passing raw
               | pointers, but I think it is still easier to do it with
               | C++ which was designed for this style of coding.
        
               | jackmott42 wrote:
               | I would never tell you that you are wrong to have cyclic
               | data structures. But there are reasonable workarounds
               | like using handles into an array to do it, which of
               | course re-creates some of the same problems as pointers,
               | but not the worst ones, and is often a positive for
               | performance on modern hardware due to improved data
               | locality.
               | 
               | Or you can use reference counted types and take a small
               | performance hit.
               | 
               | Or use unsafe and git gud.
        
               | jcranmer wrote:
               | The basic model of Rust is to move use-after-free from a
               | dynamic, runtime check to a static, compile-time check.
               | But to keep the static checks from being Turing-complete,
               | you need to prohibit arbitrary cycles while something
               | like a tree (or other boundable recursion) is doable. So
               | Rust not being able to check cyclic data structures isn't
               | a "Rust currently can't do better" situation, it's a
               | "Rust just can't do better" situation.
               | 
               | What Rust's intended solution for that is that you add in
               | data structures that do the dynamic checking for you in
               | those cases. But the Rust library doesn't provide
               | anything here that's useful (RefCell is the closest
               | alternative, and that's pretty close to a this-is-never-
               | what-you-want datatype), which means your options are
               | either to use integers, roll your own with unsafe, or try
               | hard to rewrite your code to not use cycles (which is
               | usually a euphemism for use integers anyways). The
               | problem here, I think, is that there is a missing data
               | structure helper that can sit in between integers and
               | references, namely something akin to handles (with a
               | corresponding allocator that allows concurrent
               | creation/deletion of elements).
        
               | cmrdporcupine wrote:
               | _missing data structure helper_ -- didn 't you already
               | just name-check that though, since that's basically
               | RefCell .. or if you're willing to roll the dice...
               | UnsafeCell (aka "trust me I know what I'm doing")?
        
               | jcranmer wrote:
               | What you essentially want for the user to not write any
               | unsafe code is this kind of interface:
               | trait Allocator {          fn allocate<'a, T>(&'a self,
               | init: T) -> Handle<'a, T>;          fn deallocate<'a,
               | T>(&'a self, handle: Handle<'a, T>);          fn
               | read<T>(&self, handle: Handle<'_, T>) -> impl Deref<T>;
               | fn write<T>(&self, handle: Handle<'_, T>) -> impl
               | DerefMut<T>;        }
               | 
               | &'a RefCell<T> is pretty close to a definition of
               | Handle<'a, T>, except that Rust provides no
               | implementations of allocate and deallocate that take a
               | const instead of a mut reference for self. Trying to make
               | an allocator that lets you safely deallocate something
               | requires a completely different implementation of
               | Handle<'a, T> than what RefCell can provide, and even if
               | you're fine without deallocation, allocation with a const
               | ref still requires unsafe to get the lifetime parameter
               | right.
        
             | db48x wrote:
             | Yea, different languages for different purposes. Rust is
             | for finished products, not so much for experimentation.
             | When you want to play or experiment you should use Lisp.
        
               | adamc wrote:
               | That makes it expensive to move from experimentation to
               | "fairly usable", though.
        
               | db48x wrote:
               | Your Lisp program will be entirely usable once you have
               | experimented and found the right way to do it. Lisp
               | compilers are really good, and they support gradual
               | typing: you can write your program with no explicit type
               | information, and then speed it up by adding type
               | information in the hot spots. You can deploy that to
               | production and it will serve you well.
               | 
               | At some point your Lisp program will be mature, you will
               | have implemented most of the features you know you will
               | need, and you will know that any new features you add in
               | the future will not alter the architecture. Once you
               | understand the problem and have established the best
               | architecture for the program, you can consider rewriting
               | it in Rust. Lisp's GC does have a run-time cost, and you
               | can measure it to figure out how much money you will save
               | by eliminating it. If you will save more money than the
               | cost of the rewrite, then go for it. Otherwise you can go
               | on to work on something more cost-effective.
               | 
               | Note that you might not need to rewrite the whole
               | program; it might be more effective to rewrite the most
               | performance-critical portion in Rust, and then call it
               | from your existing Lisp program. This can give you the
               | best of both worlds.
        
             | jjnoakes wrote:
             | > rust demands that I cross every last t before I can run
             | it at all. which is great if you already have a crystal
             | notion of what you are building
             | 
             | Maybe I'm a weirdo, but I don't find this to be the case
             | for me.
             | 
             | When I'm knocking things together in Rust I use a ton of
             | unwrap() and todo!() and panic!() so I can figure out what
             | I'm really doing and what shape it needs to have.
             | 
             | And then when I have a design solidified, I can easily go
             | in and finish the todo!() code, remove the panic!() and
             | unwrap() and use proper error types, etc.
        
           | IshKebab wrote:
           | In my experience _even in those "sketching" areas_ static
           | types and strict checking is the better trade-off.
           | 
           | I think the real criteria for "will static types and stricter
           | checks help?" is "how long will this thing last for?".
           | 
           | E.g. for a shell _REPL_ you definitely don 't want to have to
           | write our types, but for a shell _script_ you definitely do.
           | 
           | Something like using MATLAB for exploratory research is
           | probably another decent example. Or maybe hackathon games.
           | 
           | But for most games, data analysis, machine learning etc. then
           | being stricter pays for itself almost immediately.
        
             | Karrot_Kream wrote:
             | In your framing there's a sort of implicit _downplaying_ of
             | the frequency of exploratory work and an implicit
             | _promotion_ of stricter work.
             | 
             | > Something like using MATLAB for exploratory research is
             | probably another decent example. Or maybe hackathon games.
             | But for _most_ games, data analysis, machine learning etc.
             | then being stricter pays for itself almost immediately.
             | 
             | (Emphasis mine)
             | 
             | This is where the viewpoints differ. Some people spend a
             | lot more time on the exploratory aspect of coding. Others
             | prefer seeing a program or a system to completion. It
             | largely depends on what you work on and where your
             | preferences lie.
             | 
             | Years ago I wrote a script that grabs a bunch of stuff from
             | the HN API, does some aggregation and processing, and makes
             | a visualization out of them. I wrote it because the idea
             | hit me on a whim while intoxicated, and I wrote the whole
             | thing while intoxicated. The script works and I still use
             | it frequently. I haven't made any changes to it because it
             | just does what it needs to. It has no types. It's written
             | decently because I've been coding for a long time but I was
             | intoxicated when I wrote it. The important thing is _it 's
             | still providing value_.
             | 
             | There's a surprising amount of automation and glue code
             | that doesn't need the correctness of a type system. I've
             | written lots of stuff like this over the years that I use
             | weekly, sometimes daily, that I've never had to revisit
             | because they just work. I suspect it's a matter of personal
             | preference how much time a person spends on that kind of
             | work vs building out large, correct systems. I suspect
             | there's a long tail of quality-of-life tooling that is
             | simple and exploratory in nature much like large, strict
             | systems are much bigger than most people expect at first
             | blush because of how many cases they handle.
             | 
             | I think trying to say that one is more common than the
             | other without anything approaching the rigor of at least a
             | computing survey is really just to use your gut to make
             | generalizations. Which is what the strict vs loose typing
             | online debates really are. A popularity contest of what
             | kind of software people like to write given the forum the
             | question is being discussed on.
        
         | verdagon wrote:
         | I would love a language (or C++ subset!) where we could get the
         | benefits of that secret sauce, while mitigating or avoiding
         | some of its downsides.
         | 
         | Like Boats said, the borrow checker works really well with
         | data, but not so well with resources. I'd also opine that it
         | works well with data transformation but struggles with
         | abstraction (both the good and bad kinds), works well with
         | tree-shaped data but struggles with programs where the data has
         | more intra-relationships (like GUIs and more complex games),
         | and works well for imposing/upholding constraints but can
         | struggle with prototyping and iterating.
         | 
         | These are a nice tradeoff already, but if we can design some
         | paradigms that can harness the benefits without its particular
         | struggles, that would be pretty stellar.
         | 
         | One promising meta-direction is to find ways to compose
         | borrowing with mutable aliasing. Some promising approaches off
         | the top of my head:
         | 
         | * Vale-style "region borrowing" [0] layered on top of a more
         | flexible mutably-aliasing model, either involving single-
         | threaded RC (like in Nim) or generational references (like in
         | Vale).
         | 
         | * Forty2 [1] or Verona [2] isolation, which let us choose
         | between arenas and GC for isolated subgraphs. Combining that
         | with some annotations could be a real home run. I think Cone
         | [3] was going in this direction for a while.
         | 
         | * Val's simplified borrowing (mutable value semantics [4])
         | combined with some form of mutable aliasing (like in the
         | article!).
         | 
         | * Rust does this with its Rc/RefCell, though it doesn't compose
         | with the borrow checker and RAII as well as it could, IMO.
         | 
         | [0] https://verdagon.dev/blog/zero-cost-borrowing-regions-
         | part-1... (am author)
         | 
         | [1] http://forty2.is/
         | 
         | [2] https://github.com/microsoft/verona
         | 
         | [3] https://cone.jondgoodwin.com/
         | 
         | [4] https://www.jot.fm/issues/issue_2022_02/article2.pdf
        
       | latenightcoding wrote:
       | "Rule 4: When you want a raw pointer as a field, use an index or
       | an ID instead."
       | 
       | literally just woke up but: wouldn't it be simpler to use a
       | pointer to a pointer, or am I missing something
        
         | [deleted]
        
         | corysama wrote:
         | You might like: "Handles are the better pointers (2018)
         | (floooh.github.io)"
         | 
         | https://news.ycombinator.com/item?id=36419739
        
       | floor_ wrote:
       | Use memory arenas and never think about any of this again.
        
         | spacechild1 wrote:
         | How do arenas prevent out-of-bound access, double free or stale
         | pointers?
        
           | estebank wrote:
           | Out of bound access is avoided because you ise handles that
           | the arena has given you, creating an invalid handle is
           | restricted. You avoid double free because of Rust's owbership
           | semantics that make the arena itself reaponsible for
           | "deallocation" (which is just blanking the value and letting
           | Drop do its thing). You avoid stale pointers because every
           | access is checked at runtime if you're using a generational
           | arena.
        
             | spacechild1 wrote:
             | We are talking about C++ ;-)
        
         | shadowgovt wrote:
         | Sadly, untrue. Source: I use memory arenas, and it's still
         | pretty trivial to copy (instead of reference) an object onto a
         | stack and then try to save a pointer to that object. All you
         | need is to leave out one `&` and the compiler won't tell you
         | anything went wrong: it'll cheerfully let you retain a pointer
         | to a stack-based object that is going to die because explicit
         | lifetime analysis isn't a part of the language spec.
        
       | LoganDark wrote:
       | Absolutely love to see CHERI mentioned here <3
        
       | nraynaud wrote:
       | very nice array of ideas to open the debate for us mere mortals.
        
       | [deleted]
        
       | pizlonator wrote:
       | You could also just isoheap according to type, where the type is
       | whatever you come up with to make C++ casts sound. It could
       | literally be C++ types or something looser (like if you want to
       | say that bitcasting a int ptr to a float ptr is ok).
       | 
       | Then you don't need any language changes to make UAF type safe.
        
       | kbenson wrote:
       | Methods such as these for C and C++ are interesting, and needed,
       | but only solve a part of the problem.
       | 
       | As others have noted before, they do little good because they're
       | opt-in. I think there's a bit of nuance to that which needs to be
       | explored though, as I think it's less a problem that the extra
       | checks are opt in, and more a problem of how we use and
       | categorize libraries.
       | 
       | As long as we encourage dynamic and static library inclusion (and
       | why wouldn't we, it's how we build upon the work of others),
       | every language has a problem similar to how C and C++ are opt-in
       | and you can't easily control the code you include or link. If you
       | load openssl from Java or Rust or Go, you might have some benefit
       | from a well defined API layer, but ultimately you are still
       | beholden to the code openssl provides in their library.
       | 
       | Just as one of the real benefits of Rust or Java or Go is not
       | necessarily that the code is completely safe, but that weird
       | unsafe behavior usually requires special escape hatches which are
       | easier to audit, what we need are ways to categorize the code we
       | include, no matter the language it comes from, with appropriate
       | labels that denote how strong the safeguard guarantees it was
       | compiled with are and of which type, so we can make easier and
       | better informed decisions on what to include and how to audit it
       | easily when we do.
       | 
       | This applies to including something written in Rust as well. If
       | someone is writing something in C++ and wants to include a
       | library written in Rust, that it's written in Rust is only part
       | of the picture. It's equally important to how often (as a total
       | and as a percentage of code) the safety checks that language
       | required (or that the developers opted into) where escaped in
       | that library.
       | 
       | If the choice is a Rust library with 95% of the code in unsafe
       | blocks or a C++ library that opted into multiple different safety
       | checker systems and has almost no escapes from those
       | requirements, Rust is not providing any real safety benefits in
       | that situation, is it? What we need is better information exposed
       | at a higher level to developers about what they're opting into
       | when they use third party code, because we can all control what
       | safety mechanisms we use ourselves, so that's mostly a solved
       | problem.
        
         | jjnoakes wrote:
         | I feel like a few languages are better than others in a related
         | but not quite identical area:
         | 
         | Languages like Java and Go, while they CAN escape to native
         | libraries, have cultures that tend to avoid that kind of thing.
         | At least, in my projects, I have quite an easy time using zero
         | native dependencies with those languages (except for the
         | underlying kernel of course), and so I feel like there is a
         | much lower chance of escape-hatch issues sneaking in.
         | 
         | They aren't built on a foundation of legacy C and C++ libraries
         | - not even the crypto - and I find that to be an advantage.
        
         | verdagon wrote:
         | This is a great point, and one that doesn't get enough
         | attention. The article talks about using a static analysis
         | tool, but usage of that tool is indeed opt-in, like you say.
         | 
         | I suspect a language could mitigate this with the ability to
         | sandbox a library's code. That could be pretty slow though, but
         | we could compile it to wasm and then use wasm2c to convert it
         | back into native code. I wrote a bit about this idea in [0],
         | but I'd love to see someone make this work for C++.
         | 
         | [0] https://verdagon.dev/blog/fearless-ffi
        
         | jackmott42 wrote:
         | If you were starting a new project you could put lints in place
         | to make these things enforced. But at some point you have all
         | these lints and customizations in place, and you can't use old
         | or 3rd party C++ code any more because of them, so you begin to
         | ask, why not just use a new language where this stuff isn't
         | pasted together with glue and bailing wire?
        
           | kbenson wrote:
           | My point is really not about the code you write yourself, but
           | the code you need to include in your project. Rare is the
           | professional programmer that always gets to finish their
           | project using only code they wrote themselves, and for many
           | projects that's _highly inadvisable_ (don 't roll your own
           | crypt unless you have a very good reason).
           | 
           | So, given that at times we will have to use external
           | libraries, and given that even very safe languages often have
           | escape hatches meaning you can't be _sure_ the code of one
           | language has more constraints than another, it would be great
           | to have other indicators than the language it was written in
           | that indicated what safety checks it uses.
           | 
           | If next year you're writing a new program in a language that
           | hasn't even been invented as of now, and is viewed as safer
           | than every language out today, what does that actually get
           | you if one of your constraints is that you need to include
           | and use openssl or one of a few forks for compatibility
           | reasons? Wouldn't you rather be able to look at the available
           | options and see that come opt into specific safety
           | constraints, and have been good about not them circumventing
           | them, and do so _extremely easily_? Network effects and
           | existing known projects seem to have an inordinate amount of
           | staying power, so we might as well deal with that as a fact.
           | 
           | The world is a messy place, but the more information we have
           | the better our chances of making order out of it, even if
           | temporarily.
        
       ___________________________________________________________________
       (page generated 2023-06-23 23:00 UTC)