hngopher.com

       [HN Gopher] Borrow Checking, RC, GC, and Eleven Other Memory Saf...
       ___________________________________________________________________
        
       Borrow Checking, RC, GC, and Eleven Other Memory Safety Approaches
        
       Author : PotatoPancakes
       Score  : 167 points
       Date   : 2024-12-16 21:25 UTC (4 days ago)
        
 (HTM) web link (verdagon.dev)
 (TXT) w3m dump (verdagon.dev)
        
       | andrewstuart wrote:
       | I'd love to see a language that kept everything as familiar as
       | possible and implement memory safety as "the hard bit", instead
       | of the Rust approach of cooking in multiple different new sub
       | languages and concepts.
        
         | brabel wrote:
         | The author of the post is trying pretty much that with his
         | language, Vale.
        
         | phicoh wrote:
         | As a long time C programmer I like Rust because it combines two
         | things from C that are important to me (low runtime overhead,
         | no runtime system required) with a focus on writing correct
         | programs.
         | 
         | Memory safety is just one aspect where the compiler can help
         | making sure a program is correct. The more the compiler helps
         | with static analysis, the less we need to rely on creating
         | tests for edge cases.
        
           | zamalek wrote:
           | > Memory safety is just one aspect
           | 
           | I feel as though not enough attention is given to how std is
           | designed. For example: [u8], str, Path, and OsStr may be
           | confusing at first, but when you understand why they are
           | there any other approach feels icky. std guides you down a
           | path of caring about things that really should matter (at
           | least if you're only unwrapping provably safe values).
           | 
           | Have you considered what happens if not-utf8 data winds up in
           | an environment variable that you are writing to stdout? What
           | if it contains malicious VT commands?
        
             | shiomiru wrote:
             | > Have you considered what happens if not-utf8 data winds
             | up in an environment variable that you are writing to
             | stdout? What if it contains malicious VT commands?
             | 
             | Unless you're talking about terminal bugs in parsing
             | invalid UTF-8 - and parsing invalid UTF-8 is easier than
             | rendering _valid_ UTF-8 - VT commands are UTF-8 compatible.
             | You just need to embed an ASCII escape character.
        
               | adgjlsfhk1 wrote:
               | yeah I think this is an area where rust (and python) just
               | get it wrong. files, the Internet and input devices can
               | all give you invalid Unicode. IMO it's better to have a
               | primary string type that includes invalid Unicode since
               | most algorithms will handle it correctly anyway, and the
               | ones that won't can pretty clearly check and throw errors
               | appropriately (especially since very few algorithms work
               | correctly for all of Unicode in the first place)
        
               | dwattttt wrote:
               | There's a primary type that holds invalid utf8; it's
               | [u8]. If you want a string from that, you try turn it
               | into a string, and deal with the errors then.
               | 
               | If an algorithm works for invalid Unicode, it should
               | probably be an algorithm on bytes, not strings.
        
         | ramon156 wrote:
         | Tbf I would already consider a different language when it has
         | al the nice syntax sugar and design choices of Rust. I like
         | almost every choice they made, and I miss things like `if let
         | Some` or unwrapping in other languages. It's just not the same
        
         | chrismorgan wrote:
         | As an experienced Rust developer, I have absolutely no idea
         | what you mean by this. Could you write a little more about what
         | you have in mind, and even what you mean by sub-languages and
         | concepts in Rust?
        
         | karmakurtisaani wrote:
         | Isn't that just C/C++?
        
         | frou_dh wrote:
         | "Familiar" is subjective so it's not really something to hang
         | your hat on.
        
         | nicoburns wrote:
         | Familiar to whom? I came from a JavaScript background, and
         | Rust's syntax and "functional lite" style felt very familiar.
        
         | pornel wrote:
         | Safety is not an extra feature a'la carte. These concepts are
         | all inter-connected:
         | 
         | Safety requires unions to be safe, so unions have to become
         | tagged enums. To have tagged enums usable, you have to have
         | pattern matching, otherwise you'd get something awkward like
         | C++ std::variant.
         | 
         | Borrow checking works only on borrowed values (as the name
         | suggests), so you will need something else for long-lived/non-
         | lexical storage. To avoid GC or automatic refcounting, you'll
         | want moves with exclusive ownership.
         | 
         | Exclusive ownership lets you find all the places where a value
         | is used for the last time, and you will want to prevent double-
         | free and uninitialized memory, which is a job for RAII and
         | destructors.
         | 
         | Static analysis of manual malloc/realloc and casting to/from
         | void* is difficult, slow, and in many cases provably
         | impossible, so you'll want to have safely implemented standard
         | collections, and for these you'll want generics.
         | 
         | Not all bounds checks can be eliminated, so you'll want to have
         | iterators to implement typical patterns without redundant
         | bounds checks. Iterators need closures to be ergonomic.
         | 
         | ...and so on.
         | 
         | Every time you plug a safety hole, it needs a language feature
         | to control it, and then it needs another language feature to
         | make this control fast and ergonomic.
         | 
         | If you start with "C but safe", and keep pulling that thread,
         | nearly all of Rust will come out.
        
           | AlotOfReading wrote:
           | I've experienced that frustration myself several times and
           | tried to do "rust but simpler". I just recently failed
           | attempt #4 at not reinventing rust but worse. Attempt #2
           | ended with Erlang but worse, which was a pleasant surprise.
        
       | nahuel0x wrote:
       | It's surprising to see an article with such a large encompassing
       | of different techniques, hybrid techniques and design
       | interactions with the type system, but is more surprising that a
       | whole dimension of memory (un)management was left out: memory
       | fragmentation
        
         | Quekid5 wrote:
         | It's probably because fragmentation isn't a safety issue. (In
         | the sense of 'safety' being discussed here.)
        
           | galangalalgol wrote:
           | It doesn't create UB, but it is something safety critical
           | software has to address.
        
             | Quekid5 wrote:
             | ... which is why I had that little bit at the end there.
        
       | willvarfar wrote:
       | Meta comment, but I really like the formatting of the blog post!
       | 
       | It reminds me of the early days of the web, when text was king
       | and content was king. I particularly like the sidenotes in the
       | margins approach.
       | 
       | (Hope the author sees this comment :) Hats off)
        
         | brabel wrote:
         | Yeah the author always uses this in his blog about his
         | language, Vale (which is very unfortunately not being developed
         | anymore, at least for now). The other posts are also worth a
         | read: https://vale.dev/
        
           | ivell wrote:
           | He now works on Mojo, to bring linear types into Mojo.
        
         | dgan wrote:
         | I am sorry, I am maybe dumb but i can't see the 14 techniques
         | been listed anywhere? Where do i even click?
        
           | hawski wrote:
           | You need to read the post.
           | 
           | > Wait a minute, this list goes to 17, yet the intro only
           | mentions 14! I actually did that because a couple might
           | overlap and a couple of them are half-approaches, and that
           | last one is just here for fun. Besides, as I learn more
           | approaches and add them to the list, the title will get more
           | and more out of date anyway.
        
         | _kb wrote:
         | Side notes are a great layout for most deeper reads.
         | 
         | There's some great tooling for that via
         | https://edwardtufte.github.io/tufte-css/ and https://tufte-
         | latex.github.io/tufte-latex/.
        
       | chrismorgan wrote:
       | > _Cursed_
       | 
       | With an acute accent, that should be roughly /,ke:r'seId/ "curse-
       | ay-d". (Think "cafe" or "sashayed".)
       | 
       | The stylised pronunciation being evoked is roughly /'ke:r,sed/,
       | "curse-ed", and would be written with a grave accent: "cursed".
        
         | rzzzt wrote:
         | Italians will get close to the first pronunciation both ways, I
         | think. The Zalgo line noise is an international way of
         | signaling the level of curse in writing.
        
         | bloppe wrote:
         | Accents mean different things in different languages. I'm
         | assuming the author was invoking Spanish, where the acute
         | accent impacts syllable stress but not pronunciation.
        
       | mgaunard wrote:
       | The fact that re-using a slot for a different object of the same
       | type is considered a memory safety technique is ridiculous.
        
         | obl wrote:
         | It is not ridiculous at all. Those things have pretty precise
         | definitions and type segregation absolutely does remove a bunch
         | of soundness issues related to type confusion.
         | 
         | You can think of it as the rather classic "Vec of struct +
         | numeric IDs" that is used a lot e.g. in Rust to represent
         | complex graph-like structures.
         | 
         | This combined with bound checking is absolutely memory safe. It
         | has a bunch of correctness issue that can arise due to index
         | confusion but those are not safety issues. When combined with
         | some kind of generational counters those correctness issue also
         | go away but are only caught at runtime not at compile time (and
         | they incur a runtime cost).
         | 
         | Rust's memory safety is about avoiding liveness issues (that
         | become type confusions since all memory allocators will reuse
         | memory for different types), nothing more, nothing less.
        
           | jandrewrogers wrote:
           | FWIW, arrays of structs + integer handles is the primary way
           | objects are represented in performance-engineered C++.
        
           | gpderetta wrote:
           | > but those are not safety issues.
           | 
           | there are not _memory_ safety issues. But they definitely can
           | lead to security issues with some sort of confused deputy
           | attack.
           | 
           | For example a capability based system that relied on just
           | this form of memory safety would be pointless.
           | 
           | Of course this can be mitigated by adding version counters to
           | objects and object selectors.
        
         | adamrezich wrote:
         | "Safety" is a very overloaded English word with strong
         | connotations, and the popularization of it in the context of
         | "memory safety" has been good in some ways, but has really
         | poisoned the discourse in many others.
        
         | AlotOfReading wrote:
         | It absolutely is. Some embedded mallocs do this under the hood
         | against a predefined linker table. Genuinely helps even with
         | the restrictions it imposes.
        
         | eddd-ddde wrote:
         | Are you implying that it is unsafe? Or that it is unrelated to
         | safety properties?
        
           | mgaunard wrote:
           | It just shows memory safety is a joke, simply replacing a
           | class of bugs with another.
        
             | IshKebab wrote:
             | It isn't a joke. With memory safety bugs the value of an
             | object can unexpectedly be _any bit pattern_. That breaks
             | the assumptions of basically every language and leads to
             | pretty much anything happening.
             | 
             | If you have an array of objects of the same type and you
             | just pick the wrong one, then the data still has to be a
             | valid bit pattern. Yes it might still be a security bug,
             | but it's much less likely because you aren't completely
             | subverting the language.
             | 
             | Surely you don't think all bugs are the same because they
             | are all bugs?
        
               | mgaunard wrote:
               | It takes a lot more for a program to be correct than
               | having valid bit patterns.
               | 
               | To begin with, the whole point of classes is to maintain
               | invariants. Guaranteeing that a location in memory
               | matches the valid bit patterns of its members is far from
               | sufficient.
        
               | IshKebab wrote:
               | > It takes a lot more for a program to be correct than
               | having valid bit patterns.
               | 
               | Obviously. I never said otherwise. What's your point?
        
         | ptx wrote:
         | Maybe we need to introduce the term "value safety" to
         | complement "type safety".
         | 
         | If a language is merely type-safe, then it might be OK to
         | silently replace a value with a different one of the same type,
         | sure, fine. Who cares if the program transmits the wrong
         | message to the wrong recipient as long as it's definitely some
         | message and some recipient?
         | 
         | But a _value-safe_ language, I suggest, is one that doesn 't
         | pull this kind of switcheroo.
        
         | jcarrano wrote:
         | Reminds me of a video I saw about "cloning" on Super Mario
         | 64[1]. SM64 uses slots and one can mess around with slot
         | deletions to get random objects.
         | 
         | [1] https://www.youtube.com/watch?v=X2AhyDI58-I
        
       | hawski wrote:
       | That is very informational. Thank you.
       | 
       | I am interested in Vale and it feels very promising, though
       | because my interested in bootstrapping I don't like that it is
       | written in Scala. I know, that is shallow, but that's a thing
       | that limits my enthusiasm.
       | 
       | If you are like me and don't like jumping around between notes
       | and text and you prefer to read the notes anyway, here is a
       | little snippet you can run in Web Inspector's Console:
       | document.querySelectorAll(".slice-contents a[data-
       | noteid]").forEach(e => {document.querySelectorAll('.slice-notes
       | [data-noteid="' + e.attributes["data-noteid"].nodeValue + '"]
       | p').forEach(p => {p.style.fontSize = 'smaller';
       | e.parentNode.insertBefore(p, e)}); e.remove() })
       | 
       | It will replace note links with notes themselves making them
       | smaller, because they will not always fit smoothly.
        
       | tialaramex wrote:
       | The list gets very woolly by the end. CHERI exists (though not at
       | volume), Cornucopia Reloaded is a research paper, "plus some
       | techniques to prevent use-after-free on the stack" is entirely
       | hand waving.
       | 
       | It is really good as food for thought though.
        
       | nemetroid wrote:
       | No mention of RCU?
        
         | gpderetta wrote:
         | RCU, despite the name, is indeed a reclamation algorithm, but
         | not a general one. I.e. you would use RCU (or some other
         | deferred reclamation algorithm like hazard pointers) for
         | specific data structures when you do not have generalized
         | garbage collection.
         | 
         | A generalized RCU is just a tracing GC.
        
           | nemetroid wrote:
           | The part where the article limits itself to general
           | techniques eludes me.
        
       | DanielHB wrote:
       | I am not experienced with rust and borrow checkers, but my
       | impression is that borrow checkers also statically ensures
       | thread/async safety while most other memory safety systems don't.
       | Is this accurate?
        
         | PeterWhittaker wrote:
         | The first part - that the Rust borrow checker and overall
         | memory model ensures thread/async safety - is true. I cannot
         | speak to the second part - that other systems don't have this
         | assurance.
        
           | tialaramex wrote:
           | Just the borrowck isn't enough, you need the Send and Sync
           | marker traits. Marker traits are something lots of languages
           | _could_ do but they 'd be useless (or always unsafe) without
           | a lot of other machinery Rust had already.
        
             | kibwen wrote:
             | And Sync might just be there for performance, I think you
             | could get away with only Send if you didn't mind some
             | additional copying?
        
           | DanielHB wrote:
           | > that other systems don't have this assurance
           | 
           | My understanding is that most (all?) GC languages are memory
           | safe, but do not ensure statically verifiable thread safety
           | at all. Like Java, Go, C#, Python, etc.
        
         | kibwen wrote:
         | The borrow checker is only one component of the means by which
         | Rust statically enforces thread safety. If you design a
         | language that doesn't allow pointers to be shared across
         | threads at all, then you wouldn't need a borrow checker.
         | Likewise if you have an immutable-only language. What's
         | interesting about Rust is that it actually supports this
         | safely, which is still unbelievable sometimes (like being able
         | to send references to the stack to other threads via
         | std::thread::scoped).
        
           | nemaar wrote:
           | > If you design a language that doesn't allow pointers to be
           | shared across threads at all, then you wouldn't need a borrow
           | checker.
           | 
           | Is that actually true? I'm pretty sure you need the borrow
           | checker even for single threaded Rust to prevent use after
           | frees.
        
             | chlorion wrote:
             | Theres even more than just UAFs that you have to worry
             | about in a single threaded context, but yes you are
             | correct.
             | 
             | Here's a good post that talks about why shared mutability
             | even in single threaded contexts is dangerous:
             | https://manishearth.github.io/blog/2015/05/17/the-problem-
             | wi...
        
             | kibwen wrote:
             | Let me clarify: "wouldn't need a borrow checker for the
             | specific requirement of ensuring thread safety". Clearly
             | the borrow checker is quite useful in single-threaded
             | contexts on its own. :P The point is just that it's
             | perfectly valid to have a language that doesn't have
             | "reference semantics" at all.
        
             | jerf wrote:
             | You may have to define your terms more carefully and extend
             | past "doesn't allow pointers", but it can be true. Look to
             | Erlang; the key to Erlang's safety isn't actually its
             | immutable values, but the fact that no messages can travel
             | between the various "processes" (threads in conventional
             | parlance, just not "OS threads") that carry any sort of
             | reference or pointer is what makes it so Erlang is safe
             | without any concept of borrow checking. It also
             | semantically copies all values (there are some internal
             | optimizations for certain types of values that make it so
             | they are technically not copied, but at the language level,
             | it's all copies) and each process is GC'd, so in terms of
             | the Rust borrow checker I mention this _only_ in the
             | context of cross-thread sharing safety.
             | 
             | But in general, if threads can't communicate any sort of
             | pointer or reference that allows direct, unmediated access
             | to the same value that some other thread will see, there's
             | no need for a "borrow checker" for thread safety.
             | 
             | (Note that "but what if I have a thing that is just a token
             | for a value that people can potentially read and write from
             | anywhere?" is not an exception to this, because in this
             | context, such access would not be _unmediated_. This access
             | would still require messages to and from the  "holding"
             | process. This sort of safety won't stop you from basically
             | deliberately re-embedding your own new sorts of races and
             | unsafe accesses in at the higher level of your own code, it
             | just won't be a data race in the same way that multiple
             | threads reading and writing through the same pointer is a
             | data race at the lower level. The main solution to this
             | problem is "Doctor, it hurts when I do this." -> "Don't do
             | that.")
        
       | alexisread wrote:
       | Previous discussion:
       | 
       | https://news.ycombinator.com/item?id=40146615
       | 
       | https://news.ycombinator.com/item?id=41974185
        
         | dang wrote:
         | Thanks! Macroexpanded:
         | 
         |  _Borrow Checking, RC, GC, and the Eleven (!) Other Memory
         | Safety Approaches_ -
         | https://news.ycombinator.com/item?id=41974185 - Oct 2024 (1
         | comment)
         | 
         |  _Borrow Checking, RC, GC, and the Eleven () Other Memory
         | Safety Approaches_ -
         | https://news.ycombinator.com/item?id=40146615 - April 2024 (68
         | comments)
        
       | amelius wrote:
       | I like many of the ideas of Rust, but I still think it is an
       | unsuitable language for most projects.
       | 
       | The problem is that it is very easy to write non-GC'd code in a
       | GC'd language, but the other way around it is much much harder.
       | 
       | Therefore, I think the fundamental choice of Rust to not support
       | a GC is wrong.
        
         | FridgeSeal wrote:
         | It does have a GC.
         | 
         | It just runs at compile time. Bonus feature, it helpfully
         | prevents a number of common bugs too.
        
           | amelius wrote:
           | GC is short for automatic GC.
           | 
           | If you have to do it yourself, then it does not "have" a GC.
        
             | FridgeSeal wrote:
             | Ssshhh you're ruining my silly and "hasn't gone over
             | terribly well" joke.
        
         | f1shy wrote:
         | Yet another PoV: for some things with critical timing or so, GC
         | might be a problem. But most of the time, it isn't. The
         | performance/predictability topic could also be reviewed...
         | 
         | I was talking with a colleague about that, he said "in C I know
         | exactly where things are when" And I replied that under any OS
         | with virtual memory, you have basically no clue where are
         | things at any time, in the N levels of cache, and you cannot do
         | accurate time predictions anyway... [1]
         | 
         | I'm convinced today GC is the way to go for almost all. And I
         | was until 5 years ago or so, totally opposed to that view.
         | 
         | [1] https://news.ycombinator.com/item?id=42456310
        
           | pjmlp wrote:
           | Even with critical timing, real time GCs exist for decades
           | now, PTC and Aicas are two surviving companies selling
           | software tooling for embedded markets, including their own
           | JVM implementations, with AOT compilers, bare metal
           | deployments and real time GC.
           | 
           | Many of their customers are factory processes and military
           | deployments with weapons control, two scenarios where any
           | kind of stall might produce deadly results.
        
             | f1shy wrote:
             | Thank you so much! Now I have a more powerful argument,
             | when at work somebody says a motor ECU cannot use GC,
             | because of time constraints! Of course I assume there might
             | be a catch from the cost PoV, where Mil may be ok,
             | automotive consumer goods may be a problem (today)
        
               | pjmlp wrote:
               | Here,
               | 
               | https://www.ptc.com/en/products/developer-tools/perc
               | 
               | https://www.aicas.com/wp/products-services/jamaicavm/
        
             | dwattttt wrote:
             | While I agree with the point, the word "surviving" doesn't
             | do that point any favours.
        
               | pjmlp wrote:
               | Sadly in modern times many devs aren't willing to pay for
               | their tools...
               | 
               | Yet they surely appreciate the income they earn by using
               | them.
        
         | 3836293648 wrote:
         | You've got this one wrong. Rust is designed for a specific use
         | case. Most projects are not that use case. Therefore the choice
         | to use Rust is wrong.
         | 
         | If GC is an option and you want all the nice parts of Rust, use
         | OCaml
        
           | in-pursuit wrote:
           | I think OPs general point, although maybe not what they
           | stated is correct: it's easy to write GC'd code. It's "easy"
           | to write code with manual memory management. It's "easy" to
           | write RC code. But it's hard to write borrow checker code.
           | And that will probably limit adoption, even though the goals
           | of Rust are good.
        
           | pjmlp wrote:
           | If Go designers weren't so much anti-modern language design,
           | in many scenarios where people are rushing to do RIIR, they
           | would be better served with Go, if those modern features were
           | part of the language.
           | 
           | Having said that, there are still OCaml (as you noted),
           | Haskell, .NET languages with Native AOT, JVM languages with
           | GraalVM/OpenJ9, D, Nim, Swift, ....
           | 
           | And if one wants to get fancy with type systems, Idris,
           | Dafny, FStar,..
        
           | amelius wrote:
           | > If GC is an option and you want all the nice parts of Rust,
           | use OCaml
           | 
           | So are you saying it would be possible to use a hypothetical
           | "non-GC-enabled" OCaml compiler that complains if GC'd code
           | is invoked/generated, and it would be a similar experience as
           | using Rust?
        
             | johnnyjeans wrote:
             | No, simply because GC isn't an optional part of ocaml's
             | design. You can "disable" it only in the sense that you
             | stop any frees from happening.
             | 
             | In order to do away with Garbage collection you'd need to
             | completely change the type system, and while you might get
             | a result which ends up looking a lot like ocaml, it won't
             | be ocaml any more than ocaml is sml.
             | 
             | This _is_ what Rust started out as, by the way. In my
             | opinion, it should have stayed that way. Post-Mozilla Rust
             | has been nothing but a gigantic disappointment.
        
               | zozbot234 wrote:
               | Early versions of Rust "started out" _with_ GC: the
               | language was broadly similar to Golang, though with a lot
               | of influence from Ocaml and functional languages more
               | broadly. The comprehensive use of borrowck to avoid GC
               | altogether was only implemented around 2014 or so, hence
               | shortly before the Rust  "1.0" release in 2015.
        
           | amelius wrote:
           | > Most projects are not that use case. Therefore the choice
           | to use Rust is wrong.
           | 
           | Do you think that projects that have a large GUI component
           | should be written in Rust?
           | 
           | What if a project has both a "systems" and a GUI component to
           | it?
        
       | naasking wrote:
       | More people need to read up on C#'s ref's:
       | 
       | https://em-tg.github.io/csborrow/
       | 
       | These kinda-sorta fall under borrow checking or regions, just
       | without any annotations. Then again, Ada/Spark's strategy also
       | technically falls under Tofte-Talpin regions:
       | 
       | https://www.cs.cornell.edu/people/fluet/research/substruct-r...
        
         | mastax wrote:
         | Yeah C# is very well designed for gradually introducing low
         | level concepts for performance.
        
           | jimbob45 wrote:
           | The reverse is arguably the one trait Rust is missing that is
           | holding it back from mass adoption. You can write high-level
           | C# code and seamlessly introduce low-level concepts into it
           | as you see fit. However, although you can write low-level
           | Rust code easily, introducing high-level concepts is very
           | painful and, in many cases, impossible.
        
           | akkad33 wrote:
           | Also F#
        
             | neonsunset wrote:
             | As of now - F# needs more work w.r.t. support for [<Struct;
             | IsByRefLike>] types and byref<'T>s. There's a reason ref
             | lifetime analysis in Roslyn is quite a sophisticated part
             | of its implementation even if most developers aren't aware
             | of it.
             | 
             | F# has other cool features like `inline` bindings and
             | [<IlineIfLambda>] function attributes but I found it more
             | difficult to work with for systems programming when
             | attempting to stay within safe constructs.
             | 
             | But if you don't need byrefs, ref structs and spans, then
             | regular pointer-based or even array-based code is quite
             | pleasant to write, for example https://benchmarksgame-
             | team.pages.debian.net/benchmarksgame/...
             | 
             | With that said - F# is an incredible language, but for
             | systems programming you are likely to get better results in
             | C#. I hope eventually someone contributes subsequent spec
             | and compiler work to change this (and if you think you have
             | time and could try it - please do! it's a nice small
             | community).
        
               | akkad33 wrote:
               | Doesn't F# have byrefs, inrefs and outrefs. How are they
               | different from C# refs? Asking as someone with surface
               | level knowledge of both languages
               | https://learn.microsoft.com/en-us/dotnet/fsharp/language-
               | ref...
        
         | bob1029 wrote:
         | Here's an example where ref returns make a big impact:
         | 
         | https://github.com/prasannavl/WinApi
         | 
         | The performance of this interop layer is so close to native
         | it's difficult to argue for doing things the much more painful
         | way.
        
         | Someone wrote:
         | Isn't that a form of _"5. a mechanism where a pointer cannot
         | point to an object that is more deeply scoped than itself"_?
         | 
         | Also, since it's an error, I guess it must be different from
         | _gcc_ and _clang_ (and, likely other C compilers) _-Wreturn-
         | local-addr_ warnings
         | (https://www.emmtrix.com/wiki/Clang:Flag/-Wreturn-local-addr),
         | which can have false positives.
         | 
         | What's the difference? Is the language stricter, disallowing
         | some constructs that are valid, but hard to prove or is the
         | error not always triggered for buggy code?
        
           | naasking wrote:
           | > Isn't that a form of "5. a mechanism where a pointer cannot
           | point to an object that is more deeply scoped than itself"?
           | 
           | That constraint is just a Tofte-Talpin region restriction,
           | eg. lexically scoped regions.
        
       | xmcqdpt2 wrote:
       | Not a fan of the framing of the article. Firstly, there are
       | millions of Mayans alive today,
       | 
       | https://en.wikipedia.org/wiki/Maya_peoples
       | 
       | and secondly, the reason why the pre-Colombian cultural texts and
       | script are not in use today, even by the people who speak the 28
       | Mayan languages currently in use, is because of genocide by
       | Columbus and those that followed. The Catholic church destroyed
       | every piece of Mayan script they could get their hands on.
       | 
       | The article reads like the author is not aware of these basic
       | facts of American geography and history.
        
         | constantcrying wrote:
         | I thought it was just annoying to read. Irrelevant analogies
         | never helped me understand anything.
        
       | 4ad wrote:
       | > Interaction nets are a very fast way to manage purely immutable
       | data without garbage collection or reference counting.[...] HVM
       | starts with affine types (like move-only programming), but then
       | adds an extremely efficient lazy .clone() primitive, so it can
       | strategically clone objects instead of referencing them.
       | 
       | This is wrong, Interaction nets (and combinators) can model any
       | kind of computational systems, including ones that use mutation.
       | In fact, ICs are not really about types at all, although they do
       | come from a generalization of Girard's proofs nets, which came
       | from work in linear logic.
       | 
       | The interesting thing about ICs is that they are beta-optimal
       | (any encoding of a computation will be done in the minimum number
       | of steps required -- there is no _useless_ work being done), and
       | maximum-parallel with only local synchonization (all reduction
       | steps are local, and all work that _can_ be parallelized _will_
       | be parallelized).
       | 
       | Additionally ICs have the property that any encoding of a
       | different computational system in ICs will preserve the
       | asymptotic behavior of all programs written for the encoded
       | computational system. In fact, ICs are the only computational
       | system with this property.
       | 
       | Interaction nets absolutely require garbage collection in the
       | general sense. However, interaction combinators are linear and
       | all garbage collection is explicit (but still exists). HVMs
       | innovation is that by restricting the class of programs encoded
       | in the ICs you can get very cheap lambda duplication and eschew
       | the need for complex garbage collection while also reducing the
       | overhead of implementing ICs on regular CPUs (no croissants or
       | brackets, see Asperti[1] for what that means).
       | 
       | Having a linear language with the above restriction allows for a
       | very efficient implementation with a very simple GC, while
       | maximizing the benefits of ICs. In principle _any_ language can
       | be implemented on top of ICs, but to get most benefits you want a
       | language with these properties. It 's not that HVM starts with
       | affine types and an efficient lazy clone operation, it's that a
       | linear language allows extremely efficient lazy cloning
       | (including lambda cloning) to be implemented on top of ICs, and
       | the result of that is HVM.
       | 
       | > The HVM runtime implements this for Haskell.
       | 
       | This is very wrong. HVM has nothing to do with Haskell. HVM3 is
       | written in C[2], HVM2 has three implementations, one in C[3], one
       | in Rust[4], and a CUDA[5] one. HVM1 was just a prototype and was
       | written in Rust[6].
       | 
       | HOC[7], the company behing HVM provides two languages that
       | compile to HVM, Bend[8], and Kind[9]. Bend is a usual functional
       | language, while Kind is a theorem prover based on self types.
       | 
       | Haskell is not involved in any of these things except that the
       | HVM compiler (not runtime) is written in Haskell, but that is
       | irrelevant, before Haskell it used to be written in TypeScript
       | and then in Agda (Twitter discussion, sorry, no reference). It's
       | an implementation detail, it's not something the user sees.
       | 
       | Please note that HVM adds some stuff on top of ICs that makes it
       | not strictly beta-optimal, but nevertheless the stuff added is
       | useful in practice and the practical downgrade from theoretical
       | behaviour is minimal.
       | 
       | [1] Andrea Asperti, The Optimal Implementation of Functional
       | Programming Languages, ISBN-13: 978-0060815424
       | 
       | [2]
       | https://github.com/HigherOrderCO/HVM3/blob/main/src/HVML/Run...
       | 
       | [3] https://github.com/HigherOrderCO/HVM/blob/main/src/hvm.c
       | 
       | [4] https://github.com/HigherOrderCO/HVM/blob/main/src/hvm.rs
       | 
       | [5] https://github.com/HigherOrderCO/HVM/blob/main/src/hvm.cu
       | 
       | [6] https://github.com/HigherOrderCO/HVM1
       | 
       | [7] https://higherorderco.com
       | 
       | [8] https://github.com/HigherOrderCO/bend
       | 
       | [9] https://github.com/HigherOrderCO/kind
        
       | pizlonator wrote:
       | The way you make garbage collection deterministic is not by doing
       | regions but by making it concurrent. That's increasingly common,
       | though fully concurrent GCs are not as common as "sorta
       | concurrent" ones because there is a throughput hit to going fully
       | concurrent (albeit probably a smaller one than if you partitioned
       | your heap as the article suggests).
       | 
       | Also, no point in calling it "tracing garbage collection". Its
       | just "garbage collection". If you're garbage collecting, you're
       | tracing.
        
         | jakewins wrote:
         | Do you have any recommended reading material on this?
         | 
         | Intuitively it feels like making it concurrent should do the
         | opposite of making GC deterministic! I'd love to read something
         | showing that intuition is wrong
        
           | pizlonator wrote:
           | Garbage collection handbook
           | 
           | https://gchandbook.org/
           | 
           | If you want to see my latest concurrent GC, see
           | 
           | https://github.com/pizlonator/llvm-project-
           | deluge/blob/delug...
           | 
           | https://github.com/pizlonator/llvm-project-
           | deluge/blob/delug...
        
         | roetlich wrote:
         | > Also, no point in calling it "tracing garbage collection".
         | 
         | You're against more explicit naming just for the sake of it? In
         | the literature reference counting is also referred to as a type
         | of garbage collection, and doesn't involve tracing. If you
         | talking about a specific context you can probably drop the
         | "tracing", but in a general article like this it would just be
         | very confusing?
         | 
         | This way, someone can google "tracing garbage collection", and
         | will find the relevant wikipedia article:
         | https://en.wikipedia.org/wiki/Tracing_garbage_collection
        
           | pizlonator wrote:
           | The literature does not always put the "tracing" in front of
           | the "garbage collection".
           | 
           | For example, nobody says that Objective-C is garbage
           | collected just because it has ARC. Nobody says that C++ is
           | garbage collected even though shared_ptr is widespread. And
           | systems that do tracing GC just call it GC (see for example h
           | ttps://www.oracle.com/webfolder/technetwork/tutorials/obe/j..
           | .)
           | 
           | To think clearly about the tradeoff between GC and RC it's
           | important to acknowledge the semantic differences:
           | 
           | - GC definitely collects dead cycles.
           | 
           | - RC knows exactly when objects die, which allows for
           | sensible destructor semantics and things like CoW.
           | 
           | - it's possible to use RC as an optimization in a GC, but
           | then you end up with GC semantics and you still have tracing
           | (hence: if it's got tracing, it's a garbage collector).
           | 
           | It's a recent fad to say that RC is a kind of GC, but I don't
           | think it ever took off outside academia. Folks who write GCs
           | call them GCs. Folks who do shared_ptr or ARC say that they
           | don't use GC.
           | 
           | And its good if this fad dies because saying that RC is a
           | kind of GC causes folks to overlook the massive semantic
           | elephant in the room: if you use a GC then you can't swap it
           | for RC because you'd leak memory (RC would fail to delete
           | cycles), and if you use RC and swap it for a GC then you'd
           | leak resources (your destructors would no longer get called
           | when you expect them to).
           | 
           | On the other hand, it is possible to change the guts of an RC
           | impl without anyone noticing. And it's possible to change the
           | guts of a GC while preserving compatibility. So these are
           | really two different worlds.
        
             | roetlich wrote:
             | > The literature does not always put the "tracing" in front
             | of the "garbage collection".
             | 
             | Not always, but often enough that an introductory article
             | that presents an overview of different memory managment
             | techniques should maybe use the longer name to avoid
             | confusion.
             | 
             | And I kinda agree with you, using the name "garbage
             | collection" for RC doesn't really make sense, there is no
             | metaphorical garbage truck driving around to collect your
             | unused memory. :)
             | 
             | What's your opinion on the term "RC with cycle detection"
             | that some use for things like Python's GC?
             | 
             | > And it's possible to change the guts of a GC while
             | preserving compatibility.
             | 
             | Moving to a conservative GC might also introduce memory
             | leaks, if you're unlucky. But yes, "tracing" gc and rc
             | obeviously behave very differently, and have very different
             | performance considerations.
        
               | pizlonator wrote:
               | > Not always, but often enough that an introductory
               | article that presents an overview of different memory
               | managment techniques should maybe use the longer name to
               | avoid confusion.
               | 
               | Referring to garbage collection as tracing garbage
               | collection creates more confusion and should be avoided.
               | 
               | It confuses folks into thinking that there is some
               | garbage collection that isn't tracing. There's no such
               | thing.
               | 
               | > What's your opinion on the term "RC with cycle
               | detection" that some use for things like Python's GC?
               | 
               | Depends on how you detect cycles. Python uses a garbage
               | collector. Therefore I would say that python has a GC and
               | is a GC'd language.
               | 
               | > Moving to a conservative GC might also introduce memory
               | leaks, if you're unlucky.
               | 
               | Folks who adopt conservatism in production do so only if
               | they have a story for avoiding those leaks. (That's what
               | we did in JavaScriptCore.)
               | 
               | > But yes, "tracing" gc and rc obeviously behave very
               | differently, and have very different performance
               | considerations.
               | 
               | Just call it "GC" and everyone will know what you mean.
               | No need to be a contrarian and put "tracing" in front.
               | 
               | And it's not perf considerations if it's the difference
               | between your program running at all and crashing. Failing
               | to collect all cycles as RC does would cause a program
               | written in a GC'd language to simply crash if it ran for
               | more than just a short while. Failing to invoke
               | destructors the way RC'd programs expect, which would
               | happen if you tried to switch to GC, will cause
               | observably different behavior in addition to possible
               | performance issues.
        
               | roetlich wrote:
               | > And it's not perf considerations if it's the difference
               | between your program running at all and crashing.
               | 
               | Yeah, that's what I meant to include by "behave very
               | differently". I don't think we disagree on anything
               | technical here. The problem is if you are currently
               | googling for garbage collection you will mostly get
               | garbage results. Here's duckduckgo to avoid my search
               | bubble:
               | https://duckduckgo.com/?q=garbage+collection+programming
               | 
               | The first result is the wikipedia article: https://en.wik
               | ipedia.org/wiki/Garbage_collection_%28computer... It's
               | pretty bad, under "Strategies" it lists three: Tracing,
               | Reference Counting and Escape Analysis. I'm sure these
               | three are similar things.
               | 
               | The second result is this blog post, also listing
               | rereference counting as gc:
               | https://www.freecodecamp.org/news/a-guide-to-garbage-
               | collect...
               | 
               | And the third result looks okay. Searching for "tracing
               | garbage collection" has better results. The text in
               | question already uses "gc" most of the time, and has a
               | footnote saying:
               | 
               | > By "garbage collection", we're referring to tracing
               | garbage collection.
               | 
               | I think that's as clear as it gets, without going on rant
               | about the names of things. You are clearly an expert in
               | garbage collectors, but most people in the target
               | audience of that article are not. The article compares
               | the differences between rc and gc. If someone then goes
               | and reads the wikipedia articles about either of those
               | they will be very confused because wikipedia will tell
               | them rc is gc. A "fad" like this can't be undone, once a
               | usage of a word becomes this popular you can't undo it.
               | 
               | Okay, sorry, this was too long, and we agree to like 99%
               | anyway. Have a nice day! :)
        
               | pizlonator wrote:
               | That reminds me of the time I went on a first date and
               | she asked, "so what do you do exactly?" and I said I work
               | on "garbage collection". You should have seen her face!
        
         | azornathogron wrote:
         | Using concurrent GC has various advantages but in no way makes
         | it _deterministic_.
        
           | pizlonator wrote:
           | True, not by itself.
           | 
           | But concurrent GC is the basis for making deterministic GC,
           | since it gives you the option of scheduling GC work whenever
           | you like rather than pausing the world.
           | 
           | Some concurrent GCs are also deterministic while others
           | aren't. I've written both kinds.
        
             | gpderetta wrote:
             | you have some very non-standard definition of determinism.
        
               | pizlonator wrote:
               | How so?
        
       | bluGill wrote:
       | Why is garbage collection called memory safety? Garbage
       | collection in whatever form is only memory safe if it doesn't
       | free memory that will still be used. (which means if you actually
       | get all your free calls correct C is memory safe - most long
       | lived C code bases have been beat on enough that they get this
       | right for even the obscure paths).
       | 
       | Use after free is important, but in my experience not common and
       | not too hard to track down when it happens (maybe I'm lucky? - we
       | generally used a referenced counted GC for the cases where
       | ownership is hard to track down in C++)
       | 
       | I'm more worried about other issues of memory safety that are not
       | addressed: write into someone else's buffer - which is generally
       | caused by write off the end of your buffer.
        
         | constantcrying wrote:
         | >Why is garbage collection called memory safety? Garbage
         | collection in whatever form is only memory safe if it doesn't
         | free memory that will still be used.
         | 
         | Yes. A garbage collector is only safe if it works correctly.
         | What an irrelevant observation. Nothing can guarantee that
         | something works correctly if it doesn't work correctly.
        
           | bluGill wrote:
           | Keep reading. That is all a garbage collector gives me. there
           | are lots of other things that that are memory unsafe that
           | garbage collectors don't give me.
        
         | foota wrote:
         | To answer your question, I'd say it's memory safe when it's a
         | part of the runtime. At some point, you're relying on your
         | runtime to be correct, so if it says it does garbage collection
         | then you can rely on it, in the same way you rely on the
         | allocator not to randomly trash your memory etc.,.
        
           | bluGill wrote:
           | You misunderstand. Sure that is a part of memory safe, but
           | why is the much larger problem of running off the end of the
           | buffer into something else not considered a larger part. In
           | my experience the later is a worse problem (the blame for
           | issues goes to someone else who's code is working perfectly
           | correct and so they spend months trying to find a logic error
           | before someone finally looks elsewhere - often the fix is
           | just a random fix by those who are at fault and so the team
           | will spend months more looking before closed as "doesn't
           | happen anymore, no idea why". Memory leaks by contrast are
           | hard to track down, but at least they leave obvious clues and
           | so the blame doesn't go to the wrong person.
        
       | the__alchemist wrote:
       | After pondering, my single favorite capability of rust is this:
       | fn modify(val: &mut u8) {           // ...       }
       | 
       | No other language appears to have this seemingly trivial
       | capability; their canonical alternatives are all, IMO, clumsier.
       | In light of the article, is this due to Rust's memory model, or
       | an unrelated language insight?
        
         | constantcrying wrote:
         | How is a mutable reference as an argument to a function in any
         | way unique? C++ has mutable references as arguments to
         | functions.
        
           | the__alchemist wrote:
           | You're right; you can do this in C++, although the patterns
           | I've seen usually involve pointers.
           | 
           | There is no equivalent in Python, Javascript, Typescript,
           | Java, or Kotlin.
        
             | constantcrying wrote:
             | >I've seen usually involve pointers.
             | 
             | If only C++ had a thing called "references", which
             | conveniently also used the "&" symbol and are a way to do
             | exactly this without pointers.
             | 
             | https://en.cppreference.com/w/cpp/language/reference
        
               | the__alchemist wrote:
               | I stand corrected on C++. Can you think of any other
               | languages? Of the ones I listed, I use all of, and am
               | continuously frustrated by this limitation.
        
               | constantcrying wrote:
               | >Can you think of any other languages?
               | 
               | Yes. E.g. every language where you can pass pointers can
               | do this. Reference are a construct around pointers.
               | 
               | >Of the ones I listed, I use all of, and am continuously
               | frustrated by this limitation.
               | 
               | Why? This is only a limitation for a few basic types. All
               | the language you mention allow you to modify objects
               | passed to a function.
               | 
               | If it ever were any issue though you can always return
               | the modified object and overwrite the original.
               | 
               | a = modify(a)
               | 
               | works even if you can only pass a value.
               | 
               | In practice this should never be a limitation, certainly
               | I have never encountered it as one.
        
               | the__alchemist wrote:
               | I challenge you to write the function signature I posted
               | in Python, Javascript, or Kotlin; I'm confident you will
               | not be able to, and the alternatives will be more
               | provincial and unergonomic.
               | 
               | Hint: Python and Javascript will use the type passed to
               | determine mutability. Kotlin will let you pass a
               | mutableStateFlow etc that can be used for, e.g., top-
               | level data structure, but not an arbitrary variable.
               | (e.g. a data structure field, or free variable).
               | 
               | The most popular languages, outside of C and C++, cannot
               | pass pointers. And, raw pointers are not ideal; they lose
               | type safety, memory safety etc.
               | 
               | There are always canonical patterns for a given language
               | to write a function like I did. I find them to be
               | universally more complicated and less explicit/versatile
               | than what I posted; they require code to permeate beyond
               | the function signature.
        
               | constantcrying wrote:
               | It works the moment you wrap the int in a class.
               | 
               | class A:                   a = 0
               | 
               | def modify (v):                   v.a = 2
               | 
               | a1 = A()
               | 
               | a1.a = 1
               | 
               | print(a1.a)
               | 
               | modify(a1)
               | 
               | print(a1.a)
        
               | the__alchemist wrote:
               | You're correct re the class wrapper (HN nesting limit).
               | Can you see why that solution is complicated, and not
               | versatile? (hint: This limitation will cause ripples of
               | complication throughout your data structures)
        
               | constantcrying wrote:
               | I explicitly excluded primitive types in my previous
               | comment.
               | 
               | I have never seem a codebase where this was an issue. The
               | most important concept in programming is defining your
               | data structures. Professional programs almost never
               | operate on primitive data structures, but always on
               | larger structures, like classes or structs.
               | 
               | This is why this is only a problem for you. I don't want
               | to be insulting, but you need to brush up on your data
               | structures. The need to pass a modifiable data structure
               | almost never arises, the idiom in your OP is extremely
               | uncommon and would likely not pass any code review.
        
               | the__alchemist wrote:
               | You have missed the point: Functions should not need to
               | be provincial to a specific data structure. Nor should
               | data structures include field wrappers to subvert a
               | language's sloppy mutation semantics.
               | 
               | The point is that with mutable references, you can modify
               | only the part of the struct you need to, using a general
               | function. You can use this same function to modify a
               | different part of the structure, or a different
               | structure, or a free variable.
               | 
               | Your inference that preferring mutable references to
               | wrapper structs implies insufficient knowledge of data
               | structures is puzzling and incorrect.
        
               | constantcrying wrote:
               | You don't need to wrap fields. You just pass the classes.
               | 
               | Think about why nobody but you is inconvenienced by this.
               | I am sorry, this is a _you_ problem, because you are
               | violating common idioms of the language.
               | 
               | There is basically never a reason not to pass your data
               | structure to a function, which is why what you are doing
               | is an incredibly niche functionality.
               | 
               | Also functions _should_ be particular to data structure.
               | That is why types exists.
        
               | the__alchemist wrote:
               | General functions are useful.
               | 
               | It is not a _me_ problem, which is why Rust has fixed
               | this. You are arguing that functions which mutate
               | parameters generally are not useful, which isn 't true.
               | You have implied that I have a lack of knowledge of data
               | structures, my code wouldn't pass code review. Reflect:
               | You are proposing it is preferable to have a simple
               | function accept a whole data structure when it may only
               | act on one or more of its fields, and may be used in
               | other places. That is a semantic misdirection, and
               | provincialization.
        
               | renox wrote:
               | Ada has inout parameters.
        
               | diath wrote:
               | Most low-level languages support this, references in C++,
               | pointers in C, ref keyword in C# and Dlang, and so on.
               | It's mostly higher-level languages that do not support
               | such things because it's in the VM semantics that dictate
               | how objects are passed (i.e. in Lua tables are passed by
               | reference, non-table values are passed by copy).
        
       | bitbasher wrote:
       | I'm kinda torn. It seems there are only three approaches.
       | 
       | 1. laissez-faire / manual memory management (c, c++, etc)
       | 
       | In this approach, the programmer decides everything.
       | 
       | 2. dictatorship / garbage collection (java, go, etc)
       | 
       | In this approach, the runtime decides everything.
       | 
       | 3. deterministic / lifetime memory management (rust, c with
       | arenas, etc)
       | 
       | In this approach, the problem determines everything.
        
         | ryao wrote:
         | There is another option, which is to use a sound static
         | analyzer that can prove the absence of memory safety issues
         | like astree, and fix things that cause it to complain until it
         | stops complaining:
         | 
         | https://www.absint.com/astree/index.htm
         | 
         | For those who think static analyzers cannot do that, notice the
         | word "sound". This is a different type of static analyzer than
         | the more common ones that do not catch everything.
         | 
         | Sadly, there is no open source option that works across a broad
         | range of software. NASA's IKOS is open source for example, but
         | it does not support multithreading and some other things that I
         | do not recall offhand, which makes it unable to catch all
         | memory safety bugs in software using the features that it does
         | not support. For now, people who want to use sound static
         | analyzers need to use closed source tools or restrict
         | themselves to a subset of what C/C++ can do so they can use
         | IKOS:
         | 
         | https://github.com/NASA-SW-VnV/ikos
        
           | voxl wrote:
           | Even in their own list of features "Memory Safety" does not
           | pop up, and none of the listed features indicate to me that
           | they would entail memory safety. Academics aren't publishing
           | droves of work on separation logic for a problem that can be
           | solved by 1980s static analyzers.
        
             | ryao wrote:
             | They say that they can prove the absence of classes of bugs
             | that make up memory safety:
             | 
             | * out-of-bounds array indexing,
             | 
             | * erroneous pointer manipulation and dereferencing (NULL,
             | uninitialized and dangling pointers),
             | 
             | * read access to uninitialized variables,
             | 
             | NIST gave them a glowing review:
             | 
             | https://nvlpubs.nist.gov/nistpubs/ir/2020/NIST.IR.8304.pdf
             | 
             | It is possible to make sound static analyzers that can
             | prove code to be free from memory safety bugs, but it is
             | difficult and you need to implement checks that either
             | complain or mathematically prove the absence of each class
             | to do it.
             | 
             | These tools have been used for years in the aviation and
             | nuclear industries, but almost nobody outside of those
             | industries knows anything about them. If others had broader
             | awareness of the existence of these tools, we could get
             | open source equivalents and memory safety issues would be a
             | thing of the past in C/C++ code.
             | 
             | Finally, your 1980s remark reveals enormous ignorance of
             | what the field of formal methods has produced. Astree was
             | not available in the 1980s. It took over 40 years of work
             | to make and only became available about 20 years ago. C++
             | support took much longer for it to add, with it only adding
             | it around 6 years ago if my recollection of the public
             | documentation is correct. Some other things in this space
             | are the Polyspace Code Prover and Frama-C.
        
       | westurner wrote:
       | From https://news.ycombinator.com/item?id=33560227#33563857 :
       | 
       | - Type safety > Memory management and type safety:
       | https://en.wikipedia.org/wiki/Type_safety#Memory_management_...
       | 
       | - Memory safety > Classification of memory safety errors:
       | https://en.wikipedia.org/wiki/Memory_safety#Classification_o...
       | 
       | - Template:Memory management
       | https://en.wikipedia.org/wiki/Template:Memory_management
       | 
       | - Category:Memory_management
       | https://en.wikipedia.org/wiki/Category:Memory_management
        
       | eklitzke wrote:
       | I don't understand why they say that reference counting is
       | "slow". Slow compared to what? Atomic increments/decrements to
       | integers are one of the fastest operations you can do on modern
       | x86 and ARM hardware, and except in pathological cases will
       | pretty much always be faster than pointer chasing done in a
       | traditional mark and sweep VMs.
       | 
       | This isn't to say reference counting is without problems (there
       | are plenty of them, inability to collect cyclical references
       | being the most well known), but I don't normally think of it as a
       | slow technique, particularly on modern CPUs.
        
         | gpderetta wrote:
         | Atomic reference counting per se is fairly slow compared to
         | other simple operations [1]. But the biggest issue with
         | reference counting is that it doesn't scale well in
         | multithreaded programs: even pure readers have to write to
         | shared memory locations. Also acquiring a new reference from a
         | shared atomic pointer is complex and need something like hazard
         | pointers or a lock.
         | 
         | [1] an atomic inc on x86 is typically ~30 clock cycles, doesn't
         | really pipeline well and will stall at the very least other
         | load operations.
        
       | immibis wrote:
       | MMM++ is a variation of standard malloc/free. You can still UAF,
       | but only to another object of the same type, which may or may not
       | prevent an exploit.
       | 
       | Something that's missing is full-on formal verification where you
       | write unrestricted C code and then mathematically prove it
       | doesn't have any bugs. Nobody does this because proving a C
       | program is correct is harder than mining a bitcoin block by hand,
       | but it's useful to anchor one end of the safety/freedom spectrum.
       | Many other approaches (such as borrow checking) can also be
       | viewed as variants of this where you restrict the allowed program
       | constructs to ones that are easier to prove things about.
        
       | lilyball wrote:
       | I don't see any mention of epoch-based garbage collection (see
       | crossbeam
       | https://docs.rs/crossbeam/latest/crossbeam/epoch/index.html).
       | Generational References sounds like a related concept but it's
       | not the same. I'm also surprised nobody's mentioned that one
       | lance corporal goat yet.
        
       ___________________________________________________________________
       (page generated 2024-12-20 23:02 UTC)