[HN Gopher] Allocgate: Restructuring how allocators work in Zig
___________________________________________________________________
Allocgate: Restructuring how allocators work in Zig
Author : todsacerdoti
Score : 60 points
Date : 2021-12-15 20:21 UTC (2 hours ago)
(HTM) web link (pithlessly.github.io)
(TXT) w3m dump (pithlessly.github.io)
| khiner wrote:
| This was a fantastic, thorough explanation, thank you for putting
| in the effort!
| celeritascelery wrote:
| I am not fully understanding how fat-pointers allow LLVM to
| devirtualize the function calls. If the allocators are
| polymorphic then a particular piece of code doesn't know which
| vtable it will get at run time correct?
| Spex_guy wrote:
| It depends a lot, but in practice in Zig devirtualization is
| effectively constant propagation. The compiler needs to see the
| place where the vtable is created, and follow that to the place
| where virtual functions are called, ensuring along the way that
| nothing modifies the vtable. This is not possible for all uses
| of interfaces, but it is possible for many of them, especially
| ones where the interface is sort of "temporary" and you are
| usually passing around the implementation. These are the cases
| targeted by this change.
|
| The difference in results has to do with pointer provenance
| tracking and aliasing. With both approaches, the first call to
| an interface function will almost definitely be devirtualized.
| The problem is that that first call will also modify
| implementation state. If the implementation function is not
| inlined (which is common), this is tracked as a modification to
| the memory region containing the implementation state. But with
| the fieldParentPtr model, that's the same memory region
| containing the vtable! So this breaks constant propagation on
| the vtable and any later calls must always be fully virtual,
| even if the optimizer can see the whole way from vtable
| creation to virtual call.
| ayende wrote:
| You aren't modifying the same object When you have a fat
| pointer, llvm can tell you are modifying the ptr, bot the
| vtable
|
| You can then cache the function call
|
| When your vtable is in the object you are mutating, it needs to
| read each time
| celeritascelery wrote:
| Does LLVM do the inline caching itself? Or does it just
| enable zig to do this optimization?
| ayende wrote:
| What happens is likely that the code gen can do the virtual
| lookup once, instead on each loop iteration
|
| This is llvm, bit zig
| Shadonototra wrote:
| why not just an an interface type instead of this giant mess?
| kristoff_it wrote:
| "allocgate" is actually a meme name that we purposely gave to
| this API change. In reality it's not a big deal and that's the
| superpower that a language v0 has: you can make breaking
| changes.
|
| As for the builtin interface type, why get locked into one
| particular implementation when you can have all of them by
| leaving the choice to the programmer.
| Shadonototra wrote:
| > As for the builtin interface type, why get locked into one
| particular implementation when you can have all of them by
| leaving the choice to the programmer.
|
| that's a very good point!
| williamstein wrote:
| I agree that it's not a big deal. I am not a Zig developer,
| but I've written a few thousand lines of Zig code in the last
| few months that extensively uses allocators all over the
| place, and it only took me a few minutes to update my code to
| work with this API change. Also, having seriously played
| around in Zig for a few months writing high performance pure
| mathematics and number theory code for fun, I *really,
| really* like it. Zig is a fantastic language for certain
| application domains.
| levzettelin wrote:
| Does anyone know if RAII types will ever be a thing in Zig?
| Spex_guy wrote:
| It's unlikely. RAII comes with a surprising amount of
| complexity. In order to have a reasonably complete language
| that has RAII and value types, you _must_ also have: -
| constructors - destructors - overloadable copy assignment
| operators - placement new - move semantics and rvalue
| references
|
| These features come together or not at all. If you lose any of
| them, the language becomes less complete. I think Rust and C++
| are doing a fine job of exploring the design space of languages
| that have this feature set, but it's too much complexity for
| Zig.
| steveklabnik wrote:
| > In order to have a reasonably complete language that has
| RAII and value types, you must also have: - constructors -
| destructors - overloadable copy assignment operators -
| placement new - move semantics and rvalue references
|
| Rust has RAII and value types, and does not have
| constructors, overloadable copy assignment operators,
| placement new, or rvalue references (though we do of course
| have a very similar notion to rvalue/lvalue in general, but
| that's not the same thing as "rvalue references" with
| relation to all of this). While it has move semantics,
| they're significantly different.
| Spex_guy wrote:
| I don't mean that these things need to manifest in exactly
| the same way as they do in C++, but analagous features are
| needed. You're right that rvalue references are not
| necessary, but some form of move semantics are. When I say
| constructors and destructors, I am really referring to
| having a concept of object lifetimes as part of the
| language. Zig does not have this, and is much simpler
| because of it.
|
| Edit: to clarify, the thing that makes a
| constructor/destructor useful in this case _is_ the
| property that it begins /ends an object lifetime according
| to the language. This lifetime reasoning certainly has
| benefits, like the ability to have const fields in C++ and
| the ability to do static checking of lifetimes in Rust.
| However it also comes with significant complexity, because
| move semantics are needed throughout the language, and
| begin/end lifetime tags are needed when implementing data
| structures that use preallocated backing arrays.
| steveklabnik wrote:
| Hm, personally I consider "object lifetimes exist" to be
| completely different than "constructors", which are a
| hook into a specific point in some sort of object
| lifetime cycle. Rust doesn't have the hook, so it doesn't
| have the feature. Note that I didn't put destructors on
| my list; the Drop trait does exist in Rust and is the
| same general idea as destructors.
|
| I guess that basically, to me at least, if you've
| stretched the definitions of these features far enough to
| include what Rust does, you don't really have a
| meaningful definition any more.
| tialaramex wrote:
| Rust doesn't end up with an overloadable copy assignment
| operator, it does something else which I would argue is
| cleverer (although you can't just add it to an existing
| language)
|
| Because Rust knows the lifetime of everything in your program
| (in Rust the lifetime of things is part of their type) the
| effect of the assignment operator = is to dispose of whatever
| was in the variable before, and _move_ the assigned item into
| the variable.
|
| Rust's Copy trait does _not_ alter the semantics of the
| assignment operators - you can 't overload that. You promise
| that your type's in-memory representation is all that
| matters, and then if you move _from_ a variable the value in
| that variable is still live even though there was a copy
| made, usually that value would be dead because it was moved
| from.
|
| Rust's Clone trait behaves a little like a C++ copy
| constructor, except, it's an explicit trait, the only way to
| get a clone of x is to x.clone() or various moral equivalents
| e.g. Clone::clone(&x); so you're not getting one without
| explicitly asking for it.
|
| Rust doesn't formally have Constructors, or from another
| perspective, any Rust code anywhere which wants to make a
| Thing, is a "Constructor" for that Thing. (safe) Rust won't
| let you do any of the shenanigans which is common in C++ like
| having two separate pieces of code share responsibility for
| initialising a data structure, in Rust when you make a Thing
| you need to explicitly set all the values in the Thing at
| once, if the easy way to write that involves temporaries, no
| matter the compiler does have an optimiser and knows how to
| use it.
|
| It is idiomatic in Rust to provide a function named new() in
| the implementation of a structure which will make you one of
| that structure if doing so makes sense, but that function and
| its name aren't magic, it's just a convention. Rust's vector
| type Vec has a new() function but it also has a
| with_capacity(n) function, they're both "Constructors" in the
| C++ sense if you want to think about it that way,
| with_capacity() isn't calling new() to make the vector, that
| would be crazy.
| petertodd wrote:
| Worth noting that even though Rust doesn't let you have two
| separate pieces of code share responsibility for actually
| initializing a data structure, in practice that doesn't
| lead to much, if any, code duplication: a Foo::new() can
| usually be written as a wrapper around a call to
| Foo::new_with_options(x, y, z).
| Kranar wrote:
| Only C++ has all of what you mention. Plenty of languages
| provide RAII without introducing all that complexity such as
| Ada, Rust, Vale. D has RAII but unfortunately it too carries
| much (but not all) of the complexity of C++.
| asddubs wrote:
| maybe since Zig still is making breaking changes, they can call
| it IIRA
| kristoff_it wrote:
| `defer` and `errdefer` give you 90% of what destructors & co
| give you for 0% of the complexity.
|
| For now the the closest thing to RAII is this proposal, but
| there is no guarantee that it will be accepted.
|
| https://github.com/ziglang/zig/issues/782
| initplus wrote:
| I find in C++ at least reasoning about RAII is always
| surprisingly complex. The second you have to write a custom
| destructor you get into the weeds of reasoning about
| copy/move/copy-assignment etc.
|
| https://en.cppreference.com/w/cpp/language/rule_of_three
|
| C++ RAII also rubs up painfully against handle based APIs
| (looking at you Windows) in my experience. There is a lack of
| standardized RAII wrappers for handle types like there is for
| pointers. Yes you can pull in a custom handle RAII wrapper
| but at that point it's simpler to just manage manually.
|
| Defer on the other hand is simple. Anyone can understand it
| in five minutes.
| jcelerier wrote:
| > There is a lack of standardized RAII wrappers for handle
| types like there is for pointers.
|
| .. what's wrong with #include <memory>
| template<typename T, auto Free> using safe_handle_t
| = std::unique_ptr<T, decltype([] (auto p) { Free(p); })>;
| using file_handle = safe_handle_t<FILE, fclose>;
| void file_example() { file_handle
| f{fopen("foo", "r")}; }
| 10000truths wrote:
| Now you have a double indirection. Worse, the handle
| probably already references dynamically allocated memory.
| So that's two dynamic memory allocations per "safe"
| handle. Which might be perfectly acceptable for non-
| critical programs, but it is certainly going to tank your
| memory usage efficiency and thrash your CPU cache if
| scaled up to millions of handles.
| fbkr wrote:
| There is no double indirection and allocation here, it
| stores the FILE* directly, not as a pointer to a FILE*.
| dnautics wrote:
| I'm pinning my hopes that certain types can be marked as
| "shared resources", either in-lang, or through a minimalistic
| helper tool - and analyzed at a lower level (ZIR or AIR) for
| lifetime analysis.
| AndyKelley wrote:
| When people talk about RAII in relation to Zig I think they
| mean something slightly different than RAII, but then the
| conversation starts to become about what is the definition of
| RAII rather than whether the Zig language is lacking a certain
| kind of useful abstraction.
|
| Examples:
|
| [1]: https://news.ycombinator.com/item?id=29506814
|
| [2]:
| https://gist.github.com/andrewrk/190170bc1441839644c3f15725a...
| nikki93 wrote:
| Not saying this means you actually need to add constructors
| and destructors, but I think the main tricky bit is when you
| have eg. a grow / shrinkable array and want to call something
| on it that removes elements and it should be calling
| destructors on those. AFAICT there's no clear place to add a
| `defer` that makes it happen at the right time at the lexical
| site those elements are originally added. I think this is
| where the main RAII complexity comes from vs. the lexical
| scope local variable scenario which is definitely handled by
| `defer`.
|
| All that said, not an argument for having that, just wanted
| to nuance it. I think not having it and keeping the language
| focused on its priorities can be good, for example.
___________________________________________________________________
(page generated 2021-12-15 23:00 UTC)