[HN Gopher] What Is in a Rust Allocator?
       ___________________________________________________________________
        
       What Is in a Rust Allocator?
        
       Author : sulami
       Score  : 78 points
       Date   : 2024-05-06 10:32 UTC (1 days ago)
        
 (HTM) web link (blog.sulami.xyz)
 (TXT) w3m dump (blog.sulami.xyz)
        
       | tialaramex wrote:
       | I have never investigated why, but each time I look at this API
       | it strikes me that the allocator isn't allowed to tell you how
       | much space it _actually_ gave you.
       | 
       | Suppose I'm an allocator which only has power-of-2 sized spaces
       | to hand out, and you're a growable array type and you want enough
       | space for 20 of your 52 byte entries. So that's 20x52 = 1040
       | bytes, but my smallest suitable space is 2048 bytes. I can give
       | you that space, but it seems like it'd be nice for me to say oh,
       | here's 2048 bytes, and now the growable array type knows its new
       | total capacity is not 20 but 39.
        
         | hawski wrote:
         | You may be familiar, but I just wanted to show how it is
         | available in many C implementations and is used, for example,
         | in QuickJS:
         | https://github.com/bellard/quickjs/blob/0c8fecab2392387d76a4...
        
         | filmor wrote:
         | The growable array type would use `realloc` when growing, which
         | can make use of this internal information without exposing it.
        
           | wongarsu wrote:
           | Like most implementations, if you add items to a rust `Vec`
           | one item at a time it doubles in size each time it hits its
           | capacity (to make sure insert is amortized to O(1) even
           | though each realloc potentially requires an O(n) memcpy).
           | With the described allocator that would never lead to a cheap
           | realloc.
           | 
           | It's even worse for types like HashMaps where reallocs can be
           | much more expensive than just a big memcpy.
        
             | tialaramex wrote:
             | Also, Rust's Vec has the correct size hinting API, so we
             | can say Vec::reserve(N) meaning that we expect to put as
             | many as N more things into this Vec soon, or
             | Vec::reserve_exact(X) meaning that we expect to put no more
             | than X more things into this Vec, ever
             | 
             | With this distinct hint, reserve(N) preserves the amortized
             | growth. Say we have total capacity 30, currently there are
             | 20 items in the Vec, and Vec::reserve(20). So the advise is
             | that the Vec may soon have up to 40 items, we should grow
             | it. But since we weren't promised that's as big as it'll
             | get, we grow it by doubling - to total capacity 60 items.
             | If this is a pattern, repeatedly reserving space for 20
             | items and then adding them, this grows 30 -> 60 -> 120 ->
             | 240 and so on, amortized growth as advertised - and growth
             | is avoided when we're only "reserving" capacity we already
             | have anyway.
             | 
             | Several languages, including C++ only provide the
             | Vec::reserve_exact(X) feature, but name it reserve. Any
             | such "reservation" is also implicitly promising there won't
             | be further growth, so for our example it grows 30 -> 40 ->
             | 60 -> 80 -> 100 -> 120 -> 140 ... our amortized growth
             | strategy was destroyed and our performance will be much
             | worse than O(1). In such a language students get taught not
             | to use the reservation API to signal what they know and so
             | they lose out on performance optimisations better than O(1)
             | 
             | I first ran into this feature _because_ I was wondering
             | about the over-allocation problem, and I realised this is
             | solving a different but also important problem.
        
         | khuey wrote:
         | Things like malloc_usable_size() exist but you're not supposed
         | to use the overage it tells you about so they're not very
         | useful APIs.
        
           | darby_eight wrote:
           | This can be instrumented by the compiler or more likely
           | runtime to detect buffer overruns. Imperfectly of course if
           | you share the page(s) with other allocations but it's worth
           | considering.
        
         | pitaj wrote:
         | The APIs do tell you how many bytes are actually allocated. For
         | instance, in the following                   fn allocate(&self,
         | layout: Layout) -> Result<NonNull<[u8]>, AllocError>;
         | 
         | You can get the length of the returned byte slice (`[u8]`) to
         | know how many bytes were actually allocated.
        
           | tialaramex wrote:
           | That's the (nightly) Allocator::allocate, but the API we were
           | looking at is (stable) GlobalAlloc::alloc
           | 
           | Yes I'm aware that I can ask how long a slice is.
        
             | pitaj wrote:
             | Ah gotcha. Yeah `GlobalAlloc` is known to be non-ideal.
             | Hopefully we can get a better API soon.
        
       ___________________________________________________________________
       (page generated 2024-05-07 23:01 UTC)