[HN Gopher] Ointers: A library for representing pointers where b...
       ___________________________________________________________________
        
       Ointers: A library for representing pointers where bits have been
       stolen (2021)
        
       Author : fanf2
       Score  : 113 points
       Date   : 2024-05-08 10:42 UTC (12 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | couchand wrote:
       | Am I misreading the bitmask code? It looks like (in addition to a
       | few other ideas) it's using the old "stick a few extra bits in an
       | aligned pointer", but it seems to be only manipulating high bits,
       | whereas aligned pointer zeroes are low-order bits.
       | 
       | I'd suggest a heavier investment in testing infrastructure.
        
         | pixelesque wrote:
         | It looks like it can use both traditional "tagged pointer"
         | alignment bits AND virtual address bits...
        
           | couchand wrote:
           | I see that in the readme but I don't see where that's handled
           | in the bitmasking. It appears to be universally masking high-
           | order bits here: https://github.com/irrustible/ointers/blob/1
           | 961f75bbb9818d72...
        
             | formerly_proven wrote:
             | Read the function above the function you linked to.
        
               | couchand wrote:
               | Oh that's frightening they shift the entire pointer by
               | the alignment
        
               | db48x wrote:
               | Why? Alignment causes some low-order bits to be zero.
               | Shifting the pointer to the right drops those zeros on
               | the floor, leaving high-order bits zero (or sign extended
               | or whatever). Then you can put your tag up there instead.
               | Shifting the value left by the same amount drops the tag
               | and recovers the same pointer for use.
        
               | vlovich123 wrote:
               | Any idea why it's preferred to do the alignment route
               | instead of storing the bits in the upper part of the
               | pointer & masking?
        
               | db48x wrote:
               | Personal preference, architectural differences, phase of
               | the moon, etc, etc.
        
               | couchand wrote:
               | My guideline with pointer tricks is: if you're not just
               | masking bits it's too complicated for real use.
        
         | Sharlin wrote:
         | The stored representation is packed such that all the stealable
         | bits are contiguous. To get the original pointer value it's
         | unpacked first.
        
         | masklinn wrote:
         | 64 bit architectures don't actually have 64 bit address spaces,
         | both AMD64 and ARM64 have 48 bit address spaces by default
         | (some CPUs have extensions you can enable to request larger
         | address spaces e.g. LVA on ARM64 and 5LP / LA57 on AMD64 but
         | that's opt-in on a per-process basis).
         | 
         | So while you have 3 bits available at the bottom of the
         | pointer, there are 16 at the top. That's a lot more payload you
         | can smuggle. There are even CPU extensions which tell the
         | processor to ignore some of that (Linear Address Masking for
         | Intel, Upper Address Ignore for AMD, and Top Byte Ignore for
         | ARM).
        
           | vlovich123 wrote:
           | Small correction. That's true for 4-level paging where
           | virtual memory is capped at 256 TiB (LAM48). There's a
           | 5-level extension that reduces the wasted space to 7 bits
           | (LAM57) to allow for 128 PiB of virtual space.
           | 
           | I have no idea what the purpose of the extension is that when
           | I don't believe you can get that much secondary storage
           | attached to a CPU unless you go to tape which is pointless
           | for virtual mapping.
        
             | masklinn wrote:
             | ...
             | 
             | I literally mentioned five-level paging in my comment.
             | 
             | But it's an opt-in extension, you need the kernel to
             | support it, and then you need to ask the kernel to enable
             | it on your behalf. So it's not much of an issue, you just
             | have to not to use this sort of extensions (it's not the
             | only one, ARM64 also has a similar one though it only goes
             | up to 52 bit address spaces) with 16~19 bits of tagging.
        
               | LegionMammal978 wrote:
               | You do need to opt-in when compiling the kernel, but on
               | Linux it doesn't take anything particularly special on
               | the program's side to enable it. The rule is that non-
               | fixed mmap() will only ever yield 48-bit addresses, until
               | the program specifies a 57-bit address as a hint, after
               | which it can yield 57-bit addresses at will. (This rule
               | has lots of curious consequences for the location of the
               | vDSO, when 5-level paging is enabled and an ELF program
               | uses big addresses for its segments.)
        
       | zgs wrote:
       | Every time I've seen similar done in the past, it has come back
       | to hurt the instigator.
        
         | pixelesque wrote:
         | I've used tagged pointers with no issues at all over the years
         | to "smuggle" / store values in...
        
         | speed_spread wrote:
         | If you build your whole app with it, yeah. If you use it
         | tactically for a certain data type it can be very nice. Just
         | don't try to dissimulate it's specialness.
        
         | db48x wrote:
         | Every lisp system that ever existed uses this technique, and it
         | never hurt them any. Emacs or SBCL or Open Genera; they all
         | work perfectly well.
        
           | _flux wrote:
           | OCaml uses the lowest bit, which is nice because (aligned)
           | pointers work as-is.
        
           | lmm wrote:
           | > Every lisp system that ever existed uses this technique,
           | and it never hurt them any.
           | 
           | Wasn't it part of the reason they ended up with poor
           | "mechanical sympathy" on regular PCs, and got a bad
           | performance reputation as a result?
        
             | db48x wrote:
             | No. Their reputation for poor performance mostly arises
             | because the default data structure is a singly-linked list.
             | Although the language makes these lists extremely
             | convenient for the programmer to use, linked lists
             | invariably result in terrible cache locality. High-
             | performance systems that are written in lisp (and they do
             | exist), achieve that performance by avoiding lists and
             | using vectors and arenas instead.
             | 
             | Of course, it must be said that this performance penalty
             | rarely matters in practice. For most programs, the limiting
             | factor is not performance but the cost of implementation
             | and maintenance. Lisp has has many powerful and convenient
             | features that make implementation faster and surer than in
             | any other language, and which usually greatly lower the
             | cost of maintenance as well. Thus by writing a new program
             | in Lisp you can get it off the ground and earning you money
             | faster than you could in almost any other language. You can
             | also add features more cheaply as time goes on, allowing
             | you to keep up with your competitors. It is only when your
             | system is nearing completion, and all necessary features
             | have been invented and implemented, that you should think
             | about rewriting select parts of that system in a systems
             | language where more efficiency is possible. Not only will
             | you only understand the system after it is written, but you
             | can measure the actual performance of the system to figure
             | out which parts you need to rewrite, and which parts have
             | acceptable performance already and thus do not need to be
             | rewritten.
        
               | jxf wrote:
               | > For most programs, the limiting factor is not
               | performance but the cost of implementation and
               | maintenance.
               | 
               | The limiting factor for what? Their commercial success,
               | or something else?
        
               | db48x wrote:
               | Success, yes. It need not be direct commercial success;
               | lots of programs are written purely for internal use
               | within some larger organization. And even when there are
               | known performance requirements, Lisp is often still a
               | good choice because the performance is handled by
               | specialized hardware. Consider games, for example. Time
               | to market and rapid iteration are both paramount, while
               | performance is taken care of not by writing a fast
               | renderer but by sending everything to the GPU to be
               | rendered on specialized hardware.
        
               | pfdietz wrote:
               | Cache locality is improved if the Lisp has a compacting
               | garbage collector, but you still need extra memory for
               | the cdr links. Lisp Machines had cdr-coding that allowed
               | those to be avoided but I don't think that's practical on
               | stock architectures.
        
               | db48x wrote:
               | True, but don't forget that one of the _other_ causes of
               | performance problems in lisp programs is the creation and
               | subsequent collection of garbage. If you know ahead of
               | time which of your data needs to be accessed or created
               | by the most performance-sensitive parts of your program,
               | then you can put that data into vectors to start with and
               | avoid all of the extra work.
        
               | pfdietz wrote:
               | Long-lived things end up in older generations and aren't
               | traversed much during GC, which is the moral equivalent
               | of being statically allocated.
               | 
               | There's overhead if you mutate old objects in a GCed
               | system due to card marking.
               | 
               | Lisp vectors are vectors of pointers, so there's still an
               | overhead of dereferencing through it. Presumably the
               | objects being pointed to end up compacted together,
               | eventually.
        
               | db48x wrote:
               | I should have been more clear; I meant using vectors as
               | arena allocators, so that all of your data ends up stored
               | contiguously with no extra indirection.
        
               | mik1998 wrote:
               | > linked lists invariably result in terrible cache
               | locality
               | 
               | With a compacting GC (ie most of the relevant ones) lists
               | often provide better memory locality than arrays. They
               | only lose at a large scale due to having +8 bytes of
               | memory, which makes arrays occasionally cheaper for large
               | sequences.
        
               | nequo wrote:
               | Why do lists provide _better_ memory locality than
               | arrays? The array already stores its elements next to
               | each other in memory. The compacting GC helps arrange the
               | elements of the list close to each other too, right? Then
               | wouldn't the compacting GC at best even out the
               | performance difference for sequential traversal?
        
               | gpderetta wrote:
               | Even with perfect cache locality, a linked list will be
               | significantly slower to traverse than an array because of
               | worse pipelining and parallelization. Partially unrolled
               | lists can help, but at the cost of additional complexity.
               | 
               | Of course not all container accesses require traversal.
        
             | bitwize wrote:
             | I don't know what you mean. Maybe in the early early 80s
             | that was the case, but PCs were still 16-bit back then, and
             | would've been a poor fit for Lisp anyway.
             | 
             | One of the reasons why the Lisp machines died out is that
             | round about the mid-80s or so, compiler technology improved
             | for generic 32-bit processors, and it became possible to
             | run Lisp software on a VAX, 68k, or RISC CPU faster than it
             | ran on the Lisp machines' bespoke architecture. Back during
             | the first AI hypecycle, the makers of Golden Common Lisp
             | introduced the Hummingboard to the market, billed as an
             | inexpensive solution to have a "Lisp machine in a PC". It
             | was just a 386 on an ISA card (predating Compaq's 386 PC by
             | about a year) with gobs of memory on board; a special build
             | of Golden Common Lisp allowed code to be run on that CPU
             | rather than the main one.
        
               | coldtea wrote:
               | > _One of the reasons why the Lisp machines died out is
               | that round about the mid-80s or so, compiler technology
               | improved for generic 32-bit processors, and it became
               | possible to run Lisp software on a VAX, 68k, or RISC CPU
               | faster than it ran on the Lisp machines ' bespoke
               | architecture._
               | 
               | I'd say Lisp machines died out because Lisp died out
               | (commercially). Other languages got popular, the second
               | AI winter didn't help at all either.
               | 
               | If Lisp itself had fared better (even if it was on
               | generic hardware), Lisp machines could have been saved
               | too, they could still use a VAX, 68k, or RISC CPU
               | underneath, and optimize for the developer experience. Or
               | they'd have turned Lisp machines from hardware into a
               | "Lisp machine" OS or IDE/REPL/etc environment for other
               | OSes succeeding. But none of that took off.
        
               | zozbot234 wrote:
               | > Or they'd have turned Lisp machines from hardware into
               | a "Lisp machine" OS
               | 
               | Emacs is that Lisp OS. But it's still lacking a good text
               | editor.
        
               | pfdietz wrote:
               | Ok, that made me laugh.
        
               | bitwize wrote:
               | > Or they'd have turned Lisp machines from hardware into
               | a "Lisp machine" OS or IDE/REPL/etc environment for other
               | OSes succeeding.
               | 
               | You mean like Allegro CL? LispWorks? Genera on DEC
               | Ultrix?
               | 
               | With the exception of Genera, these solutions are
               | available, maintained, and supported today. Even Genera
               | hung on for a good few years. Lisp _machines_ started to
               | wane a few years before the AI winter hit in full force,
               | because the idea of dedicated hardware to run Lisp
               | programs made sense when your alternative was Maclisp
               | struggling on a PDP-10, but became expensive, slow, and
               | fiddly compared to the generic boxes running fast 32-bit
               | processors with, again, much improved compiler tech.
               | Genera on DEC Alpha, even as an interpreted VM, was so
               | much faster than any of Symbolics 's bespoke CPU
               | architectures that Symbolics just quit making hardware
               | and declared the Alpha version of Genera the official
               | upgrade path.
        
               | rjsw wrote:
               | One road not taken would have been to port the MIT/LMI/TI
               | Lisp Machine software to stock hardware, it was 32-bit
               | until the end so didn't need an Alpha as the host. A fast
               | 68020 with a custom MMU would have been fine. There was
               | only around 12k of "assembler" to rewrite to provide the
               | VM and simple RTOS.
        
               | lispm wrote:
               | Symbolics had its own PC implementation of Lisp: CLOE.
               | 
               | From the Lisp FAQ: "CLOE (Common Lisp Operating
               | Environment) is a cross-development environment for IBM
               | PCs (MSDOS) and Symbolics Genera. It includes CLOS,
               | condition error system, generational garbage collection,
               | incremental compilation, code time/space profiling, and a
               | stack-frame debugger. It costs from $625 to $4000 and
               | requires 4-8mn RAM and a 386 processor. "
               | 
               | Later they also ported Genera to DEC Alpha. Currently I
               | have access to an implementation which runs on ARM64 and
               | Intel x64.
        
             | lispm wrote:
             | No, the reason for less than stellar performance of Lisp on
             | the 32bit x86 CPUs were things like too few registers.
             | 
             | On 64bit Intel CPUs this is much less of a problem and Lisp
             | runs roughly twice as fast in 64bit mode. I would think
             | that this even holds without taking advantage of wider data
             | words.
             | 
             | On other architectures this was less of a problem. 32bit
             | CPUs had more registers on a RISC or some other CISC CPUs.
        
             | samatman wrote:
             | It's not. Two counterexamples: LuaJIT's interpreter/VM, and
             | Wren's, both use NaN boxing, and are among the fastest VMs
             | out there (we're ignoring JITs as out of scope).
             | 
             | It isn't tagging pointers that makes things (some Lisps are
             | 'things', some are not) slow: it's pervasive abstraction
             | and indirection. Doing some masking on a pointer is
             | pipelined, out-of-order, one to three cycles, and is
             | absolutely dwarfed by cache misses and conditional
             | mispredictions, let alone the sort of pointer chasing which
             | is common in idiomatic Lisp (or indeed Python) code.
        
           | rjsw wrote:
           | Maclisp on PDP-10 and Franz Lisp didn't, they used bibop type
           | checking instead of tagged pointers.
        
             | db48x wrote:
             | That's merely an elaboration on the same theme.
        
               | rjsw wrote:
               | It isn't, pointers in Franz Lisp can be used unmodified
               | in C.
        
               | db48x wrote:
               | But those pointers still have bits in them that indicate
               | the type of the object that they point to. It is merely
               | the case that the tag bits are chosen at run time so that
               | they correspond to allocated memory regions, rather than
               | being chosen ahead of time.
        
               | rjsw wrote:
               | None of the bits in the pointer have been "stolen"
               | though, which is the point of the referenced article,
               | they are all still available to access the full address
               | space of the CPU.
        
               | db48x wrote:
               | That doesn't mean that they aren't still there, and don't
               | tell you the type. It's still a tagged pointer, and
               | because all objects of the same type must be allocated
               | out of the same page (or collection of pages), it still
               | limits how much memory can be dedicated to objects of
               | that type.
        
           | gpderetta wrote:
           | And the modern take on this is NaN Boxing!
        
         | junon wrote:
         | VM writers use this egregiously. I've not heard of it causing
         | issues. It's not that complicated.
         | 
         | Lua, Lisps, many others.
        
         | masklinn wrote:
         | Apple's been using that for more than a decade:
         | https://www.mikeash.com/pyblog/friday-qa-2012-07-27-lets-bui...
         | http://www.sealiesoftware.com/blog/archive/2013/09/24/objc_e...
         | 
         | OCaml's also been using tagging to mark unboxed type pretty
         | much from the start, nearly 30 years ago:
         | https://dev.realworldocaml.org/runtime-memory-
         | layout.html#:~....
        
           | duskwuff wrote:
           | And it's not like Apple isn't aware of the dangers, either;
           | they used top-byte tagging on the 68000, and it took them a
           | lot of time and effort to migrate away from that to support
           | 16+ MB systems.
        
             | gpderetta wrote:
             | IIRC the issue was that 68000 would transparently mask the
             | top bits, so code could get away with avoiding the explicit
             | masking, which of course breaks when you move to a larger
             | address space.
             | 
             | More modern cpus enforce specific bit patterns for the
             | MSBs, so broken code would be caught immediately.
        
         | coldtea wrote:
         | This "trick" has been used since the dawn of time in major
         | platforms
        
           | adql wrote:
           | I dunno, pointers were kinda small on 8/16 bit platforms
        
         | titzer wrote:
         | All fast JavaScript VMs use some form of pointer tagging. V8
         | uses (compressed) tagged pointers and both JSC and Firefox used
         | NaN-boxing.
         | 
         | WasmGC has a i31ref as part of its reference hierarchy, which
         | is implemented with pointer tagging.
        
         | jstimpfle wrote:
         | It's common for libraries to use the 2 lowest bits in tree
         | structures e.g. RB tree. It's not a problem at all since the
         | data structure is controlled by the library. Even if tree nodes
         | are allocated/provided by the user, having a "4-byte aligned"
         | requirement in the API contract isn't a problem at all -- in
         | fact you need to work quite hard to allocate a structure with a
         | pointer that isn't at least 4 byte aligned (i386), or 8 byte
         | aligned (x64).
        
           | a1369209993 wrote:
           | > in fact you need to work quite hard to allocate a structure
           | with a pointer that isn't at least 4 byte aligned (i386), or
           | 8 byte aligned (x64).
           | 
           | Well, no, actually; it's:                 p =
           | malloc(size+1)+1;
           | 
           | It's just quite implausible that you'd do that _by accident_.
        
             | menaerus wrote:
             | Well no it isn't since malloc returns a void* and
             | incrementing a void* is impossible for a compiler to figure
             | out the offset that it needs to increment address for. For
             | that operation to succeed/compile you need to cast the
             | result of malloc first to p*. This means that the resulting
             | alignment will still be correct.
             | 
             | The actual danger is in reinterpreting the pointers, or
             | thereof some arbitrary chunk of memory, to a type that
             | doesn't necessarily satisfy the alignment requirements for
             | that given memory address.
        
         | adql wrote:
         | Any example ?
        
         | mbrubeck wrote:
         | This bitvec I wrote for Servo and Gecko uses this technique,
         | and has been shipping in Firefox releases for over six years
         | with no bugs found:
         | 
         | https://docs.rs/smallbitvec/
         | 
         | I'm pretty sure you can find several more examples in Firefox,
         | as well as other major browsers.
        
       | gizmo wrote:
       | Stealing bits from pointers makes sense sometimes, but in most
       | cases an easier and less error-prone way to increase memory
       | efficiency is to just use indices or relative pointers. Instead
       | of shaving individual bits of a pointer take advantage of the
       | fact that most pointers in a program are almost exactly alike.
       | 
       | Instead of storing a 64 bit prev and next pointer for an item in
       | a linked list just store e.g. two 16 bit offsets. Then the
       | next_ptr = (this_ptr & base_mask) + offset * alignment.
       | 
       | Most of the time wasting memory by storing full 64bit pointers is
       | no big deal. But when it is, you can go a lot further than just
       | shaving off a few bits.
       | 
       | If you want to be extra clever you can encode values in the base
       | address of your pointers by taking advantage of how virtual
       | memory is allocated. Every process has its own virtual memory
       | pages and you can request memory pages at specific address
       | ranges. For instance if you align all your memory at a 32gb
       | boundary then you can use 32bit pointers + a base pointer without
       | even having to mask. You can also use virtual memory allocation
       | tricks to convey information about alignment, typing, whether a
       | pointer is allowed to be shared between threads and so on. And
       | again, without having to butcher the pointer itself.
        
         | jsheard wrote:
         | Considering how common it is to use indices in lieu of pointers
         | in Rust, it would be nice if it had native NonMax integer types
         | similar to the NonZero types it already has. NonZero gives you
         | the niche-filling optimization (e.g. Option<NonZero> and
         | NonZero are the same size) but with indices you _want_ zero to
         | be a valid value, so it would be natural to use the maximum as
         | the illegal value instead. You can kind of do it by offsetting
         | the index by 1 or inverting the bits then putting it in a
         | NonZero but that 's a small extra runtime cost on every use.
        
           | joseluis wrote:
           | You can do a handy struct wrapper over a private Nonzero that
           | xors the given value with the prohibited value (max in this
           | case) at the new constructor. And like so for the get method.
           | Xoring is very cheap. That's my favorite way of storing
           | indices/links for nodes, since you can wrap them in Option
           | for free.
        
         | zozbot234 wrote:
         | This requires using arenas extensively in your program or
         | perhaps playing tricks with virtual memory, because your system
         | allocator will give you arbitrary pointers by default and
         | you'll often want to point to an object from a different
         | allocation.
        
           | gizmo wrote:
           | Right. Which is why I favor doing your own memory management
           | entirely (which has some huge advantages) or not worrying
           | about memory at all and trusting a garbage collector. I don't
           | think there are many situations left where memory management
           | is too important to leave to a gc but not important enough to
           | do from scratch.
        
             | adql wrote:
             | Plenty of niche ways that are limited enough to not require
             | program wide architecture change
             | 
             | https://muxup.com/2023q4/storing-data-in-pointers#some-
             | real-...
        
         | gpderetta wrote:
         | Definitely indices are a great option, but then you need a base
         | pointer and a way to allocate off of it. That can add
         | significant complexity. So it is all a tradeoff.
        
         | dkjaudyeqooe wrote:
         | That's really not the application for this, it's tags, which
         | are endlessly useful or characterizing the pointer. For
         | instance you can use the MSBs to establish the type of whats
         | pointed to, or for reference counting. I use the LSB to
         | indicate if the value is a pointer or an unsigned integer,
         | whereby you can store both an (integer) key to a record in a
         | database, or a pointer to the record data in memory. You could
         | use another bit in the pointer to indicate that the record is
         | dirty.
         | 
         | Using unused bits in pointers can make you code more efficient
         | and simpler.
        
           | westurner wrote:
           | Can unused bits be used for signed pointers or CRCs?
        
             | aetherspawn wrote:
             | I would argue that we shouldn't bother to build CRC
             | checking of memory into programs and that ECC RAM should
             | just become the norm eventually.
             | 
             | 99.9999% of memory operations aren't CRC-d so trying to
             | fight against flaky hardware is a bit of an uphill battle.
        
       | pavlov wrote:
       | Technically it's an ointer on PowerPC and SPARC, but a pointe
       | everywhere else.
        
       | sojuz151 wrote:
       | This look like something that debuggers would hate, because
       | following pointers would be broken
        
         | jxf wrote:
         | I was just thinking this. You'd either need to special-case
         | your debuggers or be able to use something that provided
         | bespoke hooks for pointer-following.
        
           | gpderetta wrote:
           | Wrap it in a class and teach the debugger to interpret it. It
           | might be possible with gdb and python pretty-printers for
           | example.
        
       | zozbot234 wrote:
       | Are there ways on Linux to make sure that all user-visible
       | pointers in a process are allocated within some subset of the
       | overall address space? For instance, imagine writing a 64-bit
       | program such that all userspace pointers are allocated within the
       | first 32 bits. This would allow for representing them in memory
       | using four bytes, as is commonly done on 32-bit architectures--
       | without adding more extensive or ad-hoc changes such as a new
       | "x32" binary format. Except that you could also limit to 24-bit
       | (giving 16M of address space), 16-bit (for a "tiny" code model)
       | or anything else. Of course, this would require some amount of
       | support in the linker/loader, userspace libraries and the kernel
       | itself. But the effort might be worthwhile if it can allow for
       | replacing more ad-hoc approaches.
        
         | fathyb wrote:
         | An allocator is free to only `mmap` addresses that are within a
         | range. I think jemalloc could allow you to do that using custom
         | arenas.
        
         | ksherlock wrote:
         | Linux mmap (since 2.6) has a MAP_32BIT flag, but only for
         | x86-64:
         | 
         | Put the mapping into the first 2 Gigabytes of the process
         | address space. This flag is supported only on x86-64, for
         | 64-bit programs. It was added to allow thread stacks to be
         | allocated somewhere in the first 2 GB of memory, so as to
         | improve context-switch performance on some early 64-bit
         | processors. Modern x86-64 processors no longer have this
         | performance problem, so use of this flag is not required on
         | those systems. The MAP_32BIT flag is ignored when MAP_FIXED is
         | set.
        
       | asb wrote:
       | I somehow hadn't come across this library, but have a whole blog
       | post on the various ways people store data in pointers (and
       | when+why it's safe) https://muxup.com/2023q4/storing-data-in-
       | pointers
        
       | queuebert wrote:
       | If the ointer points to a very large block of memory, is it an
       | oinker?
        
       | willcipriano wrote:
       | I never had the problem of too many pointers, but if I did I'd
       | try the "arena"[0] approach.
       | 
       | Preallocate a huge area of memory and use smaller indexes
       | relative to the start instead. Provided a pointer is bigger than
       | a int. You might be able to use this in that approach and keep
       | partial pointers instead.
       | 
       | [0]https://www.rfleury.com/p/untangling-lifetimes-the-arena-
       | all...
        
       | gumby wrote:
       | > What do you call a pointer with the high bits stolen? An
       | ointer!
       | 
       | What is the name a pointer with the low bits stolen? Machines
       | with strict word addressing will ignore the bottom two bits so
       | you can safely store what you want there.
       | 
       | Not sure I've used, much less programmed a machine with that
       | strict constraint in a while though. These days CPUs are all
       | designed to implement the C abstract machine. At least the GPU
       | folks are not subject to this constraint.
        
       | gwbas1c wrote:
       | Could something like this be used to lower memory footprint?
        
         | menaerus wrote:
         | Besides that it makes some lock-free algorithms even feasible
         | and some easier to implement, yes, it also can be used to
         | downsize the memory pressure at the cost of extra CPU cycles if
         | you know that the metadata you want to track fits in those 16+
         | bits.
        
       ___________________________________________________________________
       (page generated 2024-05-08 23:00 UTC)