[HN Gopher] In defense of linked lists
       ___________________________________________________________________
        
       In defense of linked lists
        
       Author : grep_it
       Score  : 561 points
       Date   : 2022-11-04 20:46 UTC (1 days ago)
        
 (HTM) web link (antirez.com)
 (TXT) w3m dump (antirez.com)
        
       | ordiel wrote:
       | For an "article" starting with a ramble about how the author's
       | life is being tragically affected by "the fate of Twitter", the
       | sentence
       | 
       | > get ready to read a sentimental post about a data structure
       | 
       | Makes a looot of sense
        
       | c-smile wrote:
       | Why do we need to defend a hammer against a sickle?
       | 
       | Each tool has its own value it was designed for ...
       | 
       | Try to imagine LISP without lists ...
        
         | bsder wrote:
         | > Try to imagine LISP without lists ...
         | 
         | It's called Clojure.
         | 
         | Getting rid of CONS (effectively removing singly linked lists)
         | is a huge benefit.
        
         | klysm wrote:
         | A list can have different memory representations with identical
         | semantics.
        
       | js2 wrote:
       | > Linked lists are conceptual. A node pointing to itself is the
       | most self centered thing I can imagine in computing: an ideal
       | representation of the more vulgar infinite loop. A node pointing
       | to NULL is a metaphor of loneliness. A linked list with tail and
       | head connected, a powerful symbol of a closed cycle.
       | 
       | Tongue-in-cheek, but this really made me smile. :-)
        
       | sytelus wrote:
       | Linked lists don't need defense. It is unfortunate that job of
       | vast majority of developers is rather benign. They don't work on
       | problems that requires complex algorithms and data structures
       | beyond arrays and dictionaries. But interviewers keep asking
       | those questions and give bad rep to these subjects. However, if
       | you are working on building any real infrastructures such as
       | search, deep learning, cloud, crypto, game engines, OS, compilers
       | etc then you need linked lists, graphs, trees and myriad of
       | algorithms surrounding them all the time. There is clearly
       | different tier of developers who work on such infrastructure and
       | those you simply utilize infrastructures others have built.
        
       | nialv7 wrote:
       | It's odd that the author states they wrote the article in
       | response to arguments on Twitter, yet they didn't show us the
       | arguments this was a response to.
        
       | pharmakom wrote:
       | Immutable linked lists are a cornerstone of functional
       | programming.
       | 
       | However, Clojure has shown that the underlying implementation if
       | often better done using a tree of contiguous memory blocks.
        
       | sprawld wrote:
       | Donald Knuth's Dancing Links paper is a beautiful exploration of
       | the power of linked lists. He uses a table with doubly linked
       | lists (for rows and cols) to solve omino tiling. The linked lists
       | allow for an elegant unpicking of rows (and re-attaching when
       | backtracking)
       | 
       | The paper's a really enjoyable read, Knuth's tone throughout is
       | "look at this, this is fun"
       | 
       | https://arxiv.org/abs/cs/0011047
        
         | [deleted]
        
       | tunesmith wrote:
       | How _do_ you detect a cycle with linked lists? :) I actually have
       | a side project where I 've run into this - it's a distributed
       | directed graph structure where the nodes can send messages to
       | each other. It's intended to be acyclic, but "walking the graph"
       | is really only used to send truth values, which are combined with
       | others to run Boolean logic. So in that case, the occasional
       | cycle doesn't matter, because if the calculated boolean value
       | matches that of the node's internal state, it just can cease
       | propagating. So a structural loop won't turn into an infinite
       | processing loop.
       | 
       | The problem then becomes that I can't introduce NOT gates
       | anywhere in a cycle, because then the bit will continue flipping
       | and I'll get an infinite processing loop.
       | 
       | So it seems my only hope is external processes that continually
       | walk the graph and keep track of where it visited to try and
       | detect loops, and I don't like how that scales...
        
         | rightbyte wrote:
         | I would say the easiest way is to walk the graph and tag the
         | nodes as visited. If you don't visit every node each iteration
         | use a increasing key you store in each node.
        
         | Smaug123 wrote:
         | One of the standard texts on the matter:
         | https://aphyr.com/posts/341-hexing-the-technical-interview
        
         | dragontamer wrote:
         | With "just" a linked list, you need a turtle (advance one-node
         | at a time), and a hare (advance two-nodes at a time).
         | 
         | If the turtle and hare ever meet, you have a cycle. Otherwise,
         | if the hare reaches the end of the list, you don't have a
         | cycle.
        
           | timjver wrote:
           | Just in case performance matters, there is a more efficient
           | way: have the tortoise stay in place and advance the hare
           | only one node at a time, and assign the hare to the tortoise
           | every time a power of 2 number of steps have been made. This
           | is known as Brent's algorithm, and it requires fewer
           | advancements than the original tortoise and hare algorithm by
           | Floyd.
           | 
           | Another notable advantage of Brent's algorithm is that it
           | automatically finds the cycle length, rather than (in Floyd's
           | case) any multiple of the cycle length.
           | 
           | https://en.wikipedia.org/wiki/Cycle_detection#Brent's_algori.
           | ..
        
             | Jtsummers wrote:
             | https://rosettacode.org/wiki/Cycle_detection
             | 
             | Cycle detection in (currently) 44 languages. Most (all?)
             | use Brent's algorithm. They're operating on an iterated
             | function but converting most of these to detect cycles in a
             | linked list would be straightforward.
        
         | klysm wrote:
         | Does the graph have static structure? I'm not sure if you're
         | describing something like an actor system.
        
           | tunesmith wrote:
           | Akka cluster sharding :) Shape of the graph can change as
           | processes are being run.
        
         | mdavidn wrote:
         | It's not difficult. Keep a pointer to a recently-visited node.
         | After visiting N nodes, update the pointer to the current node,
         | but also double N. If the pointer ever matches the next node,
         | the list has a cycle. The doubling of N ensures that even large
         | cycles will eventually be detected.
        
         | bewaretheirs wrote:
         | Set two pointers to the head of the list.
         | 
         | Step through the list by one element per iteration with one of
         | them, and with the other, step every other iteration.
         | 
         | If the two pointers are ever again equal you've found a cycle.
         | 
         | If you hit end of list, no cycle.
        
         | mst wrote:
         | Recycler.
         | 
         | To be clear, http://trout.me.uk/gc/recycler-overview.pdf
         | 
         | There's also a paper with the full algorithm in that directory.
         | 
         | It's how Nim's ORC collector works.
         | 
         | (note that yes, I know, this is for collection, not cycle
         | detection in general, but it's also brilliant and works)
        
         | quickthrower2 wrote:
         | Can you just store the first reference as a local variable and
         | then compare each time you move to a new node?
         | 
         | Or to save the comparisons make the thing the first node points
         | to have a stop flag of some sort.
        
           | mdavidn wrote:
           | That approach works only when the first node is included in
           | the cycle. Other cycles are possible.
        
           | Jtsummers wrote:
           | Consider this kind of cycle:                 Head -> Node1 ->
           | Node2 -> Node3                 ^                 |
           | +-----------------+
           | 
           | If it's a circular list then your option would work, but not
           | all cycles will produce circular lists.
        
             | quickthrower2 wrote:
             | Thanks. This seems so obvious now! Always draw it out I
             | guess.
        
         | ufo wrote:
         | There are a bunch of algorithms for detecting cycles. One of
         | the classics is the tortoise and hare algorithm, which uses
         | only O(1) space.
         | https://en.wikipedia.org/wiki/Cycle_detection#Floyd's_tortoi...
        
           | [deleted]
        
       | subarctic wrote:
       | Interesting, I have to disable HTTPS Everywhere on this site,
       | otherwise it redirects me to https://redis.io/news/138 and I get
       | a 404 page.
        
       | grog454 wrote:
       | There is no contiguous memory, arbitrarily resizeable alternative
       | to a linked list in high performance concurrency is there?
       | https://docs.oracle.com/javase/7/docs/api/java/util/concurre...
        
       | tomcam wrote:
       | > Linked lists are simple. It is one of those rare data
       | structures, together with binary trees and hash tables and a few
       | more, that you can implement just from memory without likely
       | stepping into big errors.
       | 
       | One of the best engineers in Microsoft's devdiv told me that he
       | often gave a linked list implementation in C as a whiteboard
       | assignment for interviewees and that no one ever managed a bug-
       | free version. (I failed mine even after creating a full general
       | purpose implementation on my own just a couple years before.)
        
         | antirez wrote:
         | One thing is: you have 30 minutes of time and can compile and
         | test and end with a working implementation. Another thing is
         | write the code on a whiteboard, that is a process nobody ever
         | follows in practice: very easy to make mistakes this way.
         | Anyway doubly linked lists are easy to get wrong without
         | compiling / testing.
        
       | scottlamb wrote:
       | > I thought of writing this blog post full of all the things I
       | love about linked lists.
       | 
       | The blog post is short, and I think the author covered about
       | everything.
       | 
       | Yeah, linked lists are sometimes appropriate, especially
       | intrusive linked lists (the author called them "embedded" linked
       | lists). The hate they get is a backlash against CS programs that
       | introduce people to them early and don't emphasize enough the
       | reasons you should usually pick something else. If we have to err
       | on one side or the other, I'd rather it be telling people not to
       | use them.
       | 
       | btw:
       | 
       | > Redis can be wrong, but both Redis and the Linux kernel can't.
       | 
       | Yes, they can, and I'd rather avoid argument from authority
       | altogether.
        
         | pornel wrote:
         | Both Redis and Linux are C projects, and in C it's customary to
         | roll your own containers. Without generics, and with a culture
         | of avoiding small dependencies, it's preferred to roll your own
         | simple list than to reach for more advanced reusable data
         | structures.
         | 
         | So I don't think it's a particularly enlightened decision by
         | these projects to use so many lists on their merit, but rather
         | a programming pattern especially common in the C language they
         | use.
        
           | colonwqbang wrote:
           | If you have a pointer to an element of a doubly liked list,
           | you can always remove it in a few processor cycles. You can
           | also always add or remove from the end of front of the list
           | in a few cycles.
           | 
           | This means that you can do those things under a spin lock or
           | with interrupts masked. You cannot really resize a dynamic
           | array under a tight spinlock, or modify a tree that should be
           | kept balanced.
           | 
           | Another use case is free lists, a list of preallocated
           | structures where you can grab one or put one back quickly
           | under spinlock.
           | 
           | These are some examples of how lists are used in kernel land.
        
             | pornel wrote:
             | I'm not saying there are no uses. Concurrent lock-free
             | algorithms are an especially useful case, and a e.g. linked
             | list of on-stack mutex waiters is a very clever approach
             | that I don't think has a better replacement. But I doubt
             | that lists are as common as they are because every place
             | really needs them.
             | 
             | Vecdeque can cheaply pop/push front and end too. Arrays
             | have swap-remove. You may not need to grow if you
             | preallocated a capacity. Freelists can be a stack, or you
             | can choose a data structure that doesn't allocate objects
             | individually so you don't need the freelist in the first
             | place.
             | 
             | Another thing that is C-specific is that most objects are
             | treated as unmovable and their address is their permanent
             | identity. Without language features for prevention of
             | dangling pointers, moves are asking for trouble. This
             | favors linked lists and pointery data structures over
             | contiguous containers and inlined data.
        
           | scottlamb wrote:
           | Meh, that's true of some C projects, but AFAIK Linux/Redis
           | have good reasons here, and if I were to criticize Linux for
           | something, it wouldn't be unwillingness to write more code.
           | 
           | I still prefer to evaluate each decision on its merits. Linux
           | has made some great decisions. Also some terrible ones--e.g.
           | fsync semantics [1], attitude to fuzzing. Redis probably has
           | too. They can agree on something and be wrong.
           | 
           | [1] https://wiki.postgresql.org/wiki/Fsync_Errors
        
         | romellem wrote:
         | > > Redis can be wrong, but both Redis and the Linux kernel
         | can't.
         | 
         | > Yes, they can, and I'd rather avoid argument from authority
         | altogether.
         | 
         | Pretty sure that was a joke, considering the overall light-
         | heartedness of the whole post.
        
           | scottlamb wrote:
           | Ahh, you're probably right. I missed it.
        
         | xxs wrote:
         | >Yes, they can, and I'd rather avoid argument from authority
         | altogether.
         | 
         | Proof by authority is the worst. Yet, linux kernel can have
         | valid reasons for, but they are intrusive lists with the data
         | having a pointer to the next.
        
         | zarzavat wrote:
         | The problem with linked lists is that they are rarely the ideal
         | solution to any problem. They are convenient _in C_ , because
         | of C's inadequacies, but in C++ and Rust they are marginalized
         | because the tools exist to use superior data structures.
         | 
         | You almost always want to unroll your linked linked lists, so
         | that items are fetched in blocks. But this rarely happens in C
         | because it's premature optimization _in C_. In C++ or Rust you
         | can write the code once in a generic data structure and enjoy
         | the benefits to the end of time.
         | 
         | That all said, I think we should still teach them in CS because
         | understanding linked lists is the first step to understanding
         | trees.
        
         | mac01021 wrote:
         | OS kernels tend to use them simply because they don't have
         | malloc at their disposal and so it's most practical to stick to
         | fixed size memory chunks, right?
        
           | scaramanga wrote:
           | Linux kernel's malloc is probably more resilient to the long-
           | term (as in, long running) problems you would get from
           | continually resizing an array with realloc in userspace (ie.
           | massive heap fragmentation and eventually dying in flames)
        
       | waynecochran wrote:
       | Another cool property about linked lists is that you can add an
       | element to the head of a list without mutating the previous list.
       | There are not many data structure that you can add an element to
       | and leave the previous version unchanged. This form of
       | immutability is a very nice feature that Lisp / functional
       | programmers enjoy. Because of this, you can concurrently add an
       | element to the head of a list to create new lists without having
       | a critical section in your code.
        
         | robocat wrote:
         | > you can concurrently add an element to the head of a list to
         | create new lists without having a critical section
         | 
         | The race is on: now you have two lists when before you had one?
        
           | rileyphone wrote:
           | Every element of the list, from its own point of view, is
           | another list.
        
       | jbandela1 wrote:
       | In my experience, the linked lists that are useful are intrusive
       | linked lists where the data you care about has a next/previous
       | embedded pointer as opposed to an just storing an arbitrary
       | object inside a linked list.
       | 
       | One example would be if your thread structure has these embedded
       | pointers, you can easily add/remove a thread to a linked list of
       | threads waiting for some resource without any allocation just by
       | pointer manipulation.
        
         | pclmulqdq wrote:
         | This is a very useful pattern when you have small collections
         | of large objects, where the vector/array implementation starts
         | to fall flat on its face. Kernels and high-performance servers
         | use these a lot for state tracking.
         | 
         | If you combine intrusive linking with a pool allocator, you can
         | have decent locality too.
        
           | scaramanga wrote:
           | Or how about an object that can't easily be relocated like,
           | you know, a stack for a kernel thread which also contains the
           | task struct and needs to be moved around between different
           | scheduler queues.
           | 
           | But yeah, I understand they took the task struct out of the
           | kernel stack now.
           | 
           | Edit: difficult being an understatement, you'd need to solve
           | the halting problem to be able to rewrite any self-
           | referential pointers to stack objects wherever they may be.
        
           | robotresearcher wrote:
           | That's absolutely right, but it's not just large objects. Any
           | situation where you follow links rarely compared to your
           | other memory accesses then the cache performance of links
           | becomes negligible and the benefit of simplicity and O(1)
           | operations can shine. Vectors for scrubbing through things
           | without changing their structure, lists and trees the
           | opposite.
           | 
           | Scrubbing through things quickly is empirically more common,
           | hence the standard advice to use vectors by default.
        
         | dclowd9901 wrote:
         | I know it's splitting hairs but doesn't that mean the structure
         | is, indeed, a linked list? A structure can be two things at
         | once.
        
         | antirez wrote:
         | This is a very popolar use case inside kernels implementations.
        
           | colinfinck wrote:
           | FWIW, I recently implemented a safe Rust crate around the
           | Windows variant of intrusive linked lists [1] and spoke about
           | it at EuroRust [2].
           | 
           | I did it only for compatibility and would still prefer
           | vectors whenever I can use them. Nevertheless, I put some
           | efforts into coming up with a useful and high-performing
           | implementation. To give an example: My `append` function is
           | O(1) and not a poor O(N) version that you rightfully
           | criticized in your tweet.
           | 
           | [1] https://colinfinck.de/posts/nt-list-windows-linked-lists-
           | in-... [2] https://www.youtube.com/watch?v=IxhZIyXOIw8
        
         | klysm wrote:
         | The other benefit of this kind of data structure is the ability
         | for one object to participate in multiple collections.
         | Otherwise you would have to do some kind of hash map structure
         | to point to them.
        
           | kelnos wrote:
           | Isn't that the opposite, though? If you store the next/prev
           | pointers in the data structure itself, you _can 't_ use the
           | same object in multiple collections.
           | 
           | If you have a standalone linked list data structure, then you
           | can, as long as you have a good means of tracking the
           | lifetime of the data itself.
        
             | nirs wrote:
             | Typically you keep a list entry in the struct:
             | https://man7.org/linux/man-
             | pages/man3/stailq.3.html#EXAMPLES
             | 
             | The same list entry can be moved between several lists
             | using the same type.
             | 
             | If the struct need to be in multiple lists in the same time
             | you can keep multiple list entries in the same struct:
             | https://github.com/freebsd/freebsd-
             | src/blob/69413598d2660054...
        
       | DeathArrow wrote:
       | While there's a certain use for linked lists - in operating
       | systems kernels for example and as a programmer you absolutely
       | have to know how linked lists work, there are other linear data
       | structures which are more suited for general programming.
       | 
       | To give a C# example, arrays, lists and dictionaries (hash
       | tables) implement iterators so you can always know what the next
       | element in collection is. Elements can be accessed by key or
       | index in O(1),elements can be added in O(1). The case in which
       | you absolutely have to insert an element in a particular position
       | is rare so linked lists are seldom used.
       | 
       | The same case is for C++, there is a vector class in STL but no
       | linked list class. Same for Java.
        
         | selimco wrote:
         | > The same case is for C++, there is a vector class in STL but
         | no linked list class. Same for Java.
         | 
         | Java does have a linked list in the standard library:
         | https://docs.oracle.com/en/java/javase/17/docs/api/java.base...
        
         | cyber_kinetist wrote:
         | > The same case is for C++, there is a vector class in STL but
         | no linked list class.
         | 
         | No, there is std::list.
         | 
         | https://en.cppreference.com/w/cpp/container/list
         | 
         | > Same for Java.
         | 
         | Also no, LinkedList exists.
         | 
         | https://docs.oracle.com/javase/7/docs/api/java/util/LinkedLi...
        
       | summerlight wrote:
       | https://twitter.com/cpuGoogle/status/1415061974820421635
       | 
       | A nice explanation on rationales of using linked lists in OS
       | kernel, from a Fuchsia developer. In short, it is almost
       | guaranteed not to fail on most typical operations given the
       | invariant is not broken, which make it suitable for cases where
       | there's no fallback option left like kernel codes.
        
       | cbreynoldson wrote:
       | > You get what data structures made of links truly are: the
       | triviality of a single node that becomes a lot more powerful and
       | complex once it references another one.
       | 
       | I don't think beginners actually make this connection for a
       | while. Linked Lists are introduced analogously to arrays, sets,
       | etc. Beginners think about Linked Lists in terms of what they
       | already know.
       | 
       | As a beginner, I thought of Linked Lists purely as non-contiguous
       | arrays, even though there are deeper concepts behind them.
       | 
       | Unless the beginners already have the perspective of "same having
       | the power as the whole", I don't think this connection gets made
       | for a while. Linked Lists don't expose so much possibility on
       | their own.
        
       | armchairhacker wrote:
       | Linked lists are great to know and understand. Linked lists are
       | not good for performance. In most cases, an array or other data
       | structure is much more efficient than a linked list. But there
       | are cases where linked lists' immutability/segmentation or
       | simplicity to implement make them the better choice.
        
         | taeric wrote:
         | There are, interestingly, also times when fun facets of the
         | linked lists mutability can lead to benefits.
         | https://taeric.github.io/DancingLinks.html is my attempt at
         | exploring Knuth's DLX algorithm. I can't recommend enough that
         | you look into how he did it. Very fun and exciting use of
         | mutable doubly linked lists.
        
       | taeric wrote:
       | I suspect the problem is that many people think of stuff like
       | Java's LinkedList when they think of linked lists. As a standard
       | list implementation, I think it is easy to see that a doubly
       | linked list just isn't that strong. Especially with how
       | cumbersome the List interface makes it to take advantage of the
       | links.
       | 
       | That said, conceptually, linked items are everywhere; such that
       | you can easily find how to list things following the links. You
       | probably just don't bother keeping that in a "List" structure in
       | your code.
        
         | xxs wrote:
         | LinkedList is useless in Java. Aside the lock free version of
         | them, Linked Lists are better avoided in any language (Few
         | exceptions)
        
           | taeric wrote:
           | I'll forever think of this tweet when I see java's LinkeList.
           | https://twitter.com/joshbloch/status/583813919019573248
           | 
           | Funny, as I constantly have to tell folks to not bother using
           | it at the office. Everyone always assumes it has to be faster
           | than arraylist for whatever they happen to be doing this
           | time. I think the vast majority of the time it flat doesn't
           | matter, but I also fully expect LinkedList will lose most
           | speed tests.
        
             | xxs wrote:
             | LinkedList is terrible as the memory overhead is way too
             | high - if you need something better use an ArrayDeque.
        
             | smarks wrote:
             | Right. The main problem is that Java's List interface
             | affords access by int index, making it seem array-like.
             | Unfortunately get(i) on a LinkedList is O(n). This turns
             | apparently innocuous things quadratic, like looping over a
             | list by index.
        
               | xxs wrote:
               | Nothing to do with the interface - only
               | java.util.RandomAccess guarantees O(1). Iterations using
               | get(int) is a mistake regardless, stick to the
               | java.util.iterator or ListIterator it you need the index.
               | 
               | The main issue w/ the LinkedList is its memory footprint
               | (aside being LinkedList w/ an indirection on each
               | access), along with the increased GC costs (the GC has to
               | iterate the nodes too, cache misses - gc pauses), even
               | half empty ArrayList is more memory friendly than any
               | LinkedList. ArrayDeque offers 'adds' at the both ends, if
               | you wish to use LinkedList as a queue.
        
         | jsight wrote:
         | I run into people that use LinkedList everywhere in their code
         | "for efficiency". It always confuses me. Are CS professors
         | teaching something that leads to this?
         | 
         | In practice, it seems unusual for it to have any advantages
         | over just using an ArrayList in typical JVM applications.
        
           | taeric wrote:
           | I blame an over reliance on Big O as an arbiter of what is a
           | fast algorithm. Linked lists are commonly taught as having
           | much better insertion speed than arrays.
           | 
           | This ignores constant factors and cache coherency, of course.
           | More, it also ignores that most lists that folks will make
           | have a VERY predictable access pattern. Typically just a one
           | way scan, if any traversal at all. With very few inserts, and
           | mostly just appending.
        
             | Narishma wrote:
             | Cache locality, not coherency.
        
       | forrestthewoods wrote:
       | The first problem with Linked Lists is you should almost never
       | use them. Do they have uses? Of course! However on modern
       | computers memory access is, roughly, more expensive than CPU
       | cycles. The Linked List has been demoted to a "special case" data
       | structure.
       | 
       | The second problem is Linked Lists are taught too early. They're
       | historically taught first in Data Structures 101. This results in
       | new programmers using LinkedLists first when they should be used
       | last.
        
       | readams wrote:
       | Linked lists are great. But they have the problem that, almost
       | always, whatever problem you're trying to solve would be better
       | done with a regular resizable vector.
       | 
       | This includes problems that they should be great for, like insert
       | into the middle or the front.
       | 
       | The reason is that in practice the way computers actually work is
       | that there is an enormous time penalty for jumping around
       | randomly in memory, and it's large enough that it's often worth
       | paying a O(lg n) cost to switch to something that will be
       | contiguous in memory and allow the CPU to prefetch data.
       | 
       | There are exceptions when you really should use an actual linked
       | list, but your default should be a vector and not a linked list.
        
         | jstimpfle wrote:
         | That works if the data that you want to put in the sequence is
         | copyable (doesn't have "identity"), or if you can arrange so
         | that it gets created in the right spot from the start and you
         | can always choose the iteration order to be sequential in
         | memory. For many more complex structures, that is not the case.
        
           | ay wrote:
           | Could you give an example of a data that is uncopyable ? A
           | struct witj self-referential pointers ?
        
             | Const-me wrote:
             | For example, an std::mutex is uncopyable, and even
             | immovable.
        
             | AstralStorm wrote:
             | If the data is relatively big buffers, you really do not
             | want to copy them. Your memory bandwidth will thank you,
             | and performance will vastly increase. Cache locality will
             | be good anyway since you're referring to long blocks.
        
               | ay wrote:
               | This kind of data structure often has a buffer +
               | associated metadata. If the metadata+pointer fits into
               | 1-2 cache lines, storing it in a vector can give be win
               | vs. storing the next/prev pointers next to data +
               | metadata.
               | 
               | In any case, there is always only one authoritative
               | answer: "perf top". :)
        
             | hither_shores wrote:
             | It's about the meaning, not the raw bits. I might be able
             | to duplicate your signature, but that doesn't mean I can
             | sign documents for you.
        
             | jlokier wrote:
             | Example: Any object shared among multiple lists or other
             | data structures at the same time, or directly pointed to by
             | other objects, such that any of those views must reach the
             | same cheaply mutable state. This includes objects in the
             | object-oriented programing (OOP) model, actors in the actor
             | model, and file objects in a kernel which are in multiple
             | lists.
             | 
             | For those you have to store object pointers in the vector,
             | instead of copies of the objects. That works and is often
             | done, but it defeats or reduces the cache locality
             | perfornance improvement which motivates using a vector in
             | the first place.
             | 
             | On an in-order CPU an _intrusive_ list (pointers inside the
             | objects) can be traversed faster than a vector or B-tree of
             | object pointers, though a _non-intrusive_ list (list of
             | nodes containing object pointers) won 't be faster. On an
             | out-order CPU with heavy speculative execution it is less
             | clear cut because the vector allows successive objects to
             | be dereferenced speculatively in parallel to a limited
             | extent, while this is not possible traversing an intrusive
             | list.
             | 
             | If the list is going to be traversed looking for objects
             | based on some filter criterion such as flags or a number
             | comparison, based on non-mutable state (or where it's ok
             | for mutation to be expensive), traversal can be sped up by
             | storing criterion data in the vector alongside the
             | pointers, or in a second vector, to avoid dereferencing
             | every touched object pointer during traversal.
        
               | gpderetta wrote:
               | Linked list are indeed great for secondary traversal
               | orders (or for subsets).
               | 
               | But I don't understand your comments re in-order CPUs:
               | vectors will be faster there too as they avoid the load-
               | load dependency. Most modern inorder CPUs are still
               | superscalar and fully pipelined.
        
         | lispm wrote:
         | what makes you think that list operations make the CPU jump
         | randomly in memory and that implementors of, say, Lisp systems
         | haven't optimized memory management for linked lists? A modern
         | GC provides features generations, areas, copying and
         | compacting.
         | 
         | One a Lisp Machine from the 80s/90s the system saves a type
         | sorted and locality optimized memory image, from which it later
         | boots the Lisp system. Additionally to the typical GC
         | optimizations (incl. integration with the paging system) it
         | also provided CDR coding, which allocates lists as continuous
         | memory. The GC also creates those CRD coded lists. CDR coding
         | fell out of fashion in modern Lisp implementations, because the
         | space and time savings aren't that great to justify the more
         | complex implementation.
        
         | eru wrote:
         | > Linked lists are great. But they have the problem that,
         | almost always, whatever problem you're trying to solve would be
         | better done with a regular resizable vector.
         | 
         | One important exemption is when you want your datastructure to
         | be persistent.
         | 
         | See https://en.wikipedia.org/wiki/Persistent_data_structure
         | 
         | But for many applications there are also better persistant data
         | structures than linked lists around.
        
         | Smaug123 wrote:
         | Nobody has mentioned immutability yet - linked lists are easily
         | used as one of the simplest immutable persistent data
         | structures. Your definition of "better" appears to be solely
         | about keeping the CPU burning, but if CPU _isn 't_ the
         | bottleneck, I prefer "immutable linked list" over "clone a
         | vector many times" merely on aesthetic and reason-ability
         | grounds.
        
           | mst wrote:
           | cons cells are a local maximum.
           | 
           | often they're sufficiently maximal.
        
           | yongjik wrote:
           | Well, of course if a solution is "cloning a vector many
           | times" then arguably you're using the vectors wrong - in
           | which case it's totally appropriate to find a different data
           | structure (or a better way to use vectors, if possible).
        
           | gleenn wrote:
           | Or you could go the Clojure route and do the tree-style
           | immutable vectors where the cost is log32(N) for most
           | operations, small enough in all the relevant operations, and
           | you get the amazing usability of immutable data structures
           | that are performant enough in most cases.
        
             | kazinator wrote:
             | How does that scale down to small sequences of under ten
             | items?
        
               | Jim_Heckler wrote:
               | an array is used for the last 1-32 elements of the vector
               | (the "tail") so there would be no trie at all, just the
               | tail
        
           | dan-robertson wrote:
           | In C where using the simplest data structures is valuable,
           | you'd likely do much better by upgrading your programming
           | language to something that offers a better data structure. If
           | you're using a higher level language where you can abstract
           | details about your sequence data structure, I think there's
           | very little benefit to using the simplest data structure and
           | you should instead have widely used libraries that offer a
           | good immutable sequence type, e.g. RRB trees.
           | 
           | One exception is ML style languages where singly linked lists
           | get a big syntax advantage (Haskell only half counts because
           | most of its lists are really just iterators). I think this
           | was, in hindsight, a mistake for a widely used language. It's
           | also a source of pointless pain for learners (eg I don't
           | think there's much benefit to people having to write lots of
           | silly tail-recursive functions or learning that you have to
           | build your list backwards and then reverse it). Another
           | exception would be some lisps where lists are everywhere and
           | the cons-cell nature of them makes it hard to change.
        
         | osigurdson wrote:
         | >> This includes problems that they should be great for, like
         | insert into the middle or the front.
         | 
         | How is a vector good for inserting elements in the middle? That
         | is O(N). Where is the O(lg n) cost coming from?
         | 
         | >> your default be a vector and not a linked list
         | 
         | The default should be to understand the problem.
        
           | tsimionescu wrote:
           | > How is a vector good for inserting elements in the middle?
           | That is O(N). Where is the O(lg n) cost coming from?
           | 
           | Both an array and a linked list are O(n) for adding in the
           | middle. In an array, you finding the middle element is O(1),
           | then moving the rest of the array is O(n) [you have to move
           | n/2 elements]. In a linked list, finding the middle element
           | is O(n) [you have to traverse n/2 elements], but adding it is
           | O(1).
           | 
           | So, asymptotically there is no difference. In practice
           | though, the linked list traversal will cause O(n) cache
           | misses to reach the middle element (since we are traversing
           | random memory), while the array move will only incur O(1)
           | cache misses (since we are accessing memory in order) - so,
           | the array will actually win in practice.
           | 
           | Edit: Note that I also don't know where the GP got the O(lg
           | n).
        
             | osigurdson wrote:
             | It seems that the case where it is necessary to find the
             | exact middle element and insert something wouldn't be very
             | common but OK, I'll humor this use case. Let's assume that
             | the list is huge and inserting at the middle is the only
             | thing we do. Just maintain a pointer to the middle element
             | and insert there: O(1) for find and insert. If it is
             | necessary to add/remove at other places in the list, it is
             | possible to maintain a doubly linked list and move the
             | middle pointer back and forth accordingly (still O(1)).
             | 
             | I suppose if the LL like structure is mostly read only with
             | rare inserts, not that big and holds simple data types or
             | structs and often needs to be read sequentially then an
             | array/vector/list would be better than a regular LL but
             | then it is obvious that an array is better anyway.
             | 
             | It's poor guidance to tell people that the vector is the
             | "go to" when you need a LL. Sad actually that this poor
             | guidance has been upvoted by so many people such that it is
             | the top comment, all in the name of some FUD about cache
             | misses.
        
               | tsimionescu wrote:
               | Let's take the case where we need to add to any element
               | of the structure, with uniform probability.
               | 
               | For an array, this means we'll have to move N elements if
               | we need to add at the beginning, N/2 if exactly in the
               | middle, or 0 if at the end. Either way, we'll need to re-
               | size the array, which has some chance of requiring N
               | copies anyway - say, half the time. So, on average, we'll
               | perform 3N/4 moves per element added, requiring 1 cache
               | miss every time.
               | 
               | For a (doubly-)linked list, this means we'll have to
               | traverse 0 elements if we want to add right before Head,
               | 0 if we want to add right after Tail, and worse case
               | we'll need to traverse N/2 nodes if right in the middle -
               | so, we'll have to do on average N/4 traversals per
               | element added. Each of the nodes traversed will generally
               | be a cache miss, so N/4 cache misses. So, the Linked List
               | will indeed be require a third the number of operations
               | on average, but many times more cache misses, so should
               | be expected to lose. If you store the middle pointer as
               | well, you'll again half the number of operations for the
               | list, but that still won't cover the amount of time
               | wasted on cache misses.
               | 
               | Of course, if the distribution is not uniform and is in
               | fact skewed near a predictable element of the linked
               | list, at some level of skewed-ness vs cache speed
               | advantage, the Linked List will overtake the array.
               | 
               | Note that linked lists also have other advantages - for
               | example, they can be used as in the Linux kernel to store
               | the same object in multiple different lists without any
               | copying and without extra pointer indirections (at a
               | small cost of one extra pointer in the object per list it
               | could be a member of). They are also easier to pack into
               | a small address space that has to be dynamically
               | allocated (since you don't require a large contiguous
               | block).
        
           | llbeansandrice wrote:
           | I think by "middle" they mean "at the current position" not
           | at an arbitrary position in the middle when starting from
           | HEAD
        
         | huhtenberg wrote:
         | > almost always
         | 
         | As any generalization this one too is, of course, incorrect.
         | 
         | Outside of the realm of academia, the need to keep data pieces
         | in several _containers_ at the same time, with no heap
         | operations on insertion or removal and O(1) removal is very
         | common, especially in the kernel space and embedded contexts.
         | The only option that fits the bill - you guessed it - are the
         | linked lists.
        
           | nostrademons wrote:
           | Note that this (and the article) describes an _intrusive_
           | linked list. _Extrusive_ linked lists (like you might see in
           | a class library or CS 101 project), where the node structure
           | is heap-allocated separately from the data that it points to,
           | have very few practical advantages over vectors, which is why
           | standard libraries are increasingly defaulting to vectors
           | even when the class is named  "list".
        
             | majjgepolja wrote:
             | Is there a difference between intrusive and extrusive list
             | when the member type is a struct (value type)?
        
               | nostrademons wrote:
               | Yes. An intrusive list puts the pointers to next & prev
               | in the struct itself. IntrusiveList<Member>::iterator is
               | a Member*, then member->next points to another Member*.
               | An extrusive list puts the pointers in a separate node
               | structure, which then has a pointer to the actual member
               | type. ExtrusiveList<Member>::iterator is a
               | ExtrusiveListNode<Member>*, node->next points to another
               | ExtrusiveListNode*, and you access the actual Member*
               | with node->data. Basically it's a double indirection.
               | 
               | The advantage of extrusive linked lists is that you don't
               | need to modify the data layout of Member at all. All the
               | linked list manipulation happens on ListNodes, which have
               | a pointer to the actual member, which can be accessed
               | with just a cast. That's why they're so well-suited to
               | class libraries: you can easily write a generic
               | ExtrusiveLinkedList that works with any data that can be
               | represented with a pointer, hide all the traversal (and
               | the existence of individual ExtrusiveListNodes) in
               | accessor functions, and present a simple API with no
               | codegen or modifications to Member.
               | 
               | The advantages of intrusive linked lists are 1)
               | performance 2) no heap allocations 3) easy traversal when
               | given a Member*. It's only a single indirection rather
               | than a double, and usually that indirection is necessary
               | anyway to avoid copies. Insertion and deletion are just
               | pointer manipulation; you don't need to allocate new
               | space for the nodes. Oftentimes your functions will take
               | Member* anyway, and it's nice not to take or traverse the
               | container if you just need to peek at the next element
               | coming up.
               | 
               | The point of this subthread is that intrusive linked
               | lists have very clear use cases in the kernel & embedded
               | spaces, where these advantages are often critical.
               | Extrusive linked lists, however, have very few advantages
               | over vectors (which also require heap allocation but are
               | more cache & locality friendly). In a common business app
               | responding to user input, the difference between O(1)
               | amortized and O(1) is negligible; you can eat the
               | occasional pause for a vector resizing, because the user
               | isn't going to care about a few milliseconds. They both
               | require heap allocation. The vector will give you _much_
               | faster traversal, because subsequent reads all hit cache.
               | The vector takes less memory (1 pointer per element, with
               | a max of 50% overhead, while an extrusive doubly-linked
               | list is 3 pointers per element). There 's just little
               | reason to use an extrusive linked list because the use-
               | cases where intrusive linked lists are not better are
               | precisely the same ones where vectors are better.
        
           | forrestthewoods wrote:
           | As any generalization this one too is, of course, incorrect.
           | 
           | In the real world of application development when choosing
           | between Vector and LinkedList the answer is almost always
           | Vector. There are, of course, certain problems where
           | LinkedList is always the correct answer.
           | 
           | However for all problem that have chosen either Vector or
           | LinkedList as their container of choice a super majority
           | should choose Vector.
        
             | Rioghasarig wrote:
             | > As any generalization this one too is, of course,
             | incorrect.
             | 
             | His statement wasn't a generalization to repeating this
             | back does not make sense.
        
             | AstralStorm wrote:
             | There are essentially no vectors in most of the embedded
             | world, including RTOS. Why would you need that many extra
             | copies of data?
             | 
             | Coopies are not free. Not in terms of stack, neither in
             | terms of memory or its bandwidth. Both of which are in
             | demand there.
             | 
             | For an example, look into how most of Zephyr RTOS is
             | implemented. It's mostly intrusive lists (called queues
             | there for some reason, but they support iteration) or ring
             | buffers (msgq or mpsc_pbuf or pipe). For bigger systems you
             | can use red-black trees.
             | 
             | There is no vector data structure, you can allocate a block
             | from the heap and manage its size manually, but it's rarely
             | done as most sizes are static and many data structures are
             | preallocated by the bootloader or loaded directly from
             | flash memory. Cache locality of these is of course possible
             | to manipulate manually by use of memory sections.
             | 
             | A list does not copy by default, which is an extremely
             | important performance characteristic. And it does not
             | require overallocation like a vector of pointers, plus has
             | a fast remove, especially at head or tail.
        
               | forrestthewoods wrote:
               | Extra copies, huh?
               | 
               | Feel free to replace the term "Vector" with "Contiguous
               | Memory". The relevant distinction in this conversation is
               | "contiguous vs node pointers". Embedded loves contiguous
               | blocks.
               | 
               | A dynamic array copies memory if it regrows, inserts,
               | removes. The cost of that operation depends on numerous
               | factors. But it doesn't result in "extra copies" laying
               | around.
               | 
               | Oh wait maybe you mean array operations perform extra
               | copies, not that they persist. In which case it comes
               | down to which is more expensive: memcpy some bytes or
               | cache inefficient pointer traversal.
               | 
               | It depends! But most problems for most programs are MUCH
               | faster if they avoid LinkedList. Many people, especially
               | new programmers, radically underestimate the penalty of
               | pointer chasing.
        
               | ernst_klim wrote:
               | > Embedded loves contiguous blocks.
               | 
               | Quite the opposite. When you have no dynamic allocations,
               | vectors are out of the question, while pool of nodes is
               | fine and sound.
               | 
               | And quite often you can't copy elements from one place to
               | another in embedded or operating systems, since many
               | consumers can store a pointer to this element. Of course
               | this could be solved by usage of vector of pointers, but
               | this completely ruins the ``cache locality'' argument.
        
         | tylerhou wrote:
         | If you're iterating, sure, use a vector. If you're pulling from
         | a queue and doing a lot of processing, maybe an RPC -- does the
         | ~70ns memory latency hit really matter that much? Probably not.
        
           | kibwen wrote:
           | Why impose that latency if you don't need to? It costs me
           | nothing to reach for a vector instead.
        
             | tylerhou wrote:
             | Without extra work you can't use a vector like a queue, and
             | a circular buffer doesn't necessarily support all types
             | and/or preserve iterators on insert. Could use a deque, a
             | blocked linked list. But IMO list is fine if it's not
             | clearly a bottleneck.
        
             | AstralStorm wrote:
             | Costs you a dynamic memory allocation, code overhead and
             | higher memory use. There are good reasons why std::array
             | and plain arrays exist. Vectors are for when your data is
             | unbounded, which is actually a risk most of the time. For
             | truly big data, you want a special structure anyway.
        
           | pornel wrote:
           | If you're pulling from a queue use a ring buffer.
        
             | AstralStorm wrote:
             | Sometimes. Depends on whether you are filling in a data
             | buffer or trying to send an already calculated data buffer.
             | 
             | A standard ring buffer does copies/fill. A ring buffer of
             | pointers has similar performance characteristics to a
             | linked list.
        
             | Phelinofist wrote:
             | OT but does anyone know a general purpose ring buffer
             | implementation for Java?
        
               | grogers wrote:
               | LMAX disruptor: https://lmax-
               | exchange.github.io/disruptor/user-guide/index.h...
               | 
               | It's used by log4j2 and several other java libraries and
               | is quite solid. You can use it in blocking or polling
               | modes, so it's quite generic depending on your latency
               | requirements.
        
               | speed_spread wrote:
               | Unfortunately log4j2 is no longer a shining example of
               | good engineering, if you're trying to sell something...
               | 
               | I'd prefer saying LMAX is / was used as the core of high
               | frequency trading applications.
        
         | manv1 wrote:
         | It's surprising people are arguing about this. That fact alone
         | shows that lots of developers don't understand the relationship
         | between L1/L2 cache, data access patterns, and how prefetching
         | works.
         | 
         | That makes sense, given how abstract things have gotten.
         | Decades ago there was an article that showed the penalty for
         | non-linear data access was massive. That was before speculative
         | access, branch prediction and cache prefetching were standard
         | features. The performance hit today would be even more (except
         | presumably for machines that have implemented the speculative
         | execution security mitigations).
        
           | eru wrote:
           | Well, ideally your programming language (and its standard
           | library) should provide you with basic data structures that
           | are cache oblivious (or optimized for your arrangements
           | cache). So programmers wouldn't need to worry.
        
           | teo_zero wrote:
           | I think the argument of locality and linearity and cache hits
           | is valid, but specific for one access pattern: reading items
           | in order and adding to the tail. If that's what your code
           | ends up doing most of the time, then an array wins over a
           | linked list hands down.
           | 
           | As soon as you start adding or deleting items in the middle,
           | the game changes because the data you need to move is not
           | contained in the cache anymore (except for small arrays).
           | 
           | I can even imagine a workflow that uses both data structures:
           | an app could build up a linked list with the data it reads
           | from an external file, for example, where they can be
           | scattered and not in order; when the file is finally closed,
           | the data are considered read-only and rearranged into an
           | array for the remaining part of the execution.
           | 
           | So don't dismiss linked lists as inefficient "a priori",
           | which is exactly what antirez claims in the article.
        
             | tsimionescu wrote:
             | That analysis is lacking. Linked lists turn out to be
             | inferior to arrays for essentially any access pattern, at
             | least when the linked list can be assumed to be randomly
             | stored and not accidentally sorted.
             | 
             | Say you want to add an element in the middle of an
             | N-element ll/array.
             | 
             | For the linked list, this means you need to do 2 * N/2
             | memory reads (read value, read next pointer), which will
             | mean on average N/2 cache misses, to reach the element at
             | index N/2. Then you'll do O(1) operations to add your new
             | element. So, overall you've done O(N) operations and
             | incurred O(N) cache misses.
             | 
             | For an array, there are two cases:
             | 
             | 1. In the happy case, you'll be able to re-alloc the array
             | in-place, and then you'll have to move N/2 elements one to
             | the left to make room for your new element. You'll incur
             | one cache miss to read the start of the array, and one
             | cache miss to read the middle of the array, and one cache
             | miss for realloc() to check if there is enough space. The
             | move itself will not incur any more cache misses, since
             | your memory access pattern is very predictable. So, overall
             | you've done O(N) operations, but incurred only O(1) cache
             | misses - a clear win over the linked list case.
             | 
             | 2. Unfortunately, sometimes there is no more room to expand
             | the array in place, and you actually have to move the
             | entire array to a new place. Here you'll have to move N
             | elements instead of N/2, but you'll still incur only 1
             | cache miss for the same reason as above. Given the massive
             | difference between a cache miss and a memory access, doing
             | N operations with 1 cache miss should easily be faster than
             | doing N/2 operations with N/2 cache misses.
             | 
             | The only access patterns where the linked list should be
             | expected to have an advantage are cases where you are
             | adding elements very close to an element which you already
             | have "in hand" (e.g. at the head, or at the tail if you
             | also keep a pointer to the tale, or to the current element
             | if you're traversing the list while modifying it). In those
             | cases, the linked list would need O(1) operations and cache
             | misses, while the array would need O(N) operations and O(1)
             | cache misses, so it would start to lose out.
        
           | pencilguin wrote:
           | The fact is that in 95+% of cases, the performance doesn't
           | matter at all, and a linked list is fine.
           | 
           | The only problem is that use cases evolve, and what was fine
           | suddenly isn't, anymore.
        
           | couchand wrote:
           | As with any such subject, it's worth discussing the details
           | but probably not the generalities really (except perhaps in
           | the whimsical style of antirez).
           | 
           | You're right to point out that given the performance
           | characteristics of most modern CPUs, accessing memory
           | linerally is optimum. But it doesn't necessarily follow that
           | all collection data structures should be stored as a vector.
           | Consider a collection traversed very slowly relative to the
           | overall computation. There may not be any cache win if the
           | line's always evicted by the time you come back to it.
        
             | fiddlerwoaroof wrote:
             | I've always wondered about whether you could have your cake
             | and eat it too with linked lists where nodes are allocated
             | in contiguous blocks of N nodes. So, the normal case of
             | traversing the next pointer is not a discontinuous memory
             | access.
        
               | tsimionescu wrote:
               | That's quite likely to happen to a linked list when using
               | a compacting GC (such as Java's or C#'s generational
               | garbage collectors) (assuming the elements of the list
               | are only reachable from the preceding element, not from
               | many other places as well).
        
           | simplotek wrote:
           | > It's surprising people are arguing about this. That fact
           | alone shows that lots of developers don't understand the
           | relationship between L1/L2 cache, data access patterns, and
           | how prefetching works.
           | 
           | Your misconception lies in the assumption that the only
           | conceivable use case is iterating over a preallocated data
           | structure to do read-only operations.
           | 
           | Once you add real world usages, with CRUD operations
           | involving memory allocations/reallocations then your
           | misconception about cache crumbles, and you start to
           | understand why "people are arguing over this".
        
             | Dylan16807 wrote:
             | Which misconceptions? Messy memory allocations hurt lists
             | even more.
        
           | pizza wrote:
           | It's hard for me to blame developers for their misaligned
           | mental models since, as you yourself point out, the
           | relationship between your mental model of the hardware's
           | behavior and what it actually does is complex and not really
           | obvious. Mitigations etc means that the hardware you bought
           | might totally change its behavior in the future. If all you
           | can rely on is time measurements, it doesn't explain that the
           | software you wrote last year is now eg 40% slower due to bug
           | mitigations. I think it's easier to expect developers to put
           | in the effort once they're empowered with powerful tools to
           | explain how the software gets mapped onto the hardware.
           | 
           | I think tools that result in greater transparency at lower
           | levels will probably emerge, something like a compiler-
           | generated dynamic analysis + pre-flight checklist hybrid. But
           | since it's so low-level, it might actually affect results,
           | which may be a problem. I think intel vtune seems interesting
           | in that regard.
        
         | djha-skin wrote:
         | This is only a problem with implementation of linked lists. The
         | article talks about how linked lists are conceptual. They make
         | a nice data structure to work with in code. But several ways
         | could easily be conceived of that optimize them so that
         | elements are adjacent to each other.
        
         | jalino23 wrote:
         | by vector you mean regular js arrays?
        
           | josephg wrote:
           | Broadly yes, C++'s vector is the equivalent of JS arrays.
           | 
           | But the javascript array type is much more complicated. It
           | intelligently changes its internal structure based on how you
           | use it - which is pretty wild.
        
             | andrekandre wrote:
             | nsarray (objective-c) also does something similar
             | 
             | https://news.ycombinator.com/item?id=2413656 (sadly the
             | article link is dead)
        
             | haimez wrote:
             | I recommend James Micken's JavaScript talk on this matter:
             | https://m.youtube.com/watch?t=3m10s&v=D5xh0ZIEUOE
             | 
             | (Time stamped to the beginning of the arrays section)
        
             | LecroJS wrote:
             | Would you mind elaborating on js arrays intelligently
             | changing their internal structure based on use? My
             | background is only JS/Python and have never touched C++ so
             | I don't have the context of how these differ.
        
               | josephg wrote:
               | A c++ vector internally uses an array - which is a chunk
               | of memory where each item is just in the next spot in
               | memory in sequential order. Eg, the array [1,2,3] might
               | have 1 at memory address 100, 2 at memory address 101 and
               | 3 at memory address 102. Given an array index, you can
               | figure out where a value is stored in memory pretty
               | trivially.
               | 
               | When the array fills up, vector will allocate a fresh
               | array (thats bigger - often 2x bigger) and copy
               | everything over. This is slow - because it has to copy
               | everything. But much faster than you think.
               | 
               | So thats how C++'s _vector_ (and Rust 's _Vec_ ) work.
               | 
               | Javascript implementations (like v8) can swap out the
               | implementation behind the scenes based on how you're
               | using your array. The API is always the same - but the
               | way the array works in memory will be different based on
               | whatever is fastest given the way you're using your
               | array. (V8 doesn't know ahead of time what sort of data
               | will be in your array or what you're using it for, so it
               | guesses then changes things up if it guesses wrong).
               | 
               | Here's a blog post talking about some of the internal
               | details in V8:
               | 
               | https://itnext.io/v8-deep-dives-understanding-array-
               | internal...
               | 
               | The actual implementation is much more complex than this
               | blog post suggests. (I think they also optimize based on
               | push/pop vs shift/unshift vs other stuff).
        
               | LecroJS wrote:
               | Thanks for taking the time to write that up-I appreciate
               | the explanation.
               | 
               | My only adjacent knowledge here was when I checked the V8
               | implementation of the `sort` method on the array
               | prototype and saw it changes the sorting algorithm used
               | based on array length (IIRC insertion sort for length <
               | 10 and merge sort otherwise).
               | 
               | Thanks for the V8 resource as well.
        
         | bjoli wrote:
         | While I generally agree with you, I have had multiple occasions
         | where amortized O(1) was unacceptable, whereas the overhead of
         | linked lists was ok.
        
           | eru wrote:
           | You can make dynamic arrays have worst case O(1) at the cost
           | of some extra overhead.
        
             | bjoli wrote:
             | Copy the array on every append? :)
        
         | halayli wrote:
         | if you can replace linked list with vector then you don't need
         | a linked list and the two shouldn't be compared as it is almost
         | always obvious when you need one over the other. Link lists are
         | used in areas when there is a relationship between nodes like a
         | list of running processes. one of the frequent operations is to
         | remove a node from whatever position it's in and add it on
         | either side.
        
           | Sirened wrote:
           | Right. Like there are so many algorithms that I shudder to
           | think about how you'd implement them without linked lists.
           | Like...how about a buddy allocator? Sure you _could_ use a
           | vector for each but you 'd be copy and resizing huge swathes
           | of memory constantly for very little gain. Use the right tool
           | for the job!
        
             | tsimionescu wrote:
             | The thing is, the cost of copying memory is essentially the
             | same as the cost of reading it, and linked lists force you
             | to keep re-reading the same memory over and over (assuming
             | random access).
             | 
             | Adding an element in the middle of an array vs the middle
             | of a linked list actually has the same asymptotic
             | complexity (O(n)), but far better cache impact for the
             | array (since moving a contiguous block of n/2 elements
             | should only incur 1 cache miss, while reading n/2 randomly
             | distributed nodes of the list will incur on average
             | something like n/4 cache misses). Adding closer to the
             | beginning of the linked list is faster, but adding close to
             | the end of the vector is faster still, so overall with
             | random access, the vector should win.
        
         | DeathArrow wrote:
         | Also, when in doubt, do a quick benchmark with different
         | implementation.
        
         | imbnwa wrote:
         | I've always wondered why GSAP and CreateJS both rely on linked
         | lists for sequencing animations
        
         | xaedes wrote:
         | I learned to love linked lists as soon as I discovered that I
         | can just store them in vectors to get the performance of
         | guaranteed contiguous memory:                 //
         | LinkedListItem[k]: item[k], prev[k], next[k]
         | std::vector<T> item;       std::vector<uint> prev;
         | std::vector<uint> next;
         | 
         | Similar is used in transparency rendering with per-pixel
         | linked-lists.
        
           | anthomtb wrote:
           | Interesting idea but how does C++ guarantee contiguous memory
           | for a vector? I just don't see how a data structure with an
           | arbitrary, dynamic size can also reside in a contiguous range
           | of address space.
        
             | zabzonk wrote:
             | the size of a c++ object is fixed at compile-time
        
             | fnbr wrote:
             | When the array is resized, it's moved to a new contiguous
             | block of memory: everything is copied or moved over. See:
             | https://stackoverflow.com/questions/8261037/what-happen-
             | to-p...
        
             | frankchn wrote:
             | Simple, you just allocate a bigger contiguous chunk of
             | memory and copy the entire vector over when the current
             | chunk maxes out.
        
         | [deleted]
        
         | nraynaud wrote:
         | I think a lot of graph edits are a nightmare to perform with an
         | adjacency list, and a breeze when you just edit linked
         | pointers.
         | 
         | Imagine the Euler Operators on a manifold (ie a 3D mesh) or on
         | a planar graph, but on a vector instead of on a DCEL.
        
         | chrisseaton wrote:
         | > almost always, whatever problem you're trying to solve would
         | be better done with a regular resizable vector
         | 
         | Resizing a vector brings the whole of it into cache,
         | potentially dragging all of it all the way from actual RAM.
         | Prepending an entry brings nothing into cache.
         | 
         | Cache is king.
        
           | tsimionescu wrote:
           | Sure, prepending or adding at the end(assuming you're keeping
           | a tail pointer, which is easy and common) are basically the
           | only cases where a linked list can beat an array.
        
             | chrisseaton wrote:
             | Yeah, and we deliberately prepend because that doesn't
             | involve writing to a field in a node which may not be in
             | cache. It's the consumer's job to reverse the list if they
             | ever actually need it.
        
           | jameshart wrote:
           | > Resizing a vector brings the whole of it into cache
           | 
           | I don't do low level code on modern systems; is there not a
           | way to just blit chunks of memory around without a CPU
           | looping it's way through word by word carrying out MOVQ
           | instructions which do a full RAM read/write?
           | 
           | If you're streaming data out of memory to a PCIe bus or
           | something, that wouldn't go via cpu cache, right? There's
           | some kind of DMA model, so there's something in the system
           | architecture that allows memory access to be done off-CPU (I
           | vaguely think of this as being what the northbridge was for
           | in the old school x86 architecture but I'm fuzzy on the
           | details?).
           | 
           | Mechanical copying of byte arrays feels like something a CPU
           | should be able to outsource (precisely because you don't want
           | to pollute your cache with that data)
        
             | chrisseaton wrote:
             | Not in architectures I've seen or maybe I'm missing the
             | instructions to do it? Even if there were and you can
             | bypass the CPU, hauling data out of RAM is still going to
             | take a long time.
        
               | NobodyNada wrote:
               | > maybe I'm missing the instructions to do it?
               | 
               | Yes, they're called "non-temporal" load and store
               | instructions. AFAIK most memcpy implementations will
               | automatically use non-temporal access when copying blocks
               | larger than a certain threshold, precisely to avoid
               | destroying the cache during bulk copies.
               | 
               | Additionally, a well-optimized memcpy implementation
               | (e.g. making good use of prefetch hints) should not
               | suffer _too_ much from RAM latency and be able to pretty
               | much max out the bandwidth available to the CPU core.
        
               | chrisseaton wrote:
               | You're thinking of like MOVNTDQA? Those still have to
               | reach into RAM - that's still double-digits nanoseconds
               | away and still creates a stall in the processor at _n_
               | complexity!! No thanks? I can allocate a new linked list
               | node with memory guaranteed already entirely in cache
               | (except for TLA failure.)
               | 
               | > be able to pretty much max out the bandwidth available
               | to the CPU core
               | 
               | Why max out a slow link at _n_ complexity when you can
               | just avoid doing that entirely and use a linked list?
        
               | [deleted]
        
               | Dylan16807 wrote:
               | Surely you know the answer, if you're analyzing at this
               | level of detail?
               | 
               | It's because spending 1 millisecond per X operations
               | doing vector moves is preferable to spending 20
               | milliseconds per X operations stalling on list links.
               | 
               | If you set a good vector size from the start, the moves
               | will be rare to nonexistent. Or you could reserve some
               | address space and _never_ have to move the data.
        
             | tsimionescu wrote:
             | In the ideal case (and assuming a vector large enough that
             | it spans many pages), you would actually be able to work
             | with the OS's virtual memory manager to re-allocate and
             | move only the page that contains the address you want to
             | write to, while all other pages can stay untouched. But
             | there is quite a bit of careful work to do to implement
             | something like this and make sure you are not invalidating
             | pointers or even internal malloc() data structures.
        
           | KerrAvon wrote:
           | So use an array deque if you need O(1) prepending.
        
             | chrisseaton wrote:
             | Resizing the array in the array dequeue brings the entire
             | thing into cache. That's a disaster.
        
               | kragen wrote:
               | It's a disaster but it's an amortized O(1) disaster.
        
               | CyberDildonics wrote:
               | It's far from a disaster because the best way to deal
               | with a list is still to use a contiguous array.
               | 
               | Allocation isn't free. You have to do it sometime and
               | doing it all in bulk is going to have the same
               | consequences as just one small allocation anyway. If you
               | are mapping new pages into main memory, making system
               | calls or looping through a heap structure, you're not
               | only affecting the data cache but also the TLB while also
               | pointer chasing.
               | 
               | Making a list out of an allocation for every element is
               | performance suicide on modern computers. Even the
               | amortized resizing of an array is done at the speed of
               | memory bandwidth because of the prefetcher. Thinking that
               | is some sort of performance hindrance is absurd because
               | it outweighs the alternative by orders of magnitude. If
               | there really are hard limits on the time every insertion
               | can take, then the solution needs to be links of large
               | chunks of memory to still minimize allocations on every
               | element.
        
             | kragen wrote:
             | That's amortized O(1), not WCET O(1). For transformational
             | systems they're equivalent, for reactive systems they
             | aren't.
        
         | im3w1l wrote:
         | One data structure I recently found myself wanting is a tree-
         | based "array", with O(log n) access by index / insertion
         | anywhere / deletion anywhere.
         | 
         | Kinda weird how rare it seems to be.
        
           | vanviegen wrote:
           | That does sound useful, although I'm having some trouble
           | thinking of what for exactly. :-)
           | 
           | And indeed, I don't recall seeing something like that. At a
           | first glance O(log n) seems easy to do in the average case,
           | but perhaps not in the worst case.
        
             | socksy wrote:
             | I suppose it's quite similar to the idea of Bagwell's
             | persistent data structures[1] -- those are particularly
             | useful for lock free coding and with a high enough
             | branching factor could be not as much overhead as a linked
             | list or just plain copying.
             | 
             | [1] https://hypirion.com/musings/understanding-persistent-
             | vector...
        
           | Munksgaard wrote:
           | That sounds like a pretty run-of-the-mill balanced binary
           | tree? Rust has a BTreeMap: https://doc.rust-
           | lang.org/std/collections/struct.BTreeMap.ht...
        
             | vanviegen wrote:
             | Inserting into an array increments the indexes of all
             | subsequent values. How would a regular BTree emulate that
             | behavior?
        
               | AstralStorm wrote:
               | It would rebalance on insert with O(log N) performance
               | but chunky constant factor, keeping the ordering so
               | access by index is also O(log N) with low constant
               | factor...
               | 
               | Which is why usually you would use a red-black tree
               | rather than a BTree, as it has much lower constant for
               | insertion and access by index. However higher for
               | traversal in order.
        
           | Koshkin wrote:
           | Well, doesn't the "index" need to be stored in the tree
           | itself for this to work? If so, the tree just represents an
           | ordered map. (Changing indexes after deletion or insertion
           | would be a problem, though.)
        
             | im3w1l wrote:
             | https://stackoverflow.com/a/837911 disregard the O(1)
             | discussion.
        
         | WhitneyLand wrote:
         | I wish the word vector didn't have so many different uses in
         | science and engineering.
        
         | layer8 wrote:
         | That depends on the number of elements and the size of the
         | payload data. Both approaches can be combined by using a linked
         | list of arrays.
        
           | marcosdumay wrote:
           | Yes.
           | 
           | There is a reason why it's hard to create either a bare
           | linked list or a bare array on modern language. It's because
           | both are bad extremes, and the optimum algorithm is almost
           | always some combination of them.
        
         | andirk wrote:
         | Understanding, and even sympathizing, with the machine's toil
         | re: vectors vs manual linked list can separate a good engineer
         | from a great one. We learn linked lists, assembly, and even
         | machine code in Computer Science majors so that we know what's
         | happening under the hood and can more easily surmise the
         | runtime effects.
        
           | kibwen wrote:
           | All of these are indeed important, and will eventually (one
           | hopes) result in the student realizing that linked lists are
           | to be avoided because of their poor mechanical sympathy
           | resulting from their awful cache locality.
           | 
           | Valid uses for a linked list are extremely niche. Yes, you
           | can name some, and I can name several valid uses for bloom
           | filters. Just because a simple linked list is easy to
           | implement does not mean it should be used, any more than
           | bubble sort should be used for its simplicity.
        
             | throwawaymaths wrote:
             | Wait why do linked lists have bad cache locality? It
             | depends on how your heap is set up. For example, you could
             | have an allocator that gives relatively good locality by
             | having high granularity and only bailing out if say your LL
             | gets super long (so if your lists are usually short, they
             | could have awesome locality)
        
               | AstralStorm wrote:
               | If you are in that place, you're probably using a linked
               | list over small statically allocated memory, but you
               | still need random ordering and fast remove or reorder.
        
               | duskwuff wrote:
               | That only works if you allocate lists all at once and
               | never modify them.
               | 
               | If there are any other allocations taking place at the
               | same time -- say, allocations from other threads, or for
               | other data you had to create while building the list,
               | that locality is shot. Same goes if you insert more
               | elements to the list later, or if you perform an
               | operation which changes its order (like sorting it).
        
               | wtetzner wrote:
               | > That only works if you allocate lists all at once and
               | never modify them.
               | 
               | You can reorder and modify the contents of the list
               | without losing locality.
        
               | duskwuff wrote:
               | I mean, if you modify the contents of the list nodes in
               | place, sure. But at that point, you're just doing an
               | array with extra steps.
               | 
               | Changing any of the pointers, however, will make memory
               | prefetch much less efficient. CPUs like linear memory
               | access patterns; they're easy to predict. Chasing
               | pointers is much less predictable, even if the pointers
               | are all to the same region of memory.
        
               | wtetzner wrote:
               | It might still be cheaper than copying the values if
               | they're large.
        
               | cerved wrote:
               | That doesn't seem to make much sense to me. LL are useful
               | in cases where you deal with changing lists, where you'd
               | be forced to allocate and deallocate memory often. If you
               | know before hand the exact or approximate size of the
               | list and it's contents you can store that continuously in
               | memory with greater cache locality because you're not
               | storing pointers in addition to data. Seems to me the use
               | case of LL is perhaps as ill suited for cache locality as
               | the structure itself
        
               | wtetzner wrote:
               | They're also useful when you want to cheaply reorder
               | nodes without copying the elements.
        
               | cerved wrote:
               | True. You could implement that using an array of structs
               | where each struct contains indexes of the array to link
               | other structures. instead of memory pointers, since they
               | are 8 bytes in 64-bit applications. This way you get the
               | nice properties of both linked lists and continuously
               | allocated arrays
        
               | CalChris wrote:
               | If your list is short then you won't have a problem. But
               | if your list is long then traversing it can be a major
               | problem.
               | 
               | For example, if your list item is page aligned like the
               | _task_struct_ in the Linux kernel, rescheduling the next
               | process could traverse the process list. The process list
               | links will be at the same page offset and associate with
               | a restricted set of the L1 /L2/L3 caches lines and thrash
               | from memory. Worse, the TLB will thrash as well. This was
               | actually the case for Linux until about 2000.
               | 
               | https://www.usenix.org/legacy/publications/library/procee
               | din...
        
             | mst wrote:
             | http://trout.me.uk/lisp/vlist.pdf
        
             | atemerev wrote:
             | An order book in electronic trading.
             | 
             | It is not too large (a few thousand of entries usually),
             | but it needs to be accessed and modified extremely fast.
             | There can be insertions and deletions anywhere, but it is
             | strongly biased towards front. It is ordered by price,
             | either ascending or descending.
             | 
             | Various "ordered queue" implementations are usually a poor
             | fit, as there are many insertions and deletions in the
             | middle, and they need to avoid any additional latency. The
             | same goes to vectors. Red-black trees etc. are too complex
             | (and slow) for such a small size.
             | 
             | So we start with a humble linked list, and optimize it
             | (unrolled, skip search etc). But we respect linked lists,
             | they work.
        
           | BeetleB wrote:
           | > Understanding, and even sympathizing, with the machine's
           | toil re: vectors vs manual linked list can separate a good
           | engineer from a great one.
           | 
           | I think your parent's point is that if you're a C++
           | programmer, recognizing that vectors outperform other data
           | structures for most use cases is what separates a good
           | engineer from a great one. Often even for things that require
           | lookup (i.e. vector outperforming a set).
           | 
           | During code reviews, a senior C++ programmer in my company
           | (formerly sat in the C++ standards committee) would always
           | demand benchmarks from the developer if they used anything
           | else. In most cases, using representative real world data,
           | they discovered that vectors outperformed even when the
           | theoretical complexity of the other data structure was
           | better.
           | 
           | A great engineer knows the limitation of theory.
           | 
           | Of course, my comment is only regarding C++. No guarantees
           | that vectors are great in other languages.
        
             | asveikau wrote:
             | > [V]ectors outperform other data structures ... even for
             | things that require lookup (i.e. vector outperforming a
             | set)
             | 
             | This is something I've heard before (not often benchmarked)
             | and it makes intuitive sense, and sometimes I've also told
             | it to myself in order to not feel bad for writing O(n)
             | where n is always small.
             | 
             | But somehow, reading this snippet, I'm reminded that the
             | vector does not maintain the semantic reminder that it's
             | used in a key/value use case. So it has me thinking there
             | should be a map/hashtable/dictionary interface that's
             | really just a vector with linear lookup.
             | 
             | Then I think, hmm, maybe the standard containers should
             | just do that. Maybe the authors of those should have some
             | heuristic of "when to get fancy".
        
               | dabitude wrote:
               | These are flat_map, flat_set, flat_hash_map, etc ...
               | 
               | See: https://en.cppreference.com/w/cpp/header/flat_map
        
               | usefulcat wrote:
               | I believe that flat_map and flat_set at least are sorted,
               | so that binary_search/lower_bound can be used for
               | lookups.
               | 
               | Linear search generally does outperform for small
               | collections though (it's better for caches and the branch
               | predictor).
        
               | asveikau wrote:
               | C++23! The future is here.
        
               | bluGill wrote:
               | Std::find_if works great if you want to treat a vector
               | like a map. Plus you can have different keys if that is
               | useful for your problem (look up employee by name, id, or
               | address?)
        
               | KerrAvon wrote:
               | I think the cases where lookup is faster for a linearly
               | scanned vector is where the hash computation overhead is
               | greater than a linear scan. Assuming you have a good hash
               | implementation, that should be true only for a small
               | number of entries, where you need to determine "small" by
               | measuring.
        
               | asveikau wrote:
               | I think it's about locality and quickness of compare.
               | 
               | So if you have a small data structure and/or large cache,
               | and the key is not complicated.
        
               | bluGill wrote:
               | Even if the key is cheap, that it isn't in cache means
               | that vector will be faster just because we avoid the
               | cache miss of loading the value into memory.
        
             | GuB-42 wrote:
             | > Often even for things that require lookup (i.e. vector
             | outperforming a set)
             | 
             | I am always uncomfortable using higher algorithmic
             | complexity. Sure, for small data sets, the vector will
             | outperform the set, but it may not stay small forever. I
             | may waste a millisecond now, but at least, I won't waste an
             | hour if the user decides to use 100x the amount of data I
             | initially considered. And using much more data than
             | expected is a very common thing. It is a robustness thing
             | more than optimization.
             | 
             | If you want both, there are staged algorithms, a typical
             | example is a sort algorithm that goes insertion ->
             | quicksort -> mergesort. Insertion sort is O(N^2), but it is
             | great for small arrays, quicksort is O(N log N) on average
             | but may become O(N^2) in some cases, and mergesort is
             | guaranteed O(N log N) and parallelizable, but it is also
             | the slowest for small arrays. For lists, that would be a
             | list of vectors.
        
               | jll29 wrote:
               | I like the staging idea, but curiously I haven't seen any
               | papers investigating this.
               | 
               | Obviously, there is an overhead cost of the staging
               | itself, which hits the use case for small data, and there
               | is the question "From what data size onwards to use what
               | algorithm?".
               | 
               | For example, I just had a look at sort in Clang's STL
               | implementation, which seems to set a boundary at N=1000.
               | Since the number is a multiple of 10, it looks like it
               | might be arbitrarily picked, in any case there is no
               | paper references in the comments to back it up.
               | 
               | Are people aware of formal studies that explore whether
               | it is worth the trouble to inspect data structure and
               | then to dispatch to different algorithms based on the
               | findings?
        
               | the_sleaze_ wrote:
               | You're saying you think there's still optimization to be
               | done re. the N=1000 constant?
               | 
               | Other than that, I'd suggest that big O complexity is
               | already telling you that the inspection is worth it.
               | Assuming inspection of an array is constant.
        
               | BeetleB wrote:
               | > Sure, for small data sets, the vector will outperform
               | the set, but it may not stay small forever.
               | 
               | In over 90% of SW applications written in C++, there is
               | little to no uncertainty on the size of the data set.
               | Most applications are written for a very specific
               | purpose, with fairly well known needs.
               | 
               | If my data set is size N, and I can benchmark and show
               | the vector outperforms the theoretical option up to, say,
               | 20N, it's a safe choice to go with vector. Doing these
               | analyses _is_ engineering.
               | 
               | In any case, it's almost trivial to swap out the vector
               | with other options when needed, if you plan for it.
        
             | zelphirkalt wrote:
             | I hope that senior engineer is also the one person, who
             | will refactor things, if ever there are more elements in
             | any of those vectors, making linear search slower than an
             | actual set or other data structure. Oh and of course that
             | senior engineer will also hopefully be the one monitoring
             | the lengths of all those vectors in production.
        
               | josefx wrote:
               | > vectors, making linear search slower
               | 
               | The standard library gives you plenty of tools to
               | maintain and search in a sorted vector. Often enough
               | still faster than maps.
               | 
               | > Oh and of course that senior engineer will also
               | hopefully be the one monitoring the lengths of all those
               | vectors in production.
               | 
               | Doesn't a std::map have more space overhead than a
               | vector?
        
               | BeetleB wrote:
               | I believe he was referring to the number of elements in a
               | vector, to know when to swap with a different data
               | structure - not the size in bytes.
               | 
               | But to your point - I once needed to do a key/value
               | lookup for a large data set, and so _of course_ I used a
               | map. Then I ran out of RAM (had 80GB of it too!). So I
               | switched to a vector and pre-sorted it and used
               | binary_search for lookups (data set was static - populate
               | once and the rest of the codebase didn 't modify it). And
               | of course, the RAM consumption was less than half that of
               | the map.
        
               | sfink wrote:
               | Just describe your sorted list as "a balanced binary tree
               | with implicit child links" and then you get credit for
               | being fast, space-efficient, _and_ using fancy data
               | structure.
        
         | sqrt_1 wrote:
         | Here is a performance graph on removing items from a vector vs
         | list - slightly old (2017)
         | https://github.com/dtrebilco/Taren/blob/master/Articles/iter...
         | 
         | From the article
         | https://github.com/dtrebilco/Taren/blob/master/Articles/Eras...
        
           | devmunchies wrote:
           | That is removing an item at N index, right? (that's why it
           | talking about iterators?) Linked lists are only fast at
           | append/pop with the first node.
        
         | hintymad wrote:
         | Isn't linked list the fundamental data structure for LISP? The
         | sweet memory of cons() this and cdr() that are all backed by
         | linked list?
        
           | tialaramex wrote:
           | So, two things. One: 1980s machines don't have the memory
           | access performance you see today. If N list chase operations
           | takes the same time as N sequential reads then the linked
           | list has the nice properties you were probably taught in
           | school and actual in-memory representation of a linked list
           | is a good choice. On today's machines that list chase might
           | incur a sizeable cache stall, making it thousands or even
           | tens of thousands of times slower.
           | 
           | Two: Lisp doesn't actually care whether there are literally
           | linked lists behind everything. Today you would use a
           | different data structure reflecting the hardware you have.
        
             | pjmlp wrote:
             | Not only today, since the mid-70's, yet somehow many keep
             | thinking List only has lists.
        
             | lispm wrote:
             | > 1980s machines don't have the memory access performance
             | you see today
             | 
             | They had similar problems. CPU with tiny caches <-> caches
             | <-> expensive RAM in the range from 500kbytes to a few MB
             | <-> virtual memory paging to slow disks.
             | 
             | For example a typical Lisp Machine might have had 20 MB
             | RAM. But the Lisp image it ran was probably already much
             | larger. Thus paging spaces upwards from 60 megabytes were
             | not uncommon. I had a Lisp Machine with 40 MB RAM and have
             | used > 200 MB paging space. Disks were very slow. Were are
             | talking about ESDI (2.5 Mbyte/sec or less) interfaces or
             | later 5-10 Mbyte/sec SCSI 1 and 2.
             | 
             | I had a 600MB ESDI disk inside a 1 MIPS Lisp Machine with 8
             | Megawords RAM of 36bit memory + 8 bit ECC.
             | 
             | Thus locality plaid an extremely large role for usable
             | performance. In the early days machines had to be rebooted
             | when they ran out of memory, since a garbage collection
             | could take a long time. Rebooting a machine was just a few
             | minutes. Doing a full GC over 200 MB virtual memory could
             | take half an hour.
             | 
             | When I was making a new Lisp image (called a world), the
             | size was upwards 50MB. 100 MB was common. A special command
             | ran for roughly 30 minutes and reordered the objects in
             | main memory to improve locality. The another command saved
             | the image - which took also tens of minutes.
             | 
             | A big breakthrough in usability came with the introduction
             | of the Ephemeral Garbage Collector, which only touched RAM
             | and took care of the short lived objects, with some
             | hardware support to identify and track RAM pages with
             | changed content.
             | 
             | Features back then were:
             | 
             | * cdr coded lists which were allocated like vectors
             | 
             | * lots of other data structures like vectors, n-dimensional
             | arrays, hashtables, records, objects, ...
             | 
             | * an ephemeral garbage collector with hardware support
             | tracking changes in RAM pages
             | 
             | * incremental garbage collection
             | 
             | * a copying/compacting generational garbage collector with
             | type sorted memory regions
             | 
             | * cooperation between the garbage collector and the virtual
             | memory pager
             | 
             | * various manual or semi-manual memory management
             | facilities
             | 
             | * incremental memory image saves
             | 
             | The main reason to develop Lisp Machines in the end 70s was
             | to get Lisp development off of time-shared computers (with
             | limited shared RAM and virtual memory) onto single user
             | workstations, where RAM and virtual memory is not shared
             | between different users.
             | 
             | The same problem appeared then on UNIX machines, where Lisp
             | systems often were among the most memory hungry programs ->
             | thus they needed lots of RAM, which was expensive. Thus a
             | lot of virtual memory was used. But access to Lisp objects
             | in virtual memory was much slower than Lisp objects in RAM.
             | It took many years to have competitive GCs on those
             | machines.
             | 
             | When RAM got more affordable and larger, things improved.
        
           | pjmlp wrote:
           | No, only if you are speaking about the first implementations
           | of LISP until LISP 1.5 and similar timeframe.
           | 
           | By the time Xerox, MIT, TI started delving into Lisp
           | machines, the language had already gotten support for stack
           | allocation, value types, arrays, structures (records).
           | 
           | Same applies to its big brother Scheme.
        
           | ww520 wrote:
           | Cons can build a tree, though it can certainly build a list.
           | 
           | (a1 (b1 (c1 c2 c3) b2 (c4 c5)) a2 a3) is a tree.
        
         | sgtnoodle wrote:
         | I often write code destined to run on a microcontroller, and
         | linked lists are great for avoiding dynamic memory. It's nice
         | to be able to statically or transiently allocate storage
         | without having to bound the size of anything.
         | 
         | Of course, the same unintuitive tradeoffs tend to apply even if
         | there isn't much caching to worry about. A trivial linear
         | iteration is often faster than a more elaborate algorithm when
         | the input is small.
        
           | bitexploder wrote:
           | Carmack is a fan of this approach. Allocate some reasonably
           | large number as your design constraint and if you start
           | approaching it that is a time to ask why and is this data
           | structure still reasonable here
        
           | KerrAvon wrote:
           | A memory-constrained microcontroller is actually one of the
           | valid exceptions to the rule.
        
             | Sirened wrote:
             | Well, yes, and a lot of people treat kernel development as
             | if they were developing for a constrained microcontroller.
             | I don't think this is even a bad thing, we want the kernel
             | to be as lean as possible because every page and cycle the
             | kernel burns is one that users don't get to use.
        
               | jstimpfle wrote:
               | I suppose a more important reason is the "not have to
               | bound anything" part. OS kernels almost definition don't
               | know how many objects will be created. Implementing
               | queues as vectors of pointers will require dynamic
               | allocation and/or introduce many more failure points when
               | going out of resources. With linked lists, once you have
               | an object you can always append it to a linked-list
               | queue.
        
               | Dylan16807 wrote:
               | This is a memory vs. time tradeoff. And the time cost
               | gets worse as the machine gets bigger.
               | 
               | Designing for embedded is not the same thing as being
               | lean.
        
             | rcxdude wrote:
             | It's also an area where the costs are a lot less
             | significant either: a cortex m4 and below often doesn't
             | have a cache (at least for RAM: flash often has a cache and
             | predictive lookup, but the perfomance difference is small
             | enough that the cache can basically render executing code
             | from flash as identical to RAM), and the RAM is about as
             | fast as the processor, so random access and sequential
             | access are pretty similar performance wise. (And with the
             | speed of modern microcontroller cores the size of the data
             | is rarely even an issue: most STM32s can iterate through
             | their entire RAM in under a millisecond).
             | 
             | That said, in my embedded work I still rarely find a linked
             | list to be a useful data structure (but I know FreeRTOS,
             | which I use a lot, uses them extremely heavily internally).
        
           | G3rn0ti wrote:
           | That's why the Amiga OS (Exec) used linked lists as its
           | primary data structure. It was one of the first micro
           | computer operating systems using dynamic memory mapping (i.e.
           | no fixed addresses for OS calls and data structures) and
           | supported multi tasking. Any object you received from Exec
           | basically was a pointer to a list node. And, yes, the Amiga's
           | biggest problem was memory fragmentation and leakage because
           | unreachable linked list nodes would be starting to accumulate
           | whenever working with a buggy program.
        
         | dmitrygr wrote:
         | Modern CPUs detect patterns of pointer chasing common in linked
         | list use and prefetch to cache just fine. Your comment would
         | have been valid a decade or more ago.
         | 
         | And _plenty_ of use cases are much better with linked lists
         | than resizable vectors. Eg: queues.
        
           | vbezhenar wrote:
           | RAM latency is huge. No amount of prefetching can fix that.
           | If anything, prefetching works better with arrays as RAM
           | reads multiple bytes at once.
        
           | xxs wrote:
           | Queues are best implemented with a pow2 cyclic array, and you
           | have a free deque, too.
        
             | attractivechaos wrote:
             | A ring-buffer based queue is mostly better than a linked
             | list but it is not necessarily the best. The typical STL
             | deque implementation [1], with all its complexity, is
             | usually faster than a ring buffer.
             | 
             | [1] https://stackoverflow.com/questions/6292332/what-
             | really-is-a...
        
           | tylerhou wrote:
           | I have some experience using/modifying linked-list benchmarks
           | (https://github.com/google/multichase) specifically to test
           | memory latency.
           | 
           | It is extremely difficult, maybe impossible, to design a
           | prefetcher that can predict the next cacheline(s) to prefetch
           | while traversing in a linked-list. I am not aware of a single
           | CPU that can do this consistently. For instance, if you run
           | multichase (a linked-list chaser) on GCP servers, you
           | generally get the expected memory latency (~70-100ns,
           | depending on the platform).
        
             | AstralStorm wrote:
             | Why let the CPU guess when you can tell it what you want?
             | (Prefetch a small arena of objects linked to by the list.)
        
           | gpderetta wrote:
           | Even with a perfect prefercher a linked list would still be
           | slower to iterate than a vector as it is inherently serial
           | (unless your CPU does address prediction and I don't think
           | any CPU does).
        
           | Borealid wrote:
           | I think there's more to cache locality than prefetching. In
           | an array, consecutive elements will be usually be adjacent in
           | physical memory (excepting page boundaries ), so each cache
           | line load where a cache line is larger than the array element
           | will "overfetch" into the next element. This should mean
           | fewer total cache loads and thus more efficient use of memory
           | bandwidth for situations where it is constrained (such as
           | walking a list).
        
             | AstralStorm wrote:
             | Then again, if you have control over allocations you can
             | prefetch whole chunks of data to which the list refers
             | making this point moot.
             | 
             | Often lists refer to arrays or sections of memory. The
             | performance loss if any appears in bigger structures where
             | you do want to have explicit control anyway.
        
           | howinteresting wrote:
           | For most use cases a ring buffer either with or without a
           | maximum length will do much better as a queue than a linked
           | list will.
           | 
           | There are some niche use cases where linked lists are good.
           | Lock-free queues are one, and another big set of use cases is
           | where you need hard O(1) insert even with a high constant
           | factor, not amortized O(1).
        
             | anonymoushn wrote:
             | High-performance SPSC queues are always implemented with an
             | array.
             | 
             | If you need hard O(1) insert you can usually just allocate
             | an array much bigger than you will need. The combination of
             | hard performance requirements and willingness to wait for
             | an allocator all the time is unusual.
        
             | gamegoblin wrote:
             | The place I see linked lists pop up the mosts are when
             | implementing LRU caches. You need a linked hashmap of sorts
             | where the hashmap gives O(1) lookup to the linked list node
             | in question, and then you can extract that node and move it
             | to the head of the queue in O(1) as well.
             | 
             | This is a special case of where they also appear: linked
             | hashmap implementations, where you want a hashmap that has
             | a consistent iteration order.
        
               | xxs wrote:
               | Having linked nodes of a hashmap, is just that - I'd not
               | call it a linked list. You can do moving median with a
               | binary tree and linked nodes, too.
        
               | josephg wrote:
               | I've seen them show up in lots of situations through
               | intrusive lists, which I think is the general form of the
               | example you're giving.
               | 
               | Intrusive lists are used in the Linux kernel, in physics
               | engines (eg Chipmunk2d) and game engines. The way Redis
               | uses them for TTL expiry follows the same pattern.
        
             | xxs wrote:
             | >where you need hard O(1) insert even with a high constant
             | factor
             | 
             | You will need a truly concurrent memory allocator for that,
             | too... or if you are just the kernel (one of the reasons
             | the linux kernel can benefit from linked lists)
             | 
             | Overall, allocating new objects/memory is not guaranteed
             | O(1).
        
           | spijdar wrote:
           | There are a lot of factors involved, but given a limited
           | number of _cache lines_ that can be stored in low level
           | cache, I would think there 'd be at least some performance
           | penalty for prefetching from non-linear blocks of memory vs
           | the inherent spatial locality in an array/vector, if only
           | because it'd cause more old cache to be evicted.
        
             | AstralStorm wrote:
             | Or the opposite, prefetching a whole vector when you just
             | need one item, evicting more than needed unnecessarily.
             | 
             | There is a reason why non-embedded CPUs have sizable L2
             | caches now.
        
           | colinmhayes wrote:
           | Queues are generally implemented with with a vector of
           | pointers to vectors. Nicely diagrammed here
           | https://stackoverflow.com/questions/6292332/what-really-
           | is-a... Ring buffers also work better than linked lists
        
             | waynesonfire wrote:
             | is Erlang and JDK general enough for you? Oh, both use
             | linked lists,
             | 
             | https://github.com/erlang/otp/blob/master/lib/stdlib/src/qu
             | e... https://github.com/openjdk-
             | mirror/jdk7u-jdk/blob/master/src/...
        
               | xxs wrote:
               | why would your refer LinkedBlockingQueue? You have
               | ArrayBlockingQueue, too. ArrayDeque is just better than
               | LinkedList.
        
           | [deleted]
        
           | anonymoushn wrote:
           | Even if your linked list is backed by an array and all the
           | elements are right next to each other, computing some cheap
           | function of the elements of a linked list is incredibly slow
           | compared to the alternative because of the unnecessary and
           | very long dependency chain.
           | 
           | Vectors are basically perfect for queues.
        
           | optymizer wrote:
           | I'm going to call you out on this one, because it's a bold
           | claim, and I'd love to see an explanation and some perf
           | numbers.
           | 
           | For example, I'm wondering how the CPU knows what the next
           | item in the list to prefetch it. Unlike the next item in an
           | array, list items could be pointing anywhere in process
           | memory.
           | 
           | Also, what counts as a modern CPU in this context? Are we
           | talking latest generation desktop CPUs from Intel, or
           | anything after Core 2 Duo? How about mobile devices with ARM
           | CPUs? Would those just be ignored?
           | 
           | Is there a benchmark?
        
             | dmitrygr wrote:
             | Feast your eyes on the "Intel(r) 64 and IA-32 Architectures
             | Optimization Reference Manual": https://cdrdv2-public.intel
             | .com/671488/248966_software_optim... (newer link, as
             | requested by one of the child comments)
             | 
             | It details the prefetchers, how they work, what patterns
             | they recognize (section 3.7)
        
               | NavinF wrote:
               | Prefetching just one element ahead does fuck-all when ram
               | is 100x slower than cache especially if you need to
               | precharge and activate a new row which is always the case
               | when you're spraying tiny objects all over the place.
        
               | AstralStorm wrote:
               | So don't spray tiny objects around. :) Keep them nicely
               | tucked away in an arena.
               | 
               | Vectors actually tend to create a spray of tiny
               | objects... As opposed to a true dynamic array which has
               | limitations in resize. If you're lucky, they will keep a
               | single dynamic array per vector. You're usually not this
               | lucky with bigger data.
        
               | josephg wrote:
               | If you never free your objects, you may as well use an
               | array anyway - it'll be much more simple. And if you use
               | an arena with an object pool, you're back in cache miss
               | territory.
               | 
               | Vectors only "create a spray of tiny objects" if you have
               | a vector-of-references or something like that. And even
               | then, reading data requires 2 reads from memory (1 to
               | read the pointer in the vector, and another to read the
               | object). Finding item N in a linked list will require N
               | reads (since you need to read each item before the
               | target). Yes, resizing a vector is O(n). But every time I
               | measure memcpy, it always goes faster than I expect.
               | 
               | If you disagree, try to prove it with code. At least one
               | of us will learn something that way.
        
               | insanitybit wrote:
               | This is a great document that will absolutely confirm, in
               | a number of ways, that linked lists are going to be
               | terrible for prefetching.
        
               | AstralStorm wrote:
               | That depends more on what they link to. Not prefetching a
               | worthless list of pointers when you could prefetch data
               | instead is typically faster. Prefetchers have limited
               | capabilities and capacity for loads.
        
               | insanitybit wrote:
               | The prefetcher doesn't even have to get involved for
               | linear sequences, it would only get involved after a
               | cache miss threshold is met. The fastest and simplest
               | prefetcher, L1, is stream based and will work great for
               | sequential data.
        
               | [deleted]
        
               | josephg wrote:
               | It doesn't claim linked lists are fast. And it doesn't
               | have any benchmarking data.
               | 
               | Actually the only reference to linked lists I see is in
               | the prefetcher section (3.7.3), where it explicitly
               | recommends that consecutive items are stored in a single
               | 4kb cache line or in consecutive cache lines. They give a
               | positive code example using an array, like other
               | commenters are suggesting.
        
               | fuckstick wrote:
               | It's a link to a document on architectures well over a
               | decade old - the most recent mentioned is the original
               | Core architecture (from 2008). Honestly, based on your
               | first comment I thought you were implying something about
               | content dependent prefetch or other techniques that I am
               | familiar in the academic literature but unaware of ever
               | being used in mainstream hardware.
               | 
               | > Organize the data so consecutive accesses can usually
               | be found in the same 4-KByte page. > * Access the data in
               | constant strides forward or backward IP Prefetcher. 3-72
               | 
               | > Method 2: >* Organize the data in consecutive lines. >
               | * Access the data in increasing addresses, in sequential
               | cache lines.
               | 
               | Nothing new here that contradicts the GPs skepticism.
               | Certainly not enough evidence for you to be a dick.
        
               | dmitrygr wrote:
               | yup, was link to old a document with same title. updated.
               | thanks.
        
               | fuckstick wrote:
               | Ok, but the hardware data prefetch functionality seems
               | basically unchanged:
               | 
               | "Characteristics of the hardware prefetcher are: * It
               | requires some regularity in the data access patterns. --
               | If a data access pattern has constant stride, hardware
               | prefetching is effective if the access stride is less
               | than half of the trigger distance of hardware prefetcher.
               | -- If the access stride is not constant, the automatic
               | hardware prefetcher can mask memory latency if the
               | strides of two successive cache misses are less than the
               | trigger threshold distance (small- stride memory
               | traffic). -- The automatic hardware prefetcher is most
               | effective if the strides of two successive cache misses
               | remain less than the trigger threshold distance and close
               | to 64 bytes."
        
               | forrestthewoods wrote:
               | My benchmarks say you're wrong.
               | 
               | https://www.forrestthewoods.com/blog/memory-bandwidth-
               | napkin...
        
               | dmitrygr wrote:
               | You're joking right?
               | 
               | "I made an extra test that wraps matrix4x4 in
               | std::unique_ptr."
               | 
               | And that is your test? One piece of code? Not even
               | presenting disassembly? Who the hell knows what your
               | compiler wrought in response to your "std::unique_ptr"
               | and "matrix4x4" ?
        
               | forrestthewoods wrote:
               | Well I've provided 1 benchmark to your 0. I'd say I'm
               | ahead.
               | 
               | My code benchmark is actually far more pre-fetch friendly
               | than a LinkedList because it can prefetch more than one
               | "node" at a time. In a LinkedList you can't prefetch N+1
               | and N+2 at the same time because N+2 is dependent on the
               | result of N+1.
               | 
               | I'm always open to learning new things. Modern compiler
               | magic is deep and dark. If sometimes a compiler magically
               | makes it fast and sometimes slow that'd be quite
               | interesting! If you have any concrete examples I'd love
               | to read them.
        
               | attractivechaos wrote:
               | Sorry for my ignorance - linked list matching the
               | performance of vector is something new to me. I would
               | like to learn more. The best way to prove your point is
               | to show us a benchmark. However, I couldn't find one with
               | a quick google search.
        
             | porcoda wrote:
             | Google for Intel hardware prefetcher and you'll turn up
             | some info.
             | 
             | Re: benchmarks. Not likely exactly what you're looking for,
             | but the GigaUpdates Per Second benchmark
             | (https://en.wikipedia.org/wiki/Giga-updates_per_second) is
             | sort of the most-pathological-case of pointer jumping where
             | the jumps are random. Prefetchers don't do well with that
             | (for obvious reasons) - they tend to work best when there
             | is some pattern to the accesses. Linked data structures
             | often live somewhere in between - they may start with good,
             | regular stride patterns, but then over time as elements are
             | added and moved around, it turns into a tangle of
             | spaghetti.
             | 
             | Edit: I should add more. While prefetchers do exist in
             | hardware, they're a bit of a murky subject. Sometimes they
             | speed things up, sometimes they do so bad they slow
             | everything down. They can be very workload dependent. And
             | to top it off, hardware vendors can sometimes be a bit
             | opaque when explaining what their prefetching hardware
             | actually does well on. It's probably safest to not assume
             | your hardware will help you when it comes to tangled
             | pointer jumping code, and if you can avoid it via a
             | different data structure it's probably good to do so. That
             | said, other data structures may entail other tradeoffs -
             | better cache usage for lookups, but potentially more
             | expensive insertions. As usual with data structures, its a
             | game of balancing tradeoffs.
        
               | insanitybit wrote:
               | I don't buy this.
               | 
               | https://www.youtube.com/watch?v=WDIkqP4JbkE
               | 
               | Prefetching in CPUs predates this talk.
               | 
               | https://www.youtube.com/watch?v=Nz9SiF0QVKY
               | 
               | That talk is 2018.
               | 
               | Cache prefetching works best for linear accesses like
               | with a vector, not for random pointer lookups.
               | Prefetchers are also going to have an extra harder time
               | with double indirection since they're unable to see which
               | value is going to be fed to the lea.
               | 
               | Prefetching is also expensive in and of itself, the CPU
               | doesn't do it unless you're already incurring cache hits.
               | This makes cache misses _even worse_. There are also
               | multiple prefetchers in the CPU and _the L1 prefetcher_
               | is very basic and is only going to help with contiguous
               | accesses. Your allocator might be doing some nice things
               | for you that let the _one of the_ L2 prefetchers help, I
               | would expect that you 'll see better L2 performance due
               | to prefetching.
               | 
               | If you have an example of a linked list being as fast as
               | a vector for iteration, please do show it.
        
               | porcoda wrote:
               | ... I didn't say it was for random pointer lookups.
               | That's the pathological case that was the primary reason
               | the Cray X-MT (formerly Tera MTA) was invented. As I
               | said, when you have pointers with some regularity (hence
               | my reference to regular or patterned striding), then you
               | _may_ get some benefit of prefetching when you 're not
               | just dealing with linear contiguous access. The "may" is
               | important there - whether you do or don't get it is
               | workload / data dependent. As I also said, more likely
               | than not over time the data structure will lose
               | regularity as pointers get re-wired, and the benefits of
               | prefetching go out the window.
               | 
               | I'm not sure what you "don't buy" - I didn't say
               | definitively that prefetching does or does not always
               | help. Sometimes it does, and when it does, it's highly
               | dependent on the regularity of the way the pointers cause
               | you to stride through memory. I even basically agreed
               | with you -- to quote me, "It's probably safest to not
               | assume your hardware will help you when it comes to
               | tangled pointer jumping code". You appear to want to
               | argue for some reason.
        
               | insanitybit wrote:
               | > As I said, when you have pointers with some regularity
               | (hence my reference to regular or patterned striding),
               | then you may get some benefit of prefetching when you're
               | not just dealing with linear contiguous access.
               | 
               | I would need a source to believe this. The only way I can
               | imagine this working is that the L2 prefetcher might
               | incidentally work well with your allocator - your
               | allocator might try to hand out more or less colocated or
               | evenly strided data.
               | 
               | > I'm not sure what you "don't buy"
               | 
               | This statement, primarily:
               | 
               | > Modern CPUs detect patterns of pointer chasing common
               | in linked list use and prefetch to cache just fine
               | 
               | To me, this statement reads as:
               | 
               | 1. Prefetchers in CPUs can prefetch data by examining
               | loads that have yet to be executed
               | 
               | 2. That they do this well, or at least that's how I read
               | "just fine"
               | 
               |  _To my knowledge_ neither of those is true.
               | 
               | > You appear to want to argue for some reason.
               | 
               | I'm not trying to argue at all, I'm trying to learn. What
               | you're saying does not sound right to be but it would be
               | extremely interesting to me if I learned that CPUs are
               | doing something that I'm not aware of.
               | 
               | If you're actually saying that an allocator may end up
               | striding allocations such that the prefetcher can
               | reliably predict where to load next, sure I'll buy that,
               | at least for L2. That's a fine thing to say, it just
               | isn't what I took away from your initial statement, which
               | is what I wanted to drill into because, again, it would
               | be very interesting to me.
               | 
               | Sorry if I'm coming off as argumentative, I was rushing
               | that previous comment, which isn't really fair to put on
               | you. I _genuinely_ just am curious and if this is a case
               | of miscommunication, no worries at all. I 'd just be
               | remiss if I didn't try to suss out if I'm wrong.
        
               | nice_byte wrote:
               | how do linked lists fit the regularity requirement? the
               | only regularity in this pattern of access is the offset
               | at which the next address to fetch is stored. this seems
               | like a nontrivial amount of logic and state for the
               | prefetcher to keep track of, unlike something like base *
               | scale + offset when e.g. traversing a 2d array.
        
               | monocasa wrote:
               | It is more complex for hardware to keep track of for
               | sure. It's been a feature of high performance cores for
               | more than a decade though. Kernels tend to be based off
               | of linked lists, so it's a massive win typically.
        
               | optymizer wrote:
               | Look, people who are excited about this and just want to
               | learn are asking for proof. You can't just make a claim
               | like "it's been a feature of high performance for more a
               | decade" and provide 0 evidence for the claim, because
               | it's not common knowledge and explicitly contradicts what
               | is commonly taught.
               | 
               | I can time how long it takes to iterate through an array
               | and show that it is faster than iterating through a
               | linked list with the same number of elements.
               | 
               | Now, what optimization would I have to impleemnt to end
               | up with a program that iterates through this linked list
               | faster than the equivalent array?
        
               | fuckstick wrote:
               | A bunch of people keep saying this with no substantiation
               | - the best we saw was a link to the latest Intel
               | optimization manual that quite specifically refutes this
               | claim. I even quoted it upthread.
               | 
               | Do you have anything substantial to add to that?
        
               | josephg wrote:
               | I'm sure CPUs will try their best to optimize these usage
               | patterns. But I suspect vector / array based approaches
               | will still always outcompete linked lists in actual
               | benchmarks.
               | 
               | Thats always been my experience of them.
               | 
               | Linked lists still show up in the kernel, in redis, in
               | game engines and so on because you can't build fast
               | intrusive collections using arrays.
               | 
               | Making linked lists "as fast as we can make them" doesn't
               | mean they're competitive with a vector.
        
             | Sirened wrote:
             | Apple's M1 has a data dependent prefetcher (superset form
             | of pointer chase prefetcher), as discovered by the
             | researchers behind the Augury paper [1]. They discuss how
             | it works, how and when it activates, as well as a couple
             | other cool details which should more than answer your
             | curiosity. They don't have benchmarks for performance, but
             | these sorts of prefetchers do exist (and possible in
             | hardware you're using right now!).
             | 
             | [1] https://www.prefetchers.info/augury.pdf
        
               | tylerhou wrote:
               | The prefetcher described in the paper implemented in the
               | M1 is nowhere near a prefetcher that would be able to
               | prefetched linked-list nodes.
               | 
               | > We tested for the existence of four DMPs: both single-
               | and two-level versions of pointer-chasing and
               | indirection-based DMPs (Section IV). Our findings show
               | the existence of a single-level pointer-chasing DMP....
               | 
               | When activated, the prefetcher described will also issue
               | loads for pointers in a _contiguous_ array of pointers.
               | In particular, the pointer /addresses are already known
               | by the core (because they themselves were presumably
               | prefetched far ahead of time by the stream/stride
               | prefetcher), so the core knows which addresses to
               | prefetch. But in a linked-list, the address of the next
               | node is not known by the core until _after_ the node is
               | retrieved from memory. I.e. there is an additional data
               | dependency, and it takes a round trip to memory to
               | resolve that data dependency.
               | 
               | It's the difference between prefetching B's in
               | A0  A1  A2  A3  A4       v   v   v   v   v       B0  B1
               | B2  B3  B4
               | 
               | and prefetching B, C, D, E in                 A -> B -> C
               | -> D -> E
               | 
               | The former is much easier to do than the latter as 1) the
               | A's are easy to prefetch, and 2) the B's can be fetched
               | in parallel. In the second, the core has to wait until B
               | resolves before it can fetch C, and it has to wait until
               | C resolves before it can fetch D, etc.
        
         | ShredKazoo wrote:
         | Has there been any work on nudging the allocator towards
         | putting all the nodes on the linked list on the same page? Or
         | close together in memory more generally (to leverage the L1 &
         | L2 caches)
        
           | tsimionescu wrote:
           | If you're allocating the entire linked list at once, that may
           | well happen in practice. Otherwise, there is no good way it
           | can happen automatically; but, compacting GCs will usually do
           | this for you (assuming that the list elements are only
           | reachable from the previous element).
        
             | ShredKazoo wrote:
             | Interesting thanks!
        
         | theCodeStig wrote:
         | Assuming rust or other systems language.
         | 
         | Linked lists are optimal for head access; not random access.
         | For random access, yes a vector would be better.
        
         | waynesonfire wrote:
        
           | andrewmcwatters wrote:
           | Would you like to explain to the class?
        
             | [deleted]
        
           | klysm wrote:
           | Care to demonstrate why? I have the same experience
        
             | waynesonfire wrote:
             | the linked list has a wonderful capability that's
             | frequently seen in C where you can do O(1) removal from a
             | list. E.g. you can have an "object" detach itself. This is
             | not possible in Java's standard linked list implementation,
             | among other languages.
             | 
             | in general, inserting and removing elements from a vector
             | requires a memory copying.
             | 
             | There are programming languages that have linked lists as
             | their fundamental data structure, like Lisp and Erlang.
             | 
             | For situations where I don't need random access, linked
             | lists are hard to beat.
             | 
             | Linked lists also make wonderful immutable data structures.
             | 
             | linked lists have wonderful filter and flatmap performance.
             | how much memory copying would a vector require?
        
               | klysm wrote:
               | Yes, but in practice the coefficients on the O() almost
               | always end up working in the dense arrays favor because
               | of memory locality.
        
               | suremarc wrote:
               | This is assuming you already have a pointer of the
               | element you're removing, or a neighbor thereof.
               | Otherwise, you'd have to traverse a portion of the linked
               | list, which has a significant amount of per-element
               | overhead.
        
               | sakras wrote:
               | The problem with your analysis is that you're using big-O
               | notation. Big-O notation is based on an abstract machine
               | that is pretty unrepresentative of real computers. In
               | practice, the memcpy is always going to be faster (until
               | you get to really big sizes). Put another way, linked
               | list is O(1) with a massive constant, and memcpy is O(n)
               | with a tiny constant.
               | 
               | And actually this logic extends to other algorithms too.
               | My rule of thumb is that a linear search (despite being
               | O(n)) will be faster up to around n=100 than binary
               | search (which is O(log n)) just due to the way the CPU
               | operates.
        
           | zabzonk wrote:
           | yes, he does linked lists are, except in a few specific
           | cases, the wrong solution
        
           | [deleted]
        
           | koheripbal wrote:
           | This is not a substantive comment.
        
         | dragontamer wrote:
         | > regular resizable vector
         | 
         | At gross complexity / fragmentation / to your memory allocator.
         | 
         | Linked Lists are definitely easier to use if you're ever in a
         | position where you're writing your own malloc(). The
         | infrastructure needed to cleanly resize arrays (and also: the
         | pointers all going bad as you do so) has a lot of faults IMO.
         | 
         | EDIT: In particular, linked lists have a fixed size, so their
         | malloc() hand-written implementation is extremely simple.
        
           | taneq wrote:
           | More complexity than allocating/deallocating a struct for
           | each link in the list?
           | 
           | And the solution to "pointers to elements of a vector going
           | bad" is don't save pointers to elements of a vector.
        
           | ay wrote:
           | Doubling the vector size each time it needs to be enlarged
           | takes care of fragmentation/memory overhead, in practice. Or
           | preallocating if you know the rough size.
           | 
           | It does take some discipline to avoid the storage of
           | pointers, but once you get used to that it's quite fine.
           | 
           | Source: I work on VPP, which uses vectors quite extensively.
        
             | saagarjha wrote:
             | Usually you want a smaller growth factor than that to allow
             | for more reuse. (Also: what's VPP?)
        
               | ay wrote:
               | I use doubling because it's simple to reason about and
               | hard to screw up the math.
               | 
               | What kind of growth factors heuristics worked for you the
               | best ?
               | 
               | VPP: rather fast user mode dataplane. https://fd.io/ is
               | the "marketing" site. https://wiki.fd.io/view/VPP is the
               | "less flashy" dev wiki.
        
               | sfink wrote:
               | I usually use doubling, but I just recently landed a
               | patch to expand by a factor of 8 instead, which sped up a
               | microbenchmark (of constructing a JSON string, fwiw) by
               | 50%.
               | 
               | Someone saw in a profile that we were spending a lot of
               | time in realloc in a task that involved building a string
               | in contiguous memory. But the final string was realloc'd
               | down to its actual size, so it was safe to get a lot more
               | aggressive with the growth to avoid copying. The overhead
               | was only temporary. It turns out that powers of 8 get big
               | fast, so there are now many fewer copies and the few that
               | happen are copying a small portion of the overall data,
               | before a final octupling that provided mostly unused
               | space that didn't cost much.
               | 
               | Know your problem, I guess?
        
               | ay wrote:
               | Very interesting. And indeed, especially being adjacent
               | with the very interesting "golden ratio" comment, this
               | shows that there is more than one definition of "best"
               | dependent on the task, and that one always needs to
               | verify their hunch by actual profiling.
        
               | Kranar wrote:
               | The optimal growth factor is the golden ratio (1.6), in
               | practice many vectors use a growth factor of 1.5.
               | 
               | The reason for not going above the golden ratio is that
               | it prevents any previously allocated memory from ever
               | being reused. If you are always doubling the size of your
               | vector, then it is never possible to reclaim/reuse any
               | previously allocated memory (for that vector) which means
               | every time your vector grows you are causing more and
               | more memory fragmentation, as opposed to using a growth
               | factor of 1.5 which results in memory compaction.
        
               | llbeansandrice wrote:
               | Why does that happen when doubling size? I don't
               | understand
        
               | Kranar wrote:
               | Sure we can go over both cases:
               | 
               | Case 1: Growth factor of 2 and an initial size of 10
               | bytes.
               | 
               | Start with an initial allocation of 10 bytes of memory.
               | 
               | On growth allocate 20 bytes and release the 10 bytes,
               | leaving a hole 10 bytes.
               | 
               | On growth allocate 40 bytes and release the 20 bytes, the
               | hole in memory is now 30 bytes large (the initial 10 byte
               | hole + the new 20 byte hole).
               | 
               | On growth allocate 80 bytes and release the 40 bytes, the
               | hole is now 60 bytes.
               | 
               | On growth allocate 160 bytes and release the 80 bytes,
               | the hole is now 140 bytes.
               | 
               | So on so forth... using this strategy it is never
               | possible for the dynamic array to reclaim the hole it
               | left behind in memory.
               | 
               | Case 2: Growth factor of 1.5 and an initial size of 10
               | bytes.
               | 
               | Start with an initial allocation of 10 bytes of memory.
               | On growth allocate 15 bytes and release the 10 bytes,
               | leaving a hole 10 bytes.
               | 
               | On growth allocate 22 bytes and release the 15 bytes, the
               | hole in memory is now 25 bytes large (the initial 10 byte
               | hole + the new 15 byte hole).
               | 
               | On growth allocate 33 bytes and release the 22 bytes, the
               | hole is now 47 bytes.
               | 
               | On growth allocate 50 bytes and release the 33 bytes, the
               | hole is now 80 bytes.
               | 
               | On growth reuse 75 bytes from the hole in memory left
               | over from previous growths, the hole is now 5 bytes.
               | 
               | With a growth factor of 1.5 (or anything less than the
               | golden ratio), the hole grows up to a point and then
               | shrinks, grows and shrinks, allowing the dynamic array to
               | reuse memory from past allocations.
               | 
               | With a growth factor of 2, the hole in memory continues
               | to grow and grow and grow.
        
               | Dylan16807 wrote:
               | 99.9% of the time, each allocation leaves a separate
               | hole. Your goal is to enable other allocations to use
               | those holes without further splitting them, not to get
               | your ever-growing vector to reuse memory.
        
               | sfink wrote:
               | This appears to assume that there will be no intervening
               | allocations that are allowed to use the same region of
               | memory, since otherwise your "hole" will be very
               | discontiguous.
               | 
               | But if that is the case, then why are you releasing
               | memory? That's just costing you time moving the data.
               | With a doubling mechanism:
               | 
               | Start with an initial allocation of 10 bytes of memory.
               | 
               | On growth allocate another 10, and don't copy anything,
               | leaving no hole at all.
               | 
               | On growth allocate another 20, and don't copy anything,
               | leaving no hole at all.
               | 
               | etc.
               | 
               | In what scenario would the growth factor of 1.5 actually
               | help? If you're restricting the allocation API to only
               | malloc then you can't grow the size of your allocation
               | and what you said might make sense as something the
               | allocator could take advantage of internally. But realloc
               | exists, and will hopefully try to extend an existing
               | allocation if possible. (If your earlier allocation was
               | fairly small, then the allocator might have put it in an
               | arena for fixed-size allocations so it won't be possible
               | to merge two of them, but once things get big they
               | generally get their own VMA area. Obviously totally
               | dependent on the allocator implementation, and there are
               | many.)
        
             | shaklee3 wrote:
             | good to know vpp is still alive. always interesting work
             | coming out of it, and it never seemed to take off.
        
               | ay wrote:
               | There are quite a few sizable businesses (hundreds of
               | MUSD) built on it / depending on it, that I know of :)
        
               | shaklee3 wrote:
               | out of curiosity, is it mostly Cisco still funding it?
               | what are the advantages over dpdk? mostly TCP?
        
               | ay wrote:
               | The funding - I believe organizationally it has now grown
               | into https://lfnetworking.org/about/members/ and fd.io is
               | just one of the projects.
               | 
               | Contributions-wise to VPP: varies from time to time. On
               | my very recent memory there were sizable IPSec
               | acceleration patches from folks at Intel. There is a fair
               | few one off bugfixes coming from smaller users who
               | scratch their own itch. Pim van Pelt [0] has
               | contributed/improved a few very cool and sizable features
               | (Linux control plane integration, VPP yaml configurator
               | [1])
               | 
               | As for difference with DPDK - does it do L3 routing? NAT
               | ? MAP ? MPLS ? DHCP ? VPP can do all of this, and of
               | course it's just off top of my memory...
               | 
               | https://docs.fd.io/vpp/22.10/aboutvpp/featurelist.html is
               | the autogenerated feature list based on the "claimed
               | official features" that are tracked via YAML files in the
               | source code.
               | 
               | [0] https://ipng.ch/s/articles/2021/08/12/vpp-1.html [1]
               | https://lists.fd.io/g/vpp-
               | dev/topic/feedback_on_a_tool_vppcf...
        
           | asddubs wrote:
           | or if you can't guarantee allocations will succeed and need
           | to handle that case. in that case you can just have the
           | object that's part of the list have its list element data
           | contained within itself. (aka an intrusive list)
        
         | pfdietz wrote:
         | Compacting garbage collectors put linked lists into nearby
         | memory.
        
           | josephg wrote:
           | Even if memory was packed by a GC, linked lists still have to
           | pay the cost of allocating and garbage collecting every cell
           | of memory individually. I doubt a GC would solve linked
           | lists' woes.
           | 
           | Do you have some benchmarks demonstrating your claim?
        
             | throwawaymaths wrote:
             | Erlang GC is pretty good, plus there are some nice highly
             | local allocators
        
               | josephg wrote:
               | Be that as it may, it'll still take a benchmark to
               | convince me.
        
             | Jach wrote:
             | There are also prefetch instructions. I listened to
             | https://www.youtube.com/watch?v=SetLtBH43_U recently
             | (transcript: https://signalsandthreads.com/memory-
             | management/), part of it talked about some work in OCaml's
             | GC.
             | 
             | > ...each individual memory request that's not in cache
             | still has to wait the full 300 cycles. But if we can get 10
             | of them going at the same time, then we can be hitting a
             | new cache line every 30 cycles. I mean, that's not as good
             | as getting one every 16 cycles, but it's close. You're
             | actually able to get to a reasonable fraction of the raw
             | memory bandwidth of the machine while just traversing
             | randomly over this huge one gigabyte heap
             | 
             | https://github.com/ocaml/ocaml/pull/10195 shows the change
             | adding prefetching to the marking phase
             | (https://github.com/ocaml/ocaml/pull/9934 was done earlier
             | for sweeping). There are some benchmarks in the
             | thread/linked from the thread.
        
               | [deleted]
        
             | tsimionescu wrote:
             | I don't have a benchmark, but it's important to remember
             | that GCs only scan live objects, and they always have to
             | scan all live objects; while allocation is normally an
             | extremely simple instruction (bump the memory pointer).
             | It's true though that scanning a live linked list would be
             | expected to be slower than scanning a live array (though
             | some optimized scanning strategies may be possible).
             | 
             | Overall I couldn't tell what the impact on overall
             | performance would be, but I can tell you for sure that de-
             | allocating an N-element linked list or array are both O(1)
             | operations for a copying garbage collector (which are the
             | most common kinds of compacting GCs), since they simply
             | will never reach nor copy any of the elements.
        
           | josefx wrote:
           | Will they? I don't think they reorder live objects relative
           | to each other, so any object allocated between two nodes
           | would still end up between those nodes after the GC ran.
        
             | pfdietz wrote:
             | It is very possible for a compacting garbage collector to
             | reorder live objects.
        
             | pencilguin wrote:
             | It might or might not. Commonly, not.
        
             | sfink wrote:
             | Disagreeing with other answers here, but usually they will,
             | and not because the authors tried to make it happen. It's
             | just sort of the default behavior.
             | 
             | A copying collector has to trace through the whole graph to
             | find everything reachable. The tracing order is usually
             | going to be depth first. (It's the easiest to code if
             | you're using recursion and therefore maintaining state with
             | the runtime's stack, but even with an explicit stack you'll
             | want to push and pop the end for locality.) Which means
             | that objects arranged into a linked list will be visited in
             | list order, and you generally copy as you visit an object
             | for the first time. So... reordering live objects relative
             | to each other "just happens". Even if it's a doubly-linked
             | list, since the back pointers have already been visited and
             | so don't matter.
             | 
             | This is less true if your objects have a lot of other
             | pointers in them. But that's just saying that if your
             | object graph isn't a list, then it won't end up laid out in
             | list order. It'll still end up in DFS order according to
             | the true graph.
        
         | [deleted]
        
         | kabdib wrote:
         | The first week into my prior job, I had to fix an in-production
         | bug involving quadratic vector growth that was tipping over
         | servers. The vector implementation guaranteed too much, namely
         | consecutive memory layout (which wasn't needed, and if we had
         | needed it, it would have been a mistake).
         | 
         | Decomposing that vector into recursive subvectors solved the
         | problem. Going from a few tens of millions of elements in a
         | single contiguous vector (with consequent -- and bad -- heap
         | fragmentation!) to nested vectors with a few thousand elements
         | each brought us back online again.
         | 
         | Which is to say: Vectors are nice. Lists are nice. Use
         | appropriate data structures for your scale, watch your semantic
         | dependencies, and don't get hung up on dogma.
        
           | AtlasBarfed wrote:
           | This sounds like you need a log structured merge tree.
        
           | mst wrote:
           | The VList paper - which I archived at
           | http://trout.me.uk/lisp/vlist.pdf to avoid having to google
           | the bloody thing every time - is an interesting hybrid.
        
             | simplotek wrote:
             | Thanks for referring to the article. Great read!
        
               | mst wrote:
               | http://trout.me.uk/lisp/ and http://trout.me.uk/gc/ are
               | both "selected stuff I knew I'd have trouble finding
               | again later" and I wish you great joy of them.
               | 
               | Still sad that John Shutt (creator of 'kernel lisp')
               | passed away while I was still procrastinating dropping
               | him an email - I think the only other 'thank you' I've
               | most regretted not sending in time was to Terry
               | Pratchett.
        
             | GeorgeWBasic wrote:
             | Thank you for that. That's a concept I remembered reading
             | about, but I couldn't remember the name and had no way to
             | find it again.
        
               | mst wrote:
               | trout.me.uk/lisp/ and trout.me.uk/gc/ are both basically
               | archives of "stuff I knew I'd suffer that problem with"
               | and it's a pleasure every time the contents come in handy
               | for people who aren't me.
               | 
               | Also back when I was still pretending to be a
               | mathematician I used both GWBASIC and UBASIC for assorted
               | purposes so thank -you- for the nostalgia kick.
        
           | bluGill wrote:
           | If your elements are in the millions than a tree might be
           | better, depending on what you do. If you traverse the whole
           | thing often then vector is best, but at a million a tree will
           | do a search for element faster despite the cache misses.
           | 
           | I don't know where the line is, but in general you are
           | probably not close to that line so default go vector. However
           | as your point is, sometimes you are over that line.
        
             | sally_glance wrote:
             | And if you really need it you can even do both tree and
             | vector simultaneously - just make sure both consist of
             | pointers only so you avoid update/sync issues.
        
               | bluGill wrote:
               | You can, but probably shouldn't. The pointers means a
               | cache miss when you follow it. Putting everything in the
               | vector is the point, otherwise the map is potentially
               | faster
        
               | jcelerier wrote:
               | If you have to do a binary search in a million elements
               | you'll have cache misses _anyways_
        
               | bluGill wrote:
               | Sure, ln cache misses. With a linear search through
               | pointers you have n cache misses.
        
           | diabeetusman wrote:
           | Correct me if I'm wrong, but it sounds like you're describing
           | an Unrolled Linked List [1]? If so, neat!
           | 
           | [1]: https://en.wikipedia.org/wiki/Unrolled_linked_list
        
           | ndriscoll wrote:
           | Ironically, if you weren't using huge pages, the operating
           | system was already doing this for you anyway, and your
           | contiguous segments weren't contiguous.
           | 
           | This does suggest functionality that the OS should provide:
           | let the program remap memory somewhere else in its address
           | space to obtain a larger contiguous chunk of addresses
           | without copying data. This wouldn't help much if you have
           | direct pointers, but I think e.g. Java has a layer of
           | indirection that would let it benefit from remapping to grow
           | ArrayLists. I don't know if things already work this way. It
           | should make speculative fetches easier for the processor
           | since now it understands what you're doing via the memory
           | map.
           | 
           | Edit: taking this idea a little further is actually
           | fascinating: use the type system to separate objects into
           | pools based on type, and instead of pointers, use indexes
           | into your pool. You can move the pool for free to grow/shrink
           | it. I think you could even do this with pointers and full
           | hardware acceleration today if you used the high bits to tag
           | pointers with their types and used remapping at the
           | hypervisor level to move the pools (so the address in the
           | application level doesn't change). You could do some crazy
           | stuff if the application could coordinate multiple levels of
           | virtual memory.
        
             | pencilguin wrote:
             | OSes do already provide this ability: mmap.
        
               | ndriscoll wrote:
               | Maybe I'm just not seeing it, but I don't see how mmap
               | let's you remap an existing address to a new one? Looks
               | like there is a mremap on Linux in any case.
        
               | comex wrote:
               | If you want to stick to mmap, you can do it by creating a
               | file which doesn't have any actual disk backing (one way
               | to do this is with POSIX shm_open), and handling all
               | memory allocations by mmapping parts of that file rather
               | than mapping true anonymous memory. Then, 'remapping' an
               | address to a new one is a matter of calling mmap again
               | with the same file offset. Unlike mremap, this allows
               | mapping the same memory at different addresses
               | simultaneously, which can be useful for some purposes.
        
               | pencilguin wrote:
               | /dev/hugepages/ is a good place to put this if you have
               | enough control over the host.
        
               | pencilguin wrote:
               | Or, you can map new space onto the end of an existing
               | mapping, at a slightly greater cost of keeping track.
               | Mremap is usually better.
        
               | kabdib wrote:
               | That sounds like a lot of work unless you absolutely have
               | to have a contiguous allocation of every element.
               | 
               | If you don't need that, you can get "O(1.1)" access and
               | nice heap behavior with an indirection or two, and the
               | implementation is likely to be lots more portable and
               | standards-friendly than if you also dragged in a bunch of
               | OS mmap() semantics (which tend to have exciting,
               | platform-dependent misfeatures).
               | 
               | Also, I would have to reach for the mmap() man page. I
               | have malloc() right in front of me. :-)
        
             | colanderman wrote:
             | glibc realloc(3) does this on Linux, by way of mremap(2)
             | with MREMAP_MAYMOVE [1]. (Just confirmed on my local
             | machine using strace(1).)
             | 
             | Unfortunately, libstdc++ does not (to my knowledge) use
             | realloc(3) or mremap(2) for vector resize, nor could it in
             | many situations due to the semantics of C++.
             | 
             | [1] https://github.com/bminor/glibc/blob/15a94e6668a6d7c569
             | 7e805...
        
             | bee_rider wrote:
             | I somehow misread your comment the first time through and
             | thought you were suggesting the ability to ask the OS:
             | "Hey, I'm a pointer, is there anything in front of my block
             | of memory, if so can you move it elsewhere so that I can
             | expand?" This would of course be a pretty silly idea but it
             | would be funny to give the OS folks a whole new class of
             | use-after bugs to worry about.
             | 
             | Edit: Let's call it "memoveouttamyway"
        
               | FeepingCreature wrote:
               | Isn't that `realloc`? At the allocator level rather than
               | OS level, but still.
        
             | kabdib wrote:
             | Right, the ability to relocate a chunk of memory to a new
             | virtual address (with more capacity at the end) without
             | doing a physical copy would be an interesting way to solve
             | the problem.
             | 
             | I don't know if anyone has done research on integrating
             | data structures and OS functionality in this way.
             | Interesting idea.
             | 
             | We needed to get things running again rather quickly,
             | though :-)
        
               | cyber_kinetist wrote:
               | Some game engine developers have certainly used this
               | idea, to prevent any reallocations when the dynamic array
               | needs to be resized. Such as:
               | 
               | https://ruby0x1.github.io/machinery_blog_archive/post/vir
               | tua...
               | 
               | Some ECS implementations use this to reduce the overhead
               | of dynamic arrays, as well as ensuring pointer stability
               | of the items stored. For example entt:
               | 
               | https://skypjack.github.io/2021-06-12-ecs-baf-part-11/
               | 
               | And here's a library implementing a resizable, pointer-
               | stable vector using virtual memory functionality from the
               | OS:
               | 
               | https://github.com/mknejp/vmcontainer
        
             | amluto wrote:
             | You can certainly allocate memory at controlled addresses
             | using mmap or VirtualAlloc.
             | 
             | But you should _not_ play virtual memory games at runtime
             | in an application that you expect to be scalable. Unmapping
             | memory or changing its permissions requires propagating the
             | change to all threads in the process. On x86, the
             | performance impact is dramatic. On ARM64, it's not as bad
             | but it's still there.
             | 
             | Also, even 64 bits can go fast when you start using high
             | bits, separate pools for different purposes, etc.
        
             | comex wrote:
             | Linux has mremap() for this. macOS has vm_remap(). Not sure
             | about other operating systems.
        
           | pencilguin wrote:
           | Another name for that is "B* tree". Or anyway a thing with
           | similar performance characteristics.
        
             | cobalt wrote:
             | also what a deque (C++) does
        
               | pencilguin wrote:
               | True... except in MSVC. std::deque in MSVC is cursed.
               | 
               | Targeting MSVC, use boost::deque or something, instead.
               | Boost containers are generally fairly good, although they
               | are more or less unmaintained nowadays. (E.g.,
               | boost::unordered_map::reserve doesn't work.) Or, like
               | many of us, ignore performance on MS targets because
               | anybody who cares about performances is running something
               | else.
        
               | q-big wrote:
               | Why?
        
               | pencilguin wrote:
               | https://github.com/AcademySoftwareFoundation/openvdb/issu
               | es/...
               | 
               | Longer answer is that the size of the "blocks" is limited
               | to 512 bytes or one element, whichever is larger. So
               | unless your elements are really tiny, it is strictly a
               | pessimization.
        
               | cyber_kinetist wrote:
               | Many of the STL containers in MSVC are _incredibly_
               | cursed. Even std::vector is implemented in such a weird
               | way, that you can 't debug it easily with any debugger
               | other than Visual Studio. And it's _incredibly_ slow on
               | Debug mode, up to the point that it 's unusable...
               | (Because of these reasons I just switched to using my own
               | Vector<T> class recently)
               | 
               | I think there's a very good reason why old gamedevs who
               | were stuck with developing on Windows largely ignored the
               | STL and made their own containers / utility functions:
               | MSVC's STL was just _awful_.
        
               | simplotek wrote:
               | > Many of the STL containers in MSVC are incredibly
               | cursed. Even std::vector is implemented in such a weird
               | way, that you can't debug it easily with any debugger
               | other than Visual Studio. And it's incredibly slow on
               | Debug mode, up to the point that it's unusable...
               | (Because of these reasons I just switched to using my own
               | Vector<T> class recently)
               | 
               | Here's a link to an old HN discussion that offers some
               | insight on the performance of MSVC's STL in debug.
               | 
               | https://news.ycombinator.com/item?id=18939260
               | 
               | The "cursed" keyword hand-waves over the problem and
               | suggests it originates in the acritical propagation of
               | ignorant beliefs. Sometimes clarifying these is just a
               | Google search away.
        
               | cyber_kinetist wrote:
               | Yes, I know that already. And good luck with dealing all
               | the linker/ABI issues from changing that setting.
        
               | brantonb wrote:
               | I always wondered why we weren't allowed to use the STL
               | in the Windows division in the 00's. This probably
               | explains it.
        
               | pencilguin wrote:
               | There was a lot of BS, as well. It is impossible to tease
               | out which was what.
        
               | ahtihn wrote:
               | > ignore performance on MS targets because anybody who
               | cares about performances is running something else.
               | 
               | That's a strange take. Surely games still represent a
               | significant chunk of the demand for C++ developers that
               | need to care about performance?
        
               | pencilguin wrote:
               | Thus, "many". Game coders would not be among those.
        
             | kabdib wrote:
             | Right, though we didn't need full tree semantics (just fast
             | random access and an append that didn't spool up the fans
             | on the servers).
             | 
             | I would hate to have to do a full balanced b-tree on short
             | notice. (We also had rather terrible extra requirements
             | that made using an off-the-shelf thing from, say, the STL
             | work in our environment).
        
               | samsquire wrote:
               | For a simple and readable implementation of a Python
               | btree see this:
               | 
               | https://github.com/samsquire/btree
               | 
               | I tried to keep the implementation as simple as I could.
        
               | dilap wrote:
               | simple vector implementations will do a full realloc and
               | recopy when you run out of space. so O(1) amortized but
               | O(N) every so often.
               | 
               | another implementation strategy, at the cost of using
               | some extra memory, is to allocate the new space, but only
               | copy over a bit at a time, each time a new element is
               | added -- so every add is still O(1). no long pauses.
               | 
               | (well, assuming the memory alloc is still fast. which i
               | think it should be in practice? tho i'm not super
               | familiar w/ that stuff.)
               | 
               | just curious, do you think that solution would have
               | worked for your situation too?
        
               | kabdib wrote:
               | That's exactly the strategy that was tipping over. Once
               | you start resizing vectors of tens of megabytes, it's
               | kind of over.
               | 
               | You can start playing games by accelerating capacity
               | allocation when a vector rapidly starts growing extremely
               | large. This doesn't solve the problem that vectors
               | sometimes guarantee too much for what you need.
        
               | dilap wrote:
               | Ah thanks, I didn't read closely enough the first time.
        
           | taf2 wrote:
           | My Forman always would tell me KISS - (keep it simple
           | stupid), maybe there is another single word for just solve
           | the problem at hand silly - j,s,p,a,h,s - doesn't really roll
           | off the tongue the same way...
        
       | Razengan wrote:
       | Seems to me that all these arguments are rooted in the way
       | computer memory is fundamentally implemented: Could it not be
       | possible to re-architecture it so that it's more agreeable to
       | arrays and lists, since that's how most data is used?
       | 
       | I often wonder about the separation of CPU and RAM, of ways it
       | could be done better. How about making it like neurons, where
       | each "memory cell" is also a teeny tiny computer itself? (How DO
       | neurons do internal computation anyway?)
        
       | fortran77 wrote:
       | I think the Rust people are de-emphasizing the importance of
       | linked lists due to the glaring hole in the Rust language that
       | makes it difficult to implement linked lists, and horribly
       | awkward to do doubly-linked lists in safe Rust
       | 
       | (See https://rcoh.me/posts/rust-linked-list-basically-impossible/
       | )
       | 
       | To save face, the Rust astroturfers are now skipping around
       | saying "Linked lists aren't important"
        
         | c-cube wrote:
         | The author is a known C programmer (he wrote redis). You're
         | imagining things.
        
       | klntsky wrote:
       | Not to mention path sharing (also called structural sharing) for
       | free.
        
       | chrisseaton wrote:
       | I've used linked lists where I wanted allocation of an almost-
       | write-only log data structure to be very fast - so not resizing a
       | buffer, and nobody cares about cache locality except for
       | allocation. In a system with a TLA buffer and bump-allocation,
       | this is absolutely ideal because allocation is local and very
       | fast, and does not cause any of the existing log to be brought
       | into cache.
       | 
       | "Nobody uses this data structure stuff in real programming at
       | work! You can forget it after college!"
        
         | [deleted]
        
         | eatonphil wrote:
         | Here's another example for the crowd: an "unbounded" fifo queue
         | in a fixed memory environment.
         | 
         | https://github.com/tigerbeetledb/tigerbeetle/blob/main/src/f...
        
       | gslepak wrote:
       | Anyone else getting a serious cert expired warning when visiting
       | this site?
       | 
       | The one I'm getting is also for the wrong name - registered for
       | redis.io and expired on `Fri, 07 Aug 2020 11:30:03 GMT`. Issued
       | by Let's Encrypt. Fingerprint: `07:BF:EA:59:DB:83:33:77:50:27:A8:
       | C5:2F:80:F6:CA:E6:EC:E2:7D:01:DB:72:7A:AF:EE:69:DD:EC:2D:DA:F3`
       | 
       | Seems like a MITM attack is going on.
        
         | gslepak wrote:
         | Oh, nevermind, link was submitted in HTTP, but I have mandatory
         | HTTPS enabled and the server is misconfigured.
        
       | scaramanga wrote:
       | My favorite typo/brainfart so far this week has to be "cache
       | obviousness" :)
        
       | m3kw9 wrote:
       | If you work on iOS, it's really hard to do a better job than what
       | the SDK has implemented in the arrays function.
        
       | bitwize wrote:
       | Linked lists are one of those things, like textual protocols,
       | that people reach for because they're easy and fun but, from an
       | engineering standpoint, you should never, ever use unless you can
       | provide a STRONG justification for doing so. The locality of
       | reference characteristics of vectors mean that, except for
       | extreme cases, they beat linked lists almost every time on modern
       | cached CPUs.
        
       | ignoramous wrote:
       | > _Linked lists are simple. It is one of those rare data
       | structures, together with binary trees and hash tables and a few
       | more, that you can implement just from memory without likely
       | stepping into big errors._
       | 
       | Well, if one likes linked lists enough, they can replace binary
       | search trees / hash tables with a skip list. Super powerful and
       | easier to implement than splay trees, red black trees, avl trees,
       | augmented trees, what-have-you.
        
       | ComputerGuru wrote:
       | One of the only times linked lists are "undoubtedly the right
       | option" is if you're writing some sort of intrusive data
       | structure to avoid an allocation by reusing caller stack from
       | another location. Not sure how many run into that often.
        
         | docandrew wrote:
         | When I need to avoid an allocation it's because I'm writing an
         | allocator :) Overlaying the list nodes on top of a block of
         | free memory avoids the need for a secondary data structure to
         | keep track of free areas.
         | 
         | The other time allocations are undesirable, and therefore
         | linked lists are widely-used, is bare metal work when other
         | allocators just aren't available. You can do tricks like
         | representing pools of free and in-use objects with two lists
         | where the nodes occupy the same memory block. Allocations are a
         | fast O(1) unlink from the free list and O(1) re-link to the
         | used list, and frees are the opposite.
        
       | photochemsyn wrote:
       | For educational purposes (i.e. hooking easily into graphics
       | packages for visualization) you can make a linked list in Python,
       | using elements like:
       | 
       | class Node:                 def __init__(self, dataval=None):
       | self.dataval = dataval              self.nodepointer = None
       | 
       | The only reason to do this is for education/visualization of data
       | structures, although I'm wondering if this can be engineered to
       | cause a Python memory leak, via circularization of the linked
       | list, or if Python's GC would catch it. Also educational perhaps.
       | 
       | Where linked lists really seem to come into play is with custom
       | manual memory allocators in embedded programming, something of a
       | niche subject.
        
       | [deleted]
        
       | lanstin wrote:
       | I just used linked lists to make an LRU hard limit on the number
       | of items in an in memory store. Each usage, the pointers are
       | updated to move the item to the head. Each time the list is at
       | the limit, the tail is removed and freed. I may take advantage of
       | the machinery to make it go onto free list. This was Go and the
       | DLL thing is my first use of generics.
        
       | layer8 wrote:
       | I think I first learned about linked lists in the context of
       | programming for AmigaOS, which has a standardized form of
       | (intrusive) linked lists that is used throughout the system. I
       | remember being fascinated by the header node which serves as both
       | the head and the tail of a list due to a "clever" field layout:
       | https://wiki.amigaos.net/wiki/Exec_Lists_and_Queues
        
         | Narishma wrote:
         | Linked lists made more sense back then when memory access
         | wasn't as slow compared to CPU speed.
        
       | OOPMan wrote:
       | Oh lord, can anyone make a post without whining on about Twitter?
        
       | thedracle wrote:
       | There was a time when memory was conventionally expensive, memory
       | allocation of large blocks was slow, and small blocks was much
       | faster (malloc would find a small block faster than a large one
       | on a highly fragmented heap).
       | 
       | Before having gigantic caches that would engulf nearly any sized
       | contiguous list, linked lists were sort of vogue, frugal, and
       | thought of as being fairly performant.
        
         | AstralStorm wrote:
         | That time is still there in embedded hardware. Even potent ARM
         | chips (except the cellphone ones maybe) have some KB of cache
         | at best, and similar amounts of SRAM. While code size is still
         | at a premium since XIP (execute-in-place from in-built flash)
         | is slower.
        
       | stephc_int13 wrote:
       | The good thing about linked list is how easy they are to
       | understand and use, almost trivial.
       | 
       | The bad thing is that their cost nis ot obvious and not
       | consistent (difficult to predict).
       | 
       | I tend prefer simple arrays.
       | 
       | This talk by Mike Acton is a good introduction to understand why
       | going for the linked list as go-to data structure can lead to
       | performance issues.
       | 
       | https://www.youtube.com/watch?v=rX0ItVEVjHc
        
       | belter wrote:
       | "Does anyone use LinkedList? I wrote it, and I never use it." -
       | https://news.ycombinator.com/item?id=33418705
        
         | int_19h wrote:
         | "Don't get me wrong; I love linked lists. They're as useful in
         | C as they are in Scheme. But LinkedList is another matter
         | entirely."
        
       | [deleted]
        
       | dreamcompiler wrote:
       | Lisp of course was originally invented for manipulating linked
       | lists and it came about in an era where cache locality wasn't an
       | issue because computers in 1959 were speed limited by their CPUs
       | rather than their memories.
       | 
       | Cdr coding [0] solves several problems with singly-linked (cons-
       | style) lists including cache locality and the 2x storage overhead
       | of these lists. But it's not typically used except in dedicated
       | hardware e.g. Lisp machines. It could in principle be revived
       | fairly easily on modern 64-bit Lisp systems on stock hardware --
       | especially if such systems provided immutable cons cells.
       | 
       | But it's probably not worthwhile because modern Lisp programmers
       | (in Common Lisp at least) don't use lists all that much. CL has
       | very nice adjustable arrays and hash tables that are easier and
       | faster than lists for most purposes.
       | 
       | [0] https://en.m.wikipedia.org/wiki/CDR_coding
        
       | cyber_kinetist wrote:
       | I think the whole "dynamic arrays" vs "linked lists" debacle is
       | just a waste of time, since the two actually complement each
       | other.
       | 
       | - You can certainly implement linked lists using dynamic arrays,
       | so you call the minimal amount of `malloc()`s and also make your
       | items stored contiguously! You need to use indices instead of
       | pointers, since you will lose pointer stability unless you use
       | virtual memory (though the upside is that typically a uint32_t is
       | enough for most cases, which only takes up half as much memory as
       | a pointer).
       | 
       | - A lot of data structures under the hood uses linked lists, even
       | the ones that seem to use dynamic arrays on the surface. Have
       | experience in writing a simple arena allocator? For the allocator
       | to find a free space in the arena in O(1), you need to maintain a
       | free-list data structure under the hood. And guess what: you need
       | to write a linked list. Even your operating systems' `malloc()`
       | itself probably uses linked lists under the hood.
       | 
       | - Linked lists just come up inside so many other data structures,
       | that you can essentially call it as the 'backbone'. For example:
       | in compute graphics there is something called a half-edge data
       | structure [0], which stores the geometry data of a mesh that's
       | both compact, easy to iterate, and easy to do manipulations with
       | (for instance, you can do things like insert/delete an edge/face
       | easily in O(1) time, and more complex mesh operations can be
       | implemented as well). And guess what? The vertices and halfedges
       | are essentially stored in a complex web of linked lists.
       | 
       | [0] https://jerryyin.info/geometry-processing-algorithms/half-
       | ed...
        
       | swellguy wrote:
       | I think this article could be more intelligently titled: "Don't
       | post, read, or respond to things on TWTR". The linked lists
       | aspect could be replaced with anything.
        
       | btilly wrote:
       | In scripting languages it is a truism that anything you would
       | want to do with a linked list is probably better done with an
       | array. And given the overhead of working in the language versus
       | things implemented in C, this is generally true.
       | 
       | However I have a fun use for linked lists where this is NOT true.
       | 
       | Anyone who has played around with algorithms has encountered
       | dynamic programming. So you can, for example, find the count of
       | subsets that sum to a particular number without actually
       | enumerating them all. But what if instead of counting them, you
       | wanted to actually find some?
       | 
       | The answer turns out to be that instead of using dynamic
       | programming to get a count, you use dynamic programming to build
       | up a data structure from which you can get the answer. And the
       | right data structure to use turns out to be...a type of linked
       | list! There is no faster array equivalent for what you get.
       | 
       | In other words, with a few extra fields for clarity, here is the
       | basic structure for subset sum of positive integers:
       | {             current_sum: ...,             count_of_solutions:
       | ...,             current_value: ...,
       | solutions_using_current_value: (link in one dim of linked list),
       | solutions_using_previous_values: (link on other dim of linked
       | list),         }
       | 
       | I leave figuring out the dynamic programming code to generate
       | this as a fun exercise. Likewise how to extract, say, the 500'th
       | subset with this sum. Both are easy if you understand linked
       | lists. If not...well...consider it education!
        
       | Legion wrote:
       | > (oh, dear Twitter: whatever happens I'll be there as long as
       | possible - if you care about people that put a lot of energy in
       | creating it, think twice before leaving the platform)
       | 
       | You mean the people that all just got fired?
        
       | pugworthy wrote:
       | Linked lists are great. But they have the problem that, almost
       | always, whatever technical interview you have, someone asks you
       | to whiteboard how to reverse one.
       | 
       | And I answer, "I'd google it"
        
       | fnordpiglet wrote:
       | Linked lists also work great in a sharded distributed data
       | structure. You can make changes to the list structure by simply
       | changing reference at nodes and don't need to move anything.
        
       | sudarshnachakra wrote:
       | I guess linked lists as they are are very useful for implementing
       | queues (particularly those that feed thread pools) where-in the
       | costs of growable array is not needed and the cache locality does
       | not matter (continuing with a thread pool example - It's almost a
       | guarantee that having next element in the cache of the CPU which
       | is not going to execute the next Runnable is a waste).
       | 
       | In Java particularly the both array as well as the linked
       | implementation of blocking queues should perform equally well.
       | FWIW most queue implementations are linked lists.
        
         | NavinF wrote:
         | The best implementations typically have a queue per thread and
         | work stealing. The first threads to finish their assigned work
         | will grab items from other queues, but until then you get
         | perfect cache locality.
         | 
         | Java's queues and global threadpool queues in general are
         | pretty old hat.
        
       | Jtsummers wrote:
       | Language wars, editor wars, and now data structure wars. What
       | random technical thing will people go to virtual blows over next?
        
         | keepquestioning wrote:
         | https://www.reddit.com/r/slatestarcodex/comments/9rvroo/most...
        
           | B1FF_PSUVM wrote:
           | I'd note that the number of people creating content now is
           | probably two or six orders orders of magnitude larger than
           | before the internet, when it was just print/radio/TV/etc.
           | 
           | Most that Joe Random could expect to get in print would be a
           | Letter to the Editor, or a random passerby interview. The in-
           | crowd was even more "unusual".
        
       | TillE wrote:
       | The Windows kernel also has linked lists as one of its few
       | general-purpose data structures available to drivers.
       | 
       | Like any tool, it has its place. As long as you're not traversing
       | large linked lists, they're probably fine.
        
         | dilyevsky wrote:
         | Traversing a short list but frequently still going to have
         | terrible cache performance compared to something like vector.
         | Just use vectors, folks
        
           | slaymaker1907 wrote:
           | Allocating a huge chunk of memory all at once when the array
           | grows can also cause a bunch of problems. Linked lists are
           | also much simpler in a multi threaded context.
        
             | dilyevsky wrote:
             | I'm not an expert in memory allocators but I suspect large
             | allocation is better than many small allocs due to less
             | fragmentation no? I'll grant you the multithreaded case but
             | if you're doing that your cache performance is probably
             | crap anyway =)
        
           | [deleted]
        
       | lacrosse_tannin wrote:
       | Linked lists don't care about you. They aren't even alive.
        
         | TheDudeMan wrote:
         | Is that a "pro" or a "con"?
        
       | adeptima wrote:
       | When people asking my opinion for Rust, I loved to share them the
       | Linkedin List implementation link: https://doc.rust-
       | lang.org/src/alloc/collections/linked_list....
       | 
       | And the first feedback is why so many unsafe blocks? What is it
       | Option<NonNull<Node<T>>> ? '_ ?
       | 
       | Another reason to share, if you can understand Linkedin List, you
       | are free to code in Rust ;)
        
         | dekhn wrote:
         | I don't know Rust but I would guess that an
         | Option<NonNull<Node<T>>> would be a Node of type T, that cannot
         | be set to null, but can be optional. This is a type I would
         | want to return from a function that could either return
         | nothing, or a pointer to a real thing.
         | 
         | As for the unsafety, I would assume the authors of collections
         | know what they are doing and do that for performance.
        
           | proto_lambda wrote:
           | `NonNull` is actually a wrapper around the pointer type, with
           | the added invariant that the pointer cannot be null. The
           | compiler is made aware of this, which allows
           | `Option<NonNull<T>>` to be just a plain pointer, where the
           | `None` variant of the option corresponds to a null pointer
           | and the `Some(ptr)` case corresponds to a non-null pointer.
        
             | josephg wrote:
             | I really wish rust had better syntax for this. Raw C
             | pointers should probably all be Option<NonNull<T>> rather
             | than *mut T.
             | 
             | The latter is easier to type but worse in almost every way.
             | Ergonomics should guide you to the former, not the latter.
        
               | kibwen wrote:
               | It's an interesting suggestion to impose Option semantics
               | on raw pointers by default (presumably with some
               | alternative way to receive a nullable pointer, for
               | completeness). But in the meantime, in your own codebase,
               | it's easy enough to do `type Ptr<T> =
               | Option<NonNull<<T>>`.
        
               | josephg wrote:
               | The problem there is that NonNull is exclusively a
               | wrapper around *mut T. Using it discards the distinction
               | with *const _ pointers. I think that has implications for
               | correctness with Miri.
        
               | andrewflnr wrote:
               | I think if you tried to do that, you'd basically be
               | mixing type-driven optimizations with the C FFI, which
               | sounds sketchy, to me at least. The null-optimization for
               | Option is just that, an optimization, and I don't like
               | taking it for granted where safety/security is at stake.
        
               | nickitolas wrote:
               | The layout is guaranteed and documented as thus:
               | https://doc.rust-lang.org/std/option/#representation
               | 
               | > Rust guarantees to optimize the following types T such
               | that Option<T> has the same size as T:
               | 
               | > - Box<U>
               | 
               | > - &U
               | 
               | > - &mut U
               | 
               | > - fn, extern "C" fn1
               | 
               | > - num::NonZero*
               | 
               | > - ptr::NonNull<U>
               | 
               | > - #[repr(transparent)] struct around one of the types
               | in this list.
               | 
               | > This is called the "null pointer optimization" or NPO.
               | 
               | > It is further guaranteed that, for the cases above, one
               | can mem::transmute from all valid values of T to
               | Option<T> and from Some::<T>(_) to T (but transmuting
               | None::<T> to T is undefined behaviour).
               | 
               | Although I'm not sure if the ABI is guaranteed to match
               | (Which could differ even if the layout matches AFAIK).
               | The ABI for Box is guaranteed to match since
               | https://blog.rust-lang.org/2020/01/30/Rust-1.41.0.html ,
               | and I would imagine NonNull is the same. Maybe you could
               | open an issue in the UCG asking if you're needing
               | confirmation.
        
               | josephg wrote:
               | Thats fair.
               | 
               | I suppose one problem with this approach for C FFI is
               | that there's a lot of different values which could all be
               | "null pointers". Converting them all for Option would be
               | awkward and slow, and you wouldn't want to ever risk
               | getting this stuff wrong.
               | 
               | But pointers are also useful even if you aren't doing
               | FFI. Eg for implementing custom data structures. In that
               | case, Option<NonNull<T>> (Or even NonNull<T>) is usually
               | better than *mut T. But its harder to type, and it
               | doesn't clearly tell you if the pointer should be *mut T
               | or *const T. NonNull<T> should be preferred _because_
               | security /safety is at stake for this sort of code.
        
         | nemothekid wrote:
         | > _When people asking my opinion for Rust, I loved to share
         | them the Linkedin List implementation link:_
         | 
         | This LinkedList obsession is a bit bizarre to me, and tends to
         | come from older programmers who come from a time when coding
         | interviews involved writing linked lists and balancing b-trees.
         | To me though it also represents the stubbornness of C
         | programmers who refuse to consider things like growable vectors
         | a solved problem. My reaction to the LinkedList coders is not
         | "well Rust needs to maintain ownership", its _why does your
         | benchmark for how easy a language is involve how easy it is to
         | fuck around with raw pointers_?.
         | 
         | LinkedLists are a tool, but to C programmers that are an
         | invaluable fundamental building block that shows up early in
         | any C programmers education due to how simple they are to
         | implement and the wide range of use cases they can be used for.
         | But they are technically an unsafe data structure and if you
         | willing to let some of that stubbornness go and finally accept
         | some guard rails, you have to be able to see that a data
         | structure like linkedlists will be harder to implement.
         | 
         | It has nothing to do with the language; implementing with
         | LinkedLists with any sort of guardrails adds a ton of
         | complexity, either up front (e.g. borrowchecker) or behind the
         | scenes (e.g. a garbage collector). When you accept this fact,
         | it becomes ludicrous to imply that a LinkedList implementation
         | is a good benchmark for the ergonomics of a language like Rust.
        
         | bjourne wrote:
         | Wtf? That code looks like shit.
        
         | riwsky wrote:
         | Also the implementation is all wrong. The proper way to add an
         | element to a linkedin list is to send the element an email
         | titled "I just requested to connect", along with a "view
         | profile" and "accept" link.
        
         | chlorion wrote:
         | Considering that unsafe rust basically just allows raw pointer
         | access (see link below) which is already a thing you can do
         | normally in C, I do not see how that's a very good argument
         | against it honestly.
         | 
         | As for the Option type, it is exactly what it says. It's an
         | optional, non-nullable node generic over type T. I suppose
         | generics, optionals and the idea of non-nullable types might be
         | exotic at first, but this is not a problem with Rust as much as
         | it's a problem with C not being able to express these concepts
         | in the first place and instead expecting you to spray null
         | checks all over the place!
         | 
         | https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html
        
           | scaramanga wrote:
           | Not to be overly contrary here but, you "need to spray null
           | checks" probably just as much in rust as in C (in this
           | specific example, not in general). Since those prev/next
           | pointers are being modified in unsafe code where either
           | you're using NonNull::new() which contains the null check, or
           | you're using NonNull::new_unchecked() which means you need to
           | convince yourself all the invariants hold true. The situation
           | is roughly equal in C (ie. once you prove in a module that
           | fields cannot be NULL, no need to add lots of redundant extra
           | NULL checks in all users of the module).
        
         | andrewflnr wrote:
         | In the Rust ecosystem philosophy, linked lists are the kind of
         | thing you want to write once, very carefully, then re-use a
         | lot. That's exactly the kind of code where unsafe can be
         | reasonable. It's not like it would actually be safer in any
         | other language that didn't have an explicit unsafe keyword.
         | More aesthetically pleasing, perhaps, but not safer.
         | 
         | You're acting like it's some epic burn on Rust that there's
         | some ugly code in its stdlib. It's not. Stdlibs are like that.
         | Furthermore, as others have pointed out, linked lists in
         | particular are a pathological case for Rust, meaning you're
         | pointing and snickering at a special case of a special case.
         | Most people never have to write or look at that kind of code.
        
           | kragen wrote:
           | Functional programming languages like Haskell, ML, and Lisps
           | use linked lists very extensively and rarely have safety
           | problems as a result. On the contrary, because linked lists
           | allow you to build a list up efficiently without mutation,
           | they give you a good way to avoid all the safety problems
           | that come from mutation. In general just about any kind of
           | data structure in these languages is some sort of weird
           | linked list.
           | 
           | This isn't the best solution for all cases! Sometimes
           | performance is more important than safety, and there are
           | things you can't do as efficiently without mutable data. And
           | sometimes (especially for systems that are reactive rather
           | than transformational, in Harel and Pnueli's terms, or for
           | libraries) a garbage collector is an unacceptable cost.
           | 
           | And Rust is actually reasonably good at singly-linked lists;
           | it's just doubly-linked lists (which inherently feature
           | mutation) that are troublesome. Singly-linked lists with
           | sharing through Arc are a little more expensive in Rust than
           | in something like GHC, Ikarus Scheme, or OCaml, but that's
           | hardly a damning criticism of Rust.
           | 
           | But you seem to be claiming that linked lists are unsafe in
           | any language, and to anyone familiar with functional
           | programming, that claim is obvious nonsense. Did you intend
           | it?
        
             | andrewflnr wrote:
             | I should have been more specific about doubly linked lists.
             | And I think the implicit context for comparisons with Rust
             | is usually pointer-heavy non-GC languages, especially given
             | OP, but maybe I should have specified that, too. It's
             | certainly true that DLL's are tricky to write, and tricky
             | to mechanically prove correct, which is mainly what
             | 'unsafe' actually means.
        
               | kragen wrote:
               | Thank you for clarifying!
               | 
               | I would think that the implicit context for _safety_
               | comparisons with Rust would be other languages designed
               | with safety as a goal. Those languages are almost all
               | GCed! ML, Haskell, Scheme, Oberon, Coq, Erlang -- all
               | GCed. That 's because GC helps enormously with type- and
               | memory-safety. In the example in question, it ensures
               | that your doubly-linked list manipulation cannot cause
               | undetected type errors, use-after-free errors, double-
               | free errors, or memory leaks; this last guarantee goes
               | beyond what Rust can offer in this case. In ML, Haskell,
               | and Oberon, it even ensures that all type errors are
               | detected at compile time instead of run time.
               | 
               | There are a few languages designed with safety as a goal
               | that do not normally use GC: Pascal, Ada, COBOL, and
               | MISRA C. But they suffer enormously in expressivity, are
               | not pointer-heavy, and are generally only viable
               | alternatives in restricted domains.
               | 
               | The usual pointer-heavy non-GC languages are C and C++,
               | and these have no concern whatsoever for safety, so it
               | seems like a weak-man argument to target linked-list
               | manipulation in them as unsafe relative to Rust.
               | 
               | The correctness criteria for doubly-linked lists are
               | indeed not within the capability of Rust's (or Haskell's,
               | or ML's, or Oberon's) static checking. That doesn't imply
               | that it's impossible to statically verify doubly-linked-
               | list manipulation code, and in fact I think Ada SPARK is
               | capable of doing it. Certainly it's within the scope of
               | what you can prove with Coq or Isabelle/HOL.
        
           | svnpenn wrote:
           | this comes off as extremely defensive. I don't feel like the
           | person you're responding to, intended any kind of "sick
           | burn". Calm down dude.
        
             | andrewflnr wrote:
             | Whatever. They're definitely pointing and snickering. I
             | don't think there's any way to read their post but as a
             | criticism of Rust the language based on this one piece of
             | stdlib code, and that's nonsense. That irritates me. So sue
             | me.
             | 
             | There are reasonable arguments to be had about Rust. I wish
             | people would do those instead of all the nonsense you
             | usually see.
        
               | svnpenn wrote:
               | on the contrary, I think more and more people are
               | unwilling to hear any criticism, even constructive
               | criticism, of "their thing".
               | 
               | Rust is not perfect. I wont go into its faults, but you
               | know what they are. I think overall, Rust is a great
               | language, but you have to be realistic and understand
               | that some stuff you ignore in favor of the holistic view,
               | others might not be able to. So sometimes maybe just take
               | the criticism and deal with it, or if you're in the
               | position, do something to fix the problem.
        
               | andrewflnr wrote:
               | Contrary to _what_ exactly that I said?
        
               | svnpenn wrote:
               | to this:
               | 
               | > I wish people would do those instead of all the
               | nonsense you usually see
               | 
               | you seem to want to place the blame squarely on the
               | critics, when the reality is that many people in your
               | position are seemingly intolerant of any criticism of
               | "their thing", even constructive criticism.
        
               | andrewflnr wrote:
               | Ah, I see, since I'm defending Rust in this sub-thread,
               | it's impossible for me to ask for constructive criticism
               | instead of bullshit without it being "seemingly
               | intolerant", while you're the reasonable one for doing
               | the exact same thing. Basically just assuming I'm an
               | idiot and/or arguing in bad faith. Lovely.
        
               | svnpenn wrote:
               | they linked to the source code. What bar do they have to
               | reach, before you honor it with being constructive? the
               | point is, there is no bar. unless its praise to Rust,
               | then its trolling in your eyes.
        
               | Dylan16807 wrote:
               | The source code of the standard library is usually an
               | awful way to show off any language. Pointing and gawking
               | at such a thing as criticism is very bad criticism.
        
               | andrewflnr wrote:
               | Yeah, I already made that point in my root comment. It
               | was pretty much my whole thesis, in fact. Doesn't matter,
               | apparently! svnpenn is determined to confuse me for
               | someone dumber. :)
        
         | SighMagi wrote:
         | There's so much code marked "unsafe" in there...
        
           | estebank wrote:
           | That's because in order to interact with pointers you need
           | unsafe, and not using unsafe in Rust requires a single
           | ownership tree _or_ reference counting. If you 're not keen
           | on either of those solutions, you shouldn't be writing a
           | Doubly Linked List in other languages either (because they
           | will have either pointers or reference counting/GC).
        
             | andrewflnr wrote:
             | "Those solutions" start off being hierarchical ownership or
             | reference counting, but turn into "pointers or reference
             | counting". Requiring pointers is implicit to the definition
             | of linked lists, and calling it out here is silly. And you
             | don't really need either ref counting (again, pretty
             | obviously, witness every real world implementation) or
             | hierarchical ownership outside Rust.
             | 
             | Conceptually, the list as a vague concept owns all the
             | nodes. The ownership just stops being expressible directly
             | through references the way Rust wants it to be. But you
             | don't have to re-invent hierarchical ownership to implement
             | DLLs in other languages, because it's not really there in
             | the problem. You just have to accept that they're always
             | going to suck under Rust's ownership model.
        
               | estebank wrote:
               | > Requiring pointers is implicit to the definition of
               | linked lists, and calling it out here is silly.
               | 
               | The usual complaints are that "Safe Rust doesn't let you
               | do this", when unsafe Rust is right there to let you
               | operate on any soup of pointers datastructure that you'd
               | want to implement.
               | 
               | > you don't really need either ref counting (again,
               | pretty obviously, witness every real world
               | implementation)
               | 
               | If you implement DDL in a memory managed language, you
               | effectively have the same behavior as you would in Rust
               | with Arc or Rc. The ownership of the nodes then belongs
               | to the GC or the RC value. You don't have to think about
               | it because you're abstracted from the lifetime of each
               | node by the language and runtime.
               | 
               | > or hierarchical ownership outside Rust
               | 
               | The ownership/borrow checker requires hierarchical to
               | operate, but it doesn't stop you from writing unsafe
               | Rust.
               | 
               | > Conceptually, the list as a vague concept owns all the
               | nodes. The ownership just stops being expressible
               | directly through references the way Rust wants it to be.
               | But you don't have to re-invent hierarchical ownership to
               | implement DLLs in other languages, because it's not
               | really there in the problem. You just have to accept that
               | they're always going to suck under Rust's ownership
               | model.
               | 
               | Yeah, and that's what unsafe is for. Most code doesn't
               | need vague ownership, though. I'd go as far as saying
               | that the nudge towards clear ownership helps both
               | maintainability and performance.
               | 
               | My concern is with the prevalence of this idea that
               | unsafe Rust is not "real Rust" or that the existence and
               | use of unsafe Rust somehow precludes any benefits of
               | Rust.
        
               | db48x wrote:
               | Technically the pointers don't have to be actual
               | pointers. And if they're not actually pointers, then
               | Rust's ownership model and borrow checker will be
               | perfectly content; they'll barely trouble you at all. And
               | you won't even need any unsafe blocks.
               | 
               | What you do instead is use integers instead of pointers.
               | The integers are indexes into a Vec of list nodes, owned
               | by the linked list. Since the nodes are now owned only by
               | the Vec, which is in turn owned only by the linked list,
               | the borrow checker will not complain.
               | 
               | Some people object that this is cheating somehow, but
               | what is memory but a giant untyped global vector, shared
               | between all parts of your application? Pointers into that
               | giant shared array are just indexes with extra risk,
               | since you have to trust that they point to real nodes
               | that have been initialized. Plus, you often see users of
               | a linked list put the nodes into an arena allocator
               | anyway, especially in the kernel. The Vec in your Rust
               | implementation serves the same purpose as the arena
               | allocator.
        
           | tracker1 wrote:
           | There be dragons...
        
         | xdavidliu wrote:
         | > I loved to share them the Linkedin List implementation ... if
         | you can understand Linkedin List ...
         | 
         | is the "In" in "LinkedIn" deliberate here or just a typo that
         | was made twice?
        
         | mastax wrote:
         | C++'s STL linked list for comparison (libcxx).
         | 
         | https://github.com/llvm-mirror/libcxx/blob/master/include/li...
        
           | einpoklum wrote:
           | Rust's list only supports a default allocation mechanism. For
           | an actual comparison, you'll want to show us a Rust
           | implementation which is parametrized over allocators.
           | 
           | Having said that - yes, it's not pretty. About the same
           | length though.
        
             | mastax wrote:
             | Adding allocator support does add a bit more noise:
             | 
             | https://github.com/rust-
             | lang/rust/pull/103093/files#diff-8bd...
        
           | kortex wrote:
           | Oof. Yeah, standard lib code tends to be harder to read on
           | average than application code. But at least I can mentally
           | parse the Rust code. Heck it even looks like a linked list
           | impl. The c++ code looks barely better than line noise. I'm
           | sure I could wrap my head around it, but ho boy, it's far
           | worse than Rust by a long shot.
           | 
           | And this is with several years cpp experience, and maybe a
           | few months of Rust on and off.
        
         | hnov wrote:
         | Isn't a doubly linked list basically incompatible with Rust's
         | idea of memory ownership?
        
           | proto_lambda wrote:
           | It's definitely not a data structure that makes the borrow
           | checker happy. Luckily it's also not a data structure that's
           | required or desirable in 99% of cases, but the standard
           | library still offers an implementation for those 1% so people
           | don't try to write their own (turns out getting a doubly
           | linked list correct isn't quite as trivial as it may seem).
        
             | Waterluvian wrote:
             | I'm actually a bit surprised it's in the standard library
             | if it's so uncommonly needed. Isn't Rust's stdlib tiny with
             | a philosophy of letting the community fill in most things?
        
               | estebank wrote:
               | There were arguments made for its non-inclusion[1].
               | 
               | [1]: https://rust-unofficial.github.io/too-many-
               | lists/sixth.html
        
               | Waterluvian wrote:
               | I enjoyed reading this. A fun style.
        
           | zozbot234 wrote:
           | AIUI, GhostCell and related patterns can be used to implement
           | doubly linked lists in safe Rust. These patterns are quite
           | obscure and hard to use in practice, though.
        
           | tmtvl wrote:
           | From what I remember Rust has some kind of RefCell with weak
           | references. Those could be used to make DLLs, if anyone can
           | find a reason to use those. That said, you could also use
           | zippers...
        
             | slaymaker1907 wrote:
             | Now imagine writing the drop implementation for a DLL that
             | doesn't cause a stack overflow.
        
             | steveklabnik wrote:
             | RefCell<T> and weak references are two different things,
             | but rust has both, yes.
        
           | brundolf wrote:
           | It's at odds with it, but so are many other data structures
           | found in the standard library. The solution is usually to use
           | unsafe { } inside the data structure to allow for those
           | criss-crossing pointers (while exposing safe, checker-
           | friendly interfaces to the outside world)
           | 
           | This is a great resource that outlines several different
           | strategies for implementing linked lists in Rust:
           | https://rust-unofficial.github.io/too-many-lists/
        
         | kibwen wrote:
         | When explaining why the difficulty of modeling linked lists in
         | Rust doesn't matter, I like to share the following book, "Learn
         | Rust With Entirely Too Many Linked Lists", written by the woman
         | who wrote the implementation you've linked above: https://rust-
         | unofficial.github.io/too-many-lists/
        
           | adeptima wrote:
           | My first thought it was a joke ... ;)
           | 
           | Learn Rust with entirely too many linked lists (2019) -
           | https://news.ycombinator.com/item?id=22390662 - https://rust-
           | unofficial.github.io/too-many-lists/index.html
           | 
           | In this series I will teach you basic and advanced Rust
           | programming entirely by having you implement 6 linked lists.
           | In doing so, you should learn:
           | 
           | - The following pointer types: &, &mut, Box, Rc, Arc,
           | _const,_ mut, NonNull(?)
           | 
           | - Ownership, borrowing, inherited mutability, interior
           | mutability, Copy
           | 
           | - All The Keywords: struct, enum, fn, pub, impl, use, ...
           | 
           | - Pattern matching, generics, destructors
           | 
           | - Testing, installing new toolchains, using miri
           | 
           | - Unsafe Rust: raw pointers, aliasing, stacked borrows,
           | UnsafeCell, variance
        
           | keepquestioning wrote:
           | I read this and noped out of learning Rust.
        
             | hinkley wrote:
             | Are you bragging about self imposed ignorance of advanced
             | programming topics?
             | 
             | I guess there's plenty of space in the world for PHP
             | programmers and their predecessors, the VB programmers.
             | So... congrats?
        
             | kelnos wrote:
             | So I glanced at it, expecting something much much worse,
             | but I was surprised to see it wasn't that bad. Even if you
             | look at the final code for the final implementation, it
             | doesn't seem much more complex than what you'd have to
             | write in C. And a good chunk of the code is implementing
             | traits from the stdlib, stuff that isn't strictly
             | necessary, but makes it easier to make the API of the deque
             | idiomatic and work well with other stdlib (and non-stdlib)
             | APIs.
             | 
             | And regardless, this book is teaching from the perspective
             | of intentional super-hard-mode. Most Rust developers will
             | never have to write that much unsafe code.
        
       | AtNightWeCode wrote:
       | People must stop give attention to old garbage solutions that
       | should not be used. Never use a link list if you don't
       | specifically need a link list. You probably don't know why you
       | would need one so don't go there.
        
         | int_19h wrote:
         | If you stop "giving attention" to old solutions, people will
         | simply reinvent them. Linked lists in particular are very easy
         | to reinvent because they're so simple, and because their
         | advantages are all evident even in a high-level language, while
         | disadvantages require a more low-level understanding of modern
         | compute performance.
        
         | tester756 wrote:
         | It could be way better advice if you tried
        
       | deafpolygon wrote:
       | Honestly, a lot of programming data structure concepts were
       | difficult to grasp /until/ I truly understood how linked list
       | works. It's a very simple concept in hindsight, but as a newbie
       | programmer - this was an alien concept. Linked list? Just give me
       | my array or built in List! Then I understood how to build a
       | linked list and how it can be extended in various ways. They can
       | be used to solve interesting problems. Many high level languages
       | provide pre-baked implementations of LL but most programmers
       | don't understand how they work under the surface. When I
       | understood LL, I was able to apply that understanding to trees.
        
       | GnarfGnarf wrote:
       | I have built my business on linked lists (genealogy trees). Paid
       | for my house :o)
        
       | visarga wrote:
       | That was my high school programming fun - implement linked lists
       | and other data structures in TurboPascal. Pascal is a language
       | that does not come with dynamic lists, only fixed size arrays and
       | memory allocations. I had to build linked lists just to have a
       | decent list primitive.
       | 
       | A few years later I discovered Perl that comes with dynamic size
       | lists, dictionaries and strings. Whoa, that was a treat, so very
       | nice to use and efficient. To this day I prefer constructs made
       | of lists and dicts 10x more than defining classes.
        
       | dragontamer wrote:
       | The most interesting thing about linked lists to me is how
       | trivially it can be made lock-free multithreaded with atomics.
       | 
       | An array/stack can be made multithreaded but it is non-trivial to
       | handle the edge-cases. In particular, the ABA problem is rather
       | difficult. I've seen many solutions to it (ex: a synchronization
       | variable that puts the stack into "push-only" mode and then "pop-
       | only" mode. There's no ABA-problem if all threads are pushing!)
       | 
       | However, pushing/popping from a Linked List stack requires no
       | such synchronization at all. Simply compare-and-swap the head
       | (and on failure, try again). Its about as simple as you can get
       | when it comes to atomic / lock free patterns.
        
         | charleslmunger wrote:
         | I did this for an object pool in re2j and saw even single
         | threaded performance improve.
         | 
         | https://github.com/google/re2j/blob/dc7d6e5d41225dc0825ea6fe...
         | 
         | Java doesn't suffer from pointer address ABA but I did have to
         | handle reinsertion (except when the stack had only one
         | element).
        
         | [deleted]
        
         | yeputons wrote:
         | > Simply compare-and-swap the head (and on failure, try again).
         | 
         | It doesn't help with ABA at all, does it? Unless you assume
         | that nodes are immutable and the same pointer is never reused,
         | of course.
        
           | dragontamer wrote:
           | I forgot about that other ABA issue.
           | 
           | Still, that's easily solved: 64-bit version number + 64-bit
           | pointer (or 32-bit version number + 32-bit pointer) for a
           | 128-bit (or 64-bit) compare-and-swap.
           | 
           | All modern CPUs support 128-bit CAS.
           | 
           | EDIT: The version number is incremented by +1 each time. It
           | is unlikely that you'd overflow 64-bit version number and
           | risk an ABA problem, though 32-bits can be overflowed
           | surprisingly quickly in today's computers.
           | 
           | --------
           | 
           | EDIT2: Note: I'm pretty sure (but not 100% sure) that the
           | Head of a linked-list stack can be "simply" compared-and-
           | swapped to remain lock-free and 100% valid (ie: 64-bit
           | compare and swap over the 64-bit pointer). I did not mean to
           | imply that you can "CAS any arbitrary node of a linked-list".
           | A fully generic linked list is possible and I've seen it with
           | the 128-bit CAS operator, but that wasn't what I was going
           | for originally.
        
             | murderfs wrote:
             | > EDIT2: Note: I'm pretty sure (but not 100% sure) that the
             | Head of a linked-list stack can be "simply" compared-and-
             | swapped to remain lock-free and 100% valid (ie: 64-bit
             | compare and swap over the 64-bit pointer).
             | 
             | No, you cannot. The problem is what you're comparing and
             | swapping into the head during a pop. You want to do the
             | moral equivalent of `current = list;
             | list.compare_exchange(current, current->next)`, but
             | current->next might have changed if someone else popped the
             | original head, pushed something else, and then pushed the
             | original head again.
             | 
             | You need double CAS or LL/SC or a more complicated scheme
             | to make this work.
        
               | AstralStorm wrote:
               | You can get away with single CAS if you don't need full
               | size pointers. Of course then you're actually using your
               | list to refer to an array.
        
               | dragontamer wrote:
               | It should be noted that on most x86 machines, they only
               | support 48-bit pointers, with 16 free bits you can use.
               | 
               | I think some Intel chips have 57-bit support. But no CPU
               | actually reads all 64-bits for some reason.
               | 
               | So yeah, you can shove count-bits, either 16 of them or 7
               | of them or so.
        
               | dragontamer wrote:
               | Fair catch.
               | 
               | That's obscure but it looks like you're correct in this
               | case.
        
               | murderfs wrote:
               | It might not be immediately obvious that this can happen,
               | but in practice, this scenario happens very frequently,
               | because if you free a node and then malloc a node, you're
               | very likely to get the same address back with most memory
               | allocators.
        
         | [deleted]
        
       | ww520 wrote:
       | Linked list is useful when you are hurting for memory and need
       | precision memory allocation, like in the kernel. You trade memory
       | compactness for more operation time in dealing with its certain
       | aspects.
        
         | AstralStorm wrote:
         | And code size, let's not forget about that one. Plus you do not
         | have a super smart memory allocator to prevent the hundreds of
         | vectors from spraying the heap.
        
       | ayhanfuat wrote:
       | For completenes, here's what Bjarne Stroustrup says on why you
       | should avoid using linked lists: https://youtu.be/YQs6IC-vgmo
        
         | db48x wrote:
         | This is a video everyone should watch.
        
       | raible wrote:
       | FWIW I usefully use a "fake" linked list by adding a .next field
       | to the struct in a compiler-allocated array of structs. On
       | initialization I set the list head to &item[0], and in a trivial
       | loop set the .next field of each entry (except the last) to
       | entry+1.
       | 
       | Why bother? Because I can then easily add a few extra structs to
       | the beginning of the (contiguously-allocated) linked-list without
       | having to reallocate the whole thing.
       | 
       | Sure, pointer chasing with separately allocated structs is
       | "slow", but I haven't yet measured to see if it's any different
       | when (almost all) items are contiguous.
       | 
       | If you would... - what sort of cache behavior should one expect
       | of this on a modern laptop CPU? - I haven't seen this approach
       | before, have you?
        
         | Sirened wrote:
         | It depends on what prefetchers your CPU has and the actual
         | underlying access pattern your accesses cause. If chasing next
         | leads to a constant stride run through the array, you'll get
         | identical performance to that of an iterative walk since
         | essentially every high performance CPU since like 2010 supports
         | stride prediction based on raw addresses. If .next sends you
         | bouncing all over the area, you'll get complicated behavior
         | since whether or not the data can be prefetched depends on the
         | performance/existence of a pointer prefetcher, which is less
         | common/more error prone. We know Apple's M1 have it due to some
         | researchers using it as a side channel [1] but you'll have to
         | do some digging on whether or not your laptop has one. Would
         | make a nice post here if you do make the benchmarks :)
         | 
         | [1] https://www.prefetchers.info/augury.pdf
        
       | zebb wrote:
       | > In defense of linked lists
       | 
       | In defense of what? A brigade of null pointer exceptions?
       | 
       | > Linked lists are conceptual. A node pointing to itself is the
       | most self centered thing I can imagine in computing: an ideal
       | representation of the more vulgar infinite loop. A node pointing
       | to NULL is a metaphor of loneliness. A linked list with tail and
       | head connected, a powerful symbol of a closed cycle.
       | 
       | Oh, this article is complete satire. Bravo, you had me
        
       | elteto wrote:
       | I much prefer deques to linked lists when I need O(1) push/pop
       | but still want to keep some semblance of cache locality. Although
       | you can't splice two deques together as easily as two linked
       | lists.
       | 
       | Sadly, you can't really control the block size in C++'s
       | std::deque so you can't communicate useful information to the
       | library about your use case. I think MSVC allocates such a small
       | block that it is effectively a linked list.
        
         | woojoo666 wrote:
         | I feel like to compare linked lists and dequeues, you would
         | have to give a specific implementation of a dequeue
        
           | Jtsummers wrote:
           | Correct, deques are abstract data types and you can implement
           | them multiple ways, including using linked lists or growable
           | arrays/vectors. So they aren't directly comparable to linked
           | lists since under the hood they can be implemented with
           | linked lists (or doubly linked lists at least). You could
           | compare the performance of different deque implementations,
           | though.
        
         | intrepidhero wrote:
         | How does a dequeue improve the situation with cache locality
         | versus a linked list?
        
           | klyrs wrote:
           | It depends on the implementation. Python's deque, for
           | example, is implemented as a two-level data structure; a
           | circularly linked list of blocks. If you've got page-sized
           | blocks, then your deque can be sufficiently local for most
           | cases. It introduces a little extra logic, and you'll want to
           | keep around a spare block to prevent thrashing memory if you
           | repeatedly push/pop at the block boundary, but it's often
           | worth the pain if the data structure is anywhere near a hot
           | loop.
        
           | zwkrt wrote:
           | Mosty dequeues are implemented under the hood with arrays. A
           | linked list usually requires a heap allocation for each
           | element.
           | 
           | This is speaking in generalities though, since you could have
           | an "alternatively implemented dequeue or a linked list
           | entirely on the stack.
        
           | blibble wrote:
           | it's typically implemented as an array (e.g. circular buffer)
        
           | Shorel wrote:
           | It is usually implemented as linked arrays. For example,
           | arrays of 16 or 64 elements, if it grows over that size, new
           | arrays or vectors are used underneath.
           | 
           | Two consecutive elements are probably in the same array, and
           | that helps with cache locality.
        
         | layer8 wrote:
         | Note that a deque in general is an ADT that isn't necessarily
         | implemented in the way std::deque commonly is. A deque can be
         | implemented as a linked list.
        
       | javajosh wrote:
       | Linked lists are cool, but he missed one reason why: how easily
       | they become n-arry trees. A tree expressed like this is quite
       | elegant and useful in a variety of domains, my favorite being a
       | "trie" that is an n-arry tree that computes the label of each
       | node as the string you get from the previous label plus the
       | string in this node. It's a wild and hairy data-structure, but
       | very cool once tamed.
        
       | koinedad wrote:
       | They are really cool. Deleting an element by assigning next to
       | next.next still blows my mind.
        
         | Jtsummers wrote:
         | If you think that's neat, check out Dancing Links used for
         | backtracking algorithms:
         | https://en.wikipedia.org/wiki/Dancing_Links
        
       ___________________________________________________________________
       (page generated 2022-11-05 23:01 UTC)