[HN Gopher] A database without dynamic memory allocation
___________________________________________________________________
A database without dynamic memory allocation
Author : todsacerdoti
Score : 250 points
Date : 2022-10-13 15:23 UTC (7 hours ago)
(HTM) web link (tigerbeetle.com)
(TXT) w3m dump (tigerbeetle.com)
| pyrolistical wrote:
| This implies changing of buffer sizes requires a process restart.
| Which is a fine trade off.
|
| This also implies all memory is allocated upfront. It would
| require the operating system to be smart enough to compress/page
| blocks or else the process will hold on to unused memory
| haggy wrote:
| Per the article: "(Currently it's determined even before
| startup, at compile-time, but startup is the important point
| since that is the first time that memory can be allocated.)"
|
| Sooo are we sure you can even change this via proc restart?
| This is saying the limits are built-in during compile...
| eatonphil wrote:
| They are currently compile time settings. That may change in
| the future to be more user-friendly. :) But what won't change
| is that the size is fixed at startup.
| Thaxll wrote:
| How does it works exactly? It's a memory pool that uses 85% of
| available memory? If I have one row in my DB it's going to
| prealocate 30GB of memory? I don't think it's a sane design, it
| reminds me the JVM args with min / max.
| sakras wrote:
| The OS doesn't actually map/allocate the pages to you until you
| touch them, so if you preallocate 30 GB and only use one row,
| you'd only be using 4 KB (or whatever your page size is). This
| is the same as if you'd individually malloc'd the single row,
| since deep down malloc just calls mmap.
| jorangreef wrote:
| Joran from the TigerBeetle team here!
|
| We have a secret plan for this too. ;)
| allknowingfrog wrote:
| The article differentiates between "memory" and "on disk"
| storage, which is where the rows in the database would live.
| apavlo wrote:
| This is pretty much how all DBMSs work right now. You have to
| tell it how much memory it can use for its internal buffer
| pool.
| aidenn0 wrote:
| I've been an embedded software developer for almost 2 decades,
| and this style used to be near universal in embedded, but has
| become less so, much to my dismay. I even saw a project not too
| long ago where a large C++ STL List (i.e. a linked list) was
| created at startup, then later freed piecemeal, causing so much
| heap fragmentation on the poor embedded system malloc
| implementation.
| hinkley wrote:
| I think it was when I was reading the zlib source code, while
| dinosaurs roamed outside my college window, where I first saw a
| variant of this, of a library that statically allocated memory
| for a graceful shutdown routine before doing anything, so even
| if memory were exhausted by a bug in other code later on it
| could still exit cleanly.
| melech_ric wrote:
| It's still done. I did the code review for the guy who wrote
| it for our server. If we're crashing due to an OOM situation
| we still have enough memory to write out a minidump.
| touisteur wrote:
| The pain here is to test and keep testing these paths. Same
| as some 'free disk space' monitoring...
| melech_ric wrote:
| I agree. However, in our case, that pain has proven to be
| less than not having those minidumps.
| touisteur wrote:
| Oh for me too, but I wish it was easier, like something
| supported seriously by my language of choice, my OS of
| choice or my libc of choice, with the flick of an flag...
| I don't retest the whole libc and kernel every time
| (although, err, someone should, right? :-)
| jorangreef wrote:
| Joran from the TigerBeetle team here!
|
| TigerBeetle uses Deterministic Simulation Testing to test
| and keep testing these paths. Fuzzing and static
| allocation are force multipliers when applied together,
| because you can now flush out leaks and deadlocks in
| testing, rather than letting these spillover into
| production.
|
| Without static allocation, it's a little harder to find
| leaks in testing, because the limits that would define a
| leak are not explicit.
| polskibus wrote:
| Can you provide some pointers regarding the simulation
| testing for more information?
| jorangreef wrote:
| Sure!
|
| Here's an overview with references to the simulator
| source, and links to resources from FoundationDB and
| Dropbox: https://github.com/tigerbeetledb/tigerbeetle/blo
| b/main/docs/...
|
| We also have a $20k bounty that you can take part in,
| where you can run the simulator yourself.
| srcreigh wrote:
| What language? What does the server do?
| melech_ric wrote:
| C++ on Windows. It's an old codebase that has been used
| to run custom test hardware.
|
| We also target Linux with the same codebase, but the
| environment is a bit different. On Linux we're using a
| system slice and we're not doing minidumps.
| kotlin2 wrote:
| What environment is the server running in? Can you prevent
| the Linux OOM killer from killing your process?
| abeyer wrote:
| The OOM killer can be disabled completely on a system, or
| a process can be excluded from it by setting an OOM flag
| under `/proc/<pid>` It's not generally considered good
| practice in production, but I think that's largely
| because most software does not do what is being suggested
| here... in the scenario where you have, seems like the
| right thing to do.
| throwaway09223 wrote:
| It's definitely good practice in production and is often
| necessary.
|
| The techniques mentioned above will (perhaps
| surprisingly) not eliminate errors related to OOM, due to
| the nature of virtual memory. Your program can OOM at
| runtime even if you malloc all your resources up front,
| because allocating address space is not the same thing as
| allocating memory. In fact, memory can be deallocated
| (swapped out), and then your application may OOM when it
| tries to access memory it has previously used
| successfully.
|
| Without looking, I can confidently say that tigerbeetle
| does in fact dynamically allocate memory -- even if it
| does not call malloc at runtime.
| abeyer wrote:
| > It's definitely good practice in production and is
| often necessary.
|
| I'd be curious if you have any resources/references on
| what is considered good practice in that now, then.
|
| It's been a long time since I did much ops stuff outside
| a few personal servers, so it may well be my background
| is out of date... but I've certainly heard the opposite
| in the past. The argument tended to run along the lines
| that most software doesn't even attempt to recover and
| continue from a failed malloc, may not even be able to
| shut down cleanly at all if it did anyway, and the kernel
| may have more insight into how to best recover...so just
| let it do its thing.
| jorangreef wrote:
| We're aware of this, in fact, and do have a plan to
| address virtual memory. To be fair, it's really the
| kernel being dynamic here, not TigerBeetle.
| throwaway09223 wrote:
| Oh for sure. Not casting any shade cast at all on your
| work - I am really happy to see what you're doing. This
| kind of thing has a lot of value even in a virtual memory
| environment.
| colejohnson66 wrote:
| IIRC, .NET does something similar for OutOfMemoryExceptions.
| The exception is thrown _before_ actually running out of
| memory. That allows just enough space for the
| OutOfMemoryException object (and associated exception info)
| to be allocated.
| Merad wrote:
| I believe the runtime preallocates an OOME on startup, so
| if one is encountered it can just fill in the blanks and
| throw without needing any additional allocations.
| znpy wrote:
| Iirc also perl does this. I'm on the phone right now, but
| iirc $^O was a variable filled with about 20k of worthless
| data that could be freed in case of an out of memory
| condition.
| LunaSea wrote:
| Would you have some links to posts that talk about this style
| of memory management and allocation?
| malkia wrote:
| Game Development still revolves around this. For example when
| loading a level, it used to be you have huge arena, you load
| your "serialized" map at one end (say top), and your other
| chunks at the bottom - you have some "flexibility" to pop
| from top or bottom, so early on you arrange for that.
| Alternatively you split your memory per sized big chunks, and
| stream in out these (works well for textures, and primed
| sounds - e.g. the first 64kb or 128kb of a sound file).
| [deleted]
| aidenn0 wrote:
| I don't have some links, but it's more straightforward than
| you might think (and TFA covers some of it).
|
| 1. Bound every _whatever_ that requires memory to some value
| N
|
| 2. For all _whatevers_ that can have a lifetime for your
| entire program, do that
|
| 3. For any _whatevers_ that can 't have a lifetime for your
| entire program (e.g. for a database server like TFA, a
| connection object would qualify), make a table of size N
|
| 4. When you need a _whatever_ (again using a connection
| object as an example, when listen() returns) grap an unused
| entry from the table. If there is no unused entry do the
| appropriate thing (for a connection, it could be sending an
| error back to the client, just resetting the TCP connection,
| blocking, or logging the error and restarting the whole
| server[1]).
|
| 5. When you are done with a _whatever_ (e.g. the connection
| is closed) return it to the table.
|
| Those are the basics; there are some details that matter for
| doing this well e.g.:
|
| - You probably don't want to do a linear-probe for a free
| item in the table if N is larger than about 12
|
| - Don't forget that the call-stack is a form of dynamic
| allocation, so ensure that all recursion is bounded or you
| still can OOM in the form of a stack overflow.
|
| I should also note, that for certain values of N, doing
| things this way is normal:
|
| - When N=1 then it's just a global (or C "static" local
| variable).
|
| - When N=(number of threads) then it's just a thread-local
| variable
|
| 1: The last one sounds extreme, but for a well understood
| system, too many connections could mean something has gone
| terribly wrong; e.g. if you have X client instances running
| for the database, using connection pools of size Y and you
| can have 2 _X_ Y live connections in your database, exceeding
| N=2 _X_ Y is a Bad Thing that recovering from might not be
| possible
| hobs wrote:
| Missing the end of "- You probably don't want to do a
| linear-probe for a free item in the table if N is larger
| than about"
| aidenn0 wrote:
| Thanks, fixed
| ericbarrett wrote:
| I was curious about that too :) My guess is 16 items
| (edit: OP says 12) but I'm not an embedded developer, and
| I'm sure it's hardware-specific.
|
| The approach I've seen for larger fixed-sized tables is a
| free list: when the table is allocated, also initialize
| an array of the slot numbers, with an associated length
| index. To allocate, pop the last slot # off the array and
| decrement the length index. To free, push the slot # onto
| the end of the array and increment the length index.
| Checking for a free slot is then just testing for length
| index > 0.
| aidenn0 wrote:
| Yeah, a stack of free indexes is good. In the rare case
| of "lots of indexes, but little (extra) memory" a bitmap
| can be good because it's 1 bit per index rather than
| [?]log2(N)[?] bits per index and has a constant factor
| speed up equal to the word-size of the architecture.
| [deleted]
| 01100011 wrote:
| Nearly everything I learned about embedded systems
| programming came from either Embedded Systems Magazine
| articles(https://www.embedded.com/embedded-systems-design-
| magazine-ar...) or just being a paranoid SoB.
|
| Writing reliable systems means spending a lot of time
| thinking about what and how failures can occur, minimizing
| them and dealing with the cases you can't get rid of. Simple
| is nearly always better.
|
| It also helped to be an EE by education and not a CS major.
| CS majors can get blinded by their abstractions and forget
| that they're working on a real piece of HW with complex
| limitations which require design tradeoffs.
| blondin wrote:
| game development has been doing it this way for a long time
| too. the domain often dictates how we code.
| aaaronic wrote:
| Isn't this just a bunch of static object pools? That's extremely
| common in embedded programming. I'm glad to see someone
| recognized it could apply to a database.
|
| My students do this on the GameBoy Advance specifically so they
| don't need to learn about dynamic memory allocation to get useful
| things done early in my course. In fact, they only learn about
| dynamic memory allocation for academic reasons later on and never
| _need_ to use it in their games.
| schemescape wrote:
| Do you mind sharing the name of the course? It sounds
| interesting. (I didn't see anything in your profile.)
| victor106 wrote:
| > Doing this was possible because we sketched out all resource
| usage (network, disk, memory, CPU) in the design phase with back-
| of-the-envelope calculations.
|
| I wish all programs did this
| [deleted]
| chubot wrote:
| _Building a system that allocates memory once up-front requires
| up-front thinking._
|
| sqlite apparently has some support for this, although of course
| you make it more predictable if you control all the code.
|
| https://www.sqlite.org/malloc.html
|
| They even provide some nice math for you to create the fixed size
| heap!
|
| _The problem of dynamic memory allocation, and specifically the
| problem of a memory allocator breakdown, has been studied by J.
| M. Robson and the results published as_ J. M.
| Robson. "Bounds for Some Functions Concerning Dynamic Storage
| Allocation". Journal of the Association for Computing Machinery,
| Volume 21, Number 8, July 1974, pages 491-499.
| matheusmoreira wrote:
| Thanks for the reference, that Robson proof is amazing.
| Published in 1974... I wonder what other treasures are hidden
| in the literature.
| kristoff_it wrote:
| The idea of having programs that don't crash under memory
| pressure is one of the things that I really hope developers will
| start striving for more in the near future.
|
| A database that doesn't OOM crash is great, and the same is also
| true for more consumer-grade software.
|
| I have 32gb of RAM on my PC and whenever I'm encoding a video
| with DaVinci Resolve, I always see Firefox and Discord crash
| almost instantly.
| slaymaker1907 wrote:
| What would happen in your utopia is that DaVinci Resolve would
| fail to start because Firefox or Discord would reserve too much
| memory. Or maybe Firefox would demand you close open tabs
| before allowing you to open a new one. We have a solution to
| the problem you describe, it's called paging to disk and hoping
| that we won't need to read or write to that memory very often
| (which works great for something like Firefox where a bunch of
| pages are sitting idly in the background).
| kristoff_it wrote:
| Allocating upfront is only one technique.
|
| Somehow DaVinci Resolve is capable of using all the free
| memory it can find and not more than that, otherwise it would
| never be able to complete any rendering job. That's one
| "utopic" program that's pretty real.
|
| > We have a solution to the problem you describe, it's called
| paging to disk and hoping
|
| No. The solution is to be smart about memory usage, and to
| handle failures when trying to allocate memory. DaVinci would
| become unusable if it started swapping when rendering.
|
| Maybe for some programs swapping is a fine solution, but the
| argument that software engineers shouldn't concern themselves
| with designing for OOM conditions is absolutely ridiculous.
|
| That's not a 'utopia', that's what we should be concretely be
| striving for.
| mananaysiempre wrote:
| It's difficult to see how that would be possible without a
| hard break from legacy systems when fork/exec practically
| forces overcommit and very dubious resource accounting on
| us. (One consumer system that handled OOM conditions with
| remarkable grace was Symbian, but of course it forced an
| extremely peculiar programming style.)
| kristoff_it wrote:
| That's a fair point, but it's also true that we're never
| going to break away from legacy systems until we start
| writing software that has a chance of functioning
| correctly when running out of resources.
|
| The good news is that we do have ecosystems where you
| don't have overcommit, like webassembly and embedded
| devices, and if you're writing libraries meant to also
| work on those platforms, then you will have some good
| building blocks that can be used to iterate towards more
| robust software.
|
| That's how you're expected to approach library writing in
| Zig.
| ok_dad wrote:
| Too bad we can't design some protocol for applications to
| "work together" to ensure each has the memory they need. Like
| some sort of "application collective memory management" where
| each application has certain memory it needs/wants and
| depending on the priority of that need and the current
| priorities of the other applications, it could know to reduce
| memory usage during those times when some other application
| has a higher priority. For one, it would help in your case,
| because background video processing might be a lower
| priority, or perhaps you want it to finish as fast as
| possible, so that Firefox and etc. just "freeze" themselves
| for a while and drop as much memory as is practicable (ie:
| deleting currently rendered pages from memory and re-
| rendering them later, or paging those out to disk for a
| while).
| mananaysiempre wrote:
| Related references:
|
| - Neal Walfield's papers on resource management[1,2]
|
| - The agoric papers[3] from the E people (who sadly now
| seem to be trying to steer the whole thing in the
| cryptocurrency direction)
|
| [1] http://walfield.org/papers/2009-walfield-viengoos-a-
| framewor...
|
| [2] http://walfield.org/papers/2011-walfield-smart-phones-
| need-s...
|
| [3] https://papers.agoric.com/papers/
| loriZyy wrote:
| Windows had this for over 20 years:
|
| https://learn.microsoft.com/en-
| us/windows/win32/api/memoryap...
| kccqzy wrote:
| iOS has had something similar but simpler since the very
| beginning: https://developer.apple.com/documentation/uikit/
| uiapplicatio...
| kristoff_it wrote:
| Firefox should in theory be able to crash individual tabs
| to reclaim memory, but in practice I just see it die.
| littlestymaar wrote:
| Isn't it likely because it's not crashing by itself, and
| is being killed by the OS?
| colejohnson66 wrote:
| OOM is usually a _good_ reason to crash. Sure, a DB might be
| able to do something creative with its cache, but if you can 't
| allocate, there's not much the average program can do anymore.
| I guess browsers can kill their child processes, but that's not
| a long-term solution. If you're out of memory, your computer
| will be lagging from paging/swapping memory in-and-out, so
| killing yourself helps relieve some pressure.
|
| Also, are you sure it's the programs failing and not some OOM
| killer coming around? Could your pagefile/swap be full?
| advisedwang wrote:
| The article says it rejects incoming connections. that seems
| like a much better behaviour than OOM - clients can back-off
| and retry and keep the overall system stable.
| jorangreef wrote:
| Thanks!
|
| Joran from the TigerBeetle team here.
|
| This was in fact one of our motivations for static
| allocation--thinking about how best to handle overload from
| the network, while remaining stable. The Google SRE book
| has a great chapter on this called "Handling Overload" and
| this had an impact on us. We were thinking, well, how do we
| get this right for a database?
|
| We also wanted to make explicit what is often implicit, so
| that the operator has a clear sense of how to provision
| their API layer around TigerBeetle.
| lmm wrote:
| I can easily imagine that kind of design getting into a
| state where it's "up" but not accepting any connections
| indefinitely (if it's using just enough memory to run
| itself, but doesn't have enough to accept any connections).
| Crashing early is often a good way to reduce the state
| space and ensure consistent behaviour, since you will
| always have to handle crashing anyway.
| wahern wrote:
| Crashing and losing all existing connections often merely
| leads to an avalanche of subsequent requests,
| accelerating overload of your systems. I've seen this
| countless times.
|
| Being crash tolerant is one thing. But "crash early,
| crash often" is absolutely horrible advice.
| lmm wrote:
| > Crashing and losing all existing connections often
| merely leads to an avalanche of subsequent requests,
| accelerating overload of your systems. I've seen this
| countless times.
|
| Sure, but again, that's a scenario that you need to
| handle anyway.
|
| > Being crash tolerant is one thing. But "crash early,
| crash often" is absolutely horrible advice.
|
| I've found it to be good advice. It's a big part of why
| Erlang systems have been so reliable for decades.
| wahern wrote:
| In Erlang a "process" is more akin to a thread or fiber
| in most other languages, and "crashing" more like an
| exception that is caught just across a logical task
| boundary. Importantly, in Erlang a process "crashing"
| doesn't kill all other connections and tasks within the
| kernel-level Erlang process.
|
| And that's why Erlang is so resilient, precisely because
| the semantics of the language make it easier to _isolate_
| tasks and subtasks, _minimizing_ blast radius and the
| risk of reaching an inconsistent or unrecoverable state.
| I often use Lua (as a high-level glue language) to
| accomplish something similar: I can run a task on a
| coroutine, and if a constraint or assertion throws an
| error (exceptions in Lua are idiomatically used
| sparingly, similar to how they 're used in Go), it's much
| easier to recover. This is true even for malloc failures,
| which can be caught at the same boundaries (pcall,
| coroutine.resume, etc) as any other error.[1] It's also
| common to use multiple separate Lua VM states in the same
| kernel process, communicating using sockets, data-copying
| channels, etc, achieving isolation behaviors even closer
| to what Erlang provides, along with the potential
| performance costs.
|
| [1] While more tricky in a language like Lua, as long as
| your steady-state--i.e. event loop, etc--is free of
| dynamic allocations, then malloc failure is trivially
| recoverable. The Lua authors are careful to document
| which interfaces and operations might allocate, and to
| keep a core set of operations allocation free. Of course,
| you still need to design your own components likewise,
| and ideally such that allocations for connection request,
| subtasks, etc can be front-loaded (RAII style), or if not
| then isolated behind convenient recovery points. Erlang
| makes much of this discipline perfunctory.
| yellowapple wrote:
| > Could your pagefile/swap be full?
|
| This presumes that one is indeed using a pagefile or swap
| partition. I know of quite a few folks who skip that on SSDs
| out of fear of SSD wear.
|
| Personally, in this day and age I don't really give a damn
| about SSD wear (that's what backups are for), so I'll happily
| create a swap partition, but not everyone is as comfortable
| with that.
| perryizgr8 wrote:
| Software for critical infrastructure often handles no/low
| memory situations in a graceful manner. I've worked on
| network switch software where we were expected to work
| indefinitely with zero remaining memory. A single packet drop
| is unacceptable, at 10-40G line rate. Any new malloc needs to
| be justified with a design document.
|
| So yeah, average user software can do a lot, it's just that
| we've given up on reliability and robustness as an industry.
| That's why sometimes your phone fails to make a 911 call, and
| sometimes you need to reset your car.
| slaymaker1907 wrote:
| When SQL Server goes OOM, it's often because the system can't
| free memory quickly enough, not that there is literally not
| enough memory to continue operating. Sure, it could be more
| aggressive with freeing memory, but at some point it just
| makes more sense to restart the entire system so that overall
| performance doesn't nosedive.
| pphysch wrote:
| It would be nice if you could specify e.g. a Postgres table be
| backed by a (really big) "ring buffer" rather than the usual
| storage.
|
| Maybe that breaks some relational features on the table, but
| being able to painlessly query the most recent X records for
| some interesting transient log data would be nice.
| ok_dad wrote:
| I ended up using a ring buffer in Redis recently to store the
| last N values of some database lookups that are slow. It
| works great! Didn't take that much logic, either, since Redis
| has easy transactions. I just have the "current buffer index
| 'i'" as a key-value, the "buffer length 'N'" as a key-value,
| then 'N' numbered entries as key-values. I prefix these with
| the data identifier that it caches, so you can have several
| ring buffers active. you can do this with a Postgres table
| just as easily, I assume!
|
| I also am building a ring buffer data structure for my hobby
| database that I use for personal stuff that uses FoundationDB
| [though I have been unhappy with the state of the FDB Go
| bindings, which are somewhat out of date versus the C client
| library for FDB and aren't very Go-like in design (the C
| libraries are the "gold standard" and they use FFI in all the
| other bindings to connect to those)].
|
| _Added later_ : it was a bit harder to deal with resizing
| the buffer, so when that happens I do some data
| shifing/dropping depending on which way we're resizing,
| effectively initializing the buffer anew because that was
| easier (again, a Redis transaction saves me a lot of
| concurrency work!), but lucky for me we would only resize
| each buffer at most 1 time per minute due to how our configs
| work.
| jandrewrogers wrote:
| Allocating all required memory during startup has been idiomatic
| for database architectures as long as I can remember. The obvious
| benefits to doing things this way are wide-ranging.
|
| There is a less obvious benefit in more schedule-driven database
| architectures. By implication, the code knows exactly how much of
| which resources are instantaneously available at all times since
| it is directly managing those resources. This enables the
| database to dynamically adapt its behavior and scheduling
| decisions in response to specific resource pressures or excess
| resource capacity. This is basically a fine-grained form of
| internal micro-backpressure, dynamically prioritizing subsets of
| the workload that minimize impact on the resources under pressure
| at any moment in time. In some versions, you can switch internal
| execution strategies to take advantage of currently under-
| utilized resources. If you outsource this to the OS, you have no
| idea how close or far you are from the limits, it doesn't really
| work.
|
| This doesn't add much performance _per se_ but is great for
| minimizing tail latencies. It helps to create a robust internal
| resource equilibrium that handles transients and concurrency more
| smoothly and consistently than if the database couldn 't make a
| fine-grained decisions about what to do from a position of total
| resource awareness.
| com2kid wrote:
| > Allocating all required memory during startup has been
| idiomatic for database architectures as long as I can remember.
| The obvious benefits to doing things this way are wide-ranging.
|
| Allocating all required memory is, IMHO, a great practice for
| almost any type of program.
|
| Heap memory allocators are a type of performance optimization,
| they allow re-use of existing memory, but only if you are
| careful to not run out of memory. Heap's of course also allow
| for more efficient utilization of memory, if some software
| module isn't using that bit of RAM right now, let another bit
| of code use it.
|
| But like all performance optimizations, they make code messier,
| and they also lead to less reliable and harder to follow code.
|
| In the embedded world, code statically allocates all memory it
| needs up front. This means that code that reads from the
| external world has a fixed buffer size, and also likely a
| related max throughput, regardless of system conditions (e.g.
| lots of free memory laying around).
|
| But, this is also a good thing! It forces programmers to think
| about their limits up front, and it also forces thinking about
| error handling when those limits are broken!
|
| Sure it can require creative coding, loading large blobs of
| binary data requires more work when you stop assuming malloc
| can return infinite memory, but, hot take here, programs should
| stop assuming malloc can return infinite memory anyway.
| girvo wrote:
| > In the embedded world, code statically allocates all memory
| it needs up front.
|
| Except for things like the ESP32, which uses malloc pretty
| liberally within ESP-IDF. Networking is hard without it, it
| seems?
| astrange wrote:
| If you're running on a regular OS, you can't escape it by
| just not calling malloc.
|
| You'll still get lower performance if the system is low on
| memory, because you got swapped out - it doesn't have much if
| anything to do with how much you allocated. Wiring it is a
| different story, and if you're an audio app maybe you should
| allocate and mlock(), but not a database.
|
| > hot take here, programs should stop assuming malloc can
| return infinite memory anyway.
|
| Assuming this is a perfectly fine way to make reliable
| software.
|
| https://en.wikipedia.org/wiki/Crash-only_software
| jorangreef wrote:
| Static allocation has also made TigerBeetle's code cleaner,
| by eliminating branching at call sites where before a message
| might not always have been available. With static allocation,
| there's no branch because a message is always guaranteed to
| be available.
|
| It's also made TigerBeetle's code more reliable, because
| tests can assert that limits are never exceeded. This has
| detected rare leaks that might otherwise have only been
| detected in production.
| pierrebai wrote:
| I don't get your characterization. The statically sized
| arrays can never be fully used? That is doubtful. I suppose
| that if data X and only ever used 1:1 with data Y and you
| got a Y, then you are guaranteed to have an X, but that
| requires complete static analysis of the code base _and_
| certainty that no future code change will _ever_ affect
| that 1:1 relationship. That seems fragile.
|
| I mean it's not like memory exhaustion is a common
| occurrence nowadays. These kind of compromises, where one
| trades an almost non-existent failure mode for a fragile
| assumption-ridden code base does not sounds like a wise
| choice.
|
| I mean, if the particular process is so tightly-integrated
| that you can know that you have a 1:1 relationship between
| X and Y, you can also tie those together with dynamic
| allocations. I find it hard to believe that you can easily
| statically show that X and Y allocations are tied in a
| static allocation scheme but that it would not be so under
| dynamic allocation?
| Twisol wrote:
| I think the grandparent was saying that dynamic allocation
| is a form of optimization, which also makes the code harder
| to follow. Your anecdote seems exactly in line with their
| suggestion.
| jorangreef wrote:
| Ah, missed that, thanks! I've updated the comment.
| parhamn wrote:
| > memory during startup has been idiomatic for database
| architectures as long as I can remember.
|
| How can this be true given most mainstream databases do not do
| this? Maybe a pattern sure, by idiomatic, not so much. I only
| add this comment in defense of the poster as this comment
| seemed to reduce their achievement/efforts.
| dilyevsky wrote:
| Not just databases but really any kind of latency sensitive
| software especially with "realtime" requirements (e.g telecom
| gear)
| gavinray wrote:
| I like this idea a lot.
|
| Are there any other databases which do a similar thing?
|
| This seems like one of those things that has massive benefit if
| you're willing to invest in the effort + architect it like that
| from the start.
|
| (Unrelated note: I discovered last night that Zig has io_uring
| support and some nice wrappers implemented in "std". Which is
| nice for usecases like databases or networked services)
|
| https://github.com/ziglang/zig/blob/42a3b60c331cf01d1ab80eb7...
| kristoff_it wrote:
| > (Unrelated note: I discovered last night that Zig has
| io_uring support and some nice wrappers implemented in "std".
| Which is nice for usecases like databases or networked
| services)
|
| io_uring support was contributed to the Zig standard library by
| Joran, who is also the creator of TigerBeetle :^)
| SleepyMyroslav wrote:
| rant at fixed memory systems:
|
| In many fixed sized systems "re-used" as a result of bugs data is
| still good one and it will produce slightly different results
| that can not be distinguished from real ones until whole
| computation will be done outside of system. If data reference
| mistake is stable it can even pass majority/stability tests for
| long time. Filling destroyed values can save a lot of time just
| because it will produce nonsensical results that will be noticed
| faster.
|
| Also coming from games industry I don't recognize praise of fixed
| systems at all. Games prefer fail fast. Restart can be easy for
| players if they re-connect or not lose any progress. The way to
| get players truly upset is to get player data mixed with bad
| values after long play time or if there is no easy way to recover
| progress/saved data.
| [deleted]
| slaymaker1907 wrote:
| Should be titled "without using malloc" or something. It's pretty
| much impossible to do anything with user input strings unless you
| have some sort of memory management. Maybe you have one giant
| buffer that you use for strings in a particular session, but that
| still involves memory management because even if you allocate the
| buffer at startup, using one buffer for multiple strings is just
| a type specific arena allocator.
|
| As someone who works on low level database code, this approach is
| going to be EXTREMELY wasteful. Instead, it's much better to
| instead have a more sophisticated understanding of memory use by
| subsystems and have some concept of required memory and caches.
| Each system allocates the minimum memory it needs at startup and
| then treats the rest of memory as caches which can be emptied as
| necessary to preserve fairness among all the various subsystems.
| throw827474737 wrote:
| > Should be titled "without using malloc" or something. It's
| pretty much impossible to do anything with user input strings
| unless you have some sort of memory management
|
| Title says "without dynamic memory allocation" == what means
| without malloc in this context, so static memory allocation ...
| where you do your own kind of memory management in fixed size
| arrays or more sophisticated if you want. It didn't say without
| memory management? (And article explains also why no strings or
| other unknown/arbitrary length stuff..)
| eatonphil wrote:
| Also, the article mentions the only entities are two fixed
| size types: accounts and transfers.
|
| There are no user strings in TigerBeetle. :)
|
| If anyone wants to see what you can store in accounts and
| transfers, check out the reference docs!
|
| https://docs.tigerbeetle.com/reference/accounts
|
| https://docs.tigerbeetle.com/reference/transfers
| kortex wrote:
| > using one buffer for multiple strings is just a type specific
| arena allocator.
|
| Sure, but there's a lot of advantages of entirely controlling
| memory allocation in-process.
|
| > this approach is going to be EXTREMELY wasteful.
|
| Could you elaborate why? Sure, you have to do a bit more
| juggling in-process, but I would imagine this would be
| potentially _more_ efficient than having conventional mallocs.
|
| In the environment it is designed to operate in, there is
| usually a fixed memory limit per process, so you might as well
| grab all that up front.
| jorangreef wrote:
| Thanks!
|
| Joran from the TigerBeetle team here.
|
| > I would imagine this would be potentially more efficient
| than having conventional mallocs.
|
| Yes, in our experience, static allocation means we use
| sometimes 10x less memory. For example, TigerBeetle's storage
| engine can theoretically address up to 100 TiB with less than
| 1 GiB of RAM for the LSM in-memory manifest, which is
| efficient.
|
| Because we think about memory so much upfront, in the design
| phase, we tend not to waste it.
| kristoff_it wrote:
| I think it should be noted that TigerBeetle is an extremely
| specialized database. It's not a general-purpose DB for storing
| arbitrary data, all it stores is numeric financial account
| information.
| sh1mmer wrote:
| Maybe I'm missing something but I would imagine query planning
| and caching is the hardest thing in terms of database memory
| usage. This post doesn't seem to address that.
|
| Even if the messages are fixed sized surely the memory cost of
| executing the queries depends on the complexity of the query, and
| the expected numbers of results returned.
|
| Are they just doing get/set style queries only?
| jorangreef wrote:
| Thanks!
|
| Joran from the TigerBeetle team here.
|
| TigerBeetle's storage engine is designed also for range
| queries, and we have some interesting ideas for our query
| engine in the works.
|
| To be sure, there are some tricky things that crop up, such as
| pipeline blockers for queries. However, we do have limits on
| literally everything, so that makes it easier--static
| allocation is best when it's done viral.
|
| We wanted to go into so much more detail on all this but there
| was only space for so much. Stay tuned.
| pierrebai wrote:
| One of the first claim of the article is plainly false. Static
| allocation does not avoid use-after-free. It just avoids
| accessing freed memory. You can still have bugs and access dead
| data in those static arrays that are really dead object but your
| buggy code still references and access them.
|
| Most benefits listed have nothing to do with memory allocations.
| For example "having a client open too many connections" can be
| limited by... limiting the number of open connections. Where and
| how that check is done might differ, but it's not related to
| memory allocations.
|
| There are BIG downfalls to static allocations: -
| You cannot support varied workloads. If one workload needs a lot
| of X and another needs a lot of Y, well, you already decided the
| maximum numbers of X and Y and cannot re-balance it.
| - If you run out of anything you thought was "sufficient" enough,
| your programs just fails.
|
| And I know, because I've worked on program that did static
| initialization at startup, and they failed in just that way. I
| also did test the performance benefits: it was around 12%. So you
| do get some performances out of it, but the added complexity of
| having things fail out where in a normal program it would just
| chug along, having to deal with these failures and the usual
| have-to-roll your-own data structures or libraries is not worth
| 12%. Because, remember, if you go out of your way to have static
| allocations, that means all 3rd-party libraries have to do it
| too, otherwise the benefits is wasted. Is that worth 12%? IMO,
| no.
|
| I mean, sure, your program is easier to analyze statically, just
| like if you gave up dynamic threads and instead just pin
| different parts of your program to different CPU core. Sure that
| solves some concurrency problems, but most programs don't do this
| for obvious reasons.
| AdamProut wrote:
| I'm not sure why this is being downvoted? Dividing up memory
| statically at startup/compile time has some nice benefits
| described in the article, but it also makes the database less
| flexible to changing workload requirements. For example, if a
| workload wants to do a big hash join that needs 90% of the
| memory at time X and then a big data load that needs most of
| the memory in a completely different sub system at time Y - its
| nice to be able to dynamically shift the memory around to match
| the workload. Doing static up front allocations seems to block
| this from happening, and that is maybe a fine trade-off for
| TigerBeetleDB based on the types of workloads they're
| optimizing for, but its definitely a valid dis-advantage of
| their approach.
| sakras wrote:
| I like this a lot! I've been trying to find ways to avoid memory
| allocation as much as possible in my projects as well. So far
| I've settled on mmap-ing large slabs at a time and arena
| allocating (Linux's mremap makes growing the allocation really
| efficient too!).
|
| However, come to think of it just using a fixed amount is just as
| good if not better, since I'm always going to be constrained by
| memory in some way anyway. And if I allocate e.g. 80% of the
| system's memory upfront, it won't _actually_ be mapped by the OS
| until I touch it. There may be some interesting stuff with using
| huge pages for the first few pages and use smaller and smaller
| pages as you approach your memory limit. Definitely will give
| this a try!
| vbezhenar wrote:
| > if I allocate e.g. 80% of the system's memory upfront, it
| won't _actually_ be mapped by the OS until I touch it
|
| What about people running vm.overcommit_memory=2
| enedil wrote:
| Yes, this approach is followed as well, but now you're at the
| problem of writing a memory allocator by yourself (which might
| of course turn out to be better for your usecase). So the issue
| of writing code that avoids allocation still prevails, as now
| you probably want to know what to allocate memory for, since if
| you were supposed to make a general purpose allocator, it is
| likely it won't overperform libc malloc.
| sakras wrote:
| I agree on some level, but I think if you're intentional with
| your design you can get by with a very simple arena allocator
| where an allocation is a single subtraction from a pointer.
|
| In my experience, most programs have a few long-lived data
| structures that can be initialized up front, and then lots of
| small data structures that are created and destroyed
| frequently. If you do it right, you can put the long-lived
| guys at the beginning of the arena and then just wiggle your
| arena head pointer back and forth for the small guys.
|
| This does take some more thought and consideration, and is
| definitely a bit trickier than just using malloc when you
| need it, but for me the trade off is worth it (for one I gain
| some amount of satisfaction from imagining my data all neatly
| in a row).
| [deleted]
| [deleted]
| hinkley wrote:
| Listen to me now and hear me later:
|
| Static allocation doesn't eliminate use-after-free bugs, it just
| reduces them.
|
| If you have mutable data structures, it's still always possible
| to refer to something that used to have one meaning and now means
| something else. That's still a use-after-free bug.
|
| I was going to say that it's similar to how GC can still have
| memory leaks, but in fact it's the exact same problem:
| reachability of data that is no longer relevant/correct.
| slaymaker1907 wrote:
| Yep, I was thinking along the same lines. What if the database
| kept a reference to a connection for some timer or something
| and the connection memory later gets used for a different
| connection? GC languages don't have this problem because you
| would just allocate a fresh connection each time and the
| language itself guarantees that said physical memory is not
| reused unless it is unreachable.
| notacoward wrote:
| Even resource leaks are still possible. Marking an object in a
| static array as used is still allocation. Forget to mark it as
| unused later and it still remains unavailable. Do that enough
| times and you'll still run out. What this approach does is
| eliminate _unconstrained_ memory use. Leak-related failures
| will tend to surface earlier and more predictably, instead of
| at some distant unpredictable time when the system can 't
| satisfy your program's infinite appetite. And there are
| resources besides memory.
|
| This approach is still useful, but it's no panacea. It can even
| leave various kinds of capacity unused, or it can become a
| tuning nightmare if you try to avoid that. It has long been
| common in embedded (as others have pointed out) due to the
| requirements and tradeoffs characteristic of that space, but
| for general-purpose computation it might not be the tradeoff
| you want to make when other techniques can solve the same
| problem without some of the drawbacks.
| anthomtb wrote:
| Pedantically, you could say static allocations eliminate use-
| after-free() bugs.
| hinkley wrote:
| use after free is really about the consequences of use-after-
| reuse. The bug is always there, but you don't notice the
| behavior until someone changes the data out from under you.
| Until someone reuses the memory allocation, your app appears
| to be working properly.
| the_mitsuhiko wrote:
| Based on my limited experience with using arenas and
| preallocated memory I find debugging memory issues
| significantly harder since it's super easy to accidentally
| reuse allocations and only notice much later.
| hinkley wrote:
| The first pointer arithmetic bug I fixed, I only narrowed it
| down once I realized that the function was getting garbage
| data. Several others since have been the same way. This data
| structure has a value that can't possibly be arrived at,
| therefore this data is fake.
|
| When you point to real, dead objects, the bug can be anywhere
| or everywhere. If you have a pattern of using impure
| functions everywhere this can be a nightmare to identify and
| fix at scale.
| avgcorrection wrote:
| Sounds like there should be some additional metadata memory
| that is summed in as well (for the total allocation at
| startup)? To check for such issues.
| gamegoblin wrote:
| At the risk of shilling for Big Rust (or at least any language
| with ownership semantics, really) -- well-designed Rust code
| can truly eliminate use-after-free. I was the lead engineer on
| a project that wrote an entire filesystem from scratch in Rust,
| and we did exactly this. 100K lines of code without any use-
| after-free bugs in the whole development process.
|
| One of the first bits of code we wrote in the project was a
| statically allocated pool of buffers such that the only way to
| free buffers back to the pool is for all references to that
| buffer being dropped.
| mytherin wrote:
| If anything, it might make use-after frees more painful. Now
| the address sanitizer won't help you find them.
| sillysaurusx wrote:
| That jumped out to me too. The point of use after free is that
| your program has a logic error. The logic error doesn't go away
| just because it doesn't crash.
|
| There's a lot of wisdom in this post, but that made me wince.
| It's a bit like saying "Our system doesn't crash when we
| dereference a null pointer." Well, cool. What could possibly go
| wrong.
|
| Still, I think it's best to just ignore that line and read the
| rest of the post. It's an unfortunate distraction to the core
| idea (which does have real value). For example:
|
| > here is how we calculate the amount of memory needed to store
| messages for TigerBeetle's consensus protocol: <snip>
|
| > And if we get the calculation wrong? Well, since we enforce
| static allocation, we can check these calculations with
| assertions (i.e. that a free message is always available). If
| we then get the static allocation calculation wrong, this can
| be surfaced sooner through fuzzing, rather than having no
| limits and eventual resource exhaustion (and cascading
| failure!) in production. When combined with assertions, static
| allocation is a force multiplier for fuzzing!
|
| Being able to put a hard upper bound on the amount of memory
| your system can use is generally a sign of good design. The
| alternatives are somewhat messy but still work. (The HN clone I
| maintain https://www.laarc.io/ runs via a while-loop bash
| script, because it dies via OOM every couple weeks. But it
| works great, since every time it dies it just restarts.)
| jorangreef wrote:
| Thanks!
|
| Joran from the TigerBeetle team here.
|
| Appreciate your balanced comment.
|
| To be fair, we're certainly concerned about logic errors and
| buffer bleeds. The philosophy in TigerBeetle is always to
| downgrade a worse bug to a lesser. For example, if it's a
| choice between correctness and liveness, we'll downgrade the
| potential correctness bug to a crash.
|
| In the specific case of message buffer reuse here, our last
| line of defense then is also TigerBeetle's assertions, hash
| chains and checksums. These exhaustively check all function
| pre/post-conditions, arguments, processing steps and return
| values. The assertion-function ratio is then also tracked for
| coverage, especially in critical sections like our consensus
| or storage engine.
|
| So--apologies for the wince! I feel it too, this would
| certainly be a nasty bug if it were to happen.
| the_mitsuhiko wrote:
| So you only reuse memory for objects with a checksum?
| Buffer bleed is scary if exploitable (see heartbleed) and
| I'm curious how you are protecting against it in practice.
| jorangreef wrote:
| It's defense-in-depth.
|
| We use what we have available, according to the context:
| checksums, assertions, hash chains. You can't always use
| every technique. But anything that can possibly be
| verified online, we do.
|
| Buffer bleeds also terrify me. In fact, I worked on
| static analysis tooling to detect zero day buffer bleed
| exploits in the Zip file format [1].
|
| However, to be clear, the heart of a bleed is a logic
| error, and therefore even memory safe languages such as
| JavaScript can be vulnerable.
|
| [1] https://news.ycombinator.com/item?id=31852389
| 0xbadcafebee wrote:
| For the first 5 years that I wrote C, I never used dynamic
| memory. (It was a hobby.) Later I found that dynamic memory
| allowed me to perform more operations. By only allocating what I
| needed, I could process more operations in the same amount of
| memory. I tripled the amount of operations due to not needing to
| restrict myself to an arbitrary size.
|
| The Kernel has a feature that takes advantage of this same
| principle: memory overcommit. Programs are always asking for way
| more memory than they actually need. The kernel lies to those
| programs, and says, "yes, yes, I'll give you all that memory...
| sure thing..." even if it already said that to every other
| program. This allows more programs to continue operating since
| they aren't really using all that memory. When the system _does_
| run out of memory, it can swap rarely used memory out to a swap
| file; when that doesn 't work, the application can receive an
| error, or the OOM killer can just kill the process. The result is
| more programs doing more work, and an occasional app crash.
|
| Virtual machine hypervisors do a similar trick with memory
| ballooning. If the guest OS has a balloon driver, the hypervisor
| can overcommit memory for the VM guests, and you can provision
| more VMs than there's actually memory for.
| polio wrote:
| It seems analogous to fractional reserve banking and the free
| economic performance it enables by leaning on a mostly reliable
| statistic about empirical peak concurrent usage of a shared
| resource.
| stephc_int13 wrote:
| In some areas, especially hard real-time/embedded or safety
| critical, dynamic memory allocation is strongly frowned upon, and
| for good reasons.
|
| Even if your application does not have stringent
| safety/predictability requirement, following the principles of
| Data Oriented Design leads naturally to a radically different
| approach to memory management.
|
| I am glad to see this kind of projects and design choices.
| jorangreef wrote:
| Thanks!
|
| Data Oriented Design runs like a river through TigerBeetle--
| it's always on our mind.
|
| By the way, have you seen Andrew Kelley's Handmade Seattle talk
| on Practical DOD? [1]
|
| [1] https://vimeo.com/649009599
| stephc_int13 wrote:
| Yes, I've seen this one, and a few others.
|
| I think DOD should be more like an engineering mindset than a
| strict set of rules.
|
| From my experience it can be impractical/cumbersome,
| especially when prototyping because DOD usually pays off in
| the long run.
| habibur wrote:
| > TigerBeetle is written in Zig
|
| This is the key part.
|
| Zig memory allocation is something like bookmark, allocate and
| then free everything after bookmark when done. Someone pushed it
| to limit.
| eatonphil wrote:
| Interestingly enough, the first version of TigerBeetle (that
| still followed much of the same design decisions) was written
| in Node! :)
|
| https://github.com/tigerbeetledb/tigerbeetle/blob/main/docs/...
| throwawaymaths wrote:
| *can be. You can do more traditional memory patterns too in
| zig. There is also an allocator in stdlib that directly wraps
| C's malloc. Likewise, if you're writing a plugin for another
| software or FFI for a VM language, you can wrap their
| allocators (if they provide them).
| commandlinefan wrote:
| Without looking through the implementation, I suspect that the
| consequence here is that it pre-allocates way more memory than it
| needs and then does some custom dynamic allocation with that.
| haggy wrote:
| They explain pretty clearly that this is static memory
| allocation. In other words, a calculated amount of (mostly)
| contiguous memory is allocated at startup and used very
| carefully, relying on disk to supply the dynamic nature of
| database load semantics.
|
| https://www.geeksforgeeks.org/difference-between-static-allo...
| francasso wrote:
| Welcome back to the 70s! Jokes aside I think there are use cases
| for this style of memory allocation even today (embedded systems
| for example). I just have never felt its need in a DB (or rather,
| I have many more desiderata from a DB than "I don't have to worry
| about OOM").
| Rochus wrote:
| > back to the 70s
|
| The sixties actually; that's how memory was managed in Fortran
| from its inception until the eighties.
| jorangreef wrote:
| Good to be back--Joran from the TigerBeetle team here!
|
| Static allocation does make for some extremely hard guarantees
| on p100 latency. For example, for a batch of 8191 queries, the
| performance plot is like Y=10ms, i.e. a flat line.
|
| And memory doesn't increase as throughput increases--it's just
| another flat line.
|
| I personally find it also to be a fun way of coding, everything
| is explicit and limits are well-defined.
| jokethrowaway wrote:
| Cool article and an interesting concept. It's definitely good to
| think out of the box - but I've rarely had to deal with a
| database OOMing (I assume because they have strategies to write
| to disk before that happens).
|
| There are plenty of strategies nowadays to avoid garbage
| collection and use-after-frees (rust comes to mind), I'm not
| convinced working with fixed amount of resources is the most
| friendly to operators.
| lewisjoe wrote:
| Can somebody eli5 what this really means? Specifically, how is
| code without dynamic allocation bring about predictability?
| Arelius wrote:
| In addition to the mentioned known maximum memory, it also has
| useful performance characteristics. global memory allocators,
| by their nature can create allocation patterns and layouts that
| can be very sensitive thing like startup conditions, long-
| runtimes, or exact inter-thread execution sequences. This can
| cause items that in one run that get allocated near each other,
| can be allocated very distantly, or chaotically in the heap in
| another run, for instance.
|
| Now, given that modern CPUs rely heavily on their cache
| lines/hierarchies, branch predictors, and prefetchers, changes
| in layouts can have significant performance implications,
| algorithms that run quickly when things are contiguous in
| memory, can run orders of magnitudes slower on layouts that are
| more random, and have more cache line miss. And those layouts
| are often the result of long run-times in dynamically allocated
| systems, due to accumulated heap fragmentation.
|
| Now, static allocation neither completely solves these
| problems, nor is it the only solution, but it is the case that
| the techniques that are used for static allocation, often have
| tremendous benefit for cache layout, and combined with other
| added benefits, like reduced OOM conditions, it's a powerful
| constraint.
|
| Edit: Not to mention that the previously noted heap
| fragmentation can also cause eventual failure to allocation
| conditions, even without true memory exhaustion.
| 0x5345414e wrote:
| It brings predictability because know you won't run out of
| memory. If everything is statically allocated, the total amount
| of memory the application will use is known at compile time.
| Very important for memory restricted systems.
| 01100011 wrote:
| Dynamic allocation can fail. Memory can become fragmented or
| you could just exceed your budget.
| tannhaeuser wrote:
| What's so special about it? Static heap allocation (without
| malloc) was the norm with COBOL + ISAM; early SQL databases used
| eg ESQL/C with static or quasi-static allocation (by which I mean
| at most one malloc per field for the entire process runtime).
| Process-per-request was (and is) recommended practice if you
| value stability and security above all when using languages that
| don't have GC and/or have issues with memory fragmentation.
| felipelalli wrote:
| I love/hate the bad humor of HN.
| qwertox wrote:
| Just today I got presented an article in my "newsfeed" claiming
| that coroutines in C++ are the hottest thing right now in
| algorithmic trading, which I assume also relies on databases.
|
| It left me thinking because of the unpredictability in timing
| which arises from its use, but then again, the article closes
| with the sentence
|
| > Until then, it is important to note that no feature is given
| support by C++ if it has not been fine-tuned and optimized and we
| cannot expect to see total coroutine implementation until at
| least C++26. Nonetheless, if you're an algorithmic trader or C++
| developer, watch this space and watch it very closely.
|
| https://www.efinancialcareers.com/news/2022/10/algorithmic-t...
| valyagolev wrote:
| > Restrict the length of function bodies to reduce the
| probability of poorly structured code. We enforce a hard limit of
| 70 lines per function.
|
| Reminds me of Carmack on the benefits of long functions:
| http://number-none.com/blow/john_carmack_on_inlined_code.htm...
| dktoao wrote:
| "Clever" developer limitation like this almost invariably cause
| developers to come up with even more convoluted solutions to
| problems than needed. Don't disagree that long functions can be
| an indication that someone was copy pasting a bunch of stuff
| around in the codebase, but not always!
| jorangreef wrote:
| Joran from the TigerBeetle team here.
|
| The limit of 70 lines is actually a slight increase beyond
| the 60 line limit imposed by NASA's Power of Ten Rules for
| Safety Critical Software.
|
| In my experience, in every instance where we've refactored an
| overlong function, the result has almost always been safer.
| avgcorrection wrote:
| 70 lines is generous compared to Clean Code dogma/opinion.
| Gibbon1 wrote:
| Obligatory Hank Hill. I'll tell you hwhat. I've seen some
| poorly structured code with nothing but small functions. Like a
| state machine implemented via chains of callbacks that could
| all just live inside one big switch statement.
___________________________________________________________________
(page generated 2022-10-13 23:00 UTC)