[HN Gopher] What every systems programmer should know about conc...
___________________________________________________________________
What every systems programmer should know about concurrency (2020)
[pdf]
Author : ibobev
Score : 162 points
Date : 2024-12-12 21:46 UTC (1 days ago)
(HTM) web link (assets.bitbashing.io)
(TXT) w3m dump (assets.bitbashing.io)
| anacrolix wrote:
| Relaxed atomic operations
| mwkaufma wrote:
| That's section 10.2 of the linked article, yes.
| tobyhinloopen wrote:
| Every time I dip my toes in multithreading in C++, I just get
| into a world of random crashes and segfaults. It seems
| unreasonably hard, hah.
| saagarjha wrote:
| Concurrency is hard, unfortunately. C++ just makes it even
| harder than that.
| jazzypants wrote:
| I hear you have to be pretty tall to be able to do it right.
| [1]
|
| [1] https://bholley.net/blog/2015/must-be-this-tall-to-write-
| mul...
| menaerus wrote:
| CPU design makes it inherently hard. C or C++ is just a thin
| layer above it making no tradeoffs. If you can live with the
| tradeoff then Rust land or VM-language land is more
| appropriate.
| sumtechguy wrote:
| We had multi CPU stuff in the 90s. C/C++ was dead set on
| ignoring it for a long time. Every OS had its own way of
| handling it and all with subtle weird bits that did not act
| like other OS's. You could not even trust the stdlib or crt
| to get it right with their globals that could change
| underneath you.
|
| So it was left to the developer. It is much better now but
| for so long the problem was ignored and now we have decades
| of 'not sure if I can touch that code'. Also by default C/C++
| are fairly open about sharing memory. So it is very easy
| create race conditions on memory. It would be nice if the
| base language had a concept of 'locked to a thread' or 'i
| want to share this with other threads' then the compiler can
| flag where I have wandered into the weeds for a class so we
| could catch the race conditions at compile time, at least
| some sort of warning.
|
| Sharing semantics were awful for a long time. stdlib has done
| some very good things to help clean that up but it is still
| very easy to share between threads and cause yourself a
| headache.
| pionaryon wrote:
| You can get pretty far with keeping sharing to an absolute
| minimum and when you do need to share data, slap a lock free
| ringbuffer between them to communicate. Pretty simple to get
| right.
| anal_reactor wrote:
| Step 1: use concurrency on a very high level. For example,
| write an app, and then run 4 instances of it, each working on
| a quarter of your data.
|
| Step 2: when you absolutely need concurrency within one app,
| try using some library that already solved the issue. For
| example, there are lots of databases with pretty strong
| concurrency contracts while still being efficient.
|
| Step 3: if you absolutely need custom solution, use a
| library/language that provides reasonable tools out of the
| box and keep your logic to minimum.
|
| Following the first two steps will solve 95% of your
| concurrency issues, if you include step 3 it goes to 99%.
| adrian_b wrote:
| You are right, but what you call "lock free" is not the same
| as many other things that are called "lock free", even if
| indeed a ringbuffer needs no locks, so this may be confusing
| for newbies.
|
| I strongly dislike the term "lock free", which is really just
| a marketing term invented by people trying to promote the
| idea that some algorithms are better than those "lock-based",
| when in fact those "lock-free" algorithms were only choosing
| a different trade-off in performance, which can be better or
| worse, depending on the application.
|
| Even worse is that after the term "lock free" has become
| fashionable, it has also been applied to unrelated
| algorithms, so now it has become ambiguous, so you cannot
| know for sure what is meant by it, unless more details are
| provided.
|
| When accessing shared data structures, the accesses are most
| frequently done in one of three ways.
|
| The first is to use mutual exclusion, when the shared data
| structure is accessed within critical sections and only one
| thread can execute the critical section at a given time. This
| method is usually called as lock-based access.
|
| The second is to use optimistic access, when the shared data
| structure is accessed concurrently by many threads, but they
| are able to detect interference from the other concurrent
| accesses and they retry their accesses in such cases. This is
| what is most frequently referred as "lock free" access.
| Compared to mutual exclusion, this access method may be
| faster in the best cases, but it is much slower in the worst
| cases, so whether this is a good choice depends on the
| application.
|
| The third method happens when it is possible to partition the
| shared resource between the threads that access it
| concurrently, so their concurrent accesses can proceed
| without fear of interference. This partitioning is usually
| possible for arrays and for buffers a.k.a. FIFO memories
| a.k.a. message queues (including one-to-one, many-to-one,
| one-to-many and many-to-many message queues).
|
| So your "lock free ringbuffer" refers to the third method
| from above, which is very different from the "lock free"
| algorithms of the second kind from above.
|
| Whenever concurrent access to partitioned shared resources is
| possible, it is much better than accesses with mutual
| exclusion or optimistic accesses, which require either
| waiting or retrying, both of which are wasting CPU time.
|
| Therefore using correctly-implemented message queues or other
| kinds of shared buffers is usually the best method to achieve
| high levels of concurrency, in comparison with other kinds of
| shared data structures, because it avoids the bottlenecks
| caused by mutual exclusion or optimistic accesses.
| gpderetta wrote:
| FWIW, lock-free is an academic term and it is not really
| about performance.
| adrian_b wrote:
| The fact that it has been coined by some academics in
| their research papers is not in contradiction with the
| fact that it has been chosen exactly like any marketing
| term, to imply that something is better than it really
| is.
|
| The alternative term "optimistic access" describes much
| better the essence of those algorithms, while "lock free"
| attempts to hide their nature and to make them look like
| something that is guaranteed to be better (so receiving
| money for researching them is justified), because locks
| are supposed to be bad.
|
| "Lock free" and "wait free" have been buzzwords that have
| provided subjects for a huge number of research papers in
| the academia, most of which have been useless in
| practice, because the described algorithms have been
| frequently worse than the lock-based algorithms that they
| were supposed to replace.
| revskill wrote:
| Thanks.
| bitbasher wrote:
| If C++ makes it possible to blow your leg off, multithreading
| with C++ makes it possible to blow yourself up, along with half
| your neighborhood and some homes halfway around the country,
| unexplicitly.
| commandersaki wrote:
| Really? I don't share either sentiment. I found
| multithreading in C++ to be pretty mundane, and in fact a lot
| easier than it probably once was with lambdas. You definitely
| need to be on top of the multithreading primitives such as
| atomics, mutex, condition variables, shared_ptr, etc. But
| otherwise it's pretty straightforward.
| crabbone wrote:
| One of the problems I'm usually facing with this is the
| need to call library functions from C++ code. Especially
| TLS-related stuff. It doesn't matter what primitives C++
| has to offer, as long as you are using library code things
| will break.
| mattgreenrocks wrote:
| The standard library concurrency primitives are still too low-
| level for a lot of general purpose concurrency needs. IMO, the
| minimum level of abstraction that you should start with for
| many apps is a thread-safe queue that dispatches messages with
| immutable data to one or more worker threads (e.g. actor-like
| model). You can go lower-level, but it needs to be deliberately
| chosen.
| Jtsummers wrote:
| People keep reinventing those and thread pools over and over
| in C++. I've been researching one of our older systems
| (slated for decommissioning in 2021, as is typical that did
| not happen). In trying to understand it I have found many
| areas of concern around how they deal with concurrency, in
| particular they created their own queue and thread pool.
| Based on past experience, there's 50/50 chance for each that
| they were created correctly (with proper concurrency
| controls), and less than that that the tasks submitted to the
| thread pool themselves make use of proper concurrency
| controls rather than assuming that they can read/write
| whatever they want as if the system were single-threaded.
| rramadass wrote:
| That is just not true and you are being unnecessarily
| hyperbolic. Even when i was learning/doing concurrency/multi-
| threading (using pthreads) long ago, i never got "into a world
| of random crashes and segfaults". It was of course challenging
| but not too difficult. You structure your application following
| standard usage patterns given in a good book (eg. Butenhof's)
| and then plug in your app logic in the thread routines. With
| some experience things get clearer over time and you begin to
| have an intuitive feel for structuring multi-threaded code. The
| key is to stay at a high enough level of abstraction
| appropriate for your usecase (eg. mutexes/semaphores/condition
| variables) before diving into compiler and hardware level
| intrinsics/atomics/etc.
|
| A good book to study and get a handle on all aspects of
| Concurrent Programming is _Foundations of Multithreaded,
| Parallel, and Distributed Programming by Gregory Andrews_ -
| https://www2.cs.arizona.edu/~greg/mpdbook/
| nocman wrote:
| In fairness the person you were responding to was referring
| to their own personal experience. They certainly are not the
| first person to conclude that doing non-trivial concurrent
| programming is too difficult for them. I agree that it is
| achievable with an appropriate level of care and experience,
| but I know there are many very smart people that conclude
| that multithreaded programming in C++ is too difficult for
| their taste.
|
| Even Rich Hickey, when discussing concurrency in Java/C#/C++
| said "I am tired of trying to get right, because it is simply
| far too difficult."
|
| > In particular, talk about shared state, how we do it today,
| how you do it in C#, Java, or C++, what happens when we get
| into a multi-threaded context, and specifically what are some
| of the current solutions for that in those spaces. Locking,
| in particular. That is something I have done a lot of, over a
| long period of time, and I am tired of trying to get right,
| because it is simply far too difficult.
|
| ( https://github.com/matthiasn/talk-
| transcripts/blob/master/Hi... )
| neonsunset wrote:
| > Java/C#/C++
|
| These three have different memory models, concurrency
| primitives and adopted practices...it's odd that they are
| lobbed together.
| nocman wrote:
| It's not odd at all. The underlying problems to be solved
| are identical.
| nocman wrote:
| Also, Rich has many years of experience using all three
| of those languages.
| neonsunset wrote:
| The two out of three have evolved in this area since.
| itronitron wrote:
| what is a systems programmer?
| asgeir wrote:
| It often refers to programmers that develop things that other
| programmers use to construct applications. A systems programmer
| may work on compilers, operating systems, framework libraries,
| etc.
|
| They will also often work with lower-level abstractions, such
| as threads, so that the higher-level developers need not think
| about them in their own designs.
| covofeee wrote:
| Fuck ... according to that I am a systems programmer. Not the
| OS/Compiler kind!
| sethammons wrote:
| Realized I was a systems developer some 15 years ago and
| underpaid. Negotiated a 33% raise
| montalbano wrote:
| My definition:
|
| A person who writes software that is closely bound to the
| operating system interface and/or hardware for reasons of
| performance and/or functionality.
| lacedeconstruct wrote:
| I think its more of a person who builds "systems" for other
| programmers to use and work within
| montalbano wrote:
| That's a valid broader definition but in practice I find it
| too broad to be very useful.
| skitter wrote:
| A miserable pile of secrets.
| mdaniel wrote:
| I don't have enough upvotes to adequately reward you
| crabbone wrote:
| For the purpose of job interviews (and not necessarily any kind
| of sound definition), this means that the position you are
| looking at will have you dealing with software that implements
| some operating system services.
|
| To give you some examples: you might be working on an SDS
| product (software-defined storage), perhaps a filesystem, or
| maybe a network-attached block device etc. that operating
| system has a way of interacting with and exposing to other
| software as it's own service. To make it even more concrete:
| you can write a filesystem in a way that it interfaces with
| Linux kernel, and, once installed, users will interact with it
| through VFS interface (so, they don't even have to know they
| are interacting with your specific product). ZFS is a good
| example of this kind of software.
|
| Similarly, it could be SDN (network), eg. various VPN software
| s.a. OpenVPN. Or a users / permissions management software (eg.
| LDAP server, s.a. slapd). Software to manage system services
| (eg. systemd). Software to manage specialized hardware (on top
| of drivers) that, again, provides its services through the
| operating system (eg. nvidia-smi). Or, maybe it's a software
| that manages memory, terminals, virtualization, containers (eg.
| DMA over IB, tmux, QEMU, containerd).
|
| Sometimes this category is extended to system monitoring or
| higher-level management tools that still give some kind of
| general service similar to the one provided by the operating
| system. Think about SAR or mdadm kind of tools. Or, this could
| also be testing of operating system services (eg. FIO).
|
| Sometimes, this can also mean some kind of administrative tools
| that perform either more high-level management of operating
| system (eg. yum, the RHEL package manager), or management of
| groups of operating systems. Cloud tools can fall into this
| category (eg. boto3 AWS client). But, this is kind of
| stretching the definition.
|
| ----
|
| So, as per usual in the field of programming... we don't have
| an authoritative source and a definition for a widely used
| term. You just have to accept (at least for now), that
| different people mean different (but often significantly
| overlapping) things when they use the term and ask more
| questions to refine what they mean by it.
| LAC-Tech wrote:
| The high priest of a low cult
| mattgreenrocks wrote:
| I cannot fault this remark, however biting it may be.
| LAC-Tech wrote:
| https://en.wikipedia.org/wiki/Robert_S._Barton
|
| completely stolen :)
| zifpanachr23 wrote:
| These days? A programmer that does anything but CRUD for web
| apps.
|
| In the before times the terminology came from system 360 and
| descendent mainframe systems where the system programmers
| tended to be intimately involved with the deployment and
| programming of the operating system and associated "exits", and
| such tasks tended to presume some working knowledge of assembly
| language (operating systems were very crude at that time...and
| many shops tended to "customize" them if that makes sense...you
| would sometimes have serious problems with operating system
| level things and at that time the generic response would be
| "contact your systems programmer", who would either fix it or
| get in touch with IBM...so TLDR is basically a sysadmin that is
| good at assembly). Big companies could afford to have systems
| programmers on staff and needed them because every shop was
| different and computers were very custom and crude at that
| point in history.
|
| That's where the terminology came from originally but now it
| has expanded and can mean different things.
|
| The "systems programmers" terminology was mainly just a
| differentiation from the more generic "application programmers"
| (who would probably program in COBOL or something similar,
| doing business related tasks) or "system operators" (who would
| not usually be programmers at all, and would do a lot of what
| we would now consider sysadmin work).
|
| Nowadays it just means anything that isn't bog standard CRUD or
| frontend web app stuff. The terminology has evolved but has a
| somewhat similar meaning to when it originated. It's tough to
| understand exactly without the historical context which is why
| I have included that.
|
| For obvious reasons, this has some overlap with the more modern
| definition of "programmers that build systems that other
| programmed then use". Because the application programmers on
| those mainframe systems would be relying on the systems
| programmers to not screw up their assembly wizardry or else
| everything would fall apart.
| mdaniel wrote:
| > so TLDR is basically a sysadmin that is good at assembly
|
| So "devops" existed even in the mainframe era! :-D
| woile wrote:
| Is there a maybe ebook friendly version? I'd love to read this in
| the kindle
| BrentOzar wrote:
| Can't you send PDFs to a Kindle? Been a decade since I used
| one, but I vaguely remember that being a thing.
| woile wrote:
| Yes you can, but it's not the best format to read on, and a
| PDF with 2 columns is kind of even harder to read.
| utopcell wrote:
| Pdfs are perfect on the kindle scribe, so long as they are
| not color. This is my primary use for the device. On
| smaller kindles, pdfs are probably a pain to read though.
| bananatype wrote:
| The ebook has a Github repo, here:
|
| https://github.com/mrkline/concurrency-primer
|
| You can fork/clone/download the repo and then convert the ebook
| into Kindle or epub with Pandoc, like so (this is what I run):
|
| pandoc -o concurrency-primer.epub concurrency-primer.tex
|
| (I can share my converted file, but I don't know which file-
| sharing service is permitted on HN. If anyone can give me
| suggestions on this, I will be happy to upload it and share my
| epub file.)
| woile wrote:
| Thanks, yes, I ended up doing the same!
| msanlop wrote:
| Spent 4h yesterday looking for a bug in my atomics, and probably
| 4 more today :^)
| david-gpu wrote:
| Been there. I don't have advice, only commiseration.
| sebtron wrote:
| Last time this happened to me it turned out I made a logic
| error, the algorithm required a lock but I erroneously thought
| atomics would do.
| zelphirkalt wrote:
| Hm. Seems very C++ oriented. Talks about specific keywords of C++
| and their semantics. Not even mentioned "actor" in the whole
| document. "immutable" not found.
|
| Seems rather like an very incomplete view of a C++ programmer,
| wanting to raise it to "systems programmer".
| ajross wrote:
| To be fair most of the consensus-imposed[1] interface to memory
| ordering semantics was developed as part of the C++ memory
| model standardization process. GCC/clang have imported them as
| extensions for C code, etc...
|
| [1] Which is not to say "best". My standard complaint in these
| threads is that the C++ ordering semantics are a mess designed
| to make things easier for compiler writers to target
| architectures with variant ordering guarantees and not for
| algorithm developers to create correct code. _But lockless code
| is much harder to write than compilers!_ It 's a bad trade. ARM
| had this right decades ago with simple read/write barriers,
| which are vastly easier to understand when reading code.
| packetlost wrote:
| Rust also adopted the C/C++ memory ordering semantics, for
| better or worse.
| ajross wrote:
| Everyone has, even the hardware vendors are treating these
| rules as the golden standard to which they can optimize and
| conform. So it's too late. But it's still a shame.
|
| The acquire/release semantics make perfect sense _for
| locks_ , because that's what it was designed for. Designing
| anything else with them is extremely difficult, and in
| practice such work tends to lean heavily on the full-
| barrier "sequentially consistent" primitive instead. The
| older read/write barrier instructions were much simpler.
| (And x86 is simpler still, with just one reordering rule to
| worry about and a single idea of serializing instructions
| to regulate it).
| gpderetta wrote:
| I think it is just a generational/educational difference.
| I learned lock-free programming around the time c+11 was
| being standardized, and the acquire/release model seems
| very natural to me and easier to understand and model
| algorithms in it than the barrier based model.
| menaerus wrote:
| Hardware vendors adhering to C++ memory model to design
| their chips? I think you got it backwards or I
| misunderstood your point. Memory barriers exist because
| of how cache memory hierarchy and coherency protocols are
| implemented in multi-core chips. And not to "optimize and
| conform to C++ memory model".
|
| C++ memory model is there to make life of a programmer
| easier across different micro-architectures because
| memory model of the micro-architecture is what differs.
| immibis wrote:
| C tries to be everything to everyone. It has to be able to
| target every architecture, within certain limits. It took
| them a long time to standardize on two's-complement signed
| integers - they could only do that _after_ the whole industry
| had migrated to two 's-complement signed integers as they are
| the technically best way to represent signed integers.
|
| The x86 is the weird architecture that tried to keep existing
| programs working when they moved to multicore. Most other
| architectures implemented whatever is easier to implement in
| hardware, and compilers have to target that. And the compiler
| can't simulate a strong memory model on top of a weak one,
| because that means lots of overhead for every memory
| operation and that's not what C does.
| cjfd wrote:
| This document seems to be at a lower level. An actor model
| would be implemented in terms of these primitives. These
| primitives are available in many languages.
|
| Immutable is kind of irrelevant in a discussion of multi-
| threading because immutable data has no problems in this
| regard. Which, of course, in a way also makes it relevant
| because one prevents issues. Ultimately one cannot have just
| immutable data because the computer would not be allowed to do
| anything. This document would be about data that cannot be
| immutable.
|
| But the complaint sounds a bit like an article about assembly
| language getting the complaint that it does not explain how to
| implement a word processor.
| adrianN wrote:
| Of course you can have only immutable data, you just need
| suitable monads.
| hyperpape wrote:
| There's disagreement about what exactly makes you a systems
| programmer, but dealing with the reality that memory is
| finite is definitely a requirement.
|
| You need mutation or garbage collection (and systems
| programming is the kind of environment where that's not
| just "someone else's problem").
| kccqzy wrote:
| Plenty of languages have immutable data without monads or
| even being statically typed. Clojure is a fine example. And
| there is also the somewhat popular https://immutable-
| js.com/ that brings the benefits of immutable data to
| JavaScript.
| motorest wrote:
| > Hm. Seems very C++ oriented. Talks about specific keywords of
| C++ and their semantics. Not even mentioned "actor" in the
| whole document. "immutable" not found.
|
| The term "systems programmer" is right there in the title. They
| are also the very first words in the article. The examples are
| in C++, but the whole document focuses on both C and C++. The
| article clearly focuses on the primitives, as it should.
|
| It's undoubtedly a great article on the subject, but nitpickers
| have to nitpick, and one-uppers have to one-up.
|
| I look forward to read your article on the topic.
| mattgreenrocks wrote:
| > Seems rather like an very incomplete view of a C++
| programmer, wanting to raise it to "systems programmer".
|
| I'm not sure I understand this comment. C++ is one of the very
| few languages that can be used for programming everything from
| a web server to a garbage collector to firmware. There's no
| "raising" going on here.
| xslvrxslwt wrote:
| https://medium.com/nerd-for-tech/linus-torvalds-c-is-
| really-...
| swatcoder wrote:
| In this context, "systems programmer" means someone who writes
| software with tight, direct affinity to the underlying system
| (i.e. hardware, OS). C and C++ still dominate that world by
| huge measure.
|
| Presumably, you're thinking of systems in the more abstract
| sense: systems of more or less pure software components that
| model some domain. That's just not what this article is about.
| crabbone wrote:
| C++ is a very rare guest in that world. Not really worth
| mentioning. You could probably find more Python in that world
| than C++. Certainly more Shell than C++.
|
| You could absolutely get a job as a system programmer not
| knowing and never touching C++, but not knowing Shell will
| not get you anywhere in that field.
|
| It doesn't really have to be direct or tight (not even sure
| what that might mean). I'm guessing you were trying to say
| that it has to have minimal overhead compared to
| theoretically possible performance, in which case that's not
| true at all (a lot of Linux services and management tools are
| written in Shell, which just isn't known for great
| performance at all).
| swatcoder wrote:
| If it helps: among other examples, systems programming (as
| used here) characterizes much of what your Python and shell
| scripts _call_ , whether commands or libraries.
|
| C was the overwhelming language of choice for a very long
| time, but C++ made steady and verysignificant inroads in
| the last 20 years. Objective-C and Java had/have niches.
| Rust and Go have traction and continue to gain more. Swift
| is working towards it.
|
| Actors and immutable types, which the GP mentioned, have
| less signifance here because they largely trade transparent
| resource management (which is critically important in many
| of these projects) for correctness guarantees and
| sophisticated domain modeling. That said, we can expect
| there to be more room for them, especially where
| implementations of those features are zero/low-overhead.
| 9999_points wrote:
| Speaking of concurrency, whatever happened to that Parallella
| "super computer" project?
| codetrotter wrote:
| > whatever happened to that Parallella "super computer"
| project?
|
| I had a couple of those. (Well, still do - but not using them
| for anything now. Mine are the ones with 16 cores, that I got
| for backing them on Kickstarter when they did their original
| crowd funding) And I was wondering the same a while ago. This
| guy that was part of Adapteva, makers of Parallella, is working
| on some ASIC stuff now it looks like.
|
| https://github.com/aolofsson
|
| https://scholar.google.com/citations?user=qdDUg48AAAAJ&hl=en
|
| And since Adapteva website also says: "Adapteva is now Zero
| ASIC", I guess maybe some other people that were originally
| doing the Parallella thing are now doing ASIC things with that
| guy too.
|
| https://www.adapteva.com/
|
| The Parallella website is still up.
|
| https://parallella.org/
|
| And you can download the old Ubuntu images for the Parallela
| SBC from the GitHub releases. But they stopped adding anything
| new to it years ago.
|
| https://github.com/parallella/parabuntu/releases
| wslh wrote:
| Having worked with some of the concurrency issues mentioned and
| observed both failures and successes, I'd like to share a
| perspective: developers who struggle most with concurrency often
| lack a formal Computer Science background. This isn't to say that
| a CS degree is the only path to competence, but it provides
| structured exposure to concepts like concurrency through
| coursework, exams, and practical projects. These academic
| challenges can help solidify an understanding that's harder to
| acquire in a fragmentary way
| w10-1 wrote:
| This 2020 review carefully summarizes the overly complex and
| error-prone outcome of programming language memory model changes
| in the early 2010's. It's tempting to master such tricky details,
| but doing so almost inevitably leads to the costliest bugs ever:
| rare data corruption that's impossible to detect and reproduce.
|
| Russ Cox in 2021 offered a higher-level insightful history of
| evolving hardware and software support for concurrency, discussed
| just yesterday
| https://news.ycombinator.com/item?id=42397737
|
| His take (as I take it): systems programmers should only use
| sequentially-consistent atomics (and only via well-tested
| concurrent data structures).
|
| Easy peasy! ;p
___________________________________________________________________
(page generated 2024-12-13 23:01 UTC)