[HN Gopher] Let's write a setjmp
___________________________________________________________________
Let's write a setjmp
Author : jmillikin
Score : 207 points
Date : 2023-02-12 07:52 UTC (15 hours ago)
(HTM) web link (nullprogram.com)
(TXT) w3m dump (nullprogram.com)
| Simran-B wrote:
| Is it considered harmful, and if so, why?
| [deleted]
| pm215 wrote:
| Mostly for the usual "don't reinvent a wheel the language
| standard library already provides" reasons, I think. For
| instance glibc's x86setjmp/sigsetjmp have been updated to
| support shadow stacks, but if you'd rolled your own you'd have
| to do that yourself:
| https://elixir.bootlin.com/glibc/glibc-2.35/source/sysdeps/i...
| j16sdiz wrote:
| Yes, but the author was trying to avoid using libc.
|
| > Yesterday I wrote that setjmp is handy and that it would be
| nice to have without linking the C standard library.
|
| Related article from the same author.
| https://nullprogram.com/blog/2023/02/11/
| pm215 wrote:
| Personally I think that's throwing the baby out with the
| bathwater for most use cases. You could rephrase my comment
| as "if you're already committed to reinventing half of
| libc's wheels, this one is not really any harder or more
| awkward than most, but if you're not aiming for that
| overall goal then reinventing just this one wheel is a bad
| plan" if you like.
| randomNumber7 wrote:
| It's like goto, but you can jump over function boundaries.
| Checking out the other comments here its not considered
| harmful...
| adrian_b wrote:
| "setjmp" and "longjmp" are the mechanism for implementing
| exceptions in C, i.e. they are just another form of writing
| "catch" and "throw".
|
| The implementation of exceptions, i.e. of jumps over multiple
| levels of nested functions and blocks, is much simpler in C
| than in C++, because there are no destructors that must be
| called when unwinding the stack, so it is enough to restore the
| CPU registers to the values correct for the program point where
| the exception must be caught.
|
| This is needed because at the point where the exception is
| thrown it is not known whether any of the nested functions that
| must be skipped has modified any of the registers that a
| function is expected to preserve and where the original values
| of the registers have been saved.
|
| Any kind of exceptions can be easily misused, which is why it
| is recommended to be careful with the use of "setjmp" and
| "longjmp", but they are not more harmful than the use of
| exceptions in any other language.
|
| The only additional problem of C is that since there are no
| implicitly called destructors, like in C++ and similar
| languages, in C the programmer must do what the compiler would
| do in C++.
|
| This means that if the nested functions that are skipped by a
| "longjmp" have allocated heap memory, opened files or sockets
| etc., such resources must be freed in the exception handler
| marked by a "setjmp".
|
| Therefore the C programmer must keep track of the resources
| that might have been allocated in the nested functions, e.g. by
| recording the allocations in some global table.
| aardvark179 wrote:
| As well as being harder to work with correctly setjmp and
| longjmp are often a lot slower than other exception handling
| systems as you are paying the full cost of saving lots of
| register state even if an exception isn't thrown. I've seen
| this cause serious performance issues on Windows on x86-64 in
| the past.
|
| It also doesn't really compose well, so if library A is using
| it for exceptions and library B is doing something else
| clever then it's hard to coordinate those two uses.
| yvdriess wrote:
| Can you elaborate? Isn't the c++ exception system built on
| top of setjmp/longjmp?
|
| For setjmp to have a measurable performance impact, you
| have to be calling it an awful lot. It is just a handful of
| mov instructions without dependencies.
| bregma wrote:
| None of the popular C++ vendors implement C++ exceptions
| using setjmp/longjmp on any of the more (and very few of
| the less) common targets. Maintaining C++ language
| runtimes is a part of my day job and I am thankful they
| don't.
| jcranmer wrote:
| > Isn't the c++ exception system built on top of
| setjmp/longjmp?
|
| It is not. The dominant C++ exception system is based on
| "zero-cost exception handling", which refers to the fact
| that you do not call any extra code (like setjmp) until
| an exception is to be thrown. All of that logic is
| instead encapsulated in tables of exception-handling data
| and sophisticated unwind routines that look up those
| tables to figure out where to transfer control-flow to.
|
| > For setjmp to have a measurable performance impact, you
| have to be calling it an awful lot.
|
| If you built the C++ exception system on top of
| setjmp/longjmp, you would have to call setjmp on every
| instance that begins a try block. This includes implicit
| try blocks generated for every object that has a
| nontrivial destructor (which must be called if you unwind
| through the block). So yeah, you would be calling it an
| awful lot...
| aardvark179 wrote:
| Depending on the ABI it's saving a lot of registers, and
| it is very easy to end up calling it a lot if you cannot
| guarantee the code that may be called by functions you
| call and you need to do any cleanup.
|
| C++ and other languages tend to make the fast path of no
| exception extremely fast and store additional data about
| functions that allow for stack unwinding when an
| exception is thrown.
| ur-whale wrote:
| > "setjmp" and "longjmp" are the mechanism for implementing
| exceptions in C
|
| Correct, but reductive.
|
| They can be used for many other things.
| someweirdperson wrote:
| > They can be used for many other things.
|
| The underlying primitive offers even more flexibility, but
| exceptions (in the sense of programming languages) can also
| be used for many other things than exceptions (in the sense
| of natural language, to handle exceptional cases).
| bogomipz wrote:
| >"... but exceptions (in the sense of programming
| languages) can also be used for many other things than
| exceptions ..."
|
| Can you elaborate? What are the other ways exceptions are
| used when not used to handle exceptional situations?
| adrian_b wrote:
| The more general POV is that "setjmp" defines a
| continuation and "longjmp" invokes it.
|
| While the most frequent use of explicit continuations is
| for implementing exceptions, you are right that there are
| many other programming techniques based on explicit
| continuations (for instance coroutines).
| rwmj wrote:
| I'm now wondering if anyone has written an actual
| continuation library in C (ie saving and restoring the
| whole stack). Is it possible?
| menaerus wrote:
| Coroutines are one good example.
| yvdriess wrote:
| yes and yes.
| speps wrote:
| I'm not sure you need to use assembly for writing the MSVC
| version, it supports the "naked" calling convention:
| https://learn.microsoft.com/en-us/cpp/cpp/rules-and-limitati...
|
| Similarly, it has the "noreturn" decl spec:
| https://learn.microsoft.com/en-us/cpp/cpp/noreturn
|
| It'd be interesting if that works the same as in GCC for the
| article 's objective.
| flohofwoe wrote:
| MSVC doesn't support inline assembly for x86-64 (the declspecs
| are just hinting the compiler to not generate function
| entry/exit code and that it shouldn't be confused by the
| function not returning to the caller - not sure if
| declspec(naked) even works on x86-64 because I think it only
| makes sense with inline assembly).
| pjmlp wrote:
| Besides external Assembler, MSVC's way is to use intrinsics.
|
| https://learn.microsoft.com/en-us/cpp/intrinsics/compiler-in...
|
| Which is kind of nice and already proven in 1961 as a better way
| to do low level coding in Burroughs B5000, nowadays still sold as
| Unisys ClearPath MCP (naturally with improvements).
|
| Intrinsics can be better understood by the compiler type system,
| both for safety and optimization algorithms.
| flohofwoe wrote:
| Do the intrinsics allow to load and store registers from and to
| memory? That's kind of the whole point why assembly is
| required.
|
| Most cross-platform co-routine libraries I've seens for C or
| C++ use a small separate assembly file for the stack switching
| magic on MSVC, so if it's possible to do the same with
| intrinsics, then it's definitely not a well known technique ;)
|
| PS: at least the list of cross-platform intrinsics looks way
| too high level for a self-rolled setjmp/longjmp:
| https://learn.microsoft.com/en-us/cpp/intrinsics/intrinsics-...
| megous wrote:
| setjmp/longjmp don't switch stacks (they just "move" up the
| current stack)
| fanf2 wrote:
| This is fun :-) I was pleased to learn about __builtin_longjmp.
| There's a small aside in this article about the signal mask,
| which skates past another horrible abyss - which might even make
| it sensible to DIY longjmp.
|
| Some of the nastiness can be seen in the POSIX rationale for
| sigsetjmp
| (https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...)
| which says that on BSD-like systems, setjmp and _setjmp
| correspond to sigsetjmp and setjmp on System V Unixes. The effect
| is that setjmp might or might not involve a system call to adjust
| the signal mask. The syscall overhead might be OK for exceptional
| error recovery, such as the arena out of memory example, but it's
| likely to be more troublesome if you are implementing coroutines.
|
| But why would they need to mess with the signal mask? Well, if
| you are using BSD-style signals or you are using sigaction
| correctly, a signal handler will run with its signal masked. If
| you decide to longjmp out of the handler, you also need to take
| care to unmask the signal. On BSD-like systems, longjmp does that
| for you.
|
| The problem is that longjmp out of a signal handler is basically
| impossible to do correctly. (There's a whole flamewar in the wg14
| committee documents on this subject.) So this is another example
| of libc being optimized for the unusual, broken case at the cost
| of the typical case.
| flykespice wrote:
| I didn't know of setjump usage until I studied Lua's source code,
| they used it everywhere as a replacement for exceptions in C
| stevefan1999 wrote:
| I wonder if setjmp/longjmp can be implemented in hardware, i.e.
| introduce an instruction set that points, save and switch the
| current registers called register window that can be switched by
| saving a pointer.
|
| This way both setjmp and longjmp are basically few cycles and
| exception handling would be hella fast.
| CalChris wrote:
| The DEC VAX-11 instruction set had SVPCTX and LDPCTX for the
| kernel which made context switch simple. But they were reserved
| instructions; so they couldn't be used for _setjmp()_ and
| _longjmp()_. VAX-11 also had queue instructions which made
| rescheduling simple. This was basically: SETIPL
| SVPCTX INSQUE FFS REMQUE LDPCTX
| SETIPL
|
| Yeah, there was a little more. See page 188 of ...
| http://bitsavers.informatik.uni-stuttgart.de/pdf/dec/vax/vms...
| snarfy wrote:
| Intel has e.g. push all/pop all instructions. It doesn't save
| everything needed for setjmp/longjmp (like fpu state) but does
| a decent job.
| smcameron wrote:
| > The solution is to qualify r with volatile, which forces the
| compiler to store the variable on the stack and never cache it in
| a variable.
|
| Should the last word of that sentence be "register"?
| magicalhippo wrote:
| This takes me back to the late 90s when I was a teen and decided
| to try to implement preemptive multi-threading in Turbo Pascal.
| I'd gotten a book on assembly so was somewhat versed in that, and
| took on the challenge.
|
| Having to save and restore the registers was obvious, but as I
| recall it the main challenge was figuring out the right sequence
| to do that while also preserving the CPU flags.
|
| I only used the timer interrupt to switch between threads in a
| round-robin way, so interactivity wasn't the best, but it worked.
| Was quite pleased with myself for accomplishing that.
|
| I think it was a pretty decent challenge, and one that was quite
| instructive. Not just did you need to know some details about the
| various CPU instructions (like which ones affected flags), but it
| was also a kind of a puzzle to arrange it all the right way.
| FpUser wrote:
| Ha, I did exactly the same in exactly same language but that
| was the end of 80s. Interactivity was just fine as I've
| increased frequency of timer interrupt to 4096 times / s and
| would pass control to standard handler every so often so it
| would work as standard at about 18 times/s. When not calling
| old interrupt I would execute threads switch and control logic.
| Turbo Pascal supported built in assembly so writing all low
| level stuff was piece of cake. I also used built in assembly to
| implement graphics.
| thdespou wrote:
| This is probably the geekiest article I can digest as a Front end
| engineer.
| jagrsw wrote:
| While some programs/libs (most notably libjpeg) use
| setjmp/longjmp, I tend to avoid them: * Does
| setjmp/longjmp save/restore all registers (eg simd) of modern CPU
| variations? Can incosistiences happen here? * As a goto, it
| restores some context (registers) but not others (memory), which
| can be counterintuitive. Does it work with volatile variables
| * Does it play well with less standard C features like
| attribute(cleanup) or with C++ features like exception handling
| or class destructors? To avoid issues, it may be best to stick to
| very procedural and basic C when using setjmp/longjmp? *
| Except registers and memory, there are also IO, FS, Net (at
| least) contexts, and these are not restored (cannot be really),
| so this notion of "restore certain vars to their original states"
| might not work well with certain types of code
|
| Bc some programming gods recommend using setjmp/longjmp, my
| hesitance is likely unfounded.
| Findecanor wrote:
| The only local variables you can count on being preserved are
| "volatile" variables initialised before the setjmp() call, and
| then not changed.
|
| That's all the guarantee the C standard gives. Anything else is
| specific to compiler/OS.
| adrian_b wrote:
| As you have phrased them, the standard guarantees seem weaker
| than they are actually specified in the C standard.
|
| Where a "longjmp" returns after a "setjmp", everything is
| preserved exactly like after a normal function call
| (including the registers specified by the ABI),
|
| "except that the values of objects of automatic storage
| duration that are local to the function containing the
| invocation of the corresponding setjmp macro that do not have
| volatile-qualified type and have been changed between the
| setjmp invocation and longjmp call are indeterminate."
|
| For the most frequent use case, when "longjmp" is used to
| throw exceptions, the only local variables that can become
| indeterminate, according to the standard, are those that have
| been passed as arguments to functions invoked inside the C
| equivalent of a "try" block, or which have been explicitly
| modified in another way there. Any local variable that is not
| assigned to or passed as an argument there is preserved.
|
| The standard behavior is normal, because where a "longjmp"
| returns it is not known where the execution of the nested
| functions has been aborted and whether any of their output
| parameters already store their expected final values or only
| some unpredictable temporary values.
| adrian_b wrote:
| "setjmp/longjmp" must save/restore only those registers that
| are defined by the C ABI as being preserved across function
| calls.
|
| The program point to where "longjmp" jumps is viewed by the
| compiler as a function return point, so the compiler assumes
| that all the other registers hold undefined values.
|
| For most CPUs, the C ABI defines the SIMD registers as being
| _not_ preserved across function calls, so "setjmp" does not
| need to save them. Had some of them been defined as preserved
| registers, "setjmp" would have also saved those.
|
| "setjmp/longjmp" cannot be used in any C++ program, unless all
| the nested functions that occur between a "setjmp" and a
| "longjmp" are C functions. This may happen in C libraries which
| are linked into C++ programs and which may use "setjmp/longjmp"
| internally, without interfering with C++ objects.
|
| As you say, "attribute(cleanup)" is not something specified in
| the C language standard, so you cannot expect that "longjmp"
| will invoke the cleanup function, unless you use a specific
| libc implementation, whose documentation explicitly says that
| its "longjmp" will invoke the cleanup functions, when the
| program is compiled with a certain C compiler. I am not aware
| of any such "longjmp" implementation.
|
| If you want to use exceptions in a C program, you must use
| "setjmp/longjmp". If you do not want to use exceptions in a C
| program, you have no need for them.
|
| The same happens in any other programming language. If you do
| not want to use exceptions in a C++ program, you have no need
| to use keywords like "try" and "throw".
| murdoze wrote:
| One of the largest C++ projects I worked on was compiled with
| exceptions disabled (-fno-exceptions).
|
| Still in production, for 20 years now.
| chaosite wrote:
| Google famously doesn't use exceptions for C++, and
| Google's C++ code is not exception-safe (https://google.git
| hub.io/styleguide/cppguide.html#Exceptions.)
| flohofwoe wrote:
| I wonder though if setjmp/longjmp was designed with
| exceptions in mind, or just as a general escape hatch for the
| 'tamed' structured goto in C (my guess would be the latter).
| [deleted]
| comex wrote:
| > I am not aware of any such "longjmp" implementation.
|
| Clang on Windows behaves this way! That is, when targeting
| the Visual C++ runtime, as opposed to MinGW.
|
| With the Visual C++ toolchain and runtime, longjmp behaves
| like a C++ exception throw and goes through a whole stack
| unwinding routine, unlike on Unix (and MinGW) where it just
| sets a few registers and branches. This has the downside of
| being much slower, but the benefit that longjmp will run
| destructors of C++ local variables as it unwinds (as well as
| SEH `__finally` blocks), making it less of a footgun in C++
| code.
|
| Visual C++ itself doesn't support GCC extensions like
| `__attribute__((cleanup))`, but these days Clang is highly
| compatible with Visual C++ while also supporting GCC
| extensions.
|
| Edit: Windows also has the distinction of, on x86-64,
| treating the SIMD registers xmm6-xmm15 as callee-saved
| (preserved across function calls). As you mention, this means
| that setjmp has to save them, which is fine except that it
| bloats the size of jmp_buf.
| bregma wrote:
| Specifically, The C++ Programming Language standard ISO/IEC
| 14882:2011 18.10/4 [support.runtime] says this.
|
| > A setjmp/longjmp call pair has undefined behavior if
| replacing the setjmp and longjmp by catch and throw would
| invoke any non-trivial destructors for any automatic objects.
|
| Of course, the C++ language runtime is usually written in
| pure C, and at least one naive implementation of the C++
| try/catch mechanism uses setjmp/longjmp under the hood.
| smcameron wrote:
| > While some programs/libs (most notably libjpeg) use
| setjmp/longjmp
|
| Also libpng. What is it with these graphics guys that they
| can't just return an error code like a normal person?
| duskwuff wrote:
| libjpeg is a fairly old library (1991), so it isn't terribly
| surprising that it doesn't follow modern programming
| conventions.
|
| libpng is only slightly newer (1995), and may have adopted
| the pattern from libjpeg.
| mjburgess wrote:
| After learning about effect systems, and generalising effects, my
| view on `setjmp` has changed considerably -- it seems effect
| systems "effectively" offer a design pattern for their use in
| languages without them.
|
| ie., it feels that there's an async-await.h, try-catch.h, etc. to
| be written which would serve as c-ish design patterns. I'd be
| interested, then, in to what degree other langs can do the same.
|
| As a side point, i've not yet read a good defence of why
| programming in C is so fun and satisfying in this way. People
| reduce the Rust/C issue to memory saftey... but isn't there
| something inherently wonderful about `while(*this++ = *that++)`
| (etc. etc.) ?
|
| (A sense of fun beaten out of you by too many annotation
| guaranteeing its saftey?)
| anonymousiam wrote:
| I once took an "Advanced C Programming" class (which would have
| been better named as "How to do OOP in a language never
| intended for it") where the instructor expressly prohibited the
| use of "*i++" and several other language elements because he
| thought they were confusing. I got into many arguments with the
| instructor throughout the course, and I figured I would get a
| poor grade, but he still gave me an A. My main disagreement
| with him was this: If the language elements are there and well
| defined, why prohibit their use? The course after all was
| "Advanced C", wasn't it?
| Karellen wrote:
| "Programs must be written for people to read, and only
| incidentally for machines to execute"
|
| -- Abelson & Sussman, _Structure and Interpretation of
| Computer Programs_ , 1984.
|
| It's possible to write English with bad structure, clumsy
| metaphors, obscure vocabulary, and non-non-non-usual
| idiosyncrasies. And other people might be technically capable
| of understanding it if they really want to, and try hard
| enough.
|
| After all, the language elements are there and well-defined,
| so why would anyone ever complain about "bad" writing?
|
| (Out of curiosity, do you object to people who say the use of
| "goto" should be seriously restricted, or even prohibited, in
| most programs? Do you specifically use gotos to make the
| point that they can still be be useful and productive? The
| language element is there and well-defined.)
| badsectoracula wrote:
| That has been my argument for "preprocessor abuse" - there
| isn't such thing as preprocessor "abuse"[0], it is part of
| the language and provides some form of extensibility in a
| language with an already limited set of features.
|
| If anything the preprocessor needs more features (let me
| include files from macros and do loops dammit :-P).
|
| [0] ok, i can think of some uses that might count, like
| "#define BEGIN {", etc that serve no practical purpose, but i
| don't think anyone called these "preprocessor abuse".
| kevin_thibedeau wrote:
| You can have all that with m4. It's a perfectly suitable C
| preprocessor.
| lisper wrote:
| > isn't there something inherently wonderful about `while(
| _this++ =_ that++)` (etc. etc.) ?
|
| There is something inherently wonderful about riding a
| motorcycle without a helmet too. That doesn't mean it is wise.
| tomcam wrote:
| About your last two questions yes and yes.
|
| I love C.
| flohofwoe wrote:
| These days the 'fashionable' way to implement async-await seems
| to be via compiler magic by transforming async functions into a
| 'switch-case state machine' and a hidden context pointer
| argument.
|
| In vanilla C (without compiler magic) I've mostly seen it
| implemented via 'green threads' aka 'fibers' aka 'stack-
| switching', but TBH I'm not sure if this can be implemented
| with the standard setjmp/longjmp, I've mostly seen it
| implemented without (and instead use two small assembly
| functions for the context switch).
|
| One downside of stack-switching is that it doesn't work on
| WASM.
| chaosite wrote:
| stack switching with setjmp/longjmp can be implemented like
| this: https://stackoverflow.com/a/8817009/1116739
|
| But it's messy enough that you'd want a library/framework to
| help you handle it.
| mananaysiempre wrote:
| POSIX pre-2008 had (and Linux/Glibc and
| {Free,Net,DragonFly}BSD still have) <ucontext.h> with
| proper stack switching functions, used as a fallback in a
| number of coroutine libraries. The fallback status is due
| to a self-inflicted inefficiency: they save and restore the
| signal mask, thus still need to go through the kernel (why
| e.g. Linux does not put signal mask manipulation in the
| vDSO, I don't know). POSIX yanked them and now recommends
| rewriting to use POSIX threads instead, which is asinine.
| gpderetta wrote:
| And the 2008 was about the time that coroutines were
| gaining popularity again!
| Findecanor wrote:
| Putting each tasklet's stack in different places on the
| actual stack and jumping between them is inherently unsafe
| and not portable. You must be sure that each tasklet does
| not consume too much stack so that it does not overwrite
| another.
|
| On BSD Unices, you can only longjump back up the stack.
| Otherwise, longjmp() will call longjmperror() and terminate
| the program.
| chaosite wrote:
| > Putting each tasklet's stack in different places on the
| actual stack and jumping between them is inherently
| unsafe and not portable. You must be sure that each
| tasklet does not consume too much stack so that it does
| not overwrite another.
|
| It's absolutely unsafe and a ridiculous to do, but what's
| the reasoning for it being unportable? Wouldn't it just
| be just as unsafe anywhere that C compiles?
|
| > On BSD Unices, you can only longjump back up the stack.
| Otherwise, longjmp() will call longjmperror() and
| terminate the program.
|
| The manpage claims that the semantics is not "only jump
| back up the stack", but rather that you can't longjmp to
| "[...] an environment that that has already returned".
| Technically, the tasklet the we're longjmp'ing to never
| terminates, right?
| comex wrote:
| In any case, you definitely can't do it on Windows and
| Emscripten (WebAssembly), where longjmp invokes the same
| stack unwinding behavior as C++ exception handling,
| rather than just setting some registers and jumping.
| Windows has its own APIs for tasklets (fibers); no such
| luck on Emscripten.
| samsquire wrote:
| I wrote an unrolled switch statement in Java to simulate
| eager async/await across threads.
|
| https://github.com/samsquire/multiversion-concurrency-
| contro...
|
| The goal is that a compiler should generate this for you.
| This code is equivalent to the following:
| task1: while True: handle1 = async
| task2(); handle2 = async task3();
| print(await handle1) print(await handle2)
| task2: n = 0 while True: yield
| n++ task3: n = 0 while True:
| yield n++
|
| It doesn't actually run task2 and task3 both eagerly, I've
| not got around to scheduling the tasks on DIFFERENT threads.
| They currently queue onto a single thread, so task2 and task3
| are parallel to task1 (but maybe not at the same time) but
| task2 and task3 are not parallel to each other. This is my
| goal.
|
| I also ported the coroutines in this blog post
| https://blog.dziban.net/coroutines/ to GNU Assembler
| https://github.com/samsquire/assembly/blob/main/coroutines.S
|
| This might be useful to someone who wants to port this to C.
| This uses the stack switching idea. So they are stackful
| coroutines.
|
| There's also Tina a header only coroutine library
| https://slembcke.github.io/Tina
|
| I also played with protothreads http://dunkels.com/adam/pt/
|
| I also asked in a stackoverflow post how to use a thread pool
| with C++20 coroutines:
| https://stackoverflow.com/questions/74520133/how-can-i-
| pass-... Someone provided some example code, which seems to
| work with C++ coroutines :-)
| duped wrote:
| The big problem with setjmp/longjmp for fibers is that a call
| to longjmp is undefined behavior if the `jmp_buf` argument
| was created by a call to `setjmp` on a different thread (1).
| That means fibers cannot be easily relocated onto a different
| thread, making M:N threading tricky to implement and erasing
| a lot of the benefit of fibers.
|
| And that said, implementing a super-fast
| setcontext/swapcontext is like twenty lines of assembly with
| not too many gotchas, if you don't care about saving a few
| things that require syscalls.
|
| But all that said the real downside of stack switching is
| that it's overkill for coroutines that can be implemented as
| a finite state machine, unless the runtime supports growable
| stacks (otherwise you pay a big cost on fiber creation, and
| eat a lot of memory for many fibers). There are a few
| languages that do this and it's super cool, but C isn't one
| of them.
|
| WASM will almost certainly support stack switching, iirc
| there have been proposals for wasmtime to support it already?
|
| (1) https://pubs.opengroup.org/onlinepubs/9699919799/function
| s/l...
| lstodd wrote:
| > And that said, implementing a super-fast
| setcontext/swapcontext is like twenty lines of assembly
| with not too many gotchas, if you don't care about saving a
| few things that require syscalls.
|
| sigaltstack(2) wasn't all that prohibitive when I did that
| in Python 2.6 back then. Was it 2009?
|
| I seriously can't understand this obsession with FSMs. A
| naive setcontext()-based implementation outperfomed both
| greenlets (with its crazy legacy of memcpy-ing parts of
| stack from stackless) and Tornado/Twisted (with them being
| pure-python and therefore lacking any means to force some
| async on client libraries. which one does in C) while
| letting everyone write some nice clean synchronous-looking
| code.
|
| 10 years later we end up with half a language hacked up and
| still nowhere near the ease of use that was coded in a week
| or so.
| ckastner wrote:
| > inherently wonderful about `while(*this++ = *that++)`
|
| As someone who hasn't used C as a primary language in some
| project for over a decade, I read this and (1) realize that LHS
| and RHS both post-increment, (2) don't remember if there is
| some UB I might be overlooking, (3) realize that the operator
| is assignment "=", not comparison "==", and I can't even
| remember what the loop termination criterion here would be.
| Until "*that++" is equivalent to false, or something?
|
| It may be beautiful to the experienced programmer, but I
| personally would consider this just "clever" (which is a
| criticism, not a compliment). It feels like someone needlessly
| tried to pack everything into a single line of code.
|
| This is something I'd write out more verbosely, if only to make
| reading it simpler. The compiler will probably generate the
| same machine code either way.
|
| I'm fully aware that to the seasoned C developer, my criticism
| might come across as naive. However, even the fully seasoned C
| developer can get careless, or become tired, and in C, every
| one of those little things can come back and bite you in those
| situations.
|
| Edit: removed the double pointer dereferencing remark, must
| have been an artifact from HN's special treatment of the
| asterisk.
|
| Edit-Edit: I was probably wrong. I don't think it's possible to
| create a more verbose version without it affecting performance.
| sfpotter wrote:
| Could we try to keep the topic on the article itself instead
| of complaints about C? It sucks to come and read the comments
| about this very nice article and have to scroll and scroll
| until I finally get to comments written by people who
| actually have something to say. This is a great blog, and the
| author puts a ton of effort into their posts. It's hard for
| me not to view comments like this as being a bit thoughtless
| and inconsiderate.
| mjburgess wrote:
| That's why comments collapse.
|
| There's little more useful to say here, other than I don't
| think most people would agree with your view of what HN is.
|
| It's a discussion forum, not an exegetical seminar. "On
| topic" is whatever the topic of discussion is; and this is
| not constrained to the article.
| dilap wrote:
| Since we're already way off-topic, allow me to share my
| idea for solving this perennial problem: When commenting,
| there is a little selector:
|
| [ ] My comment is on-topic and positive to neutral
|
| [ ] My comment is critical
|
| [ ] My comment is off-topic
|
| You've got to select one. When viewing, the comment
| thread defaults to just showing on-topic, non-negative
| comments, but you can see the other stuff, too, if you
| want.
|
| This solves two seemingly contradictory desires: the
| ability to read comments on things that interest you
| without having to fend off waves of negativity and wade
| through pools of offtopic text and the ability to speak
| freely.
| Nullabillity wrote:
| It sounds like you're looking for an upvote counter, not
| a comment section.
| dilap wrote:
| No, that's not it at all. The point is to give a small
| amount of structure to the comments section, so people
| can see what they want.
|
| Think of it as very roughly analoguous to the different
| sections of a newspaper.
| subarctic wrote:
| That's not really how comment threads work. If you reply to
| a specific comment, you're replying to the stuff they said
| in that comment. The guy's not even complaining about c
| either, he's commenting on the bit of c code that the
| parent commenter wrote
| [deleted]
| flohofwoe wrote:
| It's entirely valid C and (assuming this and that are byte
| pointers) copies a range of bytes until (and including) a
| zero byte is reached.
|
| With a suffficient warning level (e.g. -Wall on gcc, which
| should always be enabled anyway, together with -Wextra),
| compilers will complain about the '=' and ask you to add a
| pair of braces to make clear that this is actually intended:
| while( (*this++ = *that++) );
|
| It's also one of those cases where the C code matches the
| output assembly pretty well:
|
| https://www.godbolt.org/z/nz1jbz4Er
|
| As far as "obfuscated C" goes, this is a very tame example
| though, it's just a straightforward usage of language
| features, which might look strange only when coming from
| other languages that don't have pointers or a post-increment
| operator).
| ckastner wrote:
| Doesn't that kind of prove my point?
|
| The code as written was described as beautiful, yet would
| not have passed -Wall.
|
| It's those little things that can easily get you in C, and
| there are so many little things to consider.
| kevin_thibedeau wrote:
| GCC warnings can be overly pedantic. It's setup to warn
| about common footguns but doesn't know what your intent
| is. In this case it's a common enough idiom to assign
| within a control statement that GCC has the extra parens
| escape hatch.
|
| You shouldn't just blindly let your tooling dictate how
| you work. It's a tool that's supposed to work _for_ you,
| not control you. -Wall and -Wextra are good baselines but
| I always disable some of their warnings because I don 't
| need the hassle on known good code.
| flohofwoe wrote:
| That extra pair of braces doesn't make the code 'ugly' ;)
|
| And the code without braces is still entirely valid
| standard C, the warning is essentially just a lint to
| protect against typos (similar to JS linters warning
| about '===' vs '==').
|
| PS: let's see if the alternatives would be any more
| readable: char c; while (c =
| *that++) { *this++ = c; }
|
| ...this is already buggy because it doesn't copy the
| final zero byte, so the test must happen inside the loop
| body and also lets try to get rid of the post-increment:
| while (true) { char c = *that;
| *this = c; this += 1; that += 1;
| if (c == 0) { break; }
| }
|
| ...hmm not really any more readable...
|
| Let's try with an index... while (true)
| { char c = that[i]; this[i] = c;
| i += 1; if (c == 0) { break;
| } }
|
| ...might be a bit easier to grasp when used to other
| languages, but readability hasn't improved all that much
| I'd say...
|
| For reference, MUSL also just uses the original approach:
|
| https://github.com/esmil/musl/blob/master/src/string/strc
| py....
| ckastner wrote:
| I was unclear, sorry: I didn't mean to say that the extra
| braces make it uglier, I meant to point out that
| something that was described as beautiful was actually
| flawed.
|
| The flaw was minor in this case because the identifier
| names and lack of body make the intention clear, but my
| point is that there are a lot of minor things in C that
| can come and bite you at any time.
|
| Edit: You are right, I don't see a way this could have
| been implemented more readable without sacrificing some
| performance.
|
| First thing I thought of was: void
| cp(const char* from, char* to) { while (*from)
| { *to = *from; to++;
| from++; } }
|
| But that does not reduce to the original case.
| flohofwoe wrote:
| I've added a couple of examples trying to find a more
| readable version, which actually isn't trivial. Sorry for
| the 'post-edit' :)
|
| As for performance: I don't think such details matter
| much, first, compilers are pretty good to turn "readable
| but inefficient" code into the same optimal output (aka
| "zero cost abstraction").
|
| And a really performance-oriented strcpy() wouldn't
| simply copy byte by byte anyway, but try to move data in
| bigger chunks (like 64-bit or even SIMD registers).
| Whether this is then actually faster also depends on the
| CPU though.
| layer8 wrote:
| I would use a do-while, because at least one char is
| always copied: char c; do
| { c = *src++; *dst++ = c;
| } while (c != 0);
|
| The post-increment is idiomatic enough in C-based
| languages that I wouldn't worry about that.
| bee_rider wrote:
| I don't write much C, but to an outsider like me this is
| a pretty big improvement.
|
| It is a shame post-test loops aren't more popular, given
| the similarity to the assembly they output. Seems more
| mechanically sympathetic. Oh well, at least it is an
| excuse to whip out the goto.
| layer8 wrote:
| In Pascal, you could do repeat
| ... until c = 0;
|
| which might be even clearer for this use case.
| jlg23 wrote:
| Yes, with -Wall it triggers a warning, but the warning is
| a false positive because assignment is indeed the
| intention.
| mjburgess wrote:
| `this` and `that` are arrows which range over a stream of
| data; `=` is copy, and `++` moves the arrow along the
| stream.
|
| This isn't a "clever one-liner" it is a clear and precise
| syntax for expressing the operation the machine actually
| performs.
|
| while(copy(current(stream_a), current(stream_b)) and not
| end_of_stream(stream_a))
|
| You might prefer the above, but then, that's every other
| major language. The beauty of C is that the above code
| has to compile to something like the C version. C just
| allows you to actually express it
| mypalmike wrote:
| C lacks the expression of many useful common CPU
| capabilities. Integer rotation and overflow checking come
| to mind immediately.
| umanwizard wrote:
| > it is a clear and precise syntax for expressing the
| operation the machine actually performs.
|
| No it's not. Your compiler will almost certainly
| translate this into vector instructions, at least.
| umanwizard wrote:
| For posterity, I was apparently wrong. It doesn't
| autovectorize, with gcc 12.2 -O3 on godbolt, at least.
| gpderetta wrote:
| > the output assembly pretty well
|
| Ironically, the compiler os likely to recognize this as a
| strcpy and replace it with a possibly vectorized
| implementation.
| flohofwoe wrote:
| I actually tried to make that happen, but was
| unsuccessful on GCC and Clang (I've seen this in the past
| for mempcy() though).
| naasking wrote:
| You might be interested in libhandler:
|
| https://github.com/koka-lang/libhandler
|
| Basically an effect handling library for C written by the folks
| developing the koka language.
| matthews2 wrote:
| Related and also good fun, writing your own fibers!
| https://graphitemaster.github.io/fibers/
| IncRnd wrote:
| I remember doing this 30ish years ago. It was all the rage.
| Lots of people did that to get multi-threading on the OS of the
| time.
___________________________________________________________________
(page generated 2023-02-12 23:00 UTC)