[HN Gopher] Asynchronous Error Handling Is Hard
       ___________________________________________________________________
        
       Asynchronous Error Handling Is Hard
        
       Author : hedgehog
       Score  : 29 points
       Date   : 2025-06-29 20:35 UTC (1 days ago)
        
 (HTM) web link (parallelprogrammer.substack.com)
 (TXT) w3m dump (parallelprogrammer.substack.com)
        
       | rorylaitila wrote:
       | Exceptions get a lot of hate, but of the three styles, I keep
       | coming back to exceptions. Ages ago I built an application with
       | error codes, and went back to exceptions, because I thought the
       | ceremony of error checking was not worth it. On occasion, I'll
       | use a get-last-error style, particularly when the error is
       | something the user is intended to address. But for most of my
       | applications (which are usually not libraries and are code under
       | my control) I like exceptions.
       | 
       | I always have global error handler that logs and alerts of
       | anything uncaught. This allows me to code the happy path. Most of
       | the time, it's not worth figuring out how to continuing
       | processing under every possible error, so to fail and bail is my
       | default approach. If I later determine that its something that
       | can be handled to continue processing, then I update that code
       | path to handle that case.
       | 
       | Most of my code is web applications, so that is where I'm coming
       | from.
        
         | PaulHoule wrote:
         | Hell yeah. I typed in my first C program, a terminal emulator,
         | from a 1984 issue of _Byte_ magazine. It was painful seeing how
         | 5 lines of real logic were intertwined with 45 lines of error
         | handling logic that, in the end, did what exceptions did for
         | free -- it was a formative experience for me as a programmer
         | and when I saw exceptions in Java in 1995 (still in beta) they
         | made me so happy.
         | 
         | In the async case you can pass the Exception as an object as
         | opposed to throwing it but you're still left with the issue
         | that the failure of one "task" in an asynchronous program can
         | cause the failure of a supertask which is comprised of other
         | tasks and handling that involves some thinking. That's chess
         | whereas the stuff talked about in that article is Tic-Tac-Toe
         | in comparison.
        
           | rorylaitila wrote:
           | Yeah, I agree in the async case. What I do there is wrapped
           | the async code in its own global error handler, so to speak.
           | That handler is logging to something that the outer process
           | can get-last-error from.
           | 
           | But I can get away with this also because I don't write async
           | heavy code. My web applications are thread-per-request
           | (Java). This fits 99% of the needs of business code, whose
           | processing nature is mostly synchronous.
        
             | PaulHoule wrote:
             | By and large you want to avoid async if you can help it.
             | Sometimes you can't. The struggles Rustifarians have had
             | with it are a cautionary tale (the stack and borrow
             | checking go together like peanut butter and jelly.). I used
             | to have a lot of fun writing async Python, friends told me
             | I was living dangerously, I finally rewrote my RSS
             | reader/bookmark manager/personal web crawler/image sorter
             | in sync style so some of it could run in Celery and that
             | any blocking on the CPU on a 16 core machine is too much.
             | 
             | People used to worry about the 10k connection problem but
             | machines are bigger now, few services are really that big,
             | and fronting with nginx or something like that helps a lot.
             | (That image sorter serves images with IIS)
             | 
             | JavaScript is async and you gotta live with it because of
             | deployability. No fight with the App Store. No
             | installshield engineer. No army of IT people to deploy
             | updates. "Just works" on PC, Mac, Linux, tablet, game
             | consoles, VR headsets, etc. Kinda sad people are making
             | waitlist forms with frameworks that couldn't handle the
             | kind of knowledge graph editor and decision support
             | applications I was writing in 2006 but that's life.
        
         | bob1029 wrote:
         | Exceptions can even work with remote APIs.
         | 
         | If you reach into the enterprise bucket of tricks, technologies
         | like WCF/SOAP can propagate these across systems reliably. You
         | can even forward the remote stack traces by turning on some
         | scary flags in your app.config. When printing the final
         | exception using .ToString(), this creates a really magical
         | narrative of what the fuck happened.
         | 
         | The entire reason exceptions are good is because of stack
         | traces. It is amazing to me how many developers do not
         | understand that having a stack trace at the exact instant of a
         | bad thing is like having undetectable wall hacks in a
         | competitive CS:GO match.
        
           | rorylaitila wrote:
           | Yes, I've never quite understood the "But with exceptions
           | it's hard to debug why the error occurred after the fact, its
           | better to be explicit in advance" - The stack trace points
           | exactly to the line. And usually, with the error message and
           | context, its all I need. Maybe I'm missing something that
           | someone can inform me.
        
             | flysand7 wrote:
             | Yeah, this kinda becomes a problem when the library you are
             | using does not distribute its source code, so even _if_ you
             | get the line, this information is practically useless to
             | you.
             | 
             | This has been my biggest problem with exceptions, one, for
             | the reason outlined above, plus it's for how much time you
             | actually end up spending on figuring out what the exception
             | for a certain situation is. "Oh you're making a database
             | insertion, what's the error that's thrown if you get a
             | constraint violation, I might want to handle that". And
             | then it's all an adventure, because there's no way to know
             | in advance. If the docs are good it's in the docs,
             | otherwise "just try it" seems to be the way to do it.
        
               | rorylaitila wrote:
               | Yeah I agree with that, opaque errors from libraries are
               | where this really sucks. The worst is when they swallow
               | the original error and throw a generic exception instead.
        
           | 9rx wrote:
           | _> It is amazing to me how many developers do not understand
           | that having a stack trace at the exact instant of a bad thing
           | is like having undetectable wall hacks in a competitive CS:GO
           | match._
           | 
           | Who doesn't understand that? If you aren't using exceptions
           | you are using wrapping instead, and said wrapping is merely
           | an alternative representation of what is ultimately the very
           | same thing. This idea isn't lost on anyone, even if they
           | don't use the call stack explicitly.
           | 
           | The benefit of wrapping over exceptions[1] is that each layer
           | of the stack gains additional metadata to provide context
           | around the whole execution. The tradeoff is that you need
           | code at each layer in the stack to assign the metadata
           | instead of being able to prepare the data structure all in
           | one place at the point of instantiation.
           | 
           | [1] Technically you could wrap exceptions in exceptions, of
           | course. This binary statement isn't quite right, but as
           | exceptions have proven to be useless if you find yourself
           | ending up here, with two stacks offering the same
           | information, we will assume for the sake of discussion that
           | the division is binary.
        
             | groestl wrote:
             | One could say the whole point of wrapping exceptions is to
             | add additional metadata _if such data is available_.
             | Otherwise, the most basic metadata is tracked
             | automatically: stack locations.
        
               | 9rx wrote:
               | Technically, the actual whole point of wrapping is to
               | avoid leaking implementation details. If you let
               | "FooLibraryException" bubble up, and then you stop using
               | Foo Library, then all of the users of your code are going
               | to end up broken waiting for "FooLibraryException" when
               | now you throw "BarLibraryException". This diminishes any
               | value exception handlers theoretically could provide
               | since you end up having to wrap everything at each step
               | anyway.
               | 
               | Checked exceptions were introduced to try to help with
               | that problem, giving you at least a compiler error if an
               | implementation changed from underneath you. But that
               | comes with its own set of problems and at this point most
               | consider it to be a bad idea.
               | 
               | Of course, many just throw caution to the wind and don't
               | consider the future, believing they'll have moved on by
               | then and it will be the next programmer's problem. Given
               | the context of discussion, we have assumed that is the
               | case.
        
         | PaulKeeble wrote:
         | Agreed and even more heretical from me is that I quite like
         | declared exceptions. It makes the interface of a method clear
         | in all the ways it can fail and you can directly choose what to
         | handle often without having to look at the docs to work out
         | what they mean, because the names tell you what you need to
         | know. You can ignore them and rethrow catch globally but you
         | can also handle them.
         | 
         | Having used Go for years now frankly I prefer exceptions, way
         | too often there is nothing that can be done about an error
         | locally but it produces noise and if branches all over the code
         | base and its even worse to add an error later to a method than
         | in Java because every method has to have code added not just a
         | signature change. I really miss stack traces and the current
         | state of the art in Go has us writing code to produce them in
         | every method.
        
           | bigstrat2003 wrote:
           | Yep, checked exceptions are the shit. You can of course abuse
           | them to create a monstrosity (as you can with anything), but
           | when used responsibly I think they are by far the best error
           | handling paradigm.
        
         | o11c wrote:
         | The problem with `getlasterror` and `errno` is that they're
         | global (thread-local, whatever).
         | 
         | But if you make them take a `context` object, there's no longer
         | a problem.
         | 
         | One interesting observation - you can use them even for the
         | initial "failed to allocate a context" by interpreting a NULL
         | pointer as always containing an "out of memory" error.
        
       | b0a04gl wrote:
       | in async code ,errors belong to the task ,not the caller.
       | 
       | in sync code ,the caller owns the stack ,so it makes sense they
       | own the error. but async splits that. now each async function
       | runs like a background job. that job should handle its own
       | failure =retry ,fallback ,log because the caller usually cant do
       | much anyway.
       | 
       | write async blocks like isolated tasks. contain errors inside
       | unless the caller has a real decision to make. global error
       | handler picks up the rest
        
         | EGreg wrote:
         | well, that's partially true
         | 
         | the caller is itself a task / actor
         | 
         | the thing is that the caller might want to rollback what
         | they're doing based on whether the subtask was rolled back...
         | and so on, backtracking as far as needed
         | 
         | ideally all the side effects should be queued up and executed
         | at the end only, after your caller has successfully heard back
         | from all the subtasks
         | 
         | for example... don't commit DB transactions, send out emails or
         | post transactions onto a blockchain until you know everything
         | went through. Exceptions mean rollback, a lot of the time.
         | 
         | on the other hand, "after" hooks are supposed to happen after a
         | task completes fully, and their failure shouldn't make the task
         | rollback anything. For really frequent events, you might want
         | to debounce, as happens for example with browser "scroll" event
         | listeners, which can't preventDefault anymore unless you set
         | them with {passive: false}!
         | 
         | PS: To keep things simple, consider using single-threaded
         | applications. I especially like PHP, because it's not only
         | single-threaded but it actually is shared-nothing. As soon as
         | your request handling ends, the memory is released. Unlike
         | Node.js you don't worry about leaking memory or secrets between
         | requests. But whether you use PHP or Node.js you are
         | essentially running on a single thread, and that means you can
         | write code that is basically sequentially doing tasks one after
         | the other. If you need to fan out and do a few things at a
         | time, you can do it with Node.js's Promise.all(), while with
         | PHP you kind of queue up a bunch of closures and then
         | explicitly batch-execute with e.g. curl_multi_ methods. Either
         | way ... you'll need to explicitly write your commit logic in
         | the end, e.g. on PHP's "shutdown handler", and your database
         | can help you isolate your transactions with COMMIT or ROLLBACK.
         | 
         | If you organize your entire code base around dispatching events
         | instead of calling functions, as I did, then you can easily
         | refactor it to do things like microservices at scale by using
         | signed HTTPS requests as a transport (so you can isolate
         | secrets, credentials, etc.) from the web server:
         | https://github.com/Qbix/Platform/commit/a4885f1b94cab5d83aeb...
        
           | jpc0 wrote:
           | I liked where you started.
           | 
           | Any ASYNC operation, whether using coroutines or event based
           | actors or whatever else should be modelled as a network call.
           | 
           | You need a handle that will contain information about the
           | async call and will own the work it performs. You can have an
           | API that explicitly says "I don't care what happens to this
           | thing just that it happens" and will crash on failure. Or you
           | can handle its errors if there are any and importantly decide
           | how to handle those errors.
           | 
           | Oh and failing to allocate/create that handle should be a
           | breach of invariants and immediately crash.
           | 
           | That way you have all the control and flexibility and Async
           | error handling becomes trivial, you can use whatever async
           | pattern you want to manage async operations at that point as
           | well.
           | 
           | And you also know you have fundamentally done something
           | expensive in latency for the benefit of performance or access
           | to information, because if it was cheap you would have just
           | done it on the thread you are already using.
        
           | renox wrote:
           | > for example... don't commit DB transactions, send out
           | emails or post transactions onto a blockchain until you know
           | everything went through. Exceptions mean rollback, a lot of
           | the time.
           | 
           | But what if you need to send emails AND record it in a DB?
        
             | lelanthran wrote:
             | I had the same question, actually; it is _very_ common to
             | perform multiple point-of-no-return IO in a workflow, so
             | deferring all IO into a specific spot does not, in
             | practice, bring any advantages.
        
         | quietbritishjim wrote:
         | Structured concurrency [1] solves the issue of task (and
         | exception) ownership. In languages / libraries that support it,
         | when spawning a task you must specify some enclosing block that
         | owns it. That block, called a nursery or task group, can be a
         | long way outside the point where the task is spawned because
         | the nursery is an object in its own right, so it can be passed
         | into a function which can then call its start() method. All
         | errors are handled at the nursery level.
         | 
         | They were introduced in the Trio library [2] for Python, but
         | they're now also supported by Python's built in asyncio module
         | [3]. I believe the idea has spread to other languages too.
         | 
         | [1] https://vorpus.org/blog/notes-on-structured-concurrency-
         | or-g...
         | 
         | [2] https://trio.readthedocs.io/en/stable/
         | 
         | [3] https://docs.python.org/3/library/asyncio-task.html#task-
         | gro...
        
       | RS-232 wrote:
       | Async anything is hard!
        
       | innocentoldguy wrote:
       | One of the things I love most about Elixir is that it makes
       | asynchronous error handling easier than any other language I've
       | used. Asynchronous code used to be the source of many difficult
       | bugs in the teams I've worked with, but Elixir's (or, more
       | accurately, Erlang's) "let it crash" architecture helps eliminate
       | many of these issues.
        
       ___________________________________________________________________
       (page generated 2025-06-30 23:01 UTC)