[HN Gopher] Asynchronous Error Handling Is Hard
___________________________________________________________________
Asynchronous Error Handling Is Hard
Author : hedgehog
Score : 29 points
Date : 2025-06-29 20:35 UTC (1 days ago)
(HTM) web link (parallelprogrammer.substack.com)
(TXT) w3m dump (parallelprogrammer.substack.com)
| rorylaitila wrote:
| Exceptions get a lot of hate, but of the three styles, I keep
| coming back to exceptions. Ages ago I built an application with
| error codes, and went back to exceptions, because I thought the
| ceremony of error checking was not worth it. On occasion, I'll
| use a get-last-error style, particularly when the error is
| something the user is intended to address. But for most of my
| applications (which are usually not libraries and are code under
| my control) I like exceptions.
|
| I always have global error handler that logs and alerts of
| anything uncaught. This allows me to code the happy path. Most of
| the time, it's not worth figuring out how to continuing
| processing under every possible error, so to fail and bail is my
| default approach. If I later determine that its something that
| can be handled to continue processing, then I update that code
| path to handle that case.
|
| Most of my code is web applications, so that is where I'm coming
| from.
| PaulHoule wrote:
| Hell yeah. I typed in my first C program, a terminal emulator,
| from a 1984 issue of _Byte_ magazine. It was painful seeing how
| 5 lines of real logic were intertwined with 45 lines of error
| handling logic that, in the end, did what exceptions did for
| free -- it was a formative experience for me as a programmer
| and when I saw exceptions in Java in 1995 (still in beta) they
| made me so happy.
|
| In the async case you can pass the Exception as an object as
| opposed to throwing it but you're still left with the issue
| that the failure of one "task" in an asynchronous program can
| cause the failure of a supertask which is comprised of other
| tasks and handling that involves some thinking. That's chess
| whereas the stuff talked about in that article is Tic-Tac-Toe
| in comparison.
| rorylaitila wrote:
| Yeah, I agree in the async case. What I do there is wrapped
| the async code in its own global error handler, so to speak.
| That handler is logging to something that the outer process
| can get-last-error from.
|
| But I can get away with this also because I don't write async
| heavy code. My web applications are thread-per-request
| (Java). This fits 99% of the needs of business code, whose
| processing nature is mostly synchronous.
| PaulHoule wrote:
| By and large you want to avoid async if you can help it.
| Sometimes you can't. The struggles Rustifarians have had
| with it are a cautionary tale (the stack and borrow
| checking go together like peanut butter and jelly.). I used
| to have a lot of fun writing async Python, friends told me
| I was living dangerously, I finally rewrote my RSS
| reader/bookmark manager/personal web crawler/image sorter
| in sync style so some of it could run in Celery and that
| any blocking on the CPU on a 16 core machine is too much.
|
| People used to worry about the 10k connection problem but
| machines are bigger now, few services are really that big,
| and fronting with nginx or something like that helps a lot.
| (That image sorter serves images with IIS)
|
| JavaScript is async and you gotta live with it because of
| deployability. No fight with the App Store. No
| installshield engineer. No army of IT people to deploy
| updates. "Just works" on PC, Mac, Linux, tablet, game
| consoles, VR headsets, etc. Kinda sad people are making
| waitlist forms with frameworks that couldn't handle the
| kind of knowledge graph editor and decision support
| applications I was writing in 2006 but that's life.
| bob1029 wrote:
| Exceptions can even work with remote APIs.
|
| If you reach into the enterprise bucket of tricks, technologies
| like WCF/SOAP can propagate these across systems reliably. You
| can even forward the remote stack traces by turning on some
| scary flags in your app.config. When printing the final
| exception using .ToString(), this creates a really magical
| narrative of what the fuck happened.
|
| The entire reason exceptions are good is because of stack
| traces. It is amazing to me how many developers do not
| understand that having a stack trace at the exact instant of a
| bad thing is like having undetectable wall hacks in a
| competitive CS:GO match.
| rorylaitila wrote:
| Yes, I've never quite understood the "But with exceptions
| it's hard to debug why the error occurred after the fact, its
| better to be explicit in advance" - The stack trace points
| exactly to the line. And usually, with the error message and
| context, its all I need. Maybe I'm missing something that
| someone can inform me.
| flysand7 wrote:
| Yeah, this kinda becomes a problem when the library you are
| using does not distribute its source code, so even _if_ you
| get the line, this information is practically useless to
| you.
|
| This has been my biggest problem with exceptions, one, for
| the reason outlined above, plus it's for how much time you
| actually end up spending on figuring out what the exception
| for a certain situation is. "Oh you're making a database
| insertion, what's the error that's thrown if you get a
| constraint violation, I might want to handle that". And
| then it's all an adventure, because there's no way to know
| in advance. If the docs are good it's in the docs,
| otherwise "just try it" seems to be the way to do it.
| rorylaitila wrote:
| Yeah I agree with that, opaque errors from libraries are
| where this really sucks. The worst is when they swallow
| the original error and throw a generic exception instead.
| 9rx wrote:
| _> It is amazing to me how many developers do not understand
| that having a stack trace at the exact instant of a bad thing
| is like having undetectable wall hacks in a competitive CS:GO
| match._
|
| Who doesn't understand that? If you aren't using exceptions
| you are using wrapping instead, and said wrapping is merely
| an alternative representation of what is ultimately the very
| same thing. This idea isn't lost on anyone, even if they
| don't use the call stack explicitly.
|
| The benefit of wrapping over exceptions[1] is that each layer
| of the stack gains additional metadata to provide context
| around the whole execution. The tradeoff is that you need
| code at each layer in the stack to assign the metadata
| instead of being able to prepare the data structure all in
| one place at the point of instantiation.
|
| [1] Technically you could wrap exceptions in exceptions, of
| course. This binary statement isn't quite right, but as
| exceptions have proven to be useless if you find yourself
| ending up here, with two stacks offering the same
| information, we will assume for the sake of discussion that
| the division is binary.
| groestl wrote:
| One could say the whole point of wrapping exceptions is to
| add additional metadata _if such data is available_.
| Otherwise, the most basic metadata is tracked
| automatically: stack locations.
| 9rx wrote:
| Technically, the actual whole point of wrapping is to
| avoid leaking implementation details. If you let
| "FooLibraryException" bubble up, and then you stop using
| Foo Library, then all of the users of your code are going
| to end up broken waiting for "FooLibraryException" when
| now you throw "BarLibraryException". This diminishes any
| value exception handlers theoretically could provide
| since you end up having to wrap everything at each step
| anyway.
|
| Checked exceptions were introduced to try to help with
| that problem, giving you at least a compiler error if an
| implementation changed from underneath you. But that
| comes with its own set of problems and at this point most
| consider it to be a bad idea.
|
| Of course, many just throw caution to the wind and don't
| consider the future, believing they'll have moved on by
| then and it will be the next programmer's problem. Given
| the context of discussion, we have assumed that is the
| case.
| PaulKeeble wrote:
| Agreed and even more heretical from me is that I quite like
| declared exceptions. It makes the interface of a method clear
| in all the ways it can fail and you can directly choose what to
| handle often without having to look at the docs to work out
| what they mean, because the names tell you what you need to
| know. You can ignore them and rethrow catch globally but you
| can also handle them.
|
| Having used Go for years now frankly I prefer exceptions, way
| too often there is nothing that can be done about an error
| locally but it produces noise and if branches all over the code
| base and its even worse to add an error later to a method than
| in Java because every method has to have code added not just a
| signature change. I really miss stack traces and the current
| state of the art in Go has us writing code to produce them in
| every method.
| bigstrat2003 wrote:
| Yep, checked exceptions are the shit. You can of course abuse
| them to create a monstrosity (as you can with anything), but
| when used responsibly I think they are by far the best error
| handling paradigm.
| o11c wrote:
| The problem with `getlasterror` and `errno` is that they're
| global (thread-local, whatever).
|
| But if you make them take a `context` object, there's no longer
| a problem.
|
| One interesting observation - you can use them even for the
| initial "failed to allocate a context" by interpreting a NULL
| pointer as always containing an "out of memory" error.
| b0a04gl wrote:
| in async code ,errors belong to the task ,not the caller.
|
| in sync code ,the caller owns the stack ,so it makes sense they
| own the error. but async splits that. now each async function
| runs like a background job. that job should handle its own
| failure =retry ,fallback ,log because the caller usually cant do
| much anyway.
|
| write async blocks like isolated tasks. contain errors inside
| unless the caller has a real decision to make. global error
| handler picks up the rest
| EGreg wrote:
| well, that's partially true
|
| the caller is itself a task / actor
|
| the thing is that the caller might want to rollback what
| they're doing based on whether the subtask was rolled back...
| and so on, backtracking as far as needed
|
| ideally all the side effects should be queued up and executed
| at the end only, after your caller has successfully heard back
| from all the subtasks
|
| for example... don't commit DB transactions, send out emails or
| post transactions onto a blockchain until you know everything
| went through. Exceptions mean rollback, a lot of the time.
|
| on the other hand, "after" hooks are supposed to happen after a
| task completes fully, and their failure shouldn't make the task
| rollback anything. For really frequent events, you might want
| to debounce, as happens for example with browser "scroll" event
| listeners, which can't preventDefault anymore unless you set
| them with {passive: false}!
|
| PS: To keep things simple, consider using single-threaded
| applications. I especially like PHP, because it's not only
| single-threaded but it actually is shared-nothing. As soon as
| your request handling ends, the memory is released. Unlike
| Node.js you don't worry about leaking memory or secrets between
| requests. But whether you use PHP or Node.js you are
| essentially running on a single thread, and that means you can
| write code that is basically sequentially doing tasks one after
| the other. If you need to fan out and do a few things at a
| time, you can do it with Node.js's Promise.all(), while with
| PHP you kind of queue up a bunch of closures and then
| explicitly batch-execute with e.g. curl_multi_ methods. Either
| way ... you'll need to explicitly write your commit logic in
| the end, e.g. on PHP's "shutdown handler", and your database
| can help you isolate your transactions with COMMIT or ROLLBACK.
|
| If you organize your entire code base around dispatching events
| instead of calling functions, as I did, then you can easily
| refactor it to do things like microservices at scale by using
| signed HTTPS requests as a transport (so you can isolate
| secrets, credentials, etc.) from the web server:
| https://github.com/Qbix/Platform/commit/a4885f1b94cab5d83aeb...
| jpc0 wrote:
| I liked where you started.
|
| Any ASYNC operation, whether using coroutines or event based
| actors or whatever else should be modelled as a network call.
|
| You need a handle that will contain information about the
| async call and will own the work it performs. You can have an
| API that explicitly says "I don't care what happens to this
| thing just that it happens" and will crash on failure. Or you
| can handle its errors if there are any and importantly decide
| how to handle those errors.
|
| Oh and failing to allocate/create that handle should be a
| breach of invariants and immediately crash.
|
| That way you have all the control and flexibility and Async
| error handling becomes trivial, you can use whatever async
| pattern you want to manage async operations at that point as
| well.
|
| And you also know you have fundamentally done something
| expensive in latency for the benefit of performance or access
| to information, because if it was cheap you would have just
| done it on the thread you are already using.
| renox wrote:
| > for example... don't commit DB transactions, send out
| emails or post transactions onto a blockchain until you know
| everything went through. Exceptions mean rollback, a lot of
| the time.
|
| But what if you need to send emails AND record it in a DB?
| lelanthran wrote:
| I had the same question, actually; it is _very_ common to
| perform multiple point-of-no-return IO in a workflow, so
| deferring all IO into a specific spot does not, in
| practice, bring any advantages.
| quietbritishjim wrote:
| Structured concurrency [1] solves the issue of task (and
| exception) ownership. In languages / libraries that support it,
| when spawning a task you must specify some enclosing block that
| owns it. That block, called a nursery or task group, can be a
| long way outside the point where the task is spawned because
| the nursery is an object in its own right, so it can be passed
| into a function which can then call its start() method. All
| errors are handled at the nursery level.
|
| They were introduced in the Trio library [2] for Python, but
| they're now also supported by Python's built in asyncio module
| [3]. I believe the idea has spread to other languages too.
|
| [1] https://vorpus.org/blog/notes-on-structured-concurrency-
| or-g...
|
| [2] https://trio.readthedocs.io/en/stable/
|
| [3] https://docs.python.org/3/library/asyncio-task.html#task-
| gro...
| RS-232 wrote:
| Async anything is hard!
| innocentoldguy wrote:
| One of the things I love most about Elixir is that it makes
| asynchronous error handling easier than any other language I've
| used. Asynchronous code used to be the source of many difficult
| bugs in the teams I've worked with, but Elixir's (or, more
| accurately, Erlang's) "let it crash" architecture helps eliminate
| many of these issues.
___________________________________________________________________
(page generated 2025-06-30 23:01 UTC)