[HN Gopher] How Python asyncio works: recreating it from scratch
___________________________________________________________________
How Python asyncio works: recreating it from scratch
Author : jpjacobpadilla
Score : 232 points
Date : 2024-05-07 00:50 UTC (22 hours ago)
(HTM) web link (jacobpadilla.com)
(TXT) w3m dump (jacobpadilla.com)
| throwawaymaths wrote:
| Completely insane that python decided to hijack "def" to be used
| to create an object that very much is not a function. At the
| least they could have gone was create a different keyword
| maxbond wrote:
| It's a function, it's just been transformed into a function
| with a different return type (that of a generator or coroutine,
| you can see the type signatures here [1]). You could do
| something similar in vanilla Python rather than language-level
| sugar, eg using decorators.
|
| [1]
| https://docs.python.org/3/library/typing.html#typing.Generat...
|
| As a caveat, I point out the type signature for academic
| interest, I prefer to use the simpler Iterable and Awaitable
| types in type signatures.
| xmcqdpt2 wrote:
| I think the type signature and example makes the parent's
| point, frankly. It has three types, on of which is
| contravariant, two different usage of the yield keyword, and
| two exit points.
|
| Now granted most uses of generators are simpler than that,
| but still that's a lot of special syntax. And if you are
| writing Python without type-checking, that's a lot of
| bookkeeping you'd have to do when reading code.
| maxbond wrote:
| That type signature is gnarly, I agree. I've seen people
| get tripped up because they think the function _is_ the
| generator, and are confused about what behavior to expect
| from concurrent readers of different instances of the
| generator (because they don 't realize that they are
| different instances in the first place). So my position is
| more about how to work with Python effectively/understand
| it accurately than in defense of it's syntax.
| throwaway81523 wrote:
| They somewhat fixed that with "async def" but the whole thing
| is a huge tragedy anyway.
| cuu508 wrote:
| Do you mean they fixed the hijacking by adding the "async"
| keyword?
|
| By the whole thing do you mean the async functionality in
| Python in general? What is the tragedy?
| jononomo wrote:
| I'm also curious to know what the tragedy is.
| darthrupert wrote:
| Dunno if they meant this, but I feel that adding async to
| a language/ecosystem where it was earlier not used makes
| everything awkward for a long time. Especially if it's
| done liberally like python did, with both sync and async
| styles still supported.
|
| It feels like the worst of both worlds at times. I think
| Rust suffers from this too, whereas Node.js has managed
| to make async the default mode of operation for enough
| things that it doesn't feel so bad. Especially with
| Typescript.
| throwaway81523 wrote:
| Obligatory video, NSFW (lots of swearing):
|
| https://www.youtube.com/watch?v=bzkRVzciAZg
|
| "Node.js is bad ass rock star tech".
| jononomo wrote:
| I actually learned a lot from watching that video.
| andrewstuart wrote:
| >> whereas Node.js has managed to make async the default
| mode of operation for enough things that it doesn't feel
| so bad
|
| Javascript is inherently async
| throwaway81523 wrote:
| Yes, I mean the async keyword fixes the hijacking although
| the hijacked version is still there for compatibility. So
| much for Python 3 breaking stuff for the purpose of fixing
| old cruft.
|
| The tragedy is the whole Python 3 debacle, and if they were
| going to make a break like that, they should have
| introduced BEAM-like concurrency instead of async.
|
| For reasons I don't understand, the Python community seems
| to have an intense phobia and hatred towards threads. I've
| always used threads instead of async though, and by writing
| in an Erlang-like style (message passing and no shared
| mutable data) you can keep things reliable. This has worked
| ok for me with 1000s of threads, though not millions or
| anything like that.
|
| Python by now is mostly legacy cruft, it feels to me
| sometimes. Everything else of course has its own problems.
| bormaj wrote:
| As someone who started learning to code with C++ followed
| by Python and then Rust, proper threading approaches
| weren't drilled into my head until I started working with
| Rust. I agree with you that threading with message
| passing is a safer approach. Threading does add a level
| of complexity that isn't always necessary.
|
| AFAIK it's not outlined in a PEP, but my intuition is
| that Python was designed to be accessible and "simple".
| It wasn't built to be a systems programming language. So
| for all its faults with the GIL I don't think it should
| be a major point of contention. For any task it's always
| about choosing the right tool for the job.
|
| There is considerable effort at the moment to remove the
| GIL in future versions and implement better threading
| support. At 3.11/3.12, it has come a long way in terms of
| performance.
| hot_gril wrote:
| The problem with threads in Python is they're 1:1 with OS
| threads in every major implementation. It's not like
| Golang where you have greenthreading, and you don't even
| get all the benefits of regular threads due to the GIL.
| It's been too long since I've used Erlang, but the way I
| remember it, the concurrency is almost like using RPCs.
| Which is great in some applications, but Python users
| want shared mutable data for simple things and will make
| microservices for larger systems.
|
| For a simple language, Python has so many different ways
| to do concurrency, some of which are misleading to new
| users. Event loop ("async") would've been nice and simple
| from the start, but they were late to that party.
| Everything vaguely related to packaging is also
| confusing, from the __init__.py files to the
| PIP/Conda/whatever stuff. They had a whole breaking
| Py2->3 transition and still didn't fix that.
| globular-toast wrote:
| What's the actual problem with it? Or is this just a language
| "purity" thing?
| zo1 wrote:
| Are you talking about decorators? Very confused trying to parse
| what you're saying in the context of python.
| cqqxo4zV46cp wrote:
| What? There is no tragedy here. It sounds like it just put DX
| semantics ahead of...what? some sort of language purity? I
| can't say that I've ever thought "gosh! that async function
| is...an async function, and that's tripped me up!" before. Not
| even once.
| hot_gril wrote:
| The bigger problem with Python to begin with was that you have
| to use def for functions, meaning you can't inline them like
| with JS arrow funcs or C++ lambdas. And this isn't a language
| purity thing, it's just inconvenient.
| krackers wrote:
| This seems to "busy wait" when sleeping, i.e. the event loop
| keeps running even if nothing is currently runnable. I remember
| reading about another toy implementation which handled it
| correctly (the way I believe it's actually done in asyncio) by
| tracking the next-runnable times of a task in sorted order, and
| if nothing is currently runnable you can sleep the event loop.
| And then this was extended so that instead of just next-runnable
| depending on wall-clock time, the task could have a dependency on
| a socket or something, so that you can use select w/timeout.
| chippiewill wrote:
| In asyncio itself you can have custom event loop
| implementations in addition to the out of the box one.
|
| One popular implementation is uvloop
| (https://github.com/MagicStack/uvloop) which basically just
| implements the loop using libuv which takes care of doing stuff
| like `select` as you describe.
| LtWorf wrote:
| I've encountered bugs with it. Unix sockets not being created
| at all. So I went back to the regular loop.
| theelous3 wrote:
| There's nothing really wrong with this on one level. Your event
| loop doesn't even have to loop, it can just start by running
| your main and end when that ends. Imagine starting a server
| which has a while true loop that waits over a socket, and on
| some shutdown condition or interrupt ends, ending the program.
| No busy waiting, no messing with sleeps or sockets from the
| event loops perspective. Run until complete vs run forever.
|
| Better off, if making some toy loop, to not bother with the
| forever case.
| coolThingsFirst wrote:
| Whats the best book to learn about things like these?
| kodablah wrote:
| Asyncio allows you to replace the event loop with an
| implementation of your own. For Temporal Python we represent
| workflows as a custom, durable asyncio event loops so things like
| asyncio.sleep are durable timers (i.e. code can resume on another
| machine, so you can sleep for weeks). Here is a post explaining
| how it's done: https://temporal.io/blog/durable-distributed-
| asyncio-event-l....
|
| The biggest problem with asyncio is how easy/common it is in
| Python to be able to block the asyncio thread with synchronous
| calls, gumming up the whole system. Python sorely needs a static
| analysis tool that can build a call graph to help detect if a
| known thread-blocking call is called directly or directly from an
| async def.
| guappa wrote:
| Good luck with that. A simple read() might block or not,
| depending on what the descriptor is and how it's configured.
| How would you statically detect this?
| kodablah wrote:
| Like many static analyzers in ambiguous situations, you have
| to decide whether the possibility of false positives is worth
| the safety. In this case I'd flag it knowing that it could be
| marked ignored as needed (same with open() if it was used).
|
| I have toyed with a MyPy plugin to do this, but the arbitrary
| state storage/cache is a bit limited to do good call graph
| and function coloring (metadata can be on types but not
| functions). And there's just not many other good _extensible_
| static analysis libraries. Language like Go have static
| analysis libraries that let you store/cache arbitrary
| state/facts on all constructs.
| guappa wrote:
| You must also detect "too long loops"... which is
| dangerously close to solving the halting problem.
| kodablah wrote:
| This and other similar CPU-bound operations may be
| unreasonable to detect at analysis time. I would
| encourage users to run with asyncio debug mode in tests
| to help catch these:
| https://docs.python.org/3/library/asyncio-dev.html#debug-
| mod.... Granted that's runtime, but there are limitations
| on what analysis can perform, it's just meant to help
| catch obvious issues.
| chippiewill wrote:
| Temporal is seriously cool.
|
| When I found out how it had implemented the asyncio event loop
| it was a real mind blown moment.
| hot_gril wrote:
| Maybe this is a bad idea, but... What if, instead of the
| current way where every call (including native stuff) is sync
| by default, it were the other way around? You'd quickly
| whitelist basic things like arithmetic and data struct access
| as sync, then you could maybe detect other things that should
| be sync if the event loop is spinning suspiciously fast.
| tda wrote:
| There is a brilliant talk by David Beazley on asyncio. I used it
| to write a Discrete Even Simulation tool. Pretty cool to be able
| to make your own implementation of asyncio and replace the system
| clock with a simulated time
| JosephRedfern wrote:
| Link to David Beazley's asyncio talk:
| https://youtube.com/watch?v=Y4Gt3Xjd7G8
| theelous3 wrote:
| dabeaz's async talks are imo the most important things you
| can consume to understand events loops and async, followed
| closely by njsmith's analysis and criticism of different
| methods of async invocation and call structures:
| https://vorpus.org/blog/notes-on-structured-concurrency-
| or-g...
| guappa wrote:
| No mention of poll()? It's not at all how asyncio works then.
| xmcqdpt2 wrote:
| I would have preferred if the article did the same without using
| yield at all, because frankly that's where I feel the real magic
| is.
|
| See here for a much more in-depth description of coroutines in
| python,
|
| https://aosabook.org/en/500L/a-web-crawler-with-asyncio-coro...
| tusharsadhwani wrote:
| This is brilliant stuff, a very nice high level explanation that
| skips everything that would make it too boring for a first time
| reader.
|
| Adding other resources at the end which explain how it _really_
| works under the hood would be great.
| goon12 wrote:
| This is a similar article that I thought was pretty good:
| https://tenthousandmeters.com/blog/python-behind-the-scenes-...
___________________________________________________________________
(page generated 2024-05-07 23:01 UTC)