[HN Gopher] Python Asyncio
___________________________________________________________________
Python Asyncio
Author : simonpure
Score : 192 points
Date : 2022-11-10 14:54 UTC (8 hours ago)
(HTM) web link (superfastpython.com)
(TXT) w3m dump (superfastpython.com)
| PointyFluff wrote:
| Why do yo inject quotes randomly everywhere throughout the
| document?
|
| Why is only 1/3 of the screen used for content?
| nineplay wrote:
| Not mentioned: The behavior is different between python 3.6 and
| 3.8. That was fun to debug.
|
| Python is simply not the right language for asynchronous
| programming. Things that are easy in many other languages are
| hard and don't work as you'd expect. If you are in a python-only
| shop, brush up on your presentation skills and see if you can
| convince them to branch out.
|
| Or write a shell script. You're better off with "./read_socket.sh
| &" than you are with any python library.
| brrrrrm wrote:
| All of distributed machine learning would benefit from async,
| but it's unfortunately a Python-only ecosystem for the time
| being. Probably gonna be a couple years before that has a
| chance of changing
| nine_k wrote:
| How would ML benefit from async? I thought everything
| interesting in things like TF or PyTorch is done by native
| code which is free to use multithreading and anything at all.
|
| (I know little about ML.)
| brrrrrm wrote:
| Specifically distributed would be helped (multiple nodes).
|
| Right now the typical paradigm is this really clunky lock-
| step across nodes waiting for each other (everything is
| blocking). If your hardware is non-homogenous (both
| interconnects _and_ accelerators) or your program needs to
| be run in a pipelined way, you 're fighting an extremely
| uphill battle.
|
| Go, JavaScript etc are the usual languages of choice for
| this type of stuff because it's easy to express non-
| blocking code.
| nine_k wrote:
| I suspect multiprocessing + shared memory could help
| this. The stdlib has provisions for both, but a
| coordination layer is needed.
|
| That would be closest to true multithreading.
| brrrrrm wrote:
| it's not a matter of performance but expressibility.
| const y = await remote_model_a(x) // different machine
| const z = await remote_model_b(y) // different machine
| await z.backward()
|
| is trivially pipelined when run in parallel. With
| multithreading suddenly the backing C++ library has to be
| aware of this and figure things out for you
| dwrodri wrote:
| This isn't necessarily a direct reply but instead another
| anecdote which reinforces OP's claim that asynchronous
| programming is a thorn in the side of Python.
|
| I was tasked with improving a performance-critical object
| detection codebase for video that was written in Python. Before
| attempting a complete rewrite, I identified that outside of FFI
| calls related to PyTorch/OpenCV, some of the biggest latency
| overhead was due to the fact that the network I/O for fetching
| data to process was blocking the actual inference, so I tried
| to decouple the two so they could happen synchronously.
|
| My attempt at doing this with asyncio went horrendously at
| best. There were no mature client libraries for the cloud
| services we used at the time. I was digging through PyTorch
| forums to figure out the relationship between the GIL and CUDA
| Streams. Boundless headaches.
|
| Then I realized, I could just separate the code that loaded the
| data onto the machine from the inference code into two separate
| code bases and just have them communicate over really simple
| signals on a UNIX socket. Two different processes, sending a
| simple fixed-size string back and forth, and then I just used
| multiprocessing.Pool for parallelizing the downloads.
| [deleted]
| ollien wrote:
| What changed between those two releases? 3.6 was the last time
| I worked with asyncio
| [deleted]
| nineplay wrote:
| Some of it is here
|
| https://tryexceptpass.org/article/asyncio-in-37/
|
| One of my ongoing frustrations is python isn't just the lack
| of backwards compatibility but that if you inherit a script
| there's no way to know which version of the language it was
| written for. If you're lucky you can track down the
| developers, but IME even they often say "I'm not sure, let me
| run python --version".
| gmadsen wrote:
| that is an absurd exaggeration, or you don't realize all the
| issues of low level sockets that are abstracted away
| nineplay wrote:
| To follow up, the contents of read_socket.sh
|
| ---
|
| #!/bin/bash
|
| python3.8 read_one_socket.py $1
|
| ---
|
| Use the python socket libraries if you must, but don't use
| python asynchronously.
| packetlost wrote:
| The asyncio module was heavily refactored in 3.7, which
| introduced the async/await keywords which was a *huge*
| improvement over the 3.6 and below implementation.
|
| I don't agree with this, but I will concede that async Python
| is rarely the correct choice.
| est wrote:
| Missing yet very important topics: redis & db drivers in async.
| Or even async ORM.
| [deleted]
| m3047 wrote:
| I do Redis in a thread pool. Did this before aioredis came out,
| have kept doing it because I know it and it works. Have used
| the same pattern for a few other things it works so well.
| ok_dad wrote:
| The only thing I ever use async in Python for is batch requests,
| like hundreds or thousands of little requests. I collect them up
| in a task list with context, run them with one of the "run all of
| these and return the results or errors" functions in the library,
| then process the results serially.
|
| Anything I need to do that doesn't use this simple IO pattern,
| like cpu bound workers, I prefer to use processes or multiple
| copies of the app synchronized with a task queue.
| samsquire wrote:
| I hope to learn how to use async code more effectively.
| Coroutines are very interesting. Structured concurrency is very
| useful in defining understanding concurrency. I wrote a
| multithreaded userspace scheduler in Java C and Rust which
| multiplexes lightweight threads over kernel threads. It is a
| 1:M:N scheduler with 1 scheduler threads M kernel threads and N
| lightweight threads. This is similar to golang which is P:M:N
|
| https://GitHub.com/samsquire/preemptible-thread
|
| I am deeply interested in parallel and asychronous code. I write
| about it on my journal (link in my profile)
|
| I am curious if anybody has any ideas on how you would build an
| interpreter that is multithreaded - with each interpreter running
| in its own thread and sending objects between threads is done
| without copying or marshalling. I I think Java does it but I am
| yet to ask how it does it. Maybe I'll ask Stackoverflow.
|
| I wrote a parallel imaginary assembly interpreter that is backed
| by an actor framework which can send and receive messages in
| mailboxes.
|
| Here's some code: threads 25 <start>
| mailbox numbers mailbox methods set running 1
| set current_thread 0 set received_value 0 set
| current 1 set increment 1 :while1 while
| running :end receive numbers received_value :send
| receivecode methods :send :send :send add
| received_value current addv current_thread 1 modulo
| current_thread 25 send numbers current_thread increment
| :while1 sendcode methods current_thread :print
| endwhile :while1 jump :end :print println
| current return :end
|
| This is 25 threads that each send integers to eachother as fast
| as they can. The sendcode instruction can cause the other thread
| to run some code. It can get up to 1.7 million requests per
| second without the sendcode and receivecode. With method sending
| it gets ~600,000 requests per second
| xwowsersx wrote:
| I'm not in a much of a position to evaluate Python's asyncio
| since I really have not used it very much. However, over the last
| few days, I started to dig into it and tried to get some (what I
| think are) very basic examples working and really struggled. That
| alone is not fully dispositive because I've used many async
| implementations in various languages and each of them have a bit
| of a learning curve and their own wrinkles you have to ramp up
| on. That said, my limited experience at least tells me that
| Python's async has a FUX that leaves a lot to be desired. I've
| found other languages make it a lot easier to do concurrency,
| parallelism, async vs sync, blocking vs non-blocking and all
| that. I don't really know if the issue is poor documentation, the
| semantics of the APIs themselves...really not sure. I do know
| that I'm left with the feeling that "this seems like it's going
| to be a pretty big PITA".
| matsemann wrote:
| Asyncio is annoying, and often unexpectedly slow. You think
| things are parallel, but then one misbehaving coroutine can hog
| your cpu bringing everything to a halt. GIL makes it useless
| for anything other than _heavily_ IO bound tasks.
|
| And yeah, Python's documentation is useless. Never _how_ to use
| stuff, only listing of everything that 's possible to do / the
| API. Unfortunately that style is being mimicked by most other
| Python projects as well.
| rpep wrote:
| I think the big point of interest will come with Python 3.12
| where it's proposed that as part of the faster CPython
| project, the GIL-per-process will be removed.
| andrewstuart wrote:
| It IS easy.
|
| The Python project just don't make it obvious how to do it
| easy.
| btilly wrote:
| It is easy if you know what to do. But you'll CONSTANTLY run
| into things where it is obvious that this is a toy that
| hasn't been thought through.
|
| I've found random slowdowns of a factor of 10 for no reason.
|
| A context manager needs to have __enter__ and __exit__
| methods. An async context manager needs to have __aenter__
| and __aexit__ methods. The error you get if you use a context
| manager as an async context manager or vice versa is big and
| scary and offers no clue to most programmers of what the
| problem is.
|
| Using async with SQLAlchemy it isn't hard to wind up exiting
| the event loop before all cleanup has happened. They fixed
| the main cause of this, but it isn't hard to still trigger
| it.
|
| And so on.
| xwowsersx wrote:
| Right. So you think it's mainly an education/documentation
| issue? I'm totally open to that in fact being the case. One
| really simple use case I had was just annotating a method to
| run async and then calling that in a loop elsewhere which did
| speed up a bunch of work I was doing pretty dramatically.
| import asyncio def background(f): def
| wrapped(*args, **kwargs): return
| asyncio.get_event_loop().run_in_executor(None, f, *args,
| **kwargs) return wrapped
| @background def my_io_bound_function():...
| m3047 wrote:
| You need to run the executor somewhere ( loop.run_forever()
| or loop.run_until_complete() ), that can be in the current
| thread or a separate thread, keep in mind Python is still
| conceptually a single core.
|
| Things I've found particularly useful:
|
| * Optional / debug logging of coroutine start/exit.
|
| * Stats logging, including count, runtime, request queue
| (multiple instances of the same function) depth.
|
| * Printing tracebacks within the coroutine context.
|
| Haven't expended any effort on decorators, good for you.
| xwowsersx wrote:
| Those do sound useful, thanks for this. Will keep in mind
| as I continue exploring.
| m3047 wrote:
| I think I hit a foul ball there, as run_in_executor() is
| for a threadpool. So yeah you are running inside of
| run_forever() or run_until_complete(), we just don't see
| it here, and that's where your sending things to run in a
| different thread. ;-)
| xwowsersx wrote:
| Ah got it, thanks for the correction and info.
| harpiaharpyja wrote:
| I don't know why curio hasn't seen greater adoption. Having used
| both, it seems superior to asyncio in most respects.
| jmatthews wrote:
| Just to swim upstream. For http requests and bounded IO I've
| found asyncio to be straightforward and a game changer. In the
| context of call an endpoint with a data payload and have the
| endpoint process it with outbound http calls it is a 10x'er for
| very little complexity and no external libraries.
| bityard wrote:
| I bet the actual content here is great, but I find the page to be
| unreadable with the large text, huge amounts of whitespace, and
| quotes (with Amazon referral links) sprinkled in between every
| other sentence.
| mpeg wrote:
| In case the author reads this, there is an error in the "How to
| Execute a Blocking I/O or CPU-bound Function in Asyncio?" [0]
|
| It reads:
|
| > The asyncio.to_thread() function creates a ThreadPoolExecutor
| behind the scenes to execute blocking calls.
|
| > As such, the asyncio.to_thread() function is only appropriate
| for IO-bound tasks.
|
| It should say it's only appropriate for CPU-bound tasks.
|
| [0]: https://superfastpython.com/python-
| asyncio/#How_to_Execute_a...
| wrigby wrote:
| I think the article is correct here, actually - if you need to
| run a CPU-bound task, you'll need a ProcessPoolExecutor.
| mpeg wrote:
| Well, that section is talking about CPU-bound tasks, and both
| threads and processes are valid choices for CPU-bound tasks,
| with different tradeoffs.
|
| I think there is often a lot of confusion around that because
| a lot of tutorials simply say to use threads for IO-bound
| tasks and processes for CPU-bound tasks, without going deeper
| into the differences. Threads can easily share memory which
| often increases complexity and requires locks to run safely,
| with processes you avoid locking (including the infamous
| Python GIL) but also have to pass around data which might
| hurt performance.
|
| Regardless, it still doesn't make sense to say threads are
| only appropriate for IO-bound tasks, the official
| documentation [0] seems to lean towards preferring threads
| for mixing IO and CPU-bound tasks, and processes for purely
| CPU-bound tasks.
|
| [0]: https://docs.python.org/3/library/concurrent.futures.htm
| l#th...
| wrigby wrote:
| I think the section is addressing both CPU and IO-bound
| tasks:
|
| > How to Execute a Blocking I/O or CPU-bound Function in
| Asyncio?
|
| But to be precise, we should differentiate between Python's
| interpreter threads and OS threads. In general, OS threads
| are a great way to parallelize CPU-bound tasks (with the
| issues of locking you mention), but Python's interpreter
| threads are not (because of the GIL).
| danuker wrote:
| I second this. Python has a Global Interpreter Lock, so only
| one Python instruction executes at a time.
|
| Async and threads only help you with I/O (where the OS
| affords waiting for multiple operations in parallel).
|
| To execute CPU-bound code in parallel you need multiple
| processes.
|
| I think this is a major disadvantage of Python, because
| processes are much costlier to spawn, and if you implement
| long-running workers to avoid frequent spawning, you have to
| incur serialization/deserialization costs, because shared
| memory support is very rudimentary (in essence, just fixed
| size numeric code).
| mpeg wrote:
| It's just awkwardly worded, I think. It probably should
| include a mention of ProcessPoolExecutor in the following
| bit where it explains the run_in_executor function to make
| more sense. Especially as the whole section talks about
| passing CPU-bound task to a thread pool, but then
| recommends not to use a thread pool.
|
| Though I think the GIL isn't really that big a deal for
| most CPU-bound code, with plenty of third party libraries
| and even within the python standard library a lot of the
| code is not native python, and therefore releases the GIL
| lock.
| hoten wrote:
| This is really wordy. Anyone have an alternative introduction to
| asyncio?
| whalesalad wrote:
| Gevent is the only way I will do async in Python. Everything else
| ends up being a nightmare, and gevent is a lot more performant
| than I ever imagine it will be for a given situation.
| chaostheory wrote:
| Is this negligible in performance until the GIL radically
| changes? I find parallel processing has muc larger increases in
| speed.
| yoyohello13 wrote:
| Since async is single threaded the GIL shouldn't come into
| play.
| evilturnip wrote:
| Question though: asyncio is implemented as threads, which is
| where the GIL chokes, right?
| L3viathan wrote:
| asyncio is _not_ implemented as threads. It has features
| where it can wrap a sync function inside a thread in order
| to turn it into an async function, but if you just write
| "normal" async code, all your code is running in a single
| thread.
| [deleted]
| evilturnip wrote:
| Ah, ok that makes more sense.
| aivarsk wrote:
| A must-read before diving into asyncio:
|
| Async Python is slower than "sync" Python under a realistic
| benchmark. A bigger worry is that async frameworks go a bit
| wobbly under load. https://calpaterson.com/async-python-is-not-
| faster.html
| grodes wrote:
| async is not comparable to threads: - async is concurrency -
| threads are parallelism
|
| They are not exclusives, the best approach is to be multi-
| threaded while each of the threads uses concurrency to prevent
| blocking the cpu.
| gpderetta wrote:
| Threads are also for concurrency. In fact in python because of
| the GIL they are pretty much only for concurrency.
| ollien wrote:
| > - async is concurrency - threads are parallelism
|
| Even this isn't necessarily true, especially given you can
| limit a process to a single core (or, if you use a single core
| system, lest we forget those exist!), or, in Python's case, the
| GIL! You can still have parallelism with asynchronous
| programming, it's just not necessarily guaranteed. IIRC tokio
| lets you spin up runtimes on different threads which (putting
| the above caveat) _can_ run in parallel.
| samwillis wrote:
| There is very little in everyday Python usage that benefits from
| Asyncio. Two in webdev, are long running request (Websockets,
| SSE, long polling), and processing multiple backend IO processes
| in parallel. However the later is very rare, you may think you
| have multiple DB request that could use asyncio, but most of the
| time they are dependent on each other.
|
| Almost all of the time a normal multithreaded Python server is
| perfectly sufficient and much easer to code.
|
| My recommendation is only use it where it is really REALY
| required.
| jacob019 wrote:
| I don't know what you use Python for every day. Sure for some
| utility scripts it doesn't matter. I use it to run my ecommerce
| business, and for a variety of other plumbing, mostly for
| passing messages around between users, APIs, databases,
| printers, etc. Async programming is a must for just about
| everything. I guess the alternative would be thread pools, or
| process pools, like back in the day; but that has a lot of
| downsides. It is slower, way more resource intensive, and state
| sharing/synchronization becomes a major issue, you can only
| handle thread number of tasks concurrently and you're going to
| use that memory all the time; not to mention all the code needs
| to be thread safe. Most of our systems use gevent, but we've
| starting using asyncio for new projects.
| samwillis wrote:
| Same as you, e-commerce store, along with other mostly web
| projects, quite a bit of real-time and long running requests.
| I also use Gevent extensively (I prefer it to asyncio).
|
| We use Gevent or Asyncio in a small number of routes that
| either have long running requests, SSE in our case, or have a
| very large number of rear facing io requests we can
| parallelise (a few back office screens and processes).
|
| The complexity that asyncio would add to the the code for the
| 95% of routs that don't need it would add a lot of
| unnecessary overhead to development.
|
| My point is, I don't believe it adds any value 95% of the
| time, but does, sometimes significantly, 5% of the time. I
| think it's better to only use it where it's really needed and
| stick to old fashioned, none concurrent, code everywhere
| else.
| jacob019 wrote:
| Ok, I can agree with that. Gevent doesn't really add any
| cognitive overhead or complexity at all. Asyncio sure does,
| and makes the code hard to read too. Long ago I settled on
| a stack that includes using gevent for anything that
| listens on a socket, never regretted it.
| jsmith45 wrote:
| Sure. As a general rule the general style of coding using in
| async-await approaches is primarily about one of two things
|
| The first purpose is allowing more throughput at the expense of
| per request latency (Typically each request will take longer
| than with equivalent sync code).
|
| The main scenario where an async version could potentially
| complete sooner than a sync version is when the the code is
| able to start multiple async tasks and then await then as a
| group. For example if your task needs to make 10 http requests,
| and make those requests sequentially like one would in sync
| code, it will be slower. If one starts all ten calls and then
| awaits the results, then you _might_ be able to a speedup on
| this overall request.
|
| Other main purpose is when working with a UI framework where
| there is a main thread, and certain operations can only occur
| on the main thread. Use of async/await pattern helps avoid
| accidentally blocking the main thread, which can kill
| application responsiveness. This is why the pattern is used in
| javascript, and was one of the headline scenarios when C# first
| introduced this pattern. (The alternative being other methods
| of asynchrony which typically include use of callbacks, which
| can make the code harder to develop or understand).
|
| But basically, unless you have UI blocking problems, or are
| concerned about the number of requests per second you can
| handle, async-await patterns may be better avoided. It being
| even more costly in python than it is in some other languages
| does not really help.
| kissgyorgy wrote:
| Fully agree with this! Just stick to Flask/Django if you doing
| a simple website/API. Async also makes DB connections slower,
| here is a great piece from Mike Bayer from years ago:
| https://techspot.zzzeek.org/2015/02/15/asynchronous-python-a...
| nicolaslem wrote:
| Agreed, the SaaS Python application I maintain does SSE with
| threads. It is not the prettiest thing but it works and threads
| are cheaper than rewriting the whole thing to be async.
| dividuum wrote:
| I never understood why gevent wasn't integrated more into
| default Python. I think at the moment it still requires
| monkey patching, so non-aware functions use the gevent
| variants. But once you've done that, you can gevent.spawn() a
| normal python function and it also runs concurrent (though
| not parallel) and is suspended when blocking I/O happens. No
| function coloring required. I'm sure there is some reason why
| that wasn't done. Anyone knows?
| usrbinbash wrote:
| Completely agree with this.
|
| I have several python services running as glue between some of
| our components. They all use threading. I find it alot easier
| to reason about.
| ddorian43 wrote:
| Too bad Python couldn't go the gevent road. At least Java is
| doing with Project Loom and looks like it will be very
| successful.
| greyman wrote:
| Maybe off-topic, but my advice would be: if you need this guide,
| consider to switch to another language, if that's possible. In
| our company we switched to Go, and all those asyncio problems
| were magically solved.
| akdor1154 wrote:
| I love Python and fully agree with you. :(
| pantsforbirds wrote:
| I've had absolutely no problems using async programming in
| python. It's extremely easy to setup a script in 100-200 lines
| of code to do something like:
|
| pull the rows from postgres that match query <x>. Process the
| data and push each row as an event into rabbitmq. In <200 lines
| of code i was easily processing 25k/rows per second and it only
| took me a few minutes to figure out the script.
| austinpena wrote:
| Agreed. Moving my Python data analytics to a GRPC server that I
| call from a Go service has been _so much_ easier to manage and
| debug.
| vodou wrote:
| I've never bothered to learn Python asyncio. When Python 3.5 came
| out I just thought it looked overly complex. Coming from a C/C++
| background on Linux I just use the select package for waiting on
| blocking I/O, mainly sockets. Do you think there is something to
| gain for me by learning asyncio?
| eestrada wrote:
| Personally, I don't think there is a benefit. If select is
| working for you, asyncio doesn't add anything performance wise.
| It is just meant to look more synchronous in how you write the
| code. But, using select and either throwing work onto a
| background thread or doing work quickly (if it isn't CPU bound)
| can be just as clear to read, if not clearer. Sometimes "async"
| and "await" calls only obfuscate the logic more.
| brrrrrm wrote:
| Important to note:
|
| > They are suited to non-blocking I/O with subprocesses and
| sockets, however, blocking I/O and CPU-bound tasks can be used in
| a simulated non-blocking manner using threads and processes under
| the covers.
|
| If you're using it for anything besides slow async I/O, you're
| going to have to do some heavy lifting.
|
| I've also found the actual asyncio implementation in CPython to
| be slow. Measuring _purely_ event-loop overhead (and doing little
| /nothing in the async spawned tasks), it's 120x slower than
| JavaScript on my machine.
| https://twitter.com/bwasti/status/1572339846122991617
| bjt2n3904 wrote:
| Ok. This is what has been a huge hangup for me.
|
| It really seems that if you're doing asyncio, you must do
| EVERYTHING async, it's like asyncio takes over (infects?) the
| entire program.
| quietbritishjim wrote:
| That's exactly the opposite of what it says really: if you
| need to do CPU stuff, then you can do that, it just won't be
| using asyncio. So it doesn't really infect your whole
| program.
|
| You could easily have, say, a thread to do all your asyncio
| stuff, another to do some CPU intenstive stuff (so long as it
| blocks the GIL) and yet another to do some blocking I/O e.g.
| interacting with a database with its own blocking APIs. In
| that case, you could spawn separate threads and exchange
| messages between them like usual.
|
| Where this falls down a little is if you _want_ asyncio to
| infer your whole program - or more accurately, if you want to
| use it in all of the task-management code at the top level of
| your app. In that case, you would use asyncio 's thread API
| [1] although it's a bit clunky for when you want to maintain
| state bound to a specific worker thread (like SQLite database
| objects).
|
| [1] https://docs.python.org/3/library/asyncio-
| task.html#running-...
| brrrrrm wrote:
| I think the issue is more like this:
|
| 1. You jump into a huge codebase and find a very useful
| function you want to use in your code (it uses asyncio)
|
| 2. Your code doesn't use asyncio, so you start converting
| functions to be async (or else you can't use the `await`
| keyword)
|
| 3. It turns out your function is called by a bunch of
| different users, some of which do not use the asyncio
| runner
|
| 4. You end up having several meetings with upstream teams
| asking them to use the asyncio runner
|
| 5. A single team says "no"
|
| 6. Scrap the project and just rewrite that original
| function you found synchronously
| gpderetta wrote:
| Depending on the details you can often call an async
| function from a sync function, by just running the event
| loop, it is just a few lines of code.
|
| The problem is that the event loop is not reentrant [1],
| so if your sync function is being called from an async
| function, things do not work. Why would you ever do this
| you might think? Well, asyncio might be an implementation
| detail of whatever environment you are using. Jupyiter or
| ipython for example.
|
| [1] there are... hacks to make this sort of work of
| course.
| nineplay wrote:
| > there are... hacks to make this sort of work of course.
|
| The python motto.
| gpderetta wrote:
| Monkey patch like there is no tomorrow (so you do not
| have to maintain it)!
| quietbritishjim wrote:
| I guess I was focusing more on the CPU task problem
| mentioned at the top. But you're talking about blocking
| IO - more specifically, blocking IO that you want to
| convert to async (not just run in a worker thread using
| the thread-to-async API I mentioned).
|
| What you've written is true, but "infects" a smaller
| proportion of the code than you'd expect, in my
| experience. In fact, the better organised the program is,
| the less needs to change when converting to async. For
| example, reading from a connection your top-level async
| function would often be something like this:
| async def read_and_handle_messages(connection):
| while True: next_message_bytes = await
| connection.get_next_message()
| next_message_parsed = my_parser.parse(next_message_bytes)
| my_message_handler(next_message_parsed)
|
| All the application-specific code is in the parser and
| the message handler but neither of those are async. (If
| you need to send a response, your message handler could
| post a message to an async queue that is read by a
| separate writer task.) In your hypothetical scenario,
| you'd maybe write two a wrapper functions, one for
| blocking and one for async, which both read the message,
| parse it and handle it. But that only needs to be two
| versions of a three line function while all the
| subtantial code is completely shared.
|
| I find that there's very little actual async code in
| async programs that I write, even with no blocking IO
| historical baggage. Admittedly, that's partly because I
| organise programs to reach that goal, but it actually
| works out for the best because all the async-task
| lifecycle management ends up in one place, which makes
| following overall program flow particularly easy.
| dekhn wrote:
| No, the point is that asyncio is _viral_ , polluting large
| existing systems with its paradigm.
|
| Modern frameworks should be orthogonal and compositional.
| Even orthogonality (when you use multiple frameworks, each
| framework solves a problem along a different "axis" and
| does not interact with any of the other axes) is optional.
|
| Compositionality means I should be able to combine several
| systems without one affecting the other unnecessarily.
| async does not fulfill that requirement.
| jnwatson wrote:
| That's 80% due to the "what color is your function" problem.
|
| There are bridges from async to sync and vice versa, but they
| must be used very sparingly, as they aren't nestable.
| nrmitchi wrote:
| Yes. Unless you (and your team) are well-versed with python
| and how async works, and all the dependencies you use, is it
| very easy to accidentally do something blocking in async and
| lose most of the benefits.
| acedTrex wrote:
| Yes async is viral, its all or nothing
| maxbond wrote:
| As someone who has written an application with sync and
| async Python components within the same process, it is all,
| nothing, or pain.
|
| (If you really, really need to do this - triple check - use
| queues & worker threads to decouple the sync parts and keep
| them out of your event loop thread, as a sort of sync-async
| impedence match.)
| nicolaslem wrote:
| To be fair this is not really asyncio's fault, the same
| happens in most languages unless the runtime does some magic
| like scheduling blocking calls in a background thread (which
| can also be done with asyncio).
| brrrrrm wrote:
| I feel the same way, it's a lot like `const` in C++.
| Basically it's enforcing usage for every upstream user to
| ensure there's no inefficiency (such as waiting for things
| like the network).
|
| JavaScript has an extremely nice "out" (I'm sure other
| languages do too): async function() {
| await async_thing() do_sync_stuff() }
|
| can be written function() {
| async_thing().then(do_sync_stuff) }
|
| which is quite intuitive and helps glue code together. I
| think the callback-based thinking really clarifies user
| intent ("just tell me when it's done") without impacting
| performance.
| gpderetta wrote:
| problem is async doesn't enforce anything. You can still
| block from an async function.
| brrrrrm wrote:
| what do you mean "block"? If you call an async function
| from a non-async context it doesn't block.
| gpderetta wrote:
| Nothing prevents you from calling a blocking function
| (for example good old non-async socket functions) from an
| async function.
| evilduck wrote:
| The propagation/infection of async handling is just
| indicitive of the nature of the problem. Try writing async
| code in any other language and you see similar patterns
| emerge.
| m3047 wrote:
| I seem to use asyncio a lot, so maybe it's just good for internet
| plumbing. Things I've used it for:
|
| * A Postfix TCP table.
|
| * A milter.
|
| * DNS request forwarding.
|
| * Reading data from a Unix domain socket and firing off dynamic
| DNS updates.
|
| * A DNS proxy for Redis.
|
| * A netflow agent.
|
| * A stream feeder for Redis.
|
| https://github.com/search?q=user%3Am3047+asyncio&type=Reposi...
|
| By the way you can't use it for disk I/O, but you can try to use
| it for e.g. STDOUT:
| https://github.com/m3047/shodohflo/blob/5a04f1df265d84e69f10...
| class UniversalWriter(object): """Plastering over the
| differences between file descriptors and network sockets."""
| gilad wrote:
| what about aiofiles[0] for disk I/O?
|
| [0] https://github.com/Tinche/aiofiles
| m3047 wrote:
| Not sure. A cursory look suggests it runs file ops in a
| thread pool.
|
| The problem that I'm aware of is at a deeper level and has to
| do with the ability (or lack thereof) to set nonblocking on
| file descriptors associated with disk files.
| dekhn wrote:
| I've used python since 1995 and I can say that async is one of
| the worst things I've seen put into python since then. I've used
| a wide range of frameworks (twisted, gevent, etc) as well as
| threads and even if async is a good solution (I don't think it
| is) it broke awscli for quite some time (through aiobotocore and
| related package dependencies). It's too late in the game for
| long-term breaks like that or any backward-incompatible changes
| impacting users.
| btown wrote:
| Yep, it's 2022 and gevent is still the only solution for async
| & high concurrency that Just Works with the entire ecosystem of
| Python libraries without code changes. There's definitely some
| compute overhead compared to async, but we save so much
| developer time having effortless concurrency and never being
| worried that, say, using a slow third-party API over the web
| will slow down other requests.
| quietbritishjim wrote:
| Not complete - doesn't include Task Groups [1]
|
| In fairness they were only included in asyncio as of Python 3.11,
| which was released a couple of weeks ago.
|
| These were an idea originally from Trio [2] where they're called
| "nurseries" instead of "task groups". My view is that you're
| better off using Trio, or at least anyio [3] which gives a Trio-
| like interface to asyncio. One particularly nice thing about Trio
| (and anyio) is that there's no way to spawn background tasks
| except to use task groups i.e. there's no analogue of asyncio's
| create_task() function. That is good because it guarantees that
| no task is ever left accidentally running in the background and
| no exception left silently uncaught.
|
| [1] https://docs.python.org/3/library/asyncio-task.html#task-
| gro...
|
| [2] https://github.com/python-trio/trio
|
| [3] https://anyio.readthedocs.io/en/latest/
| dang wrote:
| Ok, we've taken completeness out of the title above.
| throwaway81523 wrote:
| I have such a feeling of tragedy about Python. I wish it had
| migrated to BEAM or implemented something similar, instead of
| growing all this async stuff. Whenever I see anything about
| Python asyncio, I'm reminded of gar1t's hilarious but NSFW rant
| about node.js, https://www.youtube.com/watch?v=bzkRVzciAZg .
| Content warning: lots of swearing, mostly near the end.
| rogers12 wrote:
| That's a blast from the past. Incredible that 10 years ago
| people thought using a thread per request is a good idea
| because anything else is too hard.
| Steltek wrote:
| Python, more than other languages, ends up being used in
| utility scripts on end user devices, where you'll find just
| about everything. Python's async seems to still have a large
| swing in available features. Unlike async methods in other
| languages, semi-serious dabbling in it just for fun is courting
| a lot of headaches later.
| djha-skin wrote:
| Just in time, I've been needing to implement s3 upload
| asynchronously.
| PyWoody wrote:
| Are you sure it's worth the complexity? Threads will drop the
| GIL for the duration of the network bound I/O .
| djha-skin wrote:
| Good point!
| xtreak29 wrote:
| Does aioboto3 help here?
|
| https://aioboto3.readthedocs.io/en/latest/usage.html#upload
| ltbarcly3 wrote:
| Python asyncio is pretty awful. The libraries are of extremely
| poor quality, and the slightest mistake can lead to blocking the
| event loop. After a few years of dealing with it I refuse to
| continue and am just using threads.
| andrewstuart wrote:
| I love Python async - it's a complete game changer for certain
| types of applications.
|
| I find Python async to be fun and exciting and interesting and
| powerful.
|
| BUT it is a big power tool and there's so much in it that it's
| hard to work out how to drive it right.
|
| I have pretty good experience with Python and javascript.
|
| I prefer Python to javascript when writing async code.
|
| Specific example I spent hours trying to drive some processes via
| stdin/stout/stderr with javascript and it kept failing for
| reasons I couldn't determine.
|
| Switched to Python async and it just worked.
|
| The most frustrating thing about async Python is that it has been
| improving greatly. That means that it's not obvious what "the
| right way" is, ie using the latest techniques. This is actually a
| really big problem for async Python. I'm fairly competent with
| it, but still have to spend ages working out if I'm doing it "the
| right way/the latest way".
|
| The Python project really owes it to its users to have a short
| cookbook that shows the easiest, most modern recommended way to
| do common tasks. Somehow this cookbook must give the reader
| instant 100% confidence that they are reading the very latest
| official recommendations and thinking on simple asyncio
| techniques.
|
| Without such a "latest and greatest techniques of async Python
| cookbook" it's too easy to get lost in years of refinement and
| improvement and lower and higher level techniques.
|
| The Python project should address this, it's a major ease of use
| problem.
|
| Ironically, pythons years of async refinement mean there's many
| many many ways to get the same things done, conflicting with
| pythons "one right way to do it" philosophy.
|
| It can be solved with documentation that drives people to the
| simplest most modern approaches.
| arecurrence wrote:
| The way that this is formatted... I initially thought it was a
| Haiku :)
| kissgyorgy wrote:
| After 2 years of using asyncio in production, I recommend to
| avoid it if you can. With async programming, you take the
| complexity of concurrent programming, which is way harder than
| you can imagine.
|
| Also nobody mentions this for some reason, but asyncio doesn't
| make your programs faster, in fact it makes everything 100x
| SLOWER (we measured it multiple times, compared the same thing to
| the sync version), but makes your program concurrent (NOT
| parallel), where the tradeoff is you use more CPU, maybe less
| memory, and can saturate your bandwidth better. Never do anything
| CPU-bound in an async loop!
| qbasic_forever wrote:
| This isn't a problem specific to python though. You hint at it,
| async programming (single thread worker loop) is not the right
| way to deal with CPU-bound tasks. NodeJS or any other async
| language would have the same problems. You want multiprocessing
| or multithreading, where work is distributed to different CPU
| cores at the same time. Python gives you the ability to use any
| of those three paradigms. Choose wisely.
| powersnail wrote:
| Async in general (not just python) doesn't speed up any
| computation.
|
| The value is allowing you to do something else while waiting
| for IO to finish. If you are constantly doing IO, async is a
| good tool. Otherwise, it won't help.
| whakim wrote:
| This just sounds like you fundamentally misunderstood how
| asyncio works and applied it to a use-case which it wasn't
| suited to. Why would you use async if you're doing a bunch of
| CPU-bound work?
| [deleted]
| number6 wrote:
| Asyncio is not ment for CPU Bound Task but for IO Bound Tasks.
| You should have used multiprocessing. The problem you describe
| is exactly what asyncio is used for: saturate your bandwidth
| better.
|
| maybe the right aproach would have been a Threadpool? Plus you
| don't have to refractor the task. Just let it run sync, but you
| can make it also async
| matsemann wrote:
| In other languages the async paradigm work for multiple kind
| of workflows, not just heavily IO bound ones.
| qwertox wrote:
| But it only does so because you can move out that extra
| work from the async thread into a thread pool.
|
| In Python there's no benefit in doing this due to the GIL,
| unless you're using a module which implements its own
| multithreading (for example in C). Python is not alone with
| this issue.
|
| In other languages you're basically stepping out of the
| async paradigm in order to use threading in parallel. You
| can wait for result of the thread in the async loop without
| blocking.
|
| I really enjoy using asyncio in Python for things where I
| have to do a lot of stuff in parallel, like executing
| remote scripts in a dozen of servers in parallel via
| AsyncSSH or for low workload servers which query databases.
|
| In any case, what keeps me hooked on Python like a junkie
| is the `reload(module)` which I invoke via an inotify hook
| every time a file/module changes:
|
| server.py
|
| server_handler.py
|
| reloader.py
|
| server.py loads reloader.py (which sets up inotify) and
| that one takes care of reloading server_handler.py whenever
| it changes. server.py defers all the request/response
| handling to server_handler.py which can be edited on the
| fly so that the next request executes the new code. Instant
| hot reloading.
|
| This also works very nice with asyncio (like aiohttp) and
| one ends up with very readable code.
|
| If you need performance, then use Java, Rust, Go or C/C++,
| but for prototyping and tooling I absolutely love this
| approach with Python.
| number6 wrote:
| Can you give me an example which languages do this?
| matsemann wrote:
| In Kotlin for instance you can execute your coroutines
| with different "dispatchers", which have different
| behaviors. You can for instance use a dispatcher with
| multiple threads in a threadpool, run it in the main
| thread, a specific thread, in threads with low/high
| priority etc. Basically allowing you to write code using
| the async/coroutine paradigm, but then control its
| execution from the outside.
|
| So I've also used coroutines in Kotlin for CPU bound
| tasks with a nice speedup, where multiple sub-tasks get
| executed in parallel and then gathered when needed to
| produce new tasks etc., in a granular way that would be
| very intrusive doing with threadpools. Here I basically
| took the existing coroutine code and slapped some more
| threads on it.
|
| With that said, I'm not entirely sold on Kotlin's way
| either. Both Kotlin and Python have the "what color is
| your function" problem.
| tmh88j wrote:
| > In Kotlin for instance you can execute your coroutines
| with different "dispatchers", which have different
| behaviors. You can for instance use a dispatcher with
| multiple threads in a threadpool, run it in the main
| thread, a specific thread, in threads with low/high
| priority etc. Basically allowing you to write code using
| the async/coroutine paradigm, but then control its
| execution from the outside.
|
| I'm not very experienced with Kotlin, but that sounds
| similar to python's run_in_executor [1]
|
| [1] https://docs.python.org/3/library/asyncio-
| eventloop.html#asy...
| jerf wrote:
| "Asyncio is not ment for CPU Bound Task but for IO Bound
| Tasks."
|
| You can amplify this; pure Python is not meant for CPU-bound
| tasks. If you have a pure Python task, you've CPU optimized
| it somewhat, and it's still too slow, the answer is, get out
| of pure Python.
|
| There's half-a-dozen options that still leave you essentially
| in Python land (just not _pure_ Python anymore) like cython
| or NumPy, and of course dozens of other languages that leave
| Python entirely.
|
| To accelerate a pure Python task to the speed that you can
| get out of a single thread of a more efficient language, you
| have to perfectly parallelize your task across upwards of 40
| CPUs (conservatively!), because that's how slow pure Python
| is. asyncio isn't a solution to this, but neither is any sort
| of multiprocessing of the pure Python either.
|
| This is not criticism. This is just engineering reality. Pure
| Python is a fun language, but a very slow one.
| JosephRedfern wrote:
| Depending on the specifics, a threadpool proabably wouldn't
| help with a CPU bound task due to GIL limitations.
| `multiprocess.Pool` might (though process pools aren' without
| their overheads)
| dekhn wrote:
| any concurrency system that is properly designed supports
| both types of tasks.
| number6 wrote:
| I only know C#: https://learn.microsoft.com/en-
| us/dotnet/csharp/async#recogn...
|
| There you also have to act differently based the task
| dekhn wrote:
| The docs on that page explicitly say you just have to
| provide a different parameter (basically, to force it to
| use a thread-like instead of select-like approach).
|
| The docs on that page also point you to a different task
| library which does what I'd expect:
|
| "If the work is appropriate for concurrency and
| parallelism, also consider using the Task Parallel
| Library."
|
| I checked those docs and that is a framework that makes
| sense.
| phs2501 wrote:
| Not sure how that's different than Python asyncio's "run
| blocking stuff with `run_in_executor`, the result
| (future) of which integrates into the async event loop?
| dekhn wrote:
| it's threaded with the option to use 1:M, N:N or N:M.
| It's a mental model that includes both asyncio and
| threads, while async will always only be for IO-blocking
| calls that can be multiplexed.
| harpiaharpyja wrote:
| I've experienced something similar using asyncio as well, but I
| think you're wrong to blame the inherent complexity of
| concurrent programming.
|
| I've used curio as an alternative concurrent programming engine
| for a med-large project (and many small ones). In comparison to
| asyncio, it's a joy to use.
| kissgyorgy wrote:
| What I'm saying if you don't need concurrency, don't trade
| off simplicity. For REST APIs and websites, probably not
| needed.
| dec0dedab0de wrote:
| It used to get brought up every time asyncio was mentioned on
| here.
|
| I either use multiprocessing, or celery. using async/await
| looks ugly, doesn't solve most problems I would want it to, and
| feels unstable.
|
| Maybe once development around it settles down I'll revisit, but
| I can't imagine wanting to use it in it's current form.
| jpicard wrote:
| asyncio gets a lot of hype, but I prefer concurrent.futures. It
| feels more simple and general since I don't have to make
| everything awaitable.
| wiredfool wrote:
| My opinion is that async/evented is a (useful) performance
| hack.
|
| You wouldn't do it that way for any reason other than you can
| get better performance than way than by using threading or
| other pre-emptive multitasking.
|
| It's not a better developer experience than writing straight
| forward code where threads are linear and interruptions happen
| transparently to the flow of the code. There are foot guns
| everywhere, with long running tasks that don't yield or places
| where there are hidden blocking actions.
|
| It reminds me a bit of the old system 6 mac cooperative
| multitasking. It was fine, and significantly faster because
| your program would only yield when you let it do it, so
| critical sections coule be guaranteed to not context shift.
| However, you could bring the entire machine to a halt by
| holding down the mouse button, as eventually an event handler
| would get stuck waiting for mouse up.
|
| Pre-emptive multitasking was a huge step forward -- it made
| things a bit slower on average, but the tail latency was
| greatly improved, because all the processes were guaranteed at
| least some slice of the machine.
| samwillis wrote:
| "Performance" is too wide a term, asyncio does not improve
| performance in the most widely understood interoperation,
| "speed". Talking about performance benefits of asyncio causes
| people to misunderstand where it is best used.
|
| "Scalability" is a better word to use when talking about
| asyncio. Along with describing complex concurrent programming
| such as for a GUI, where the added syntactical complexity if
| outweighed by the reduced boilerplate of traditional GUI
| programming.
| wiredfool wrote:
| Yeah, talking speed in the manner of C1M sorts of things.
| Server stuff - where python tends to be rather than GUI
| side. It's demonstrably faster for IO to not be context
| switching with threads -- but if threads were equal weight
| to events, I don't the event programming style being more
| programmer productive than the threaded style. I can see
| where events are useful in a GUI context, but UI Thread +
| thread pool dispatch still works well, if your framework
| supports it.
|
| Just this week, I got to the bottom of a performance issue
| in django because the developers were using async, and then
| doing eleventy billion db queries to import a big csv,
| thereby blocking all other requests.
|
| One of the questions I ask on our programming interviews is
| "how would you make this (cpu bound) thing go faster" --
| Async is definitely a low quality answer to that, when
| things like indexing or hash lookup vs array scan are still
| on the table.
| acedTrex wrote:
| This is an insane take, if you are doing async you already HAVE
| a need for concurrent programming. Async just makes that
| simpler to read and write.
| kissgyorgy wrote:
| I'm pretty sure a lot of people don't understand the
| tradeoffs and don't really need concurrent code. There are
| really only a couple of use-cases where it's really handy:
| for example an API gateway, where you only get and send HTTP
| requests. Other than that, I don't see how it worth the
| insane complexity (compared to sync code).
|
| You can write web servers in a sync manner; Flask, Django is
| way better if you only need an API, choosing FastAPI for a
| simple REST API is a huge mistake.
| thunky wrote:
| > FastAPI for a simple REST API is a huge mistake
|
| FastAPI can be sync or async, although sync will have "a
| small performance penalty"
|
| https://github.com/tiangolo/fastapi/issues/260#issuecomment
| -...
| walls wrote:
| What kind of APIs are you writing that don't involve db
| lookups or calls to other services?
| samwillis wrote:
| Asyncio does not make your db calls or io faster, it does
| make your code more complex.
|
| The only thing async io gives you is scalability of many
| MANY concurrent io operations. You have to either be
| pushing seriously large traffic or doing a lot of long
| running websocket/sse/long polling type requests.
|
| The only other use case is lots of truly concurrent io
| within one request/response cycle. But again that is
| unusual, most apis have low single digit db queries that
| are usually dependent on one another removing any
| advantage of async.
| mixmastamyk wrote:
| Choosing fastapi for a fast api is a mistake?
| kissgyorgy wrote:
| Choosing unnecessary complexity for a simple task is a
| mistake. FastAPI is not faster than Django or Flask for a
| single request, quite the opposite.
| hdjjhhvvhga wrote:
| There are other valid reasons than speed:
|
| https://christophergs.com/python/2021/06/16/python-flask-
| fas...
| qbasic_forever wrote:
| Async python by design is not concurrent, nothing is
| executing at the same time on different cores/threads.
| There's one worker loop and one task running on it at a time.
| srik wrote:
| Concurrency is not Parallelism. You should watch this
| quintessential talk about the difference -
| https://www.youtube.com/watch?v=oV9rvDllKEg
| dekhn wrote:
| This is Rob's perspective, but it's not universally
| shared. In my mind, all concurrency is a form of
| parallelism (parallel tasks, but not parallel threads)
| while not all parallelism is concurrency. I have
| frequently solved concurrency problems with thread pools,
| rather than using non-blocking IO, because the
| programming paradigm is a lot simpler.
| _bohm wrote:
| I think you mean it is not parallel. Cooperative
| multitasking is absolutely a type of concurrency
| calpaterson wrote:
| Unfortunately the majority of people who use asyncio in the
| real world do not have a need for concurrenct programming.
| They are drawn to it because they believe it is faster,
| which, generally, it is not. That isn't to say there aren't
| reasonable usecases - but most people using asyncio are
| writing webapps which spend a lot of time in the CPU
| generating or outputting JSON/GraphQL/HTML.
| nine_k wrote:
| I think this may be influenced by Node. With JS and Node,
| async is cheap and ergonomic, so it's very widely used,
| even if the latency gains are marginal.
|
| With Python and asyncio, neither assumption holds, but the
| idea of shaving off some latency persists: if we rewrite
| everything using asyncio, our DB accesses can be
| parallelized, yay! This may or may not be a net win in
| latency; reworking your DB access patterns may gain you
| more.
| kissgyorgy wrote:
| > reworking your DB access patterns may gain you more.
|
| Nope:
| https://techspot.zzzeek.org/2015/02/15/asynchronous-
| python-a...
| [deleted]
| nine_k wrote:
| I don't say that asyncio gives you nothing! Go for it
| with your DB access if you have already gone async in
| your design.
|
| But often the lower-hanging fruit is doing more joins in
| the DB, fetching fewer columns, and writing your query
| more thoughtfully. Async only helps if you can do
| something while waiting for the DB to complete your
| query. A common anti-pattern, exacerbated by ORMs, is
| doing a bunch of small queries while passing bits of data
| between them in Python; moving that to asyncio won't help
| much, if any.
| calpaterson wrote:
| > But often the lower-hanging fruit is doing more joins
| in the DB, fetching fewer columns, and writing your query
| more thoughtfully.
|
| Reluctant to add a "me-too" comment but I think you have
| it there. For whatever reason people are much keener to
| investigate different ways of scheduling IO than they are
| to investigate what the query plan looks like and why.
| _bohm wrote:
| It may not be the norm, but there are definitely
| apps/services out there that require heavy-duty database
| queries and can have their throughput bottlenecked by
| synchronous DB IO latency, and would benefit from asyncio
| on the database side. Also, SQLAlchemy supports asyncio
| now.
| throwawaymaths wrote:
| > Async just makes that simpler to read and write.
|
| Really? I definitely had a huge struggle reading async
| python, because generators and yields break the concept of
| "what is a function" and struggled writing async python
| because of function coloring.
| andrewstuart wrote:
| Day to day Python async should be using generators and
| yields.
| nine_k wrote:
| Generators are great for predictable concurrency. Things
| like asynchronous _I /O_, with timeouts and stuff,
| require more support on the language and stdlib level.
| dekhn wrote:
| I am really not a fan of generators and yields for the same
| reason. But over time I've come to see it as ever-so-
| slightly more elegant syntax sugar for a class that
| maintains internal state to sequentially return specific
| values in response to a series of calls to a function.
| throwawaymaths wrote:
| The least they could have done is sugared it to something
| that isn't "def", like "gen" or "defgen"
| jborean93 wrote:
| Unless I'm misunderstanding your comment they do, it's
| `async def` to denote a coroutine.
| nicwolff wrote:
| It's the buried `yield` in a loop somewhere, magically
| changing your function into a generator, that can be
| confusing - it's on the original coder to have the
| docstring say `"""Generator yielding records from DB."""`
| or something.
| dekhn wrote:
| They're referring to generator functions. A generator
| function typically open a resource and then use a for
| loop to read (for example) a line at a time and yield it.
| once you yield, control returns to the caller (like a
| return) but subsequent calls to next on the returned
| expression pick back up at the yield point. So function
| calls maintain state-behind-the-scenes. It's not just a
| simple call-and-return model anymore.
|
| To me this violates a core principle: the principle of
| least surprise. It completely changes how I have to
| reason about function execution, especially if there are
| conditionals and different yields.
| jacob019 wrote:
| After using gevent for about a decade I have started using
| asyncio for new projects, just because it's in the standard
| library and has the official async blessing of the Python gods.
| Indeed it is way harder. I'm always coming up against little
| gotchas that take time to debug and fix. Part of me enjoys the
| challenge of learning something new and solving little puzzles.
| It's getting easier, especially as I build up a collection of
| in-house async libraries for various things. As for
| performance, it's not too bad for mostly io bound tasks, which
| is why one uses async in the first place. Some tight loop
| benchmarks for message passing with other processes show it to
| be about half the speed of gevent in my case, which is fine.
| It's nice to be able to deploy async microservices without
| installing gevent, and there's a certain value to the
| discipline that it imposes. I like how I am able to bring non-
| async code into the async world using threading. I imagine the
| performance would improve quite a bit with pypy, perhaps
| exceeding that of gevent. Gevent makes it so damn easy, I've
| been spoiled. I was disappointed when asyncio came out, as I
| would have preferred the ecosystem moved in the gevent
| direction instead; but I'm coming around. It's super annoying
| how the python ecosystem has been bifurcated with asyncio. You
| really have to choose one way or another at the beginning of a
| project and stick with it.
|
| And yeah, async programming (in Python) isn't really for CPU
| bound stuff. You might benefit from multiprocessing and just
| use asyncio to coordinate, which is what it excels at. PyPy can
| really help with CPU bound stuff too, if the code is mostly
| pure Python.
| doliveira wrote:
| The clue is in the name: asyncIO is for, well, I/O bound
| stuff...
| m3047 wrote:
| > 100x SLOWER (we measured it multiple times, compared the same
| thing to the sync version)
|
| I'd need to know the methodology. Asyncio is "gated", meaning
| that coroutines execute in tranches. When you queue up a bunch
| of coroutines from another coroutine, they don't execute right
| away instead they go in a list. Then when the current tranche
| completes the next one starts. There's some tidying up which
| occurs with every tranche.
|
| As an (only one) example of "measuring the wrong thing", if you
| queue up a single task 1000 times which takes as its argument a
| unique timer instance it matters whether you queue them up one
| at a time as they finish running or queue them all up at once.
| #!/usr/bin/python3 # (c) 2022 Fred Morris, Tacoma WA
| USA. Apache 2.0 license. """Illustrating the tranche
| effect in asyncio.""" from time import time
| import asyncio N = 1000 class
| TimerInstance(object): def __init__(self,
| accumulator): self.accumulator = accumulator
| self.start = time() def stop(self):
| self.accumulator.cumulative += time() - self.start
| return class Timer(object):
| def __init__(self): self.cumulative = 0.0
| return def timer(self): return
| TimerInstance(self) async def a_task(timer):
| timer.stop() return def main():
| loop = asyncio.get_event_loop() timing = Timer()
| overall = time() for i in range(N):
| loop.run_until_complete( loop.create_task(
| a_task(timing.timer()) ) ) print('Sequential: {}
| Overall: {}'.format(timing.cumulative, time() - overall))
| timing = Timer() overall = time() for i
| in range(N): loop.create_task(
| a_task(timing.timer()) ) loop.stop()
| loop.run_forever() print('Tranche: {} Overall:
| {}'.format(timing.cumulative, time() - overall))
| if __name__ == '__main__': main()
| # ./tranche-demo.py Sequential: 0.02229022979736328
| Overall: 0.04146838188171387 Tranche: 6.084041595458984
| Overall: 0.012317180633544922
| amelius wrote:
| Async programming is Windows 3.1 style cooperative multitasking.
| cturner wrote:
| Different trade-offs. It is much easier to pursue cache-
| efficiency in a cooperative multitasking setting than with
| preempt threads.
| leveraction wrote:
| I have used asyncio through aiohttp, and I have been pretty happy
| with it, but I also started with it from the beginning, so that
| probably made things a little easier.
|
| My setup is a bunch of microservices that each run an aiohttp web
| server based api for calls from the browser where communications
| between services are done async using rabbitmq and a hand rolled
| pub/sub setup. Almost all calls are non-blocking, except for
| calls to Neo4j (sadly, they block, but Neo4j is fast, so its not
| really a problem.)
|
| With an async api I like the fact that I can make very fast https
| replies to the browser while queing the resulting long running
| job and then responding back to the Vue based SPA client over a
| web socket connection. This gives the interface a really snappy
| feel.
|
| But Complex? Oh yes.
|
| But the upside is that it is also a very flexible architecture,
| and I like the code isolation that you get with microservices.
| Nevertheless, more than once I have thought about whether I would
| choose it all again knowing what I know now. Maybe a monolithic
| flask app would have been a lot easier if less sexy. But where's
| the fun in that?
| meitham wrote:
| I share your sentiment and have been using aiohttp for five
| years and pretty happy with it. My current project is a web
| service with blocking SQL server backend, so I tend to do
| loop.run_in_executor for every DB.execute statement. But now
| I'm considering just running a set of light aio streams sub
| processes with a simple bespoke protocol that takes a SQL
| statement and returns the result json encoded to move away from
| threads.
| senko wrote:
| > With an async api I like the fact that I can make very fast
| https replies to the browser while queing the resulting long
| running job and then responding back to the Vue based SPA
| client over a web socket connection. This gives the interface a
| really snappy feel.
|
| How does this compare to doing the same with eg. Django
| Channels (or other ASGI-aware frameworks)?
|
| I have yet to find a use case compelling enough to dive into
| async in Python (doesn't help that I also work in JS and Go so
| I just turn to them for in cases where I could maybe use
| asyncio). This is not to say it's useless, just that I'm still
| searching for a problem this is the best solution for.
| leveraction wrote:
| I have never used Django, so I cannot say but if Channels
| handles the websocket connection back to the client, then I
| assume you could send back a quick 200 notification as the
| http response to the user and then send the real results
| later over the socket. I think these would be equivalent.
|
| I have also never used Go, but I am comfortable saying that
| Python async is much easier to user than JS async. I find JS
| to be as frustrating as it is unavoidable.
|
| Using aiohttp as on api is not bad at all. Once you have the
| event loop up and running, its a lot like writing sync code.
| Someone else made a comment about the fact that Python has
| too many ways to do async because everything keeps evolving
| so fast. I this this is true. The first time I ever looked at
| async on Python it was so nasty I basically gave up and
| reconsided Flask but came back around later because I so
| despised the idea of blocking by server that I was compelled
| to give it another go. The next time around was a lot easier
| because the libraries were so much improved.
|
| I think a lot of people think that async Python is harder
| than it is (now).
___________________________________________________________________
(page generated 2022-11-10 23:02 UTC)