https://www.bitecode.dev/p/asyncio-twisted-tornado-gevent-walk
[https]
Bite code!
Subscribe
Sign in
Share this post
[https]
Asyncio, twisted, tornado, gevent walk into a bar...
www.bitecode.dev
Copy link
Facebook
Email
Notes
Other
Asyncio, twisted, tornado, gevent walk into a bar...
... they pay, they leave, they drink, they order.
Aug 22, 2023
1
Share this post
[https]
Asyncio, twisted, tornado, gevent walk into a bar...
www.bitecode.dev
Copy link
Facebook
Email
Notes
Other
1
Share
Summary
Concurrency has a lot to do with sharing one resource, and Python has
dedicated tools to deal with that depending on the resource you must
share.
If you have to share one CPU while waiting on the network, then the
specialized tools for this are asyncio, twisted, trio, gevent, etc.
Asyncio is the current standard to do this, but tornado, gevent and
twisted solved this problem more than a decade ago. While trio and
curio are showing us what the future could look like
But chances are, you should use none of them.
[ ]Subscribe
Different forms of concurrency
As Rob Pike said, concurrency "is about dealing with a lot of things
as once", unlike parallelism, which is "doing lots of things at once
".
The typical analogy is this:
* concurrency is having two lines of customers ordering from a one
cashier;
* parallelism is having two lines of customers ordering from two
cashiers.
Which, means, if you think about it, that concurrency has a lot to do
with sharing one resource.
The question is, which resource?
In a computer, you may have to share different things:
* Battery charge.
* CPU calculation power.
* RAM space.
* Disk space and throughput.
* Network throughput.
* File system handles.
* User input.
* Screen real estate.
* ...
There are tons of software that is written solely to deal with the
fact we have to share.
asyncio, twisted, tornado and gevent are such tools, specialized to
let you share more efficiently a CPU core between several things that
access the network.
Now, it seems a bit counterintuitive to tie "CPU core" and "network"
performances together.
But when you talk to the network, you send messages to the outside
world, and then things gets out of your control. The outside world
can answer each message very quickly, or not. You can't affect it
that much.
Meanwhile, your program sits there, waiting for the answer to the
message. And while it sits, what does it do in Python by default? It
keeps the CPU core for itself.
Sure, this waiting and seating is often only a few milliseconds, so
to a human it seems very quick at first glance. But one millisecond
is a huge time for a computer that does billions of things in the
blink of an eye.
asyncio, twisted, tornado and gevent have one trick up their sleeve:
they can send a message to the network, and while waiting for the
response, wake up another part of the program to do some other work.
And they can do that with many messages in a row. While waiting for
the network, they can let other parts of the program use the CPU
core.
Note that they only can speed up waiting on the network. They will
not make two calculations at the same time (can't use several CPU
cores like with multiprocessing) and you can't speed up waiting on
other types of I/O (like when you use threads to not block on user
input or disk writes).
All in all, they are good for writing things like bots (web crawler,
chat bots, network sniffers, etc.) and servers (web servers, proxies,
...). For maximum benefits, it's possible to use them inside other
concurrency tools, such as multiprocessing or multithreading. You can
perfectly have 4 processes, each of them containing 4 threads (so 16
threads in total), and each thread with their own asyncio loop
running.
But let's not get side tracked, and focus on the question at hand:
what are asyncio, tornado or twisted?
A concrete example
Let's take a few URLs from a completely random web site. Our task
will be to get all the titles of all those pages.
Here is how to do that synchronously, with only the stdlib:
import re
import time
from urllib.request import urlopen, Request
urls = [
"https://www.bitecode.dev/p/relieving-your-python-packaging-pain",
"https://www.bitecode.dev/p/hype-cycles",
"https://www.bitecode.dev/p/why-not-tell-people-to-simply-use",
"https://www.bitecode.dev/p/nobody-ever-paid-me-for-code",
"https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager",
"https://www.bitecode.dev/p/the-costly-mistake-so-many-makes",
"https://www.bitecode.dev/p/the-weirdest-python-keyword",
]
title_pattern = re.compile(r"
]*>(.*?)", re.IGNORECASE)
# We'll pretend to be Firefox or substack is going to kick us
user_agent = (
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"
)
# Let's time how long all this takes
global_start_time = time.time()
for url in urls:
# let's also time each page processing
start_time = time.time()
with urlopen(Request(url, headers={"User-Agent": user_agent})) as response:
html_content = response.read().decode("utf-8")
match = title_pattern.search(html_content)
title = match.group(1) if match else "Unknown"
print(f"URL: {url}\nTitle: {title}")
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Time taken: {elapsed_time:.4f} seconds\n")
global_end_time = time.time()
global_elapsed_time = global_end_time - global_start_time
print(f"Total time taken: {global_elapsed_time:.4f} seconds")
If we run the script, we can see that, individually, each page load
are not that long:
URL: https://www.bitecode.dev/p/relieving-your-python-packaging-pain
Title: Relieving your Python packaging pain - Bite code!
Time taken: 0.6022 seconds
URL: https://www.bitecode.dev/p/hype-cycles
Title: XML is the future - Bite code!
Time taken: 0.1813 seconds
URL: https://www.bitecode.dev/p/why-not-tell-people-to-simply-use
Title: Why not tell people to "simply" use pyenv, poetry or anaconda
Time taken: 0.9496 seconds
URL: https://www.bitecode.dev/p/nobody-ever-paid-me-for-code
Title: Nobody ever paid me for code - Bite code!
Time taken: 0.3314 seconds
URL: https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager
Title: Python cocktail: mix a context manager and an iterator in equal parts
Time taken: 0.2849 seconds
URL: https://www.bitecode.dev/p/the-costly-mistake-so-many-makes
Title: The costly mistake so many make with numpy and pandas
Time taken: 0.3622 seconds
URL: https://www.bitecode.dev/p/the-weirdest-python-keyword
Title: The weirdest Python keyword - Bite code!
Time taken: 0.5032 seconds
Total time taken: 3.2149 seconds
However, the sum of all of them is quite a lot, because all networks
accesses are sequential, and the program waits for each one to finish
before starting the next.
Now consider the equivalent with asyncio (this requires to install
httpx since asyncio doesn't come with an HTTP client):
import asyncio
import re
import time
import httpx
urls = [
"https://www.bitecode.dev/p/relieving-your-python-packaging-pain",
"https://www.bitecode.dev/p/hype-cycles",
"https://www.bitecode.dev/p/why-not-tell-people-to-simply-use",
"https://www.bitecode.dev/p/nobody-ever-paid-me-for-code",
"https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager",
"https://www.bitecode.dev/p/the-costly-mistake-so-many-makes",
"https://www.bitecode.dev/p/the-weirdest-python-keyword",
]
title_pattern = re.compile(r"]*>(.*?)", re.IGNORECASE)
user_agent = (
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"
)
# fetch_url() is the concurrency unit for this program. We can start many
# of them and each will wait for the response while letting other tasks run.
async def fetch_url(url):
start_time = time.time()
async with httpx.AsyncClient() as client:
response = await client.get(url, headers={"User-Agent": user_agent})
match = title_pattern.search(response.text)
title = match.group(1) if match else "Unknown"
print(f"URL: {url}\nTitle: {title}")
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Time taken for {url}: {elapsed_time:.4f} seconds\n")
async def main():
global_start_time = time.time()
await asyncio.gather(*map(fetch_url, urls))
global_end_time = time.time()
global_elapsed_time = global_end_time - global_start_time
print(f"Total time taken for all URLs: {global_elapsed_time:.4f} seconds")
asyncio.run(main())
While the code is more complex, the performances are much better:
# URL: https://www.bitecode.dev/p/hype-cycles
# Title: XML is the future - Bite code!
# Time taken for https://www.bitecode.dev/p/hype-cycles: 0.1750 seconds
# URL: https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager
# Title: Python cocktail: mix a context manager and an iterator in equal parts
# Time taken for https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager: 0.1656 seconds
# URL: https://www.bitecode.dev/p/the-weirdest-python-keyword
# Title: The weirdest Python keyword - Bite code!
# Time taken for https://www.bitecode.dev/p/the-weirdest-python-keyword: 0.1636 seconds
# URL: https://www.bitecode.dev/p/the-costly-mistake-so-many-makes
# Title: The costly mistake so many make with numpy and pandas
# Time taken for https://www.bitecode.dev/p/the-costly-mistake-so-many-makes: 0.1803 seconds
# URL: https://www.bitecode.dev/p/nobody-ever-paid-me-for-code
# Title: Nobody ever paid me for code - Bite code!
# Time taken for https://www.bitecode.dev/p/nobody-ever-paid-me-for-code: 0.2661 seconds
# URL: https://www.bitecode.dev/p/why-not-tell-people-to-simply-use
# Title: Why not tell people to "simply" use pyenv, poetry or anaconda
# Time taken for https://www.bitecode.dev/p/why-not-tell-people-to-simply-use: 0.2938 seconds
# URL: https://www.bitecode.dev/p/relieving-your-python-packaging-pain
# Title: Relieving your Python packaging pain - Bite code!
# Time taken for https://www.bitecode.dev/p/relieving-your-python-packaging-pain: 0.5334 seconds
# Total time taken for all URLs: 0.5335 seconds
In the end, we are 6 times faster, and with more URLs, the advantage
would grow.
That's because while the code of all of those tasks cannot run at the
same time, asyncio at least makes sure when one task waits on the
network, it switches to let another task run.
The await and the async with in fetch_url() tell asyncio that, at
those lines, we are going to ask something from the network, and so
it can switch to another fetch_url() task.
So what's the deal with asyncio, twisted, gevent, trio and all that
stuff?
All those libraries solve the same problem, but there are so many, so
it can be confusing.
Let's dive in.
asyncio
asyncio is the modern module for asynchronous network programming
provided with the python stdlib since 3.4. In other words, it's the
default stuff at your disposal if you want to code something without
waiting on the network.
asyncio replaces the old deprecated asyncore module. It is quite low
level, so while you can manually code most network-related things
with it, you are still at the level of TCP or UDP. If you want
higher-level protocols, like FTP, HTTP or SSH, you have to either
code it yourself, or install a third party library.
Because asyncio is the default solution, it has a the biggest
ecosystem of 3rd party libs, and pretty much everything async strives
to be compatible with it directly, or through compatibility layers
like anyio.
Twisted
20 years ago, there was no asyncio, there was no async/await, nodejs
didn't exist and Python 3 was half a decade away. Yet, it was the
.com bubble, everything needed to be connected now. And so was born
twisted, the grandfather of all the asynchronous frameworks we have
today. Twisted ecosystem grew to include everything, from mail to
ssh.
To this day, twisted is still a robust and versatile tool. But you do
pay the price of its age. It doesn't follow PEP8 very well, and the
design lean on the heavy size.
Here is a typical asyncio http request:
import httpx
import asyncio
async def fetch_url(url):
async with httpx.AsyncClient() as client:
response = await client.get("url")
print("Response received")
asyncio.run(fetch_url("https://www.bitecode.dev/p/relieving-your-python-packaging-pain"))
And here is the code you will find in the twisted docs:
from twisted.internet import reactor
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
agent = Agent(reactor)
d = agent.request(
b"GET",
b"https://www.bitecode.dev/p/relieving-your-python-packaging-pain"
Headers({"User-Agent": ["Twisted Web Client Example"]}),
None
)
def cbResponse(ignored):
print("Response received")
d.addCallback(cbResponse)
def cbShutdown(ignored):
reactor.stop()
d.addBoth(cbShutdown)
reactor.run()
To be fair, the code can be reduced to:
from twisted.internet import reactor
from twisted.web.client import Agent
agent = Agent(reactor)
d = agent.request( b"GET", b"https://www.bitecode.dev/p/relieving-your-python-packaging-pain")
d.addCallback(lambda ignored: print("Response received"))
d.addBoth(lambda ignored: reactor.stop())
reactor.run()
But in the end, you still get a lot of drawbacks:
* The doc, as you noticed, doesn't have your back.
* There are too many things to know. Wants SSL support? You need to
install twisted[tls], not twisted. You have to use bytes, not
strings. You have to manually tear down the reactor.
* Twisted does support async/await, but I leave you figuring out
how to turn this snippet into using it. You'll see.
* ChatGPT will not help you much. Most twisted code is old, and use
the old APIs and code style. You'll get... twisted results.
* Google will find many examples still using yield and other
contraptions.
* The deferred system is like "futures" or "promises" in other
tools. Until it's not.
* High levels libs are hard to find.
For the last point, you may think it's not important, but it's easy
to find httpx or aiohttp for asyncio. However, unless you know the
ecosystem, good luck finding out the existence of treq, the
high-level HTTP defacto lib. Which turns the code into a much better:
import treq
from twisted.internet import reactor
def done(response):
print("Response received")
reactor.stop()
deferred = treq.get("https://www.bitecode.dev/p/relieving-your-python-packaging-pain")
.addCallback(done)
reactor.run()
I do appreciate that asynchronous operations are automatically
scheduled, though. Having to call asyncio.create_task() is one of the
things I hate in Python and having more natural, JS-like calls, is
much simpler.
In short, the product is good, but the developer experience is not.
And I say that as a co-author of Expert Twisted.
When I was younger, I got an interview to work for Jamendo. The boss
said they were interested in my profile solely because I heard of
Twisted. Not because I knew how to use it. Heard. It was hard to use
so few people did.
Also, I got rejected because I wanted to work from home. Post COVID
it seems funny, doesn't it?
Tornado
Tornado was developed after Twisted, by FriendFeed, at this weird
2005-2015 web dev period where everything needed to be [S:social:S]
web scale. It was like Twisted, but tooted to be faster, and was
higher level. Out of the box, the HTTP story is way nicer.
Today, you are unlikely to use Tornado unless you work at Facebook or
contribute to jupyter. After all, if you want to make async web
things, the default tool is FastAPI in 2023.
gevent
Gevent is a weird one for me. It came about in 2009, the same year as
Tornado, but with a fundamentally different design. Instead of
attempting to provide an asychronous API, it decided to do black
magic. When you use gevent, you call from gevent import monkey;
monkey.patch_all() and it changes the underlying mechanism of Python
networking, making everything non-blocking.
I used to fear gevent at the time:
* You had to compile it from source, or use eggs, and that had too
many ways to fail.
* Monkey patching is brittle by nature, so you rolled the dice
every time you used it.
* Task switching is implicit, and you never knew what dragon would
await.
So I avoided it like the plague.
Ironically, today, gevent is becoming quite appealing.
Thanks to wheels, installing it is simple and robust. Monkey patching
had more than a decade to be polished, so it's now quite reliable.
And the implicit task switching becomes just a trade off with async/
await colored functions.
Because of the way gevent works, you can take a blocking script, and
with very few modifications, make it async. Let's take the original
stdlib one, and convert it to gevent:
import re
import time
import gevent
from gevent import monkey
# We magically patch everything.
# THIS MUST BE DONE BEFORE IMPORTING URLLIB
monkey.patch_all()
from urllib.request import Request, urlopen
urls = [
"https://www.bitecode.dev/p/relieving-your-python-packaging-pain",
"https://www.bitecode.dev/p/hype-cycles",
"https://www.bitecode.dev/p/why-not-tell-people-to-simply-use",
"https://www.bitecode.dev/p/nobody-ever-paid-me-for-code",
"https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager",
"https://www.bitecode.dev/p/the-costly-mistake-so-many-makes",
"https://www.bitecode.dev/p/the-weirdest-python-keyword",
]
title_pattern = re.compile(r"]*>(.*?)", re.IGNORECASE)
user_agent = (
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"
)
# We move the fetching into a function so we can isolate it into a green thread
def fetch_url(url):
start_time = time.time()
headers = {"User-Agent": user_agent}
with urlopen(Request(url, headers=headers)) as response:
html_content = response.read().decode("utf-8")
match = title_pattern.search(html_content)
title = match.group(1) if match else "Unknown"
print(f"URL: {url}\nTitle: {title}")
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Time taken: {elapsed_time:.4f} seconds\n")
def main():
global_start_time = time.time()
# Here is where we convert asynchronous calls into async ones
greenlets = [gevent.spawn(fetch_url, url) for url in urls]
gevent.joinall(greenlets)
global_end_time = time.time()
global_elapsed_time = global_end_time - global_start_time
print(f"Total time taken: {global_elapsed_time:.4f} seconds")
main()
No async, no await. No special lib except for gevent. In fact it
would work with the requests lib just as well. Very few modifications
are needed, for a net perf gain:
URL: https://www.bitecode.dev/p/the-weirdest-python-keyword
Title: The weirdest Python keyword - Bite code!
Time taken: 0.1896 seconds
URL: https://www.bitecode.dev/p/relieving-your-python-packaging-pain
Title: Relieving your Python packaging pain - Bite code!
Time taken: 0.2071 seconds
URL: https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager
Title: Python cocktail: mix a context manager and an iterator in equal parts
Time taken: 0.1955 seconds
URL: https://www.bitecode.dev/p/why-not-tell-people-to-simply-use
Title: Why not tell people to "simply" use pyenv, poetry or anaconda
Time taken: 0.2764 seconds
URL: https://www.bitecode.dev/p/nobody-ever-paid-me-for-code
Title: Nobody ever paid me for code - Bite code!
Time taken: 0.3167 seconds
URL: https://www.bitecode.dev/p/the-costly-mistake-so-many-makes
Title: The costly mistake so many make with numpy and pandas
Time taken: 0.4341 seconds
URL: https://www.bitecode.dev/p/hype-cycles
Title: XML is the future - Bite code!
Time taken: 0.4432 seconds
The only danger is if you call gevent.monkey.patch_all() too late.
You get a cryptic error that crashes your program.
So I'm much more likely to use gevent in 2023 than in 2009, as it now
has very good value, especially for utility scripts. It's the HTMX of
async libs: it's simple, does a lot for a low cost, and you can use
your old toolbox.
trio
For many years, the very talented dev and speaker David Beazley has
been showing unease with asyncio's design, and made more and more
experiments and public talks about what could an alternative look
like. It culminated with the excellent Die Threads presentation, live
coding the sum of the experience of all those ideas, that eventually
would become the curio library. Watch it. It's so good.
Meanwhile, Nathaniel J. Smith published "Notes on structured
concurrency, or: Go statement considered harmful", an article that
made a ripple in the asynchronous loving community. The article is
quite complex, but the core idea is simple: spawning a coroutine (or
goroutine, or green thread) is like calling goto. For the youngsters
among the readers, it's a reference to a famous Edsger Dijkstra's
letter.
In short, it states that every time you start an async task, just
like with goto, you jump somewhere else, in some other part of the
code. Which means it's very hard to know where a coroutine comes
from, when it started, and where it's going, or when it's going to
stop. Scope and life spans are suddenly opaque, which makes reasoning
about the whole software difficult.
And according to him, this is a problem with the design of our
asyncio API, not with the nature of async itself.
Nathaniel didn't just come with a problem, it also brought a
solution: a new kind of design for async handling, inspired by
Beazley's concepts, with a few twists.
This solution grew into a library, trio.
Trio is not compatible with asyncio, nor gevent or twisted by
default. This means it's also its little own async island.
But in exchange for that, it provides a very different internal take
on how to deal with this kind of concurrency, where every coroutine
is tied to an explicit scope, everything can be awaited easily, or
canceled.
The code isn't that different from your typical asyncio script:
import re
import time
import httpx
import trio
urls = [
"https://www.bitecode.dev/p/relieving-your-python-packaging-pain",
"https://www.bitecode.dev/p/hype-cycles",
"https://www.bitecode.dev/p/why-not-tell-people-to-simply-use",
"https://www.bitecode.dev/p/nobody-ever-paid-me-for-code",
"https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager",
"https://www.bitecode.dev/p/the-costly-mistake-so-many-makes",
"https://www.bitecode.dev/p/the-weirdest-python-keyword",
]
title_pattern = re.compile(r"]*>(.*?)", re.IGNORECASE)
user_agent = (
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"
)
async def fetch_url(url):
start_time = time.time()
async with httpx.AsyncClient() as client:
headers = {"User-Agent": user_agent}
response = await client.get(url, headers=headers)
match = title_pattern.search(response.text)
title = match.group(1) if match else "Unknown"
print(f"URL: {url}\nTitle: {title}")
end_time = time.time()
elapsed_time = end_time - start_time
print(f"Time taken for {url}: {elapsed_time:.4f} seconds\n")
async def main():
global_start_time = time.time()
# That's the biggest API difference
async with trio.open_nursery() as nursery:
for url in urls:
nursery.start_soon(fetch_url, url)
global_end_time = time.time()
global_elapsed_time = global_end_time - global_start_time
print(f"Total time taken for all URLs: {global_elapsed_time:.4f} seconds")
if __name__ == "__main__":
trio.run(main)
Thanks to anyio, it even can use httpx like asyncio. So what's the
big deal?
Well, for once, it's a tad faster:
URL: https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager
Title: Python cocktail: mix a context manager and an iterator in equal parts
Time taken for https://www.bitecode.dev/p/python-cocktail-mix-a-context-manager: 0.1029 seconds
URL: https://www.bitecode.dev/p/hype-cycles
Title: XML is the future - Bite code!
Time taken for https://www.bitecode.dev/p/hype-cycles: 0.1203 seconds
URL: https://www.bitecode.dev/p/nobody-ever-paid-me-for-code
Title: Nobody ever paid me for code - Bite code!
Time taken for https://www.bitecode.dev/p/nobody-ever-paid-me-for-code: 0.1137 seconds
URL: https://www.bitecode.dev/p/the-costly-mistake-so-many-makes
Title: The costly mistake so many make with numpy and pandas
URL: https://www.bitecode.dev/p/relieving-your-python-packaging-pain
Title: Relieving your Python packaging pain - Bite code!
Time taken for https://www.bitecode.dev/p/the-costly-mistake-so-many-makes: 0.1074 seconds
Time taken for https://www.bitecode.dev/p/relieving-your-python-packaging-pain: 0.1286 seconds
URL: https://www.bitecode.dev/p/why-not-tell-people-to-simply-use
Title: Why not tell people to "simply" use pyenv, poetry or anaconda
Time taken for https://www.bitecode.dev/p/why-not-tell-people-to-simply-use: 0.1220 seconds
URL: https://www.bitecode.dev/p/the-weirdest-python-keyword
Title: The weirdest Python keyword - Bite code!
Time taken for https://www.bitecode.dev/p/the-weirdest-python-keyword: 0.1883 seconds
Total time taken for all URLs: 0.2133 seconds
Also, because it doesn't create nor schedule coroutines immediately
(notice the nursery.start_soon(fetch_url, url) is not
nursery.start_soon(fetch_url(url))), it will also consume less
memory. But the most important part is the nursery:
# That's the biggest API difference
async with trio.open_nursery() as nursery:
for url in urls:
nursery.start_soon(fetch_url, url)
The with block scopes all the tasks, meaning everything that is
started inside that context manager is guaranteed to be finished (or
terminated) when it exits. First, the API is better than expecting
the user to wait manually like with asyncio.gather: you cannot start
concurrent coroutines without a clear scope in trio, it doesn't rely
on the coder's discipline. But under the hood, the design is also
different. The whole bunch of coroutines you group and start can be
canceled easily, because trio always knows where things begin and
end. As soon as things get complicated, code with curio-like design
become radically simpler than ones with asyncio-like design.
Would I recommend using trio in prod for now? No.
The asyncio ecosystem and compat advantage is too good to pass on.
But it is inspiring changes in asyncio design itself in very positive
ways. E.G, in 3.11, we got asyncio.TaskGroup, a similar concept to
nurseries, and we had exception groups added to Python. This can't
change the underlying asyncio design, but the API is way nicer.
Fast API
Fast API is not in the same category as the other libs. It's not
supposed to be part of this article, because it's a completely
different beast, but people asked questions about it and seemed
confused, so I decided to add a little section to clarify.
Fast API is a high-level web framework like flask, but that happens
to be async, unlike flask. With the added benefit of using type hints
and pydantic to generate schemas.
It's not a building block like twisted, gevent, trio or asyncio. In
fact, it's built on top of asyncio. It's in the same group as flask,
bottle, django, pyramid, etc. Although it's a micro-framework, so
it's focused on routing, data validation and API delivery.
I use Fast API when I want to make a quick little web API. It
basically replaced flask for everything I used it for, unless my
clients or co-workers as for it, of course.
But don't get me wrong, Django is my web framework of choice, as I
very rarely need a full web async framework, and with django-ninja,
it's very easy to build a Web API almost like Fast API.
Also, and this will be the topic of another controversial article, I
strongly believe beginners should start their first serious project
with django and not flask, despite the fact most people see it the
other way around. Flask is fine for learning, or for serious projects
if you know what you are doing. In the middle lie troubles.
What to use?
So which async lib to use?
Well, probably none of them.
Those tools serve a very niche purpose, and most people don't
encounter it very often.
In fact, I would dare to say that the vast majority of developers are
not working on problems where network performance is an issue that
hasn't be solved in a better way. At least at their scale.
If you are doing a web site, blocking frameworks like Django or Flask
are fine, and you'll need a task queue no matter your stack. Small to
medium companies rarely build services that would need more than
that. Even a lot of big companies probably don't need more.
If you are doing calculations, this is CPU related, and they will not
help you.
If you need to get a few URLS fast, a ThreadPoolExecutor is likely
the Pareto solution, 99% of the time.
If you need an industrial crawler, scrappy is there for you.
You have to understand that async programming is hard, and no matter
how good the tooling is, it's going to make your code more difficult
to manage. It has a high price.
Ok, but let's say you are sure you need some async lib, then which
one?
If you are asking this question, then just go with asyncio. It's
standard, so you'll get more doc, more 3rd party components, more
resources in general.
I'll add you probably should not go with asyncio manually. Use a
higher level asyncio based lib, or better, framework. Async is hard
enough as it is.
Some of you may think: "but wait, for this particular task, I do need
async, and I did my research, and asyncio is not the proper tool".
But then, you don't need the help of this article to decide, you
already have the skills to make an educated choice.
Get a Future object on the next article!
[ ]Subscribe
1
Share this post
[https]
Asyncio, twisted, tornado, gevent walk into a bar...
www.bitecode.dev
Copy link
Facebook
Email
Notes
Other
1
Share
Previous
1 Comment
[https]
[ ]
Kevin Tewouda
Writes Woudar's Blog
1 hr agoLiked by Bite Code!
Another good piece of writing. This is a nice summary of the
async ecosystem.
[https]
When I need async stuff, I prefer to use anyio, and when web
si involved I use Django-ninja-extras (it is Django-ninja
with some nice extras)
Expand full comment
Reply
Top
New
Community
No posts
Ready for more?
[ ]Subscribe
(c) 2023 Bite Code!
Privacy [?] Terms [?] Collection notice
Start WritingGet the app
Substack is the home for great writing
This site requires JavaScript to run correctly. Please turn on
JavaScript or unblock scripts