Over the past year, my team has been making the transition from Flask to
aiohttp. We’re making this
transition because of a lot of the situations where non-blocking I/O
theoretically scales better:
- large numbers of simultaneous connections
- remote http requests with long response times
There is agreement that asyncio scales better memory-wise: a green thread
in Python consumes less memory than a system thread.
However, performance for latency and load is a bit more contentious. The best way to find
out is to run a practical experiment.
To find out, I forked py-frameworks-benchmark, and designed an experiment.
The conditions of the web application, and the work performed, are identical:
- a route on a web server that: 1. returns the response as json 2. queries a
- http request to an nginx server returning back html.
- a wrk benchmark run, with 400 concurrent requests for 20 seconds
- running under gunicorn, with two worker processes.
The variants are:
- flask + meinheld
- flask + gevent
- flask + multithreading, varying from 10 to 1000.
You can probably get a sense that aiohttp can server more requests than any
other. To get a real sense of how threads scale we can put the request count on
The interesting note is that the meinheld worker didn’t scale very well at all.
Gevent handled requests faster than any threading implementation.
But nothing handled nearly as many requests as aiohttp.
These are the results on my machine. I’d strongly suggest you try the experiment
for yourself: the code is available in my fork.
If anyone has any improvements on the multithreading side, or can explain the discrepency in performance, I’d love to understand more.