Over the past year, my team has been making the transition from Flask to aiohttp. We're making this transition because of a lot of the situations where non-blocking I/O theoretically scales better:
- large numbers of simultaneous connections
- remote http requests with long response times
There is agreement that asyncio scales better memory-wise: a green thread in Python consumes less memory than a system thread.
However, performance for latency and load is a bit more contentious. The best way to find out is to run a practical experiment.
To find out, I forked py-frameworks-benchmark, and designed an experiment.
The Experiment #
The conditions of the web application, and the work performed, are identical:
- a route on a web server that: 1. returns the response as json 2. queries a
- http request to an nginx server returning back html.
- a wrk benchmark run, with 400 concurrent requests for 20 seconds
- running under gunicorn, with two worker processes.
The Variants #
The variants are:
- flask + meinheld
- flask + gevent
- flask + multithreading, varying from 10 to 1000.
You can probably get a sense that aiohttp can server more requests than any other. To get a real sense of how threads scale we can put the request count on a chart:
The interesting note is that the meinheld worker didn't scale very well at all. Gevent handled requests faster than any threading implementation.
But nothing handled nearly as many requests as aiohttp.
These are the results on my machine. I'd strongly suggest you try the experiment for yourself: the code is available in my fork.
If anyone has any improvements on the multithreading side, or can explain the discrepency in performance, I'd love to understand more.