Aiohttp vs Multithreaded Flask for High I/O Applications

Over the past year, my team has been making the transition from Flask to
aiohttp. We’re making this
transition because of a lot of the situations where non-blocking I/O
theoretically scales better:

  • large numbers of simultaneous connections
  • remote http requests with long response times

There is agreement that asyncio scales better memory-wise: a green thread
in Python consumes less memory than a system thread.

However, performance for latency and load is a bit more contentious. The best way to find
out is to run a practical experiment.

To find out, I forked py-frameworks-benchmark, and designed an experiment.

The Experiment

The conditions of the web application, and the work performed, are identical:

  • a route on a web server that: 1. returns the response as json 2. queries a
  • http request to an nginx server returning back html.
  • a wrk benchmark run, with 400 concurrent requests for 20 seconds
  • running under gunicorn, with two worker processes.
  • python3.6

The Variants

The variants are:

  • aiohttp
  • flask + meinheld
  • flask + gevent
  • flask + multithreading, varying from 10 to 1000.


variant min p50 p99 p99.9 max mean duration requests
aiohttp 163.27 247.72 352.75 404.59 1414.08 257.59 20.10 30702
flask:gevent 85.02 945.17 6587.19 8177.32 8192.75 1207.66 20.08 7491
flask:meinheld 124.99 2526.55 6753.13 6857.55 6857.55 3036.93 20.10 190
flask:10 163.05 4419.11 4505.59 4659.46 4667.55 3880.05 20.05 1797
flask:20 110.23 2368.20 3140.01 3434.39 3476.06 2163.02 20.09 3364
flask:50 122.17 472.98 3978.68 8599.01 9845.94 541.13 20.10 4606
flask:100 118.26 499.16 4428.77 8714.60 9987.37 556.77 20.10 4555
flask:200 112.06 459.85 4493.61 8548.99 9683.27 527.02 20.10 4378
flask:400 121.63 526.72 3195.23 8069.06 9686.35 580.54 20.06 4336
flask:800 127.94 430.07 4503.95 8653.69 9722.19 514.47 20.09 4381
flask:1000 184.76 732.21 1919.72 5323.73 7364.60 786.26 20.04 4121

You can probably get a sense that aiohttp can server more requests than any
other. To get a real sense of how threads scale we can put the request count on
a chart:


The interesting note is that the meinheld worker didn’t scale very well at all.
Gevent handled requests faster than any threading implementation.

But nothing handled nearly as many requests as aiohttp.

These are the results on my machine. I’d strongly suggest you try the experiment
for yourself: the code is available in my fork.

If anyone has any improvements on the multithreading side, or can explain the discrepency in performance, I’d love to understand more.

MongoDB Streaming Pattern, Allowing for Batching

An interesting problem arose at work today, regarding how to build an
aggregate of changes to a MongoDB collection.

A more general version of the problem is:

  1. you have a document which has multiple buckets it could
    belong to. Say, an animal which an arbitrary set of tags,
    such as [“mammal”, “wings”], and a discrete type location [“backyard”, “frontyard”, “house”].

    an example document could look like:

    { "name": "Cat",
      "location": "house",
      "tags": ["mammal", "ears"]
  2. Make it easy to retrieve the sum of each type, by tag. So:

       "tag": "mammal",
       "location": {
         "house": 10,
         "backyard": 4,
         "frontyard": 2,

The animal location is updated regularly, so the aggregates
can change over time.

A First Attempt

The simplest way to perform this is to rely on Mongo to retrieve all
animals that match the tag by indexing the tag field, then handling
the query and count in the application.

This works well for small scales. However, performing the action in
this way requires a scanning query per aggregate, and that must scan
every document returned to perform the aggregate. So, O(matched_documents):

def return_count_by_tag(tag_name):
    result = {
        "tag": tag_name,
        "location": defaultdict(int)
    for result in db.animals.find({"tag": tag_name}, {"location": 1}):
        result["type_count"][result["location"]] += 1

    return result

In our case, we needed to return an answer for every tag, within a
minute. We were able to scale the approach with this constraint in
mind to 35,000 tags and 120,000 documents. At that point, the
application was unable to build the aggregates fast enough.

The New Strategy

The main disadvantage of the previous design is the calculation of the
aggregate counts does not need to be on read: if we can ensure
consistent count updates as the location actually changes per
document, we can perform O(tag_count) updates per document instead.

The comparative complexity over a minute is:

  • old: len(distinct_tags) * len(average_animals_per_tag)
  • new: len(updates_per_minute) * len(average_tag_count_per_animal)

So, if we have:

  • 30,000 tags
  • 120,000 animals
  • 40 animals average per tag
  • (40 * 30,000) / (120,000) = 10 tags per animal
  • 10000 updates a minute

The number of documents touched is:

old: 30k * 40 = 1.2 million reads
new: 10k * 10 = 100,000 writes

So, we can scale a bit better by handling writes over reads. This
becomes an even better ratio if the updates occur at a less frequent

So, the stream processing works by:

  1. every desired changes is enqueued into a queue (in Mongo, this can
    be implemented as a capped collection)
  2. a worker process pulls from the queue, and processes the results.

The worker process:

  1. reads a watermark value of where it had processed
    previously (Mongo ObjectIds increase relative to time and insertion
    order, so it can be used as the watermark)
  2. performs the work required
  3. saves works to the collection
  4. writes the watermark value of where it had finished processing.

You could also delete records as you process them, but it can cause
issues if you need to read a record again, or if multiple workers need them.
need them.

Starting from Scratch

So how do we allow starting from scratch? Or, rebuilding the
aggregates if an issue occurs?

There could be a function that performs the whole collection
calculation, dumps it to the collection, and sets the watermark to
whatever the most recent object is in the queue.

Unfortunately, this process and the worker process cannot run at the
same time. If that happens, then the aggregate collection will be
corrupted, as one could query an older version of the collection, have
updates that are applied to the original aggregate copy, and are overwritten
with a stale copy from the rebuild.

Thus, we must ensure that the update worker does not run at the same
time as the batch worker.

A locking strategy

In Mongo, the locking is decided by the database, and a user has no
control over that. Thus, we must implement our own locking functionality by
using Mongo primitives.

The same record that holds the watermark could also hold the lock. To
ensure that we can survive a worker dying halfway and not releasing,
the lock, we can provide a lock owner, ensuring the same process type
can begin an operation again:

{ "name": "pet-aggregates",
  "watermark: ObjectId("DEADBEEF"),
  "lock": {
      "type": "update" // could also be type: bulk

Using this type of lock, the possible failure scenarios are:

  1. update process lock, failure, and update doesn’t run again:
    This requires manually looking at the issue, resolving, and restarting the queue.
  2. bulk process lock, failure, and bulk doesn’t run again:
    This requires manually looking at the issue, resolving, and restarting the queue.

deepmerge: deep merge dictionaries, lists and more in Python

Introducing deepmerge. It’s a library designed to provide simple
controls around a merging system for basic Python data structures like dicts and lists.

It provides a few common cases for merging (like always merge + override, or raise an exception):

from deepmerge import always_merger, merge_or_raise

base = {
    "a": ["b"],
    "c": 1,
    "nested": {
        "nested_dict": "value",
        "nested_list": ["a"]

nxt = {
    "new_key": "new_value",
    "nested": {
        "nested_dict": "new_value",
        "nested_list": ["b"],
        "new_nested_key": "value"

always_merge(base, nxt)
assert base == {
      "a": ["b"],
      "c": 1,
      "new_key": "new_value"
      "nested": {
          "nested_dict": "new_value",
          "nested_list": ["a", "b"],
          "new_nested_key": "value"

deepmerge allows customization as well, for when you want to specify
the merging strategy:

from deepmerge import Merger

my_merger = Merger(
    # pass in a list of tuples,with the
    # strategies you are looking to apply
    # to each type.
        (list, ["prepend"]),
        (dict, ["merge"])
    # next, choose the fallback strategies,
    # applied to all other types:
    # finally, choose the strategies in
    # the case where the types conflict:
base = {"foo": ["bar"]}
next = {"bar": "baz"}
my_merger.merge(base, next)
assert base == {"foo": ["bar"], "bar": "baz"}

For each strategy choice, pass in a list of strings specifying built in strategies,
or a function defining your own:

def merge_sets(merger, path, base, nxt):
    base |= nxt
    return base

def merge_list(merger, path, base, nxt):
    if len(nxt) > 0:
        return base

return Merger(
        (list, merge_list),
        (dict, "merge"),
        (set, merge_sets)

That’s it! Give and try, and Pull Requests are always encouraged.

The CID Pattern: a strategy to keep your web service code clean

The Problem

Long term maintenance of a web application, will, at some point,
require changes. Code grows with the functionality it serves, and
an increase in functionality is inevitable.

It is impossible to foresee what sort of changes are required, but there are
changes that are common and are commonly expensive:

  • changing the back-end datastore of one or more pieces of data
  • adding additional interfaces for a consumer to request or modify data

It is possible to prevent some of these changes with some foresight,
but it is unlikely to prevent all of them. As such, we can try to
encapsulate and limit the impact of these changes on other code bases.

Thus, every time I start on a new project, I practice CID: (Consumer-Internal-Datasource)

CID Explained

CID is an acronym for the three layers of abstraction that should be
built out from the beginning of an application. The layers are described as:

  • The consumer level: the interface that your consumers interact with
  • The internal level: the interface that application developers interact with most of the time
  • The datasource level: the interface that handles communication with the database and other APIs

Let’s go into each of these in detail.

Consumer: the user facing side

The client level handles translating and verifying the client format,
to something that makes more sense internally. In the beginning, this
level could be razor thin, as the client format probably matches the
internal format completely. However, other responsibilities that might
occur at this layer are:

  • schema validation
  • converting to whatever format the consumer desires, such a json
  • speaking whatever transport protocol is desired, such as HTTP or a Kafka stream

As the application grows, the internal format might change, or a new
API version may need to be introduced, with it’s own schema. At that
point, it makes sense to split the client schema and the internal
schema, so ending up with something like:

class PetV1():
    to_internal()  # converts Pet to the internal representation.
    from_internal() # in case you need to return pet objects back as V1

class PetV2():
    to_internal()  # converts Pet to the internal representation.
    from_internal()  # in case you need to return pet objects back as V2

class PetInt():
    # the internal representation, used within the internal level.

Datastore: translates internal to datastore

Some of the worst refactorings I’ve encountered are the ones involving
switching datastores. It’s a linear problem: as the database
interactions increase, so do the lines of code that are needed to
perform that interaction, and each line must be modified in switching
or alternating the way datastores are called.

It’s also difficult to get a read on where the most expensive queries
lie. When your application has free form queries all over the code, it
requires someone to look at each call and interpret the cost, as ensure
performance is acceptable for the new source.

If any layer should be abstracted, it’s the datastore. Abstracting the
datastore in a client object makes multiple refactors simpler:

  • adding an index and modifying queries to hit that index
  • switching datasources
  • putting the database behind another web service
  • adding timeouts and circuit breakers

Internal: the functional developer side

The client and datastore layers abstract away any refactoring that
only affects the way the user interacts with the application, or the
way data is stored. That leaves the final layer to focus on just the

The internal layer stitches together client and datastore, and
performs whatever other transformations or logic needs to be
performed. By abstracting out any modification to the schema that had
to be done on the client or datastore (including keeping multiple
representation for the API), you’re afforded a layer that deals exclusively
with application behavior.

An Example of a CID application

A theoretical organization for a CID application is:

    - HTTPPetV1
    - HTTPPetV2
    - SQSPetV1
    # only a single internal representation is needed.
    - Pet
    # showcasing a migration from Postgres to MongoDB
    - PetPostgres
    - PetMongoDB

Example Where CID helps

So I’ve spent a long time discussing the layers and their
responsibilities. If we go through all of this trouble, where does
this actually help?

Adding a new API version

  • add a new API schema
  • convert to internal representation

Modifying the underlying database

  • modify the datasource client.

Complex Internal Representations

If you need to keep some details in a Postgres database, and store
other values within memcache for common queries, this can be
encapsulated in the datasource layer.

All too often the internal representations attempt to detail with this
type of complexity, which makes it much harder to understand the
application code.

Maintaining Multiple API versions

Without clearly separating how an object is structured internally from
how consumers consume it, the details of the consumer leaks into the
internal representation.

For example, attempting to support two API version, someone writes
some branched code to get the data they want. this pattern continues
for multiple parts of the code dealing with that data, until it
becomes hard to get a complete understanding of what in V1 is
consumed, and what in V2 is consumed.

Final Thoughts

David Wheeler is quoted for saying:

All problems in computer science can be solved by another level of indirection.

Indirection is handy because it encapsulates: you do not need a
complete understanding of the implementation to move forward.

At the same time, too much indirection causes the inability to
understand the complete effect of a change.

Balance is key, and using CID helps guide indirection where
it could help the most.

KeyError in self._handlers: a journey deep into Tornado’s internals

If you’ve worked with tornado, you may have encountered a traceback of
a somewhat bewildering error:

Traceback (most recent call last):
    File "/usr/local/lib/python2.7/site-packages/tornado/", line 832, in start
fd_obj, handler_func = self._handlers[fd]
KeyError: 16

A few other people have been confused as well. After some digging and a combination
of learning about the event loop, fork, and epoll, the answer finally entered into focus.


If you’re looking for the solution, don’t call or start IOLoops before
an os.fork. This happens in web servers like gunicorn, as well as
tornado.multiprocess, so be aware of that caveat as well.

But why does this happen?

As I mentioned previously, this is a combination of behaviour all
across the system, python and tornado stack. Let’s start with
learning more about that error specifically.

The code the traceback is referring occurs in the the IOLoop:

# tornado/
while self._events:
    fd, events = self._events.popitem()
        fd_obj, handler_func = self._handlers[fd]
        handler_func(fd_obj, events)

What are these variables? you can read the IOLoop code yourself, but effectively:

  • _handlers is a list of the callbacks that should be called once an async event is complete.
  • _events is a list of events that have occurred, that need to be handled.

What is an FD?

The handlers and events are both keyed off of file descriptors. In a
few words, file descriptors represent a handle to some open file. In
unix, a pattern has propagated where a lot of resources (devices,
cgroups, active/inactive state) are referenced via file descriptors:
it became a lingua franca for low level resources because a lot of
tooling knows how to work with file descriptors, and writing and
reading to a file is simple.

They’re useful for tornado because sockets also have a file descriptor
represent them. So the tornado ioloop could wait for an event
affecting a socket, then pass that socket to a handler when a socket
event is fired (e.g. some new data came into the socket buffer).

What modifies the events and handlers?

A KeyError handlers means there’s a key in events that is not in the
handlers: some code is causing events to be added to the ioloop, and
aren’t registering a handler for it at the same time. So how does that
happen in the code?

A good starting point is looking where _handlers and _events are
modified in the code. In all of the tornado code, there’s only a
couple places:

# tornado/
def add_handler(self, fd, handler, events):
    fd, obj = self.split_fd(fd)
    self._handlers[fd] = (obj, stack_context.wrap(handler))
    self._impl.register(fd, events | self.ERROR)
# tornado/
def remove_handler(self, fd):
    fd, obj = self.split_fd(fd)
    self._handlers.pop(fd, None)
    self._events.pop(fd, None)
    except Exception:
        gen_log.debug("Error deleting fd from IOLoop", exc_info=True)

Looking at these pieces, the code is pretty solid:

  • handlers are added only in add_handler, and they are added to a _impl.register
  • handlers are only removed in remove_handler, where they are removed in _events, _handlers and _impl.
  • events are added to _events in _impl.poll()

So the removing of handlers always make sure that events no longer has
it anymore, and it removes it from this impl thing too.

But what is impl? Could impl be adding fd’s for events that don’t have handlers?

impl: polling objects

It turns out _impl is chosen based on the OS. There is a little bit of
indirection here, but the IOLoop class in tornado extends a configurable object,
which selects the class based on the method configurable_default:

# tornado/
def configurable_default(cls):
    if hasattr(select, "epoll"):
        from tornado.platform.epoll import EPollIOLoop
        return EPollIOLoop
    if hasattr(select, "kqueue"):
        # Python 2.6+ on BSD or Mac
        from tornado.platform.kqueue import KQueueIOLoop
        return KQueueIOLoop
    from import SelectIOLoop
    return SelectIOLoop

And each of these loop implementations pass it’s own argument into the impl argument:

class EPollIOLoop(PollIOLoop):
    def initialize(self, **kwargs):
        super(EPollIOLoop, self).initialize(impl=select.epoll(), **kwargs)

Looking at select.epoll, it follows the interface of a polling object: a
class in the Python standard library that has the ability to poll for
changes to file descriptors. If something happens to a file descriptor
(e.g. a socket recieving data), the polling object, it will return
back the file descriptor that was triggered.

Different architectures have different polling objects
implemented. The avaialable ones in tornado by default are:

  • epoll (Linux)
  • kqueue (OSX / BSD)
  • select Windows use

In our case, this was happening on Linux, so we’ll look at epoll.


So what is epoll? It’s documented in the Python standard library, but
it’s a wrapper around the epoll Linux system calls.

The ioloop code actually looks like:

  • wait for epoll to return a file descriptor that has an event
  • execute the handler (which will presumably register another handler if another step is required, or not if it’s complete)
  • repeat.

epoll has two different configurations, but the one tornado uses is
edge-polling: it only triggers when a CHANGE occurs, vs when a
specific level is hit. In other words, it will only trigger when new
data is available: if the user decides to do nothing with the data,
epoll will not trigger again.

epoll works by registering file descriptors for the epoll object to
listen to. You can also stop listening to file descriptors as well.

So epoll works great for an event loop. But is it possible to somehow
register file descriptors to the epoll/impl object without using the
method above?

epoll and os.fork

It isn’t possible to register things outside of the impl
object. But, os.fork can cause some weird behaviour here. See, the way
that one interfaces with epoll is using file descriptors: you have an
fd to the epoll object, and you can use Linux system calls to work
with that:

As mentioned previously, file descriptors is a common way to reference
some object when using Linux kernel system calls.

Another common system call is fork. The
documentation of fork specifies that fork is equivalent to:

  • copying the memory of the current process to a new space
  • spawning a new process that uses the new copy.

This is fine for most objects in memory, but how about file
descriptors, which reference some object outside of the memory space
of the current process.

In the case of file descriptors, the file descriptor is also cloned to
the new fork. In other words, both the parent and the child process
will have a reference to the same file descriptor.

So, what does this mean for epoll, which is just another file
descriptor under the hood? Well, you can probably guess.

It gets shared.

How the bug works

So this is the crux of the issue. When an os.fork occurs, the parent
and the child share the SAME epoll. So for an IOLoop that is created
by the parent object, the child process uses the same epoll as well!

So, that allows a condition like this:

  1. parent creates an IOLoop loop_1, with an epoll epoll_1
  2. parent calls os.fork, creating loop_2, which shares the same epoll_2
  3. parent starts ioloop, waits for epoll_1.poll()
  4. child adds a handler for fd_2 to epoll_1
  5. parent gets back fd_2, but doesn’t have a handler for it, and raises the KeyError.

So this will pretty much happen at some point anytime a new ioloop is not created for a child process.

Here’s a repro script. I couldn’t figure out a good way to kill this
gracefully, so be warned this will need to be killed externally.

import logging
import select
import socket
import os
import time
import tornado.ioloop
import tornado.httpclient
import tornado.web

serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
serversocket.bind(('', 8080))


loop = tornado.ioloop.IOLoop.current()

if os.fork():
    handler = lambda *args, **kwargs: None
    loop.add_handler(serversocket.fileno(), handler, select.EPOLLIN)
    client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    client.connect(('', 8080))

How about gunicorn or tornado.multiprocess?

So how to avoid this in gunicorn or tornado.multiprocess, which uses
an os.fork? The best practice is to not start the ioloop until AFTER
the fork: calling ioloop.Instance() or current() will create an ioloop whose ioloop will be shared
by any child ioloop, without explicitly clearing it.

Gunicorn calls a fork as it’s spawning a worker:

# gunicorn/
def spawn_worker(self):
    self.worker_age += 1
    worker = self.worker_class(self.worker_age,, self.LISTENERS,
                     , self.timeout / 2.0,
                               self.cfg, self.log)
    self.cfg.pre_fork(self, worker)
    pid = os.fork()
    if pid != 0:
        self.WORKERS[pid] = worker
        return pid


Tornado is an awesome framework, but it’s not simple. However, thanks
to well documented pieces, it’s possible to diagnose even complex
issues like this, and do a bit of learning along the way.

Also, os.fork is not a complete guarantee that you’ll get a unique
instance of every object you use. Beware file descriptors.

Introducing transmute-core: quickly create documented, input validating APIs for any web framework

A majority of my career has been spent on building web services in
Python. Specifically, internal ones that have minimal or no UIs, and
speak REST (or
at least are rest-ish).

With each new service, I found myself re-implementing work to
make user-friendly REST APIs:

  • validation of incoming data, and descriptive errors when a field does not
    match the type or is otherwise invalid.
  • documenting said schema, providing UIs or wiki pages allowing users to
    understand what the API provides.
  • handling serialization to and from multiple content types (json, yaml)

This is maddening work to do over and over again, and details are
often missed: sometimes yaml is not supported for a particular API, or
there is a specific field that is not validated. Someone will ask about
an API that you changed, and forgot to document a new parameter. It’s hard to
scale API maintenance when you’re dealing with forgetting some minute boilerplate.

This was further exacerbated by using different web frameworks for
different projects. Every framework provides their own REST plugin or
library, and often there’s a lack of functional parity, or declaring
an API is completely different and requires learning multiple

So with this monumental pain, what if I told you can get an API that:

  • validates incoming data types
  • supports multiple content types
  • has a fully documented UI

Just by writing a vanilla Python function? And what if I told you
this can work for YOUR Python framework of choice in 100 statements
of Python code?

Well, that’s what the transmute framework is.

How it works

transmute-core is
a library that provides tools to quickly implement rest APIs. It’s
designed to be consumed indirectly, through a thin layer that adapts
it to the style of the individual framework.

HTTP Endpoints

Here is an example of a GET endpoint in flask:

import flask_transmute

# flask-like decorator.
@flask_transmute.route(app, paths='/multiply')
# tell transmute what types are, which ensures validations
@flask_transmute.annotate({"left": int, "right": int, "return": int})
# the function is a vanilla Python function
def multiply(left, right):
    return left * right

And one in aiohttp, the web framework that uses Python 3’s asyncio:

import aiohttp_transmute

# tell transmute what types are, which ensures validations
# Python3.5+ supports annotations natively
# request is provided by aiohttp.
def multiply(request, left: int, right: int) -> int:
    return left * right

aiohttp_transmute.route(app, multiply)

Both do the following:

  • generate a valid route in the target framework
  • detect the content type (yaml or json, and parse the body)
  • verify that input parameters match the parameters specified. return a 400 status
    code an details if not.
  • write back yaml or json, depending on the content type

Note that we don’t have to deal with the content type serialization,
read from request objects, or returning a valid response object:
that’s all handled by transmute. This keeps the functions cleaner in
general: it looks similar to any other Python function.

Complex Schemas via Schematic (or any validation framework)

Primitive types in the parameters are OK, but it’s often true that
more complex types are desired.

Schema declaration and validation has multiple solutions
already, so transmute defers this other libraries. By default transmute uses

from schematics.models import Model
from schematics.types import StringType, IntType

class Card(Model):
    name = StringType()
    price = IntType()

# passing in a schematics model as the type enables
# validation and creation of the object when converted
# to an API.
@annotate({"card": Card})
def submit_card(card):

Of course, some may prefer other solutions like marshmallow. In that
case, transmute-core provides a transmute-context for users to customize and use
their own implementation of transmute’s serializers:

from transmute_core import TransmuteContext, default_context

context = TransmuteContext(serializers=MySerializer())

route(app, fn, context=context)

# alternatively, you could modify the default context directly
# (be careful about where this code is called: it needs
# to happen before any routes are constructed)
default_context.serializers = MySerializer()

Documentation via Swagger

Swagger / OpenAPI allows one to define a REST API using json. Transmute generates
swagger json files based on the transmute routes added to an app, and transmute-core provides the static CSS and JavaScript
files required to render a nice documentation interface for it:

from flask_transmute import add_swagger

# reads all the transmute routes that have been added, extracts their
# swagger definitions, and generates a swagger json and an HTML page that renders it.
add_swagger(app, "/swagger.json", "/swagger")

This also means clients can be auto-generated as well: swagger has a
large number of open source projects dedicated to parsing and
generating swagger clients. However, I haven’t explored this too

Lightweight Framework Implementations

Earlier in this post, it is mentioned that there should a wrapper
around transmute-core for your framework, as the style of how to add
routes and how to extract values from requests may vary.

A goal of transmute was to make the framework-specific code as thin as
possible: this allows more re-use and common behavior across the
frameworks, enabling developers across frameworks to improve
functionality for everyone.

Two reference implementations exist, and they are very thin. As of this writing, they are at:

  • flask-transmute: 166 lines of code, 80 statements
  • aiohttp-transmute: 218 lines of code, 103 statements (a little bloated to support legacy APIs)

A one-page example for flask integration is also provided, to
illustrate what is required to create a new one. That’s 200 LOC with
comments, a little more than 100 without.


Frameworks are always a means to an end: it’s about reducing that
effort between what you want to build and actually building it.

I love great, well designed APIs. And dealing with the minutiae of
some detail I missed in boilerplate content type handling or object
serialization was draining the enjoyment out of authoring them. Since
I’ve started using transmute for all of my projects, it’s let me focus
on what I care about most: actually writing the functional code, and
designing the great interfaces that let people use them. For the most part,
it feels like just writing another function in Python.

The auto-documentation is freeing from both sides: as an author I can
keep my documentation in line with my implementation, because my
implementation is the source. For consumers, they’re immediately
provided with a simple UI where they can rapidly iterate with the API
call they would like to make.

It’s also great knowing I can use transmute in the next framework,
whatever that may be: I can take all the work and behavior that’s
embedded in transmute, with a module or two’s worth of code.


Give it a shot! Issues
and PRs
are welcome, and I’d love to see someone apply transmute to
another framework.

Global logging with flask

As of December 2016, Flask has a built-in
logger that it instantiates for you. Unfortunately, this misses the
errors and other log messages in other libraries that may also be

It would be nice to have a single logger, one that captures BOTH
library AND app logs. For those that want a global logger, this may
take a few concept to get right. You have to:

  1. undo flask’s logging
  2. set up your own logging
  3. set log levels, as the default may not suit you.

Combined, this ends up looking like:

import logging
import sys
from flask import Flask, current_app

LOG = logging.getLogger("my_log")
LOG2 = logging.getLogger(__name__ + ".toheunateh")
app = Flask(__name__)

def route():"flask logger: foo")"log: foo")"log2: foo")
    return "hello!"

# create your own custom handler and formatter.
# you can also use logging.basicConfig() to get
# the python default.
out_hdlr = logging.StreamHandler(sys.stdout)
fmt = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
# append to the global logger.
# removing the handler and
# re-adding propagation ensures that
# the root handler gets the messages again.
app.logger.handlers = []
app.logger.propagate = True

And you get the right messages. Voila!