Goodbye Pelican, Hello WordPress!

First of all, sorry for all of those who came here through Google, and were redirected to the homepage. I tried my best to preserve URLs but I couldn’t figure out a great way to do that.

For recurring readers, you may have noticed the site has changed. That’s because this blog is now powered by WordPress!

I’m generally not a fan of heavy-handed systems, but the user experience eventually convinced me this was the right route. I’m now using WordPress, and even the paid edition.

Why I chose WordPress

WordPress as a platform provides a lot of tools to simplify the blog authoring experience. With Pelican, my blog writing experience was the following:

  1. Create a new file in restructured text, add some boilerplate
  2. adding images requires copying the image to the images/ directory, then adding the image link by hand into the file.
  3. re-rendering the post over and over again.
  4. calling the execute script, which handles publishing the files to Github.

The disadvantages of the platform were:

  1. The iteration was slow, including the ability to quickly add and manipulate images.
  2. The experience was on desktop only, and Git to boot, so I had to have enough time to clone (or pull or push) a git repository and fire up a text editor. Not great for just jotting down a quick note.

WordPress reduces this whole process, and support both mobile and desktop:

  1. Create a new post in the UI
  2. Add images by just selecting the file. I can do basic modifications like crop and rotate directly in wordpress.
  3. click “publish”

Overall, the reduced friction has let me write posts more frequently, as well as use it as a bed for notes in the meantime.

There are also other benefits:

  • several themes available so I can quickly style.
  • mobile app
  • SEO friendly

And probably features I’m sure to discover as well.

So, welcome to my new WordPress blog!

Crafting pelican-export in 6 hours.

Over the past two or three days, I spent some deep work time on writing pelican-export, a tool to export posts from the pelican static blog creator to WordPress (with some easy hooks to add more). Overall I was happy with the project, not only because it was successful, but because I was able to get to something complete in a pretty short period of time: 6 hours. Reflecting, I owe this to the techniques I’ve learned to prototype quickly.

Here’s a timeline of how I iterated, with some analysis.

[20 minutes] Finding Prior Art

Before I start any project, I try to at least do a few quick web searches to see if what I want already exist. Searching for “pelican to wordpress” pulled up this blog post:

Which pointed at a git repo:

Fantastic! Something exists that I can use. Even if it doesn’t work off the bat, I can probably fix it, use it, and be on my way.

[60m] Trying to use pelican-to-wordpress

I started by cloning the repo, and looking through the code. From here I got some great ideas to quickly build this integration (e.g. discovering the xmlrpc-wordpress library). Unfortunately the code only supported markdown (mine are in restructuredtext), and there were a few things I wasn’t a fan of (constants including password in a file), so I decided to start doing some light refactoring.

I started organizing things into a package structure, and tried to use the Pelican Python package itself to do things like read the file contents (saves me the need to parse the text myself). While looking for those docs, I stumbled upon some issues in the pelican repository, suggesting that for exporting, one would want to write a plugin:

At this point, I decided to explore plugins.

[60m] Scaffolding and plugin structure.

Looking through the plugin docs, it seemed much easier than me trying to read in the pelican posts myself[ I had limited success with instantiating a pelican reader object directly, as it expects specific configuration variables.

So I started authoring a real package. Copying in the package scaffolding like from another repo, I added the minimum integration I needed to actually install the plugin into pelican and run it.

[60m] Rapid iteration with pdb.

At that point, I added a pdb statement into the integration, so I could quickly look at the data structures. Using that I crafted the code to migrate post formats in a few minutes:

    def process_post(self, content) -> Optional[WordPressPost]:
        """Create a wordpress post based on pelican content"""
        if content.status == "draft":
            return None
        post = WordPressPost()
        post.title = content.title
        post.slug = content.slug
        post.content = content.content
        # this conversion is required, as pelican uses a SafeDateTime
        # that python-wordpress-xmlrpc doesn't recognize as a valid date. = datetime.fromisoformat(
        post.term_names = {
            "category": [],
        if hasattr(content, "tags"):
            post.term_names["post_tag"] = [ for tag in content.tags]
        return post

I added a simlar pdb statement to the “finalized” pelican signal, and tested the client with hard-coded values. I was done as far as functionality was concerned!

[180m] Code cleanup and publishing

The bulk of my time after that was just smaller cleanup that I wanted to do from a code hygiene standpoint. Things like:

  • [70m] making the wordpress integration and interface, so it’s easy to hook in other exporters.
  • [40m] adding a configuration pattern to enable hooking in other exporters.
  • [10m] renaming the repo to it’s final name of pelican-export
  • [30m] adding readme and documentation.
  • [30m] publishing the package to pypi.

This was half of my time! Interesting how much time is spent just ensuring the right structure and practices for the long term.


I took every shortcut in my book to arrive at something functional, as quickly as I could. Techniques that saved me tons of time were:

  • Looking for prior art. Brainstorming how to do the work myself would have meant investigating potential avenues and evaluating how long it would take. Having an existing example, even if it didn’t work for me, helped me ramp up of the problem quickly.
  • Throwing code away. I had a significant amount of modified code in my forked exporter. But continuing that route would involve a significant investment in hacking and understanding the pelican library. Seeing that the plugin route existed, and testing it out, saved me several hours of time trying to hack and interface to private pelican APIs.
  • Using pdb to live write code. In Python especially, there’s no replacement to just introspecting and trying things. Authoring just enough code to integrate as a plugin to give me a fast feedback loop, and throwing a pdb statement to quickly learn the data structure, helped me find the ideal structure in about 10 minutes.

There was also a fair bit of Python expertise that I used to drive down the coding time, but what’s interesting is the biggest contributors to time savings were process: knowing the tricks on taking the right code approach, and iterating quickly, helped me get this done in effectively a single work day.

Tech Notes: Debugging LLVM + Rust

I’m working on a programming language, writing the compiler in rust. I’m stuck at this point from a segfault that occurs with the following IR (generated by my compiler):

; ModuleID = 'main'
source_filename = "main"

define void @main() {
  %result = call i64 @fib(i64 1)

define i64 @fib(i64) {
  %alloca = alloca i64
  store i64 %0, i64* %alloca
  %load = load i64, i64* %alloca
  switch i64 %load, label %switchcomplete [
    i64 0, label %case
    i64 1, label %case1

switchcomplete:                                   ; preds = %case1, %entry, %case
  %load2 = load i64, i64* %alloca
  %binop = sub i64 %load2, 1
  %result = call i64 @fib(i64 %binop)
  %load3 = load i64, i64* %alloca
  %binop4 = sub i64 %load3, 2
  %result5 = call i64 @fib(i64 %binop4)
  %binop6 = add i64 %result, %result5
  ret i64 %binop6

case:                                             ; preds = %entry
  ret i64 0
  br label %switchcomplete

case1:                                            ; preds = %entry
  ret i64 1
  br label %switchcomplete

This segfaults whenever I run my compiler, which currently compiles the code and immediately executed it in LLVM’s MCJIT.


Whenever I run my code in my debugger, I find that I have a segfault which doesn’t occur (at least at the same time) as when I run my app on the command line.

VS Code’s debugger returns:

so something is happening during the FPPassManager. Apparently the FPPassManager is what handles generating code for functions (read in the source code)

getNumSuccessors was a bit nebulous for me… what does this function actually do? I wasn’t familiar with the term “successor”: it must be something custom to LLVM. Some Googling finds:

So I guess successor is referring to the number of statements that immediately follow the existing statement. getNumSuccessors in core.h of llvm specifies there are function calls for a terminator. So what precisely is a terminator?

Looking through the LLVM source code again, it’s the classification for instructions that will terminate a BasicBlock. The list from LLVM9 looks like:

  /* Terminator Instructions */
  LLVMRet            = 1,
  LLVMBr             = 2,
  LLVMSwitch         = 3,
  LLVMIndirectBr     = 4,
  LLVMInvoke         = 5,
  /* removed 6 due to API changes */

Looking at the traceback, this is specifically occurring in the updatePostDominatedByUnreachable. The source code for that is:

/// Add \p BB to PostDominatedByUnreachable set if applicable.
BranchProbabilityInfo::updatePostDominatedByUnreachable(const BasicBlock *BB) {
  const Instruction *TI = BB->getTerminator();
  if (TI->getNumSuccessors() == 0) {
    if (isa<UnreachableInst>(TI) ||
        // If this block is terminated by a call to
        // @llvm.experimental.deoptimize then treat it like an unreachable since
        // the @llvm.experimental.deoptimize call is expected to practically
        // never execute.

The actual errors occurs on the first instruction of the assembly instruction:

; id = {0x00012806}, range = [0x000000000093fbb0-0x000000000093fc3b), name="llvm::TerminatorInst::getNumSuccessors() const", mangled="_ZNK4llvm14TerminatorInst16getNumSuccessorsEv"
; Source location: unknown
555555E93BB0: 0F B6 47 10                movzbl 0x10(%rdi), %eax
555555E93BB4: 48 8D 15 81 3B D5 01       leaq   0x1d53b81(%rip), %rdx
555555E93BBB: 83 E8 18                   subl   $0x

I can’t read assembler very well. But since this is a method, most likely the first instruction has to do with loading the current object into memory. Most likely then, getNumSuccessors is receiving a pointer to something it doesn’t expect. Most likely this is an NPE.

My hunch now is I have a basic block without a terminator statement, causing the JIT pass to fail.

There was a missing return statement on the main function. Adding that didn’t change anything.

Fixing the blocks to only have terminators did indeed fix the issue! Ultimately figuring out that a validator existed, and heeding it’s error messages lead to the solution.

Tech Notes: Updating Unity for Cerebrawl

I’m interested in starting a habit of note taking while I take on some pretty difficult tasks, maybe as a learning experience for myself or others if they find it valuable.

Today, I’ll be tackling Updating Cerebrawl’s Unity from 5.6 to 2018.3.

This is actually pretty late in the journey: I’ve got a branch of 2018.3 working, I just need to figure out how to reconcile that with the month-and-a-half’s worth of changes that were made in the meantime.

My upgrade path thus far has been a combination of the following tools:

  • vscode, when I need to go look at live coude
  • sourcetree, when I need to do some fine-grained change picking
  • Unity, to see if the things runs.

Errors Again

Pulling up my branch again, there’s errors around the lack of a TMP_PRO namespace. It seems that TextMeshProUGUI doesn’t exist for TextMeshPro 1.3. Something to look into later, but for now commenting that out should be fine.

Next ran into a duplicate tk2dSkin.dll. It looks like that now goes in the “tk2d” directory, rather than “TK2DROOT”. So just delete the old one.

Cherry-Picking the New Changes

We had to revert the 2018 unity changes previously. Last time I tried to merge in the master branch (I use git-svn so it’s effectively the SVN tree), git I think got confused because I reverted a bunch of the changes I had done, breaking everything and requiring me to apply those changes again.

This time, I should only pull in the changed made after that point. I created another branch to keep my working changes from being broken and lost in history when I merge in other changes.

I can use git cherry-pick to specifically pick up diffs in that version range:

git cherry-pick b813563…5646829

Ran into multiple errors cherry-picking. Resolution is to pick up incoming changes again and again (these are Unity asset files so not ones I needed to touch to update).

Once those were done, I switched back to the Unity editor, let it load again.

It Works!

Huzzah! For the most part everything has migrated over. The biggest challenge on this one was upgrading tk2d toolkit, which was broken by newer Unity versions.

Merging Changes In

I hit another snag trying to merge files in. Git svn attempted to rebase my changes on top of the existing branch, which doesn’t work really well as it tries to merge diffs again.

My best hope is to basically construct a changeset that is all of the changes I made on what’s in SVN today. To do so I run:

git svn fetch
git checkout master
git reset --hard git-svn 
git clean -xdf
git checkout feature/merge-unity-2018
git reset --soft master
git commit

Finally a git svn push and all the changes have been made!

From Emacs to Atom

I want to start this post by stating I have nothing but respect, admiration, and love for the Emacs community. Emacs’ extensibility, community packages, and it’s choice to effectively be an editor built on a lisp VM is amazing, and anyone choosing Emacs as an editor is investing in something that can grow with your needs.

Nevertheless, there are compelling reasons to switch to Atom. I am saying goodbye to Emacs, and have started using Atom as my main text editor.

My History with Emacs

I learned about Emacs during my college years (roughly 2008), when I happened to attend a house party for a family friend. The friend was a software developer who retired many years ago, but upon learning of my interest in software, he began to regale me with story after story of how much he does in Emacs.

“I check my e-mail with Emacs.”

“I built a program that opens my garage door from Emacs.”

“I share the same editor across multiple computers using the remote Emacs client.”

In all honesty I wasn’t really impressed by the idea of a program that you use to literally do everything, but it seemed like a great kernel for a text editor. I was using vim at the time and what I always lamented at (which I know others would say it’s vim’s greatest strength) was the fact that it could only be used to modify text. When I write code, I do so much more than write the code itself. I wanted an environment that made executing additional tasks seamless:

  • Interact with version control (git push / pull, commit, add)
  • Code search
  • Running command line scripts
  • Running a REPL and unit tests

Thus I dove into Emacs. The built-in terminal emulator, the ability to build whole programs in a single .el file and load them up, adding significant functionality to the editor, meant a lot of my needs were met quickly, and without significant effort. This became even easier with the release of Emacs 24, which included a built-in library to retrieve and install third-party packages, reducing the need to copy and paste.

I continued pretty happily for several years. I made a couple videos showcasing my Emacs setup, published my dotfiles, and wrote some tutorials as well.

Enter Atom

In 2015, Atom was released. An editor that was inspired by the flexibility of Emacs, but was built on HTML5 technologies. Using web technologies to build out massive native applications is not an advantage in every respect, but there are strong wins in some important areas.

A Powerful UI Framework

HTML, CSS, and Javascript have all grown to support the needs of the massive range of uses that websites are now used for, and the result from tackling such a large breadth is a powerful, general system for laying out various windows, styling them appropriately. Combine that with highly optimized runtimes to render said windows (web browsers) and you have a system that is not just developer-friendly, but also user-friendly.

Large Pool of Experienced Developers

A large portion of software engineers are web developers, and thus work in web technologies. The ability to transfer even some of this expertise as one is extending their editor removes a large chunk of the learning curve.

Impressed, But Not Sold

Atom was conceptually the editor I always wanted: the power of Emacs, a flexible UI framework, and a core built on technologies that I knew well and could contribute to. If I was starting from scratch, I may have chosen Atom, even when it just came out of beta.

However, I wasn’t starting from scratch. There was years of expertise in elisp, finding the right packages, learning to use them, and familiarizing myself with keybindings and the Emacs way of doing things. It didn’t make sense to throw those out the window for a nascent editor.

The Catalyst: Atom IDE and Language Server Integration

Since Atom came out, there was another text editor that entered the scene: VSCode. Similar in design to Atom, VSCode took a more opinionated approach to how an editor should be organized, and what tools to use (in the vein of visual studio). The more open world of Atom wasn’t a first priority (for example, VSCode did not provide support for more than three text windows at a time until recently).

However, VSCode did directly lead to the creation of the language server protocol, which enables any text editor to take advantage of IDE-like features, as long as they build an interface to a JSON-RPC based API.

When Atom implemented it’s language client, it was impressive, and it made me want to try Atom. But making the switch would require me to port all of my existing tools to find equivalents, and most likely learn a new set of keybindings. I already had a lot of that in Emacs. However, there was a final factor that really made me switch.

Community Critical Mass

For almost any tool or program, you’ll find one that is better in almost every significant way, but yet has not taken off. As much as we’d like to believe software engineering is a purely merit-based field, the reality is it depends on socio-economic factors as much as every other discipline. Market and mind-share matters.

The most impressive part of the language server protocol is not that it was built, it’s who built it. Facebook was a major contributor, teaming up with Github to build a real IDE experience for Atom.

Facebook’s business practices aside, they have a giant and talented engineer base. With Facebook engineers supporting a plugin like the Atom IDE, there’s a strong chance that you will see that integration improved and supported for years to come. And Atom is also a blessed project from Github.

I love Emacs, but it’s primarily supported by a volunteer base, who have other fulltime jobs. It’s very difficult trying to get a group of developers to implement something like language server support, and maintain and contribute back for years to come.

And the active community is larger around Atom. As of October 2018, here’s the counts of packages on the major package repositories per editor:

Unfortunately, Emacs does not have the development community in the same way Atom and VSCode can. That’s a conversation worth diving into, but it doesn’t change the state of the world today.

Migrating to Atom

So, I migrated my Emacs setup to Atom. Since I was a relatively late adopter a majority of my desired features were already a part of the editor, or were available as an extension.

I don’t think it’s valuable to dive into exactly what my setup looks like, but if you’d like to learn more, you can check out an Atom plugin I’m working on:

I am not using Atom 100% of the time, and I haven’t opened Emacs in about a year. The migration process took a couple of weeks.

The Future

Today, I have a lot invested in Atom, and I like my experience. Language server integration was a missing pain point, and that ecosystem (along with Atom’s integrations with it) is getting better every day.

The biggest lost I faced with Atom was performance: due to its reliance on a browser-based renderer, performance suffers vs draw calls in a native GUI. There are also improvements that can be made to Atom to ensure more non-blocking UI actions.

The Atom team has been working on xray, a text editor designed for performance whose improvements will be incorporated into the editor.

VSCode has also been a lot better on the performance front than Atom (still orders of magnitude slower than native editors). I tried it out recently and found the performance gain for me has been imperceptible, so it’s probably not worth the effort to lose the extension and keybinding knowledge.




The Why of Disp Pt. 1: The Syntax

Over the past few weeks, I’ve spent some intensive time on Disp, a programming language that looks syntactically like lisp, with the goal of making managing large codebases easier.

There’s a lot of ideas that went into it’s design, so I wanted to lay them out in a series here. I’m looking for feedback, so don’t hesitate to reply back if you disagree or have ideas. Also please check out the RFCs and leave some thoughts.

This first series is around the choice of lisp + indentation. Specifically, disp syntax looks something like:

(no highlighting unfortunately, Disp is it’s own fun syntax).

It’s very lisp-like: you’ll see the standard parens, which represent a function invocation. But you also see that some parens are missing, and indentation exists instead.

There are two extra rules to manage these syntactical changes:

  • every newline is considered an implicit expression
  • indentation means that you are providing a list to the previous, less indented statement.

It means the following two forms are identical to the parser:

to help reduce the parentheses, every statement on a new line is considered to be an expression, and a list as an argument can be represented with an indented block.

There were a couple reasons for this choice:


A major complaint with lisp is the number of parentheses required, making it hard to see where parentheses begin and where they end. Lisp experts say that you get used to it and it’s not a major issue in the long run. There are also plugins for many editors that match the parentheses using colors, that helps as well.

However, considering the language will be responsible for the parsing anyway, it seemed intuitive to remove unneeded symbols if the purpose could still be clear to a reader. The easiest removal was the surrounding of parenthesis on a newline statement: at that point, the syntax looks very similar to a non-lisp based programming language:

The indentation rules enable an almost python-like syntax: in many cases a block is represented by a list of statements or expressions, so allowing one to enumerate results in a similar level of readability, as far as expressions go:

Semantic Indentation for Consistency

Many languages are moving to the singleidiomatic formatting paradigm, which does a great job of quelling style discussions which ultimately have minor benefits for the reader, but consume a large amount of time. I think this is a must for a language for large organizations, and Disp continues in that vein.

Adding semantic meaning to indentation is almost a self-fulfilling prophecy: by adding semantic meaning style becomes more consistent, and it’s possible to add semantic meaning to indentation because style is consistent. It would be a waste to not use it to semantically.

Indentation is also used to improve readability and denote blocks of code in most styleguides, so it’s not a far stretch to use it as such.

Tabs instead of Spaces for Indentation

I’m sure this choice is a bit more controversial, but Disp uses tabs for indentation, instead of spaces (unfortunately many code snippets here use spaces because it’s hard to type tabs in browser).

There are many different flavors of indentation to choose from. Python, for example, is extremely lenient and allows one to use a mix of both. In the name of consistency and simplified parsing, it made sense to choose a single one.

Tabs was chosen to allows developers to modify the tab width settings in their IDE, choosing the spacing that is more legible to them.


Thanks for reading! I’m looking for any help to improve the readability or remove unneeded syntax. There’s more in this series coming up, so stay tuned.





Using Rust functions in LLVM’s JIT

LLVM is an amazing framework for building high-performance programming languages,
and Rust has some great bindings with llvm-sys. One challenge
was getting functions authored in Rust exposed to LLVM. To make this happen, there’s a few steps to walk through.

1. Exposing the Rust functions as C externs

When LLVM interfaces with shared libraries, it uses the C ABI protocol to do so. Rust provides a way to build do this, out of the box, using the ‘extern “C”‘ declaration:

extern "C" pub fn foo() {

This instructs the Rust compiler that this should be exposed in a way where it can be found and used as a library. In the case of an executable binary, this is still the case.

The big gotcha here is ensuring that you are declaring the function as public, AND you are declaring it as public in the main module too. If the function was located in a child module, you will need to re-export in the main file:

// src/

extern "C" pub fn foo() {
  println!("I'm a shared library call");

mod my_mod
// note the pub here.
pub use self::my_mod::foo;

Aiohttp vs Multithreaded Flask for High I/O Applications

Over the past year, my team has been making the transition from Flask to
aiohttp. We’re making this
transition because of a lot of the situations where non-blocking I/O
theoretically scales better:

  • large numbers of simultaneous connections
  • remote http requests with long response times

There is agreement that asyncio scales better memory-wise: a green thread
in Python consumes less memory than a system thread.

However, performance for latency and load is a bit more contentious. The best way to find
out is to run a practical experiment.

To find out, I forked py-frameworks-benchmark, and designed an experiment.

The Experiment

The conditions of the web application, and the work performed, are identical:

  • a route on a web server that: 1. returns the response as json 2. queries a
  • http request to an nginx server returning back html.
  • a wrk benchmark run, with 400 concurrent requests for 20 seconds
  • running under gunicorn, with two worker processes.
  • python3.6

The Variants

The variants are:

  • aiohttp
  • flask + meinheld
  • flask + gevent
  • flask + multithreading, varying from 10 to 1000.


variant min p50 p99 p99.9 max mean duration requests
aiohttp 163.27 247.72 352.75 404.59 1414.08 257.59 20.10 30702
flask:gevent 85.02 945.17 6587.19 8177.32 8192.75 1207.66 20.08 7491
flask:meinheld 124.99 2526.55 6753.13 6857.55 6857.55 3036.93 20.10 190
flask:10 163.05 4419.11 4505.59 4659.46 4667.55 3880.05 20.05 1797
flask:20 110.23 2368.20 3140.01 3434.39 3476.06 2163.02 20.09 3364
flask:50 122.17 472.98 3978.68 8599.01 9845.94 541.13 20.10 4606
flask:100 118.26 499.16 4428.77 8714.60 9987.37 556.77 20.10 4555
flask:200 112.06 459.85 4493.61 8548.99 9683.27 527.02 20.10 4378
flask:400 121.63 526.72 3195.23 8069.06 9686.35 580.54 20.06 4336
flask:800 127.94 430.07 4503.95 8653.69 9722.19 514.47 20.09 4381
flask:1000 184.76 732.21 1919.72 5323.73 7364.60 786.26 20.04 4121

You can probably get a sense that aiohttp can server more requests than any
other. To get a real sense of how threads scale we can put the request count on
a chart:


The interesting note is that the meinheld worker didn’t scale very well at all.
Gevent handled requests faster than any threading implementation.

But nothing handled nearly as many requests as aiohttp.

These are the results on my machine. I’d strongly suggest you try the experiment
for yourself: the code is available in my fork.

If anyone has any improvements on the multithreading side, or can explain the discrepency in performance, I’d love to understand more.

MongoDB Streaming Pattern, Allowing for Batching

An interesting problem arose at work today, regarding how to build an
aggregate of changes to a MongoDB collection.

A more general version of the problem is:

  1. you have a document which has multiple buckets it could
    belong to. Say, an animal which an arbitrary set of tags,
    such as [“mammal”, “wings”], and a discrete type location [“backyard”, “frontyard”, “house”].

    an example document could look like:

    { "name": "Cat",
      "location": "house",
      "tags": ["mammal", "ears"]
  2. Make it easy to retrieve the sum of each type, by tag. So:

       "tag": "mammal",
       "location": {
         "house": 10,
         "backyard": 4,
         "frontyard": 2,

The animal location is updated regularly, so the aggregates
can change over time.

A First Attempt

The simplest way to perform this is to rely on Mongo to retrieve all
animals that match the tag by indexing the tag field, then handling
the query and count in the application.

This works well for small scales. However, performing the action in
this way requires a scanning query per aggregate, and that must scan
every document returned to perform the aggregate. So, O(matched_documents):

def return_count_by_tag(tag_name):
    result = {
        "tag": tag_name,
        "location": defaultdict(int)
    for result in db.animals.find({"tag": tag_name}, {"location": 1}):
        result["type_count"][result["location"]] += 1

    return result

In our case, we needed to return an answer for every tag, within a
minute. We were able to scale the approach with this constraint in
mind to 35,000 tags and 120,000 documents. At that point, the
application was unable to build the aggregates fast enough.

The New Strategy

The main disadvantage of the previous design is the calculation of the
aggregate counts does not need to be on read: if we can ensure
consistent count updates as the location actually changes per
document, we can perform O(tag_count) updates per document instead.

The comparative complexity over a minute is:

  • old: len(distinct_tags) * len(average_animals_per_tag)
  • new: len(updates_per_minute) * len(average_tag_count_per_animal)

So, if we have:

  • 30,000 tags
  • 120,000 animals
  • 40 animals average per tag
  • (40 * 30,000) / (120,000) = 10 tags per animal
  • 10000 updates a minute

The number of documents touched is:

old: 30k * 40 = 1.2 million reads
new: 10k * 10 = 100,000 writes

So, we can scale a bit better by handling writes over reads. This
becomes an even better ratio if the updates occur at a less frequent

So, the stream processing works by:

  1. every desired changes is enqueued into a queue (in Mongo, this can
    be implemented as a capped collection)
  2. a worker process pulls from the queue, and processes the results.

The worker process:

  1. reads a watermark value of where it had processed
    previously (Mongo ObjectIds increase relative to time and insertion
    order, so it can be used as the watermark)
  2. performs the work required
  3. saves works to the collection
  4. writes the watermark value of where it had finished processing.

You could also delete records as you process them, but it can cause
issues if you need to read a record again, or if multiple workers need them.
need them.

Starting from Scratch

So how do we allow starting from scratch? Or, rebuilding the
aggregates if an issue occurs?

There could be a function that performs the whole collection
calculation, dumps it to the collection, and sets the watermark to
whatever the most recent object is in the queue.

Unfortunately, this process and the worker process cannot run at the
same time. If that happens, then the aggregate collection will be
corrupted, as one could query an older version of the collection, have
updates that are applied to the original aggregate copy, and are overwritten
with a stale copy from the rebuild.

Thus, we must ensure that the update worker does not run at the same
time as the batch worker.

A locking strategy

In Mongo, the locking is decided by the database, and a user has no
control over that. Thus, we must implement our own locking functionality by
using Mongo primitives.

The same record that holds the watermark could also hold the lock. To
ensure that we can survive a worker dying halfway and not releasing,
the lock, we can provide a lock owner, ensuring the same process type
can begin an operation again:

{ "name": "pet-aggregates",
  "watermark: ObjectId("DEADBEEF"),
  "lock": {
      "type": "update" // could also be type: bulk

Using this type of lock, the possible failure scenarios are:

  1. update process lock, failure, and update doesn’t run again:
    This requires manually looking at the issue, resolving, and restarting the queue.
  2. bulk process lock, failure, and bulk doesn’t run again:
    This requires manually looking at the issue, resolving, and restarting the queue.

deepmerge: deep merge dictionaries, lists and more in Python

Introducing deepmerge. It’s a library designed to provide simple
controls around a merging system for basic Python data structures like dicts and lists.

It provides a few common cases for merging (like always merge + override, or raise an exception):

from deepmerge import always_merger, merge_or_raise

base = {
    "a": ["b"],
    "c": 1,
    "nested": {
        "nested_dict": "value",
        "nested_list": ["a"]

nxt = {
    "new_key": "new_value",
    "nested": {
        "nested_dict": "new_value",
        "nested_list": ["b"],
        "new_nested_key": "value"

always_merge(base, nxt)
assert base == {
      "a": ["b"],
      "c": 1,
      "new_key": "new_value"
      "nested": {
          "nested_dict": "new_value",
          "nested_list": ["a", "b"],
          "new_nested_key": "value"

deepmerge allows customization as well, for when you want to specify
the merging strategy:

from deepmerge import Merger

my_merger = Merger(
    # pass in a list of tuples,with the
    # strategies you are looking to apply
    # to each type.
        (list, ["prepend"]),
        (dict, ["merge"])
    # next, choose the fallback strategies,
    # applied to all other types:
    # finally, choose the strategies in
    # the case where the types conflict:
base = {"foo": ["bar"]}
next = {"bar": "baz"}
my_merger.merge(base, next)
assert base == {"foo": ["bar"], "bar": "baz"}

For each strategy choice, pass in a list of strings specifying built in strategies,
or a function defining your own:

def merge_sets(merger, path, base, nxt):
    base |= nxt
    return base

def merge_list(merger, path, base, nxt):
    if len(nxt) > 0:
        return base

return Merger(
        (list, merge_list),
        (dict, "merge"),
        (set, merge_sets)

That’s it! Give and try, and Pull Requests are always encouraged.