Crafting pelican-export in 6 hours.

Over the past two or three days, I spent some deep work time on writing pelican-export, a tool to export posts from the pelican static blog creator to WordPress (with some easy hooks to add more). Overall I was happy with the project, not only because it was successful, but because I was able to get to something complete in a pretty short period of time: 6 hours. Reflecting, I owe this to the techniques I’ve learned to prototype quickly.

Here’s a timeline of how I iterated, with some analysis.

[20 minutes] Finding Prior Art

Before I start any project, I try to at least do a few quick web searches to see if what I want already exist. Searching for “pelican to wordpress” pulled up this blog post:

https://code.zoia.org/2016/11/29/migrating-from-pelican-to-wordpress/

Which pointed at a git repo:

https://github.com/robertozoia/pelican-to-wordpress

Fantastic! Something exists that I can use. Even if it doesn’t work off the bat, I can probably fix it, use it, and be on my way.

[60m] Trying to use pelican-to-wordpress

I started by cloning the repo, and looking through the code. From here I got some great ideas to quickly build this integration (e.g. discovering the xmlrpc-wordpress library). Unfortunately the code only supported markdown (mine are in restructuredtext), and there were a few things I wasn’t a fan of (constants including password in a file), so I decided to start doing some light refactoring.

I started organizing things into a package structure, and tried to use the Pelican Python package itself to do things like read the file contents (saves me the need to parse the text myself). While looking for those docs, I stumbled upon some issues in the pelican repository, suggesting that for exporting, one would want to write a plugin:

https://github.com/getpelican/pelican/issues/2143

At this point, I decided to explore plugins.

[60m] Scaffolding and plugin structure.

Looking through the plugin docs, it seemed much easier than me trying to read in the pelican posts myself[ I had limited success with instantiating a pelican reader object directly, as it expects specific configuration variables.

So I started authoring a real package. Copying in the package scaffolding like setup.py from another repo, I added the minimum integration I needed to actually install the plugin into pelican and run it.

[60m] Rapid iteration with pdb.

At that point, I added a pdb statement into the integration, so I could quickly look at the data structures. Using that I crafted the code to migrate post formats in a few minutes:

    def process_post(self, content) -> Optional[WordPressPost]:
        """Create a wordpress post based on pelican content"""
        if content.status == "draft":
            return None
        post = WordPressPost()
        post.title = content.title
        post.slug = content.slug
        post.content = content.content
        # this conversion is required, as pelican uses a SafeDateTime
        # that python-wordpress-xmlrpc doesn't recognize as a valid date.
        post.date = datetime.fromisoformat(content.date.isoformat())
        post.term_names = {
            "category": [content.category.name],
        }
        if hasattr(content, "tags"):
            post.term_names["post_tag"] = [tag.name for tag in content.tags]
        return post

I added a simlar pdb statement to the “finalized” pelican signal, and tested the client with hard-coded values. I was done as far as functionality was concerned!

[180m] Code cleanup and publishing

The bulk of my time after that was just smaller cleanup that I wanted to do from a code hygiene standpoint. Things like:

  • [70m] making the wordpress integration and interface, so it’s easy to hook in other exporters.
  • [40m] adding a configuration pattern to enable hooking in other exporters.
  • [10m] renaming the repo to it’s final name of pelican-export
  • [30m] adding readme and documentation.
  • [30m] publishing the package to pypi.

This was half of my time! Interesting how much time is spent just ensuring the right structure and practices for the long term.

Takeaways

I took every shortcut in my book to arrive at something functional, as quickly as I could. Techniques that saved me tons of time were:

  • Looking for prior art. Brainstorming how to do the work myself would have meant investigating potential avenues and evaluating how long it would take. Having an existing example, even if it didn’t work for me, helped me ramp up of the problem quickly.
  • Throwing code away. I had a significant amount of modified code in my forked exporter. But continuing that route would involve a significant investment in hacking and understanding the pelican library. Seeing that the plugin route existed, and testing it out, saved me several hours of time trying to hack and interface to private pelican APIs.
  • Using pdb to live write code. In Python especially, there’s no replacement to just introspecting and trying things. Authoring just enough code to integrate as a plugin to give me a fast feedback loop, and throwing a pdb statement to quickly learn the data structure, helped me find the ideal structure in about 10 minutes.

There was also a fair bit of Python expertise that I used to drive down the coding time, but what’s interesting is the biggest contributors to time savings were process: knowing the tricks on taking the right code approach, and iterating quickly, helped me get this done in effectively a single work day.

Aiohttp vs Multithreaded Flask for High I/O Applications

Over the past year, my team has been making the transition from Flask to
aiohttp. We’re making this
transition because of a lot of the situations where non-blocking I/O
theoretically scales better:

  • large numbers of simultaneous connections
  • remote http requests with long response times

There is agreement that asyncio scales better memory-wise: a green thread
in Python consumes less memory than a system thread.

However, performance for latency and load is a bit more contentious. The best way to find
out is to run a practical experiment.

To find out, I forked py-frameworks-benchmark, and designed an experiment.

The Experiment

The conditions of the web application, and the work performed, are identical:

  • a route on a web server that: 1. returns the response as json 2. queries a
  • http request to an nginx server returning back html.
  • a wrk benchmark run, with 400 concurrent requests for 20 seconds
  • running under gunicorn, with two worker processes.
  • python3.6

The Variants

The variants are:

  • aiohttp
  • flask + meinheld
  • flask + gevent
  • flask + multithreading, varying from 10 to 1000.

Results

variant min p50 p99 p99.9 max mean duration requests
aiohttp 163.27 247.72 352.75 404.59 1414.08 257.59 20.10 30702
flask:gevent 85.02 945.17 6587.19 8177.32 8192.75 1207.66 20.08 7491
flask:meinheld 124.99 2526.55 6753.13 6857.55 6857.55 3036.93 20.10 190
flask:10 163.05 4419.11 4505.59 4659.46 4667.55 3880.05 20.05 1797
flask:20 110.23 2368.20 3140.01 3434.39 3476.06 2163.02 20.09 3364
flask:50 122.17 472.98 3978.68 8599.01 9845.94 541.13 20.10 4606
flask:100 118.26 499.16 4428.77 8714.60 9987.37 556.77 20.10 4555
flask:200 112.06 459.85 4493.61 8548.99 9683.27 527.02 20.10 4378
flask:400 121.63 526.72 3195.23 8069.06 9686.35 580.54 20.06 4336
flask:800 127.94 430.07 4503.95 8653.69 9722.19 514.47 20.09 4381
flask:1000 184.76 732.21 1919.72 5323.73 7364.60 786.26 20.04 4121

You can probably get a sense that aiohttp can server more requests than any
other. To get a real sense of how threads scale we can put the request count on
a chart:

 

The interesting note is that the meinheld worker didn’t scale very well at all.
Gevent handled requests faster than any threading implementation.

But nothing handled nearly as many requests as aiohttp.

These are the results on my machine. I’d strongly suggest you try the experiment
for yourself: the code is available in my fork.

If anyone has any improvements on the multithreading side, or can explain the discrepency in performance, I’d love to understand more.

The CID Pattern: a strategy to keep your web service code clean

The Problem

Long term maintenance of a web application, will, at some point,
require changes. Code grows with the functionality it serves, and
an increase in functionality is inevitable.

It is impossible to foresee what sort of changes are required, but there are
changes that are common and are commonly expensive:

  • changing the back-end datastore of one or more pieces of data
  • adding additional interfaces for a consumer to request or modify data

It is possible to prevent some of these changes with some foresight,
but it is unlikely to prevent all of them. As such, we can try to
encapsulate and limit the impact of these changes on other code bases.

Thus, every time I start on a new project, I practice CID: (Consumer-Internal-Datasource)

CID Explained

CID is an acronym for the three layers of abstraction that should be
built out from the beginning of an application. The layers are described as:

  • The consumer level: the interface that your consumers interact with
  • The internal level: the interface that application developers interact with most of the time
  • The datasource level: the interface that handles communication with the database and other APIs

Let’s go into each of these in detail.

Consumer: the user facing side

The client level handles translating and verifying the client format,
to something that makes more sense internally. In the beginning, this
level could be razor thin, as the client format probably matches the
internal format completely. However, other responsibilities that might
occur at this layer are:

  • schema validation
  • converting to whatever format the consumer desires, such a json
  • speaking whatever transport protocol is desired, such as HTTP or a Kafka stream

As the application grows, the internal format might change, or a new
API version may need to be introduced, with it’s own schema. At that
point, it makes sense to split the client schema and the internal
schema, so ending up with something like:

class PetV1():
    to_internal()  # converts Pet to the internal representation.
    from_internal() # in case you need to return pet objects back as V1

class PetV2():
    to_internal()  # converts Pet to the internal representation.
    from_internal()  # in case you need to return pet objects back as V2

class PetInt():
    # the internal representation, used within the internal level.

Datastore: translates internal to datastore

Some of the worst refactorings I’ve encountered are the ones involving
switching datastores. It’s a linear problem: as the database
interactions increase, so do the lines of code that are needed to
perform that interaction, and each line must be modified in switching
or alternating the way datastores are called.

It’s also difficult to get a read on where the most expensive queries
lie. When your application has free form queries all over the code, it
requires someone to look at each call and interpret the cost, as ensure
performance is acceptable for the new source.

If any layer should be abstracted, it’s the datastore. Abstracting the
datastore in a client object makes multiple refactors simpler:

  • adding an index and modifying queries to hit that index
  • switching datasources
  • putting the database behind another web service
  • adding timeouts and circuit breakers

Internal: the functional developer side

The client and datastore layers abstract away any refactoring that
only affects the way the user interacts with the application, or the
way data is stored. That leaves the final layer to focus on just the
behavior.

The internal layer stitches together client and datastore, and
performs whatever other transformations or logic needs to be
performed. By abstracting out any modification to the schema that had
to be done on the client or datastore (including keeping multiple
representation for the API), you’re afforded a layer that deals exclusively
with application behavior.

An Example of a CID application

A theoretical organization for a CID application is:

root:
  consumers:
    - HTTPPetV1
    - HTTPPetV2
    - SQSPetV1
  internal:
    # only a single internal representation is needed.
    - Pet
  datasource:
    # showcasing a migration from Postgres to MongoDB
    - PetPostgres
    - PetMongoDB

Example Where CID helps

So I’ve spent a long time discussing the layers and their
responsibilities. If we go through all of this trouble, where does
this actually help?

Adding a new API version

  • add a new API schema
  • convert to internal representation

Modifying the underlying database

  • modify the datasource client.

Complex Internal Representations

If you need to keep some details in a Postgres database, and store
other values within memcache for common queries, this can be
encapsulated in the datasource layer.

All too often the internal representations attempt to detail with this
type of complexity, which makes it much harder to understand the
application code.

Maintaining Multiple API versions

Without clearly separating how an object is structured internally from
how consumers consume it, the details of the consumer leaks into the
internal representation.

For example, attempting to support two API version, someone writes
some branched code to get the data they want. this pattern continues
for multiple parts of the code dealing with that data, until it
becomes hard to get a complete understanding of what in V1 is
consumed, and what in V2 is consumed.

Final Thoughts

David Wheeler is quoted for saying:

All problems in computer science can be solved by another level of indirection.

Indirection is handy because it encapsulates: you do not need a
complete understanding of the implementation to move forward.

At the same time, too much indirection causes the inability to
understand the complete effect of a change.

Balance is key, and using CID helps guide indirection where
it could help the most.

Hierarchal Naming

One of the most interesting artifacts of most programming languages using English conventions is variable naming. Today I contend that:

English Grammar is a Terrible Programming Default

Consider how you would specify that a room is for guests in English,
or a car is designed to be sporty. In both cases, the specifier comes
before the object or category:

  • Sports Car
  • Guest Room
  • Persian Cat

Since programming languages are primarily based on English, it’s a natural default to name your variables in a similar order:

  • PersianCat
  • TabbyCat
  • SiameseCat

To further qualify your classes, one prepends additional information:

  • RedTabbyCat
  • BlueTabbyCat
  • BlackTabbyCat

And the pattern continues. As more qualifiers are added, the more names are prepended.

This reads well, if our main goal was to make software read as close
to english as possible. However, software has a goal that’s more
important than grammatical correctness: organization and searchability.

Naming should have qualifiers last

Consider instead appending qualifying variables to the end, as with a namespace:

  • CatPersian
  • CatTabby
  • CatSiamese
  • CatTabbyRed
  • CatTabbyBlue
  • CatTabbyBlack

It’s still legible as an English speaker: it’s clear the adjectives are inverted. It also provides a couple other advantages too:

Sortability

If you sorted all class names next to each other, the groupings would happen naturally:

  • CatTabbyRed
  • CatTabbyBlue
  • CatTabbyBlack
  • Truck
  • PimentoLoaf

In contrast to the previous example:

  • BlueTabbyCat
  • BlackTabbyCat
  • PimentoLoaf
  • RedTabbyCat
  • Truck

Clear correlation while scanning

If you’re trying to look through a table of values quickly,
using the reverse-adjective writing shows a clear organization, even when unsorted.

  • CatTabbyBlue
  • PimentoLoaf
  • CatPersion
  • Truck
  • CatTabbyRed

In contrast to:

  • BlueTabbyCat
  • PimentoLoaf
  • PersianCat
  • Truck
  • RedTabbyCat

Conclusion

Our variable naming convention wasn’t deliberate: it was an artifact
of the language that it was modeled against. Let’s adopt conventions that
come from a logical foundation. Like more search-friendly ordering of class qualifiers.

Test Classes Don’t Work

Test Classes don’t work as a test structure.

It’s worth clarifying what I mean by the test class. I’m
speaking specifically about the following structure of an test:

  • having a test class, that contains the setup and teardown method for test fixtures
  • putting multiple tests in that class
  • having the execution of a test look something like:
    * run setup
    * execute test
    * run teardown

More or less, something like:

class TestMyStuff:

     def setUp(self):
         self.fixture_one = create_fixture()
         self.fixture_two = create_another_fixture()

     def tearDown(self):
         teardown_fixture(self.fixture_one)
         teardown_fixture(self.fixture_two)

     def test_my_stuff(self):
         result = something(self.fixture_one)
         assert result.is_ok

This pattern is prevalent across testing suites, since they follow the
XUnit pattern of test design.

Why Test Classes are the Norm

Removing the setup and teardown from your test fixtures keep things
clean: it makes sense to remove them from you test body. When looking at code,
you only want to look at context that’s relevant to you, otherwise it’s harder
to identify what should be focused on:

def test_my_stuff():
    fixture = create_fixture()

    try:
        result = something(fixture)
        assert result.is_ok
    finally:
        teardown_fixture(fixture)

So, it makes sense to have setup and teardown methods. A lot of the
time, you’ll have common sets of test fixtures, and you want to share
them without explicitly specifying them every time. Most languages
provide object-oriented programming, which allows state that is
accessible by all methods. Classes are a good vessel to give a test
access to a set of test fixtures.

When You Have a Hammer…

The thing about object oriented programming is, it’s almost always a
single inheritance model, and multiple inheritance gets ugly
quickly. It’s not very easy to compose test classes together. In the
context of test classes, why would you ever want to do that?

Test fixtures. Tests depend on a variety of objects, and you don’t
want to have to multiple the setup of the same test fixtures across
multiple classes. Even when you factor it out, it gets messy quick:

class TestA():
    def setUp(self):
        self.fixture_a = create_fixture_a()
        self.fixture_b = create_fixture_b()

    def tearDown(self):
        teardown_fixture(self.fixture_a)
        teardown_fixture(self.fixture_b)

    def test_my_thing(self):
        ...


class TestB():
    def setUp(self):
        self.fixture_b = create_fixture_b()

    def tearDown(self):
        teardown_fixture(self.fixture_b)

    def test_my_other_thing(self):
        ...

class TestB():
    def setUp(self):
        self.fixture_c = create_fixture_b()
        self.fixture_b = create_fixture_c()

    def tearDown(self):
        teardown_fixture(self.fixture_b)

    def test_my_other_other_thing(self):
        ...

At this rate, a test class per test would become necessary, each with
the same code to set up and teardown the exact same fixture.

To avoid this, there needs to be a test system that:

  • has factories for test fixtures
  • as little code as possible to choose the fixtures necessary, and to
    clean them up.

A Better Solution: Dependency Injection

In a more general sense, a test fixtures is a dependency for a
test. If a system existed that handled the teardown and creation of
dependencies, it’s possible to keep the real unique logic alone
in the test body.

Effectively, this is the exact description of a dependency injection
framework
:
specify the dependencies necessary, and the framework handles the
rest.

For Python as an example, py.test has this capability. I declare a common fixture
somewhere, and can consume it implicitly in any test function:

# example copied from the py.test fixture page.
import pytest

@pytest.fixture
def smtp(request):
    import smtplib
    server = smtplib.SMTP("merlinux.eu")
    # addfinalizer can be used to hook into the fixture cleanup process
    request.addfinalizer(lambda: teardown(server))

def test_ehlo(smtp):
    response, msg = smtp.ehlo()
    assert response == 250
    assert 0 # for demo purposes

With pytest, You can even use fixtures while generating other fixtures!

It’s a beautiful concept, and a cleaner example of how test fixtures
could be handled. No more awkward test class container to handle creation
and teardown of fixtures.

As always, thoughts and comment are appreciated.

How I Design Test Suites

At Zillow, I’ve done a lot of work on the design and development of
the test infrastructure we use for full-stack tests. It’s always fun
to watch your tool become popular, but even more interesting is the
discussions around test suite design that come with it.

Many discussions later, I have a good idea of what I want in a test suite.
Here’s what I think about:

Tests are a question of cost

At the end of the day, tests have a cost. Each and every test has a
value / cost ratio. Things that increase the value of a test include:

  • consistency: given the same inputs, give the same results, every time.
  • speed: the faster the test is, the faster the feedback. The faster
    the feedback, the faster one can take action, and the more often we
    can execute the tests to get feedback.

In contrast, the things that increase the cost of a test include:

  • maintenance time: maintenance takes time, and development time is expensive.
    probably the biggest cost to consider.
  • cpu / memory to execute the test: although arguably cheap in this world
    of cloud providers, cpu and memory are real concerns, and tests that use
    a lot of these resources are expensive.
  • the time to execute the test: time is a huge cost, especially as the
    technology world we live in demands for more changes, more
    quickly. Depending on how fast you ship, tests that take too long will
    be prohibitively expensive, and thus not used.

When I look at the value of a test, I look at these factors. In
practice, I’ve found that the most important metric of them all is
maintenance time: test that have little to no maintenance survive
refactors, rewrites, and pretty much anything that could happen to
code besides deprecation.

On the other hand, the more the test requires maintenance, the more likely
it’ll suffer one of two outcomes:

  • the test is thrown out because it takes too much time to maintain,
    despite the value.
  • the test is not given the time it needs, and continues to fall into
    disarray until it is ignored.

Basically: low maintenance tests last forever, high maintenance tests probably won’t.

Designing cheap tests

So how do we make tests that require little to no maintenance? From what I’ve observed, there are two types of maintenance:

  • functional maintenance, which modifies the test to reflect changes in the code itself
    • e.g. for a web page, the login form fields are modified
  • operational maintenance, which requires keeping a service dependency in a good state to test.
    • e.g. for an office application with cloud sync, keeping the cloud syncing service up.

Functional maintenance is unavoidable: as code changes, one must
ensure that any tests that validate that code are kept up to date. In
addition, for most tests, functional maintenance is relatively cheap
in time: except in the cases of extreme redesigns or refactorings, the
changes tend to be small in nature.

Operational maintenance costs can vary wildly, and it can become very
expensive. Tests that have multiple dependencies can become a game of
juggling an environment where all of those are functional. It becomes
even harder if there’s a small team maintaining this environment:
executing the tests consistently requires a production-quality
environment, and that’s more difficult the more services there are to
maintain.

However, unlike functional maintenance, operational maintenance, for
the most part, is avoidable. Taking advantage of heavy mocking, it’s
possible to remove dependencies like databases and APIs. Google
Testing Blog has a good article about
this
.

Summary: tests with fewer operational dependencies are cheaper to maintain.

What kind of test distribution: the testing pyramid

When testing software, there are multiple levels at which one could author tests:

  • at the “unit” level, typically written in the same language and validating a single function or behaviour
  • at the integration level, typically written in the same language, and validating the communication between your code and an external application
  • at the end-to-end level, not necessarily written in the same language, and validating a complete workflow that a user would be performing.

Although all are important and should be included in a test suite,
each test is not created equally. Going back to the idea that tests
with the least maintenance will last the longest, we should be trying
to have as many of those as possible.

Unit tests are the cheapest. They:

  • have no dependencies (or else they would at least be considered an integration test),
  • run quickly (no waiting for network, or other delay from communication)

If we could capture all behaviour of our application with just unit
tests, that would be perfect. Unfortunately, many things can go wrong
when composing multiple pieces of these units together, so some level
of integration and end-to-end tests will be needed. But the larger
tests should be fewer in number, since they are harder to maintain.

A good model to visualize a good distribution is the “testing pyramid”, as explained
by Martin Fowler and Google:

The more expensive tests are fewer in number, while the cheaper tests
are much more common.

How many tests should be in a suite

Adequate test coverage varies wildly between applications: medical
software than monitors heart rate should probably have a lot more
coverage than a non-critical social media website. The only common
rule of thumb I’ve found is: add the absolute minimum number of tests
to achieve your desired confidence in quality.

Testing is important, but at the end of the day, it’s not a
user-facing feature. On the other hand, quality is. Adding additional
tests does increase quality, but it comes at the cost of development
and maintenance time toward other features that help your application
provide value. A properly sized testing suite comes right at the line
of too little testing, and hover around that. This gives developers
as much time as possible on features, while ensuring that an
important feature (quality) is not neglected.

Summary

  • the best tests are the cheapest tests: low maintenance and executes quickly and low CPU/RAM resources
  • the cheapest tests have the fewest number of dependencies on other applications, like DBs or APIs
  • try to keep test coverage as low level as possible, and cheap tests are worth 10x expensive ones.
  • expensive tests validate the whole infrastructune, so they’re almost
    always necessary: refer to the test pyramid for a rough sketch of a good distribution.
  • never add more or less coverage than you need: more coverage results
    in more maintenance that detracts from development time, and less coverage means an application
    whose quality is not up to the desired standards.
  • how much coverage do I need? Depends on how critical the application
    is, and how critical it continues to work. A payment path needs high
    quality, so should have high coverage. The alignment of a button on
    a dialog three pages deep probably needs less quality assurance.

How do you design your test suite?

Book Report: Refactoring by Martin Fowler

Refactoring is a book covering the basics tenants of refactoring as
dictated by Martin Fowler: a very smart person with some very good
ideas about code in general.

First, the interesting thing about the definition of refactoring (as
defined by this book) is that it doesn’t encompass all code
cleanup. It explicitly defines refactoring as a disciplined practice
that involves:

  • a rigorous test suite to ensure code behaves as desired beforehand.
  • a set of steps that ensures that, at every step, the code works as before.

There’s a lot of gems in this book. ‘Refactoring’ not only covers the
basic tenants around refactoring, but also provides a great set of
guidelines around writing code that is very easy for future
maintainers to understand as well.

The Indicators for Refactoring

After showing a great example of a step-by-step refactoring of code
that excellently preserves functionality, the next chapter describes
several code smells that indicate the need for a refactor:

  • duplicate code: a common red flag for anyone familiar with the age
    old adage DRY (Don’t repeat yourself)
  • long methods: definitely a good sign for a refactor. I can’t recall
    how many methods I’ve read where I’ve barely been able to keep mental track
    of what’s really going on here.
  • strong coupling: Definitely not an easy one to catch when you’re
    hacking away hardcore at something. Sometimes it takes a real objective look at
    your code to find that the two classes or methods that you’ve been working with
    should really be one, or maybe organized separately.

Aside from this, the book explicitly describes several situations
which indicate the need to consider refactoring. That said (and Martin
also admits this), it’s not even close to outlining every single
situation where refactoring is necessary. After all, programming,
despite requiring a very logical and objective mind, can be a very
subjective practice.

The Actual Refactorings

After going over the smells, the next chapters finally describe the
actual refactoring themselves. The description of the refactoring
themselves is very rigorous, covering motivation, explicit steps and
examples. It’s a very good reference to cover all of your bases, and
like any book that describes patterns, is a good reference to keep
somewhere when tackling particularly difficult refactoring tasks.

A lot of the refactors were ones I was already familiar with, but
there were some interesting cases I didn’t really think a lot about, that
‘Refactoring’ helped me assess more deeply:

Replace Temp with Query

The summary of this description is to replace temporary variables with
a method that generates the state desired:

def shift_left(digits, value):
    multiplier = 2 ** digits
    return value * multiplier

After:

def shift_left(digits, value):
    return value * _power_of_two(digits)

def _power_of_two(digits):
    return 2 ** digits

This is a trivial example, and not necessarily representative of a
real refactoring. However, using a ‘query method’ to generate state
helps prevent several bad patterns from emerging:

  • modifying the local variable to be different than the initial intention
  • ensure that the variable is not misused anywhere else

It’s a good example of a refactoring that help ensure the variable is
actually temporary, and is not misused.

Introduce Explaining Variable

At the end of the day, good code is 90% about making it easier for
others to read. Code that works is great, but code that can not be
understood or maintained is not going to last when that code is
encountered a second time.

Explaining variables really help here. This is the idea of making
ambiguous code more clearer by assigning results to named variables that
express the intent a lot better:

def interest(amount, percentage, period):
    return amount * (1.414 ** (percentage / period))

After:

def interest(amount, percentage, period):
    e_constant = 1.414
    return amount * Ce_constant ** (percentage / period))

Having very descriptive variables can make understanding the code a
lot easier.

Remove Assignment to Parameters

This is saying basically avoid mutating input parameters:

def multiply(x, y):
    x *= y
    return x

After:

def multiply(x, y):
    result = x * y
    return result

This is nice because it makes it easier to work with input parameters
later: mutating values that have clear intent can result to poor
misuse of those variables later (because you assume no one changed it,
or it actually describes the value it should). This could be
inefficient, but compiler optimizers can get rid of these
inefficiencies anyway, so why make it more confusing to a potential
consumer?

Duplicate Observed Data

This is basically pushing for a decoupling of data stored on both a
client (interface) as well as a publisher. There’s a lot of times
where the client will store data that’s almost identical to an object
that already exists and has all the information stored neatly. Reducing the
duplication of data is always a good thing.

Separate Query from Modifier

There’s a lot of methods that not only perform formatting or retrieve
data, but also mutate data as well. This refactoring suggests
separating them:

def retrieve_name(log_object):
    log_object.access_count += 1
    return [str(x) for x in log_object.names]

After:

def increment_access_count(log_object):
  log_object.access_count += 1

def retrieve_name(log_object):
  return [str(x) for x in log_object.names]

increment_access_count(log_object)
return retrieve_name(log_object)

I can’t count the number of times I wanted to have one specific part
of the function a function performs. Refactorings such as this one
really give modular pieces that can be stitched together as necessary.

The General Refactoring Principles

The book’s scatters some great gems about what a good refactoring
looks like, and it’s very similar to what is commonly known to be good
code:

  • mostly self-documenting: code that is so easily legible that it your
    barely even need comments to understand what it’s doing: intelligible
    variable and function names, written like plain English more that code.
  • modular: each function is split into small, singularly functional units.
  • taking advantage of the principles and idioms for the language at
    hand: ‘refactoring’ was written with object-oriented languages in
    mind, so it advocated strong utilization of OOP. Utilize the
    programming language’s strengths.

Any step that takes your code in that direction (whilst preserving
functionality) is a good example of a refactoring.

How to Allocate Time to Refactor

‘Refactoring’ also stresses and appropriate time to refactor code:
constantly. Martin Fowler argues refactoring should occur during the
development process, and time should be added to estimates to give
space for refactoring. I’ve never been given explicit amounts of time
to refactor code, and most of the time, you won’t. Best thing to do is
to push yourself to refactor whenever it’s appropriate. The book also
warns against going overboard, only refactoring what you need. It’s a very
agile approach to the idea of refactoring.

Conclusion

Ultimately, ‘Refactoring’ doesn’t blow my mind and introduce me to
some life-changing concept. That said, it definitely changed my
mindset about refactoring. Refactoring should:

  • be done as you go
  • move the code toward being easily comprehensible
  • move the code toward being easily extendable
  • have a strong set of testing around it to preserve functionality

As I was about to tackle a fairly large refactoring, It was a great
read to organize my thoughts about my methodologies and practices, and
my goals.

I don’t recommend reading every word, but the chapters that explain
philosophies and glancing over the refactoring patters was more that
worth the time spent reading.