Crafting pelican-export in 6 hours.
Over the past two or three days, I spent some deep work time on writing pelican-export, a tool to export posts from the pelican static blog creator to WordPress (with some easy hooks to add more). Overall I was happy with the project, not only because it was successful, but because I was able to get to something complete in a pretty short period of time: 6 hours. Reflecting, I owe this to the techniques I’ve learned to prototype quickly.
Here’s a timeline of how I iterated, with some analysis.
[20 minutes] Finding Prior Art #
Before I start any project, I try to at least do a few quick web searches to see if what I want already exist. Searching for “pelican to wordpress” pulled up this blog post:
https://code.zoia.org/2016/11/29/migrating-from-pelican-to-wordpress/
Which pointed at a git repo:
https://github.com/robertozoia/pelican-to-wordpress
Fantastic! Something exists that I can use. Even if it doesn’t work off the bat, I can probably fix it, use it, and be on my way.
[60m] Trying to use pelican-to-wordpress #
I started by cloning the repo, and looking through the code. From here I got some great ideas to quickly build this integration (e.g. discovering the xmlrpc-wordpress library). Unfortunately the code only supported markdown (mine are in restructuredtext), and there were a few things I wasn’t a fan of (constants including password in a file), so I decided to start doing some light refactoring.
I started organizing things into a package structure, and tried to use the Pelican Python package itself to do things like read the file contents (saves me the need to parse the text myself). While looking for those docs, I stumbled upon some issues in the pelican repository, suggesting that for exporting, one would want to write a plugin:
https://github.com/getpelican/pelican/issues/2143
At this point, I decided to explore plugins.
[60m] Scaffolding and plugin structure. #
Looking through the plugin docs, it seemed much easier than me trying to read in the pelican posts myself[ I had limited success with instantiating a pelican reader object directly, as it expects specific configuration variables.
So I started authoring a real package. Copying in the package scaffolding like setup.py from another repo, I added the minimum integration I needed to actually install the plugin into pelican and run it.
[60m] Rapid iteration with pdb. #
At that point, I added a pdb statement into the integration, so I could quickly look at the data structures. Using that I crafted the code to migrate post formats in a few minutes:
<code> def process_post(self, content) -> Optional[WordPressPost]:
"""Create a wordpress post based on pelican content"""
if content.status == "draft":
return None
post = WordPressPost()
post.title = content.title
post.slug = content.slug
post.content = content.content
# this conversion is required, as pelican uses a SafeDateTime
# that python-wordpress-xmlrpc doesn't recognize as a valid date.
post.date = datetime.fromisoformat(content.date.isoformat())
post.term_names = {
"category": [content.category.name],
}
if hasattr(content, "tags"):
post.term_names["post_tag"] = [tag.name for tag in content.tags]
return post</code>
I added a simlar pdb statement to the “finalized” pelican signal, and tested the client with hard-coded values. I was done as far as functionality was concerned!
[180m] Code cleanup and publishing #
The bulk of my time after that was just smaller cleanup that I wanted to do from a code hygiene standpoint. Things like:
- [70m] making the wordpress integration and interface, so it’s easy to hook in other exporters.
- [40m] adding a configuration pattern to enable hooking in other exporters.
- [10m] renaming the repo to it’s final name of pelican-export
- [30m] adding readme and documentation.
- [30m] publishing the package to pypi.
This was half of my time! Interesting how much time is spent just ensuring the right structure and practices for the long term.
Takeaways #
I took every shortcut in my book to arrive at something functional, as quickly as I could. Techniques that saved me tons of time were:
- Looking for prior art. Brainstorming how to do the work myself would have meant investigating potential avenues and evaluating how long it would take. Having an existing example, even if it didn’t work for me, helped me ramp up of the problem quickly.
- Throwing code away. I had a significant amount of modified code in my forked exporter. But continuing that route would involve a significant investment in hacking and understanding the pelican library. Seeing that the plugin route existed, and testing it out, saved me several hours of time trying to hack and interface to private pelican APIs.
- Using pdb to live write code. In Python especially, there’s no replacement to just introspecting and trying things. Authoring just enough code to integrate as a plugin to give me a fast feedback loop, and throwing a pdb statement to quickly learn the data structure, helped me find the ideal structure in about 10 minutes.
There was also a fair bit of Python expertise that I used to drive down the coding time, but what’s interesting is the biggest contributors to time savings were process: knowing the tricks on taking the right code approach, and iterating quickly, helped me get this done in effectively a single work day.