DIY Static Site Generator in Python
Tagged: python software
A huge burden has been lifted from my shoulders thanks to changing my blog over from a Rails 4 application to a custom static site generator. I want to tell you about it because I believe it illustrates a transition common to many novice developers of my generation - the transition from high to low abstraction, from building with blocks to building with tools, from lots of complexity to a minimal set of moving parts.
I'll start off with the juicy part first: why? Why are these transitions even any good?
I feel like most developers in my generation start climbing the mountain from top to bottom. There's tons of frameworks and ready-made blocks that can get you up and running in a matter of hours. I personally started with Michael Hartl's Ruby on Rails Tutorial back when Rails 3.2 was out. In it you create a simple twitter clone in a matter of hours. You don't even know what's happening - all the database stuff, all the HTTP stuff, all the transformations of data from format A to B are taken care of. It's really great, it's DRY, and it's fast. It's also a great method for creating a blog, like I did 2 years ago. Nowadays Rails is old and people start with React or Meteor or other new and shiny things. These things pick up where Rails left off and take you another step above the mundane low-level bits and pieces.
This isn't bad - delivering working code faster means winning. However, you quickly hit obstacles as soon as you build something other than a twitter-clone or a blog. These obstacles are things like aligning application logic with business logic, technical debt, scaling, adding features, switching environments, recovering from outages, etc. They are a combinatorial explosion and trying to fit it in your head won't really work. You have to shift a level on the Dreyfus model. You gotta go down from, say, ActiveRecord, and into SQL-land. Or from a Controller to the HTTP protocol. Branch out from churning out Rails-flavored-OOP to functional programming in Python. Substituting a million 3rd party libraries for a few hundred lines of hand-crafted code.
Having done this, you will the understand why you want to use a relational database vs. a NoSQL solution for a particular problem, why you might want to ditch MVC for something else, why your server is taking 400ms instead of 50ms to produce a response. You're in a position to understand how the amazing powertools you've been using really work and whether sometimes you might just do with a simple hand-operated drill instead.
Simplicity is the game. Simpler machines will be easier to maintain and faster to build. More likely than not, they will also function better. We're talking more time for new features for your product, more time writing that blog instead of maintaining it, more time for contributing to open source projects instead of wrestling with dependencies, and the list goes on. Most coders have an eye for elegance and simplicity is a major dimension of that quality.
Why even strive towards this? For me it's about unlocking the doors to building even cooler things. Plus it also means you gain the ability to help out others on a similar path. Lets not forget about the satisfaction of a job well done, too.
Before, my blog had a Gemfile that listed a bunch of dependencies that had to be satisfied. Each on was a ticking security- or bug-bomb. Then there was the database itself and everything that it entails - another process running on a server, another vendor to be locked in with (MySQL shivers). Finally, there was the application logic itself along with everything related to presenting it - css/js minifiers, unit tests, route definitions, etc. We're talking about a freakin' blog here! The CRUDest of the CRUD.
Now, after pruning the idea of a blog down into its components, I'm left with under 400 lines of code with two 3rd party libraries (one for parsing markdown and one for compiling templates), no database at all (the file system's tree structure handles all of that), no asset pipelines (it's a few handfuls of js and css), no deployment script (ok, it's one line of bash that uses rsync. I've tried to stick to a somewhat functional style that aids in transforming a tree of documents of markdown files into a tree of html documents. Transformations and trees work wonderfully with the functional style and Python supports it enough to make it work. I've yet to add tags to posts and really flesh out the non-blog related functionality, but it all fits into this idea of transforming trees so I'm really looking forward to it.
Maintenance is reduced and I don't need to service 3rd party dependencies at all. I've offloaded all the actual work on Nginx, which is pretty darn good at serving static files. Plus, now I can host this blog anywhere where I can host static files. This all boils down to one thing - I have more time for things that really matter.
You can check out the code here. There's still some work to be done, but I'm happy to be able to write again.