Skip to content.

plope

Personal tools
You are here: Home » Members » chrism's Home » My Experiences Creating a Build System
 
 

My Experiences Creating a Build System

It's a meme!

Ian Bicking created an article with the same title as this article where he describes writing a build system to deploy software. And nothing spells fun like everybody piling on with their own experiences creating a build system!

I created Buildit after taking a run at the same problem via pymake . Essentially both are Make clones where the control system is Python rather than some domain-specific language. Pymake was no fun to use because it left the problem of properly computing dependent variable replacements up to the user. For example, if you have a path named "base_dir", pymake would make its users join that explicitly in Python to create "base_dir"-relative variables "etc_dir", "var_dir", etc. Buildit does a much better job of this; this job can be done in the configuration language itself and thus supplanted Pymake.

Buildit is a perhaps slightly-too-literal port of Make to Python. It's perhaps a better library on which to base a proper developer build system than it is a build system in its own right. When I developed it, I saw building software environments in terms of Make-like tasks. Since then I've seen other ways to do it, which has been useful. If I were to do it all over, I'd do it differently.

Things I like about buildit:

  • Familiarity: it's a Make-like system in Python and shares some concepts with Make.
  • Flexibility: because it's so general, it can do just about anything.
  • Speed. A properly-written buildit build will finish in a few milliseconds if it doesn't need to do anything (if it has already been run once and nothing needs changing).
  • Variable resolution. It does a pretty good job at resolving variable values before any build tasks start. zc.buildout also does well at this.

Things I like about zc.buildout:

  • Reusable build steps are packaged as "recipes" and usable without installing any software at all.
  • Creating a build usually doesn't imply editing any code at all; instead you create declarative configuration. It's even more declarative than a Makefile. This is a good fit to hand off to customers.
  • I don't have to think much about whether I'm doing it the right way or not. Not My Job; existing recipes do it the way they do it. Drool.
  • I can usually run any given build over and over and it typically does the right thing (although that might take a long time). This is also true of properly-written buildit and Make recipes, but it's easier to create poorly-written ones in buildit/Make that do things you don't expect on the second and third runs.

Things I don't like about buildit:

  • Ian's right: buildit's "namespace-aware" variable substitution stuff is suboptimal. There's a lot of "save me typing" shortcuts in the way that variables get referred to in replaceable task elements, which can cause confusion. It's hard to figure out just what a task is going to do if you put it in more than one namespace. If you need a task to do something slightly different in one namespace that some other, you often don't know whether you need to change Python code or an .INI file somewhere or what. Ian also complained that couldn't make a variable value dynamic based on OS environment and he had to therefore know about namespaces and such, and he's of course right. Before I got sucked into the zc.buildout world, that's something I wanted to change.
  • It has a complex end user "UI". You need to know about both Python and its own little mini DSL and sometimes even the buildit internals in order to really understand what's going on.

Things I don't like about zc.buildout:

  • Script generation. I'd rather it made N virtualenvs, if only so you could get out of buildout-land for easier experimentation when you needed to temporarily try out a package in the working environment. One of these days, I'll get around to writing a zc.buildout recipe that does this as a replacement for zc.recipe.egg.

But all in all, using zc.buildout has been a pretty nice experience. I suppose there's a bit of "I finally get it" hiding in that observation.

I was never completely comfortable with the way buildit worked. There's probably a better way to do all of it. I dunno what that is. It's hard to make a system that is meant to be extended to do imperative things from a completely declarative configuration. zc.buildout doesn't really do all of it right either, but it gets it more right than buildit does as an extensible framework. In any case, I've mostly given up on buildit, and we've switched over to using zc.buildout for new builds. I am usually the "build bitch" on any given project we do, so I get to choose the tool. Any innovation I put into a buil systems will likely go into creating zc.buildout recipes.

Ian describes fassembler as "does a list of things, top to bottom [ .. with ... ] no effort [ .. towards ..] detecting changes in the build, or changes in the settings, or anything else". In that case, why bother writing a build system at all and instead just use a shell script? I don't think Ian actually meant this, I think fassembler does detect changes in the build (is this file already written?, do I need to make this symlink?, etc). So it does make the same sorts of decisions that buildit and Make do in order to be able to skip work because redoing it would cause more problems than it solved; maybe it is just more interactive about how to deal with conflicts: it asks the user. That's sort of ok by me. I'd actually rather that it just gave up and exited so it could be used in a buildbot sort of thing, but it's not really all that important either way.

I think continued innovation in Python build systems is important. Nothing we have is near perfect.

The most important features of a development build system (as we use it) is are:

  • being able to run a given build over and over again without needing to start from scratch and having the result on every run match the intent of the build config. This is to deal with the fact that N developers need a way to get "the latest stuff" without needing to install it by hand; "the latest stuff" changes rapidly and frequently in the first few weeks of a project; it's not just custom code that changes; it's library dependencies and such.
  • Being able to preserve data between build runs (e.g. database tables).
  • Having the build take a reasonable amount of time each time it runs. It doesn't have to take absolutely the shortest amount of time, but its run time can't be the same for each invocation (e.g. 20 minutes).

Everything else is sort of secondary. We usually end up using the development build system to deploy to production too, and there are some different requirements driving "important features" there, but most of the value we get out of automated builds is by avoiding fallout from manually configured environments on developer systems.

Created by chrism
Last modified 2008-06-21 11:28 AM

shell scripts

When you asked "why not use a shell script" it reminded me why not, and I added another section to that post. Shell scripts swallow errors by default, and if you do the Right Thing with respect to errors the result is still horrible error handling. Build systems have to handle errors right.

This was probably a big part of my frustration with BuildIt. When I used zc.buildout it was also horrible, though I'm sure it's improved some (though whether it has improved *enough* is a separate issue).

my experiences using ian's build system

I assume you weren't serious about the shell script comment. Fassembler is about 4000 lines of python, not counting specific builds, requirements files, and the like.

Fassembler is by no means perfect, I've listed some things I dislike about it in my comments on Ian's blog.
But as it's matured I've been pretty happy with it.

For some reason people can't seem to wrap their heads around "no effort [ .. towards ..] detecting changes in the build, or changes in the settings, or anything else."

We can do that because we leverage existing tools. As Ian said, "We don't have a dependency system because we run all steps on every build." I think maybe people are making a mental leap that it's all fassembler all the way down. Of course not. A step to build Zope 2 might be "./configure && make". That's already going to be pretty fast on the second run. Similarly, "easy-install foopackage" is pretty fast on the second run.

There are a few ad-hoc freshness checks in our actual build, as I noted in my comment on Ian's post. In practice this is needed rarely enough, and is easy enough, that fassembler doesn't try to abstract it. There are some conveniences for common tasks that need to preserve data, eg. there's a fassembler task to create a mysql database with a given name only if it doesn't exist.

As for the interactive questions and "I'd actually rather that it just gave up and exited so it could be used in a buildbot sort of thing": there is a --no-interactive command line option, which translates to "everywhere we need an answer, just use the default choice provided by the build author, and if there is no default, die." We use this with buildbot without problems. The interactive features are most useful for production deployments and upgrades, where you really want somebody paying attention and making decisions. It could use some improvement, like getting smarter about detecting trivial things that the user's always going to approve.

Rebuilds may be slower than a properly written makefile or buildit config, but they're plenty fast enough:
On my home desktop, bootstrapping fassembler itself takes about 30 seconds.
Building opencore and its dependencies takes 243 seconds.
Re-building it with --no-interactive takes 31 seconds.