The Case for Better Build Systems

A lot of my work, these days, focuses on figuring out how to improve how people develop software in ways that reduces the amount of time developers have to spend doing work outside of development and that improves the quality of their work. This post, has been sitting in my drafts folder for the last year, and does a good job of explaining how I locate my work **and* makes a case for high quality generic build system tooling that I continue to feel is compelling.*


Incidentally, it turns out that I wrote an introductory post about buildsystems 6 years ago. Go past me.

Canonically, build systems described the steps required to produce artifacts, as system (graph) of dependencies [1] and these dependencies are between source files (code) and artifacts (programs and packages) with intermediate artifacts all in terms of the files they are or create. Though different development environments, programming languages, and kinds of software have different.

While the canonical "build systems are about producing files," the truth is that the challenge of contemporary _software_ development isn't really just about producing files. Everything from test automation to deployment is something that we can think about as a kind of build system problem.

Let's unwind for a moment. The work of "making software," breaks down into a collection of--reasonably disparate--tasks, which include:

  • collecting requirements (figuring out what people want,)
  • project planning (figuring out how to break larger collections of functionality into more reasonable units.)
  • writing new code in existing code bases.
  • exploring unfamiliar code and making changes.
  • writing tests for code you've recently written, or areas of the code base that have recently chaining.
  • rewriting existing code with functionally equivalent code (refactoring,)
  • fixing bugs discovered by users.
  • fixing bugs discovered by an automated test suite.
  • releasing software (deploying code.)

Within these tasks developers do a lot of little experiments and tests. Make a change, see what it's impact is by doing something like compiling the code, running the program or running a test program. The goal, therefore, of the project of developer productivity projects is to automate these processes and shorten the time it takes to do any of these tasks. In particular the feedback loop between "making a change" and seeing if that change had an impact. The more complex the system that you're developing, with regards to distinct components, dependencies, target platforms, compilation model, and integration's, them the more time you spend in any one of these loops and the less productive you can be.

Build systems are uniquely positioned to suport the development process: they're typically purpose built per-project (sharing common infrastructure,) most projects already have one, and they provide an ideal environment to provide the kind of incremental development of additional functionality and tooling. The model of build systems: the description of processes in terms of dependency graphs and the optimization for local environments means.

The problem, I think, is that build systems tend to be pretty terrible, or at least many suffer a number of classic flaws:

  • implicit assumptions about the build or development environment which make it difficult to start using.
  • unexpressed dependencies on services or systems that the build requires to be running in a certain configuration.
  • improperly configured dependency graphs which end up requiring repeated work.
  • incomplete expression of dependencies which require users to manually complete operations in particular orders.
  • poorly configured defaults which make for overly complex invocations for common tasks.
  • operations distributed among a collection of tools with little integration so that compilation, test automation, release automation, and other functions.

By improving the quality, correctness, and usability of build systems, we:

  • improve the experience for developers every day,
  • make it really easy to optimize basically every aspect of the development process,
  • reduce the friction for including new developers in a project's development process.

I'm not saying "we need to spend more time writing build automation tools" (like make, ninja, cmake, and friends,) or that the existing ones are bad and hard to use (they, by and large are,) but that they're the first and best hook we have into developer workflows. A high quality, trustable, tested, and easy to use build system for a project make development easier, continuous integration easy and maintainable, and ultimately improve the ability of developers to spend more of their time focusing on important problems.

[1]ideally build systems describe directed acylcic graph, though many projects have buildsystems with cyclic dependency graphs that they ignore in some way.

In Favor of Fast Builds

This is an entry in my loose series of posts about build systems.

I've been thinking recently about why I've come to think that build systems are so important, and this post is mostly just me thinking aloud about this issue and related questions.

Making Builds Efficient

Writing a build systems for a project is often relatively trivial, once you capture the process, and figure out the base dependencies, you can write scripts and make files to automate this process. The problem is that the most rudimentary build systems aren't terribly efficient, for two main reasons:

1. It's difficult to stumble into a build process that is easy to parallelize, so these rudimentary solutions often depend on a series of step happening in a specific order.

2. It's easier to write a build system that rebuilds too much rather than too little for subsequent builds. From the perspective of build tool designers, this is the correct behavior; but it means that it takes more work to ensure that you only rebuild what you need to.

As a corollary, you need to test build systems and approaches with significantly large systems, where "rebuilding too much," can be detectable.

Making a build system efficient isn't too hard, but it does require some amount of testing and experimentation, and often it centers on having explicit dependencies, so that the build tool (i.e. Make, SCons, Ninja, etc.) can build output files in the correct order and only build when a dependency changes. [1]

The Benefits of a Fast Build

  1. Fast builds increase overall personal productivity.

    You don't have to wait for a build to complete, and you're not tempted to context switch during the build, so you stay focused on your work.

  2. Fast builds increase quality.

    If your build system (and to a similar extent, your test system,) run efficiently, it's possible to detect errors earlier in the development process, which will prevent errors and defects. A tighter feedback loop on the code you write is helpful.

  3. Fast builds democratize the development process.

    If builds are easy to run, and require minimal cajoling and intervention, it becomes much more likely that many people

    This is obviously most prevalent in open source communities and projects, this is probably true of all development teams.

  4. Fast builds promote freshness.

    If the build process is frustrating, then anyone who might run the build will avoid it and run the build less frequently, and on the whole the development effort looses important feedback and data.

    Continuous integration systems help with this, but they require significant resources, are clumsy solutions, and above all, CI attempts to solve a slightly different problem.

Optimizing Builds

Steps you can take to optimizing builds:

(Note: I'm by no means an expert in this, so feel free to add or edit these suggestions.)

  • A large number of smaller jobs that can complete independently of other tools, are easy to run in parallel. If the jobs that create a product take longer and are more difficult to split into components, then the build will be slower, particularly on more powerful hardware.
  • Incremental builds are a huge win, particularly for larger processes. Most of the reasons why you want "fast builds," only require fast rebuilds and partial builds, not necessarily the full "clean builds." While fast initial builds are not unimportant, they account for a small percentage of use.
  • Manage complexity.

There are a lot of things you can do to make builds smarter, which should theoretically make builds faster.

Examples of this kind of complexity include storing dependency information in a database, or using hashing rather than "mtime" to detect staleness, or integrating the build automation with other parts of the development tool chain, or using a more limited method to specify build processes.

The problem, or the potential problem is that you lose simplicity, and it's possible that something in this "smarter and more complex" system can break or slow down under certain pressures, or can have enough overhead to render them unproductive optimizations.

[1]It's too easy to use wild-cards so that the system must rebuild a given output if any of a number of input files change. Some of this is unavoidable, and generally there are more input files than output files, but particularly with builds that have intermediate stages, or more complex relationships between files it's important to attend to these files.