Buildcloth v0.2.0 Release

2013-09-08 – tychoish

Today I did a (slightly) more formal release of a software project that I’ve been working on pretty consistently for the last two months. It’s an extension or elaboration on the buildcloth, and is the groundwork for some other projects I’ve been working on.

While there are some new fixes and improvements to the initial meta-build tool components of the project as I’d been working on 5 months ago, this one goes even further and includes a complete build automation tool.

The deal with this is that I’d been running a build system for months that had a bunch of very small tasks, and performance was awful for no really good reason. Well for one reason: process creation. Each task needed to create its own shell, run, and exit, which was awful. The solution to this problem was running each task (which was ultimately just a Python function) in a Python multi-processing pool. The new version of buildcloth is an attempt to build some common infrastructure around this practice.

It still needs some real world testing, and there are some missing features that I’d like to add, and always more documentation, but it’s good enough that I wanted to get it out there so that people could start using it and giving feedback.

I’ll post more later on the the experience and lessons learned here in a bit. While I work on that, see:

Onward and Upward!

Uphill Forever

2013-09-08 – tychoish

In many ways this is the follow up to hard is good and the post I promised recounting the lessons of the buildcloth v0.2.0 release

This release of buildcloth is in some ways, the first real piece of software I’ve written from the ground up. I’ve written a bunch of code, and I’ve implemented a decent amount of functionality as extensions and additions to other programs, written some very small programs, and written an endless number of throw away scripts, but never something quite on this scale. The remainder of this post is

Whats the coolest thing about Buildcloth 0.2.0?

The buildsystem feature is pretty awesome, mostly because it makes it possible to have legitimate, honest-to-goodness build systems running with an Python project. Integration is a sweet thing.

Why make another build tool? Aren’t there enough of those already?

I started working on this for two reasons, first because the MongoDB documentation build process was lagging under some process creation overhead and using buildcloth as a meta-build system was clearly not holding up.

Second, I wanted to write a static site generator that used a fully concurrent internal model. My initial plan was to use the buildcloth meta-build system, but that clearly wouldn’t hold up at scale so I needed something like buildcloth.

Finally, there aren’t actually a lot of generalized build automation tools: Make, Ninja, SCons, Waf, and Rake plus a small cluster of Java tools (Ant, Maven, sbt, Gradle). See the wikipedia list of build automation tools.

How can I use Buildcloth? Is Buildcloth right for my project?

If you want, you can use Buildcloth as a Make replacement, using the buildc front end. For build systems that already have Python code to wrap or implement build steps, Buildcloth may be much more efficient than using something like Make. For other kinds of builds, the benefits may be less pronounced.

You can also use Buildcloth as a library in your own Python programs if you need a way of ruining build-jobs in a parallel, dependency aware mode.

I’ve started to think think buildcloth is really a sort of embedable, small-scale, local version of something like celery. Or maybe it’s just a collections of decent wrappers around multiprocessing.Pool. Regardless, there aren’t a lot of really intuitive tools around that make async processing/parallel execution easy and fun in Python, so if you need to do this kind of processing work, take a look.

How well tested is Buildcloth?

There’s a complete unit test suite with 400 tests (last count) that should ensure that things stay stable as the product continues to develop. In this respect buildcloth is really well tested (particularly for a project of its age.) In other respects, its less well tested.

Now that things are comparatively stable, I’m pretty eager to begin using the buildsystem and make sure all of the higher-level aspects work well.

What aspects need the most improvement?

I need to devise a way to save some state between builds so that buildcloth can check to see if a target needs to be rebuilt by seeing if the content of the dependent files has changed. Currently, (like many tools,) buildcloth must rebuild things based on a comparison of modified times from the file system, which do not necessarily indicate a required rebuild.

This isn’t really a buildcloth problem, but I also find myself frustrated by the error reporting of tasks running inside the multiprocessing pool. I’m thinking of wrapping tasks right before calling them in a way that will capture output and make it easier to kill zombie tasks and dead pools.

What would you do differently next time?

I wrote the implementation in a very bottom-up sort of way, and as a result the design and testing suite feels a little bottom up. In the long term I think it was the right decision, but I think that in the medium term it will lead to some awkwardness.

Furthermore, build systems are fundamentally static and there’s no good way to “add jobs to the top of the pipe.” I don’t yet have a good answer to this problem (yet,) but shouldn’t be insurmountable.

Hard is Good

2013-08-04 – tychoish

In the hard part of software you could easily chose to read the word “hard” as “sucky,” which makes it seem like a big whine on the topic of “polishing your work is hard and annoying,” but really, that’s unfair.

I think it’s probably better to read the word “hard” as “important” rather than “sucky”.

While you may be able to get someone to try your software on the basis of its core implementation or design, people keep using software because it’s reliable (i.e. “has tests”) understandable (i.e. “has documentation”) and is easy to operate (i.e. “has a user interface.")

Furthermore, doing the hard work of adding infrastructure to a project is what allows software to grow in awesome ways. When you do the hard work, you make it possible to:

add functionality and utility without needing to rewrite large amounts of code.
add options and globules to expose features and behaviors in response to users needs.

Basically infrastructure begets agility.

The Hard Part of Software

2013-08-03 – tychoish

I’ve been writing build system tool that allows users to specify concurrent build processes using a lightweight, Python-based system that minimizes overhead.

Progress is decent. I hope to use this to replace a hodgepodge of fabric and Makefile for my work and personal projects. I have a decent spec (3 hours), an initial implementation of the internal parts (3 hours,) a good first draft at a command line utility (1.5 hours,) internal/APO documentation (10 hours,) and none of the unit tests and procedural and conceptual documentation. In essence the hard stuff.

Basically what happened, is I spent a lot of time thinking about the problem, a little bit of time coding, and if all goes according to plan a lot of time writing rather droll code and good if uninteresting documentation.

Which is, all things considered, what all software boils down to.

Writing the core implementation is (often) this intense impassioned process that is necessarily flow-like, because there’s a bunch of state that you have to keep hot in your mind while solving hard problems, and if your attention drifts too far, you start breaking things.

Not that flow-like states are the best or only way to write code for core functionality, but it works and it’s enjoyable.

Everything else, is different:

Writing documentation is an exercise in context switching: you have to read code, or poke at a running program to figure out how it works, and then turn that information on its head so you can tell people how to use it.

It’s fun, but it’s much more fussy.
Writing tests is similarly hard: it’s also about balancing “how it works” and “how its used,” but rather than describing something for future users, tests are about defining what constitutes “correctness” and what’s incorrect.¹

Writing test-code is intellectually challenging work, and requires many of the same base skills as writing implementation-code but requires a different kind of focus and thinking.
There’s a lot of code that remains once the core logic exists, including: user interfaces, logging, test, managing edge cases, optimization, and tuning the parameters of the behavior (business logic tweaks.)

Which isn’t to say that any particularly portion of the work is more or less difficult or important. But, if you don’t work in this world every day it’s easy to see the hard initial work as being “the real part of software development,” and allow all the other work to sort of fade into the background. Which is unfair, and I think is representative of a larger misunderstanding of how software works and gets made.

Another project for another day.

Onward and Upward!

This assumes that you’re not writing code in a test-driven manner, which is I think is probably statistically likely, if somewhat in-ideal. ↩︎

Topic Based Authoring Failure

2013-07-21 – tychoish

I wrote a long time ago, about /technical-writing/atomicity which (more or less) is the same as topic based authoring. Both describe the process of breaking information into the smallest coherent blocks and then using the documentation toolkit to compile the kind resource.

Topic based approaches to documentation promise reduced maintenance costs and greater documentation reuse. I’m not sure if anyone’s used “ease of authorship,” as an argument in favor of topic based approaches (they’re conceptually a bit difficult for the author,) but you get the feeling that it was part of the intention.

The obvious parallel is object orientation in programming, and I think the comparison is useful: they both present with optimism about reuse and collaboration through modularity and modern tool chains. While object oriented programming predates topic based authoring, both have been around for a while and even if you aren’t an adherent of object orientation or topic-based authoring, I think it’s impossible to approach programming or documentation without being influenced by either of these paradigms.

Unless you’re working with a really small resource, without some topic-based you end up with redundant documentation that looses consistency and a maintenance nightmare.

The downfalls of “topics,” don’t negate it’s overall utility, but they are significant:

topic based authoring makes it harder for non-writers to contribute to the documentation. This makes it more challenging to keep documentation up to date and can hurt overall accuracy.
topics force writers to focus on the “micro” documentation at the expense of the “macro” documentation experience. The content is clear, the completeness is good, but the overall experience for users is awful.
topic-centrism sometimes leads to deeper hierarchies which leads to duplicated content across the hierarchy as “cousin” nodes address related concepts.

What’s the solution? I’m not sure there is a single one, but:

it’s important to avoid duplication, by having great support for “single sourcing” (inlining/inclusion,) and simple cross referencing.
isolate all content in concrete topical units.
start with flat global organization and add interpage hierarchy only when necessary.
use as much intrapage organization and hierarchy as you need, and allow intrapage hierarchy.
build great reference material first. Everything else is gloss, and you should layer the gloss on top of strong reference rather than try and build reference under an existing structure.

Build Stages

2013-07-20 – tychoish

For work, I’ve been working on revising our build system so that less of the build definition happens in Makefiles and more of it happens in Python scripts. This post is an elaboration.

I’m a complete partisan of reusing standard tools, and so moving away from Make felt like a big/hard jump. However:

1. Process creation is expensive, and every “job” starts a new shell and process, which takes time.

2. Most of the build logic was in Python anyway: over time most shell lines called Python code rather than commands directly. This seems like a common artifact of more complex build processes.

Beyond this, the generation of the Makefiles itself was encoded in Python code.

For our project, at least, we were indirecting through Make, sort of through the hell of it.

3. It turns out that multiprocessing in Python is crazy easy to use.

The transition isn’t complete of course, we’re still using make to handle dependencies between groups of tasks, and there’s no particular rush or need to rid ourselves of Make, but the gains are huge. Things build faster one or two orders of magnitude in some cases. There’s less flakiness. Rebuild times are much faster, and there are fewer moving parts.

Great win!

This has me thinking about ways of doing build systems in a generic, maintainable way without relying on something like Make. I have a prototype on my Laptop at the moment that provides a way to specify build processes concurrently. The rest of this post will be a high level overview of the design of this system. Please provide feedback and enjoy!

Build systems are basically collections of tasks expressed in a graph structure. The tools exist to enforce and encode the graph structure, or less abstractly to ensure that tasks run in the proper order. If you’re paining a wall, the build system ensures that you spackle, apply the primer, and then apply the final coat, in that order.

There are, as near as I can tell, three different kinds of relationships among/between groups of tasks in a build process:

1. There are groups tasks that don’t depend on each other and can run concurrently with each other.

2. There are some tasks or groups of tasks that must not run before or after another group of tasks.

3. There are sequences of tasks that must run in a specific order, but can run at the same time as other tasks or sequence of tasks.

What Make, and related systems do is provide a mechanism to specify “dependency” relationships between files (and tasks after a fashion,) or groups of files/tasks. After a fashion, Make takes the dependency information and runs tasks more or less according to one of those patterns. In many ways, my project is an experiment to see if it’s possible to “outsmart Make,” by generalizing the kinds of operations and forcing users to specify the concurrency constraints of the tasks explicitly, rather than letting the concurrency emerge out of the dependency graph. Thoughts:

This depends on users being able to intuit the actual dependencies abstractly, rather than rely on the emergence properties of Make. Arguably, Make also requires you to think abstractly about the potential concurrent modeling of the build, but allows you to avoid it in some situations.
If some large portion of the compilation process relies external processes, the performance gains will probably be more modest. Process creation is still expensive, but it’s probably marginally cheaper to use subprocess than it is to start a full shell.

In addition to the basic machinery, I’ve written a few helper functions to read build definitions from a YAML file, which will produce a usable build system. I’ll release this once: I’ve written some tests, there’s better logging, and some basic README-level documentation.

Onward and Upward!

On Generic Build Systems

2013-07-19 – tychoish

I spent a bunch of time this week taking a bunch of my work project’s build system. We’ve gone from having most of the heavy lifting done by Make, to having only doing the general high level orchestration with Make and doing all of the heavy lifting with (simple) custom Python code.

The logic in the previous system was:

Make is everywhere, stable, and consistent. In the spirit of making the project as compatible and accessible to everyone it made sense to use common tools and restrict dependencies.
Concurrency and parallelism are both super hard, and Make provided a way to model the build in a knowable way, and the parallels the build as much as possible. Before starting this project, I’d spent two years being a write working with a build system that was not concurrently and ran in one thread, and I was eager to avoid this problem.

It turns out:

If you write Makefiles by hand they’re the inverse of portable. In the same way that “Portable Bash Script” is a thing that can lead only to insanity.

As Make-based build systems grow, the only thing you can do to preserve your sanity is wrap up all build instructions as scripts of some kind and/or use some sort of meta-build tool to generate the Makefiles programatically, using some sort of meta-build tool.

Complexity abounds.
Forcing you to model your build process as a graph, is actually not a bad thing, and frankly is the strongest selling point. Make doesn’t enforce graphs (and how could you really?) but if you pay attention to the ordering and build performance it’s not hard to keep things running in parallel.

By contrast, Make’s parallel execution is SO BAD. I think the problem is mostly shell/process creation overhead rather than scheduling. Regardless, for build systems with lots of small pieces, you loose a lot to overhead.

So I took the logic that’d I’d been using to generate the Makefiles, implemented some simple mtime based dependency checking, and used it to call the build functions directly in Python.

The results were huge. Speed gains of 300x, using 30% of the code and better processor utilization. Can’t argue with that. Even the steps that required external (i.e. non-Python) sub-processes components were considerably faster

So I was thinking: build systems tend to be big sources of blight, they tend to be hard to maintain and require a bunch of specialized knowledge that’s distinct from the actual domain knowledge of a project, and most generic build tools have problems, (like Make,) so what gives?

If the generic tools had better performance or were significantly easier to maintain, there’d be a rather convincing argument in their favor. As it is, I’m not really seeing it.

Micro Events

2013-07-02 – tychoish

I’ve been enjoying blogging over on the micro tychoish site and thought I’d catalog these posts here.

More to come!

Whats the coolest thing about Buildcloth 0.2.0?#

Why make another build tool? Aren’t there enough of those already?#

How can I use Buildcloth? Is Buildcloth right for my project?#

How well tested is Buildcloth?#

What aspects need the most improvement?#

What would you do differently next time?#

Whats the coolest thing about Buildcloth 0.2.0?

Why make another build tool? Aren’t there enough of those already?

How can I use Buildcloth? Is Buildcloth right for my project?

How well tested is Buildcloth?

What aspects need the most improvement?

What would you do differently next time?