What is it That You Do?

The longer that I have this job, the more difficult it is to explain what I do. I say, "I'm a programmer," and you'd think that I write code all day, but that doesn't map onto what my days look like, and the longer it seems the less code I actually end up writing. I think the complexity of this seemingly simple question grows from the fact that building software involves a lot more than writing code, particularly as projects become more complex.

I'd venture to say that most code is written and maintained by one person, and typically used by a very small number of pepole (often on behalf of many more people,) though this is difficult to quantify. Single maintainer software is still software, and there are lots of interesting problems, but as much as anything else I'm interested in the problems adjacent to multi-author code-bases and multi-operator software development. [1]

Fundamentally, I'm interested in the following questions:

  • How can (sometimes larger) groups of people collaborate to build something that's bigger than the scope of any of their work?
  • How can we build software in a way that lets individual developers focus most of the time on the features and concerns that are the most important to them and their users. [2]

The software development process, regardless of the scope of the problem, has a number of different aspects:

  • Operations: How does is this software execute and how do we know that its successful when it runs?
  • Behavior: What does it do, and how do we ensure it has the correct behavior?
  • Interface: How will users interact with the process, and how do we ensure a consistent experience across versions and users' environment?
  • Product: Who are the users? What features do they want? Which features are the most important?

Sometimes we can address these questions by writing code, but often there's a lot of talking to users, other developers, and other people who work in software development organizations (e.g. product managers, support, etc.) not to mention writing a lot of English (documentation, specs, and the like.)

I still don't think that I've successfully answered the framing question, except to paint a large picture of what kinds of work goes into making software, and described some of my specific domain interests. This ends up boiling down to:

  • I write a lot of documents describing new features and improvements to our software. [product]
  • I think a lot about how our product works as it grows (scaling), and what kinds of changes we can make now to make that process more smooth. [operations]
  • How can I help the more junior members of my team focus on the aspects of their jobs that they enjoy the most, and help illustrate broader contexts to them. [mentoring]
  • How can we take the problems we're solving today and build the solution that balances the immediate requirements with longer term maintainability and reuse. [operations/infrastructure]

The actual "what" I'm spending my time boils down to reading a bunch of code, meeting with my teamates, meeting with users (who are also coworkers.) And sometimes writing code. If I'm lucky.

[1]I think the single-author and/or single-operator class is super interesting and valuable, particularly because it includes a lot of software outside of the conventional disciplinary boundaries of software and includes things like macros, spreadsheets, small scale database, and IT/operations ("scripting") work.
[2]It's very easy to spend most of your time as a developer writing infrastructure code of some sort, to address either internal concerns (logging, data management and modeling, integrating with services) or project/process automation (build, test, operations) concerns. Infrastructure isn't bad, but it isn't the same as working on product features.

The Case for Better Build Systems

A lot of my work, these days, focuses on figuring out how to improve how people develop software in ways that reduces the amount of time developers have to spend doing work outside of development and that improves the quality of their work. This post, has been sitting in my drafts folder for the last year, and does a good job of explaining how I locate my work **and* makes a case for high quality generic build system tooling that I continue to feel is compelling.*


Incidentally, it turns out that I wrote an introductory post about buildsystems 6 years ago. Go past me.

Canonically, build systems described the steps required to produce artifacts, as system (graph) of dependencies [1] and these dependencies are between source files (code) and artifacts (programs and packages) with intermediate artifacts all in terms of the files they are or create. Though different development environments, programming languages, and kinds of software have different.

While the canonical "build systems are about producing files," the truth is that the challenge of contemporary _software_ development isn't really just about producing files. Everything from test automation to deployment is something that we can think about as a kind of build system problem.

Let's unwind for a moment. The work of "making software," breaks down into a collection of--reasonably disparate--tasks, which include:

  • collecting requirements (figuring out what people want,)
  • project planning (figuring out how to break larger collections of functionality into more reasonable units.)
  • writing new code in existing code bases.
  • exploring unfamiliar code and making changes.
  • writing tests for code you've recently written, or areas of the code base that have recently chaining.
  • rewriting existing code with functionally equivalent code (refactoring,)
  • fixing bugs discovered by users.
  • fixing bugs discovered by an automated test suite.
  • releasing software (deploying code.)

Within these tasks developers do a lot of little experiments and tests. Make a change, see what it's impact is by doing something like compiling the code, running the program or running a test program. The goal, therefore, of the project of developer productivity projects is to automate these processes and shorten the time it takes to do any of these tasks. In particular the feedback loop between "making a change" and seeing if that change had an impact. The more complex the system that you're developing, with regards to distinct components, dependencies, target platforms, compilation model, and integration's, them the more time you spend in any one of these loops and the less productive you can be.

Build systems are uniquely positioned to suport the development process: they're typically purpose built per-project (sharing common infrastructure,) most projects already have one, and they provide an ideal environment to provide the kind of incremental development of additional functionality and tooling. The model of build systems: the description of processes in terms of dependency graphs and the optimization for local environments means.

The problem, I think, is that build systems tend to be pretty terrible, or at least many suffer a number of classic flaws:

  • implicit assumptions about the build or development environment which make it difficult to start using.
  • unexpressed dependencies on services or systems that the build requires to be running in a certain configuration.
  • improperly configured dependency graphs which end up requiring repeated work.
  • incomplete expression of dependencies which require users to manually complete operations in particular orders.
  • poorly configured defaults which make for overly complex invocations for common tasks.
  • operations distributed among a collection of tools with little integration so that compilation, test automation, release automation, and other functions.

By improving the quality, correctness, and usability of build systems, we:

  • improve the experience for developers every day,
  • make it really easy to optimize basically every aspect of the development process,
  • reduce the friction for including new developers in a project's development process.

I'm not saying "we need to spend more time writing build automation tools" (like make, ninja, cmake, and friends,) or that the existing ones are bad and hard to use (they, by and large are,) but that they're the first and best hook we have into developer workflows. A high quality, trustable, tested, and easy to use build system for a project make development easier, continuous integration easy and maintainable, and ultimately improve the ability of developers to spend more of their time focusing on important problems.

[1]ideally build systems describe directed acylcic graph, though many projects have buildsystems with cyclic dependency graphs that they ignore in some way.

Non-Trad Software Engineer

It happened gradually, and it wasn't entirely an intentional thing, but at some point I became a software engineer. While a lot of people become software engineers, many of them have formal backgrounds in engineering, or have taken classes or done programs to support this retooling (e.g. bootcamps or programming institutes.)

I skipped that part.

I wrote scripts from time to time for myself, because there were things I wanted to automate. Then I was working as a technical writer and had to read code that other people had written for my job. Somewhere in there I was responsible for managing the publication workflow, and write a couple of build systems.

And then it happened.

I don't think it's the kind of thing that is right for everyone, but I was your typical, nerdy/bookish kid who wasn't great in math class, and I suspect that making software is the kind of thing that a lot of people could do. I don't think that my experience is particularly replicable, but I have learned a number of useful (and important) things, and I realize as I've started writing more about what I'm working on now, I realize that I've missed some of the fundamentals [1]

Formal education in programming, from what I've been able to gather strikes me as really weird: there are sort of two main ways of teaching people about software and computer science: Option one is that you start with a very theoretical background that focuses on data structures, the performance of algorithms, or the internals of how core technologies function (operating systems, compilers, databases, etc.) Option two, is that you spend a lot of time learning about (a) programming language and about how to solve problems using programming.

The first is difficult, because the theory [2] is not particularly applicable except invery rare cases and only at the highest level which is easy to back-fill as needed. The second is also challenging, as idioms change between languages and most generic programming tasks are easily delegated to libraries. The crucial skill for programming is the ability to learn new languages and solve problems in the context of existing systems, and developing a curriculum to build those skills is hard.

The topics that I'd like to write about include:

  • Queue behavior, particularly in the context of distributed systems.
  • Observability/Monitoring and Logging, particularly for reasonable operations at scale.
  • build systems and build automation.
  • unit-testing, test automation, and continuous integration.
  • interface design for users and other programmers.
  • maintaining and improving legacy systems.

These are, of course, primarily focused on the project of making software rather than computer science or computing in the abstract. I'm particularly interested (practically) in figuring out what kinds of experiences and patterns are important for new programmers to learn, regardless of background. [3] I hope you all find it interesting as well!

[1]This is, at least in part, because I mostly didn't blog very much during this process. Time being finite and all.
[2]In practice, theoretical insights come up pretty infrequently and are mostly useful for providing shorthand for characterizing a problem in more abstract terms. Most of the time, you're better off intuiting things anyway because programming is predominantly a pragmatic exercise. For the exceptions, there are a lot of nerds around (both at most companies and on the internet) who can figure out what the proper name is for a phenomena and then you can look on wikipedia.
[3]A significant portion of my day-to-day work recently has involved mentoring new programmers. Some have traditional backgrounds or formal technical education and many don't. While everyone has something to learn, I often find that because my own background is so atypical it can be hard for me to outline the things that I think are important, and to identify the high level concepts that are important from more specific sets of experiences.

Evergreen Intro

Almost two years ago, I switched teams at work to join the team behind evergreen which is a homegrown continuous integration tool that we use organization wide to host and support development efforts and operational automation. It's been great.

From the high level, Evergreen takes changes that developers make to source code repositories and runs a set of tasks for each of those changes on a wide variety of systems, and is a key part of the system that allows us to verify that the software we write works on computers other than the ones that we interact with directly. There are a number of CI systems in the world, but Evergreen has a number of interesting features:

  • it runs tasks in parallel, fanning out tasks to a large pool of machines to shorten the "wall clock" time for task execution.
  • tasks execute on ephemeral systems managed by Evergreen in response to demands of incoming work.
  • the service maintains a queue of work and handles task dispatching and results collection.

This focus on larger scale task parallelism and managing host pools gives Evergreen the ability to address larger scale continuous integration workflows with a lower maintenance overhead. This is totally my jam: we get to both affect the development workflow and engineering policies for basically everyone and improving operational efficiency is a leading goal.

My previous gig was more operational, on a sibling team, so it's been really powerful to be able to address problems relating to application scale and drive the architecture from the other side. I wrote a blog post for a work-adjacent outlet about the features and improvements, but this is my blog, and I think it'd be fun to have some space to explore "what I've been working on," rather than focusing on Evergren as a product.

My first order of business, after becoming familiar with the code base, was to work on logging. When I started learning Go, I wrote a logging library (I even bloged about it), and using this library has allowed us to "get serious about logging." While it was a long play, we now have highly structured logging which has allowed the entire logging system to become a centerpiece in our observably story, and we've been able to use centralized log aggregation services (and even shop around!) As our deployment grows, centralized logging is the thing that has kept everything together.

Recently, I've been focusing on how the application handles "offline" or background work. Historically the application has had a number of loosely coupled "cron-job" like operations that all happened on single machine at a regular interval. I'm focusing on how to move these systems into more tightly coupled, event-driven operations that can be distributed to a larger cluster of machines. Amboy is a big part of this, but there have been other changes related to this project.

On the horizon, I'm also starting to think about how to reduce the cost of exposing data and operations to clients and users in a way that's lightweight and flexible, and relatively inexpensive for developer time. Right now there's a lot of technical debt, a myriad of different ways to describe interfaces, and inconsistent client coverage. Nothing insurmountable, but definitely the next frontier of growing pains.


The theme here is "how do we take an application that works and does something really cool," and turn it into a robust piece of software that can both scale as needs grown, but also provide a platform for developing new features with increasing levels of confidence, stability, and speed.

The conventional wisdom is that it's easy to build features fast-and-loose without a bunch of infrastructure, and that as you scale the speed of feature development slows down. I'm pretty convinced that this is untrue and am excited to explore the ways that improved common infrastructure can reduce the impact of this ossification and lead to more nimble and expansive feature development.

We'll see how it goes! And I hope to be able to find the time to write about it more here.

Deleuze and Grove

I've been reading, two books non-fiction intermittently in the last little bit: Andy Grove's High Output Management and Deleuze and Guatteri's What is Philosophy?. Not only is reading non-fiction somewhat novel for me, but I'm sorting delighting in the juxtaposition. And I'm finding both books pretty compelling.

These are fundamentally materialist works. Grove's writing from his experience as a manager, but it's a book about organizing that focuses on personal and organizational effectiveness, with a lot of corporate high-tech company examples. But the fact that it's a high-tech company that works on actually producing things, means that he's thinking a lot about production and material constraints. It's particularly interesting because the discussion technology and management often lead to popular writing that's handwavey and abstract: this is not what Grove's book is in the slightest.

Deleuze is more complex, and Guatteri definitely tempers the materialism, though less in the case of What is Philosophy than the earlier books. Having said that, I think What is Philosophy is really an attempt to both justify philosophy in and for itself, but also to discuss the project of knowledge (concept) creation in material, mechanistic terms.

To be honest this is the thing that I find the most compelling about Deleuze in general: he's undeniably materialist in his outlook and approach, but but his work often--thanks to Guatteri, I think--focuses on issues central to non-materialist thought: interiority, subjectivity, experience, and identity. Without loosing the need to explore systems, mechanisms, and interfaces between and among related components and concepts.

I talked with a coworker about the fact that I've been reading both of these pieces together, and he said something to the effect of "yeah, Grove rambles a bunch but has a lot of good points, which is basically the same as Deleuze." Fair. I'd even go a bit further and say that these are both books, that are despite their specialized topics and focus, are really deep down books for everyone, and guides for being in the world.

Read them both.

Today's Bottleneck

Computers are always getting faster. From the perspective of the casual observer it may seem like every year all of the various specs keep going up, and systems are faster. [1] In truth, progress isn't uniform across all systems and subsystems, and thinking about this progression of technology gives us a chance to think about the constraints that developers [2] and other people who build technology face.

For most of the past year, I've used a single laptop, for all of my computing work, and while it's been great, in this time I lost touch with the comparative speed of systems. No great loss, but I found myself surprised to learn that all computers did not have the same speed: It wasn't until I started using other machines on a regular basis that I remembered that hardware could affect performance.

For most of the past decade, processors have been fast. While some processors are theoretically faster and some have other features like virtualization extensions and better multitasking capacities (i.e. hyperthreading and multi-core systems) the improvements have been incremental at best.

Memory (RAM) manages to mostly keep up with the processors, so there's no real bottleneck between RAM and the processor. Although RAM capacities are growing, at current volumes extra RAM just means services/systems that had to be distributed given RAM density can all run on one server. In general: "ho hum."

Disks are another story all together.

While disks got faster over this period, they didn't get much faster during this period, and so for a long time disks were the bottle neck in computing speed. To address this problem, a number of things changed:

  • We designed systems for asynchronous operation.

Basically, folks spilled a lot of blood and energy to make sure that systems could continue to do work while waiting for the disk to reading or writing data. This involves using a lot of event loops, queuing systems, and so forth.

These systems are really cool, the only problem is that it means that we have to be smarter about some aspects of software design and deployment. This doesn't fix the tons of legacy sitting around, or the fact that a lot of tools and programmers are struggling to keep up.

  • We started to build more distributed systems so that any individual spinning disk is responsible for writing/reading less data.

  • We hacked disks themselves to get better performance.

    There are some ways you can eek out a bit of extra performance from spinning disks: namely RAID-10, hardware RAID controllers, and using smaller platters. RAID approaches use multiple drives (4) to provide simple redundancy and roughly double performance. Smaller platters require less movement of the disk arm, and you get a bit more out of the hardware.

    Now, with affordable solid state disks (SSDs,) all of these disk related speed problems are basically moot. So what are the next bottlenecks for computers and performance:

  • Processors. It might be the case that processors are going to be the slow to develop bottleneck. There are a lot of expectations on processors these days: high speed, low power consumption, low temperature, high amount of parallelism (cores and hyperthreading.) But these expectations are necessarily conflicting.

    The main route to innovation is to make the processors themselves smaller, which does increase performance and helps control heat and power consumption, but there is a practical limit to the size of a processor.

    Also, no matter how fast you make the processor, it's irrelevant unless the software is capable of taking advantage of the feature.

  • Software.

    We're still not great at building software with asynchronous components. "Non-blocking" systems do make it easier to have systems that work better with slower disks. Still, we don't have a lot of software that does a great job of using the parallelism of a processor, so it's possible to get some operations that are slow and will remain slow because a single threaded process must grind through a long task and can't share it.

  • Network overhead.

    While I think better software is a huge problem, network throughput could be a huge issue. The internet endpoints (your connection) has gotten much faster in the past few years. That's a good thing, indeed, but there are a number of problems:

  • Transfer speeds aren't keeping up with data growth or data storage, and if that trend continues, we're going to end up with a lot of data that only exists in one physical location, which leads to catastrophic data loss.

    I think we'll get back to a point where moving physical media around will begin to make sense. Again.

  • Wireless data speeds and architectures (particularly 802.11x, but also wide area wireless,) have become ubiquitous, but aren't really sufficient for serious use. The fact that our homes, public places, and even offices (in some cases) aren't wired correctly to be able to provide opportunities to plug in will begin to hurt.

Thoughts? Other bottlenecks? Different reading of the history?

[1]By contrast, software seems like its always getting slower, and while this is partially true, there are additional factors at play, including feature growth, programmer efficiency, and legacy support requirements.
[2]Because developers control, at least to some extent, how everyone uses and understands technology, the constrains on the way they use computers id important to everyone.

Novel Automation

This post is a follow up to the interlude in the /posts/programming-tutorials post, which part of an ongoing series of posts on programmer training and related issues in technological literacy and education.

In short, creating novel automations is hard. The process would have to look something like:

  1. Realize that you have an unfulfilled software need.
  2. Decide what the proper solution to that need is. Make sure the solution is sufficiently flexible to be able to support all required complexity.
  3. Then sit down, open an empty buffer and begin writing code.

Not easy. [1]

Something I've learned in the past few years is that the above process is relatively uncommon for actual working programmers: most of the time you're adding a few lines here and there, testing various changes or adding small features built upon other existing systems and features.

If this is how programming work is actually done, then the kinds of methods we use to teach programmers how to program should hold some resemblance to the actual work that programmers do. As an attempt at a case study, my own recent experience:

I've been playing with Buildbot for a few weeks now for personal curiosity, and it may be useful to automate some stuff for the Cyborg Institute. Buildbot has its merits and frustrations, but this post isn't really about buildbot. Rather, the experience of doing buildbot work has taught me something about programming and about "building things," including:

  • When you set up buildbot, it generates a python configuration file where all buildbot configuration and "programming" goes.

    As a bit of a sidebar, I've been using a base configuration derived from the buildbot configuration for buildbot itself, and the fact that the default configuration is less clean and a big and I'd assumed that I was configuring a buildbot in the "normal way."

    Turns out I haven't, and this hurts my (larger) argument slightly.

    I like the idea of having a very programmatic interface for systems that must integrate with other components, and I really like the idea of a system that produces a good starting template. I'm not sure what this does for overall maintainability in the long term, but it makes getting started and using the software in a meaningful way, much more possible.

  • Using organizing my buildbot configuration as I have, modeled on the "metabuildbot," has nicely illustrated the idea software is just a collection of modules that interact with each other in a defined way. Nothing more, nothing less.

  • Distributed systems are incredibly difficult to get people to conceptualize properly, for anyone, and I think most of the frustration with buildbot stems from this.

  • Buildbot provides an immediate object lesson on the trade-offs between simplicity and terseness on the one hand and maintainability and complexity on the other.

    This point relates to the previous one. Because distributed systems are hard, it's easy to configure something that's too complex and that isn't what you want at all in your Buildbot before you realize that what you actually need is something else entirely.

    This doesn't mean that there aren't nightmarish Buildbot configs, and there are, but the lesson is quite valuable.

  • There's something interesting and instructive in the way that Buildbot's user experience lies somewhere between "an application," that you install and use, and a program that you write using a toolkit.

    It's clearly not exactly either, and both at the same time.

I suspect some web-programming systems may be similar, but I have relatively little experience with systems like these. And frankly, I have little need for these kinds of systems in any of my current projects.

Thoughts?

[1]Indeed this may be why the incidence of people writing code, getting it working and then rewrite it from the ground up: writing things from scratch is an objectively hard thing, where rewriting and iterating is considerably easier. And the end result is often, but not always better.

In Favor of PDF

This is really a short rant, and should come as a surprise to no one.

I hate DOC files, and RTF files, to say nothing of ODF, DOCX, and their ilk because they have two necessarily conflicting properties:

1. They're oriented at producing documents on paper. Which is crazy. Paper is an output, but it's not the only output in common use, so it's nuts that generic document representation formats would be so tightly coupled with paper.

2. The rendering of the content is editor specific, particularly with regards to display options. If I compile a document and send it to you, I have no guarantee whatsoever about the presentation or display of the document on your system, particularly if I'm not certain that your system is similarly configured. Particularly with respect to fonts, page breaks, etc.

This is particularly idiotic with respect to 1.

It's not that PDF is great, or especially usable, but it's consistent and behaves as expected. Furthermore, it does a good job of appropriately expressing the limitations of paper.

So use PDF and accept no substitutions.