work | tychoish

The longer that I have this job, the more difficult it is to explain what I do. I say, “I’m a programmer,” and you’d think that I write code all day, but that doesn’t map onto what my days look like, and the longer it seems the less code I actually end up writing. I think the complexity of this seemingly simple question grows from the fact that building software involves a lot more than writing code, particularly as projects become more complex.

I’d venture to say that most code is written and maintained by one person, and typically used by a very small number of pepole (often on behalf of many more people,) though this is difficult to quantify. Single maintainer software is still software, and there are lots of interesting problems, but as much as anything else I’m interested in the problems adjacent to multi-author code-bases and multi-operator software development.¹

Fundamentally, I’m interested in the following questions:

How can (sometimes larger) groups of people collaborate to build something that’s bigger than the scope of any of their work?
How can we build software in a way that lets individual developers focus most of the time on the features and concerns that are the most important to them and their users.²

The software development process, regardless of the scope of the problem, has a number of different aspects:

Operations: How does is this software execute and how do we know that its successful when it runs?
Behavior: What does it do, and how do we ensure it has the correct behavior?
Interface: How will users interact with the process, and how do we ensure a consistent experience across versions and users' environment?
Product: Who are the users? What features do they want? Which features are the most important?

Sometimes we can address these questions by writing code, but often there’s a lot of talking to users, other developers, and other people who work in software development organizations (e.g. product managers, support, etc.) not to mention writing a lot of English (documentation, specs, and the like.)

I still don’t think that I’ve successfully answered the framing question, except to paint a large picture of what kinds of work goes into making software, and described some of my specific domain interests. This ends up boiling down to:

I write a lot of documents describing new features and improvements to our software. [product]
I think a lot about how our product works as it grows (scaling), and what kinds of changes we can make now to make that process more smooth. [operations]
How can I help the more junior members of my team focus on the aspects of their jobs that they enjoy the most, and help illustrate broader contexts to them. [mentoring]
How can we take the problems we’re solving today and build the solution that balances the immediate requirements with longer term maintainability and reuse. [operations/infrastructure]

The actual “what” I’m spending my time boils down to reading a bunch of code, meeting with my teamates, meeting with users (who are also coworkers.) And sometimes writing code. If I’m lucky.

I think the single-author and/or single-operator class is super interesting and valuable, particularly because it includes a lot of software outside of the conventional disciplinary boundaries of software and includes things like macros, spreadsheets, small scale database, and IT/operations (“scripting”) work. ↩︎
It’s very easy to spend most of your time as a developer writing infrastructure code of some sort, to address either internal concerns (logging, data management and modeling, integrating with services) or project/process automation (build, test, operations) concerns. Infrastructure isn’t bad, but it isn’t the same as working on product features. ↩︎

Almost two years ago, I switched teams at work to join the team behind evergreen which is a homegrown continuous integration tool that we use organization wide to host and support development efforts and operational automation. It’s been great.

From the high level, Evergreen takes changes that developers make to source code repositories and runs a set of tasks for each of those changes on a wide variety of systems, and is a key part of the system that allows us to verify that the software we write works on computers other than the ones that we interact with directly. There are a number of CI systems in the world, but Evergreen has a number of interesting features:

it runs tasks in parallel, fanning out tasks to a large pool of machines to shorten the “wall clock” time for task execution.
tasks execute on ephemeral systems managed by Evergreen in response to demands of incoming work.
the service maintains a queue of work and handles task dispatching and results collection.

This focus on larger scale task parallelism and managing host pools gives Evergreen the ability to address larger scale continuous integration workflows with a lower maintenance overhead. This is totally my jam: we get to both affect the development workflow and engineering policies for basically everyone and improving operational efficiency is a leading goal.

My previous gig was more operational, on a sibling team, so it’s been really powerful to be able to address problems relating to application scale and drive the architecture from the other side. I wrote a blog post for a work-adjacent outlet about the features and improvements, but this is my blog, and I think it’d be fun to have some space to explore “what I’ve been working on,” rather than focusing on Evergren as a product.

My first order of business, after becoming familiar with the code base, was to work on logging. When I started learning Go, I wrote a logging library (I even bloged about it), and using this library has allowed us to “get serious about logging.” While it was a long play, we now have highly structured logging which has allowed the entire logging system to become a centerpiece in our observably story, and we’ve been able to use centralized log aggregation services (and even shop around!) As our deployment grows, centralized logging is the thing that has kept everything together.

Recently, I’ve been focusing on how the application handles “offline” or background work. Historically the application has had a number of loosely coupled “cron-job” like operations that all happened on single machine at a regular interval. I’m focusing on how to move these systems into more tightly coupled, event-driven operations that can be distributed to a larger cluster of machines. Amboy is a big part of this, but there have been other changes related to this project.

On the horizon, I’m also starting to think about how to reduce the cost of exposing data and operations to clients and users in a way that’s lightweight and flexible, and relatively inexpensive for developer time. Right now there’s a lot of technical debt, a myriad of different ways to describe interfaces, and inconsistent client coverage. Nothing insurmountable, but definitely the next frontier of growing pains.

The theme here is “how do we take an application that works and does something really cool,” and turn it into a robust piece of software that can both scale as needs grown, but also provide a platform for developing new features with increasing levels of confidence, stability, and speed.

The conventional wisdom is that it’s easy to build features fast-and-loose without a bunch of infrastructure, and that as you scale the speed of feature development slows down. I’m pretty convinced that this is untrue and am excited to explore the ways that improved common infrastructure can reduce the impact of this ossification and lead to more nimble and expansive feature development.

We’ll see how it goes! And I hope to be able to find the time to write about it more here.

Topic: Work

What is it That You Do?

Evergreen Intro