A Common Failure

I've been intermittently working on a common lisp library to produce a binary encoding of arbitrary objects, and I think I'm going to be abandoning the project. This is an explanation of that decision and an reflection on my experience.

Why Common Lisp?

First, some background. I've always thought that Common Lisp is a language with a bunch of cool features and selling points, but unsurprisingly, I've never really had the experience of writing more than some one-off bits of code in CL, which isn't surprising. This project was a good experience for really digging into writing and managing a conceptually larger project which was a good kick in the pants to learn more.

The things I like:

  • the implementations of the core runtime are really robust and high quality, and make it possible to imagine running your code in a bunch of different contexts. Even though it's a language with relatively few users, it feels robust in a way. The most common implementations also have ways of producing fully self contained static binaries (like Go, say), which makes the thought of distributing software seem reasonable.
  • quicklisp, a package/library management tool is new (in the last decade or so,) has really raised the level of the ecosystem. It's not as complete as I'd expect in many ways, but quicklisp changed CL from something quaint to something that you could actually imagine using.
  • the object system is really nice. There isn't quite compile time-type checking on the values of slots (attributes) of objects, though you can opt in. My general feeling is that I can pretty easily get the feeling of writing statically typed code with all of the freedom of writing dynamic code.
  • multiple dispatch, and the conceptual approach to genericism, is amazing and really simplifies flow control in a lot of cases. You implement the methods you need, for the appropriate types/objects and then just write the logic you need, and the function call machinery just does the right thing. There's surprisingly little conditional logic, as a result.

Things I don't like:

  • there are all sorts of things that don't quite have existing libraries, and so I find myself wanting to do things that require more effort than necessary. This project to write a binary encoding tool would have been a component in service of a much larger project. It'd be nice if you could skip some of the lower level components, or didn't have your design choices so constrained by gaps in infrastructure.
  • at the same time, the library ecosystem is pretty fractured and there are common tools around which there aren't really consensus. Why are there so many half-finished YAML and JSON libraries? There are a bunch of HTTP server (!) implementations, but really you need 2 and not 5?
  • looping/iteration isn't intuitive and difficult to get common patterns to work. The answer, in most cases is to use (map) with lambdas rather than loops, in most cases, but there's this pitfall where you try and use a (loop) and really, that's rarely the right answer.
  • implicit returns seem like an over sight, hilariously, Rust also makes this error. Implicit returns also make knowing what type a function or method returns quite hard to reason about.

Writing an Encoder

So the project I wrote was an attempt to write really object oriented code as a way to writing an object encoder to a JSON-like format. Simple enough, I had a good mental model of the format, and my general approach to doing any kind of binary format processing is to just write a crap ton of unit tests and work somewhat iteratively.

I had a lot of fun with the project, and it gave me a bunch of key experiences which make me feel comfortable saying that I'm able to write common lisp even if it's not a language that I feel maximally comfortable in (yet?). The experiences that really helped included:

  • producing enough code to really have to think about how packaging and code organization worked. I'd written a function here and there, before, but never something where I needed to really understand and use the library/module/packaging (e.g. systems and libraries.) infrastructure.
  • writing and running lots of tests. I don't always follow test-driven development closely, but writing lots of tests is part of my process, and being able to watch the layers of this code come together was a lot of fun and very instructive.
  • this project for me, was mostly about API design and it was nice to have a project that didn't require much design in terms of the actual functionality, as object encoding is pretty straight forward.

From an educational perspective all of my goals were achieved.

Failure Mode

The problem is that it didn't work out, in the final analysis. While the library I constructed was able to encode and decode objects and was internally correct, I never got it to produce encoding that other implementations of the same specification could reliably read, and the ability to read data encoded by other libraries only worked in trivial cases. In the end:

  • this was mostly a project designed to help me gain competence in a programming language I don't really know, and in that I was successful.
  • adding this encoding format isn't particularly useful to any project I'm thinking of working on in the short term, and doesn't directly enable me to do anything in particular.
  • the architecture of the library would not be particularly performant in practice, as the encoding process didn't deploy a buffer pool of any kind, and it would have been harder than not to back fill that in, and I wasn't particularly interested in that.
  • it didn't work, and the effort to debug the issue would be more substantive than I'm really interested in doing at this point, particularly given the limited utility.

So here we are. Onto a different project!

There's No Full Stack

Software engineers use terms like "backend" and "frontend" to describe areas of expertice and focus, and the thought is that these terms map roughly onto "underlying systems and business logic" and "user interfaces." The thought, is that these are different kinds of work and no person can really specialize on "everything."

But it's all about perspective. Software is build in layers: there are frontends and backends at almost every level, so the classification easily breaks down if you look at it too hard. It's also the case that that logical features, from the perspective of the product and user, require the efforts of both disciplines. Often development organizations struggle to hand projects off between groups of front-end and back-end teams. [1]

[1]In truth this problem of coordination between frontend and backend teams is really that it forces a waterfall-like coordination between teams, which is always awkward. The problem isn't that backend engineers can't write frontend code, but that having different teams requires a handoff that is difficult to manage correctly, and around that handoff processes and management happens.

Backend/Frontend is also a poor way to organize work, as often it forces a needless boundary between people and teams wokring on related projects. Backend work (ususally) has to be completed first, and if that slips (or estimation is off) then the front end work has to happen in a crunch. Even if timing goes well, it's difficult to maintain engineering continuity through the handoff and context is often lost in the process.

In response to splitting projects and teams into front and backend, engineers have developed this idea of "full stack" engineering. This typically means "integrated front end and backend development." A noble approach: keep the same engineer on the project from start to finish, and avoid an awkward handoff or resetting context halfway through a project. Historic concerns about "front end and backend being in different languages" are reduced both by the advent of back-end javascript, and a realization that programmers often work in multiple languages.

While full stack sounds great, it's a total lie. First, engineers by and large cannot maintain context on all aspects of a system, so boundaries end up appearing in different places. A full stack engineer might end up writing front end and the APIs on the backed that the front end depends on, but not the application logic that supports the feature. Or an engineer might focus only a very specific set of features, but not be able to branch out very broadly. Second, specialization is important for allowing engineers to focus and be productive, and while context switching projects between engineers, having engineers that must context switch regularly between different disciplines is bad for those engineers. In short you can't just declare that engineers will be able to do it all.'

Some, particularly larger, teams and prodcuts can get around the issue entirely by dividing ownership and specialization along functional boundaries rather than by engineering discipline, but there can be real technical limitations, and getting a team to move to this kind of ownership model is super difficult. Therefore, I'd propose a different organization or a way of dividing projects and engineering that avoids both "frontend/backend" as well as the idea of "full stack":

  • feature or product engineers, that focus on core functionality delivered to users. This includes UI, supporting backend APIs, and core functionality. The users of these teams are the users of the product. These jobs have the best parts of "full stack" type orientation, but draw an effective "lower" boundary of responsibility and allow feature-based specialization.
  • infrastructure or product platform engineers, that focus on deployment, operations and supporting internal APIs. These teams and engineers should see their users as feature and product engineers. These engineers should fall somewhere between "backend engineers," and the "devops" and "sre" -type roles of the last decade, and cover the area "above" systems (e.g. not inclusive of machine management and access provisioning,) and below features.

This framework helps teams scale up as needs and requirements change: Feature teams can be divided and parallelized and focus in functionality slices, while, infrastructure teams divide easily into specialties (e.g. networking, storage, databases, internal libraries, queues, etc.) and along service boundaries. Teams are in a better position to handle continuity of projects, and engineers can maintain context and operate using more agile methods. I suspect that, if we look carefully, many organizations and teams have this kind of de facto organization, even if they use different kind of terminology.

Thoughts?

What is it That You Do?

The longer that I have this job, the more difficult it is to explain what I do. I say, "I'm a programmer," and you'd think that I write code all day, but that doesn't map onto what my days look like, and the longer it seems the less code I actually end up writing. I think the complexity of this seemingly simple question grows from the fact that building software involves a lot more than writing code, particularly as projects become more complex.

I'd venture to say that most code is written and maintained by one person, and typically used by a very small number of pepole (often on behalf of many more people,) though this is difficult to quantify. Single maintainer software is still software, and there are lots of interesting problems, but as much as anything else I'm interested in the problems adjacent to multi-author code-bases and multi-operator software development. [1]

Fundamentally, I'm interested in the following questions:

  • How can (sometimes larger) groups of people collaborate to build something that's bigger than the scope of any of their work?
  • How can we build software in a way that lets individual developers focus most of the time on the features and concerns that are the most important to them and their users. [2]

The software development process, regardless of the scope of the problem, has a number of different aspects:

  • Operations: How does is this software execute and how do we know that its successful when it runs?
  • Behavior: What does it do, and how do we ensure it has the correct behavior?
  • Interface: How will users interact with the process, and how do we ensure a consistent experience across versions and users' environment?
  • Product: Who are the users? What features do they want? Which features are the most important?

Sometimes we can address these questions by writing code, but often there's a lot of talking to users, other developers, and other people who work in software development organizations (e.g. product managers, support, etc.) not to mention writing a lot of English (documentation, specs, and the like.)

I still don't think that I've successfully answered the framing question, except to paint a large picture of what kinds of work goes into making software, and described some of my specific domain interests. This ends up boiling down to:

  • I write a lot of documents describing new features and improvements to our software. [product]
  • I think a lot about how our product works as it grows (scaling), and what kinds of changes we can make now to make that process more smooth. [operations]
  • How can I help the more junior members of my team focus on the aspects of their jobs that they enjoy the most, and help illustrate broader contexts to them. [mentoring]
  • How can we take the problems we're solving today and build the solution that balances the immediate requirements with longer term maintainability and reuse. [operations/infrastructure]

The actual "what" I'm spending my time boils down to reading a bunch of code, meeting with my teamates, meeting with users (who are also coworkers.) And sometimes writing code. If I'm lucky.

[1]I think the single-author and/or single-operator class is super interesting and valuable, particularly because it includes a lot of software outside of the conventional disciplinary boundaries of software and includes things like macros, spreadsheets, small scale database, and IT/operations ("scripting") work.
[2]It's very easy to spend most of your time as a developer writing infrastructure code of some sort, to address either internal concerns (logging, data management and modeling, integrating with services) or project/process automation (build, test, operations) concerns. Infrastructure isn't bad, but it isn't the same as working on product features.

The Case for Better Build Systems

A lot of my work, these days, focuses on figuring out how to improve how people develop software in ways that reduces the amount of time developers have to spend doing work outside of development and that improves the quality of their work. This post, has been sitting in my drafts folder for the last year, and does a good job of explaining how I locate my work **and* makes a case for high quality generic build system tooling that I continue to feel is compelling.*


Incidentally, it turns out that I wrote an introductory post about buildsystems 6 years ago. Go past me.

Canonically, build systems described the steps required to produce artifacts, as system (graph) of dependencies [1] and these dependencies are between source files (code) and artifacts (programs and packages) with intermediate artifacts all in terms of the files they are or create. Though different development environments, programming languages, and kinds of software have different.

While the canonical "build systems are about producing files," the truth is that the challenge of contemporary _software_ development isn't really just about producing files. Everything from test automation to deployment is something that we can think about as a kind of build system problem.

Let's unwind for a moment. The work of "making software," breaks down into a collection of--reasonably disparate--tasks, which include:

  • collecting requirements (figuring out what people want,)
  • project planning (figuring out how to break larger collections of functionality into more reasonable units.)
  • writing new code in existing code bases.
  • exploring unfamiliar code and making changes.
  • writing tests for code you've recently written, or areas of the code base that have recently chaining.
  • rewriting existing code with functionally equivalent code (refactoring,)
  • fixing bugs discovered by users.
  • fixing bugs discovered by an automated test suite.
  • releasing software (deploying code.)

Within these tasks developers do a lot of little experiments and tests. Make a change, see what it's impact is by doing something like compiling the code, running the program or running a test program. The goal, therefore, of the project of developer productivity projects is to automate these processes and shorten the time it takes to do any of these tasks. In particular the feedback loop between "making a change" and seeing if that change had an impact. The more complex the system that you're developing, with regards to distinct components, dependencies, target platforms, compilation model, and integration's, them the more time you spend in any one of these loops and the less productive you can be.

Build systems are uniquely positioned to suport the development process: they're typically purpose built per-project (sharing common infrastructure,) most projects already have one, and they provide an ideal environment to provide the kind of incremental development of additional functionality and tooling. The model of build systems: the description of processes in terms of dependency graphs and the optimization for local environments means.

The problem, I think, is that build systems tend to be pretty terrible, or at least many suffer a number of classic flaws:

  • implicit assumptions about the build or development environment which make it difficult to start using.
  • unexpressed dependencies on services or systems that the build requires to be running in a certain configuration.
  • improperly configured dependency graphs which end up requiring repeated work.
  • incomplete expression of dependencies which require users to manually complete operations in particular orders.
  • poorly configured defaults which make for overly complex invocations for common tasks.
  • operations distributed among a collection of tools with little integration so that compilation, test automation, release automation, and other functions.

By improving the quality, correctness, and usability of build systems, we:

  • improve the experience for developers every day,
  • make it really easy to optimize basically every aspect of the development process,
  • reduce the friction for including new developers in a project's development process.

I'm not saying "we need to spend more time writing build automation tools" (like make, ninja, cmake, and friends,) or that the existing ones are bad and hard to use (they, by and large are,) but that they're the first and best hook we have into developer workflows. A high quality, trustable, tested, and easy to use build system for a project make development easier, continuous integration easy and maintainable, and ultimately improve the ability of developers to spend more of their time focusing on important problems.

[1]ideally build systems describe directed acylcic graph, though many projects have buildsystems with cyclic dependency graphs that they ignore in some way.

Non-Trad Software Engineer

It happened gradually, and it wasn't entirely an intentional thing, but at some point I became a software engineer. While a lot of people become software engineers, many of them have formal backgrounds in engineering, or have taken classes or done programs to support this retooling (e.g. bootcamps or programming institutes.)

I skipped that part.

I wrote scripts from time to time for myself, because there were things I wanted to automate. Then I was working as a technical writer and had to read code that other people had written for my job. Somewhere in there I was responsible for managing the publication workflow, and write a couple of build systems.

And then it happened.

I don't think it's the kind of thing that is right for everyone, but I was your typical, nerdy/bookish kid who wasn't great in math class, and I suspect that making software is the kind of thing that a lot of people could do. I don't think that my experience is particularly replicable, but I have learned a number of useful (and important) things, and I realize as I've started writing more about what I'm working on now, I realize that I've missed some of the fundamentals [1]

Formal education in programming, from what I've been able to gather strikes me as really weird: there are sort of two main ways of teaching people about software and computer science: Option one is that you start with a very theoretical background that focuses on data structures, the performance of algorithms, or the internals of how core technologies function (operating systems, compilers, databases, etc.) Option two, is that you spend a lot of time learning about (a) programming language and about how to solve problems using programming.

The first is difficult, because the theory [2] is not particularly applicable except invery rare cases and only at the highest level which is easy to back-fill as needed. The second is also challenging, as idioms change between languages and most generic programming tasks are easily delegated to libraries. The crucial skill for programming is the ability to learn new languages and solve problems in the context of existing systems, and developing a curriculum to build those skills is hard.

The topics that I'd like to write about include:

  • Queue behavior, particularly in the context of distributed systems.
  • Observability/Monitoring and Logging, particularly for reasonable operations at scale.
  • build systems and build automation.
  • unit-testing, test automation, and continuous integration.
  • interface design for users and other programmers.
  • maintaining and improving legacy systems.

These are, of course, primarily focused on the project of making software rather than computer science or computing in the abstract. I'm particularly interested (practically) in figuring out what kinds of experiences and patterns are important for new programmers to learn, regardless of background. [3] I hope you all find it interesting as well!

[1]This is, at least in part, because I mostly didn't blog very much during this process. Time being finite and all.
[2]In practice, theoretical insights come up pretty infrequently and are mostly useful for providing shorthand for characterizing a problem in more abstract terms. Most of the time, you're better off intuiting things anyway because programming is predominantly a pragmatic exercise. For the exceptions, there are a lot of nerds around (both at most companies and on the internet) who can figure out what the proper name is for a phenomena and then you can look on wikipedia.
[3]A significant portion of my day-to-day work recently has involved mentoring new programmers. Some have traditional backgrounds or formal technical education and many don't. While everyone has something to learn, I often find that because my own background is so atypical it can be hard for me to outline the things that I think are important, and to identify the high level concepts that are important from more specific sets of experiences.

Evergreen Intro

Almost two years ago, I switched teams at work to join the team behind evergreen which is a homegrown continuous integration tool that we use organization wide to host and support development efforts and operational automation. It's been great.

From the high level, Evergreen takes changes that developers make to source code repositories and runs a set of tasks for each of those changes on a wide variety of systems, and is a key part of the system that allows us to verify that the software we write works on computers other than the ones that we interact with directly. There are a number of CI systems in the world, but Evergreen has a number of interesting features:

  • it runs tasks in parallel, fanning out tasks to a large pool of machines to shorten the "wall clock" time for task execution.
  • tasks execute on ephemeral systems managed by Evergreen in response to demands of incoming work.
  • the service maintains a queue of work and handles task dispatching and results collection.

This focus on larger scale task parallelism and managing host pools gives Evergreen the ability to address larger scale continuous integration workflows with a lower maintenance overhead. This is totally my jam: we get to both affect the development workflow and engineering policies for basically everyone and improving operational efficiency is a leading goal.

My previous gig was more operational, on a sibling team, so it's been really powerful to be able to address problems relating to application scale and drive the architecture from the other side. I wrote a blog post for a work-adjacent outlet about the features and improvements, but this is my blog, and I think it'd be fun to have some space to explore "what I've been working on," rather than focusing on Evergren as a product.

My first order of business, after becoming familiar with the code base, was to work on logging. When I started learning Go, I wrote a logging library (I even bloged about it), and using this library has allowed us to "get serious about logging." While it was a long play, we now have highly structured logging which has allowed the entire logging system to become a centerpiece in our observably story, and we've been able to use centralized log aggregation services (and even shop around!) As our deployment grows, centralized logging is the thing that has kept everything together.

Recently, I've been focusing on how the application handles "offline" or background work. Historically the application has had a number of loosely coupled "cron-job" like operations that all happened on single machine at a regular interval. I'm focusing on how to move these systems into more tightly coupled, event-driven operations that can be distributed to a larger cluster of machines. Amboy is a big part of this, but there have been other changes related to this project.

On the horizon, I'm also starting to think about how to reduce the cost of exposing data and operations to clients and users in a way that's lightweight and flexible, and relatively inexpensive for developer time. Right now there's a lot of technical debt, a myriad of different ways to describe interfaces, and inconsistent client coverage. Nothing insurmountable, but definitely the next frontier of growing pains.


The theme here is "how do we take an application that works and does something really cool," and turn it into a robust piece of software that can both scale as needs grown, but also provide a platform for developing new features with increasing levels of confidence, stability, and speed.

The conventional wisdom is that it's easy to build features fast-and-loose without a bunch of infrastructure, and that as you scale the speed of feature development slows down. I'm pretty convinced that this is untrue and am excited to explore the ways that improved common infrastructure can reduce the impact of this ossification and lead to more nimble and expansive feature development.

We'll see how it goes! And I hope to be able to find the time to write about it more here.

Deleuze and Grove

I've been reading, two books non-fiction intermittently in the last little bit: Andy Grove's High Output Management and Deleuze and Guatteri's What is Philosophy?. Not only is reading non-fiction somewhat novel for me, but I'm sorting delighting in the juxtaposition. And I'm finding both books pretty compelling.

These are fundamentally materialist works. Grove's writing from his experience as a manager, but it's a book about organizing that focuses on personal and organizational effectiveness, with a lot of corporate high-tech company examples. But the fact that it's a high-tech company that works on actually producing things, means that he's thinking a lot about production and material constraints. It's particularly interesting because the discussion technology and management often lead to popular writing that's handwavey and abstract: this is not what Grove's book is in the slightest.

Deleuze is more complex, and Guatteri definitely tempers the materialism, though less in the case of What is Philosophy than the earlier books. Having said that, I think What is Philosophy is really an attempt to both justify philosophy in and for itself, but also to discuss the project of knowledge (concept) creation in material, mechanistic terms.

To be honest this is the thing that I find the most compelling about Deleuze in general: he's undeniably materialist in his outlook and approach, but but his work often--thanks to Guatteri, I think--focuses on issues central to non-materialist thought: interiority, subjectivity, experience, and identity. Without loosing the need to explore systems, mechanisms, and interfaces between and among related components and concepts.

I talked with a coworker about the fact that I've been reading both of these pieces together, and he said something to the effect of "yeah, Grove rambles a bunch but has a lot of good points, which is basically the same as Deleuze." Fair. I'd even go a bit further and say that these are both books, that are despite their specialized topics and focus, are really deep down books for everyone, and guides for being in the world.

Read them both.

Today's Bottleneck

Computers are always getting faster. From the perspective of the casual observer it may seem like every year all of the various specs keep going up, and systems are faster. [1] In truth, progress isn't uniform across all systems and subsystems, and thinking about this progression of technology gives us a chance to think about the constraints that developers [2] and other people who build technology face.

For most of the past year, I've used a single laptop, for all of my computing work, and while it's been great, in this time I lost touch with the comparative speed of systems. No great loss, but I found myself surprised to learn that all computers did not have the same speed: It wasn't until I started using other machines on a regular basis that I remembered that hardware could affect performance.

For most of the past decade, processors have been fast. While some processors are theoretically faster and some have other features like virtualization extensions and better multitasking capacities (i.e. hyperthreading and multi-core systems) the improvements have been incremental at best.

Memory (RAM) manages to mostly keep up with the processors, so there's no real bottleneck between RAM and the processor. Although RAM capacities are growing, at current volumes extra RAM just means services/systems that had to be distributed given RAM density can all run on one server. In general: "ho hum."

Disks are another story all together.

While disks got faster over this period, they didn't get much faster during this period, and so for a long time disks were the bottle neck in computing speed. To address this problem, a number of things changed:

  • We designed systems for asynchronous operation.

Basically, folks spilled a lot of blood and energy to make sure that systems could continue to do work while waiting for the disk to reading or writing data. This involves using a lot of event loops, queuing systems, and so forth.

These systems are really cool, the only problem is that it means that we have to be smarter about some aspects of software design and deployment. This doesn't fix the tons of legacy sitting around, or the fact that a lot of tools and programmers are struggling to keep up.

  • We started to build more distributed systems so that any individual spinning disk is responsible for writing/reading less data.

  • We hacked disks themselves to get better performance.

    There are some ways you can eek out a bit of extra performance from spinning disks: namely RAID-10, hardware RAID controllers, and using smaller platters. RAID approaches use multiple drives (4) to provide simple redundancy and roughly double performance. Smaller platters require less movement of the disk arm, and you get a bit more out of the hardware.

    Now, with affordable solid state disks (SSDs,) all of these disk related speed problems are basically moot. So what are the next bottlenecks for computers and performance:

  • Processors. It might be the case that processors are going to be the slow to develop bottleneck. There are a lot of expectations on processors these days: high speed, low power consumption, low temperature, high amount of parallelism (cores and hyperthreading.) But these expectations are necessarily conflicting.

    The main route to innovation is to make the processors themselves smaller, which does increase performance and helps control heat and power consumption, but there is a practical limit to the size of a processor.

    Also, no matter how fast you make the processor, it's irrelevant unless the software is capable of taking advantage of the feature.

  • Software.

    We're still not great at building software with asynchronous components. "Non-blocking" systems do make it easier to have systems that work better with slower disks. Still, we don't have a lot of software that does a great job of using the parallelism of a processor, so it's possible to get some operations that are slow and will remain slow because a single threaded process must grind through a long task and can't share it.

  • Network overhead.

    While I think better software is a huge problem, network throughput could be a huge issue. The internet endpoints (your connection) has gotten much faster in the past few years. That's a good thing, indeed, but there are a number of problems:

  • Transfer speeds aren't keeping up with data growth or data storage, and if that trend continues, we're going to end up with a lot of data that only exists in one physical location, which leads to catastrophic data loss.

    I think we'll get back to a point where moving physical media around will begin to make sense. Again.

  • Wireless data speeds and architectures (particularly 802.11x, but also wide area wireless,) have become ubiquitous, but aren't really sufficient for serious use. The fact that our homes, public places, and even offices (in some cases) aren't wired correctly to be able to provide opportunities to plug in will begin to hurt.

Thoughts? Other bottlenecks? Different reading of the history?

[1]By contrast, software seems like its always getting slower, and while this is partially true, there are additional factors at play, including feature growth, programmer efficiency, and legacy support requirements.
[2]Because developers control, at least to some extent, how everyone uses and understands technology, the constrains on the way they use computers id important to everyone.