Isolation Reading

I spent a few days last week isolating after attending a larger social event in a friends apartment in a (mostly) unfamiliar neighborhood and I got to spend a few days enjoying (a dear friend’s) book collection.

I don’t have many paper books left: enough moves and small new York city apartments, combined with vague personal preference for e-ink have left me with only about 100 books, but I do sometimes enjoy reading paper books when I’m visiting someone else. My perfect vacation has always been some combination of “drinking too much coffee and reading books,” and given that I’m kind of in an in-between moment job-wise right now, this was actually pretty much perfect.

I started out the week reading Slouching Toward Bethlehem (Joan Diddion) and finished it reading the first half of The God of Small Things (Arundhati Roy). It was pretty much everything.


I’ve always been an admirer of Diddon, but I’ve never read Bethlehem and I’ve meant to sort of spend a few years trawlling luxuriously through her backlist, but hadn’t gotten around to it. The writing is perfect in exactly the sort of austere but precise way that I’ve come to expect. I’m doubly impressed also that she was so young when these essays were published.

I had in my mind that this was a book that was an account of the state of counter-culture in the 60s, and the title essay definitely is that, but having read the entire book over the course of a few days, I’m left with the impression that this book is really a big “why I left New York City in my late 20s” combined with a love letter to California from a returning native child, who remembers “the (really) old California” and what is by now “the (simply) old California”.

The “why I left New York City in my late 20s” story is pretty familiar, and it’s actually nice to see, now 60 years on, that people coming to New York in their 20s and then burning out or not figuring out how to be in New York sustainably is a very old story indeed. I’m also of course, heartened that she returned to the city for the last 25 years of her life. I hope that this also proves to be an enduring pattern for my generation.

I was also struck by the way that the reflection (and really, critique) of the counter culture managed to be very early but also consistent with what a lot of people were saying earlier. To my eyes, it’s not particularly surprising but the date is a bit.


The God of Small Things is, of course, lush in all the ways that Slouching is austere. Almost provoking whiplash.

I typically find these sort of lush non-linear books to be a bit Extra. Lovely, to be sure, but the lushness and non-linearity can so distract from the plot or the characters or the impact. Lush and non-linear prose has also started to feel faddish and at least for me, a signifies of a certain kind of academic/“art school” approach to prose. This is not true at all of Small Things: the story directly and explicitly explores childhood memories and trauma in ways that are reflected both in the characters and the story telling. It extremely works.

As is, I suppose, the intent, the book and writing has me thinking a lot about imperialism1 and the history therein, and I think there’s a way that the non-linearity of the story telling manages to engage this fundamental question2 “why do people fight for their servitude as if it were their salvation,” and watching this

I’m not done yet with the book, but I’m excited to dig in more.


The next book on my friend’s bookshelf that I’m excited by was a collection of Grace Paley stories and essays. I haven’t really started it, yet, but I think I will soon.


  1. I wrote this sentence as “post/colonialism” but I think there are so many layers and intersections that expand have echos and impacts that are much larger than the history of the British in India, which isn’t (and shouldn’t!) be the at the center of the story, despite it’s outsized and unrefutable impact. ↩︎

  2. In a bit of my own non-linearity, I’ve been working on an essay that plays with this famous quote/question from Deleuze (derived from Riech, derived from Spinoza). The full (ish) quote is, “the fundamental problem of political philosophy is still precisely the one that Spinoza saw so clearly, and that Wilhelm Reich rediscovered: ‘Why do men fight for their servitude as stubbornly as though it were their salvation?’ How can people possibly reach the point of shouting: ‘More taxes! Less bread!’? As Reich remarks, the astonishing thing is not that some people steal or that others occasionally go out on strike, but rather that all those who are starving do not steal as a regular practice, and all those who are exploited are not continually out on strike.” ↩︎

Software Engineering for 2.0

I’ve been thinking about what I do as a software engineer for a while, as there seems to be a common thread through the kinds of projects and teams that I’m drawn toward and I wanted to write a few blog posts on this topic to sort of collect my thoughts and see if these ideas resonated with anyone else.

I’ve never been particularly interested in building new and exciting features. Hackathon’s have never held any particular appeal, and the things I really enjoy are working on are on the spectrum of “stabilize this piece of software,” or “make this service easy to operate” or “refactor this code to make support future development” and less “design and build some new feature.” Which isn’t to say that I don’t like building new features or writing code, but that I’m more driven by the code and supporting my teammates than I am by the feature.

I think it’s great that I’m different from software engineers who are really focused on the features, both because I think the tension between our interests pushes both classes of software engineer to do great things. Feature development keeps software and products relevant and addresses users' needs. Stabilization work makes projects last and reduces the incidence of failures that distract from feature work, and when there’s consistent attention paid to aligning infrastructure1 work with feature development of the long term, infrastructure engineers can significantly lower the cost of implementing a feature.

The kinds of projects that fall into these categories inculde the following areas:

  • managing application state and workload in larger distributed contexts. This has involved designing and implementing things like configuration management, deployment processes, queuing systems, and persistence layers.
  • concurrency control patterns and process lifecycle. In programming environments where threads are available, finding ways to ensure that processes can safely shut down, and errors can be communicated between threads and processes takes some work and providing mechanisms to shutdown cleanly, communicate abort signals to worker threads, and handle communication patterns between threads in a regular and expected way, is really important. Concurrency is a great tool, but being able to manage concurrency safely and predictably and in descret parts of the code are useful.
  • programming model and ergonomic APIs and services. No developers produces a really compelling set of abstractions on the first draft, particularly when they’re focused on delivering different kinds of functionality. The revision and iteration process helps everyone build better software.
  • test infrastructure and improvements. No one thinks tests should take a long time or report results non-deterministically, and yet so many test are. The challenge is that tests often look good or seem reasonable or are stable when you write them, and their slow runtimes compound overtime, or orthogonal changes make them slower. Sometimes adding an extra check in some pre-flight test-infrastructure code ends ups causing tests that had been just fine, thank you to become problems. Maintaining and structure test infrastructure has been a big part of what I’ve ended up doing. Often, however, working back from the tests, it’s possible to see how a changed interface or an alternate factoring of code would make core components easier to test, and doing a cleanup pass of tests on some regular cadence to improve things. Faster more reliable tests, make it possible to develop with greater confidence.

In practice this has included:

  • changing the build system for a project to produce consistent artifacts, and regularizing the deployment process to avoid problems during deploy.
  • writing a queuing system without any extra service level dependencies (e.g. in the project’s existing database infrastructure) and then refactoring (almost) all arbitrary parallel workloads to use the new queuing system.
  • designing and implementing runtime feature flagging systems so operators could toggle features or components on-and-off via configuration options rather than expensive deploys.
  • replacing bespoke implementations with components provided by libraries or improving implementation quality by replacing components in-place, with the goal of making new implementations more testable or performant (or both!)
  • plumbing contexts (e.g. Golang’s service contexts) through codebases to be able to control the lifecycle of concurrent processes.
  • implementing and migrating structured logging systems and building observability systems based on these tools to monitor fleets of application services.
  • Refactoring tests to reuse expensive test infrastructure, or using table-driven tests to reduce test duplication.
  • managing processes' startup and shutdown code to avoid corrupted states and efficiently terminate and resume in-progress work.

When done well (or just done at all), this kind of work has always paid clear dividends for teams, even when under pressure to produce new features, because the work on the underlying platform reduces the friction for everyone doing work on the codebase.


  1. It’s something of an annoyance that the word “infrastructure” is overloaded, and often refers to the discipline of running software rather than the parts of a piece of software that supports the execution and implementation of the business logic of user-facing features. Code has and needs infrastructure too, and a lot of the work of providing that infrastructure is also software development, and not operational work, though clearly all of these boundaries are somewhat porous. ↩︎

Systems Administrators are the Problem

For years now, the idea of the terrible stack, or the dynamic duo of Terraform and Ansible, from this tweet has given me a huge amount of joy, basically anytime someone mentions either Terraform or Ansible, which happens rather a lot. It’s not exactly that I think that Terriform or Ansible are exactly terrible: the configuration management problems that these pieces of software are trying to solve are real and actually terrible, and having tools that help regularize the problem of configuration management definitely improve things. And yet the tools leave things wanting a bit.

Why care so much about configuration management?

Configuration matters because every application needs some kind of configuration: a way to connect to a database (or similar), a place to store its output, and inevitably other things, like a dependencies, or feature flags or whatever.

And that’s the simple case. While most things are probably roughly simple, it’s very easy to have requirements that go beyond this a bit, and it turns out that while a development team might--but only might--not have requirements for something that qualifies as “weird” but every organization has something.

As a developer, configuration and deployment often matters a bunch, and it’s pretty common to need to make changes to this area of the code. While it’s possible to architect things so that configuration can be managed within an application (say), this all takes longer and isn’t always easy to implement, and if your application requires escalated permissions, or needs a system configuration value set then it’s easy to get stuck.

And there’s no real way to avoid it: If you don’t have a good way to manage configuration state, then infrastructure becomes bespoke and fragile: this is bad. Sometimes people suggest using image-based distribution (so called “immutable infrastructure,") but this tends to be slow (images are large and can take a while to build,) and you still have to capture configuration in some way.

But how did we get here?

I think I could weave a really convincing, and likely true story about the discipline of system administration and software operations in general and its history, but rather than go overboard, I think the following factors are pretty important:

  • computers used to be very expensive, were difficult to operate, and so it made sense to have people who were primarily responsible for operating them, and this role has more or less persisted forever.
  • service disruptions can be very expensive, so it’s useful for organizations to have people who are responsible for “keeping the lights on,” and troubleshoot operational problems when things go wrong.
  • most computer systems depend on state of some kind--files on disks, the data in databases--and managing that state can be quite delicate.
  • recent trends in computing make it possible to manipulate infrastructure--computers themselves, storage devices, networks--with code, which means we have this unfortunate dualism of infrastructure where it’s kind of code but also kind of data, and so it feels hard to know what the right thing to do.

Why not just use <xyz>

This isn’t fair, really, but and you know it’s gonna be good when someone trivializes an adjacent problem domain with a question like this, but this is my post so you must endure it, because the idea that there’s another technology or way of framing the problem that makes this better is incredibly persistent.

Usually <xyz>, in recent years has been “Kubernetes” or “docker” or “containers,” but it sort of doesn’t matter, and in the past solutions platforms-as-a-service (e.g. AppEngine/etc.) or backend-as-a-service (e.g. parse/etc.) So let’s run down some answers:

  • “bake configuration into the container/virtual machine/etc. and then you won’t have state,” is a good idea, except it means that if you need to change configuration very quickly, it becomes quite hard because you have to rebuild and deploy an image, which can take a long time, and then there’s problems of how you get secrets like credentials into the service.
  • “use a service for your platform needs,” is a good solution, except that it can be pretty inflexible, particularly if you have an application that wasn’t designed for the service, or need to use some kind of off-the-shelf (a message bus, a cache, etc.) service or tool that wasn’t designed to run in this kind of environment. It’s also the case that the hard cost of using platforms-as-a-service can be pretty high.
  • “serverless” approaches something of a bootstrapping problem, how do you manage the configuration of the provider? How do you get secrets into the execution units?

What’s so terrible about these tools?

  • The tools can’t decide if configuration should be described programatically, using general purpose programming languages and frameworks (e.g. Chef, many deployment tools) or using some kind of declarative structured tool (Puppet, Ansible), or some kind of ungodly hybrid (e.g. Helm, anything with HCL). I’m not sure that there’s a good answer here. I like being able to write code, and I think YAML-based DSLs aren’t great; but capturing configuration creates a huge amount of difficult to test code. Regardless, you need to find ways of being able to test the code inexpensively, and doing this in a way that’s useful can be hard.
  • Many tools are opinionated have strong idioms in hopes of helping to make infrastructure more regular and easier to reason about. This is cool and a good idea, it makes it harder to generalize. While concepts like immutability and idempotency are great properties for configuration systems to have, say, they’re difficult to enforce, and so maybe developing patterns and systems that have weaker opinions that are easy to comply with, and idioms that can be applied iteratively are useful.
  • Tools are willing to do things to your systems that you’d never do by hand, including a number of destructive operations (terraform is particularly guilty of this), which erodes some of their trust and inspires otherwise bored ops folks, to write/recapitulate their own systems, which is why so many different configuration management tools emerge.

Maybe the tools aren’t actually terrible, and the organizational factors that lead to the entrenchment of operations teams (incumbency, incomplete cost analysis, difficult to meet stability requirements,) lead to the entrenchment of the kinds of processes that require tools like this (though causality could easily flow in the opposite direction, with the same effect.)

API Ergonomics

I touched on the idea of API ergonomics in Values for Collaborative Codebases, but I think the topic is due a bit more exploration. Typically you think about an API as being “safe” or “functionally complete,” or “easy to use,” but “ergonomic” is a bit far afield from the standard way that people think and talk about APIs (in my experience.)

I think part of the confusion is that “API” gets used in a couple of different contexts, but let’s say that an API here are the collection of nouns (types, structures,) and verbs (methods, functions) used to interact with a concept (hardware, library, service). APIs can be conceptually really large (e.g. all of a database, a public service), or quite small and expose only a few simple methods (e.g. a data serialization library, or some kind of hashing process.) I think some of the confusion is that people also use the term API to refer to the ways that services access data (e.g. REST, etc.) and while I have no objection to this formulation, service API design and class or library API design feel like related but different problems.

Ergonomics, then is really about making choices in the design of an API, so that:

  • functionality is discoverable during programming. If you’re writing in a language with good code completion tools, then make sure methods and functions are well located and named in a way to take advantage of completion. Chainable APIs are awesome for this.
  • use clear naming for functions and arguments that describe your intent and their use.
  • types should imply semantic intent. If your programming language has a sense of mutability (e.g. passing references verses concrete types in Go, or const (for all its failings) in C++), then make sure you use these markers to both enforce correct behavior and communicate intent.
  • do whatever you can to encourage appropriate use and discourage inappropriate use, by taking advantage of encapsulation features (interfaces, non-exported/private functions, etc.), and passing data into and out of the API with strongly/explicitly-typed objects (e.g. return POD classes, or enumerated values or similar rather than numeric or string types.)
  • reduce the complexity of the surface area by exporting the smallest reasonable API, and also avoiding ambiguous situations, as with functions that take more than one argument of a given type, which leads to cases where users can easily (and legally) do the wrong thing.
  • increase safety of the API by removing or reducing and being explicit about the API’s use of global state. Avoid providing APIs that are not thread safe. Avoid throwing exceptions (or equivalents) in your API that you expect users to handle. If users pass nil pointers into an API, its OK to throw an exception (or let the runtime do it,) but there shouldn’t be exceptions that originate in your code that need to be handled outside of it.

Ergonomic interfaces feel good to use, but they also improve quality across the ecosystem of connected products.

Common Gotchas

This is a post I wrote a long time ago and never posted, but I’ve started getting back into doing some work in Common Lisp and thought it’d be good to send this one off.

On my recent “(re)learn Common Lisp” journey, I’ve happened across a few things that I’ve found frustrating or confusing: this post is a collection of them, in hopes that other people don’t struggle with them:

  • Implementing an existing generic function for a class of your own, and have other callers specialize use your method implementation you must import the generic function, otherwise other callers will (might?) fall back to another method. This makes sense in retrospect, but definitely wasn’t clear on the first go.

  • As a related follow on, you don’t have to define a generic function in order to write or use a method, and I’ve found that using methods is actually quite nice for doing some type checking, at the same time, it can get you into a pickle if you later add the generic function and it’s not exported/imported as you want.

  • Property lists seem cool for a small light weight mapping, but they’re annoying to handle as part of public APIs, mostly because they’re indistinguishable from regular lists, association lists are preferable, and maybe with make-hash even hash-tables.

  • Declaring data structures inline is particularly gawky. I sometimes want to build a list or a hash map in-line an alist, and it’s difficult to do that in a terse way that doesn’t involve building the structure programatically. I’ve been writing (list (cons "a" t) (cons "b" nil)) sort of things, which I don’t love.

    You could render this as:

    `(("a" . t) ("b" . nil)) 
    

    Having said that, I’ve always found the back-tick hard to read, so I tend to disprefer it.

  • If you have a variadic macro (i.e. that takes &rest args), or even I suppose any kind of macro, and you have it’s arguments in a list, there’s no a way, outside of eval to call the macro, which is super annoying, and makes macros significantly less appealing as part of public APIs. My current conclusion is that macros are great when you want to add syntax to make the code you’re writing clearer or to introduce a new paradigm, but for things that could also be a function, or are thin wrappers on for function, just use a function.

How to Choose a Programming Language

I talk to lots of people about software and programming: people who are trying to make a technical decision for a new project or who are interested in learning something new, and some form of the question “what programming language should I learn?” or “what’s the best language for this new project?” comes up a lot.

These are awful questions, because there is no singular right answer, and in some senses all answers are wrong. This post will be an exploration of some decent answers to this question, and some useful ways to think about the differences between programming languages.

  • If you already build and maintain software in one programming language, build new components in the same language you already use. Adding new tools and technologies increases maintenance burden for all engineers, and software tends to stick around for a long time, so this cost can stick around for a long time.

  • Sometimes the software you want to write must target a specific runtime or environment, there’s really only one reasonable choice. The prototypical examples of these are things like: iOS apps (Swift,) Android apps (Kotlin), or things that run in the browser (JavaScript,) although:

  • Given things like React Native and Electron, it’s reasonable to just write JavaScript for all GUI code, although often this might actually mean TypeScript in practice. While it used to be the case that it made sense to write GUI code in various native tool kits, at this point it seems like it makes sense to just figure out ways of doing it all in JS.

  • If you already know how to program in one language, and want to learn something new, but don’t have a specific project in mind attempt to learn something that’s quite different from you already know: if you’re comfortable in something like Python, try and learn something like Go or Rust. If you’re primarily a Java programmer, something like JavaScript or Python might be an interesting change of pace.

    The same basic ideas applies to selecting languages that will be used by teams: choose a tool that’s complementary to what you’re already doing, and that could provide value.

  • If you’re more familiar with a few programming languages or don’t feel you need to learn a new language for professional reasons pick something fun and off the wall: Ocaml! Common Lisp! Rust! Haskell! Scheme! Elixir! It doesn’t matter and in these cases you probably can probably learn new languages when you need, the point is to learn something that’s radically different and to help you think about computers and programming in radically different ways.

  • Choose the language that people working on similar projects are already using. For instance, if you’re doing a lot of data science, using Python makes a lot of sense; if you’re writing tools that you expect banks (say) to use, something that runs on the JVM is a good bet. The idea here is you may be able to find more well developed tools and resources relevant to the kinds of problems you encounter.

  • When starting a new project and there isn’t a lot of prior art in the area that you’re working, or you want to avoid recapitulating some flaw in the existing tools, you end up having a lot of freedom. In general:

    • Think about concurrency and workload characteristics. Is the workload CPU, Network, or IO bound? Is the application multithreaded, or could take advantage of parallelism within processes? There are different kinds of concurrency, and different execution models, so this isn’t always super cut-and-dry: theoretically languages that have “real threads” (C/C++, Java, Rust, Common Lisp, etc.) or a close enough approximation (Go,) are better, but for workloads that are network bound, event-driven systems (e.g. Python’s Tornado and Node.J’s) work admirably.
    • How will you distribute and run the application? There are some languages that can provide static binaries that include all of their dependencies for distribution, which can simplify some aspects of distribution and execution process, but for software that you control the runtime (e.g. services deployed on some kind of container based-platform,) it might matter less.
    • Are there strong real-time requirements? If so, and you’re considering a garbage collected language, make sure that the GC pauses aren’t going to be a problem. It’s also the case that all GCsare not the same, so having a clear idea of what the tolerances
    • Is this software going to be maintained by a team, and if so, what kind of tools will they need in order to succeed and be productive. Would static typing help? What’s the developer tooling and experience like? Are there libraries that you’d expect to need that are conspicuously missing?

Have fun! Build cool things!

Spheres of Alignment

This is a post in my alignment series. See the introductory post Finding Alignment for more context.


I think, in practice, most of what managers do--and indeed all leadership--is about building alignment. The core concept, of alignment, having a shared understanding of the problem space and its context combined with relevant goals and objectives, and grasp of how the context contexts to these objectives. Alignment isn’t just “agreement” or “understanding the solution,” and really centers on this connection between context and goals. Alignment shows up in many different situations and interactions:

  • a small working group (2-4 people) who are working on building or developing something. The thing can be any kind of work product: a piece of software, documentation, a business process, a marketing campaign, a sales deal. When you have more than one person working on something, if they’re not aligned, each person may be able to work on a piece of work as delegated or assigned, but lacks the ability to (reliably) continue to work on the next piece of work after finishing a narrow task, or be able to assess if a line of work is still germane to the goals as things develop. If we view people’s roles in projects as machines, and they perform assigned tasks well, then alignment isn’t super critical, but if you need people to make decisions and act upon them, then they have to be aligned, as a group otherwise the project runs a huge risk of stalling out as each contributor pulls in an opposite direction.
  • one person aligning with the rest of their team to understand how their background and personal goals contribute to and interact with the team’s context and goals. Individuals all bring unique skills and interests and (hopefully) useful to teams, and teams (e.g. their leaders) need to be able to understand how to effectively use those skills and interests to support the team’s goals. This happens over conversation and in the context of someones participation in a team over time, and doesn’t need to take a lot of time on a regular basis, but cannot be entirely abandoned.
  • managers need to align their teams with the company’s objectives. This takes the form of making sure that the projects that the team is working on (and will work on in the future,) support the organization and company’s larger goals.
  • across all level each team needs to align with its peer teams and the organization that it belongs in. This is true in organizations with 30 people and 3-4 teams, and in organizations of 2000 people and dozens of teams.

Alignment is hierarchical, and largely the responsibility of leaders to monitor alignment above and below them, and understand if their teams or specific contributors are falling out of alignment. This doesn’t necessarily mean that it’s not participatory and discursive: individuals can impact the direction, goals, or alignment of their teams, but there must be well formed goals of the organization (that they can understand!) and they must be supported by their team in order to actualize in this dimension. Despite being hierarchical, and individuals and teams must align up building and maintaining alignment in all directions is actually the responsibility of leadership at all levels.

It’s easy to frame this as “you must align with the goals sent from above,” this couldn’t be further from the truth. Some organizations function like this, but it’s probably not healthy for anyone, because the kinds of alignment that it builds are fleeting and tactical. Teams and contributors do need to align with broader goals (up), but their job is not building alignment, it’s building whatever their specialty is: attending to the organizational health and alignment is the concern of leadership whose work must center on building alignment. At almost every level, the alignment goes both ways: you work with leaders above you to align your own work and team, and work with the people you collaborate and mentor to build alignment.

When it works, and even though it takes a while, it helps teams and organizations work really well.

Easy Mode, Hard Mode

I’ve been thinking more recently about that the way that we organize software development projects, and have been using this model of “hard mode” vs “easy mode” a bit recently, and thought it might be useful to expound upon it.


Often users or business interests come to software developers and make a feature request. “I want it to be possible to do thing faster or in combination with another operation, or avoid this class of errors.” Sometimes (often!) these are reasonable requests, and sometimes they’re even easy to do, but sometimes a seemingly innocuous feature or improvement are really hard sometimes, engineering work requires hard work. This isn’t really a problem, and hard work can be quite interesting. It is perhaps, an indication of an architectural flaw when many or most easy requests require disproportionately hard work.

It’s also the case that it’s possible to frame the problem in ways that make the work of developing software easier or harder. Breaking problems into smaller constituent problems make them easier to deal with. Improving the quality of the abstractions and testing infrastructure around a problematic area of code makes it easier to make changes later to an area of the code.

I’ve definitely been on projects where the only way to develop features and make improvement is to have a large amount of experience with the problem domain and the codebase, and those engineers have to spend a lot of consentrated time building features and fighting against the state of the code and its context. This is writing software in “hard mode,” and not only is the work harder than it needs to be, features take longer to develop than users would like. This mode of development makes it very hard to find and retain engineers because of the large ramping period and consistently frustrating nature of the work. Frustration that’s often compounded by the expectation or assumption that easy requests are easy to produce.

In some ways the theme of my engineering career has been work on taking “hard mode projects” reducing the barriers to entry in code bases and project so that they become more “easy mode projects”: changing the organization of the code, adding abstractions that make it easier to develop meaningful features without rippling effects in other parts of the code, improving operational observability to facilitate debugging, restructuring project infrastructure to reduce development friction. In general, I think of the hallmarks of “easy mode” projects as:

  • abstractions and library functions exist for common tasks. For most pieces of “internet infrastructure” (network attached services,) developer’s should be able to add behavior without needing to deal with the nitty gritty of thread pools or socket abstractions (say.) If you’re adding a new REST request, you should be able to just write business logic and not need to think about the applications threading model (say). If something happens often (say, retrying failed requests against upstream API,) you should be able to rely on an existing tool to orchestrate retries.
  • APIs and tools are safe ergonomic. Developers writing code in your project should be able to call into existing APIs and trust that they behave reasonably and handle errors reasonably. This means, methods should do what they say, and exported/public interfaces should be difficult to use improperly, and (e.g. expected exception handling/safety, as well as thread safety and nil semantics ad appropriate.) While it’s useful to interact with external APIs defensively, you can reduce the amount of effort by being less defensive for internal/proximal APIs.
  • Well supported code and operational infrastructure. It should be easy to deploy and test changes to the software, the tests should run quickly, and when there’s a problem there should be a limited number of places that you could look to figure out what’s happening. Making tests more reliable, improving error reporting and tracing, exposing more information to metrics systems, to make the behavior of the system easier to understand in the long term.
  • Changes are scoped to support incremental development. While there are lots of core technical and code infrastructure work that support making projects more “easy mode” a lot of this is about the way that development teams decide to structure projects. This isn’t technical, ususally, but has more to do with planning cadences, release cadences, and scoping practices. There are easier and harder ways of making changes, and it’s often worthwhile to ask yourself “could we make this easier.” The answer, I’ve found, is often “yes”.

Moving a project from hard to easy mode is often in large part about investing in managing technical debt, but it’s also a choice: we can prioritize things to make our projects easier, we can make small changes to the way we approach specific projects that all move projects toward being easier. The first step is always that choice.