Distributed Systems Problems and Strategies

2021-01-07 – tychoish

At a certain scale, most applications end up having to contend with a class of “distributed systems” problems: when a single computer or a single copy of an application can’t support the required throughput of an application there’s not much to do except to distribute it, and therein lies the problem. Taking one of a thing and making many of the thing operate similarly can be really fascinating, and frankly empowering. At some point, all systems become distributed in some way, to a greater or lesser extent. While the underlying problems and strategies are simple enough, distributed systems-type bugs can be gnarly and having some framework for thinking about these kinds of systems and architectures can be useful, or even essential, when writing any kind of software.

Concerns

Application State

Applications all have some kind of internal state: configuration, runtime settings, in addition to whatever happens in memory as a result of running the application. When you have more than one copy of a single logical application, you have to put state somewhere. That somewhere is usually a database, but it can be another service or in some kind of shared file resource (e.g. NFS or blob storage like S3.)

The challenge is not “where to put the state,” because it probably doesn’t matter much, but rather in organizing the application to remove the assumption that state can be stored in the application. This often means avoiding caching data in global variables and avoiding storing data locally on the filesystem, but there are a host of ways in which application state can get stuck or captured, and the fix is generally “ensure this data is always read out of some centralized and authoritative service,” and ensure that any locally cached data is refreshed regularly and saved centrally when needed.

In general, better state management within applications makes code better regardless of how distributed the system is, and when we use the “turn it off and turn it back on,” we’re really just clearing out some bit of application state that’s gotten stuck during the runtime of a piece of software.

Startup and Shutdown

Process creation and initialization, as well as shutdown, is difficult in distributed systems. While most configuration and state is probably stored in some remote service (like a database,) there’s a bootstrapping process where each process gets enough local configuration required to get that configuration and startup from the central service, which can be a bit delicate.

Shutdown has its own problems set of problems, as specific processes need to be able to complete or safely abort in progress operations.

For request driven work (i.e. HTTP or RPC APIs) without statefull or long-running requests (e.g. many websockets and most streaming connections), applications have to stop accepting new connections and let all in progress requests complete before terminating. For other kinds of work, the process has to either complete in progress work or provide some kind of “checkpointing” approach so that another process can pick up the work later.

Horizontal Scalability

Horizontal scalability, being able to increase the capability of an application by adding more instances of the application rather than creasing the resources allotted to the application itself, is one of the reasons that we build distributed systems in the first place,¹ but simply being able to run multiple copies of the application at once isn’t always enough, the application needs to be able to effectively distribute it’s workloads. For request driven work this is genreally some kind of load balancing layer or strategy, and for other kinds of workloads you need some way to distribute work across the application.

There are lots of different ways to provide loadbalancing, and a lot depends on your application and clients, there is specialized software (and even hardware) that provides loadbalancing by sitting “in front of” the application and routing requests to a backend, but there are also a collection of client-side solutions that work quite well. The complexity of load balancing solutions varies a lot: there are some approaches that just distribute responses “evenly” (by number) to a single backend one-by-one (“round-robin”) and some approaches that attempt to distribute requests more “fairly” based on some reporting of each backend or an analysis of the incoming requests, and the strategy here depends a lot on the requirements of the application or service.

For workloads that aren’t request driven, systems require some mechanism of distributing work to workers, ususally with some kind of messaging system, though it’s possible to get pretty far using a just a normal general purpose database to store pending work. The options for managing, ordering, and distributing the work, is the meat of problem.

Challenges

When thinking about system design or architecture, I tend to start with the following questions.

how does the system handle intermittent failures of particular components?
what kind of downtime is acceptable for any given component? for the system as a whole?
how do operations timeout and get terminated, and how to clients handle these kinds of failures?
what are the tolerances for the application in terms of latency of various kinds of operations, and also the tolerances for “missing” or “duplicating” an operation?
when (any single) node or component of the system aborts or restarts abruptly, how does the application/service respond? Does work resume or abort safely?
what level of manual intervention is acceptable? Does the system need to node failure autonomously? If so how many nodes?

Concepts like “node” or “component” or “operation,” can mean different things in different systems, and I use the terms somewhat vaguely as a result. These general factors and questions apply to systems that have monolithic architectures (i.e. many copies of a single type of process which performs many functions,) and service-based architectures (i.e. many different processes performing specialized functions.)

Solutions

Ignore the Problem, For Now

Many applications run in a distributed fashion while only really addressing parts of their relevant distributed systems problems, and in practice it works out ok. Applications may store most of their data in a database, but have some configuration files that are stored locally: this is annoying, and sometimes an out-of-sync file can lead to some unexpected behavior. Applications may have distributed application servers for all request-driven workloads, but may still have a separate single process that does some kind of coordinated background work, or run cron jobs.

Ignoring the problem isn’t always the best solution in the long term, but making sure that everything is distributed (or able to be distributed,) isn’t always the best use of time, and depending the specific application it works out fine. The important part, isn’t always to distribute things in all cases, but to make it possible to distribute functions in response to needs: in some ways I think about this as the “just in time” approach.

Federation

Federated architectures manage distributed systems protocols at a higher level: rather than assembling a large distributed system, build very small systems that can communicate at a high level using some kind of established protocol. The best example of a federated system is probably email, though there are others.²

Federated systems have more complex protocols that have to be specification based, which can be complicated/difficult to build. Also, federated services have to maintain the ability to interoperate with previous versions and even sometimes non-compliant services, which can be difficult to maintain. Federated systems also end up pushing a lot of the user experience into the clients, which can make it hard to control this aspect of the system.

On the upside, specific implementations and instances of a federated service can be quite simple and have straight forward and lightweight implementations. Supporting email for a few users (or even a few hundred) is a much more tractable problem than supporting email for many millions of users.

Distributed Locks

Needing some kind of lock (for mutual exclusion or mutex) is common enough in programming, and provide some kind of easy way to ensure that only a single actor has access to a specific resource. Doing this within a single process involves using kernel (futexes) or programming language runtime implementations, and is simple to conceptualize, and while the concept in a distributed system is functionally the same, the implementation of distributed locks are more complicated and necessarily slower (both the lock themselves, and their impact on the system as a whole).

All locks, local or distributed can be difficult to use correctly: the lock must be acquired before using the resource, and it must fully protect the resource, without protecting too much and having a large portion of functionality require the lock. So while locks are required sometimes, and conceptually simple, using them correctly is hard. With that disclaimer, to work, distributed locks require:³

some concept of an owner, which must be sufficiently specific (hostname, process identifier,) but that should be sufficiently unique to protect against process restarts, host renaming and collision.
lock status (locked/link) and if the lock has different modes, such as a multi-reader/single-writer lock, then that status.
a timeout or similar mechanism to prevent deadlocks if the actor holding a lock halts or becomes inaccessible, the lock is eventually released.
versioning, to prevent stale actors from modifying the same lock. In the case that actor-1 has a lock and stalls for longer than the timeout period, such that actor-2 gains the lock, when actor-1 runs again it must know that its been usurped.

Not all distributed systems require distributed locks, and in most cases, transactions in the data layer, provide most of the isolation that you might need from a distributed lock, but it’s a useful concept to have.

Duplicate Work (Idempotency)

For a lot of operations, in big systems, duplicating some work is easier and ultimately faster than coordinating and isolating that work in a single location. For this, having idempotent operations⁴ is useful. Some kinds of operations and systems make idempotency easier to implement, and in cases where the work is not idempotent (e.g. as in data processing or transformation,) the operation can be, by attaching some kind of clock to the data or operation.⁵

Using clocks and idempotency makes it possible to maintain data consistency without locks. At the same time, some of the same considerations apply. Having all operations duplicated is difficult to scale so having ways for operations to abort early can be useful.

Consensus Protocols

Some operations can’t be effectively distributed, but are also not safe to duplicate. Applications can use consensus protocols to do “leader election,” to ensure that there’s only one node “in charge” at a time, and the protocol. This is common in database systems, where “single leader” systems are useful for balancing write performance in distributed context. Consensus protocols have some amount of overhead, and are good for systems of a small to moderate size, because all elements of the system must communicate with all other nodes in the system.

The two prevailing consensus protocols are Paxos and Raft--pardoning the oversimplification here--with Raft being a simpler and easier to implement imagination of the same underlying principles. I’ve characterized consensus as being about leader election, though you can use these protocols to allow a distributed system to reach agreement on any manner of operations or shared state.

Queues

Building a fully generalized distributed application with consensus is a very lofty proposition, and commonly beyond the scope of most applications. If you can characterize the work of your system as discrete units of work (tasks or jobs,) and can build or access a queue mechanism within your application that supports workers on multiple processes, this might be enough to support a great deal of your distributed requirements for the application.

Once you have reliable mechanisms and abstractions for distributing work to a queue, scaling the system can be managed outside of the application by using different backing systems, or changing the dispatching layer, and queue optimization is pretty well understood. There are lots of different ways to schedule and distribute queued work, but perhaps this is beyond the scope of this article.

I wrote one of these, amboy, but things like gearman and celery do this as well, and many of these tools are built on messaging systems like Kafka or AMPQ, or just use general purpose databases as a backend. Keeping a solid abstraction between the applications queue and then messaging system seems good, but a lot depends on your application’s workload.

Delegate to Upstream Services

While there are distributed system problems that applications must solve for themselves, in most cases no solution is required! In practice many applications centralize a lot of their concerns in trusted systems like databases, messaging systems, or lock servers. This is probably correct! While distributed systems are required in most senses, distributed systems themselves are rarely the core feature of an application, and it makes sense to delegate these problem to services that that are focused on solving this problem.

While multiple external services can increase the overall operational complexity of the application, implementing your own distributed system fundamentals can be quite expensive (in terms of developer time), and error prone, so it’s generally a reasonable trade off.

Conclusion

I hope this was as useful for you all as it has been fun for me to write!

In most cases, some increase in reliability, by adding redundancy is a strong secondary motivation. ↩︎
xmpp , the protocol behind jabber which powered/powers many IM systems is another federated example, and the fediverse points to others. I also suspect that some federation-like features will be used at the infrastructure layer to coordinate between constrained elements (e.g. multiple k8s clusters will use federation for coordination, and maybe multi-cloud/multi-region orchestration as well…) ↩︎
This article about distributed locks in redis was helpful in summarizing the principles for me. ↩︎
An operation is idempotent if it can be performed more than once without changing the outcome. For instance, the operation “increment the value by 10” is not idempotent because it increments a value every time it runs, so running the operation once is different than running it twice. At the same time the operation “set the value to 10” is idempotent, because the value is always 10 at the end of the operation. ↩︎
Clocks can take the form of a “last modified timestamp,” or some kind of versioning integer associated with a record. Operations can check their local state against a canonical record, and abort if their data is out of date. ↩︎

Towards Faster Emacs Start Times

2020-12-30 – tychoish

While I mostly run emacs as a daemon managed by systemd I like to keep my configuration nimble enough that I can just start emacs and have an ad-hoc session for editing some files.¹ Or that’s the goal. This has been a long running project for me, but early on in quarantine times, I’ve gotten things tightened up to the point that I feel comfortable using EDITOR=emacs just about all the time. On my (very old) computers, emacs starts and in well under 1 or 2 seconds, and does better on faster machines. If your emacs setup takes a long time to start up, this article is for you! Let’s see if we can get the program to start faster.

Measure!

The emacs-init-time function in emacs will tell you how long it took emacs to start up. This tends to be a bit under the actual start time, because some start-up activity gets pushed into various hooks, after startup. Just to keep an eye on things, I have a variant of the following as the first line in my config file: :

(add-to-list 'after-init-hook
      (lambda ()
        (message (concat "emacs (" (number-to-string (emacs-pid)) ") started in " (emacs-init-time)))))

It’s also possible to put additional information in this hook (and in reality I use something that pushes to a desktop notification tool, rather than just to message/logging.

You can also lower this number by putting more work into the after-init hook, but that sort of defeats the purpose: if you get the init time down to a second, but have 10 seconds of init-hook runtime, then that’s probably not much of a win, particularly if the init-hook tasks are blocking.

The second thing to do is have a macro² available for use that allows to measure the runtime of a block of code, something like: :

(defmacro with-timer (name &rest body)
  `(let ((time (current-time)))
 ,@body
 (message "%s: %.06f" ,name (float-time (time-since time)))))

You can then do something like: :

(with-timer "mode-line-setup"
  (set-up-mode-line))

And then look in the *Message* buffer to see how long various things take so you know where focus your efforts.

Strategies

use-package makes it possible to do much less during applciation startup, and makes it possible to load big lisp packages only when you’re going to use them, and registers everything so you never feel the difference. Audit your use-package forms and ensure that they all declare one of the following forms, which insures that things are loaded upon use, rather than at start time:
- :bind registers specific keybindings, but as autoloads so that the pacakge is only loaded when you use the keybinding.
- :command registers function names as autoloads, like keybinding, so the commands/functions are available but again the package isn’t loaded until it’s used.
- :mode attaches the package to a specific file extension or extensions so again, the package doesn’t get loaded until you open a file of that type.
- :after for loading a package after another package it depends on or is only used after.
- :hook this allows you to associate something from this package into another package’s hook, with the same effect of deferring loading of one package until after another is used.
Avoid having the default-scratch buffer trigger loading as much as possible. It’s very tempting to have *scratch* default to org-mode or something, but if you keep it in fundamental-mode you can avoid loading most of your packages until you actually use them, or can avoid pulling in too much too quickly.
Keep an eye on startup both for GUI, terminal, and if you use daemons, daemon instances of emacs. There are some subtleties that may be useful or important. It’s also the case that terminal startup times are often much less than GUI times.
Think about strategies for managing state-management (e.g. recentf session-mode and desktop-mode.) I’ve definitley had times when I really wanted emacs to be able to restart and leave me exactly where I was when I shut down. These session features are good, but often it’s just crufty, and causes a lot of expense at start up time. I still use desktop and session, but less so.
Fancy themes and modelines come at some cost. I recently was able to save a lot of start up time by omiting a call to spaceline-compile in my configuration. Theme setup also can take a bit of time in GUI mode (I think!), but I dropped automatic theme configuration so that my instances would play better with terminal mode.³

Helpful Settings

I think of these four settings as the “start up behavior” settings from my config: :

(setq inhibit-startup-echo-area-message "tychoish")
(setq inhibit-startup-message 't)
(setq initial-major-mode 'fundamental-mode)
(setq initial-scratch-message 'nil)

I’ve had the following jit settings in my config for a long time, and they tend to make things more peppy, which can only help during startup. This may also just be cargo-cult, stuff: :

(setq jit-lock-stealth-time nil)
(setq jit-lock-defer-time nil)
(setq jit-lock-defer-time 0.05)
(setq jit-lock-stealth-load 200)

If you use desktop-mode to save state and re-open buffers, excluding some modes from this behavior is good. In my case, avoiding reopening (and rereading) potentially large org-mode buffers was helpful for me. The desktop-modes-not-to-save value is useful here, as in the following snippet from my configuration: :

(add-to-list 'desktop-modes-not-to-save 'dired-mode)
(add-to-list 'desktop-modes-not-to-save 'Info-mode)
(add-to-list 'desktop-modes-not-to-save 'org-mode)
(add-to-list 'desktop-modes-not-to-save 'info-lookup-mode)
(add-to-list 'desktop-modes-not-to-save 'fundamental-mode)

Notes

Ok, ok, I live with a vim user and the “quick startup time,” situation is very appealing. ↩︎
I tweaked this based on what I found this on stack overflow which stole it from this listserv post. ↩︎
Mixed GUI/Terminal mode with themes can be kind of finicky because theme settings are global and color backgrounds often won’t play well with terminal frames. You can say “if this is a gui mode,” but that doesn’t play well with daemons, and hooking into frame-creation has never worked as seamlessly as I would expect, because the of the execution context the hook functions (and slowing down frame creation can defeat part of the delight of the daemon.) ↩︎

Staff Engineering

2020-09-27 – tychoish

In August of 2019 I became a Staff Engineer, which is what a lot of companies are calling their “level above Senior Engineer” role these days. Engineering leveling is a weird beast, which probably a post onto itself. Despite my odd entry into a career in tech, my path in the last 4 or 5 years has been pretty conventional; however, somehow, despite having an increasingly normal career trajectory, explaining what I do on a day to day basis has not gotten easier.

Staff Engineers are important for scaling engineering teams, but lots of teams get by with out them, and unlike more junior engineers who have broadly similar job roles, there are a lot of different ways to be a Staff Engineer, which only muddies things. This post is a reflection on some key aspects of my experience organized in to topics that I hope will be useful for people who may be interested in becoming staff engineers or managing such a person. If you’re also a Staff Engineer and your experience is different, I wouldn’t be particularly surprised.

Staff Engineers Help Teams Build Great Software

Lots of teams function just fine without Staff Engineers and teams can build great products without having contributors in Staff-type roles. Indeed, because Staff Engineers vary a lot, the utility of having more senior individual contributors on a team depends a lot of the specific engineer and the team in question: finding a good fit is even harder than usual. In general, having Senior Technical leadership can help teams by:

giving people managers more space and time to focus on the team organization, processes, and people. Particularly in small organizations, team managers often pick up technical leadership.
providing connections and collaborations between groups and efforts. While almost all senior engineers have a “home team” and are directly involved in a few specific projects, they also tend to have broader scope, and so can help coordinate efforts between different projects and groups.
increasing the parallelism of teams, and can provide the kind of infrastructure that allows a team to persue multiple streams of development at one time.
supporting the career path and growth of more junior engineers, both as a result of direct mentoring, but also by enabling the team to be more successful by having more technical leadership capacity creates opportunities for growth for everyone on the team.

Staff Promotions Reflect Organizational Capacity

In addition to experience and a history of effusiveness, like other promotions, getting promoted to Staff Engineer is less straight forward than other promotions. This is in part because the ways we think about leveling and job roles (i.e. to describe the professional activities and capabilities along several dimensions for each level,) become complicated when there are lots of different ways to be a Staff Engineer. Pragmatically, these kind of promotions often depend on other factors:

the existence of other Staff Engineers in the organization make it more likely that there’s an easy comparison for a candidate.
past experience of managers getting Staff+ promotions for engineers. Enginering Managers without this kind of experience may have difficulty creating the kinds of opportunities within their organizations and for advocating these kinds of promotions.
organizational maturity and breadth to support the workload of a Staff Engineer: there are ways to partition teams and organizations that preclude some of the kinds of higher level concerns that justify having Staff Engineers, and while having senior technical leadership is often useful, if the organization can’t support it, it won’t happen.
teams with a sizable population of more junior engineers, particularly where the team is growing, will have more opportunity and need for Staff Engineers. Teams that are on the balance more senior, or are small and relatively static tend to have less opportunity for the kind of broadly synthetic work that tends to lead to Staff promotions.

There are also, of course, some kinds of technical achievements and professional characteristics that Staff Engineers often have, and I’m not saying that anyone in the right organizational context can be promoted, exactly. However, without the right kind of organizational support and context, even the most exceptional engineers will never be promoted.

Staff Promotions are Harder to Get Than Equivalent Management Promotions

In many organizations its true that Staff promotions are often much harder to get than equivalent promotions to peer-level management postions: the organizational contexts required to support the promotion of Engineers into management roles are much easier to create, particularly as organizations grow. As you hire more engineers you need more Engineering Managers. There are other factors:

managers control promotions, and it’s easier for them to recapitulate their own career paths in their reports than to think about the Staff role, and so more Engineers tend to be pushed towards management than Senior IC roles. It’s also probably that meta-managers benefit organizationally from having more front-line managers in their organizations than more senior ICs, which exacerbates this bias.
from an output perspective, Senior Engineers can write the code that Staff Engineers would otherwise write, in a way that Engineering Management tends to be difficult to avoid or do without. In other terms, management promotions are often more critical from the organization’s perspective and therefore prioritized over Staff promotions, particularly during growth.
cost. Staff Engineers are expensive, often more expensive than managers particularly at the bottom of the brackets, and it’s difficult to imagine that the timing of Staff promotions are not impacted by budgetary requirements.

Promoting a Staff Engineer is Easier than Hiring One

Because there are many valid ways to do the Staff job, and so much of the job is about leveraging context and building broader connections between different projects, people with more organizational experience and history often have an advantage over fresh industry hires. In general:

Success as a Staff Engineer in one organization does not necessarily translate to success at another.
The conventions within the process for industry hiring, are good at selecting junior engineers, and there are fewer conventions for more Senior roles, which means that candidates are not assessed for skills and experiences that are relevant to their day-to-day while also being penalized for (often) being unexceptional at the kind of problems that junior engineering interviews focus on. While interview processes are imperfect assessment tools in all cases, they’re particularly bad at more senior levels.
Senior engineering contributors have a the potential to have huge impact on product development, engineering outcomes, all of which requires a bunch of trust on the part of the organization, and that kind of trust is often easier to build with someone who already has organizational experience

This isn’t to say that it’s impossible to hire Staff engineers, I’m just deeply dubious of the hiring process for these kinds of roles having both interviewed for these kinds of roles and also interviewed candidates for them. I’ve also watched more than one senior contributor not really get along well with a team or other leadership after being hired externally, and for reasons that end up making sense in retrospect. It’s really hard.

Staff Engineers Don’t Not Manage

Most companies have a clear distinction between the career trajectories of people involved in “management” and senior “individual contributor” roles (like Staff Engineers,) with managers involved in leadership for teams and humans, with ICs involved in technical aspects. This seems really clear on paper but incredibly messy in practice. The decisions that managers make about team organization and prioritization have necessary technical implications; while it’s difficult to organize larger scale technical initiatives without awareness of the people and teams. Sometimes Staff Engineers end up doing actual management on a temporary basis in order to fill gaps as organizations change or cover parental leave

It’s also the case that a huge part of the job for many Staff Engineer’s involves direct mentorship of junior engineers, which can involve leading specific projects, conversations about career trajectories and growth, as well as conversations about specific technical topics. This has a lot of overlap with management, and that’s fine. The major differences is that senior contributors share responsibility for the people they mentor with their actual managers, and tend to focus mentoring on smaller groups of contributors.

Staff Engineers aren’t (or shouldn’t be!) managers, even when they are involved in broader leadership work, even if the specific engineer is capable of doing management work: putting ICs in management roles, takes time away from their (likely more valuable) technical projects.

Staff Engineers Write Boring and Tedious But Easy Code

While this is perhaps not a universal view, I feel pretty safe in suggesting that Staff Engineers should be directly involved in development projects. While there are lots of ways to be involved in development: technical design, architecture, reviewing code and documents, project planning and development, and so fort, I think it’s really important that Staff Engineers be involved with code-writing, and similar activies. This makes it easy to stay grounded and relevant, and also makes it possible to do a better job at all of the other kinds of engineering work.

Having said that, it’s almost inevitable that the kinds of contribution to the code that you make as a Staff Engineer are not the same kinds of contributions that you make at other points in your career. Your attention is probably pulled in different directions. Where a junior engineer can spend most of their day focusing on a few projects and writing code, Staff Engineers:

consult with other teams.
mentor other engineers.
build long and medium term plans for teams and products.
breaking larger projects apart and designing APIs between components.

All of this “other engineering work” takes time, and the broader portfolio of concerns means that more junior engineers often have more time and attention to focus on specific programming tasks. The result is that the kind of code you end up writing tends to be different:

fixing problems and bugs in systems that require a lot of context. The bugs are often not very complicated themselves, but require understanding the implication of one component with regards to other components, which can make them difficult.
projects to enable future development work, including building infrastructure or designing an interface and connecting an existing implementation to that interface ahead of some larger effort. This kind of “refactor things to make it possible to write a new implementation.”
writing small isolated components to support broader initiatives, such as exposing existing information via new APIs, and building libraries to facilitate connections between different projects or components.
projects that support the work of the team as a whole: tools, build and deployment systems, code clean up, performance tuning, test infrastructure, and so forth.

These kinds of projects can amount to rather a lot of development work, but they definitely have their own flavor. As I approached Staff and certainly since, the kind of projects I had attention for definitely shifted. I actually like this kind of work rather a lot, so that’s been quite good for me, but the change is real.

There’s definitely a temptation to give Staff Engineers big projects that they can go off and work on alone, and I’ve seen lots of teams and engineers attempt this: sometimes these projects work out, though more often the successes feel like an exception. There’s no “right kind” of way to write software as a Staff Engineer, sometimes senior engineer’s get to work on bigger “core projects.” Having said that, if a Staff Engineer is working on the “other engineering” aspects of the job, there’s just limited time to do big development projects in a reasonable time frame.

New Beginnings: Deciduous Platform

2020-08-30 – tychoish

I left my job at MongoDB (8.5 years!) at the beginning of the summer, and started a new job at the beginning of the month. I’ll be writing and posting more about my new gig, career paths in general, reflections on what I accomplished on my old team, the process of interviewing as a software engineer, as well as the profession and industry over time. For now, though, I want to write about one of the things I’ve been working on this summer: making a bunch of the open source libraries that I worked on more generally useable. I’ve been calling this the deciduous platform,¹ which now has its own github organization! So it must be real.

The main modification in these forks, aside from adding a few features that had been on my list for a while, has been to update the buildsystem to use go modules² and rewrite the history of the repository to remove all of the old vendoring. I expect to continue development on some aspects of these over time, though the truth is that these libraries were quite stable and were nearly in maintenance mode anyway.

Background

The team was responsible for a big monolith (or so) application: development had begun in 2013, which was early for Go, and while everything worked, it was a bit weird. My efforts when I joined in 2015 focused mostly on stabilization, architecture, and reliability. While the application worked, mostly, it was clear that it suffered from a few problem, which I believe were the result of originating early in the history of Go: First, because no one had tried to write big applications yet, the patterns weren’t well established, and so the team ended up writing code that worked but that was difficult to maintain, and ended up with bespoke solutions to a number of generic problems like running workloads in the background or managing Apia. Second, Go’s standard library tends to be really solid, but also tends towards being a little low level for most day-to-day tasks, so things like logging and process management end up requiring more code³ than is reasonable.

I taught myself to write Go by working on a logging library, and worked on a distributed queue library. One of the things that I realized early, was that breaking the application into “microservices,” would have been both difficult and offered minimal benefit,⁴ so I went with the approach of creating a well factored monolith, which included a lot of application specific work, but also building a small collection of libraries and internal services to provide useful abstractions and separations for application developers and projects.

This allowed for a certain level of focus, both for the team creating the infrastructure, but also for the application itself: the developers working on the application mostly focused on the kind of high level core business logic that you’d expect, while the infrastructure/platform team really focused on these libraries and various integration problems. The focus wasn’t just organizational: the codebases became easier to maintain and features became easier to develop.

This experience has lead me to think that architecture decisions may not be well captured by the monolith/microservice dichotomy, but rather there’s' this third option that centers on internal architecture, platforms, and the possibility for developer focus and velocity.

Platform Overview

While there are 13 or so repositories in the platform, really there are 4 major libraries: grip, a logging library; jasper, a process management framework; amboy, a (possibly distributed) worker queue; and gimlet, a collection of tools for building HTTP/REST services.

The tools all work pretty well together, and combine to provide an environment where you can focus on writing the business logic for your HTTP services and background tasks, with minimal boilerplate to get it all running. It’s pretty swell, and makes it possible to spin up (or spin out) well factored services with similar internal architectures, and robust internal infrastructure.

I wanted to write a bit about each of the major components, addressing why I think these libraries are compelling and the kinds of features that I’m excited to add in the future.

Grip

Grip is a structured-logging friendly library, and is broadly similar to other third-party logging systems. There are two main underlying interfaces, representing logging targets (Sender) and messages, as well as a higher level “journal” interface for use during programming. It’s pretty easy to write new message or bakcends, which means you can use grip to capture all kinds of arbitrary messages in consistent manners, and also send those messages wherever they’re needed.

Internally, it’s quite nice to be able to just send messages to specific log targets, using configuration within an application rather than needing to operationally manage log output. Operations folks shouldn’t be stuck dealing with just managing logs, after all, and it’s quite nice to just send data directly to Splunk or Sumologic. We also used the same grip fundamentals to send notifications and alerts to Slack channels, email lists, or even to create Jira Issues, minimizing the amount of clunky integration code.

There are some pretty cool projects in and around grip:

support for additional logging targets. The decudous version of grip adds twitter as an output format as well as creating desktop notifications (e.g. growl/libnotify,) but I think it would also be interesting to add fluent/logstash connections that don’t have to transit via standard error.'
While structured logging is great, I noticed that we ended up logging messages automatically in the background as a method of metrics collection. It would be cool to be able to add some kind of “intercepting sender” that handled some of these structured metrics, and was able to expose this data in a format that the conventional tools these days (prometheus, others,) can handle. Some of this code would clearly need to be in Grip, and other aspects clearly fall into other tools/libraries.

Amboy

Amboy is an interface for doing things with queues. The interfaces are simple, and you have:

a queue that has some way of storing and dispatching jobs.
implementations of jobs which are responsible for executing your business logic, and with a base implemention that you can easily compose, into your job types, all you need to implement, really is a Run() method.
a queue “group” which provides a higher level abstraction on top of queues to support segregating workflows/queues in a single system to improve quality of service. Group queues function like other queues but can be automatically managed by the processes.
a runner/pool implementation that provides the actual thread pool.

There’s a type registry for job implementations and versioning in the schema for jobs so that you can safely round-trip a job between machines and update the implementation safely without ensuring the queue is empty.

This turns out to be incredibly powerful for managing background and asynchronous work in applications. The package includes a number of in-memory queues for managing workloads in ephemeral utilities, as well as a distributed MongoDB backed-queue for running multiple copies of an application with a shared queue(s). There’s also a layer of management tools for introspecting, managing, the state of jobs.

While Amboy is quite stable, there is a small collection of work that I’m interested in:

a queue implementation that store jobs to a local Badger database on-disk to provide single-system restartabilty for jobs.
a queue implementation that stores jobs in a PostgreSQL, mirroring the MongoDB job functionality, to be able to meet job backends.
queue implementations that use messaging systems (Kafka, AMPQ) for backends. There exists an SQS implementation, but all of these systems have less strict semantics for process restarts than the database options, and database can easily handle on the order of a hundred of thousand of jobs an hour.
changes to the queue API to remove a few legacy methods that return channels instead of iterators.
improve the semantics for closing a queue.

While Amboy has provisions for building architectures with workers running on multiple processes, rather than having queues running multiple threads within the same process, it would be interesting to develop more fully-fledged examples of this.

Jasper

Jasper provides a high level set of tools for managing subprocesses in Go, adding a highly ergonomic API (in Go,) as well as exposing process management as a service to facilitate running processes on remote machines. Jasper also manages/tracks the state of running processes, and can reduce pressures on calling code to track the state of processes.

The package currently exposes Jasper services over REST, gRPC, and MongoDB’s wire protocol, and there is also code to support using SSH as a transport so that you don’t need to expose remote these services publically.

Jasper is, perhaps, the most stable of the libraries, but I am interested in thinking about a couple of extensions:

using jasper as PID 1 within a container to be able to orchestrate workloads running on containers, and contain (some) support for lower level container orchestration.
write configuration file-based tools for using jasper to orchestrate buildsystems and distributed test orchestration.

I’m also interested in cleaning up some of the MongoDB-specific code (i.e. the code that downloads MongoDB versions for use in test harnesses,) and perhaps reinvisioning that as client code that uses Jasper rather than as a part of Jasper.

Gimlet

I’ve written about gimlet here before when I started the project, and it remains a pretty useful and ergonomic way to define and regester HTTP APIs, in the past few years, its grown to add more authentication features, as well as a new “framework” for defining routes. This makes it possible to define routes by implementing an interface that:

makes it very easy to produce paginated routes, and provides some helpers for managing content
separates the parsing of inputs from executing the results, which can make route definitions easy to test without integration tests.
rehome functionality on top of chi router. The current implementation uses Negroni and gorilla mux (but neither are exposed in the interface), but I think it’d be nice to have this be optional, and chi looks pretty nice.

Other Great Tools

The following libraries are defiantly smaller, but I think they’re really cool:

birch is a builder for programatically building BSON documents, and MongoDB’s extended JSON format. It’s built upon an earlier version of the BSON library. While it’s unlikely to be as fast at scale, for many operations (like finding a key in a document), the interface is great for constructing payloads.
ftdc provides a way to generate (and read,) MongoDB’s diagnostic data format, which is a highly compressed timeseries data format. While this implementation could drift from the internal implementation over time, the format and tool remain useful for arbitrary timeseries data.
certdepot provides a way to manage a certificate authority with the certificates stored in a centralized store. I’d like to add other storage backends over time.

And more…

Notes

My old team built a continuous integration tool called evergreen which is itself a pun (using “green” to indicate passing builds, most CI systems are not ever-green.) Many of the tools and libraries that we built had got names with tree puns, and somehow “deciduous” seems like the right plan. ↩︎
For an arcane reason, all of these tools had to build with an old version of Go (1.10) that didn’t support modules, so we had an arcane and annoying vendoring solution that wasn’t compatible with modules. ↩︎
Go tends to be a pretty verbose language, and I think most of the time this creates clarity; however, for common tasks it has the feeling of offering a poor abstraction, or forcing you to write duplicated code. While I don’t believe that more-terse-code is better, I think there’s a point where the extra verbosity for route operations just creates the possibility for more errors. ↩︎
The team was small, and as an internal tools team, unlikely to grow to the size where microservices offered any kind of engineering efficiency (at some cost,) and there weren’t significant technical gains that we could take advantage of: the services of the application didn’t need to be globally distributed and the boundaries between components didn’t need to scale independently. ↩︎

Tycho Emacs Config Kit

2020-08-22 – tychoish

So I made a thing, or at least, I took a thing I’ve built and made it presentable for other people. I’m talking, of course, about my Emacs configuration.

Check it out at github.com/tychoish/.emacs.d! The README is pretty complete, and giving it a whirl is simple, just do something like: :

mv ~/.emacs.d/ ~/emacs.d-archive
git clone --recurse-submodules git@github.com:tychoish/.emacs.d.git ~/.emacs.d/

If you’re using Emacs 27, you might also be able to clone it into ~/.config/emacs, for similar effect. It doesn’t matter much to me. Let me know what you think!

The tl;dr of “what’s special about this config,” is that, it has:

a good set of defaults out of the box.
a lot of great wiz-bang features enabled and configured.
very quick start times, thanks to lazy-loading of libraries.
great support for running as a daemon, and even running multiple daemons at once.

I’ve sporadically produced copies of my emacs config for other folks to use, but those were always ad hoc, and difficult to reproduce and maintain, and I’ve been working on and off to put more polish on things, so making it usable for other people seemed like a natural step at this point.

I hope other people find it useful, but also I think it’s a good exercise for me in keeping things well maintained, and it doesn’t bother me much one way or another.

Discussion

I think a lot about developer experience: I’ve spent my career, thus far, working on infrastructure products (databases, CI systems, build tools, release engineering,) that help improve developer experience and productivity, and even my day-to-day work tends to focus mostly on problems that impact the way that my teams write software: system architecture, service construction, observability, test infrastructure, and similar.

No matter how you slice it, editor experience is a huge factor in people’s experience, and can have a big impact on productivity, and there are so many different editors with wildly different strengths. Editors experience is also really hard: unlike other kinds of developer infrastructure (like buildsystems, or even programming languages) the field conceptualizes editors as personal: your choice in editor is seen to reflect on you (it probably doesn’t), and your ability to use your editor well is seen to be a reflection of your skills as a developer (it isn’t). It’s also the case that because editors are so personal, it’s very difficult to produce an editor with broad appeal, and the most successful editors tend to be very configurable and can easily lack defaults, which means that even great editors are terrible out of the box, which mostly affects would-be and new developers.

Nevertheless, time being able to have an editor that you’re comfortable with and that you can use effectively without friction does make it easier to build software, so even if folks often conceptualize of editors in the wrong way, improving the editing experience is important. I think that there are two areas for improvement:

editor configurations should--to some extent--be maintained at the project (or metaproject) level, rather than on the level individual engineer. The hard part here is how to balance individual preference with providing a consistent experience, particularly for new developers.¹
there should be more “starter kits” (like this one! or the many other starter kits, but also for (neo)vim, vscode, and others.) that help bootstrap the experience, while also figuring out ways to allow layering other project-based extensions on top of a base configuration.

Also, I want to give chemacs a shout out, for folks who want to try out other base configurations.

There are two kinds of new developers with different experiences but some overlap: folks who have development experience in general but no experience with a specific project, and folks who are new to all development. ↩︎

Running Emacs

2020-08-03 – tychoish

OK, this is a weird post, but after reading a post about running emacs with systemd, ¹ I’ve realized that the my take on how I run and manage processes is a bit unique, and worth enumerating and describing. The short version is that I regularly run multiple instances of emacs, under systemd, in daemon mode, and it’s pretty swell. Here’s the rundown:

On Linux systems, I use a build of emacs with the lucid toolkit, rather than GTK, because of this bug with GTK, which I should write more about at some point. Basically, if the X session crashes with GTK emacs, even if you don’t have windows open, the daemon will crash, even if the dameon isn’t started from a GUI session. The lucid toolkit doesn’t have this problem.
I run the process under my user’s systemd instance, rather than under the PID 1 systemd instance. I like keeping things separate. Run the following command to ensure that your user’s systemd will start at boot rather than at login: :
```
sudo loginctl enable-ligner $(whoami)
```

I have a systemd service file named emacs@.service in my ~/.config/systemd/user/ directory that looks like this: :

[Unit]
Description=Emacs-tychoish: the extensible, self-documenting text editor.

[Service]
Type=forking
ExecStart=/usr/bin/emacs --daemon=%i --chdir %h
ExecStop=/usr/bin/emacsclient --server-file=hud --eval "(progn (setq kill-emacs-hook 'nil) (kill-emacs))"
Restart=always
TimeoutStartSec=0

[Install]
WantedBy=default.target

I then start emacs dameons: :

systemctl --user start emacs@work
systemctl --user start emacs@personal
systemctl --user start emacs@chat

To enable them so that they start following boot: :

systemctl --user enable emacs@work
systemctl --user enable emacs@personal
systemctl --user enable emacs@chat

Though to be honest, I use different names for daemons.

I have some amount of daemon specific code, which might be useful: :

(setq server-use-tcp t)

(if (equal (daemonp) nil)
(setq tychoish-emacs-identifier "solo")
  (setq tychoish-emacs-identifier (daemonp)))

;; this makes erc configs work less nosily. There's probably no harm in
;; turning down the logging
(if (equal (daemonp) "chat")
(setq gnutls-log-level 0)
  (setq gnutls-log-level 1))

(let ((csname (if (eq (daemonp) nil)
     "generic"
     (daemonp))))
  (setq recentf-save-file (concat user-emacs-directory system-name "-" csname "-recentf"))
  (setq session-save-file (concat user-emacs-directory system-name "-" csname "-session"))
  (setq bookmark-default-file (concat user-emacs-directory system-name "-" csname "-bookmarks"))
  (setq helm-c-adaptive-history-file (concat user-emacs-directory system-name "-" csname "--helm-c-adaptive-history"))
  (setq desktop-base-file-name (concat system-name "-" csname "-desktop-file"))
  (setq desktop-base-lock-name (concat system-name "-" csname "-desktop-lock")))

Basically this just sets up some session-specific information to be saved to different files, to avoid colliding per-instance.

Additionally, I use the tychoish-emacs-identifier from above to provide some contextual information as to what emacs daemon/window I’m currently in: :
```
(setq frame-title-format '(:eval (concat tychoish-emacs-identifier ":" (buffer-name))))

(spaceline-emacs-theme 'daemon 'word-count)
(spaceline-define-segment daemon tychoish-emacs-identifier)
```
Also, on the topic of configuration, I do have a switch statement that loads different mu4e configurations in different daemons.

To start emacs sessions, I use operations in the following forms: :

# create a new emacs frame. Filename optional.
emacsclient --server-file=<name> --create-frame --no-wait <filename>

# open a file in an existing (last-focus) frame/window. Filename required.
emacsclient --server-file=<name> --no-wait <filename>

# open a terminal emacs mode. Filename optional .
emacsclient --server-file=<name> --tty --no-wait <filename>

That’s a lot to type, so I use aliases in my shell profile: :

alias e='emacsclient --server-file=personal --no-wait'
alias ew='emacsclient --server-file=personal --create-frame --no-wait'
alias et='emacsclient --server-file=personal --tty'

I create a set of aliases for each daemon prefixing e/ew/et with the first letter of the daemon name.

And that’s about it. When I’ve used OS X, I’ve managed something similar using launchd but the configuration files are a bit less elegant. On OS X, I tend to install emacs with cocoa toolkit, using homebrew.

Using multiple daemons is cool, though not required, for a number of reasons:

you can have good separation between personal things and professional/work things, which is always nice, but particularly gratifying during the pandemic when it’s easy to work forever.
Managing multiple separation of email. While mu4e has profiles and contexts, and that’s great, I like a firmer boundary, and being able maintain separate email databases.
Running emacs lisp applications that do a lot of networking, or do other blocking operations. The main case where this matters in my experience is running big erc instance, or something else that isn’t easily broken into async/subprocesses.

And commenting! ↩︎

Emacs and LSP Mode

2020-07-25 – tychoish

I’ve been using emacs for a long time, and it’s mostly pretty unexceptional. A few months ago, I started using LSP mode, and it’s been pretty great… mostly. This post reflects on some of the dustier corners, and some of the broader implications.

History and Conflict

Language Server is a cool idea, that (as far as I know) grew out of VS Code, and provides a way to give a text editor “advanced, IDE-like” features without having to implement those features per-language in the text editor. Instead, editor developers implement generic integration for the protocol, and language developers produce a single process that can process source code and supports the editor features. Sounds great!

There are challenges:

The LSP protocol uses JSON for encoding data between the server and emacs. This is a reasonable choice, and makes development easy, but it’s not a perfect fit. Primarily, in emacs, until version 27, which hasn’t reached an official release yet (but is quite usable on Linux), parsed JSON in lisp rather than C, which made everything rather slow, but emacs 27 helps a lot. Second,
There is some overhead for the lsp, particularly as you attempt to scale very large numbers of files or very large files. Emacs is really good with large files, as an artifact of having had to support editing larger files on computers 20 or 30 years ago, and lsp feels sluggish in these situations.

I think some of this overhead is probably depends on the implementation of the language server’s themselves, and some percentage is probably protocol itself: http’s statelessness is great in big distributed web services, and for the local use cases where responsiveness matters, it’s less good. Similarly JSON’s transparency and ubiquity is great, but for streaming streaming-type use cases, perhaps less so.

I’ve not poked around the protocol, or any implementations, extensively, so I’m very prepared to be wrong about these points.

There’s also some tension about moving features, like support for adding editor support for a language, out of lisp and into an external process. While I’m broadly sympathetic, and think there are a bunch of high-quality language integrations in emacs lisp, and for basic things, the emacs lisp ones might be faster. Having said that, there’s a lot of interface inconsistency between different support for languages in emacs, and LSP support feels spiffy my comparisons. Also at least for python (jedi,) go (gocode), and C/C++ (irony,) lisp (slime,) the non-lsp services were already using external processes for some portion of their functionality.

It’s also the case that if people spend their emacs-lisp-writing time doing things other than writing support for various development environments it’s all for the better anyway.

Suggestions / Notes

things will break or get wedged, call lsp-restart-workspace to restart the server (I think.) It helps.
use lsp-file-watch-ignored to avoid having the language servers process emacs process files from vendored code, intermediate/generated code, or similar, in cases where you might have a large number of files that aren’t actually material to your development work
I believe some (many? all?) servers are sourced from the PATH (e.g. clang/llvm, gopls), which means you are responsible for updating them and their dependencies.
I’ve found it useful to update lsp-mode itself more often than I typically update other emacs packages. I also try and keep the language servers as up to date as possible in coordination with lsp-mode updates.
If Emacs 27 can work on your system, it’s noticeably faster. Do that.
If you have an existing setup for a language environment, the LSP features end up layering over existing functionality, and that can have unexpected results.

My Configuration

None of this is particularly exciting, but if you use use-package, then the following might be an interesting starting point. I stole a lot of this from other places, so shrug.

I must say that I don’t really use dap or treemacs, because I tend to keep things pretty barebones.

(use-package helm-lsp
  :ensure t
  :after (lsp-mode)
  :commands (helm-lsp-workspace-symbol)
  :init (define-key lsp-mode-map [remap xref-find-apropos] #'helm-lsp-workspace-symbol))

(use-package lsp-mode
  :diminish (lsp-mode . "lsp")
  :bind (:map lsp-mode-map
("C-c C-d" . lsp-describe-thing-at-point))
  :hook ((python-mode . #'lsp-deferred)
(js-mode . #'lsp-deferred)
(go-mode-hook . #'lsp-deferred))
  :init
  (setq lsp-auto-guess-root t       ; Detect project root
lsp-log-io nil
lsp-enable-indentation t
lsp-enable-imenu t
lsp-keymap-prefix "C-l"
lsp-file-watch-threshold 500
lsp-prefer-flymake nil)      ; Use lsp-ui and flycheck

  (defun lsp-on-save-operation ()
(when (or (boundp 'lsp-mode)
     (bound-p 'lsp-deferred))
  (lsp-organize-imports)
  (lsp-format-buffer))))

(use-package lsp-clients
  :ensure nil
  :after (lsp-mode)
  :init (setq lsp-clients-python-library-directories '("/usr/local/" "/usr/")))

(use-package lsp-ui
  :ensure t
  :after (lsp-mode)
  :commands lsp-ui-doc-hide
  :bind (:map lsp-ui-mode-map
     ([remap xref-find-definitions] . lsp-ui-peek-find-definitions)
     ([remap xref-find-references] . lsp-ui-peek-find-references)
     ("C-c u" . lsp-ui-imenu))
  :init (setq lsp-ui-doc-enable t
     lsp-ui-doc-use-webkit nil
     lsp-ui-doc-header nil
     lsp-ui-doc-delay 0.2
     lsp-ui-doc-include-signature t
     lsp-ui-doc-alignment 'at-point
     lsp-ui-doc-use-childframe nil
     lsp-ui-doc-border (face-foreground 'default)
     lsp-ui-peek-enable t
     lsp-ui-peek-show-directory t
     lsp-ui-sideline-update-mode 'line
     lsp-ui-sideline-enable t
     lsp-ui-sideline-show-code-actions t
     lsp-ui-sideline-show-hover nil
     lsp-ui-sideline-ignore-duplicate t)
  :config
  (add-to-list 'lsp-ui-doc-frame-parameters '(right-fringe . 8))

  ;; `C-g'to close doc
  (advice-add #'keyboard-quit :before #'lsp-ui-doc-hide)

  ;; Reset `lsp-ui-doc-background' after loading theme
  (add-hook 'after-load-theme-hook
   (lambda ()
     (setq lsp-ui-doc-border (face-foreground 'default))
     (set-face-background 'lsp-ui-doc-background
              (face-background 'tooltip))))

  ;; WORKAROUND Hide mode-line of the lsp-ui-imenu buffer
  ;; @see https://github.com/emacs-lsp/lsp-ui/issues/243
  (defadvice lsp-ui-imenu (after hide-lsp-ui-imenu-mode-line activate)
(setq mode-line-format nil)))

;; Debug
(use-package dap-mode
  :diminish dap-mode
  :ensure t
  :after (lsp-mode)
  :functions dap-hydra/nil
  :bind (:map lsp-mode-map
     ("<f5>" . dap-debug)
     ("M-<f5>" . dap-hydra))
  :hook ((dap-mode . dap-ui-mode)
(dap-session-created . (lambda (&_rest) (dap-hydra)))
(dap-terminated . (lambda (&_rest) (dap-hydra/nil)))))

(use-package lsp-treemacs
  :after (lsp-mode treemacs)
  :ensure t
  :commands lsp-treemacs-errors-list
  :bind (:map lsp-mode-map
     ("M-9" . lsp-treemacs-errors-list)))

(use-package treemacs
  :ensure t
  :commands (treemacs)
  :after (lsp-mode))

Common Lisp Grip, Project Updates, and Progress

2020-07-12 – tychoish

Last week, I did a release, I guess, of cl-grip which is a logging library that I wrote after reflecting on common lisp logging earlier. I wanted to write up some notes about it that aren’t covered in the read me, and also talk a little bit4 about what else I’m working on.

cl-grip

This is a really fun and useful project and it was really the right size for a project for me to really get into, and practice a bunch of different areas (packages! threads! testing!) and I think it’s useful to boot. The read me is pretty comprehensive, but I thought I’d collect some additional color here:

Really at it’s core cl-grip isn’t a logging library, it’s just a collection of interfaces that make it easy to write logging and messaging tools, which is a really cool basis for an idea, (I’ve been working on and with a similar system in Go for years.)

As result, there’s interfaces and plumbing for doing most logging related things, but no actual implementations. I was very excited to leave out the “log rotation handling feature,” which feels like an anachronism at this point, though it’d be easy enough to add that kind of handler in if needed. Although I’m going to let it stew for a little while, I’m excited to expand upon it in the future:

additional message types, including capturing stack frames for debugging, or system information for monitoring.
being able to connect and send messages directly to likely targets, including systemd’s journal and splunk collectors.
a collection of more absurd output targets to cover “alerting” type workloads, like desktop notifications, SUMP, and Slack targets.

I’m also excited to see if other people are interested in using it. I’ve submitted it to Quicklisp and Ultralisp, so give it a whirl!

See the cl-grip repo on github.

Eggqulibrium

At the behest of a friend I’ve been working on an “egg equilibrium” solver, the idea being to provide a tool that can given a bunch of recipes that use partial eggs (yolks and whites) can provide optimal solutions that use a fixed set of eggs.

So far I’ve implemented some prototypes that given a number of egg parts, attempt collects recipes until there are no partial eggs in use, so that there are no leftovers. I’ve also implemented the “if I have these partial eggs, what can I make to use them all.” I’ve also implemented a rudimentary CLI interface (that was a trip!) and a really simple interface to recipe data (both parsing from a CSV format and an in memory format that makes solving the equilibrium problem easier.)

I’m using it as an opportunity to learn different things, and find a way to learn more about things I’ve not yet touched in lisp (or anywhere really,) so I’m thinking about:

building a web-based interface using some combination of caveman, parenscript, and related tools. This could include features like “user submitted databases,” as well as links to the sources the recpies, in addition to the basic “web forms, APIs, and table rendering.”
storing the data in a database (probably SQLite, mostly) both to support persistence and other more advanced features, but also because I’ve not done database things from Lisp at all.

See the eggquilibrium repo on github it’s still pretty rough, but perhaps it’ll be interesting!'

Other Projects

Writing more! I’m trying to be less obsessive about blogging, as I think it’s useful (and perhaps interesting for you all too.) I’ve been writing a bunch and not posting very much of it. My goal is to mix sort of grandiose musing on technology and engineering, with discussions of Lisp, Emacs, and programming projects.
Working on producing texinfo output from cl-docutils! I’ve been toying around with the idea of writing a publication system targeted at producing books--long-form non-fiction, collections of essays, and fiction--rather than the blogs or technical resources that most such tools are focused on. This is sort of part 0 of this process.
Hacking on some Common Lisp projects, I’m particularly interested in the Nyxt and StumpWM.

Concerns#

Application State#

Startup and Shutdown#

Horizontal Scalability#

Challenges#

Solutions#

Ignore the Problem, For Now#

Federation#

Distributed Locks#

Duplicate Work (Idempotency)#

Consensus Protocols#

Queues#

Delegate to Upstream Services#

Conclusion#

Measure!#

Strategies#

Helpful Settings#

Notes#

Staff Engineers Help Teams Build Great Software#

Staff Promotions Reflect Organizational Capacity#

Staff Promotions are Harder to Get Than Equivalent Management Promotions#

Promoting a Staff Engineer is Easier than Hiring One#

Staff Engineers Don’t Not Manage#

Staff Engineers Write Boring and Tedious But Easy Code#

Related Reading#

Background#

Platform Overview#

Grip#

Amboy#

Jasper#

Gimlet#

Other Great Tools#

Notes#

Discussion#

History and Conflict#

Suggestions / Notes#

My Configuration#

cl-grip#

Eggqulibrium#

Other Projects#

Concerns

Application State

Startup and Shutdown

Horizontal Scalability

Challenges

Solutions

Ignore the Problem, For Now

Federation

Distributed Locks

Duplicate Work (Idempotency)

Consensus Protocols

Queues

Delegate to Upstream Services

Conclusion

Measure!

Strategies

Helpful Settings

Notes

Staff Engineers Help Teams Build Great Software

Staff Promotions Reflect Organizational Capacity

Staff Promotions are Harder to Get Than Equivalent Management Promotions

Promoting a Staff Engineer is Easier than Hiring One

Staff Engineers Don’t Not Manage

Staff Engineers Write Boring and Tedious But Easy Code

Related Reading

Background

Platform Overview

Grip

Amboy

Jasper

Gimlet

Other Great Tools

Notes

Discussion

History and Conflict

Suggestions / Notes

My Configuration

cl-grip

Eggqulibrium

Other Projects