I left my job at MongoDB (8.5 years!) at the beginning of the summer,
and started a new job at the beginning of the month. I’ll be writing
and posting more about my new gig, career paths in general, reflections
on what I accomplished on my old team, the process of interviewing as a
software engineer, as well as the profession and industry over time. For
now, though, I want to write about one of the things I’ve been working
on this summer: making a bunch of the open source libraries that I
worked on more generally useable. I’ve been calling this the deciduous
platform, which now has its own
github organization! So it must be real.
The main modification in these forks, aside from adding a few features
that had been on my list for a while, has been to update the buildsystem
to use go modules and rewrite the history of the repository to
remove all of the old vendoring. I expect to continue development on
some aspects of these over time, though the truth is that these
libraries were quite stable and were nearly in maintenance mode anyway.
Background#
The team was responsible for a big monolith (or so) application:
development had begun in 2013, which was early for Go, and while
everything worked, it was a bit weird. My efforts when I joined in 2015
focused mostly on stabilization, architecture, and reliability. While
the application worked, mostly, it was clear that it suffered from a few
problem, which I believe were the result of originating early in the
history of Go: First, because no one had tried to write big applications
yet, the patterns weren’t well established, and so the team ended up
writing code that worked but that was difficult to maintain, and ended
up with bespoke solutions to a number of generic problems like running
workloads in the background or managing Apia. Second, Go’s standard
library tends to be really solid, but also tends towards being a little
low level for most day-to-day tasks, so things like logging and process
management end up requiring more code than is reasonable.
I taught myself to write Go by working on a logging library, and worked
on a distributed queue library. One of the things that I realized early,
was that breaking the application into “microservices,” would have
been both difficult and offered minimal benefit, so I went with the
approach of creating a well factored monolith, which included a lot of
application specific work, but also building a small collection of
libraries and internal services to provide useful abstractions and
separations for application developers and projects.
This allowed for a certain level of focus, both for the team creating
the infrastructure, but also for the application itself: the developers
working on the application mostly focused on the kind of high level core
business logic that you’d expect, while the infrastructure/platform
team really focused on these libraries and various integration problems.
The focus wasn’t just organizational: the codebases became easier to
maintain and features became easier to develop.
This experience has lead me to think that architecture decisions may not
be well captured by the monolith/microservice dichotomy, but rather
there’s’ this third option that centers on internal architecture,
platforms, and the possibility for developer focus and velocity.
While there are 13 or so repositories in the platform, really there are
4 major libraries: grip, a logging library; jasper, a process management
framework; amboy, a (possibly distributed) worker queue; and gimlet, a
collection of tools for building HTTP/REST services.
The tools all work pretty well together, and combine to provide an
environment where you can focus on writing the business logic for your
HTTP services and background tasks, with minimal boilerplate to get it
all running. It’s pretty swell, and makes it possible to spin up (or
spin out) well factored services with similar internal architectures,
and robust internal infrastructure.
I wanted to write a bit about each of the major components, addressing
why I think these libraries are compelling and the kinds of features
that I’m excited to add in the future.
Grip#
Grip is a structured-logging
friendly library, and is broadly similar to other third-party logging
systems. There are two main underlying interfaces, representing logging
targets (Sender) and messages, as well as a higher level “journal”
interface for use during programming. It’s pretty easy to write new
message or bakcends, which means you can use grip to capture all kinds
of arbitrary messages in consistent manners, and also send those
messages wherever they’re needed.
Internally, it’s quite nice to be able to just send messages to
specific log targets, using configuration within an application rather
than needing to operationally manage log output. Operations folks
shouldn’t be stuck dealing with just managing logs, after all, and
it’s quite nice to just send data directly to Splunk or Sumologic. We
also used the same grip fundamentals to send notifications and alerts to
Slack channels, email lists, or even to create Jira Issues, minimizing
the amount of clunky integration code.
There are some pretty cool projects in and around grip:
- support for additional logging targets. The decudous version of grip
adds twitter as an output format as well as creating desktop
notifications (e.g. growl/libnotify,) but I think it would also be
interesting to add fluent/logstash connections that don’t have to
transit via standard error.'
- While structured logging is great, I noticed that we ended up logging
messages automatically in the background as a method of metrics
collection. It would be cool to be able to add some kind of
“intercepting sender” that handled some of these structured metrics,
and was able to expose this data in a format that the conventional
tools these days (prometheus, others,) can handle. Some of this code
would clearly need to be in Grip, and other aspects clearly fall into
other tools/libraries.
Amboy#
Amboy is an interface for doing
things with queues. The interfaces are simple, and you have:
- a queue that has some way of storing and dispatching jobs.
- implementations of jobs which are responsible for executing your
business logic, and with a base implemention that you can easily
compose, into your job types, all you need to implement, really is a
Run()
method.
- a queue “group” which provides a higher level abstraction on top of
queues to support segregating workflows/queues in a single system to
improve quality of service. Group queues function like other queues
but can be automatically managed by the processes.
- a runner/pool implementation that provides the actual thread pool.
There’s a type registry for job implementations and versioning in the
schema for jobs so that you can safely round-trip a job between machines
and update the implementation safely without ensuring the queue is
empty.
This turns out to be incredibly powerful for managing background and
asynchronous work in applications. The package includes a number of
in-memory queues for managing workloads in ephemeral utilities, as well
as a distributed MongoDB backed-queue for running multiple copies of an
application with a shared queue(s). There’s also a layer of management
tools for introspecting, managing, the state of jobs.
While Amboy is quite stable, there is a small collection of work that
I’m interested in:
- a queue implementation that store jobs to a local Badger database
on-disk to provide single-system restartabilty for jobs.
- a queue implementation that stores jobs in a PostgreSQL, mirroring the
MongoDB job functionality, to be able to meet job backends.
- queue implementations that use messaging systems (Kafka, AMPQ) for
backends. There exists an SQS implementation, but all of these systems
have less strict semantics for process restarts than the database
options, and database can easily handle on the order of a hundred of
thousand of jobs an hour.
- changes to the queue API to remove a few legacy methods that return
channels instead of iterators.
- improve the semantics for closing a queue.
While Amboy has provisions for building architectures with workers
running on multiple processes, rather than having queues running
multiple threads within the same process, it would be interesting to
develop more fully-fledged examples of this.
Jasper#
Jasper provides a high level
set of tools for managing subprocesses in Go, adding a highly ergonomic
API (in Go,) as well as exposing process management as a service to
facilitate running processes on remote machines. Jasper also
manages/tracks the state of running processes, and can reduce pressures
on calling code to track the state of processes.
The package currently exposes Jasper services over REST, gRPC, and
MongoDB’s wire protocol, and there is also code to support using SSH as
a transport so that you don’t need to expose remote these services
publically.
Jasper is, perhaps, the most stable of the libraries, but I am
interested in thinking about a couple of extensions:
- using jasper as PID 1 within a container to be able to orchestrate
workloads running on containers, and contain (some) support for lower
level container orchestration.
- write configuration file-based tools for using jasper to orchestrate
buildsystems and distributed test orchestration.
I’m also interested in cleaning up some of the MongoDB-specific code
(i.e. the code that downloads MongoDB versions for use in test
harnesses,) and perhaps reinvisioning that as client code that uses
Jasper rather than as a part of Jasper.
Gimlet#
I’ve written about gimlet here
before when I started the
project,
and it remains a pretty useful and ergonomic way to define and regester
HTTP APIs, in the past few years, its grown to add more authentication
features, as well as a new “framework” for defining routes. This makes
it possible to define routes by implementing an interface that:
- makes it very easy to produce paginated routes, and provides some
helpers for managing content
- separates the parsing of inputs from executing the results, which can
make route definitions easy to test without integration tests.
- rehome functionality on top of chi
router. The current implementation
uses Negroni and gorilla mux (but neither are exposed in the
interface), but I think it’d be nice to have this be optional, and
chi looks pretty nice.
The following libraries are defiantly smaller, but I think they’re
really cool:
- birch is a builder for
programatically building BSON documents, and MongoDB’s extended JSON
format. It’s built upon an earlier version of the BSON library. While
it’s unlikely to be as fast at scale, for many operations (like
finding a key in a document), the interface is great for constructing
payloads.
- ftdc provides a way to generate
(and read,) MongoDB’s diagnostic data format, which is a highly
compressed timeseries data format. While this implementation could
drift from the internal implementation over time, the format and tool
remain useful for arbitrary timeseries data.
- certdepot provides a way
to manage a certificate authority with the certificates stored in a
centralized store. I’d like to add other storage backends over time.
And more…
Notes#