tychoish

In Favor of an Application Infrastructure Framework

2018-07-27 – tychoish

The byproduct of a lot of my work on Evergreen over the past few years has been that I’ve amassed a small collection of reusable components in the form of libraries that address important but not particularly core functionality. While I think the actual features and scale that we’ve achieved for “real” features, the infrastructure that we built has been particularly exciting.

It turns out that I’ve written about a number of these components already here, even. Though I think, my initial posts were about these components in their more proof-of-concept stage, now (finally!) we’re using them all in production so their a bit more hardened.

The first grip is a logging framework. Initially, I thought a high-level logging framework with plug-able backends was going to be really compelling. While configurable back-ends has been good for using grip as the primary toolkit for writing messaging and user-facing alerting, the most compelling feature has been structured logging.

Most of the logging that we do, now, (thanks to grip,) has been to pass structures (e.g. maps) to the logger with key/value data. In combination with log aggregation services/tools (like ELK, splunk, or sumologic,) we can basically take care of nearly all of our application observablity (monitoring) use cases in one stop. It includes easy to use system and golang runtime metrics collection, all using an easy push-based collection, and can also power alert escalation. After having maintained an application using this kind of event driven structured logging system, I have a hard time thinking about running applications without it.

Next we have amboy which is a queue-system. Like grip, all of the components are plug-able, so it support in-memory (ephemeral) queues, distributed queues, dependency graph systems and priority queue implementations as well as a number of different execution models. The most powerful thing that amboy affords us is a single and clear abstraction for defining “background” execution and workloads.

In go it’s easy to spin up a go routine to do some work in the background, it’s super easy to implement worker pools to parallelize the processing of simple tasks. The problem is that as systems grow, it becomes pretty hard to track this complexity in your own code, and we discovered that our application was essentially bifurcated between offline (e.g. background) and online (e.g. request-driven) work. To address all of this problem, we defined all of the background work as small, independent units of work, which can be easily tested, and as a result there is essentially no-adhoc concurrency in the application except what runs in the queues.

The end result of having a unified way to characterize background work is that scaling the application because much less complicated. We can build new queue implementations, without needing to think about the business logic of the background work itself, and we add capacity by increasing the resources of worker machines without needing to think about the architecture of the system. Delightfully, the queue metaphor is independent of external services, so we can run the queue in memory backed by a heap or hash map with executors running in dedicated go-routines if we want, and also scale it out to use databases or dedicated queue services with additional process-based workers, as needed.

The last component, gimlet, addresses building HTTP interfaces, and provides tools for registering routes, writing responses, managing middleware and authentication, an defining routes in a way that’s easy to test. Gimlet is just a wrapper around some established tools like negroni, gorilla/mux, all built on established standard-library foundations. Gimlet has allowed us to unify a bunch of different approaches to these problems, and has lowered the barrier to entry for most of our interfaces.

There are other infrastructural problems still on the table: tools for building inter-system communication and RPC when you can’t communicate via a queue or a shared database (I’ve been thinking a lot about gRPC and protocol buffers for this,) and also about object-mapping and database access patterns, which I don’t really have an answer for.¹

Nevertheless, with the observability, background tasks, and HTTP interface problems well understood at supported, it definitely frees developers to spend more of their time focused core problems of importance to users and the goals of the project. Which is a great place to be.

I built a database migration tool called anser which is mostly focused on integrating migration workflows into production systems so that migrations are part of the core code and can run without affecting production traffic, and while these tools have been useful, I haven’t seen a clear path between this project and meaningfully simplifying the way we manage access to data. ↩︎

Why Write

2018-07-26 – tychoish

I’ve had a blog¹ for more than 15 years, and I’ve found this experience to be generally quite rewarding. I’ve learned a lot about writing, and enjoyed the opportunity to explore a number of different topics² in great detail. While I haven’t blogged as much in recent years, I’ve been thinking in the past few weeks about getting back into writing more regularly, which has lead me to reflect on my writing in the past and my goals for this in the future.

First, the blog as a genre has changed fundamentally in the last 17 or 18 years. In 2000 or 2001, blogs were independent things that grew out of communities (e.g. MetaFilter, or web-diary/journaling) and were maintained by independent writers or small groups. Then the tooling got better, the community got better and eventually started to segment based on topic, and finally the press³ gained competence in the form.

Publishing a blog today, is a vastly different proposition today even in the recent past.

True to my form, this leaves me with a collection of divergent thoughts:

maybe the “write everything in one blog even though the topics are not really of interest to any specific group” approach that I’ve always taken. More distinct blogs means more writing (maybe a good thing,)
having a writing practice is good for focusing thoughts, but also for sharing and distributing understating, and I think that sharing understanding is a really important part of learning and growing, and I miss having a structure for these kinds of notes.
perhaps, it would make sense to outsource/hire a freelancer to take care of some editing and marketing-adjacent work, which is more required if you want to engage with users more consistently but that I find distracting and outside of my ability to focus on properly. The problem then is figuring out how to fund that work in a longer-term/sustainable way.
In the past RSS has been a (the?) leading way to distribute content to serious readers, but that isn’t true now and likly hasn’t been true for years. So while I feel able to write a lot of things, I don’t know what the best way to engage with regular readers is
I used to think that I wanted to organize group blogs, and while I think that blog-discussions are fun, and I think there is merit to combined efforts, I’m less interested in doing the organizing myself.
There was a period where I wasn’t blogging much because my day job was very writing focused, and I needed side projects that didn’t involve the English language, and I spent a long time focusing on learning programming, which took a lot of time. Now that work mostly involves programming, and only a little writing, and I’ve had some time to recover as a writer, it feels like I have some space.
I’m not really sure how to host a blog in 2018. The old set up and server I have is more than functional, but there are a lot of services, tools, and patterns that I’m not familiar with and I have some learning to do, even though I probably mostly just want to write things.

In one form or another, though the archives are all here. ↩︎
I’ve written blogs about Philosophy, Hand Knitting, Technology, Documentation, Programming, Science Fiction, Folk Music and Dance, and Economics. ↩︎
Both old media institutions (news papers, television companies, book and magazine publishers) and new institutions that grew out of blogging itself (e.g. HuffPo, Gawker, etc.) ↩︎

Evergreen Intro

2018-07-25 – tychoish

Almost two years ago, I switched teams at work to join the team behind evergreen which is a homegrown continuous integration tool that we use organization wide to host and support development efforts and operational automation. It’s been great.

From the high level, Evergreen takes changes that developers make to source code repositories and runs a set of tasks for each of those changes on a wide variety of systems, and is a key part of the system that allows us to verify that the software we write works on computers other than the ones that we interact with directly. There are a number of CI systems in the world, but Evergreen has a number of interesting features:

it runs tasks in parallel, fanning out tasks to a large pool of machines to shorten the “wall clock” time for task execution.
tasks execute on ephemeral systems managed by Evergreen in response to demands of incoming work.
the service maintains a queue of work and handles task dispatching and results collection.

This focus on larger scale task parallelism and managing host pools gives Evergreen the ability to address larger scale continuous integration workflows with a lower maintenance overhead. This is totally my jam: we get to both affect the development workflow and engineering policies for basically everyone and improving operational efficiency is a leading goal.

My previous gig was more operational, on a sibling team, so it’s been really powerful to be able to address problems relating to application scale and drive the architecture from the other side. I wrote a blog post for a work-adjacent outlet about the features and improvements, but this is my blog, and I think it’d be fun to have some space to explore “what I’ve been working on,” rather than focusing on Evergren as a product.

My first order of business, after becoming familiar with the code base, was to work on logging. When I started learning Go, I wrote a logging library (I even bloged about it), and using this library has allowed us to “get serious about logging.” While it was a long play, we now have highly structured logging which has allowed the entire logging system to become a centerpiece in our observably story, and we’ve been able to use centralized log aggregation services (and even shop around!) As our deployment grows, centralized logging is the thing that has kept everything together.

Recently, I’ve been focusing on how the application handles “offline” or background work. Historically the application has had a number of loosely coupled “cron-job” like operations that all happened on single machine at a regular interval. I’m focusing on how to move these systems into more tightly coupled, event-driven operations that can be distributed to a larger cluster of machines. Amboy is a big part of this, but there have been other changes related to this project.

On the horizon, I’m also starting to think about how to reduce the cost of exposing data and operations to clients and users in a way that’s lightweight and flexible, and relatively inexpensive for developer time. Right now there’s a lot of technical debt, a myriad of different ways to describe interfaces, and inconsistent client coverage. Nothing insurmountable, but definitely the next frontier of growing pains.

The theme here is “how do we take an application that works and does something really cool,” and turn it into a robust piece of software that can both scale as needs grown, but also provide a platform for developing new features with increasing levels of confidence, stability, and speed.

The conventional wisdom is that it’s easy to build features fast-and-loose without a bunch of infrastructure, and that as you scale the speed of feature development slows down. I’m pretty convinced that this is untrue and am excited to explore the ways that improved common infrastructure can reduce the impact of this ossification and lead to more nimble and expansive feature development.

We’ll see how it goes! And I hope to be able to find the time to write about it more here.

Combating Legacy Code

2018-07-24 – tychoish

I wrote some notes about to write a post about a software project I worked on a year and a half ago, that I think is pretty cool, but I was on writing hiatus. Even better the specific code in question is now no longer in use. But I think it serves as a useful parable, but I will attempt to reflect.

Go’s logging¹ support in standard library works, and it successfully achieves its goals on its own terms. The problem is that it’s incredibly simple and lacks a number of features that are standard in most logging systems.² So as a result, I’m not surprised that most applications of consequence either use a couple of more fully featured logging packages or end up writing a large number of logging wrappers.

The fact that my project at work was using a special logging library is not particularly surprising, particularly because the project is old for a Go project. The logging library in question is a log4j-inspired package, that had been developed by a different group internally, but was no longer being used by that group. It worked, but there were a host of problems.³

I’d also written a logging package myself which was a definite improvement on the state of the art. I had two chief problems:

how to convince teammates to make the change,
how to make the change without disrupting ongoing work or the functioning of the system which had to be always deploy-able.

Here’s what I did…

First, I learned as much as I could about the existing system, it’s history and how we used it. I read a lot of code, documentation (such as it was,) and also related bug reports, feature requests, and history.

Second, I implemented wrappers for my system that (mostly) cloned the interfaces for the existing library in my own package. It’s called slogger, and it’s still there, though I hope to delete it soon. I wanted to make it possible to make the switch⁴ in the project initialization without needing to change every last logging statement.⁵

Then, we actually made the change so that logging used the new code internally but wrapped by the old interfaces. I think there were a couple of very obvious bugs early on, but frankly none of them are so memorable that I could describe them any more.

Finally, we went through and updated all of the logging statements. It was a big change, and impacted all of the code, but it happened quite late in the process and there were no bugs, because it was the least interesting or radical part of the project.

And then we had a new logger. It’s been great. With the new tool we’ve been able to easily add support for more structured approaches to logging and collecting log output in a variety of third party services.

In summary:

replacing legacy subsystems can be a good way to improve the functionality of your project.
change is hard, but there are ways to make changes easier and less disruptive. They often involve doing even more work.*
write code to facilitate transitions, and then delete it later.
the larger a change is, the less risky it should be. While there are lots of small-and-low risk changes you can make, the inverse should be true as rarely as possible.

This is to say, application logging facility. ↩︎
This includes filtering by log level, different formatting options, (semi) structured logging, conditional logging, buffering, and other options. ↩︎
Hilariously something in the way we were using the logger was tripping up the race detector. While the logger did a decent job of providing the file name and line number of the logging statement, it was pretty focused on printing content to a file/standard output. ↩︎
Potentially this should have been behind a feature flag, though I think I didn’t actually use a feature flag. ↩︎
The short version here is, “interfaces are great.” ↩︎

Consciousness Rising

2016-11-24 – tychoish

The subtitle of this post should be “or, how the internet learned about intersectionality,” but while I love a good pretentious academic title, I don’t think that’s particularly representative of my intent here.

Sometime in the last 5 or 10 years, the popular discourse on justice on the internet learned about intersectionality. Which is great. Intersectionality, generally is the notion that a single identity isn’t sufficient to explain an individuals social experience particularly vis a vis privilege. Cool.

This is really crucial and really important for understanding how the world works, but for totally understandable and plain ways. People have a lot of different identities which lead to many different experiences, perspectives, and understandings. All of these identities, experiences, perspectives, and understandings interact with each other in a big complex system

Therefore our analysis of our experiences, thought, understandings, and identities, must explore identities (ET AL) not only on their own terms, but in conversation with each other and with other aspects of experience.

Intersectionality is incredibly important. It’s also incredibly useful as a critical tool because it makes it possible for our thought to reflect actual lived experiences and the way that various aspects of experience interact to create culture and society.¹

While intersectionality is an interesting and important concept that could certainly support an entire blog post, I’m more interested, the genealogy of this concept in the popular critical discourse.

I know that I read a lot about intersectionality in college (in 2004-2007), I know that the papers I read were at least 10 years old, and I know that intersectionality wasn’t an available concept to political conversations on the internet at the time in the way that it is now.²

Concepts take a long time, centuries sometimes, to filter into general awareness, so the delay itself isn’t particularly notable. Even the specific route isn’t that interesting in and for itself. Rather, I’m interested in how a concept proliferates and what is required for a concept to become available to a more popular discourse.

If interesectionality was an available concept in the academic literature, what changes and evolutions in thought--both about intersectionality, but in the context--needed to happen for that concept to become available more broadly.

I think it’s particularly exciting to trace the recent intelectual history of a specific concept in discourse, because it might give us insight into the next concepts that will help inform our discourse and things we can do to facilitate this process in the future for new concepts and perspectives.

As we understand the history of this proliferation, we can also understand its failures and inefficiencies and attempt to deploy new strategies that resolve those shortcomings.

A lot of arguments in favor of intersectional analysis and perspectives are political, and raise the very real critique that analysis that is not intersectional tends to recapitulate normative cultural assumptions. I’d argue, additionally, that intersectionality is really the only way to pull apart experiences and thoughts and understand fundamentally how culture works. It’s not just good politics, but required methodology for learning about our world and our lives. ↩︎
I admit that this post is based on the conceit that there was a point when the popular discourse (on the internet) was unaware of intersectionality followed linearly by another point where the concept of intersectionality was available generally. This isn’t how the dissemination of concepts into discourses work, and I’m aware that I’ve oversimplified the idea somewhat. This is more about the process of popularization. ↩︎

Cache Maintence

2016-11-11 – tychoish

Twice this fall I’ve worked on code that takes a group of files and ensures that the total size of the files are less than a given size. The operation is pretty simple: identify all the files and their size (recursively, or not but accounting for the size of directories,) sort them, and and delete files from the front or back of the populated list. When you’ve reached the desired size.

If you have a cache and you’re constantly adding content to it, eventually you will either need an infinite amount of storage or you’ll have to delete something.

But what to delete? And how?

Presumably you use some items in the cache more often than others, and some files that change very often while others change very rarely, and in many cases, use and change frequency are orthogonal.

For the cases that I’ve worked on, the first case, frequency of use, is the property that we’re interested in. If we haven’t used a file in a while relative to the other files, the chances are its safe to delete.

The problem is that access time (atime) is that while most file systems have a concept of atime, most of them don’t update it. Which makes sense: if every time you read a file you have to update the metadata, then every read operation becomes a write operations, and everything becomes slow.

Relative access time or, relatime, helps some. Here atime is updated, but only if you’re writing to the file or if it’s been more than 24 hours since your last update. The problem, of course, is that if cache are write-once-read-many and operates with a time granularity of less than a day, then relatime is often just creation time. That’s no good.

The approach I’ve been taking is to use the last modification time, (mtime), and to intentionally update mtime (e.g. using touch or a similar operation,) after cache access. It’s slightly less elegant than it could be, but it works really well and requires very little overhead.

Armed with these decisions all you need is a thing that crawls a file system, collects objects and stores their size and time, so we know how large the cache is, and can maintain an ordered list of file objects by mtime. The ordered lists of files should be a heap, but the truth is that you build and sort the structure once, and then just remove the “lowest” (oldest) items until the cache is the right size and then throwing it all away, so you’re not really doing many heap-ish operations.

Therefore, I present lru. Earlier this summer I wrote a less generic implementation of the same principal, and was elbows deep into another project when I realized I needed another cache pruning tool. Sensing a trend, I decided to put a little more time into the project and built it out as a library that other people can use, though frankly I’m mostly concerned about my future self.

The package has two types, a Cache type that incorporates the core functionality and FileObject which represents items in the cache.

Operation is simple. You can construct and add items to the cache manually, or you can use DirectoryContents or TreeContents which build caches from a starting file system point. DirectoryContents looks at the contents of a single directory (skipping sub-directories optionally) and returns a Cache object with those contents. If you do not skip directories, each directory has, in the cache the total size of its contents.

TreeContents recurses through the tree and ignores directories, and returns a Cache object with all of those elements. TreeContents does not clean up empty directories.

Once you have a Cache object, use its Prune method with the maximum size of the cache (in bytes), any objects to exclude, and an optional dry-run flag, to prune the cache down until it’s less than or equal to the max size.

Done.

I’m not planning any substantive changes to the library at this time as it meets most of my needs but there are some obvious features:

a daemon mode where the cache object can “watch” a file system (using ionotify or similar) and add items to or update existing items in the cache. Potentially using fsnotify.
an option to delete empty directories encountered during pruning.
options to use other time data from the file system when possible, potentially using the times library.

With luck, I can go a little while longer without doing this again. With a little more luck, you’ll find lru useful.

Deleuze and Grove

2016-11-10 – tychoish

I’ve been reading, two books non-fiction intermittently in the last little bit: Andy Grove’s High Output Management and Deleuze and Guatteri’s What is Philosophy?. Not only is reading non-fiction somewhat novel for me, but I’m sorting delighting in the juxtaposition. And I’m finding both books pretty compelling.

These are fundamentally materialist works. Grove’s writing from his experience as a manager, but it’s a book about organizing that focuses on personal and organizational effectiveness, with a lot of corporate high-tech company examples. But the fact that it’s a high-tech company that works on actually producing things, means that he’s thinking a lot about production and material constraints. It’s particularly interesting because the discussion technology and management often lead to popular writing that’s handwavey and abstract: this is not what Grove’s book is in the slightest.

Deleuze is more complex, and Guatteri definitely tempers the materialism, though less in the case of What is Philosophy than the earlier books. Having said that, I think What is Philosophy is really an attempt to both justify philosophy in and for itself, but also to discuss the project of knowledge (concept) creation in material, mechanistic terms.

To be honest this is the thing that I find the most compelling about Deleuze in general: he’s undeniably materialist in his outlook and approach, but but his work often--thanks to Guatteri, I think--focuses on issues central to non-materialist thought: interiority, subjectivity, experience, and identity. Without loosing the need to explore systems, mechanisms, and interfaces between and among related components and concepts.

I talked with a coworker about the fact that I’ve been reading both of these pieces together, and he said something to the effect of “yeah, Grove rambles a bunch but has a lot of good points, which is basically the same as Deleuze.” Fair. I’d even go a bit further and say that these are both books, that are despite their specialized topics and focus, are really deep down books for everyone, and guides for being in the world.

Read them both.

Isolation and Ideology Change

2016-11-10 – tychoish

Following the 2016 election my father, who is a much more active participant in Facebook than I, said something to the effect of “don’t mourn; organize. I had a long winded post on the topic of ‘don’t celebrate; organize’, but the bottom line is the same: organize.”

I’d append to this just to make clear that I’m of the opinion that self care, survival and the care for and survival of our communities is crucial. Which sometimes means celebration and sometimes means mourning and sometimes means a quiet night at home with the and friends.

At the 2016 New England Sacred Harp Convention a friend of mine gave a lesion for those members of the community who were unable to attend because of profound illness which was delivered in conjunction with a lession in memorial for members of the community who had died in the last year. These lessons are a common and enduring tradition of Sacred Harp conventions.

The lesson focused on isolation, and the ways that illness, care-giving (and indeed dying, death, and grief) are isolating. But it went on to discuss the ways that we combat isolation, through connections to people and communities, and by the project of meaning making.

Connection and meaning making are related, of course, and are central to why I sing. I mean I also enjoy the music, but it’s the connection with other singers, and the ways that our practices in and around singing are about making meaning.

I heard this almost 6 weeks ago, but I keep coming back to this in a number of different contexts. There’s a lot in the world that either directly isolates, or provokes feelings of isolation.

Bottom line, the way that we can fight isolation is by forming connections and by working to create meaning in our lives.

I was talking on Wednesday with a couple of friends, one of who was most distraught at the seeming impossibility of progress. “What can I do? There are all these people, and I’m not sure anything I can do will have any effect.” I think this distress is incredibly common and reasonable, given the size of the task and the amount of time any person has in the world.

The task of effecting change is huge on its own, but the project is compounded by its scale: there are a lot of people in the world and a lot of different views. It’s difficult to even know where to begin.

I think fundamentally this kind of distress is about the isolation created by the experience of difference, by the size of the task.

There are tools that we can use for managing and fighting our own isolation: building connections to each other, creating meaning in our lives and in our social spheres.

This is also, interestingly, these are the same methods that we use to organize, to build consciousness, and to change ideologies.

On Wednesday, I said, that (for the most part) people are just people: the way that thought changes is through meaningfulrelationships, conversation, and through additional opportunities to make meaning and to form connections in a larger context.

Seek out people and experiences that are different. Stay safe. Listen. Learn. Talk. Teach. Share your experiences with people who are like you. Work hard. Take breaks. Remember that people are, for the most part, just people, and we’re all alone in this together. All of us.