How to Write Performant Software and Why You Shouldn't

I said a thing on twitter that I like, and I realized that I hadn't really written (or ranted) much about performance engineering, and it seemed like a good thing to do. Let's get to it.

Making software fast is pretty easy:

  • Measure the performance of your software at two distinct levels:

    • figure out how to isolate specific operations, as in unit test, and get the code to run many times, and measure how long the operations take.
    • Run meaningful units of work, as in integration tests, to understand how the components of your system come together.

    If you're running a service, sometimes tracking the actual timing of actual operations over time, can also be useful, but you need a lot of traffic for this to be really useful. Run these measurements regularly, and track the timing of operations over time so you know when things actually chair.

  • When you notice something is slow, identify the slow thing and make it faster. This sounds silly, but the things that are slow usually fall into one of a few common cases:

    • an operation that you expected to be quick and in memory, actually does something that does I/O (either to a disk or to the network,)
    • an operation allocates more memory than you expect, or allocates memory more often than you expect.
    • there's a loop that takes more time than you expect, because you expected the number of iterations to be small (10?) and instead there are hundreds or thousands of operations.

    Combine these and you can get some really weird effects, particularly over time. An operation that used to be quick gets slower over time, because the items iterated over grows, or a function is called in a loop that used to be an in-memory only operation, now accesses the database, or something like that. The memory based ones can be trickier (but also end up being less common, at least with more recent programming runtimes.)

Collect data, and when something gets slower you should fix it.

Well no.

Most of the time slow software doesn't really matter. The appearance of slowness or fastness is rarely material to user's experience or the bottom line. If software gets slower, most of the time you should just let it get slower:

  • Computers get faster and cheaper over time, so most of the time, as long as your rate of slow down is slow and steady over time, its usually fine to just ride it out. Obviously big slow downs are a problem, but a few percent year-over-year is so rarely worth fixing.

    It's also the case that runtimes and compilers are almost always getting faster, (because compiler devlopers are, in the best way possible, total nerds,) so upgrading the compiler/runtime regularly often offsets regular slow down over time.

  • In the vast majority of cases, the thing that makes software slow is doing I/O (disks or network,) and so your code probably doesn't matter and so what your code does is unlikely to matter much and you can solve the problem by changing how traffic flows through your system.

    For IX (e.g. web/front-end) code, the logic is a bit different, because slow code actually impacts user experience, and humans notice things. The solution here, though, is often not about making the code faster, but in increasingly pushing a lot of code to "the backend," (e.g. avoid prossing data on the front end, and just make sure the backend can always hand you exactly the data you need and want.)

  • Code that's fast is often harder to read and maintain: to make code faster, you often have to be careful and avoid using certain features of your programming language or runtime (e.g. avoiding ususing heap allocations, or reducing the size of allocations by encoding data in more terse ways, etc,) or by avoiding libraries that are "slower," or that use certain abstractions, all of which makes your code less conventional more difficult to read, and harder to debug. Programmer time is almost always more expensive than compute time, so unless it's big or causing problems, its rarely worth making code harder to read.

    Sometimes, making things actually faster is actually required. Maybe you have a lot of data that you need to get through pretty quickly and there's no way around it, or you have some classically difficult algorithm problem (graph search, say,), but in the course of generally building software this happens pretty rarely, and again most of the time pushing the problem "up" (from the front end to the backend and from the backend to the database, similar,) solves whatever problems you might have.

  • While there are obviously pathological counter-examples, ususally related to things that happen in loops, but a lot of operations never have to be fast because they sit right next to another operation that's much slower:

    • Lots of people analyze logging tools for speed, and this is almost always silly because all log messages either have to be written somewhere (I/O) and generally there has to be something to serialize messages (a mutex or functional equivalent,) somewhere because you want to write only one message at a time to the output, so even if you have a really "fast logger" on its own terms, you're going to hit the I/O or the serializing nature of the problem. Use a logger that has the features you have and is easy to use, speed doesn't matter.
    • anything in HTTP request routing and processing. Because request processing sits next to network operations, often between a database as well as to the client, any sort of gain by using a "a faster web framework," is probably immeasurable. Use the ones with the clearest feature set.

Pattern Fragment 1

This is the follow up to Pattern Fragment 0

After all of the shaping for the body of the sweater, you'll have 256 stitches. The goal is to have 196 stitches total for the yoke section, or 98 stitches front and back. This gives me a yoke width of 14 inches, which I know fits me well. Your shoulder width may turn out to be deeply personal, modify as needed to accommodate your personal shoulders.

Put 14 stitches on holders at each underarm, this should be the 7 stitches before and after your round beginning and middle markers. Cast on 10 steek stitches using the backward loop (e-wrap) method above each steek. These are the underarms.

On the next row, after creating the steeks, decrease one body stitch into the first and last steek stitches, and continue these decreases in alternating rows, 7 times (14 total rows), until there are 98 stitches ready for the yoke.

The division between "stitches set aside" and "stitches decreased" at the beginning of the yoke are flexible, as long as you've finished shaping the yoke before its about 2 inches long.

Finished a Sweater

I finished knitting a sweater a bit ago, and it's pretty cool. Some thoughts:

  • The cuffs ended up being a touch too wide, but it's workable. I think this sweater is really good for wearing over an oxford, and as such slightly wider cuffs may be fine.
  • I used the placket / open neck line reminiscent of 1/4 zip sweaters, but chose for the first time to do garter stitch rather than ribbing for the horizontal parts of the plackets, which worked pretty well, though I might choose to execute them differently in the future. Having said that, I think I want to explore different neck shapes.
  • I didn't do any kind of lower body shaping, which is fine, particularly on such a boxy garment, but waist shaping is a good thing that I'd use again in the future.
  • This is the first drop shouldered garment I've made since the knitting hiatus. It was comforting, but I suspect I'll not knit another for quite a while.
  • I'd knit this sweater before using exactly these colors, albeit in a thicker weight yarn, and a few times with different color combinations. It was really fun and familiar.
  • HD Shetland Yarn is pretty awesome. This was the first time I'd used it for non-stranded knitting, and it was great fun to knit.
  • The previous couple of sweaters that I'd made were both knit at about 9 stitches to the inch, and this was about 7 stitches to the inch, which means that it felt like it went really fast. It's wild how we acclimate to things.

Continuous Integration is Harder Than You Think

I've been working on continuous integration systems for a few years, and while the basic principle of CI is straightforward, it seems that most CI deployments are not. This makes sense: project infrastructure is an easy place to defer maintenance during the development cycle, and projects often prioritize feature development and bug fixing over tweaking the buildsystem or test infrastructure, but I almost think there's something more. This post is a consideration of what makes CI hard and perhaps provide a bit of unsolicited advice.

The Case for CI

I suppose I don't really have to sell anyone on the utility or power of CI: running a set of tests on your software regularly allows developers and teams to catch bugs early, and saves a bucket of developer time, and that is usually enough. Really, though, CI ends up giving you the leverage to solve a number of really gnarly engineering problems:

  • how to release software consistently and regularly.
  • how to support multiple platforms.
  • how to manage larger codebases.
  • anything with distributed systems.
  • how to develop software with larger numbers of contributors.

Doing any of these things without CI isn't really particularly viable, particularly at scale. This isn't to say, that they "come free" with CI, but that CI is often the right place to build the kind of infrastructure required to manage distributed systems problems or release complexity.

Buildsystems are Crucial

One thing that I see teams doing some times is addressing their local development processes and tooling with a different set of expectations than they do in CI, and you can totally see and understand how this happens: the CI processes always start from a clean environment, and you often want to handle failures in CI differently than you might handle a failure locally. It's really easy to write a shell script that only runs in CI, and then things sort of accumulate, and eventually there emerge a class of features and phenomena that only exist for and because of CI.

The solution is simple: invest in your buildsystem, [1] and ensure that there is minimal (or no!) indirection between your buildsystem and your CI configuration. But buildsystems are hard, and in a lot of cases, test harnesses aren't easily integrated into build systems, which complicates the problem for some. Having a good build system isn't particularly about picking a good tool, though there are definitely tradeoffs for different tools, the problem is mostly in capturing logic in a consistent way, providing a good interface, and ensuring that the builds happen as efficiently as possible.

Regardless, I'm a strong believer in centralizing as much functionality in the buildsystem as possible and making sure that CI just calls into build systems. Good build systems:

  • allow you to build or rebuild (or test/subtest) only subsets of work, to allow quick iteration during development and debugging.
  • center around a model of artifacts (things produced) and dependencies (requires-type relationships between artifacts).
  • have clear defaults, automatically detect dependencies and information from the environment, and perform any required set up and teardown for the build and/or test.
  • provide a unified interface for the developer workflow, including building, testing, and packaging.

The upside, is that effort that you put into the development of a buildsystem pay dividends not just for managing to complexity of CI deployments, but also make the local development stable and approachable for new developers.

[1]Canonically buildsystems are things like makefiles (or cmake, scons, waf, rake, npm, maven, ant, gradle, etc.) that are responsible for converting your source files into executable, but the lines get blurry in a lot of languages/projects. For Golang, the go tool plays the part of the buildsystem and test harness without much extra configuration, and many environments have a pretty robust separation between building and testing.

T-Shaped Matrices

There's a temptation with CI systems to exercise your entire test suite with a comprehensive and complete range of platforms, modes, and operations. While this works great for some smaller projects, "completism" is not the best way to model the problem. When designing and selecting your tests and test dimensions, consider the following goals and approaches:

  • on one, and only one, platform run your entire test suite. This platform should probably be very close to the primary runtime of your environment (e.g. when developing a service that runs on Linux service, your tests should run in a system that resembles the production environment,) or possibly your primary development environment.
  • for all platforms other than your primary platform, run only the tests that are either directly related to that runtime/platform (e.g. anything that might be OS or processor specific,) plus some small subset of "verification" or acceptance tests. I would expect that these tests should easily be able to complete in 10% of the time of a "full build,"
  • consider operational variants (e.g. if your product has multiple major-runtime modes, or some kind of pluggable sub-system) and select the set of tests which verifies these modes of operations.

In general the shape of the matrix should be t-shaped, or "wide across" with a long "narrow down." The danger more than anything is in running too many tests, which is a problem because:

  • more tests increase the chance of a false negative (caused by the underlying systems infrastructure, service dependencies, or even flakey tests,) which means you risk spending more time chasing down problems. Running tests that provide signal is good, but the chance of false negatives is a liability.
  • responsiveness of CI frameworks is important but incredibly difficult, and running fewer things can improve responsiveness. While parallelism might help some kinds of runtime limitations with larger numbers of tests, this incurs overhead, is expensive.
  • actual failures become redundant, and difficult to attribute failures in "complete matrices." A test of certain high level systems may pass or fail consistently along all dimensions creating more noise when something fails. With any degree of non-determinism or chance of a false-negative, running tests more than once just make it difficult to attribute failures to a specific change or an intermittent bug.
  • some testing dimensions don't make sense, leading to wasted time addressing test failures. For example when testing an RPC protocol library that supports both encryption and authentication, it's not meaningful to test the combination of "no-encryption" and "authentication," although the other three axes might be interesting.

The ultimate goal, of course is to have a test matrix that you are confident will catch bugs when they occur, is easy to maintain, and helps you build confidence in the software that you ship.

Conclusion

Many organizations have teams dedicated maintaining buildsystems and CI, and that's often appropriate: keeping CI alive is of huge value. It's also very possible for CI and related tools to accrued complexity and debt in ways that are difficult to maintain, even with dedicated teams: taking a step back and thinking about CI, buildsystems, and overall architecture strategically can be very powerful, and really improve the value provided by the system.

Get More Done

It's really easy to over think the way that we approach our work and manage our own time and projects. There are no shortage of tools, services, books, and methods to organizing your days and work, and while there are a lot of good ideas out there, it's easy to get stuck fiddling with how you work, at the expense of actuallying getting work done. While I've definitely thought about this a lot over time, for a long time, I've mostly just done things and not really worried much about the things on my to-do list. [1]

I think about the way that I work similarly to the way that I think about the way I work with other people. The way you work alone is different from collaboration, but a lot of the principles of thinking about big goals, and smaller actionable items is pretty transferable.

My suggestions here are centered around the idea that you have a todo list, and that you spend a few moments a day looking at that list, but actually I think the way I think about my work is really orthogonal to any specific tools. For years, most of my personal planning has revolved around making a few lists in a steno pad once or twice a day, [2] though I've been trying to do more digital things recently. I'm not sure I like it. Again, tools don't matter.

[1]Though, to be clear, I've had the pleasure and benefit of working in an organization that lives-and-dies by a bug tracking system, with a great team of folks doing project management. So there are other people who manage sprints, keep an eye on velocity, and make sure that issues don't get stuck.
[2]My general approach is typically to have a "big projects" or "things to think about" list and a "do this next list", with occasional lists about all the things in a specific big project. In retrospect these map reasonable well to SCRUM/Agile concepts, but it also makes sense.

Smaller Tasks are Always Better

It's easy to plan projects from the "top down," and identify the major components and plan your work around those components, and the times that I run in to trouble are always the times when my "actionable pieces" are too big. Smaller pieces help you build momentum, allow to move around to different areas as your attention and focus change, and help you use avalible time effectively (when you want.)

It's easy to find time in-between meetings, or while the pasta water is boiling, to do something small and quick. It's also very easy to avoid starting something big until you have a big block of unfettered time. The combination of these factors makes bigger tasks liabilities, and more likely to take even longer to complete.

Multi-Task Different Kinds of Work

I read a bunch of articles that suggest that the way to be really productive is to figure out ways of focusing and avoiding context switches. I've even watched a lot of coworkers organize their schedules and work around these principles, and it's always been something of a mystery for me. It's true that too much multi-tasking and context switching can lead to a fragmented experience and make some longer/complicated tasks harder to really dig into, but it's possible to manage the costs of context switching, by breaking apart bigger projects into smaller projects and leaving notes for your (future) self as you work.

Even if you don't do a lot of actual multitasking within a given hour or day of time, it's hard to avoid really working on different kinds of projects on the scale of days or weeks, and I've found that having multiple projects in flight at once actually helps me get more done. In general I think of this as the idea that more projects in flight means that you finish things more often, even if the total number of projects completed is the same in the macro context.

Regardless, different stages of a project require different kind of attention and energy and having a few things in flight increases the chance that when you're in the mood to do some research, or editing, or planning, you have a project with that kind of work all queued up. I prefer to be able to switch to different kinds of work depending on my attention and mood. In general my work falls into the following kinds of activities:

  • planning (e.g. splitting up big tasks, outlining, design work,)
  • generative work (e.g. writing, coding, etc.)
  • organizational (email, collaboration coordination, user support, public issue tracking, mentoring, integration, etc.)
  • polishing (editing, writing/running tests, publication prepping,)
  • reviewing (code review, editing, etc.)

Do the Right Things

My general approach is "do lots of things and hope something sticks," which makes the small assumption that all of the things you do are important. It's fine if not everything is the most important, and it's fine to do things a bit out of order, but it's probably a problem if you do lots of things without getting important things done.

So I'm not saying establish a priority for all tasks and execute them in strictly that priority, at all. Part of the problem is just making sure that the things on your list are still relevant, and still make sense. As we do work and time passes, we have to rethink or rechart how we're going to complete a project, and that reevaluation is useful.

Prioritization and task selection is incredibly hard, and it's easy to cast "prioritization" in over simplified terms. I've been thinking about prioritization, for my own work, as being a decision based on the following factors:

  • deadline (when does this have to be done: work on things that have hard deadlines or expected completion times, ordered by expected completion date, to avoid needing to cram at the last moment.)
  • potential impact (do things that will have the greatest impact before lesser impact, this is super subjective, but can help build momentum, and give you a chance to decide if lower-impact items are worth doing.)
  • time availability fit (do the biggest thing you can manage with the time you have at hand, as smaller things are easier to fit in later,)
  • level of understanding (work on the things that you understand the best, and give yourself the opportunity to plan things that you don't understand later. I sometimes think about this as "do easy things first," but that might be too simple.)
  • time outstanding (how long ago was this task created: do older things first to prevent them from becoming stale.)
  • number of things (or people) that depend on this being done (work on things that will unblock other tasks or collaborators before things that don't have any dependencies, to help increase overall throughput.)

Maintain a Pipeline of Work

Productivity, for me, has always been about getting momentum on projects and being able to add more things. For work projects, there's (almost) always a backlog of tasks, and the next thing is ususally pretty obvious, but sometimes this is harder for personal projects. I've noticed a tendency in myself to prefer "getting everything done" on my personal todo list, which I don't think particularly useful. Having a pipleine of backlog of work is great:

  • there's always something next to do, and there isn't a moment when you've finished and have to think about new things.
  • keeping a list of things that you are going to do in the more distant future lets you start thinking about how bigger pieces fit together without needint to starting to work on that.
  • you can add big things to your list(s) and then break them into smaller pieces as you make progress.

As an experiment, think about your todo list, not as a thing that you'd like to finish all of the items, but as list that shouldn't be shorter than a certain amount (say 20 or 30?) items with rate of completion (10 a week?) though you should choose your own numbers, and set goals based on what you see yourself getting done over time.

Lessons from the Knitting Hiatus

I took years and years off of knitting: life and priorities change and I must confess that a few years of living in a very small apartment with very active cats made it difficult to have the space to really get into knitting. Anyway, it was really nice to have a hobby sitting on the metaphoric shelf that I could get right back into without a big learning curve.

The interesting thing, I think is in observing is that the hiatus made some subtle changes to the way that I approach knitting things, at least relative to what I remember.

  • I'm less opposed to garter stitch, and have been using little bits of it here and there in some projects.
  • I've gotten much better at wrap-and-turn short rows in stocking stitch., they now look pretty good and I remember them always looking terrible.
  • I knit the yoke of a sweater back and forth, which is a thing that I would have found unimaginable.
  • Knitting plain stocking stitch in the round has always been a great joy of mine, but in the last couple of months I've done it rather a lot, knitting 3, or so, plain sweaters, which I've found quite captivating. I always seemed to feel like I needed some kind of patterning (color work, lace, cables etc) to keep things interesting, and that doesn't really seem to be the case.
  • I've yet to knit anything post-hiatus on needles other than US 0s (which are quite small,) and it doesn't seem to bug me very much. I continue to make progress on projects and rounds with 250-340 (or so) stitches don't seem oppressively long.
  • The problem of having little gaps between the sleeve of a sweater and the body at the "bottom corner," always used to be a big problem, and these days I haven't need to sew up these gaps at all, which is kind of novel.
  • My cast on edges have gotten better: I've managed to get edges that are as elastic as they need to be, and all of the usual problems (a twist, mistakes in rubbing, problems in counting, misjudging the length of the long-tail) haven't been a problem at all.

Of course some things didn't change:

  • I still don't really like to do things that involve knitting rows very much, and would prefer to knit as much as possible in the round.
  • My taste in yarns seems to be heavy on the "boring fine wool" and while I've been looking around at the kinds of yarns that are available and popular, I am (for the most part,) pretty content to stick to the really simple and boring yarn.
  • I haven't yet vanquished a number of old fears/struggles like making an EPS-style sweater that I really like, knitting sleeves from the cuff-up, cardigans. Many of these things are on my list of things to explore more in the future, but we'll see how I fair.

It's all very curious! I'm excited to see if anything else changes!

Open Source Emacs Configuration Improvements

In retrospect I'm not totally sure why I released my emacs configuration to the world. I find tweaking Emacs Lisp to be soothing, and in 2020 these kinds of projects are particularly welcome. I've always thought about making it public: I feel like I get a lot out of Emacs, and I'm super aware that it's very hard for people who haven't been using Emacs forever to get a comparable experience. [1]

I also really had no idea of what to expect, and while it's still really recent, I've noticed a few things which are worth remarking:

  • Making your code usable for other people really does make it easy for people to find bugs. While it's likely that there are bugs that people never noticed, I found a few things very quickly:

    • Someone reported higher than expected CPU use, and I discovered that there were a number of functions that ran regularly in timers, and I was able to quickly tune some knobs in order to reduce average CPU use by a lot. This is likely to be great both for the user in question, but also because it'll help battery life.
    • The config includes a git submodule (!) with the contents of all third-party packages, mostly to reduce friction for people getting started. Downloading all of the packages fresh from the archive would take a few minutes, and the git clone is just faster. I realized, when someone ran into some problems when running with emacs 28 (e.g. the development/mainline build,) that the byte-compilation formats were different, which made the emacs27 files not work on emacs28. I pushed a second branch.

    More than anything the experience of getting bug reports and feedback has been great. It both makes it possible to focus time because the impact of the work is really clear, and it also makes it clear to me that I've accumulated some actually decent Emacs Lisp skills, without really noticing it. [2]

  • I was inspired to make a few structural improvements.

    • For a long time, including after the initial release, I had a "settings" file, and a "local functions" file that held code that I'd written or coppied from one place or another, and I finally divided them all into packages named tychoish-<thing>.el which allowed me to put all or most of the configuration into use-package forms, which is more consistent and also helps startup time a bit, and makes the directory structure a bit easier.
    • I also cleaned up a bunch of local snippets that I'd been carrying around, which wasn't hurting anything but is a bit more clear in the present form.
  • I believe that I've hit the limit, with regards to startup speed. I'd really like to get a GUI emacs instance to start (with no buffers) in less than a second, but it doesn't seem super plausible. I got really close. At this point there are two factors that constrain:

    • Raw CPU speed. I have two computers, and the machine with the newer CPU is consistently 25% faster than the slow computer.
    • While the default configuration doesn't do this, my personal configuration sets a font (this is reasonable,) but seems that the time to do this is sometimes observable, and proportional to the number of fonts you have installed on the system. [3]
    • Dependencies during the early load. I was able to save about 10% time by moving a function between package to reduce the packages that startup code depended upon. There's just a limit to how much you can clean up here.

    Having said that, these things can drift pretty easily. I've added some helper macros with-timer and with-slow-op-timer that I can use to report the timing of operations during startup to make sure that things don't slow down.

    Interestingly, I've experimented with byte-compiling my local configuration and I haven't really noticed much of a speedup at this scale, so for ease I've been leaving my own lisp directory unbytecompiled.

  • With everything in order, there's not much to edit! I guess I'll have other things to work on, but I have made a few improvements, generally:

    • Using the alert package for desktop notification, which allowed me to delete a legacy package I've been using. Deleting code is awesome.
    • I finally figured out how to really take advantage of projectile, which is now configured correctly, and has been a lot of help in my day-to-day work.
    • I've started using ERC more, and only really using my irssi (in screen) session as a fallback. My IRC/IM setup is a bit beyond the scope of this post but ERC has been a bit fussy to use on machines with intermittent connections, but I think I've been able to tweak that pretty well and have an experience that's quite good.

It's been interesting! And I'm looking forward to continuing to do this!

[1]Sure, other editors also have long setup curves, but Emacs is particularly gnarly in this regard, and I think adoption by new programmers is definitely constrained by this fact.
[2]I never really thought of myself as someone who wrote Emacs Lisp: I've never really written a piece of software in Emacs, it's always been a function here or there, or modifying some snippet from somewhere. I don't know if I have a project or a goal that would involve writing more emacs software, but it's nice to recognize that I've accidentally acquired a skill.
[3]On Windows and macOS systems this may not matter, but you may have more fonts installed than you need. I certianly did. Be aware that webbrowsers often downlaod their own fonts separately from system fonts, so having fonts installed is really relevant to your GTK/QT/UI use and not actually to the place where you're likely doing most of your font interaction (e.g. the browser.)

Knitting Pictures

I've never been really good at the blogging+picture game, and while maybe once upon a time it was technical limitation--taking photos and getting them online was complicated--anymore it's probably not. To this end, I've started a knitting specific Instagram account as a kind of photoblog for knitting things. It's @gestaltknitting, if you're interested.


While I took this picture a while ago, I must confess that my knitting basically looks the same now.

The same, not because I've made no progress, but because sleeves take a while and it's just plain knitting, so unless you have a very discerning eye, you might miss the details.

Indeed, I really want my next project to also have a lot of plain knitting with black yarn: I expect the photographs will be captivating. Perhaps it will be enjoyable for people to be able to spot the different patterns of embedded cat hair in the sweaters.


I get that knitting is visual for a lot of people, and I do like a smart looking sweater as much as the next guy, but I've always felt somewhat resistant to this view: knitting is about the process and the act more than it is about the product, and so the things that are most exciting aren't the visuals.

While it's gotten much easier to take high quality pictures, my intention for this book that I've been writing is that it mostly would not be a book with a lot of picture, though we'll see: If anything, I suspect that diagrams and cartoons may be more effective for this kind of application.

Having said that, it's nice to see what other people are knitting, and I like the way that the ephemeral nature of instagram stories make it less daunting to post in-progress updates on projects. So I've definitely been enjoying that.

We'll see!