Editor Themes

It's not real secret that I'm red-green colorblind. It's not really a major life obstacle: I've got a wardrobe of clothes in colors that I can easily tell apart, I have developed a number of heuristics for guessing colors that are right enough, and mostly it just creates funny stories where I get a confused look if I try to describe something that might be purple, or have to convince a coworker into reading a graph for me.

One challenge historically, however, has been various kinds of text editing color themes: so often they end up having some kind of low contrast situation that's hard to read, or two different aspects of code that should be highlighted differently but aren't. I've tried lots of themes out, and I would always end up just going back and using default emacs themes, which might not have been great, but I always found it useable.

Until Protesilaos Stavrou's Modus Themes, that is.

These are super compelling and I never really knew how good an editor could look until I started using it. Like is this what it's really like for the rest of you all the time? Things are clear: I never get confused between type names and function names any more. There are rarely situations where I feel like the highlighting color and text color are the same, which used to happen all the time.

The real win, I think, is that Modus' makes dark themes accessible to me, in a way that they never were before. For the most part "dark themes" which have been so popular recently, are just impossible to see clearly (for me), I also find that it's less taxing to spend time in front of screens when darker backgrounds, so being able to spend most of my time doing work in an environment that's easy to read. I tend to keep to light backgrounds when using a laptop and dark backgrounds otherwise.

The second piece is that, I think I've caved in and decided to increase the size of the font, by default in my text editor, at least when I'm using my desktop/external monitor. I think my vision is as good as it's been, though I should probably get that checked out post-pandemic. I think there's a balance between "small fonts let you see more of the file you're working on," and "larger fonts let you focus on the area that you're editing." When I'm writing English, the focus is great, and when writing software I tend to want more context. There's a balance also in wanting to keep an entire line length viable at once, and an ideal words-per-line limit for text that's useful for making things easier to read. So there's some tuning there, depending on what your workload looks like.

I guess if there is any lesson in this it's that: Comfort matters, and you shouldn't push yourself into uncomfortable display situations if you can.

Does Anyone Actually Want Serverless?

Cloud computing, and with it most of tech, has been really hot on the idea of "serverless" computing, which is to say, services and applications that are deployed, provisioned, and priced separate from conventional "server" resources (memory, storage, bandwidth.) The idea is that we can build and expose ways of deploying and running applications and services, even low-level components like "databases" and "function execution", in ways that mean that developers and operators can avoid thinking about computers qua computers.

Serverless is the logical extension of "platform as a service" offerings that have been an oft missed goal for a long time. You write high-level applications and code that is designed to run in some kind of sandbox, with external services provided in some kind of ala carte model via integrations with other products or services. The PaaS, then, can take care of everything else: load balancing incoming requests, redundancy to support higher availability, and any kind of maintains on the lower level infrastructure. Serverless is often just PaaS but more: provide a complete stack of services to satisfy needs (databases, queues, background work, authentication, caching, on top of the runtime,) and then change the pricing model to be based on request/utilization rather than time or resources.

Fundamentally, this allows the separation of concerns between "writing software," and "running software," and allows much if not all of the running of software to be delegated to service providers. This kind of separation is useful for developers, and in general runtime environments seem like the kind of thing that most technical organizations shouldn't need to focus on: outsourcing may actually be good right?

Well maybe.

Let's be clear, serverless platforms primarily benefit the provider of the services for two reasons:

  • serverless models allow providers to build offerings that are multi-tenant, and give provider the ability to reap the benefit of managing request load dynamically and sharing resources between services/clients.
  • utilization pricing for services is always going to be higher than commodity pricing for the underlying components. Running your on servers ("metal") is cheaper than using cloud infrastructure, over time, but capacity planning, redundancy, and management overhead, make that difficult in practice. The proposition is that while serverless may cost more per-unit, it has lower management costs for users (fewer people in "ops" roles,) and is more flexible if request patterns change.

So we know why the industry seems to want serverless to be a thing, but does it actually make sense?

Maybe?

Makers of software strive (or ought to) make their software easy to run, and having very explicit expectations about the runtime environment, make software easier to run. Similarly, being able to write code without needing to manage the runtime, monitoring, logging, while using packaged services for caching storage and databases seems like a great boon.

The downsides to software producers, however, are plentiful:

  • vendor lock-in is real, not just because it places your application at the mercy of an external provider, as they do maintenance, or as their API and platform evolves on their timeframe.
  • hosted systems, mean that it's difficult to do local development and testing: either every developer needs to have their own sandbox (at some expense and management overhead), or you have to maintain a separate runtime environment for development.
  • application's cannot have service levels which exceed the service level agreements of their underlying providers. If your serverless platform has an SLA which is less robust than the SLA of your application you're in trouble.
  • when something breaks, there are few operational remedies available. Upstream timeouts are often not changeable and most forms of manual intervention aren't available.
  • pricing probably only makes sense for organizations operating at either small scale (most organizations, particularly for greenfield projects,) but is never really viable for any kind of scale, and probably doesn't make sense in any situation at scale.
  • some problems and kinds of software just don't work in a serverless model: big data sets that exceed reasonable RAM requirements, data processing problems which aren't easily parallelizable, workloads with long running operations, or workloads that require lower level network or hardware access.
  • most serverless systems will incur some overhead over dedicated/serverfull alternatives and therefore have worse performance/efficiency, and potentially less predictable performance, especially in very high-volume situations.

Where does that leave us?

  • Many applications and bespoke tooling should probably use serverless tools. Particularly if your organization is already committed to a specific cloud ecosystem, this can make a lot of sense.
  • Prototypes, unequivocally make sense to rely on off-the-shelf, potentially serverless tooling, particularly for things like runtimes.
  • If and when you begin to productionize applications, find ways to provide useful abstractions between the deployment system and the application. These kinds of architectural choices help address concerns about lock-in and making it easy to do development work without dependencies.
  • Think seriously about your budget for doing operational work, holistically, if possible, and how you plan to manage serverless components (access, cost control, monitoring and alerting, etc.) in connection with existing infrastructure.

Serverless is interesting, and I think it's interesting to say "what if application development happened in a very controlled environment with a very high level set of APIs." There are clearly a lot of cases where it makes a lot of sense, and then a bunch of situations where it's clearly a suspect call. And it's early days, so we'll see in a few years how things work out. In any case, thinking critically about infrastructure is always a good plan.

The Kubernetes Cloud Mainframe

I made a tongue-in-cheek comment on twitter a while back that, k8s is just the contemporary API for mainframe computing., but as someone who is both very skeptical and very excited about the possibilities of kube, this feels like something I want to expand upon.

A lot of my day-to-day work has some theoretical overlap with kube, including batch processing, service orchestration, and cloud resource allocation. Lots of people I encounter are also really excited by kube, and its interesting to balance that excitement with my understanding of the system, and to watch how Kubernetes(as a platform) impacts the way that we develop applications.

I also want to be clear that my comparison to mainframes is not a disparagement, not only do I think there's a lot of benefit to gain by thinking about the historic precedents of our paradigm. I would also assert that the trends in infrastructure over the last 10 or 15 years (e.g. virtualization, containers, cloud platforms) have focused on bringing mainframe paradigms to a commodity environment.

Observations

  • clusters are static ususally functionally. I know that the public clouds have autoscaling abilities, but having really elastic infrastructure requires substantial additional work, and there are some reasonable upper-limits in terms of numbers of nodes, which makes it hard to actually operate elastically. It's probably also the case that elastic infrastructure has always been (mostly) a pipe dream at most organizations.
  • some things remain quite hard, chiefly in my mind:
    • autoscaling, both of the cluster itself and of the components running within the cluster. Usage patterns are don't always follow easy to detect patterns, so figuring out ways to make infrastructure elastic may take a while to converse or become common. Indeed, VMs and clouds were originally thought to be able to provide some kind of elastic/autoscaling capability, and by and large, most cloud deployments do not autoscale.
    • multi-tenancy, where multiple different kinds of workloads and use-cases run on the same cluster, is very difficult to schedule for reliably or predictably, which leads to a need to overprovision more for mixed workloads.
  • kubernettes does not eliminate the need for an operations team or vendor support for infrastructure or platforms.
  • decentralization has costs, and putting all of the cluster configuration in etcd imposes some limitations, mostly around performance. While I think decentralization is correct, in many ways for Kubernetes, applications developers may need systems that have lower latency and tighter scheduling abilities.
  • The fact that you can add applications to an existing cluster, or host a collection of small applications is mostly a symptom of clusters being over provisioned. This probably isn't bad, and it's almost certainly the case that you can reduce the overprovisioning bias with kube, to some degree.

Impact and Predictions

  • applications developed for kubernettes will eventually become difficult or impossible to imagine or run without kubernettes. This has huge impacts on developer experience and test experience. I'm not sure that this is a problem, but I think it's a hell of a dependency to pick up. This was true of applications that target mainframes as well.
  • Kubernetes will eventually replace vendor specific APIs for cloud infrastructure for most higher level use cases.
  • Kubernetes will primarily be deployed by Cloud providers (RedHat/IBM, Google, AWS, Azure, etc.) rather than by infrastructure teams.
  • Right now, vendors are figuring out what kinds of additional services users and applications need to run in Kubernetes, but eventually there will be another layer of tooling on top of Kubernetes:
    • logging and metrics collection.
    • deployment operations and configuration, particularly around coordinating dependencies.
    • authentication and credential management.
    • low-latency offline task orchestration.
  • At some point, we'll see a move multi-cluster orchestration, or more powerful tools approach to workload isolation within a single cluster.

Conclusion

Kubernetes is great, and it's been cool to see how, really in the last couple of years, it's emerged to really bring together things like cloud infrastructure and container orchestration. At the same time, it (of course!) doesn't solve all of the problems that developers have with their infrastructure, and I'm really excited to see how people build upon Kubernetes to achieve some of those higher level concerns, and make it easier to build software on top of the resulting platforms.

Programming in the Common Lisp Ecosystem

I've been writing more and more Common Lips recently and while I reflected a bunch on the experience in a recent post that I recently followed up on .

Why Ecosystems Matter

Most of my thinking and analysis of CL comes down to the ecosystem: the language has some really compelling (and fun!) features, so the question really comes down to the ecosystem. There are two main reasons to care about ecosystems in programming languages:

  • a vibrant ecosystem cuts down the time that an individual developer or team has to spend doing infrastructural work, to get started. Ecosystems provide everything from libraries for common tasks as well as conventions and established patterns for the big fundamental application choices, not to mention things like easily discoverable answers to common problems.

    The more time between "I have an idea" to "I have running (proof-of-concept quality) code running," matters so much. Everything is possible to a point, but the more friction between "idea" and "working prototype" can be a big problem.

  • a bigger and more vibrant ecosystem makes it more tenable for companies/sponsors (of all sizes) to choose to use Common Lisp for various projects, and there's a little bit of a chicken and egg problem here, admittedly. Companies and sponsors want to be confidence that they'll be able to efficiently replace engineers if needed, integrate or lisp components into larger ecosystems, or be able to get support problems. These are all kind of intangible (and reasonable!) and the larger and more vibrant the ecosystem the less risk there is.

    In many ways, recent developments in technology more broadly make lisp slightly more viable, as a result of making it easier to build applications that use multiple languages and tools. Things like microservices, better generic deployment orchestration tools, greater adoption of IDLs (including swagger, thrift and GRPC,) all make language choice less monolithic at the organization level.

Great Things

I've really enjoyed working with a few projects and tools. I'll probably write more about these individually in the near future, but in brief:

  • chanl provides. As a current/recovering Go programmer, this library is very familiar and great to have. In some ways, the API provides a bit more introspection, and flexibility that I've always wanted in Go.
  • lake is a buildsystem tool, in the tradition of make, but with a few additional great features, like target namespacing, a clear definition between "file targets" and "task targets," as well as support for SSH operations, which makes it a reasonable replacement for things like fabric, and other basic deployment tools.
  • cl-docutils provides the basis for a document processing system. I'm particularly partial because I've been using the python (reference) implementation for years, but the implementation is really quite good and quite easy to extend.
  • roswell is really great for getting started with CL, and also for making it possible to test library code against different implementations and versions of the language. I'm a touch iffy on using it to install packages into it's own directory, but it's pretty great.
  • ASDF is the "buildsystem" component of CL, comparable to setuptools in python, and it (particularly the latest versions,) is really great. I like the ability to produce binaries directly from asdf, and the "package-inferred" is a great addition (basically, giving python-style automatic package discovery.)
  • There's a full Apache Thrift implementation. While I'm not presently working on anything that would require a legit RPC protocol, being able to integrate CL components into larger ecosystem, having the option is useful.
  • Hunchensocket adds websockets! Web sockets are a weird little corner of any stack, but it's nice to be able to have the option of being able to do this kind of programming. Also CL seems like a really good platform to do
  • make-hash makes constructing hash tables easier, which is sort of needlessly gawky otherwise.
  • ceramic provides bridges between CL and Electron for delivering desktop applications based on web technologies in CL.

I kept thinking that there wouldn't be good examples of various things, (there's a Kafka driver! there's support for various other Apache ecosystem components,) but there are, and that's great. There's gaps, of course, but fewer, I think, than you'd expect.

The Dark Underbelly

The biggest problem in CL is probably discoverability: lots of folks are building great tools and it's hard to really know about the projects.

I thought about phrasing this as a kind of list of things that would be good for supporting bounties or something of the like. Also if I've missed something, please let me know! I've tried to look for a lot of things, but discovery is hard.

Quibbles

  • rove doesn't seem to work when multi-threaded results effectively. It's listed in the readme, but I was able to write really trivial tests that crashed the test harness.
  • Chanl would be super lovely with some kind of concept of cancellation (like contexts in Go,) and while it's nice to have a bit more thread introspection, given that the threads are somewhat heavier weight, being able to avoid resource leaks seems like a good plan.
  • There doesn't seem to be any library capable of producing YAML formated data. I don't have a specific need, but it'd be nice.
  • it would be nice to have some way of configuring the quicklisp client to be able to prefer quicklisp (stable) but also using ultralisp (or another source) if that's available.
  • Putting the capacity in asdf to produce binaries easily is great, and the only thing missing from buildapp/cl-launch is multi-entry binaries. That'd be swell. It might also be easier as an alternative to have support for some git-style sub-commands in a commandline parser (which doesn't easily exist at the moment'), but one-command-per-binary, seems difficult to manage.
  • there are no available implementations of a multi-reader single-writer mutex, which seems like an oversite, and yet, here we are.

Bigger Projects

  • There are no encoders/decoders for data formats like Apache Parquet, and the protocol buffers implementation don't support proto3. Neither of these are particular deal breakers, but having good tools dealing with common developments, lowers to cost and risk of using CL in more applications.
  • No support for http/2 and therefore gRPC. Having the ability to write software in CL with the knowledge that it'll be able to integrate with other components, is good for the ecosystem.
  • There is no great modern MongoDB driver. There were a couple of early implementations, but there are important changes to the MongoDB protocol. A clearer interface for producing BSON might be useful too.
  • I've looked for libraries and tools to integrate and manage aspects of things like systemd, docker, and k8s. k8s seems easiest to close, as things like cube can be generated from updated swagger definitions, but there's less for the others.
  • Application delievery remains a bit of an open. I'm particularly interested in being able to produce binaries that target other platforms/systems (cross compilation,) but there are a class of problems related to being able to ship tools once built.
  • I'm eagerly waiting and concerned about the plight of the current implementations around the move of ARM to Darwin, in the intermediate term. My sense is that the transition won't be super difficult, but it seems like a thing.

How to Write Performant Software and Why You Shouldn't

I said a thing on twitter that I like, and I realized that I hadn't really written (or ranted) much about performance engineering, and it seemed like a good thing to do. Let's get to it.

Making software fast is pretty easy:

  • Measure the performance of your software at two distinct levels:

    • figure out how to isolate specific operations, as in unit test, and get the code to run many times, and measure how long the operations take.
    • Run meaningful units of work, as in integration tests, to understand how the components of your system come together.

    If you're running a service, sometimes tracking the actual timing of actual operations over time, can also be useful, but you need a lot of traffic for this to be really useful. Run these measurements regularly, and track the timing of operations over time so you know when things actually chair.

  • When you notice something is slow, identify the slow thing and make it faster. This sounds silly, but the things that are slow usually fall into one of a few common cases:

    • an operation that you expected to be quick and in memory, actually does something that does I/O (either to a disk or to the network,)
    • an operation allocates more memory than you expect, or allocates memory more often than you expect.
    • there's a loop that takes more time than you expect, because you expected the number of iterations to be small (10?) and instead there are hundreds or thousands of operations.

    Combine these and you can get some really weird effects, particularly over time. An operation that used to be quick gets slower over time, because the items iterated over grows, or a function is called in a loop that used to be an in-memory only operation, now accesses the database, or something like that. The memory based ones can be trickier (but also end up being less common, at least with more recent programming runtimes.)

Collect data, and when something gets slower you should fix it.

Well no.

Most of the time slow software doesn't really matter. The appearance of slowness or fastness is rarely material to user's experience or the bottom line. If software gets slower, most of the time you should just let it get slower:

  • Computers get faster and cheaper over time, so most of the time, as long as your rate of slow down is slow and steady over time, its usually fine to just ride it out. Obviously big slow downs are a problem, but a few percent year-over-year is so rarely worth fixing.

    It's also the case that runtimes and compilers are almost always getting faster, (because compiler devlopers are, in the best way possible, total nerds,) so upgrading the compiler/runtime regularly often offsets regular slow down over time.

  • In the vast majority of cases, the thing that makes software slow is doing I/O (disks or network,) and so your code probably doesn't matter and so what your code does is unlikely to matter much and you can solve the problem by changing how traffic flows through your system.

    For IX (e.g. web/front-end) code, the logic is a bit different, because slow code actually impacts user experience, and humans notice things. The solution here, though, is often not about making the code faster, but in increasingly pushing a lot of code to "the backend," (e.g. avoid prossing data on the front end, and just make sure the backend can always hand you exactly the data you need and want.)

  • Code that's fast is often harder to read and maintain: to make code faster, you often have to be careful and avoid using certain features of your programming language or runtime (e.g. avoiding ususing heap allocations, or reducing the size of allocations by encoding data in more terse ways, etc,) or by avoiding libraries that are "slower," or that use certain abstractions, all of which makes your code less conventional more difficult to read, and harder to debug. Programmer time is almost always more expensive than compute time, so unless it's big or causing problems, its rarely worth making code harder to read.

    Sometimes, making things actually faster is actually required. Maybe you have a lot of data that you need to get through pretty quickly and there's no way around it, or you have some classically difficult algorithm problem (graph search, say,), but in the course of generally building software this happens pretty rarely, and again most of the time pushing the problem "up" (from the front end to the backend and from the backend to the database, similar,) solves whatever problems you might have.

  • While there are obviously pathological counter-examples, ususally related to things that happen in loops, but a lot of operations never have to be fast because they sit right next to another operation that's much slower:

    • Lots of people analyze logging tools for speed, and this is almost always silly because all log messages either have to be written somewhere (I/O) and generally there has to be something to serialize messages (a mutex or functional equivalent,) somewhere because you want to write only one message at a time to the output, so even if you have a really "fast logger" on its own terms, you're going to hit the I/O or the serializing nature of the problem. Use a logger that has the features you have and is easy to use, speed doesn't matter.
    • anything in HTTP request routing and processing. Because request processing sits next to network operations, often between a database as well as to the client, any sort of gain by using a "a faster web framework," is probably immeasurable. Use the ones with the clearest feature set.

Continuous Integration is Harder Than You Think

I've been working on continuous integration systems for a few years, and while the basic principle of CI is straightforward, it seems that most CI deployments are not. This makes sense: project infrastructure is an easy place to defer maintenance during the development cycle, and projects often prioritize feature development and bug fixing over tweaking the buildsystem or test infrastructure, but I almost think there's something more. This post is a consideration of what makes CI hard and perhaps provide a bit of unsolicited advice.

The Case for CI

I suppose I don't really have to sell anyone on the utility or power of CI: running a set of tests on your software regularly allows developers and teams to catch bugs early, and saves a bucket of developer time, and that is usually enough. Really, though, CI ends up giving you the leverage to solve a number of really gnarly engineering problems:

  • how to release software consistently and regularly.
  • how to support multiple platforms.
  • how to manage larger codebases.
  • anything with distributed systems.
  • how to develop software with larger numbers of contributors.

Doing any of these things without CI isn't really particularly viable, particularly at scale. This isn't to say, that they "come free" with CI, but that CI is often the right place to build the kind of infrastructure required to manage distributed systems problems or release complexity.

Buildsystems are Crucial

One thing that I see teams doing some times is addressing their local development processes and tooling with a different set of expectations than they do in CI, and you can totally see and understand how this happens: the CI processes always start from a clean environment, and you often want to handle failures in CI differently than you might handle a failure locally. It's really easy to write a shell script that only runs in CI, and then things sort of accumulate, and eventually there emerge a class of features and phenomena that only exist for and because of CI.

The solution is simple: invest in your buildsystem, [1] and ensure that there is minimal (or no!) indirection between your buildsystem and your CI configuration. But buildsystems are hard, and in a lot of cases, test harnesses aren't easily integrated into build systems, which complicates the problem for some. Having a good build system isn't particularly about picking a good tool, though there are definitely tradeoffs for different tools, the problem is mostly in capturing logic in a consistent way, providing a good interface, and ensuring that the builds happen as efficiently as possible.

Regardless, I'm a strong believer in centralizing as much functionality in the buildsystem as possible and making sure that CI just calls into build systems. Good build systems:

  • allow you to build or rebuild (or test/subtest) only subsets of work, to allow quick iteration during development and debugging.
  • center around a model of artifacts (things produced) and dependencies (requires-type relationships between artifacts).
  • have clear defaults, automatically detect dependencies and information from the environment, and perform any required set up and teardown for the build and/or test.
  • provide a unified interface for the developer workflow, including building, testing, and packaging.

The upside, is that effort that you put into the development of a buildsystem pay dividends not just for managing to complexity of CI deployments, but also make the local development stable and approachable for new developers.

[1]Canonically buildsystems are things like makefiles (or cmake, scons, waf, rake, npm, maven, ant, gradle, etc.) that are responsible for converting your source files into executable, but the lines get blurry in a lot of languages/projects. For Golang, the go tool plays the part of the buildsystem and test harness without much extra configuration, and many environments have a pretty robust separation between building and testing.

T-Shaped Matrices

There's a temptation with CI systems to exercise your entire test suite with a comprehensive and complete range of platforms, modes, and operations. While this works great for some smaller projects, "completism" is not the best way to model the problem. When designing and selecting your tests and test dimensions, consider the following goals and approaches:

  • on one, and only one, platform run your entire test suite. This platform should probably be very close to the primary runtime of your environment (e.g. when developing a service that runs on Linux service, your tests should run in a system that resembles the production environment,) or possibly your primary development environment.
  • for all platforms other than your primary platform, run only the tests that are either directly related to that runtime/platform (e.g. anything that might be OS or processor specific,) plus some small subset of "verification" or acceptance tests. I would expect that these tests should easily be able to complete in 10% of the time of a "full build,"
  • consider operational variants (e.g. if your product has multiple major-runtime modes, or some kind of pluggable sub-system) and select the set of tests which verifies these modes of operations.

In general the shape of the matrix should be t-shaped, or "wide across" with a long "narrow down." The danger more than anything is in running too many tests, which is a problem because:

  • more tests increase the chance of a false negative (caused by the underlying systems infrastructure, service dependencies, or even flakey tests,) which means you risk spending more time chasing down problems. Running tests that provide signal is good, but the chance of false negatives is a liability.
  • responsiveness of CI frameworks is important but incredibly difficult, and running fewer things can improve responsiveness. While parallelism might help some kinds of runtime limitations with larger numbers of tests, this incurs overhead, is expensive.
  • actual failures become redundant, and difficult to attribute failures in "complete matrices." A test of certain high level systems may pass or fail consistently along all dimensions creating more noise when something fails. With any degree of non-determinism or chance of a false-negative, running tests more than once just make it difficult to attribute failures to a specific change or an intermittent bug.
  • some testing dimensions don't make sense, leading to wasted time addressing test failures. For example when testing an RPC protocol library that supports both encryption and authentication, it's not meaningful to test the combination of "no-encryption" and "authentication," although the other three axes might be interesting.

The ultimate goal, of course is to have a test matrix that you are confident will catch bugs when they occur, is easy to maintain, and helps you build confidence in the software that you ship.

Conclusion

Many organizations have teams dedicated maintaining buildsystems and CI, and that's often appropriate: keeping CI alive is of huge value. It's also very possible for CI and related tools to accrued complexity and debt in ways that are difficult to maintain, even with dedicated teams: taking a step back and thinking about CI, buildsystems, and overall architecture strategically can be very powerful, and really improve the value provided by the system.

Get More Done

It's really easy to over think the way that we approach our work and manage our own time and projects. There are no shortage of tools, services, books, and methods to organizing your days and work, and while there are a lot of good ideas out there, it's easy to get stuck fiddling with how you work, at the expense of actuallying getting work done. While I've definitely thought about this a lot over time, for a long time, I've mostly just done things and not really worried much about the things on my to-do list. [1]

I think about the way that I work similarly to the way that I think about the way I work with other people. The way you work alone is different from collaboration, but a lot of the principles of thinking about big goals, and smaller actionable items is pretty transferable.

My suggestions here are centered around the idea that you have a todo list, and that you spend a few moments a day looking at that list, but actually I think the way I think about my work is really orthogonal to any specific tools. For years, most of my personal planning has revolved around making a few lists in a steno pad once or twice a day, [2] though I've been trying to do more digital things recently. I'm not sure I like it. Again, tools don't matter.

[1]Though, to be clear, I've had the pleasure and benefit of working in an organization that lives-and-dies by a bug tracking system, with a great team of folks doing project management. So there are other people who manage sprints, keep an eye on velocity, and make sure that issues don't get stuck.
[2]My general approach is typically to have a "big projects" or "things to think about" list and a "do this next list", with occasional lists about all the things in a specific big project. In retrospect these map reasonable well to SCRUM/Agile concepts, but it also makes sense.

Smaller Tasks are Always Better

It's easy to plan projects from the "top down," and identify the major components and plan your work around those components, and the times that I run in to trouble are always the times when my "actionable pieces" are too big. Smaller pieces help you build momentum, allow to move around to different areas as your attention and focus change, and help you use avalible time effectively (when you want.)

It's easy to find time in-between meetings, or while the pasta water is boiling, to do something small and quick. It's also very easy to avoid starting something big until you have a big block of unfettered time. The combination of these factors makes bigger tasks liabilities, and more likely to take even longer to complete.

Multi-Task Different Kinds of Work

I read a bunch of articles that suggest that the way to be really productive is to figure out ways of focusing and avoiding context switches. I've even watched a lot of coworkers organize their schedules and work around these principles, and it's always been something of a mystery for me. It's true that too much multi-tasking and context switching can lead to a fragmented experience and make some longer/complicated tasks harder to really dig into, but it's possible to manage the costs of context switching, by breaking apart bigger projects into smaller projects and leaving notes for your (future) self as you work.

Even if you don't do a lot of actual multitasking within a given hour or day of time, it's hard to avoid really working on different kinds of projects on the scale of days or weeks, and I've found that having multiple projects in flight at once actually helps me get more done. In general I think of this as the idea that more projects in flight means that you finish things more often, even if the total number of projects completed is the same in the macro context.

Regardless, different stages of a project require different kind of attention and energy and having a few things in flight increases the chance that when you're in the mood to do some research, or editing, or planning, you have a project with that kind of work all queued up. I prefer to be able to switch to different kinds of work depending on my attention and mood. In general my work falls into the following kinds of activities:

  • planning (e.g. splitting up big tasks, outlining, design work,)
  • generative work (e.g. writing, coding, etc.)
  • organizational (email, collaboration coordination, user support, public issue tracking, mentoring, integration, etc.)
  • polishing (editing, writing/running tests, publication prepping,)
  • reviewing (code review, editing, etc.)

Do the Right Things

My general approach is "do lots of things and hope something sticks," which makes the small assumption that all of the things you do are important. It's fine if not everything is the most important, and it's fine to do things a bit out of order, but it's probably a problem if you do lots of things without getting important things done.

So I'm not saying establish a priority for all tasks and execute them in strictly that priority, at all. Part of the problem is just making sure that the things on your list are still relevant, and still make sense. As we do work and time passes, we have to rethink or rechart how we're going to complete a project, and that reevaluation is useful.

Prioritization and task selection is incredibly hard, and it's easy to cast "prioritization" in over simplified terms. I've been thinking about prioritization, for my own work, as being a decision based on the following factors:

  • deadline (when does this have to be done: work on things that have hard deadlines or expected completion times, ordered by expected completion date, to avoid needing to cram at the last moment.)
  • potential impact (do things that will have the greatest impact before lesser impact, this is super subjective, but can help build momentum, and give you a chance to decide if lower-impact items are worth doing.)
  • time availability fit (do the biggest thing you can manage with the time you have at hand, as smaller things are easier to fit in later,)
  • level of understanding (work on the things that you understand the best, and give yourself the opportunity to plan things that you don't understand later. I sometimes think about this as "do easy things first," but that might be too simple.)
  • time outstanding (how long ago was this task created: do older things first to prevent them from becoming stale.)
  • number of things (or people) that depend on this being done (work on things that will unblock other tasks or collaborators before things that don't have any dependencies, to help increase overall throughput.)

Maintain a Pipeline of Work

Productivity, for me, has always been about getting momentum on projects and being able to add more things. For work projects, there's (almost) always a backlog of tasks, and the next thing is ususally pretty obvious, but sometimes this is harder for personal projects. I've noticed a tendency in myself to prefer "getting everything done" on my personal todo list, which I don't think particularly useful. Having a pipleine of backlog of work is great:

  • there's always something next to do, and there isn't a moment when you've finished and have to think about new things.
  • keeping a list of things that you are going to do in the more distant future lets you start thinking about how bigger pieces fit together without needint to starting to work on that.
  • you can add big things to your list(s) and then break them into smaller pieces as you make progress.

As an experiment, think about your todo list, not as a thing that you'd like to finish all of the items, but as list that shouldn't be shorter than a certain amount (say 20 or 30?) items with rate of completion (10 a week?) though you should choose your own numbers, and set goals based on what you see yourself getting done over time.

Open Source Emacs Configuration Improvements

In retrospect I'm not totally sure why I released my emacs configuration to the world. I find tweaking Emacs Lisp to be soothing, and in 2020 these kinds of projects are particularly welcome. I've always thought about making it public: I feel like I get a lot out of Emacs, and I'm super aware that it's very hard for people who haven't been using Emacs forever to get a comparable experience. [1]

I also really had no idea of what to expect, and while it's still really recent, I've noticed a few things which are worth remarking:

  • Making your code usable for other people really does make it easy for people to find bugs. While it's likely that there are bugs that people never noticed, I found a few things very quickly:

    • Someone reported higher than expected CPU use, and I discovered that there were a number of functions that ran regularly in timers, and I was able to quickly tune some knobs in order to reduce average CPU use by a lot. This is likely to be great both for the user in question, but also because it'll help battery life.
    • The config includes a git submodule (!) with the contents of all third-party packages, mostly to reduce friction for people getting started. Downloading all of the packages fresh from the archive would take a few minutes, and the git clone is just faster. I realized, when someone ran into some problems when running with emacs 28 (e.g. the development/mainline build,) that the byte-compilation formats were different, which made the emacs27 files not work on emacs28. I pushed a second branch.

    More than anything the experience of getting bug reports and feedback has been great. It both makes it possible to focus time because the impact of the work is really clear, and it also makes it clear to me that I've accumulated some actually decent Emacs Lisp skills, without really noticing it. [2]

  • I was inspired to make a few structural improvements.

    • For a long time, including after the initial release, I had a "settings" file, and a "local functions" file that held code that I'd written or coppied from one place or another, and I finally divided them all into packages named tychoish-<thing>.el which allowed me to put all or most of the configuration into use-package forms, which is more consistent and also helps startup time a bit, and makes the directory structure a bit easier.
    • I also cleaned up a bunch of local snippets that I'd been carrying around, which wasn't hurting anything but is a bit more clear in the present form.
  • I believe that I've hit the limit, with regards to startup speed. I'd really like to get a GUI emacs instance to start (with no buffers) in less than a second, but it doesn't seem super plausible. I got really close. At this point there are two factors that constrain:

    • Raw CPU speed. I have two computers, and the machine with the newer CPU is consistently 25% faster than the slow computer.
    • While the default configuration doesn't do this, my personal configuration sets a font (this is reasonable,) but seems that the time to do this is sometimes observable, and proportional to the number of fonts you have installed on the system. [3]
    • Dependencies during the early load. I was able to save about 10% time by moving a function between package to reduce the packages that startup code depended upon. There's just a limit to how much you can clean up here.

    Having said that, these things can drift pretty easily. I've added some helper macros with-timer and with-slow-op-timer that I can use to report the timing of operations during startup to make sure that things don't slow down.

    Interestingly, I've experimented with byte-compiling my local configuration and I haven't really noticed much of a speedup at this scale, so for ease I've been leaving my own lisp directory unbytecompiled.

  • With everything in order, there's not much to edit! I guess I'll have other things to work on, but I have made a few improvements, generally:

    • Using the alert package for desktop notification, which allowed me to delete a legacy package I've been using. Deleting code is awesome.
    • I finally figured out how to really take advantage of projectile, which is now configured correctly, and has been a lot of help in my day-to-day work.
    • I've started using ERC more, and only really using my irssi (in screen) session as a fallback. My IRC/IM setup is a bit beyond the scope of this post but ERC has been a bit fussy to use on machines with intermittent connections, but I think I've been able to tweak that pretty well and have an experience that's quite good.

It's been interesting! And I'm looking forward to continuing to do this!

[1]Sure, other editors also have long setup curves, but Emacs is particularly gnarly in this regard, and I think adoption by new programmers is definitely constrained by this fact.
[2]I never really thought of myself as someone who wrote Emacs Lisp: I've never really written a piece of software in Emacs, it's always been a function here or there, or modifying some snippet from somewhere. I don't know if I have a project or a goal that would involve writing more emacs software, but it's nice to recognize that I've accidentally acquired a skill.
[3]On Windows and macOS systems this may not matter, but you may have more fonts installed than you need. I certianly did. Be aware that webbrowsers often downlaod their own fonts separately from system fonts, so having fonts installed is really relevant to your GTK/QT/UI use and not actually to the place where you're likely doing most of your font interaction (e.g. the browser.)