How to Become a Self-Taught Programmer

i am a self taught programmer. i don't know that i'd recommend it to anyone else there are so many different curricula and training programs that are well tested and very efficacious. for lots of historical reasons, many programmers end up being all or mostly self taught: in the early days because programming was vocational and people learned on the job, then because people learned programming on their own before entering cs programs, and more recently because the demand for programmers (and rate of change) for the kinds of programming work that are in the most demand these days. and knowing that it's possible (and potentially cheaper) to teach yourself, it seems like a tempting option.

this post, then, is a collection of suggestions, guidelines, and pointers for anyone attempting to teach themselves to program:

  • focus on learning one thing (programming language and problem domain) at a time. there are so many different things you could learn, and people who know how to program seem to have an endless knowledge of different things. knowing one set of tools and one area (e.g. "web development in javascript," or "system administration in python,") gives you the framework to expand later, and the truth is that you'll be able to learn additional things more easily once you have a framework to build upon.

  • when learning something in programming, always start with a goal. have some piece of data that you want to explore or visualize, have a set of files that you want to organize, or something that you want to accomplish. learning how to program without a goal, means that you don't end up asking the kinds of questions that you need to form the right kinds of associations.

  • programming is all about learning different things: if you end up programming for long enough you'll end up learning different languages, and being able to pick up new things is the real skill.

  • being able to clearly capture what you were thinking when you write code is basically a programming super power.

  • programming is about understanding problems [1] abstractly and building abstractions around various kinds of problems. being able break apart these problems into smaller core issues, and thinking abstractly about the problem so that you can solve both the problem in front of you and also solve it in the future are crucial skills.

  • collect questions or curiosities as you encounter them, but don't feel like you have to understand everything, and use this to guide your background reading, but don't feel like you have to hunt down the answer to every term you hear or see that you don't already know immediately. if you're pretty rigorous about going back and looking things up, you'll get a pretty good base knowledge over time.

  • always think about the users of your software as you build, at every level. even if you're building software for your own use, think about the future version of yourself that will use that software, imagine that other people might use the interfaces and functions that you write and think about what assumptions they might bring to the table. think about the output that your program, script, or function produces, and how someone would interact with that output.

  • think about the function as the fundamental building block of your software. lower level forms (i.e. statements) are required, but functions are the unit where meaning is created in the context of a program. functions, or methods depending, take input (arguments, ususally, but sometimes also an object in the case of methods) and produce some output, sometimes with some kind of side-effect. the best functions:

    • clearly indicate side-effects when possible.
    • have a mechanism for reporting on error conditions (exceptions, return values,)
    • avoid dependencies on external state, beyond what is passed as arguments.
    • are as short as possible.
    • use names that clearly describe the behavior and operations of the function.

    if programming were human language (english,) you'd strive to construct functions that were simple sentences and not paragraph's, but also more than a couple of words/phrases, and you would want these sentences to be clear to understand with limited context. if you have good functions, interfaces are more clear and easier to use, code becomes easier to read and debug, and easier to test.

  • avoid being too weird. many programmers are total nerds, and you may be too, and that's ok, but it's easier to learn how to do something if there's prior art that you can learn from and copy. on a day-to-day basis, a lot of programming work is just doing something until you get stuck and then you google for the answer. If you're doing something weird--using a programming language that's less widely used, or in a problem space that is a bit out of mainstream, it can be harder to find answers to your problems.

Notes

[1]I use the term "problem" to cover both things like "connecting two components internally" and also more broadly "satisfying a requirement for users," and programming often includes both of these kinds of work.

Does Anyone Actually Want Serverless?

Cloud computing, and with it most of tech, has been really hot on the idea of "serverless" computing, which is to say, services and applications that are deployed, provisioned, and priced separate from conventional "server" resources (memory, storage, bandwidth.) The idea is that we can build and expose ways of deploying and running applications and services, even low-level components like "databases" and "function execution", in ways that mean that developers and operators can avoid thinking about computers qua computers.

Serverless is the logical extension of "platform as a service" offerings that have been an oft missed goal for a long time. You write high-level applications and code that is designed to run in some kind of sandbox, with external services provided in some kind of ala carte model via integrations with other products or services. The PaaS, then, can take care of everything else: load balancing incoming requests, redundancy to support higher availability, and any kind of maintains on the lower level infrastructure. Serverless is often just PaaS but more: provide a complete stack of services to satisfy needs (databases, queues, background work, authentication, caching, on top of the runtime,) and then change the pricing model to be based on request/utilization rather than time or resources.

Fundamentally, this allows the separation of concerns between "writing software," and "running software," and allows much if not all of the running of software to be delegated to service providers. This kind of separation is useful for developers, and in general runtime environments seem like the kind of thing that most technical organizations shouldn't need to focus on: outsourcing may actually be good right?

Well maybe.

Let's be clear, serverless platforms primarily benefit the provider of the services for two reasons:

  • serverless models allow providers to build offerings that are multi-tenant, and give provider the ability to reap the benefit of managing request load dynamically and sharing resources between services/clients.
  • utilization pricing for services is always going to be higher than commodity pricing for the underlying components. Running your on servers ("metal") is cheaper than using cloud infrastructure, over time, but capacity planning, redundancy, and management overhead, make that difficult in practice. The proposition is that while serverless may cost more per-unit, it has lower management costs for users (fewer people in "ops" roles,) and is more flexible if request patterns change.

So we know why the industry seems to want serverless to be a thing, but does it actually make sense?

Maybe?

Makers of software strive (or ought to) make their software easy to run, and having very explicit expectations about the runtime environment, make software easier to run. Similarly, being able to write code without needing to manage the runtime, monitoring, logging, while using packaged services for caching storage and databases seems like a great boon.

The downsides to software producers, however, are plentiful:

  • vendor lock-in is real, not just because it places your application at the mercy of an external provider, as they do maintenance, or as their API and platform evolves on their timeframe.
  • hosted systems, mean that it's difficult to do local development and testing: either every developer needs to have their own sandbox (at some expense and management overhead), or you have to maintain a separate runtime environment for development.
  • application's cannot have service levels which exceed the service level agreements of their underlying providers. If your serverless platform has an SLA which is less robust than the SLA of your application you're in trouble.
  • when something breaks, there are few operational remedies available. Upstream timeouts are often not changeable and most forms of manual intervention aren't available.
  • pricing probably only makes sense for organizations operating at either small scale (most organizations, particularly for greenfield projects,) but is never really viable for any kind of scale, and probably doesn't make sense in any situation at scale.
  • some problems and kinds of software just don't work in a serverless model: big data sets that exceed reasonable RAM requirements, data processing problems which aren't easily parallelizable, workloads with long running operations, or workloads that require lower level network or hardware access.
  • most serverless systems will incur some overhead over dedicated/serverfull alternatives and therefore have worse performance/efficiency, and potentially less predictable performance, especially in very high-volume situations.

Where does that leave us?

  • Many applications and bespoke tooling should probably use serverless tools. Particularly if your organization is already committed to a specific cloud ecosystem, this can make a lot of sense.
  • Prototypes, unequivocally make sense to rely on off-the-shelf, potentially serverless tooling, particularly for things like runtimes.
  • If and when you begin to productionize applications, find ways to provide useful abstractions between the deployment system and the application. These kinds of architectural choices help address concerns about lock-in and making it easy to do development work without dependencies.
  • Think seriously about your budget for doing operational work, holistically, if possible, and how you plan to manage serverless components (access, cost control, monitoring and alerting, etc.) in connection with existing infrastructure.

Serverless is interesting, and I think it's interesting to say "what if application development happened in a very controlled environment with a very high level set of APIs." There are clearly a lot of cases where it makes a lot of sense, and then a bunch of situations where it's clearly a suspect call. And it's early days, so we'll see in a few years how things work out. In any case, thinking critically about infrastructure is always a good plan.

The Kubernetes Cloud Mainframe

I made a tongue-in-cheek comment on twitter a while back that, k8s is just the contemporary API for mainframe computing., but as someone who is both very skeptical and very excited about the possibilities of kube, this feels like something I want to expand upon.

A lot of my day-to-day work has some theoretical overlap with kube, including batch processing, service orchestration, and cloud resource allocation. Lots of people I encounter are also really excited by kube, and its interesting to balance that excitement with my understanding of the system, and to watch how Kubernetes(as a platform) impacts the way that we develop applications.

I also want to be clear that my comparison to mainframes is not a disparagement, not only do I think there's a lot of benefit to gain by thinking about the historic precedents of our paradigm. I would also assert that the trends in infrastructure over the last 10 or 15 years (e.g. virtualization, containers, cloud platforms) have focused on bringing mainframe paradigms to a commodity environment.

Observations

  • clusters are static ususally functionally. I know that the public clouds have autoscaling abilities, but having really elastic infrastructure requires substantial additional work, and there are some reasonable upper-limits in terms of numbers of nodes, which makes it hard to actually operate elastically. It's probably also the case that elastic infrastructure has always been (mostly) a pipe dream at most organizations.
  • some things remain quite hard, chiefly in my mind:
    • autoscaling, both of the cluster itself and of the components running within the cluster. Usage patterns are don't always follow easy to detect patterns, so figuring out ways to make infrastructure elastic may take a while to converse or become common. Indeed, VMs and clouds were originally thought to be able to provide some kind of elastic/autoscaling capability, and by and large, most cloud deployments do not autoscale.
    • multi-tenancy, where multiple different kinds of workloads and use-cases run on the same cluster, is very difficult to schedule for reliably or predictably, which leads to a need to overprovision more for mixed workloads.
  • kubernettes does not eliminate the need for an operations team or vendor support for infrastructure or platforms.
  • decentralization has costs, and putting all of the cluster configuration in etcd imposes some limitations, mostly around performance. While I think decentralization is correct, in many ways for Kubernetes, applications developers may need systems that have lower latency and tighter scheduling abilities.
  • The fact that you can add applications to an existing cluster, or host a collection of small applications is mostly a symptom of clusters being over provisioned. This probably isn't bad, and it's almost certainly the case that you can reduce the overprovisioning bias with kube, to some degree.

Impact and Predictions

  • applications developed for kubernettes will eventually become difficult or impossible to imagine or run without kubernettes. This has huge impacts on developer experience and test experience. I'm not sure that this is a problem, but I think it's a hell of a dependency to pick up. This was true of applications that target mainframes as well.
  • Kubernetes will eventually replace vendor specific APIs for cloud infrastructure for most higher level use cases.
  • Kubernetes will primarily be deployed by Cloud providers (RedHat/IBM, Google, AWS, Azure, etc.) rather than by infrastructure teams.
  • Right now, vendors are figuring out what kinds of additional services users and applications need to run in Kubernetes, but eventually there will be another layer of tooling on top of Kubernetes:
    • logging and metrics collection.
    • deployment operations and configuration, particularly around coordinating dependencies.
    • authentication and credential management.
    • low-latency offline task orchestration.
  • At some point, we'll see a move multi-cluster orchestration, or more powerful tools approach to workload isolation within a single cluster.

Conclusion

Kubernetes is great, and it's been cool to see how, really in the last couple of years, it's emerged to really bring together things like cloud infrastructure and container orchestration. At the same time, it (of course!) doesn't solve all of the problems that developers have with their infrastructure, and I'm really excited to see how people build upon Kubernetes to achieve some of those higher level concerns, and make it easier to build software on top of the resulting platforms.

Continuous Integration is Harder Than You Think

I've been working on continuous integration systems for a few years, and while the basic principle of CI is straightforward, it seems that most CI deployments are not. This makes sense: project infrastructure is an easy place to defer maintenance during the development cycle, and projects often prioritize feature development and bug fixing over tweaking the buildsystem or test infrastructure, but I almost think there's something more. This post is a consideration of what makes CI hard and perhaps provide a bit of unsolicited advice.

The Case for CI

I suppose I don't really have to sell anyone on the utility or power of CI: running a set of tests on your software regularly allows developers and teams to catch bugs early, and saves a bucket of developer time, and that is usually enough. Really, though, CI ends up giving you the leverage to solve a number of really gnarly engineering problems:

  • how to release software consistently and regularly.
  • how to support multiple platforms.
  • how to manage larger codebases.
  • anything with distributed systems.
  • how to develop software with larger numbers of contributors.

Doing any of these things without CI isn't really particularly viable, particularly at scale. This isn't to say, that they "come free" with CI, but that CI is often the right place to build the kind of infrastructure required to manage distributed systems problems or release complexity.

Buildsystems are Crucial

One thing that I see teams doing some times is addressing their local development processes and tooling with a different set of expectations than they do in CI, and you can totally see and understand how this happens: the CI processes always start from a clean environment, and you often want to handle failures in CI differently than you might handle a failure locally. It's really easy to write a shell script that only runs in CI, and then things sort of accumulate, and eventually there emerge a class of features and phenomena that only exist for and because of CI.

The solution is simple: invest in your buildsystem, [1] and ensure that there is minimal (or no!) indirection between your buildsystem and your CI configuration. But buildsystems are hard, and in a lot of cases, test harnesses aren't easily integrated into build systems, which complicates the problem for some. Having a good build system isn't particularly about picking a good tool, though there are definitely tradeoffs for different tools, the problem is mostly in capturing logic in a consistent way, providing a good interface, and ensuring that the builds happen as efficiently as possible.

Regardless, I'm a strong believer in centralizing as much functionality in the buildsystem as possible and making sure that CI just calls into build systems. Good build systems:

  • allow you to build or rebuild (or test/subtest) only subsets of work, to allow quick iteration during development and debugging.
  • center around a model of artifacts (things produced) and dependencies (requires-type relationships between artifacts).
  • have clear defaults, automatically detect dependencies and information from the environment, and perform any required set up and teardown for the build and/or test.
  • provide a unified interface for the developer workflow, including building, testing, and packaging.

The upside, is that effort that you put into the development of a buildsystem pay dividends not just for managing to complexity of CI deployments, but also make the local development stable and approachable for new developers.

[1]Canonically buildsystems are things like makefiles (or cmake, scons, waf, rake, npm, maven, ant, gradle, etc.) that are responsible for converting your source files into executable, but the lines get blurry in a lot of languages/projects. For Golang, the go tool plays the part of the buildsystem and test harness without much extra configuration, and many environments have a pretty robust separation between building and testing.

T-Shaped Matrices

There's a temptation with CI systems to exercise your entire test suite with a comprehensive and complete range of platforms, modes, and operations. While this works great for some smaller projects, "completism" is not the best way to model the problem. When designing and selecting your tests and test dimensions, consider the following goals and approaches:

  • on one, and only one, platform run your entire test suite. This platform should probably be very close to the primary runtime of your environment (e.g. when developing a service that runs on Linux service, your tests should run in a system that resembles the production environment,) or possibly your primary development environment.
  • for all platforms other than your primary platform, run only the tests that are either directly related to that runtime/platform (e.g. anything that might be OS or processor specific,) plus some small subset of "verification" or acceptance tests. I would expect that these tests should easily be able to complete in 10% of the time of a "full build,"
  • consider operational variants (e.g. if your product has multiple major-runtime modes, or some kind of pluggable sub-system) and select the set of tests which verifies these modes of operations.

In general the shape of the matrix should be t-shaped, or "wide across" with a long "narrow down." The danger more than anything is in running too many tests, which is a problem because:

  • more tests increase the chance of a false negative (caused by the underlying systems infrastructure, service dependencies, or even flakey tests,) which means you risk spending more time chasing down problems. Running tests that provide signal is good, but the chance of false negatives is a liability.
  • responsiveness of CI frameworks is important but incredibly difficult, and running fewer things can improve responsiveness. While parallelism might help some kinds of runtime limitations with larger numbers of tests, this incurs overhead, is expensive.
  • actual failures become redundant, and difficult to attribute failures in "complete matrices." A test of certain high level systems may pass or fail consistently along all dimensions creating more noise when something fails. With any degree of non-determinism or chance of a false-negative, running tests more than once just make it difficult to attribute failures to a specific change or an intermittent bug.
  • some testing dimensions don't make sense, leading to wasted time addressing test failures. For example when testing an RPC protocol library that supports both encryption and authentication, it's not meaningful to test the combination of "no-encryption" and "authentication," although the other three axes might be interesting.

The ultimate goal, of course is to have a test matrix that you are confident will catch bugs when they occur, is easy to maintain, and helps you build confidence in the software that you ship.

Conclusion

Many organizations have teams dedicated maintaining buildsystems and CI, and that's often appropriate: keeping CI alive is of huge value. It's also very possible for CI and related tools to accrued complexity and debt in ways that are difficult to maintain, even with dedicated teams: taking a step back and thinking about CI, buildsystems, and overall architecture strategically can be very powerful, and really improve the value provided by the system.

Get More Done

It's really easy to over think the way that we approach our work and manage our own time and projects. There are no shortage of tools, services, books, and methods to organizing your days and work, and while there are a lot of good ideas out there, it's easy to get stuck fiddling with how you work, at the expense of actuallying getting work done. While I've definitely thought about this a lot over time, for a long time, I've mostly just done things and not really worried much about the things on my to-do list. [1]

I think about the way that I work similarly to the way that I think about the way I work with other people. The way you work alone is different from collaboration, but a lot of the principles of thinking about big goals, and smaller actionable items is pretty transferable.

My suggestions here are centered around the idea that you have a todo list, and that you spend a few moments a day looking at that list, but actually I think the way I think about my work is really orthogonal to any specific tools. For years, most of my personal planning has revolved around making a few lists in a steno pad once or twice a day, [2] though I've been trying to do more digital things recently. I'm not sure I like it. Again, tools don't matter.

[1]Though, to be clear, I've had the pleasure and benefit of working in an organization that lives-and-dies by a bug tracking system, with a great team of folks doing project management. So there are other people who manage sprints, keep an eye on velocity, and make sure that issues don't get stuck.
[2]My general approach is typically to have a "big projects" or "things to think about" list and a "do this next list", with occasional lists about all the things in a specific big project. In retrospect these map reasonable well to SCRUM/Agile concepts, but it also makes sense.

Smaller Tasks are Always Better

It's easy to plan projects from the "top down," and identify the major components and plan your work around those components, and the times that I run in to trouble are always the times when my "actionable pieces" are too big. Smaller pieces help you build momentum, allow to move around to different areas as your attention and focus change, and help you use avalible time effectively (when you want.)

It's easy to find time in-between meetings, or while the pasta water is boiling, to do something small and quick. It's also very easy to avoid starting something big until you have a big block of unfettered time. The combination of these factors makes bigger tasks liabilities, and more likely to take even longer to complete.

Multi-Task Different Kinds of Work

I read a bunch of articles that suggest that the way to be really productive is to figure out ways of focusing and avoiding context switches. I've even watched a lot of coworkers organize their schedules and work around these principles, and it's always been something of a mystery for me. It's true that too much multi-tasking and context switching can lead to a fragmented experience and make some longer/complicated tasks harder to really dig into, but it's possible to manage the costs of context switching, by breaking apart bigger projects into smaller projects and leaving notes for your (future) self as you work.

Even if you don't do a lot of actual multitasking within a given hour or day of time, it's hard to avoid really working on different kinds of projects on the scale of days or weeks, and I've found that having multiple projects in flight at once actually helps me get more done. In general I think of this as the idea that more projects in flight means that you finish things more often, even if the total number of projects completed is the same in the macro context.

Regardless, different stages of a project require different kind of attention and energy and having a few things in flight increases the chance that when you're in the mood to do some research, or editing, or planning, you have a project with that kind of work all queued up. I prefer to be able to switch to different kinds of work depending on my attention and mood. In general my work falls into the following kinds of activities:

  • planning (e.g. splitting up big tasks, outlining, design work,)
  • generative work (e.g. writing, coding, etc.)
  • organizational (email, collaboration coordination, user support, public issue tracking, mentoring, integration, etc.)
  • polishing (editing, writing/running tests, publication prepping,)
  • reviewing (code review, editing, etc.)

Do the Right Things

My general approach is "do lots of things and hope something sticks," which makes the small assumption that all of the things you do are important. It's fine if not everything is the most important, and it's fine to do things a bit out of order, but it's probably a problem if you do lots of things without getting important things done.

So I'm not saying establish a priority for all tasks and execute them in strictly that priority, at all. Part of the problem is just making sure that the things on your list are still relevant, and still make sense. As we do work and time passes, we have to rethink or rechart how we're going to complete a project, and that reevaluation is useful.

Prioritization and task selection is incredibly hard, and it's easy to cast "prioritization" in over simplified terms. I've been thinking about prioritization, for my own work, as being a decision based on the following factors:

  • deadline (when does this have to be done: work on things that have hard deadlines or expected completion times, ordered by expected completion date, to avoid needing to cram at the last moment.)
  • potential impact (do things that will have the greatest impact before lesser impact, this is super subjective, but can help build momentum, and give you a chance to decide if lower-impact items are worth doing.)
  • time availability fit (do the biggest thing you can manage with the time you have at hand, as smaller things are easier to fit in later,)
  • level of understanding (work on the things that you understand the best, and give yourself the opportunity to plan things that you don't understand later. I sometimes think about this as "do easy things first," but that might be too simple.)
  • time outstanding (how long ago was this task created: do older things first to prevent them from becoming stale.)
  • number of things (or people) that depend on this being done (work on things that will unblock other tasks or collaborators before things that don't have any dependencies, to help increase overall throughput.)

Maintain a Pipeline of Work

Productivity, for me, has always been about getting momentum on projects and being able to add more things. For work projects, there's (almost) always a backlog of tasks, and the next thing is ususally pretty obvious, but sometimes this is harder for personal projects. I've noticed a tendency in myself to prefer "getting everything done" on my personal todo list, which I don't think particularly useful. Having a pipleine of backlog of work is great:

  • there's always something next to do, and there isn't a moment when you've finished and have to think about new things.
  • keeping a list of things that you are going to do in the more distant future lets you start thinking about how bigger pieces fit together without needint to starting to work on that.
  • you can add big things to your list(s) and then break them into smaller pieces as you make progress.

As an experiment, think about your todo list, not as a thing that you'd like to finish all of the items, but as list that shouldn't be shorter than a certain amount (say 20 or 30?) items with rate of completion (10 a week?) though you should choose your own numbers, and set goals based on what you see yourself getting done over time.

Staff Engineering

In August of 2019 I became a Staff Engineer, which is what a lot of companies are calling their "level above Senior Engineer" role these days. Engineering leveling is a weird beast, which probably a post onto itself. Despite my odd entry into a career in tech, my path in the last 4 or 5 years has been pretty conventional; however, somehow, despite having an increasingly normal career trajectory, explaining what I do on a day to day basis has not gotten easier.

Staff Engineers are important for scaling engineering teams, but lots of teams get by with out them, and unlike more junior engineers who have broadly similar job roles, there are a lot of different ways to be a Staff Engineer, which only muddies things. This post is a reflection on some key aspects of my experience organized in to topics that I hope will be useful for people who may be interested in becoming staff engineers or managing such a person. If you're also a Staff Engineer and your experience is different, I wouldn't be particularly surprised.

Staff Engineers Help Teams Build Great Software

Lots of teams function just fine without Staff Engineers and teams can build great products without having contributors in Staff-type roles. Indeed, because Staff Engineers vary a lot, the utility of having more senior individual contributors on a team depends a lot of the specific engineer and the team in question: finding a good fit is even harder than usual. In general, having Senior Technical leadership can help teams by:

  • giving people managers more space and time to focus on the team organization, processes, and people. Particularly in small organizations, team managers often pick up technical leadership.
  • providing connections and collaborations between groups and efforts. While almost all senior engineers have a "home team" and are directly involved in a few specific projects, they also tend to have broader scope, and so can help coordinate efforts between different projects and groups.
  • increasing the parallelism of teams, and can provide the kind of infrastructure that allows a team to persue multiple streams of development at one time.
  • supporting the career path and growth of more junior engineers, both as a result of direct mentoring, but also by enabling the team to be more successful by having more technical leadership capacity creates opportunities for growth for everyone on the team.

Staff Promotions Reflect Organizational Capacity

In addition to experience and a history of effusiveness, like other promotions, getting promoted to Staff Engineer is less straight forward than other promotions. This is in part because the ways we think about leveling and job roles (i.e. to describe the professional activities and capabilities along several dimensions for each level,) become complicated when there are lots of different ways to be a Staff Engineer. Pragmatically, these kind of promotions often depend on other factors:

  • the existence of other Staff Engineers in the organization make it more likely that there's an easy comparison for a candidate.
  • past experience of managers getting Staff+ promotions for engineers. Enginering Managers without this kind of experience may have difficulty creating the kinds of opportunities within their organizations and for advocating these kinds of promotions.
  • organizational maturity and breadth to support the workload of a Staff Engineer: there are ways to partition teams and organizations that preclude some of the kinds of higher level concerns that justify having Staff Engineers, and while having senior technical leadership is often useful, if the organization can't support it, it won't happen.
  • teams with a sizable population of more junior engineers, particularly where the team is growing, will have more opportunity and need for Staff Engineers. Teams that are on the balance more senior, or are small and relatively static tend to have less opportunity for the kind of broadly synthetic work that tends to lead to Staff promotions.

There are also, of course, some kinds of technical achievements and professional characteristics that Staff Engineers often have, and I'm not saying that anyone in the right organizational context can be promoted, exactly. However, without the right kind of organizational support and context, even the most exceptional engineers will never be promoted.

Staff Promotions are Harder to Get Than Equivalent Management Promotions

In many organizations its true that Staff promotions are often much harder to get than equivalent promotions to peer-level management postions: the organizational contexts required to support the promotion of Engineers into management roles are much easier to create, particularly as organizations grow. As you hire more engineers you need more Engineering Managers. There are other factors:

  • managers control promotions, and it's easier for them to recapitulate their own career paths in their reports than to think about the Staff role, and so more Engineers tend to be pushed towards management than Senior IC roles. It's also probably that meta-managers benefit organizationally from having more front-line managers in their organizations than more senior ICs, which exacerbates this bias.
  • from an output perspective, Senior Engineers can write the code that Staff Engineers would otherwise write, in a way that Engineering Management tends to be difficult to avoid or do without. In other terms, management promotions are often more critical from the organization's perspective and therefore prioritized over Staff promotions, particularly during growth.
  • cost. Staff Engineers are expensive, often more expensive than managers particularly at the bottom of the brackets, and it's difficult to imagine that the timing of Staff promotions are not impacted by budgetary requirements.

Promoting a Staff Engineer is Easier than Hiring One

Because there are many valid ways to do the Staff job, and so much of the job is about leveraging context and building broader connections between different projects, people with more organizational experience and history often have an advantage over fresh industry hires. In general:

  • Success as a Staff Engineer in one organization does not necessarily translate to success at another.
  • The conventions within the process for industry hiring, are good at selecting junior engineers, and there are fewer conventions for more Senior roles, which means that candidates are not assessed for skills and experiences that are relevant to their day-to-day while also being penalized for (often) being unexceptional at the kind of problems that junior engineering interviews focus on. While interview processes are imperfect assessment tools in all cases, they're particularly bad at more senior levels.
  • Senior engineering contributors have a the potential to have huge impact on product development, engineering outcomes, all of which requires a bunch of trust on the part of the organization, and that kind of trust is often easier to build with someone who already has organizational experience

This isn't to say that it's impossible to hire Staff engineers, I'm just deeply dubious of the hiring process for these kinds of roles having both interviewed for these kinds of roles and also interviewed candidates for them. I've also watched more than one senior contributor not really get along well with a team or other leadership after being hired externally, and for reasons that end up making sense in retrospect. It's really hard.

Staff Engineers Don't Not Manage

Most companies have a clear distinction between the career trajectories of people involved in "management" and senior "individual contributor" roles (like Staff Engineers,) with managers involved in leadership for teams and humans, with ICs involved in technical aspects. This seems really clear on paper but incredibly messy in practice. The decisions that managers make about team organization and prioritization have necessary technical implications; while it's difficult to organize larger scale technical initiatives without awareness of the people and teams. Sometimes Staff Engineers end up doing actual management on a temporary basis in order to fill gaps as organizations change or cover parental leave

It's also the case that a huge part of the job for many Staff Engineer's involves direct mentorship of junior engineers, which can involve leading specific projects, conversations about career trajectories and growth, as well as conversations about specific technical topics. This has a lot of overlap with management, and that's fine. The major differences is that senior contributors share responsibility for the people they mentor with their actual managers, and tend to focus mentoring on smaller groups of contributors.

Staff Engineers aren't (or shouldn't be!) managers, even when they are involved in broader leadership work, even if the specific engineer is capable of doing management work: putting ICs in management roles, takes time away from their (likely more valuable) technical projects.

Staff Engineers Write Boring and Tedious But Easy Code

While this is perhaps not a universal view, I feel pretty safe in suggesting that Staff Engineers should be directly involved in development projects. While there are lots of ways to be involved in development: technical design, architecture, reviewing code and documents, project planning and development, and so fort, I think it's really important that Staff Engineers be involved with code-writing, and similar activies. This makes it easy to stay grounded and relevant, and also makes it possible to do a better job at all of the other kinds of engineering work.

Having said that, it's almost inevitable that the kinds of contribution to the code that you make as a Staff Engineer are not the same kinds of contributions that you make at other points in your career. Your attention is probably pulled in different directions. Where a junior engineer can spend most of their day focusing on a few projects and writing code, Staff Engineers:

  • consult with other teams.
  • mentor other engineers.
  • build long and medium term plans for teams and products.
  • breaking larger projects apart and designing APIs between components.

All of this "other engineering work" takes time, and the broader portfolio of concerns means that more junior engineers often have more time and attention to focus on specific programming tasks. The result is that the kind of code you end up writing tends to be different:

  • fixing problems and bugs in systems that require a lot of context. The bugs are often not very complicated themselves, but require understanding the implication of one component with regards to other components, which can make them difficult.
  • projects to enable future development work, including building infrastructure or designing an interface and connecting an existing implementation to that interface ahead of some larger effort. This kind of "refactor things to make it possible to write a new implementation."
  • writing small isolated components to support broader initiatives, such as exposing existing information via new APIs, and building libraries to facilitate connections between different projects or components.
  • projects that support the work of the team as a whole: tools, build and deployment systems, code clean up, performance tuning, test infrastructure, and so forth.

These kinds of projects can amount to rather a lot of development work, but they definitely have their own flavor. As I approached Staff and certainly since, the kind of projects I had attention for definitely shifted. I actually like this kind of work rather a lot, so that's been quite good for me, but the change is real.

There's definitely a temptation to give Staff Engineers big projects that they can go off and work on alone, and I've seen lots of teams and engineers attempt this: sometimes these projects work out, though more often the successes feel like an exception. There's no "right kind" of way to write software as a Staff Engineer, sometimes senior engineer's get to work on bigger "core projects." Having said that, if a Staff Engineer is working on the "other engineering" aspects of the job, there's just limited time to do big development projects in a reasonable time frame.

A Common Failure

I've been intermittently working on a common lisp library to produce a binary encoding of arbitrary objects, and I think I'm going to be abandoning the project. This is an explanation of that decision and an reflection on my experience.

Why Common Lisp?

First, some background. I've always thought that Common Lisp is a language with a bunch of cool features and selling points, but unsurprisingly, I've never really had the experience of writing more than some one-off bits of code in CL, which isn't surprising. This project was a good experience for really digging into writing and managing a conceptually larger project which was a good kick in the pants to learn more.

The things I like:

  • the implementations of the core runtime are really robust and high quality, and make it possible to imagine running your code in a bunch of different contexts. Even though it's a language with relatively few users, it feels robust in a way. The most common implementations also have ways of producing fully self contained static binaries (like Go, say), which makes the thought of distributing software seem reasonable.
  • quicklisp, a package/library management tool is new (in the last decade or so,) has really raised the level of the ecosystem. It's not as complete as I'd expect in many ways, but quicklisp changed CL from something quaint to something that you could actually imagine using.
  • the object system is really nice. There isn't quite compile time-type checking on the values of slots (attributes) of objects, though you can opt in. My general feeling is that I can pretty easily get the feeling of writing statically typed code with all of the freedom of writing dynamic code.
  • multiple dispatch, and the conceptual approach to genericism, is amazing and really simplifies flow control in a lot of cases. You implement the methods you need, for the appropriate types/objects and then just write the logic you need, and the function call machinery just does the right thing. There's surprisingly little conditional logic, as a result.

Things I don't like:

  • there are all sorts of things that don't quite have existing libraries, and so I find myself wanting to do things that require more effort than necessary. This project to write a binary encoding tool would have been a component in service of a much larger project. It'd be nice if you could skip some of the lower level components, or didn't have your design choices so constrained by gaps in infrastructure.
  • at the same time, the library ecosystem is pretty fractured and there are common tools around which there aren't really consensus. Why are there so many half-finished YAML and JSON libraries? There are a bunch of HTTP server (!) implementations, but really you need 2 and not 5?
  • looping/iteration isn't intuitive and difficult to get common patterns to work. The answer, in most cases is to use (map) with lambdas rather than loops, in most cases, but there's this pitfall where you try and use a (loop) and really, that's rarely the right answer.
  • implicit returns seem like an over sight, hilariously, Rust also makes this error. Implicit returns also make knowing what type a function or method returns quite hard to reason about.

Writing an Encoder

So the project I wrote was an attempt to write really object oriented code as a way to writing an object encoder to a JSON-like format. Simple enough, I had a good mental model of the format, and my general approach to doing any kind of binary format processing is to just write a crap ton of unit tests and work somewhat iteratively.

I had a lot of fun with the project, and it gave me a bunch of key experiences which make me feel comfortable saying that I'm able to write common lisp even if it's not a language that I feel maximally comfortable in (yet?). The experiences that really helped included:

  • producing enough code to really have to think about how packaging and code organization worked. I'd written a function here and there, before, but never something where I needed to really understand and use the library/module/packaging (e.g. systems and libraries.) infrastructure.
  • writing and running lots of tests. I don't always follow test-driven development closely, but writing lots of tests is part of my process, and being able to watch the layers of this code come together was a lot of fun and very instructive.
  • this project for me, was mostly about API design and it was nice to have a project that didn't require much design in terms of the actual functionality, as object encoding is pretty straight forward.

From an educational perspective all of my goals were achieved.

Failure Mode

The problem is that it didn't work out, in the final analysis. While the library I constructed was able to encode and decode objects and was internally correct, I never got it to produce encoding that other implementations of the same specification could reliably read, and the ability to read data encoded by other libraries only worked in trivial cases. In the end:

  • this was mostly a project designed to help me gain competence in a programming language I don't really know, and in that I was successful.
  • adding this encoding format isn't particularly useful to any project I'm thinking of working on in the short term, and doesn't directly enable me to do anything in particular.
  • the architecture of the library would not be particularly performant in practice, as the encoding process didn't deploy a buffer pool of any kind, and it would have been harder than not to back fill that in, and I wasn't particularly interested in that.
  • it didn't work, and the effort to debug the issue would be more substantive than I'm really interested in doing at this point, particularly given the limited utility.

So here we are. Onto a different project!

There's No Full Stack

Software engineers use terms like "backend" and "frontend" to describe areas of expertice and focus, and the thought is that these terms map roughly onto "underlying systems and business logic" and "user interfaces." The thought, is that these are different kinds of work and no person can really specialize on "everything."

But it's all about perspective. Software is build in layers: there are frontends and backends at almost every level, so the classification easily breaks down if you look at it too hard. It's also the case that that logical features, from the perspective of the product and user, require the efforts of both disciplines. Often development organizations struggle to hand projects off between groups of front-end and back-end teams. [1]

[1]In truth this problem of coordination between frontend and backend teams is really that it forces a waterfall-like coordination between teams, which is always awkward. The problem isn't that backend engineers can't write frontend code, but that having different teams requires a handoff that is difficult to manage correctly, and around that handoff processes and management happens.

Backend/Frontend is also a poor way to organize work, as often it forces a needless boundary between people and teams wokring on related projects. Backend work (ususally) has to be completed first, and if that slips (or estimation is off) then the front end work has to happen in a crunch. Even if timing goes well, it's difficult to maintain engineering continuity through the handoff and context is often lost in the process.

In response to splitting projects and teams into front and backend, engineers have developed this idea of "full stack" engineering. This typically means "integrated front end and backend development." A noble approach: keep the same engineer on the project from start to finish, and avoid an awkward handoff or resetting context halfway through a project. Historic concerns about "front end and backend being in different languages" are reduced both by the advent of back-end javascript, and a realization that programmers often work in multiple languages.

While full stack sounds great, it's a total lie. First, engineers by and large cannot maintain context on all aspects of a system, so boundaries end up appearing in different places. A full stack engineer might end up writing front end and the APIs on the backed that the front end depends on, but not the application logic that supports the feature. Or an engineer might focus only a very specific set of features, but not be able to branch out very broadly. Second, specialization is important for allowing engineers to focus and be productive, and while context switching projects between engineers, having engineers that must context switch regularly between different disciplines is bad for those engineers. In short you can't just declare that engineers will be able to do it all.'

Some, particularly larger, teams and prodcuts can get around the issue entirely by dividing ownership and specialization along functional boundaries rather than by engineering discipline, but there can be real technical limitations, and getting a team to move to this kind of ownership model is super difficult. Therefore, I'd propose a different organization or a way of dividing projects and engineering that avoids both "frontend/backend" as well as the idea of "full stack":

  • feature or product engineers, that focus on core functionality delivered to users. This includes UI, supporting backend APIs, and core functionality. The users of these teams are the users of the product. These jobs have the best parts of "full stack" type orientation, but draw an effective "lower" boundary of responsibility and allow feature-based specialization.
  • infrastructure or product platform engineers, that focus on deployment, operations and supporting internal APIs. These teams and engineers should see their users as feature and product engineers. These engineers should fall somewhere between "backend engineers," and the "devops" and "sre" -type roles of the last decade, and cover the area "above" systems (e.g. not inclusive of machine management and access provisioning,) and below features.

This framework helps teams scale up as needs and requirements change: Feature teams can be divided and parallelized and focus in functionality slices, while, infrastructure teams divide easily into specialties (e.g. networking, storage, databases, internal libraries, queues, etc.) and along service boundaries. Teams are in a better position to handle continuity of projects, and engineers can maintain context and operate using more agile methods. I suspect that, if we look carefully, many organizations and teams have this kind of de facto organization, even if they use different kind of terminology.

Thoughts?