Staff Engineering

In August of 2019 I became a Staff Engineer, which is what a lot of companies are calling their "level above Senior Engineer" role these days. Engineering leveling is a weird beast, which probably a post onto itself. Despite my odd entry into a career in tech, my path in the last 4 or 5 years has been pretty conventional; however, somehow, despite having an increasingly normal career trajectory, explaining what I do on a day to day basis has not gotten easier.

Staff Engineers are important for scaling engineering teams, but lots of teams get by with out them, and unlike more junior engineers who have broadly similar job roles, there are a lot of different ways to be a Staff Engineer, which only muddies things. This post is a reflection on some key aspects of my experience organized in to topics that I hope will be useful for people who may be interested in becoming staff engineers or managing such a person. If you're also a Staff Engineer and your experience is different, I wouldn't be particularly surprised.

Staff Engineers Help Teams Build Great Software

Lots of teams function just fine without Staff Engineers and teams can build great products without having contributors in Staff-type roles. Indeed, because Staff Engineers vary a lot, the utility of having more senior individual contributors on a team depends a lot of the specific engineer and the team in question: finding a good fit is even harder than usual. In general, having Senior Technical leadership can help teams by:

  • giving people managers more space and time to focus on the team organization, processes, and people. Particularly in small organizations, team managers often pick up technical leadership.
  • providing connections and collaborations between groups and efforts. While almost all senior engineers have a "home team" and are directly involved in a few specific projects, they also tend to have broader scope, and so can help coordinate efforts between different projects and groups.
  • increasing the parallelism of teams, and can provide the kind of infrastructure that allows a team to persue multiple streams of development at one time.
  • supporting the career path and growth of more junior engineers, both as a result of direct mentoring, but also by enabling the team to be more successful by having more technical leadership capacity creates opportunities for growth for everyone on the team.

Staff Promotions Reflect Organizational Capacity

In addition to experience and a history of effusiveness, like other promotions, getting promoted to Staff Engineer is less straight forward than other promotions. This is in part because the ways we think about leveling and job roles (i.e. to describe the professional activities and capabilities along several dimensions for each level,) become complicated when there are lots of different ways to be a Staff Engineer. Pragmatically, these kind of promotions often depend on other factors:

  • the existence of other Staff Engineers in the organization make it more likely that there's an easy comparison for a candidate.
  • past experience of managers getting Staff+ promotions for engineers. Enginering Managers without this kind of experience may have difficulty creating the kinds of opportunities within their organizations and for advocating these kinds of promotions.
  • organizational maturity and breadth to support the workload of a Staff Engineer: there are ways to partition teams and organizations that preclude some of the kinds of higher level concerns that justify having Staff Engineers, and while having senior technical leadership is often useful, if the organization can't support it, it won't happen.
  • teams with a sizable population of more junior engineers, particularly where the team is growing, will have more opportunity and need for Staff Engineers. Teams that are on the balance more senior, or are small and relatively static tend to have less opportunity for the kind of broadly synthetic work that tends to lead to Staff promotions.

There are also, of course, some kinds of technical achievements and professional characteristics that Staff Engineers often have, and I'm not saying that anyone in the right organizational context can be promoted, exactly. However, without the right kind of organizational support and context, even the most exceptional engineers will never be promoted.

Staff Promotions are Harder to Get Than Equivalent Management Promotions

In many organizations its true that Staff promotions are often much harder to get than equivalent promotions to peer-level management postions: the organizational contexts required to support the promotion of Engineers into management roles are much easier to create, particularly as organizations grow. As you hire more engineers you need more Engineering Managers. There are other factors:

  • managers control promotions, and it's easier for them to recapitulate their own career paths in their reports than to think about the Staff role, and so more Engineers tend to be pushed towards management than Senior IC roles. It's also probably that meta-managers benefit organizationally from having more front-line managers in their organizations than more senior ICs, which exacerbates this bias.
  • from an output perspective, Senior Engineers can write the code that Staff Engineers would otherwise write, in a way that Engineering Management tends to be difficult to avoid or do without. In other terms, management promotions are often more critical from the organization's perspective and therefore prioritized over Staff promotions, particularly during growth.
  • cost. Staff Engineers are expensive, often more expensive than managers particularly at the bottom of the brackets, and it's difficult to imagine that the timing of Staff promotions are not impacted by budgetary requirements.

Promoting a Staff Engineer is Easier than Hiring One

Because there are many valid ways to do the Staff job, and so much of the job is about leveraging context and building broader connections between different projects, people with more organizational experience and history often have an advantage over fresh industry hires. In general:

  • Success as a Staff Engineer in one organization does not necessarily translate to success at another.
  • The conventions within the process for industry hiring, are good at selecting junior engineers, and there are fewer conventions for more Senior roles, which means that candidates are not assessed for skills and experiences that are relevant to their day-to-day while also being penalized for (often) being unexceptional at the kind of problems that junior engineering interviews focus on. While interview processes are imperfect assessment tools in all cases, they're particularly bad at more senior levels.
  • Senior engineering contributors have a the potential to have huge impact on product development, engineering outcomes, all of which requires a bunch of trust on the part of the organization, and that kind of trust is often easier to build with someone who already has organizational experience

This isn't to say that it's impossible to hire Staff engineers, I'm just deeply dubious of the hiring process for these kinds of roles having both interviewed for these kinds of roles and also interviewed candidates for them. I've also watched more than one senior contributor not really get along well with a team or other leadership after being hired externally, and for reasons that end up making sense in retrospect. It's really hard.

Staff Engineers Don't Not Manage

Most companies have a clear distinction between the career trajectories of people involved in "management" and senior "individual contributor" roles (like Staff Engineers,) with managers involved in leadership for teams and humans, with ICs involved in technical aspects. This seems really clear on paper but incredibly messy in practice. The decisions that managers make about team organization and prioritization have necessary technical implications; while it's difficult to organize larger scale technical initiatives without awareness of the people and teams. Sometimes Staff Engineers end up doing actual management on a temporary basis in order to fill gaps as organizations change or cover parental leave

It's also the case that a huge part of the job for many Staff Engineer's involves direct mentorship of junior engineers, which can involve leading specific projects, conversations about career trajectories and growth, as well as conversations about specific technical topics. This has a lot of overlap with management, and that's fine. The major differences is that senior contributors share responsibility for the people they mentor with their actual managers, and tend to focus mentoring on smaller groups of contributors.

Staff Engineers aren't (or shouldn't be!) managers, even when they are involved in broader leadership work, even if the specific engineer is capable of doing management work: putting ICs in management roles, takes time away from their (likely more valuable) technical projects.

Staff Engineers Write Boring and Tedious But Easy Code

While this is perhaps not a universal view, I feel pretty safe in suggesting that Staff Engineers should be directly involved in development projects. While there are lots of ways to be involved in development: technical design, architecture, reviewing code and documents, project planning and development, and so fort, I think it's really important that Staff Engineers be involved with code-writing, and similar activies. This makes it easy to stay grounded and relevant, and also makes it possible to do a better job at all of the other kinds of engineering work.

Having said that, it's almost inevitable that the kinds of contribution to the code that you make as a Staff Engineer are not the same kinds of contributions that you make at other points in your career. Your attention is probably pulled in different directions. Where a junior engineer can spend most of their day focusing on a few projects and writing code, Staff Engineers:

  • consult with other teams.
  • mentor other engineers.
  • build long and medium term plans for teams and products.
  • breaking larger projects apart and designing APIs between components.

All of this "other engineering work" takes time, and the broader portfolio of concerns means that more junior engineers often have more time and attention to focus on specific programming tasks. The result is that the kind of code you end up writing tends to be different:

  • fixing problems and bugs in systems that require a lot of context. The bugs are often not very complicated themselves, but require understanding the implication of one component with regards to other components, which can make them difficult.
  • projects to enable future development work, including building infrastructure or designing an interface and connecting an existing implementation to that interface ahead of some larger effort. This kind of "refactor things to make it possible to write a new implementation."
  • writing small isolated components to support broader initiatives, such as exposing existing information via new APIs, and building libraries to facilitate connections between different projects or components.
  • projects that support the work of the team as a whole: tools, build and deployment systems, code clean up, performance tuning, test infrastructure, and so forth.

These kinds of projects can amount to rather a lot of development work, but they definitely have their own flavor. As I approached Staff and certainly since, the kind of projects I had attention for definitely shifted. I actually like this kind of work rather a lot, so that's been quite good for me, but the change is real.

There's definitely a temptation to give Staff Engineers big projects that they can go off and work on alone, and I've seen lots of teams and engineers attempt this: sometimes these projects work out, though more often the successes feel like an exception. There's no "right kind" of way to write software as a Staff Engineer, sometimes senior engineer's get to work on bigger "core projects." Having said that, if a Staff Engineer is working on the "other engineering" aspects of the job, there's just limited time to do big development projects in a reasonable time frame.

A Common Failure

I've been intermittently working on a common lisp library to produce a binary encoding of arbitrary objects, and I think I'm going to be abandoning the project. This is an explanation of that decision and an reflection on my experience.

Why Common Lisp?

First, some background. I've always thought that Common Lisp is a language with a bunch of cool features and selling points, but unsurprisingly, I've never really had the experience of writing more than some one-off bits of code in CL, which isn't surprising. This project was a good experience for really digging into writing and managing a conceptually larger project which was a good kick in the pants to learn more.

The things I like:

  • the implementations of the core runtime are really robust and high quality, and make it possible to imagine running your code in a bunch of different contexts. Even though it's a language with relatively few users, it feels robust in a way. The most common implementations also have ways of producing fully self contained static binaries (like Go, say), which makes the thought of distributing software seem reasonable.
  • quicklisp, a package/library management tool is new (in the last decade or so,) has really raised the level of the ecosystem. It's not as complete as I'd expect in many ways, but quicklisp changed CL from something quaint to something that you could actually imagine using.
  • the object system is really nice. There isn't quite compile time-type checking on the values of slots (attributes) of objects, though you can opt in. My general feeling is that I can pretty easily get the feeling of writing statically typed code with all of the freedom of writing dynamic code.
  • multiple dispatch, and the conceptual approach to genericism, is amazing and really simplifies flow control in a lot of cases. You implement the methods you need, for the appropriate types/objects and then just write the logic you need, and the function call machinery just does the right thing. There's surprisingly little conditional logic, as a result.

Things I don't like:

  • there are all sorts of things that don't quite have existing libraries, and so I find myself wanting to do things that require more effort than necessary. This project to write a binary encoding tool would have been a component in service of a much larger project. It'd be nice if you could skip some of the lower level components, or didn't have your design choices so constrained by gaps in infrastructure.
  • at the same time, the library ecosystem is pretty fractured and there are common tools around which there aren't really consensus. Why are there so many half-finished YAML and JSON libraries? There are a bunch of HTTP server (!) implementations, but really you need 2 and not 5?
  • looping/iteration isn't intuitive and difficult to get common patterns to work. The answer, in most cases is to use (map) with lambdas rather than loops, in most cases, but there's this pitfall where you try and use a (loop) and really, that's rarely the right answer.
  • implicit returns seem like an over sight, hilariously, Rust also makes this error. Implicit returns also make knowing what type a function or method returns quite hard to reason about.

Writing an Encoder

So the project I wrote was an attempt to write really object oriented code as a way to writing an object encoder to a JSON-like format. Simple enough, I had a good mental model of the format, and my general approach to doing any kind of binary format processing is to just write a crap ton of unit tests and work somewhat iteratively.

I had a lot of fun with the project, and it gave me a bunch of key experiences which make me feel comfortable saying that I'm able to write common lisp even if it's not a language that I feel maximally comfortable in (yet?). The experiences that really helped included:

  • producing enough code to really have to think about how packaging and code organization worked. I'd written a function here and there, before, but never something where I needed to really understand and use the library/module/packaging (e.g. systems and libraries.) infrastructure.
  • writing and running lots of tests. I don't always follow test-driven development closely, but writing lots of tests is part of my process, and being able to watch the layers of this code come together was a lot of fun and very instructive.
  • this project for me, was mostly about API design and it was nice to have a project that didn't require much design in terms of the actual functionality, as object encoding is pretty straight forward.

From an educational perspective all of my goals were achieved.

Failure Mode

The problem is that it didn't work out, in the final analysis. While the library I constructed was able to encode and decode objects and was internally correct, I never got it to produce encoding that other implementations of the same specification could reliably read, and the ability to read data encoded by other libraries only worked in trivial cases. In the end:

  • this was mostly a project designed to help me gain competence in a programming language I don't really know, and in that I was successful.
  • adding this encoding format isn't particularly useful to any project I'm thinking of working on in the short term, and doesn't directly enable me to do anything in particular.
  • the architecture of the library would not be particularly performant in practice, as the encoding process didn't deploy a buffer pool of any kind, and it would have been harder than not to back fill that in, and I wasn't particularly interested in that.
  • it didn't work, and the effort to debug the issue would be more substantive than I'm really interested in doing at this point, particularly given the limited utility.

So here we are. Onto a different project!

There's No Full Stack

Software engineers use terms like "backend" and "frontend" to describe areas of expertice and focus, and the thought is that these terms map roughly onto "underlying systems and business logic" and "user interfaces." The thought, is that these are different kinds of work and no person can really specialize on "everything."

But it's all about perspective. Software is build in layers: there are frontends and backends at almost every level, so the classification easily breaks down if you look at it too hard. It's also the case that that logical features, from the perspective of the product and user, require the efforts of both disciplines. Often development organizations struggle to hand projects off between groups of front-end and back-end teams. [1]

[1]In truth this problem of coordination between frontend and backend teams is really that it forces a waterfall-like coordination between teams, which is always awkward. The problem isn't that backend engineers can't write frontend code, but that having different teams requires a handoff that is difficult to manage correctly, and around that handoff processes and management happens.

Backend/Frontend is also a poor way to organize work, as often it forces a needless boundary between people and teams wokring on related projects. Backend work (ususally) has to be completed first, and if that slips (or estimation is off) then the front end work has to happen in a crunch. Even if timing goes well, it's difficult to maintain engineering continuity through the handoff and context is often lost in the process.

In response to splitting projects and teams into front and backend, engineers have developed this idea of "full stack" engineering. This typically means "integrated front end and backend development." A noble approach: keep the same engineer on the project from start to finish, and avoid an awkward handoff or resetting context halfway through a project. Historic concerns about "front end and backend being in different languages" are reduced both by the advent of back-end javascript, and a realization that programmers often work in multiple languages.

While full stack sounds great, it's a total lie. First, engineers by and large cannot maintain context on all aspects of a system, so boundaries end up appearing in different places. A full stack engineer might end up writing front end and the APIs on the backed that the front end depends on, but not the application logic that supports the feature. Or an engineer might focus only a very specific set of features, but not be able to branch out very broadly. Second, specialization is important for allowing engineers to focus and be productive, and while context switching projects between engineers, having engineers that must context switch regularly between different disciplines is bad for those engineers. In short you can't just declare that engineers will be able to do it all.'

Some, particularly larger, teams and prodcuts can get around the issue entirely by dividing ownership and specialization along functional boundaries rather than by engineering discipline, but there can be real technical limitations, and getting a team to move to this kind of ownership model is super difficult. Therefore, I'd propose a different organization or a way of dividing projects and engineering that avoids both "frontend/backend" as well as the idea of "full stack":

  • feature or product engineers, that focus on core functionality delivered to users. This includes UI, supporting backend APIs, and core functionality. The users of these teams are the users of the product. These jobs have the best parts of "full stack" type orientation, but draw an effective "lower" boundary of responsibility and allow feature-based specialization.
  • infrastructure or product platform engineers, that focus on deployment, operations and supporting internal APIs. These teams and engineers should see their users as feature and product engineers. These engineers should fall somewhere between "backend engineers," and the "devops" and "sre" -type roles of the last decade, and cover the area "above" systems (e.g. not inclusive of machine management and access provisioning,) and below features.

This framework helps teams scale up as needs and requirements change: Feature teams can be divided and parallelized and focus in functionality slices, while, infrastructure teams divide easily into specialties (e.g. networking, storage, databases, internal libraries, queues, etc.) and along service boundaries. Teams are in a better position to handle continuity of projects, and engineers can maintain context and operate using more agile methods. I suspect that, if we look carefully, many organizations and teams have this kind of de facto organization, even if they use different kind of terminology.

Thoughts?

What is it That You Do?

The longer that I have this job, the more difficult it is to explain what I do. I say, "I'm a programmer," and you'd think that I write code all day, but that doesn't map onto what my days look like, and the longer it seems the less code I actually end up writing. I think the complexity of this seemingly simple question grows from the fact that building software involves a lot more than writing code, particularly as projects become more complex.

I'd venture to say that most code is written and maintained by one person, and typically used by a very small number of pepole (often on behalf of many more people,) though this is difficult to quantify. Single maintainer software is still software, and there are lots of interesting problems, but as much as anything else I'm interested in the problems adjacent to multi-author code-bases and multi-operator software development. [1]

Fundamentally, I'm interested in the following questions:

  • How can (sometimes larger) groups of people collaborate to build something that's bigger than the scope of any of their work?
  • How can we build software in a way that lets individual developers focus most of the time on the features and concerns that are the most important to them and their users. [2]

The software development process, regardless of the scope of the problem, has a number of different aspects:

  • Operations: How does is this software execute and how do we know that its successful when it runs?
  • Behavior: What does it do, and how do we ensure it has the correct behavior?
  • Interface: How will users interact with the process, and how do we ensure a consistent experience across versions and users' environment?
  • Product: Who are the users? What features do they want? Which features are the most important?

Sometimes we can address these questions by writing code, but often there's a lot of talking to users, other developers, and other people who work in software development organizations (e.g. product managers, support, etc.) not to mention writing a lot of English (documentation, specs, and the like.)

I still don't think that I've successfully answered the framing question, except to paint a large picture of what kinds of work goes into making software, and described some of my specific domain interests. This ends up boiling down to:

  • I write a lot of documents describing new features and improvements to our software. [product]
  • I think a lot about how our product works as it grows (scaling), and what kinds of changes we can make now to make that process more smooth. [operations]
  • How can I help the more junior members of my team focus on the aspects of their jobs that they enjoy the most, and help illustrate broader contexts to them. [mentoring]
  • How can we take the problems we're solving today and build the solution that balances the immediate requirements with longer term maintainability and reuse. [operations/infrastructure]

The actual "what" I'm spending my time boils down to reading a bunch of code, meeting with my teamates, meeting with users (who are also coworkers.) And sometimes writing code. If I'm lucky.

[1]I think the single-author and/or single-operator class is super interesting and valuable, particularly because it includes a lot of software outside of the conventional disciplinary boundaries of software and includes things like macros, spreadsheets, small scale database, and IT/operations ("scripting") work.
[2]It's very easy to spend most of your time as a developer writing infrastructure code of some sort, to address either internal concerns (logging, data management and modeling, integrating with services) or project/process automation (build, test, operations) concerns. Infrastructure isn't bad, but it isn't the same as working on product features.

Leadership in Distributed Social Networks

Let us understand "social networks," to mean networks of people interacting in a common group or substrate rather than the phenomena that exists on a certain class of websites (like Facebook): we can think of this as the "conventional sense" of the term.

As I watch Anonymous, the 'hacktivist" group, to say nothing of the movements in Egypt and Tunisia, I'm fascinated by the way that such an ad hoc group can appear to be organized and coherent on the outside without appearing to have real leadership. External appearance and reality is different, of course, and I'm not drawing a direct parallel between what Anonymous is doing and what happened in Egypt, but there is a parallel. I think we're living in a very interesting moment. New modes of political and social organizing do not manifest themselves often.

We still have a lot to learn about what's happened recently in Egypt/Tunisia/Libya and just as much to learn about Anonymous. In a matter of months or years, we could very easily look back on this post and laugh at its naivete. Nevertheless, at least for the moment, there are a couple of big things that I think are interesting and important to think about:

  • We have movements that are lead, effectively, by groups rather than individuals, and if individuals are actually doing leadership work, they are not taking credit for that work. So movements that are not lead by egos.
  • These are movements that are obviously technologically very aware, but not in a mainstream sort of way. Anonymous uses (small?) IRC networks and other collaborative tools that aren't quite mainstream yet. The Egyptian protesters in the very beginning had UStream feeds of Tahrir Square, and I'd love to know how they were handling for internal coordination and communication.
  • I think the way that these movements "do ideology," is a bit unique and non conventional. I usually think of ideology as being a guiding strategy from which practice springs. I'm not sure that's what's happening here.
  • The member activists, who are doing the work in these movements are not professional politicians or political workers.

The more I ponder this, the more I realize how improbable these organizations are and the more I am impressed by the ability of these groups to be so effective. In addition to all of my other questions, I'm left wondering: how will this kind of leadership (or non-leadership) method and style influence other kinds of movements and projects.