Values for Collaborative Codebases
After, my post on what I do for work I thought it'd be good to describe the kinds of things that make software easy to work on and collaborative. Here's the list:
- Minimal Documentation. As a former technical writer this is sort of painful, but most programning environments (e.g. languages) have idioms and patterns that you can follow for how to organize code, run tests and build artifacts. it's ok if your project has exceptional requirements that require you to break the rules in some way, but the conventions should be obvious and the justification for rule-breaking should be plain. If you adhere to convention, you don't need as much documentation. It's paradoxical, because better documentation is supposed to facilitate accessibility, but too much documentation is sometimes an indication that things are weird and hard.
- Make it Easy To Run. I'd argue that the most difficult (and frustrating) part of writing software is getting it to run everywhere that you might want it to run. Writing software that runs, even does what you want on your own comptuer is relatively, easy: making it work on someone else's computer is hard. One of the big advantages of developing software that runs as web apps means that you (mostly) get to control (and limit) where the software runs. Making it possible to easily run a piece of software on any computer it might reasonably run (e.g. developer's computers, user's computers and/or production environments.) Your software itself, should be responsible for managing this environment, to the greatest extent possible. This isn't to say that you need to use containers or some such, but having packaging and install scripts that use well understood idioms and tools (e.g. requirements.txt, virtualenv, makefiles, etc.) is good.
- Clear Dependencies. Software is all built upon other pieces of software, and the versions of those libraries are important to record, capture, and recreate. Now it's generally a good idea to update dependencies regularly so you can take advantage of improvements from upstream providers, but unless you regularly test against multiple versions of your dependencies (you don't), and can control all of your developer and production environments totally (you can't), then you should provide a single, centralized way of describing your dependencies. Typically strategies involve: vendoring dependencies, using lockfiles (requirements.txt and similar) or build system integration tends to help organize this aspect of a project.
- Automated Tests. Software requires some kind of testing to ensure that it has the intended behavior, and tests that can run quickly and automatically without requiring users to exercise the software manually is absolutely essential. Tests should run quickly, and it should be possible to run a small group of tests very quickly to support iterative development on a specific area of the code. Indeed, most software development can and should be oriented toward the experience of writing tests and exercising new features with tests above pretty much everything else. The best test suites exercise the code at many levels, ranging from very small unit tests to ensure the correct behavior of the functions and methods, to higher level tests that test the functionality of higher-order functions and methods, and full integration tests that test the entire system.
- Continuous Integration. Continuous integration system's are tools that support development and ensure that changes to a code pass a more extensive range of tests than are readily available to developers. CI systems are also useful for automating other aspects of a project (releases, performance testing, etc.) A well maintained CI environment provide the basis for commonality for larger projects with a larger number for projects larger groups of developers, and is all but required to ensure a well supported automated test system and well managed dependency.
- Files and Directories Make Sense. Code is just text in files, and software is just a lot of code. Different languages and frameworks have different expectations about how code is organized, but you should be able to have a basic understanding of the software and be able to browse the files, and be able to mostly understand what components are and how they relate to each other based on directory names, and should be able to (roughly) understand what's in a file and how the files relate to eachother based on their names. In this respect, shorter files, when possible are nice, are directory structures that have limited depth (wide and shallow,) though there are expections for some programming language.
- Isolate Components and Provide Internal APIs. Often when we talk about
APIs we talk about the interface that users access our software, but larger
systems have the same need for abstractions and interfaces internally that we
expose for (programmatic) users. These APIs have different forms from public
ones (sometimes,) but in general:
- Safe APIs. The APIs should be easy to use and difficult to use incorrectly. This isn't just "make your API thread safe if your users are multithreaded," but also, reduce the possibility for unintended side effects, and avoid calling conventions that are easy to mistake: effects of ordering, positional arguments, larger numbers of arguments, and complicated state management.
- Good API Ergonomics. The ergonomics of an API is somewhat ethereal, but it's clear when an API has good ergonomics: writing code that uses the API feels "native," it's easy to look at calling code and understand what's going on, and errors that make sense and are easy to handle. It's not simply enough for an API to be safe to use, but it should be straightforward and clear.