Principles of Test Oriented Software Development

I want to like test-driven-development (TDD), but realistically it's not something that I ever actually do. Part of this is because TDD, as canonically described is really hard to actually pratice: TDD involves writing tests before writing code, writing tests which must fail before the implementation is complete or correct, and then using the tests to refactor the code. It's a nice idea, and it definitely leads to better test coverage, but the methodology forces you to iterate inefficiently on aspects of a design, and is rarely viable when extending existing code bases. Therefore, I'd like to propose a less-dogmatic alternative: test-oriented-development. [1]

I think, in practice, this largely aligns with the way that people write software, and so test oriented development does not describe a new way of writing code or of writing tests, but rather describes the strategies we use to ensure that the code we write is well tested and testable. Also, I think providing these strategies in a single concrete form will present a reasonable and pragmatic alternative to TDD that will make the aim of "developing more well tested software" more achievable.

  1. Make state explicit. This is good practice for all kinds of development, but generally, don't put data in global variables, and pass as much state (configuration, services, etc.) into functions and classes rather than "magicing" them.
  2. Methods and functions should be functional. I don't tend to think of myself as a functional programmer, as my tendencies are not very ideological, in this regard, but generally having a functional approach simplifies a lot of decisions and makes it easy to test systems at multiple layers.
  3. Most code should be internal and encapsulated. Packages and types with large numbers of exported or public methods should be a warning sign. The best kinds of tests can provide all desired coverage, by testing interfaces themselves,
  4. Write few simple tests and varry the data passed to those tests. This is essentially a form of "table driven testing," where you write a small sequence of simple cases, and run those tests with a variety of tests. Having test infrastructure that allows this kind of flexibility is a great technique.
  5. Begin writing tests as soon as possible. Orthodox TDD suggests that you should start writing tests first, and I think that this is probably one of the reasons that TDD is so hard to adopt. It's also probably the case that orthodox TDD emerged when prototyping was considerably harder than it is today, and as a result TDD just feels like friction, because it's difficult to plan implementations in a test-first sort of way. Having said that, start writing tests as soon as possible.
  6. Experiment in tests. Somehow, I've managed to write a lot of code without working an interactive debugger into my day-to-day routine, which means I do a lot of debugging by reading code, and also by writing tests to try and replicate production phenomena in more isolated phenomena. Writing and running tests in systems is a great way to learn about them.
[1]Sorry that this doesn't lead to a better acronym.

Pattern Fragment 3

This is the follow up to Pattern Fragment 0, Pattern Fragment 1, and Pattern Fragment 2.

Starting at the "end" of the back of the neck, join new yarn, and with a short circular needle pick up stitches around the steek. Be sure to pick up the first/last stitch of the steek. If you did not set aside a stitch at the bottom of the neck, increase one stitch at the bottom point of the neck.

When you've completed picking up stitches and have knit (plain) across the back of the neck, begin knitting in Knit 1 Purl One Ribbing, being sure to mirror left-to-right at the bottom of the knit (i.e. if the last stitch before your "point" or "bottom of the neck stitch" is a knit, then you should knit the stitch right after the point.) Always knit the "point" stitch. Additionally, on this first row you should do a centered double decrease at the point.

My favorite centered double decrease is a "slip 2 together, knit 1, lift the two slipped stitches over the knit stitch."

Continue knitting the collar, in this ribbing, doing one double decrease every other row, for an inch and a half. Bind off normally.

Editor Themes

It's not real secret that I'm red-green colorblind. It's not really a major life obstacle: I've got a wardrobe of clothes in colors that I can easily tell apart, I have developed a number of heuristics for guessing colors that are right enough, and mostly it just creates funny stories where I get a confused look if I try to describe something that might be purple, or have to convince a coworker into reading a graph for me.

One challenge historically, however, has been various kinds of text editing color themes: so often they end up having some kind of low contrast situation that's hard to read, or two different aspects of code that should be highlighted differently but aren't. I've tried lots of themes out, and I would always end up just going back and using default emacs themes, which might not have been great, but I always found it useable.

Until Protesilaos Stavrou's Modus Themes, that is.

These are super compelling and I never really knew how good an editor could look until I started using it. Like is this what it's really like for the rest of you all the time? Things are clear: I never get confused between type names and function names any more. There are rarely situations where I feel like the highlighting color and text color are the same, which used to happen all the time.

The real win, I think, is that Modus' makes dark themes accessible to me, in a way that they never were before. For the most part "dark themes" which have been so popular recently, are just impossible to see clearly (for me), I also find that it's less taxing to spend time in front of screens when darker backgrounds, so being able to spend most of my time doing work in an environment that's easy to read. I tend to keep to light backgrounds when using a laptop and dark backgrounds otherwise.

The second piece is that, I think I've caved in and decided to increase the size of the font, by default in my text editor, at least when I'm using my desktop/external monitor. I think my vision is as good as it's been, though I should probably get that checked out post-pandemic. I think there's a balance between "small fonts let you see more of the file you're working on," and "larger fonts let you focus on the area that you're editing." When I'm writing English, the focus is great, and when writing software I tend to want more context. There's a balance also in wanting to keep an entire line length viable at once, and an ideal words-per-line limit for text that's useful for making things easier to read. So there's some tuning there, depending on what your workload looks like.

I guess if there is any lesson in this it's that: Comfort matters, and you shouldn't push yourself into uncomfortable display situations if you can.

Does Anyone Actually Want Serverless?

Cloud computing, and with it most of tech, has been really hot on the idea of "serverless" computing, which is to say, services and applications that are deployed, provisioned, and priced separate from conventional "server" resources (memory, storage, bandwidth.) The idea is that we can build and expose ways of deploying and running applications and services, even low-level components like "databases" and "function execution", in ways that mean that developers and operators can avoid thinking about computers qua computers.

Serverless is the logical extension of "platform as a service" offerings that have been an oft missed goal for a long time. You write high-level applications and code that is designed to run in some kind of sandbox, with external services provided in some kind of ala carte model via integrations with other products or services. The PaaS, then, can take care of everything else: load balancing incoming requests, redundancy to support higher availability, and any kind of maintains on the lower level infrastructure. Serverless is often just PaaS but more: provide a complete stack of services to satisfy needs (databases, queues, background work, authentication, caching, on top of the runtime,) and then change the pricing model to be based on request/utilization rather than time or resources.

Fundamentally, this allows the separation of concerns between "writing software," and "running software," and allows much if not all of the running of software to be delegated to service providers. This kind of separation is useful for developers, and in general runtime environments seem like the kind of thing that most technical organizations shouldn't need to focus on: outsourcing may actually be good right?

Well maybe.

Let's be clear, serverless platforms primarily benefit the provider of the services for two reasons:

  • serverless models allow providers to build offerings that are multi-tenant, and give provider the ability to reap the benefit of managing request load dynamically and sharing resources between services/clients.
  • utilization pricing for services is always going to be higher than commodity pricing for the underlying components. Running your on servers ("metal") is cheaper than using cloud infrastructure, over time, but capacity planning, redundancy, and management overhead, make that difficult in practice. The proposition is that while serverless may cost more per-unit, it has lower management costs for users (fewer people in "ops" roles,) and is more flexible if request patterns change.

So we know why the industry seems to want serverless to be a thing, but does it actually make sense?

Maybe?

Makers of software strive (or ought to) make their software easy to run, and having very explicit expectations about the runtime environment, make software easier to run. Similarly, being able to write code without needing to manage the runtime, monitoring, logging, while using packaged services for caching storage and databases seems like a great boon.

The downsides to software producers, however, are plentiful:

  • vendor lock-in is real, not just because it places your application at the mercy of an external provider, as they do maintenance, or as their API and platform evolves on their timeframe.
  • hosted systems, mean that it's difficult to do local development and testing: either every developer needs to have their own sandbox (at some expense and management overhead), or you have to maintain a separate runtime environment for development.
  • application's cannot have service levels which exceed the service level agreements of their underlying providers. If your serverless platform has an SLA which is less robust than the SLA of your application you're in trouble.
  • when something breaks, there are few operational remedies available. Upstream timeouts are often not changeable and most forms of manual intervention aren't available.
  • pricing probably only makes sense for organizations operating at either small scale (most organizations, particularly for greenfield projects,) but is never really viable for any kind of scale, and probably doesn't make sense in any situation at scale.
  • some problems and kinds of software just don't work in a serverless model: big data sets that exceed reasonable RAM requirements, data processing problems which aren't easily parallelizable, workloads with long running operations, or workloads that require lower level network or hardware access.
  • most serverless systems will incur some overhead over dedicated/serverfull alternatives and therefore have worse performance/efficiency, and potentially less predictable performance, especially in very high-volume situations.

Where does that leave us?

  • Many applications and bespoke tooling should probably use serverless tools. Particularly if your organization is already committed to a specific cloud ecosystem, this can make a lot of sense.
  • Prototypes, unequivocally make sense to rely on off-the-shelf, potentially serverless tooling, particularly for things like runtimes.
  • If and when you begin to productionize applications, find ways to provide useful abstractions between the deployment system and the application. These kinds of architectural choices help address concerns about lock-in and making it easy to do development work without dependencies.
  • Think seriously about your budget for doing operational work, holistically, if possible, and how you plan to manage serverless components (access, cost control, monitoring and alerting, etc.) in connection with existing infrastructure.

Serverless is interesting, and I think it's interesting to say "what if application development happened in a very controlled environment with a very high level set of APIs." There are clearly a lot of cases where it makes a lot of sense, and then a bunch of situations where it's clearly a suspect call. And it's early days, so we'll see in a few years how things work out. In any case, thinking critically about infrastructure is always a good plan.

The Kubernetes Cloud Mainframe

I made a tongue-in-cheek comment on twitter a while back that, k8s is just the contemporary API for mainframe computing., but as someone who is both very skeptical and very excited about the possibilities of kube, this feels like something I want to expand upon.

A lot of my day-to-day work has some theoretical overlap with kube, including batch processing, service orchestration, and cloud resource allocation. Lots of people I encounter are also really excited by kube, and its interesting to balance that excitement with my understanding of the system, and to watch how Kubernetes(as a platform) impacts the way that we develop applications.

I also want to be clear that my comparison to mainframes is not a disparagement, not only do I think there's a lot of benefit to gain by thinking about the historic precedents of our paradigm. I would also assert that the trends in infrastructure over the last 10 or 15 years (e.g. virtualization, containers, cloud platforms) have focused on bringing mainframe paradigms to a commodity environment.

Observations

  • clusters are static ususally functionally. I know that the public clouds have autoscaling abilities, but having really elastic infrastructure requires substantial additional work, and there are some reasonable upper-limits in terms of numbers of nodes, which makes it hard to actually operate elastically. It's probably also the case that elastic infrastructure has always been (mostly) a pipe dream at most organizations.
  • some things remain quite hard, chiefly in my mind:
    • autoscaling, both of the cluster itself and of the components running within the cluster. Usage patterns are don't always follow easy to detect patterns, so figuring out ways to make infrastructure elastic may take a while to converse or become common. Indeed, VMs and clouds were originally thought to be able to provide some kind of elastic/autoscaling capability, and by and large, most cloud deployments do not autoscale.
    • multi-tenancy, where multiple different kinds of workloads and use-cases run on the same cluster, is very difficult to schedule for reliably or predictably, which leads to a need to overprovision more for mixed workloads.
  • kubernettes does not eliminate the need for an operations team or vendor support for infrastructure or platforms.
  • decentralization has costs, and putting all of the cluster configuration in etcd imposes some limitations, mostly around performance. While I think decentralization is correct, in many ways for Kubernetes, applications developers may need systems that have lower latency and tighter scheduling abilities.
  • The fact that you can add applications to an existing cluster, or host a collection of small applications is mostly a symptom of clusters being over provisioned. This probably isn't bad, and it's almost certainly the case that you can reduce the overprovisioning bias with kube, to some degree.

Impact and Predictions

  • applications developed for kubernettes will eventually become difficult or impossible to imagine or run without kubernettes. This has huge impacts on developer experience and test experience. I'm not sure that this is a problem, but I think it's a hell of a dependency to pick up. This was true of applications that target mainframes as well.
  • Kubernetes will eventually replace vendor specific APIs for cloud infrastructure for most higher level use cases.
  • Kubernetes will primarily be deployed by Cloud providers (RedHat/IBM, Google, AWS, Azure, etc.) rather than by infrastructure teams.
  • Right now, vendors are figuring out what kinds of additional services users and applications need to run in Kubernetes, but eventually there will be another layer of tooling on top of Kubernetes:
    • logging and metrics collection.
    • deployment operations and configuration, particularly around coordinating dependencies.
    • authentication and credential management.
    • low-latency offline task orchestration.
  • At some point, we'll see a move multi-cluster orchestration, or more powerful tools approach to workload isolation within a single cluster.

Conclusion

Kubernetes is great, and it's been cool to see how, really in the last couple of years, it's emerged to really bring together things like cloud infrastructure and container orchestration. At the same time, it (of course!) doesn't solve all of the problems that developers have with their infrastructure, and I'm really excited to see how people build upon Kubernetes to achieve some of those higher level concerns, and make it easier to build software on top of the resulting platforms.

Programming in the Common Lisp Ecosystem

I've been writing more and more Common Lips recently and while I reflected a bunch on the experience in a recent post that I recently followed up on .

Why Ecosystems Matter

Most of my thinking and analysis of CL comes down to the ecosystem: the language has some really compelling (and fun!) features, so the question really comes down to the ecosystem. There are two main reasons to care about ecosystems in programming languages:

  • a vibrant ecosystem cuts down the time that an individual developer or team has to spend doing infrastructural work, to get started. Ecosystems provide everything from libraries for common tasks as well as conventions and established patterns for the big fundamental application choices, not to mention things like easily discoverable answers to common problems.

    The more time between "I have an idea" to "I have running (proof-of-concept quality) code running," matters so much. Everything is possible to a point, but the more friction between "idea" and "working prototype" can be a big problem.

  • a bigger and more vibrant ecosystem makes it more tenable for companies/sponsors (of all sizes) to choose to use Common Lisp for various projects, and there's a little bit of a chicken and egg problem here, admittedly. Companies and sponsors want to be confidence that they'll be able to efficiently replace engineers if needed, integrate or lisp components into larger ecosystems, or be able to get support problems. These are all kind of intangible (and reasonable!) and the larger and more vibrant the ecosystem the less risk there is.

    In many ways, recent developments in technology more broadly make lisp slightly more viable, as a result of making it easier to build applications that use multiple languages and tools. Things like microservices, better generic deployment orchestration tools, greater adoption of IDLs (including swagger, thrift and GRPC,) all make language choice less monolithic at the organization level.

Great Things

I've really enjoyed working with a few projects and tools. I'll probably write more about these individually in the near future, but in brief:

  • chanl provides. As a current/recovering Go programmer, this library is very familiar and great to have. In some ways, the API provides a bit more introspection, and flexibility that I've always wanted in Go.
  • lake is a buildsystem tool, in the tradition of make, but with a few additional great features, like target namespacing, a clear definition between "file targets" and "task targets," as well as support for SSH operations, which makes it a reasonable replacement for things like fabric, and other basic deployment tools.
  • cl-docutils provides the basis for a document processing system. I'm particularly partial because I've been using the python (reference) implementation for years, but the implementation is really quite good and quite easy to extend.
  • roswell is really great for getting started with CL, and also for making it possible to test library code against different implementations and versions of the language. I'm a touch iffy on using it to install packages into it's own directory, but it's pretty great.
  • ASDF is the "buildsystem" component of CL, comparable to setuptools in python, and it (particularly the latest versions,) is really great. I like the ability to produce binaries directly from asdf, and the "package-inferred" is a great addition (basically, giving python-style automatic package discovery.)
  • There's a full Apache Thrift implementation. While I'm not presently working on anything that would require a legit RPC protocol, being able to integrate CL components into larger ecosystem, having the option is useful.
  • Hunchensocket adds websockets! Web sockets are a weird little corner of any stack, but it's nice to be able to have the option of being able to do this kind of programming. Also CL seems like a really good platform to do
  • make-hash makes constructing hash tables easier, which is sort of needlessly gawky otherwise.
  • ceramic provides bridges between CL and Electron for delivering desktop applications based on web technologies in CL.

I kept thinking that there wouldn't be good examples of various things, (there's a Kafka driver! there's support for various other Apache ecosystem components,) but there are, and that's great. There's gaps, of course, but fewer, I think, than you'd expect.

The Dark Underbelly

The biggest problem in CL is probably discoverability: lots of folks are building great tools and it's hard to really know about the projects.

I thought about phrasing this as a kind of list of things that would be good for supporting bounties or something of the like. Also if I've missed something, please let me know! I've tried to look for a lot of things, but discovery is hard.

Quibbles

  • rove doesn't seem to work when multi-threaded results effectively. It's listed in the readme, but I was able to write really trivial tests that crashed the test harness.
  • Chanl would be super lovely with some kind of concept of cancellation (like contexts in Go,) and while it's nice to have a bit more thread introspection, given that the threads are somewhat heavier weight, being able to avoid resource leaks seems like a good plan.
  • There doesn't seem to be any library capable of producing YAML formated data. I don't have a specific need, but it'd be nice.
  • it would be nice to have some way of configuring the quicklisp client to be able to prefer quicklisp (stable) but also using ultralisp (or another source) if that's available.
  • Putting the capacity in asdf to produce binaries easily is great, and the only thing missing from buildapp/cl-launch is multi-entry binaries. That'd be swell. It might also be easier as an alternative to have support for some git-style sub-commands in a commandline parser (which doesn't easily exist at the moment'), but one-command-per-binary, seems difficult to manage.
  • there are no available implementations of a multi-reader single-writer mutex, which seems like an oversite, and yet, here we are.

Bigger Projects

  • There are no encoders/decoders for data formats like Apache Parquet, and the protocol buffers implementation don't support proto3. Neither of these are particular deal breakers, but having good tools dealing with common developments, lowers to cost and risk of using CL in more applications.
  • No support for http/2 and therefore gRPC. Having the ability to write software in CL with the knowledge that it'll be able to integrate with other components, is good for the ecosystem.
  • There is no great modern MongoDB driver. There were a couple of early implementations, but there are important changes to the MongoDB protocol. A clearer interface for producing BSON might be useful too.
  • I've looked for libraries and tools to integrate and manage aspects of things like systemd, docker, and k8s. k8s seems easiest to close, as things like cube can be generated from updated swagger definitions, but there's less for the others.
  • Application delievery remains a bit of an open. I'm particularly interested in being able to produce binaries that target other platforms/systems (cross compilation,) but there are a class of problems related to being able to ship tools once built.
  • I'm eagerly waiting and concerned about the plight of the current implementations around the move of ARM to Darwin, in the intermediate term. My sense is that the transition won't be super difficult, but it seems like a thing.

Pattern Fragment 2

This is the follow up to Pattern Fragment 0 and Pattern Fragment 1.

After the yoke decreases, in addition to the steeks, there should be 196 stitches in total, or 98 stitches on the front and back of the neck.

Knit the yoke section plain, until it is--in total--3 inches deep. On the front of the sweater, knit 49 stitches (half), cast on 10 steek stitches, and continue knitting round marking the stitches. Knit the next round plan, and then decrease one stitch on either side of the steek, every other round, 21 or 22 times to shape the neck (42 or 44 rounds). Knit plain from here to the end of the sweater. After the first 2-3 inches of decreases, you may choose to space out the decreases more for a gradual slope, though I wouldn't.

Meanwhile, when the yoke is 7.5 inches deep, set aside at least 26 stitches in the middle of the back for back-of-neck-shaping, cast on a 10 stitch steak, and then decrease on alternating sides of the steek over the next inch and a half, until the number of stitches decreased at the front is exactly equal to the number of stitches decreased at the back.

When the yoke is 9 inches deep, in the last round bind off the middle two stitch of both of the armhole steeks, ending with knitting across the back one last time. Turn the work inside out and using a three-needle bind off, join and bind off the shoulders.

Knitting off the Cone

I have, for a long time, done rather a lot of knitting from yarn directly off of cones, which is maybe a bit weird or at least uncommon, so I thought I'd elaborate a bit more:

  • Theoretically a cone of yarn, which often contains at least 250 grams or more of yarn, has fewer breaks in it than you'd have with an equivalent weight of yarn packaged in skeins or balls. This isn't always true, as cones of yarn do have breaks, sometimes, but if you have a construction that doesn't require you break the yarn very much you can probably save a lot of weaving in by knitting off of a cone.
  • Cones of yarn are often not quite ready for use: most often the yarn hasn't received its final wash, which often means that the spinning oil is still in the wool. This is potentially only true for yarn that's undyed or dyed before being spun, and not the case for yarn that's dyed after being spun. It's also likely the case that the yarn will be wound onto the cone slightly tighter than it would be otherwise. The effect is that the yarn will be a bit limp relative to it's final state. The color can also change a bit. You can knit with the unwashed yarn, but know that the final product will require a bit more washing, and the texture can change.
  • Typically the kind of yarn that's available on cones is boring, which is to say that there are less varieties in general but also of different colors. I think this is actually a great thing: knitting in more plain colors and simple smooth yarns draws attention to the knitting itself, which is often my goal.
  • Cones of yarn feel like buying yarn in bulk, and buying yarn by the pound or kilo (!) means that you can really get a feel for the yarn and it's behavior and knit with it for more than one project. Make a few sweaters, or many pairs of socks. See what happens!
  • Because yarn on cones is often used as a method of distributing undyed yarn to dyers in bulk, you can select materials on the basis of fiber content in a way that can be difficult when you also have to balance color considerations.

The clear solution to this problem is, of course, to wind the yarn off the cone into a hank (typically using a niddy nody or similar,) avoid tying the yarn too tightly, soak and wash the yarn gently with wool wash, and then hang it up to dry, and then wind it back into balls. I never do this. I should, but realistically I never do.