The Kubernetes Cloud Mainframe

I made a tongue-in-cheek comment on twitter a while back that, k8s is just the contemporary API for mainframe computing., but as someone who is both very skeptical and very excited about the possibilities of kube, this feels like something I want to expand upon.

A lot of my day-to-day work has some theoretical overlap with kube, including batch processing, service orchestration, and cloud resource allocation. Lots of people I encounter are also really excited by kube, and its interesting to balance that excitement with my understanding of the system, and to watch how Kubernetes(as a platform) impacts the way that we develop applications.

I also want to be clear that my comparison to mainframes is not a disparagement, not only do I think there's a lot of benefit to gain by thinking about the historic precedents of our paradigm. I would also assert that the trends in infrastructure over the last 10 or 15 years (e.g. virtualization, containers, cloud platforms) have focused on bringing mainframe paradigms to a commodity environment.

Observations

  • clusters are static ususally functionally. I know that the public clouds have autoscaling abilities, but having really elastic infrastructure requires substantial additional work, and there are some reasonable upper-limits in terms of numbers of nodes, which makes it hard to actually operate elastically. It's probably also the case that elastic infrastructure has always been (mostly) a pipe dream at most organizations.
  • some things remain quite hard, chiefly in my mind:
    • autoscaling, both of the cluster itself and of the components running within the cluster. Usage patterns are don't always follow easy to detect patterns, so figuring out ways to make infrastructure elastic may take a while to converse or become common. Indeed, VMs and clouds were originally thought to be able to provide some kind of elastic/autoscaling capability, and by and large, most cloud deployments do not autoscale.
    • multi-tenancy, where multiple different kinds of workloads and use-cases run on the same cluster, is very difficult to schedule for reliably or predictably, which leads to a need to overprovision more for mixed workloads.
  • kubernettes does not eliminate the need for an operations team or vendor support for infrastructure or platforms.
  • decentralization has costs, and putting all of the cluster configuration in etcd imposes some limitations, mostly around performance. While I think decentralization is correct, in many ways for Kubernetes, applications developers may need systems that have lower latency and tighter scheduling abilities.
  • The fact that you can add applications to an existing cluster, or host a collection of small applications is mostly a symptom of clusters being over provisioned. This probably isn't bad, and it's almost certainly the case that you can reduce the overprovisioning bias with kube, to some degree.

Impact and Predictions

  • applications developed for kubernettes will eventually become difficult or impossible to imagine or run without kubernettes. This has huge impacts on developer experience and test experience. I'm not sure that this is a problem, but I think it's a hell of a dependency to pick up. This was true of applications that target mainframes as well.
  • Kubernetes will eventually replace vendor specific APIs for cloud infrastructure for most higher level use cases.
  • Kubernetes will primarily be deployed by Cloud providers (RedHat/IBM, Google, AWS, Azure, etc.) rather than by infrastructure teams.
  • Right now, vendors are figuring out what kinds of additional services users and applications need to run in Kubernetes, but eventually there will be another layer of tooling on top of Kubernetes:
    • logging and metrics collection.
    • deployment operations and configuration, particularly around coordinating dependencies.
    • authentication and credential management.
    • low-latency offline task orchestration.
  • At some point, we'll see a move multi-cluster orchestration, or more powerful tools approach to workload isolation within a single cluster.

Conclusion

Kubernetes is great, and it's been cool to see how, really in the last couple of years, it's emerged to really bring together things like cloud infrastructure and container orchestration. At the same time, it (of course!) doesn't solve all of the problems that developers have with their infrastructure, and I'm really excited to see how people build upon Kubernetes to achieve some of those higher level concerns, and make it easier to build software on top of the resulting platforms.

Xen and KVM: Failing Differently Together

When I bought what is now my primary laptop, I had intended to use the extra flexibility to learn the prevailing (industrial-grade) virtualization technology. While that project would have been edifying on its own, I also hoped to use the extra flexibility to some more consistent testing and development work.

This project spurned a xen laptop project, but the truth is that Xen is incredibly difficult to get working, and eventually the "new laptop" just became the "every day laptop," and I let go of the laptop Xen project. In fact, until very recently I'd pretty much given up on doing virtualization things entirely, but for various reasons beyond the scope of this post I've been inspired to begin tinkering with virtualization solutions again.

As a matter of course, I found myself trying KVM in a serious way for the first time. This experience both generated a new list of annoyances and reminded me about all the things I didn't like about Xen. I've collected these annoyances and thoughts into the following post. I hope that these thoughts will be helpful for people thinking about virtualization pragmatically, and also help identify some of the larger to pain points with the current solution.

Xen Hardships: It's all about the Kernel

Xen is, without a doubt, the more elegant solution from a design perspective and it has a history of being the more robust and usable tool. Performance is great, Xen hosts can have up-times in excess of a year or two.

The problem is that dom0 support has, for the past 2-3 years, been in shambles, and the situation isn't improving very rapidly. For years, the only way to run a Xen box was to use an ancient kernel with a set of patches that was frightening, or a more recent kernel with ancient patches forward ported. Or you could use cutting edge kernel builds, with reasonably unstable Xen support.

A mess in other words.

Now that Debian Squeeze (6.0) has a pv-ops dom0 kernel, things might look up, but other than that kernel (which I've not had any success with, but that may be me,) basically the only way to run Xen is to pay Citrix [1] or build your own kernel from scratch, again results will be mixed (particularly given the non-existent documentation,) maintenance costs are high, and a lot of energy will be duplicated.

What to do? Write documentation and work with the distributions so that if someone says "I want to try using Xen," they'll be able to get something that works.

KVM Struggles: It's all about the User Experience

The great thing about KVM is that it just works. "sudo modprobe kvm kvm-intel" is basically the only thing between most people and a KVM host. No reboot required. To be completely frank, the prospect of doing industrial-scale virtualization on-top of nothing but the Linux kernel and with a wild module in it, gives me the willies is inelegant as hell. For now, it's pretty much the best we have.

The problem is that it really only half works, which is to say that while you can have hypervisor functionality and a booted virtual machine, with a few commands, it's not incredibly functional in practical systems. There aren't really good management tools, and getting even basic networking configured off the bat, and qemu as the "front end" for KVM leaves me writhing in anger and frustration. [2]

Xen is also subject to these concerns, particularly around netowrking. At the same time, Xen's basic administrative tools make more sense, and domU's can be configured outside of interminable non-paradigmatic command line switches.

The core of this problem is that KVM isn't very Unix-like, and it's a problem that is rooted in it's core and pervades the entire tool, and it's probably rooted in the history of its development.

What to do? First, KVM does a wretched job of anticipating actual real-world use cases, and it needs to do better at that. For instances it sets up networking in a way that's pretty much only good for software testing and GUI interfaces but sticking the Kernel on the inside of the VM makes it horrible for Kernel testing. Sort out the use cases, and there ought to be associated tooling that makes common networking configurations easy.

Second, KVM needs to at least pretend to be Unix-like. I want config files with sane configurations, and I want otherwise mountable disk images that can be easily mounted by the host.

Easy right?

[1]The commercial vendor behind Xen, under whose stewardship the project seems to have mostly stalled. And I suspect that the commercial distribution is Red Hat 5-based, which is pretty dead-end. Citrix doesn't seem to be very keen on using "open source," to generate a sales channel, and also seems somewhat hesitant to put energy into making Xen easier to run for existing Linux/Unix users.
[2]The libvirtd and Virt Manager works pretty well, though it's not particularly flexible, and it's not a simple command line interface and a configuration file system.

City Infrastructure

I'm always interested in how the lessons that people learn in IT trickle down to other kinds of work and problems. This is one of the reasons that I [1] am so interested in what developers are interested in: if you want to know what's happening in the technology space, it's best to start at the top of the food chain. For this reason this article from IBM, which addresses the use of IT/Data Center management tools outside of the data center was incredibly interesting for me.

When you think about it, it makes sense. IT involves a lot of physical assets, even more virtual assets, and when projects and systems grow big enough, it can be easy to lose track of what you have, much less what state it's in at any given time. Generalized, this is a prevalent issue in many kinds of complex systems.

As an aside, I'm a little interested when software that provides asset management and monitoring features, will scale down to the personal level. That'll be interesting too. There are the beginnings of this kind of thing (e.g. iTunes, and git-annex) but only the beginnings.

I'm left with the following questions:

  • Obviously moving from managing and monitoring networked devices to managing and monitoring infrastructure objects like water filtration systems, storm water drainage, the electrical grid, snow removal, etc. presents a serious challenge for the developers of these tools, and this adaptation will likely improve the tools. I'm more interested in how cities improve in this equation. And not simply with regards to operating efficiencies. What do we learn from all this hard data on cities?
  • Will cities actually be able to become more efficient, or will they need to expand to include another layer of management management, that nullifies the advances. There are also concerns about additional efficiency increasing the "carrying capacity of cities," into unsustainable levels.
  • Can the conclusions from automated city-wide reporting lead to advancements in quality of service, if we're better at determining defective practices and equipment. In this vein, how cities share data between them will also be quite interesting.

I'd love to hear from you!

[1]RedMonk also use a similar argument.

mcabber and IM

I've always rather enjoyed this post that I wrote about instant messaging programs. My issue is that I use IM a lot. A lot. I communicate with colleagues, friends, and frankly, if you want to get a hold of me, IM is often the best way to do this, and frankly for a lot of communications I prefer it to the phone.

Nevertheless, IM clients, on the whole, are mostly pretty bad. Right? They distract, they filter information poorly, they take up a lot of room on the screen, and are as a class pretty inconsistent. This is probably because no one really expected people to use instant messaging technologies in a serious way.

But here we are.

The leading IM client, really, for people who rely on this kind of thing is Pidgin (and the other libpurple based clients), which make it possible to connect to lots of different services at the same time but only have one roster/buddy list. It's a good solution to the "multiple networks/accounts" problem, but the truth is the quality of the implementation varies, and the user interface is... awkward and rigid [1].

There is, however, this program called mcabber that provides the ability to connect to one xmpp account, in a terminal-based (ncurses?) environment. It's not perfect, nothing is, but its a lot better than the other options.

While I've not been able to switch to using only it for my IMing needs for a couple of reasons, mostly related to my xmpp server/provider, but I have used it exclusively for a number of days and the experience is pretty good. Reasons I like it:

  • Everything lives in one window, and chat windows have equal footing, like buffers in emacs, say. That's really nifty.

  • The key shortcuts are really simple, and quite intuitive. Thats key in a terminal application

  • I can run it in screen, and pull the screen over ssh to my laptop when I'm not in front of my computer. Though xmpp is generally really great about multiple connections, sometimes it's best to not have to deal with that fussing.

  • I like that it supports PGP encryption, though I don't have anyone with whom I can encrypt conversations with, but that always seems to be a minor detail.

    Though, this isn't to say that it's all good. There are some problems that I've had. Though to be fair, my complaints here are much fewer than with most other XMPP clients, so that's a good thing, right? Complaints:

  • No support for service discovery. Which means you have to install psi--basically--if you're serious about getting the most out of XMPP. This is... unfortunate.

  • It takes a lot of work to get configured and get key bindings set up. I'm mostly of the opinion that it's always best to give users a set of key bindings to start with and then let them customize as needed, and the default screen layout isn't particularly useful or economical (put the status window at the top, and make it smaller, for starters).

  • It doesn't support multiple connections/user-ids. This is a biggie, and while I don't (really) object to the fact that it doesn't support other protocols, I think the reality is that most users probably have and need to use more than one identity at once, so that's a noticeable hole.

  • I'd also like an easy interface for producing system notifications give me a setting option to pile messages (username, excerpt of x characters, time) to a pipe (|) and I'd be very happy indeed.

  • I couldn't decide if the fact that there was only one "text entry field" (mini-buffer) for all outgoing message buffers. Which meant that, particularly with scroll back, that it was easy to cross contexts unintentionally.

My last complaint isn't so much a complaint about the software itself as it is a complaint about the instant messaging space in general. Basically, I think there has to be a way to filter this kind of communication. Ways to setup client-independent rules regarding statuses, auto-responders, and notification level. I've said that this should be "along lines of procmail," but I don't quite know what, even that, would look like. But someday, it's coming, and I for one can't wait.

But all in all. It's a great program and you should check mcabber out, if you're intense about jabber/xmpp and instant messaging (and living in the console).

[1]So there was a pretty notorious fork-threat in the Pidgin project a while back over a sort of deterministic user interface decision, which I return to every now and then as an example of both intra FOSS project politics gone awry, and user interface design gone awry. I'm not dredging up the flame war, because the truth is I really hate GUI applications writ-large, so on some level its nothing specific.