Documentation Emergence

I stumbled across a link somewhere along the way to a thread about the Pyramid project's documentation planning process. It's neat to see a community coming to what I think is the best possible technical outcome. In the course of this conversation Iain Duncan, said something that I think is worth exploring in a bit more depth. The following is directly from the list, edited only slightly:

I wonder whether some very high level tutorials on getting into Pyramid that look at the different ways you can use it would be useful? I sympathize with Chris and the other documenters because just thinking about this problem is hard: How do you introduce someone to Pyramid easily without putting blinders on them for Pyramid's flexibility? I almost feel like there need to 2 new kinds of docs:

  • easy to follow beginner docs for whatever the most common full stack scaffold is turning out to be (no idea what this is!)
  • some mile high docs on how you can lay out pyramid apps differently and why you want to be able to do that. For example, I feel like hardly anyone coming to Pyramid from the new docs groks why the zca under the hood is so powerful and how you can tap into it.

Different sets of users have different needs from documentation. I think my ":Multi-Audience Documentation" post also addresses this issue.

I don't think there are good answers and good processes that always work for documentation projects. Targeted users and audience changes a lot depending on the kind of technology at play. The needs of users (and thus the documentation) varies in response to the technical complexity and nature every project/product varies. I think, as the above example demonstrates, there's additional complexity for software whose primary users are very technical adept (i.e. systems administrators) or even software developers themselves.

The impulse to have "beginner documentation," and "functional documentation," is a very common solution for many products and reflects two main user needs:

  • to understand how to use something. In other words, "getting started," documentation and tutorials.
  • to understand how something works. In other words the "real" documentation.

I think it's feasible to do both kinds of documentation within a single resource, but the struggle then revolves around making sure that the right kind of users find the content they need. That's a problem of documentation usability and structure. But it's not without challenges, lets think about those in the comments.

I also find myself thinking a bit about the differences between web-based documentation resources and conventional manuals in PDF or dead-tree editions. I'm not sure how to resolve these challenges, or even what the right answers are, but I think the questions are very much open.

Xen and KVM: Failing Differently Together

When I bought what is now my primary laptop, I had intended to use the extra flexibility to learn the prevailing (industrial-grade) virtualization technology. While that project would have been edifying on its own, I also hoped to use the extra flexibility to some more consistent testing and development work.

This project spurned a xen laptop project, but the truth is that Xen is incredibly difficult to get working, and eventually the "new laptop" just became the "every day laptop," and I let go of the laptop Xen project. In fact, until very recently I'd pretty much given up on doing virtualization things entirely, but for various reasons beyond the scope of this post I've been inspired to begin tinkering with virtualization solutions again.

As a matter of course, I found myself trying KVM in a serious way for the first time. This experience both generated a new list of annoyances and reminded me about all the things I didn't like about Xen. I've collected these annoyances and thoughts into the following post. I hope that these thoughts will be helpful for people thinking about virtualization pragmatically, and also help identify some of the larger to pain points with the current solution.

Xen Hardships: It's all about the Kernel

Xen is, without a doubt, the more elegant solution from a design perspective and it has a history of being the more robust and usable tool. Performance is great, Xen hosts can have up-times in excess of a year or two.

The problem is that dom0 support has, for the past 2-3 years, been in shambles, and the situation isn't improving very rapidly. For years, the only way to run a Xen box was to use an ancient kernel with a set of patches that was frightening, or a more recent kernel with ancient patches forward ported. Or you could use cutting edge kernel builds, with reasonably unstable Xen support.

A mess in other words.

Now that Debian Squeeze (6.0) has a pv-ops dom0 kernel, things might look up, but other than that kernel (which I've not had any success with, but that may be me,) basically the only way to run Xen is to pay Citrix [1] or build your own kernel from scratch, again results will be mixed (particularly given the non-existent documentation,) maintenance costs are high, and a lot of energy will be duplicated.

What to do? Write documentation and work with the distributions so that if someone says "I want to try using Xen," they'll be able to get something that works.

KVM Struggles: It's all about the User Experience

The great thing about KVM is that it just works. "sudo modprobe kvm kvm-intel" is basically the only thing between most people and a KVM host. No reboot required. To be completely frank, the prospect of doing industrial-scale virtualization on-top of nothing but the Linux kernel and with a wild module in it, gives me the willies is inelegant as hell. For now, it's pretty much the best we have.

The problem is that it really only half works, which is to say that while you can have hypervisor functionality and a booted virtual machine, with a few commands, it's not incredibly functional in practical systems. There aren't really good management tools, and getting even basic networking configured off the bat, and qemu as the "front end" for KVM leaves me writhing in anger and frustration. [2]

Xen is also subject to these concerns, particularly around netowrking. At the same time, Xen's basic administrative tools make more sense, and domU's can be configured outside of interminable non-paradigmatic command line switches.

The core of this problem is that KVM isn't very Unix-like, and it's a problem that is rooted in it's core and pervades the entire tool, and it's probably rooted in the history of its development.

What to do? First, KVM does a wretched job of anticipating actual real-world use cases, and it needs to do better at that. For instances it sets up networking in a way that's pretty much only good for software testing and GUI interfaces but sticking the Kernel on the inside of the VM makes it horrible for Kernel testing. Sort out the use cases, and there ought to be associated tooling that makes common networking configurations easy.

Second, KVM needs to at least pretend to be Unix-like. I want config files with sane configurations, and I want otherwise mountable disk images that can be easily mounted by the host.

Easy right?

[1]The commercial vendor behind Xen, under whose stewardship the project seems to have mostly stalled. And I suspect that the commercial distribution is Red Hat 5-based, which is pretty dead-end. Citrix doesn't seem to be very keen on using "open source," to generate a sales channel, and also seems somewhat hesitant to put energy into making Xen easier to run for existing Linux/Unix users.
[2]The libvirtd and Virt Manager works pretty well, though it's not particularly flexible, and it's not a simple command line interface and a configuration file system.

Is Dropbox the Mobile File System Standard

I've started using Dropbox on my Android devices recently (and my laptop as a result, [1]) and I'm incredibly impressed with the software and with the way that this service is a perfect example of the kind of web services that we need to see more of. While I have some fairly uninteresting concerns about data security and relying on a service that I'm not administrating personally, I think it's too easy to get caught up the implications of where the data lives and forget what the implications of having "just works," file syncing between every computer.

I used to think that the thing that kept mobile devices from being "real" was the fact that they couldn't sell "post-file system" computer use. I'm not sure that we're ready to do away with the file system metaphor yet. I think Dropbox is largely successful because it brings files back and makes them available in a way that makes sense for mobile devices.

The caveat is that it provides a file system in a way that makes sense in the context for these kinds of "file systemless" platforms. Dropbox provides access to files, but in a way that doesn't require applications (or users) to have a firm awareness of "real files. Best of all, Dropbox (or similar) can handle all of the synchronization, so that every application doesn't need to have its own system.

This might mean that Dropbox is the first functionally Unix-like mobile application. I think (and hope) that Dropbox's success will prove to be an indicator for future development. Not that there will be more file syncing services, but that mobile applications and platforms will have applications that "do one thing well," and provide a functionality upon which other applications can build awesome features.


This isn't to say that there aren't other important issues with Dropbox. Where your data lives does matter, who controls the servers that your data lives on is important. Fundamentally, Dropbox isn't doing anything technologically complicated. When I started writing the post, I said "oh, it wouldn't be too hard to get something similar set up," and while Dropbox does seem like the relative leader, it looks like there is a fair amount of competition. That's probably a good thing.

So despite the concerns about relying on a proprietary vendor and about trusting your data on someone else's server, data has to go somewhere. As long as users have choices and options, and there are open ways of achieving the same ends, I think that these issues are less important than many others.

[1]To be fair, I'm using it to synchronize files to the Android devices, and not really to synchronize files between machines: I have a server for simple file sharing, and git repositories for the more complex work. So it's not terribly useful for desktop-to-desktop sharing, But for mobile devices? Amazing.

Issue Tracking and the Health of Open Source Software

I read something recently that suggested that the health of an open source project and its community could be largely assessed by reviewing the status of the bug tracker. I'm still trying to track down the citation for this remark. This basically says that vital active projects have regularly updated bugs that are clearly described and that bugs be easy to search and easy to submit.

I'm not sure that free software communities and projects can be so easily assessed or that conventional project management practices are the only meaningful way to judge a project's health. While we're at it, I don't know that it's terribly useful to focus too much attention or importance on project management. Having said that, the emergence of organizational structure is incredibly fascinating, and could probably tolerate more investigation.

As a starting point, I'd like to offer two conjectures:

  • First, that transparent issue tracking is a reasonably effective means of "customer service," or user support. If the bug tracking contains answers to questions that people encounter during use, and provide a way to resolve issues with the software that's productive and helps with support self-service. Obviously some users and groups of users are better at this than others.
  • Second, issue tracking is perhaps the best way to do bottom-up project/product management and planning in the open, particularly since these kinds or projects lack formal procedures and designated roles to do this kind of organizational work.

While the overriding goal of personal task management is to break things into the smallest manageable work units, the overriding goal of issue tracking systems is to track the most intellectually discrete issues within a single project through the development process. Thus, issue tracking systems have requirements that are either much less important in personal systems or actively counter-intuitive for other uses. They are:

  • Task assignment, so that specific issues can be assigned different team members. Ideally this gets a specific developer can "own" a specific portion of the project and actually be able to work and coordinate efforts on the project.
  • Task prioritization, so that less important or crucial issues get attention before "nice to have," items are addressed.
  • Issue comments and additional attached information, to track progress and support information sharing among teams, particularly over long periods of time with asynchronous elements.

While it's nice to be able to integrate tasks and notes (this is really the core of org-mode's strength) issue tracking systems need to be able to accommodate error output and discussion from a team on the best solution, as well as discussion about the ideal solution.

The truth is that a lot of projects don't do a very good job of using issue tracking systems, despite how necessary and important bug trackers. The prefabricated systems can be frustrating and difficult to use, and most of the minimalist systems [1] are hard to use in groups. [2] The first person to write a fully featured, lightweight, and easy to use issue tracking system will be incredibly successful. Feel free to submit a patch to this post, if you're aware of a viable systems along these lines.

[1]I'm thinking about using ikiwiki or org-mode to track issues, but ditz suffers from the same core problem.
[2]Basically, they either sacrifice structure or concurrency features or both. Less structured systems rely on a group of people to capture the same sort of information in a regular way (unlikely) or they capture less information, neither option is tenable. Without concurrency (because they store things in single flat files) people can't use them to manage collaboration, which make them awkward personal task tracking systems.

Packaging Technology Creates Value

By eliminating the artificial scarcity of software, open source software forces businesses and technology developers to think differently about their business models. There are a few ways that people have traditionally built businesses around open free and open source software. There are pros and cons to every business model, but to review the basic ideas are:

  • Using open source software as a core and building a thin layer of proprietary technology on top of the open source core. Sometimes this works well enough (e.g. SugarCRM, OS X,) and sometimes this doesn't seem to work as well (e.g. MySQL, etc.)
  • Selling services around open source software. This includes support contracts, training services, and infrastructure provisioning. Enterprises and other organizations and projects need expertise to make technology work, and the fact that open source doesn't bundle licensing fees with support contracts doesn't make the support (and other services) less useful or needed for open source.
  • Custom development services. Often open source projects provide a pretty framework for a technology, but require some level of customization to fit the needs and requirements of the "business case." The work can be a bit uneven, as with all consulting, but the need a service are both quit real. While the custom code may end up back in the upstream, sometimes this doesn't quite happen for a number of reasons. Custom development obviously overlaps with service and thin-proprietarization, but is distinct: it's not a it doesn't revolve around selling proprietary software, and it doesn't involve user support or systems administration. These distinctions can get blurry in some cases.

In truth, when you consider how proprietary software actually convey value, it's really the same basic idea as the three models above. There's just this minor mystification around software licenses, but other than that, the business of selling software and services around software doesn't vary that much.

James Governor of Red Monk suggests a fourth option: Packaging technology.

The packaging model is likely just an extension of the "services" model, but it draws attention to the ways that companies can create real value not just by providing services and not just by providing a layer of customization, but by spending time attending to the whole experience, rather than the base technology. It also draws some attention to the notion that reputation matters.

I suppose it makes sense: when businesses (and end users) pay for proprietary software, while the exchange is nominally "money" for "license" usage rights, in reality there are services and other sources of value. Thus it is incumbent upon open source developers and users to find all of the real sources of value that can be conveyed in the exchange of money for software, and find ways to support themselves and the software. How hard can it be?

Leadership in Distributed Social Networks

Let us understand "social networks," to mean networks of people interacting in a common group or substrate rather than the phenomena that exists on a certain class of websites (like Facebook): we can think of this as the "conventional sense" of the term.

As I watch Anonymous, the 'hacktivist" group, to say nothing of the movements in Egypt and Tunisia, I'm fascinated by the way that such an ad hoc group can appear to be organized and coherent on the outside without appearing to have real leadership. External appearance and reality is different, of course, and I'm not drawing a direct parallel between what Anonymous is doing and what happened in Egypt, but there is a parallel. I think we're living in a very interesting moment. New modes of political and social organizing do not manifest themselves often.

We still have a lot to learn about what's happened recently in Egypt/Tunisia/Libya and just as much to learn about Anonymous. In a matter of months or years, we could very easily look back on this post and laugh at its naivete. Nevertheless, at least for the moment, there are a couple of big things that I think are interesting and important to think about:

  • We have movements that are lead, effectively, by groups rather than individuals, and if individuals are actually doing leadership work, they are not taking credit for that work. So movements that are not lead by egos.
  • These are movements that are obviously technologically very aware, but not in a mainstream sort of way. Anonymous uses (small?) IRC networks and other collaborative tools that aren't quite mainstream yet. The Egyptian protesters in the very beginning had UStream feeds of Tahrir Square, and I'd love to know how they were handling for internal coordination and communication.
  • I think the way that these movements "do ideology," is a bit unique and non conventional. I usually think of ideology as being a guiding strategy from which practice springs. I'm not sure that's what's happening here.
  • The member activists, who are doing the work in these movements are not professional politicians or political workers.

The more I ponder this, the more I realize how improbable these organizations are and the more I am impressed by the ability of these groups to be so effective. In addition to all of my other questions, I'm left wondering: how will this kind of leadership (or non-leadership) method and style influence other kinds of movements and projects.

Talk Proposals

At POSSCON there were a lot of talks, most of which did little to interest me. I don't think this was the fault of the conference: I'm a weirdo. I tend to be developer-grade geeky, but am still not a developer and I wasn't otherwise representative of the general audience. By the end, I was starting to think that the thing most people talk about at conferences isn't very cutting edge. I don't think it's just POSSCON (surely not!) but I've not been to enough conferences to be able to speak definitively. In any case, I'd like to propose in open forum (i.e. this wiki,) a number of conference presentations that I'd like to see or would be willing to present.

If you're interested in any of these presentation, or want to help/inspire me to work up notes, please create or add to the wiki pages linked to below.

Emacs Productivity and Production, Org-Mode and Beyond

Emacs, with its extensive feature list, endless customizations, and arcane approach to user interface, is often the butt of many jokes. While some of this is certainly valid, there are many incredibly innovative and intensely useful pieces of software written for Emacs. This talk would center on the org-mode package, but would branch out to talk workflows and automation in Emacs and using Emacs to help people make awesome work.

The Year of The Linux Desktop: Amazing Window Manager Paradigms

I'm always distraught by the way that discussion of "The Linux Desktop" revolves around convincing people that the major desktop environments (KDE/GNOME) either: are feature comparable to the Windows/OS X desktop or are able to "out-Windows" and "out-OS X" each other/Windows/OS X. Both of these propositions seem somewhat tenuous and unlikely to be convincing in the long run, and do little to inspire enthusiasm for the platform. This is sad because there is a lot of very interesting activity in the Linux desktop space. This talk would present and explore a couple of projects in the tiling window manger space and explain why this kind of software is what should drive adoption of the Linux desktop.

Cloud Independence, Infrastructure, and Administration

The "cloud computing" paradigm and the shift to thinking about technology resources as service based raises some interesting questions about software/computing freedom and the shape of data ownership in the contemporary moment. This talk would address these questions, provide an overview of how to "go it alone," and how to be responsible for managing and administering for your own personal "cloud infrastructure."

The Inevitability of Open Source

I recently attended POSSCON as part of my day-job. I don't usually blog directly about this kind of stuff ("You like to keep your church and state separate," a fellow attendee said, which fits.) But, I had a number of awesome conversations with the speakers, attendees and sponsors, that may spawn a series of brief posts here. POSSCON is a regional open source convention that drew developers, leaders of informational technology departments, and IT consultants of various types.

I had a number of conversations that revolved around the adoption of open source in opposition to proprietary systems. People asked questions like "what do we have to do to get more people to use open source software?" and many people apologized for doing work with proprietary software for mostly economic reasons (e.g. "I have a .NET development job," or "people need windows administration and I can't turn away work.")

This led me to have one of three reactions:

1. Working with any specific (proprietary) technology, particularly because you have to make ends meet should never require excusing. There are cases where "working with proprietary technology," may more like "building a business model on proprietary technology," and that sort of thing needs to be watched out for, but I don't think it's morally ambiguous to make a living.

2. I'm not sure that the success of technology, particularly open source, is determined solely on the basis of adoption rates. Successful technology is technology that efficiently allows people/cyborgs to do work, not overwhelmingly ubiquitous technology.

3. In many many contexts, open source technology has triumphed over proprietary alternatives: Linux-based systems are the dominant UNIX-like operating system. OpenSSH is the dominant SSH implementation (and remote terminal protocol/implementation). Darwin/FreeBSD is incredibly successful (as Mac OS X.) Other domains where open source packages have very high (dominating) adoption rates: OpenSSL, gcc, perl/python/php/ruby (web development), Apache/Lighttpd/nginx (web servers) etc.

While I think the end-user desktop isn't unimportant, I think there may be merit in playing to the strengths of open source (servers, infrastructure, developers.) Additionally, it seems more productive to have the discussion about "how do we advance open source," couched in terms of a battle for technological dominance in which open source has already won.

And Free Software/Open Source has won. While there remain sectors and domains where non-free software remains prevalent and business models that don't value user's freedom, I think that most people who know anything about technology will say that all paths forward lead toward a greater level of software freedom.

Maybe this is a symptom of the situation in which I work and maybe I'm being too optimistic, but I don't think so. Thoughts?