Collaborative Technology

I agreed to work on an article for a friend about the collaborative technology "stuff" that I've been thinking about for a long time. I don't have an archive that covers this subject, but perhaps I should, because I think I've written about the technology that allows people to make things with other people a number of times, though I have yet to pull together these ideas into some sort of coherent post or essay.

This has been my major post-graduation intellectual challenge. I have interests, even some collected research, and no real way to turn these half conceptualized projects into a "real paper." So I've proposed working with a friend to collect and develop a report that's more concrete and more comprehensive than the kind of work that I've been attempting to accomplish on the blog. Blogging is great, don't get me wrong, but I think it leads to certain kinds of thinking and writing (at least as I do it,) and sometimes other kinds of writing and thinking are required.

Regarding this project, I want to think about how technology like "git" (a distributed version control system) and even tools like wiki's shape the way that groups of people can collaborate with each other. I think there's an impulse in saying "look at the possibilities that these tools create! This brave new world is entirely novel, and not only changes the way I am able to complete my work, but how I look at problems, and make it so much easier for me to get things done.." At the same time, the technology can only promote a way of working it doesn't necessarily enforce a way of working, nor does any particular kind of technology really remove the burdens and challenges of "getting things done." More often perhaps new kinds of technology, like distributed version control, is responsible for increasing the level of abstracting and allowing us (humans) to attend to higher order concerns.

Then, moving up from the technology, I think looking at how people use technology in this class allows us to learn a great deal about how work is accomplished. We can get an idea of when work is being done, an idea of how quality control efforts are implemented. Not only does this allow us to demystify the process of creation, but having a more clear idea of how things are made could allow us to become better makers.

The todo list, then, is something like:

  • Condense the above into something that resembles a thesis/argument.
  • Become a little more familiar with the git-dm ("data mining") tool that the Linux Foundation put together for their "state of Kernel development."
  • Develop some specific questions to address. I think part of my problem above and heretofore has been that I'm saying "there's something interesting here, if we looked," rather than. "I think w kind of projects operate in x ways, where y projects will operate in z ways."
  • Literature review. I've done some of this, but I've felt like I need to do even more basic methodological and basic theory reading. And even though an unread Patterns of Culture is on my bookshelf, I don't need to read that to begin reading articles.

That's a start. Feedback is always useful. I'll keep you posted as I progress.

Saved Searches and Notmuch Organization

I've been toying around with the Notmuch Email Client which is a nifty piece of software that provides a very minimalist and powerful email system that's inspired by the organizational model of Gmail.

Mind you, I don't think I've quite gotten it.

Notmuch says, basically, build searches (e.g. "views") to filter your email so you can process your email in the manner that makes the most sense to you, without needing to worry about organizing and sorting email. It has the structure for "tagging," which makes it easy to mark status for managing your process (e.g. read/unread, reply-needed), and the ability to save searches. And that's about it.

Functionally tags and saved searches work the way that mail boxes in terms of the intellectual organization of mailboxes. Similarly the ability to save searches, makes it possible to do a good measure of "preprocessing." In the same way that Gmail changes the email paradigm by saying "don't think about organizing your email, just do what you need to do," not much says "do less with your email, don't organize it, and trust that the machine will be able to help you find what you need when the time comes."


I've been saying variations of the following for years, but I think on some level it hasn't stuck for me. Given contemporary technology, it doesn't make sense to organize any kind of information that could conceivably be found with search tools. Notmuch proves that this works, and although I've not been able to transfer my personal email over, I'm comfortable asserting that notmuch is a functional approach to email. To be fair, I don't feel like my current email processing and filtering scheme is that broken, so I'm a bad example.

The questions that this raises, which I don't have a particularly good answers for, are as follows:

  • Are there good tools for the "don't organize when you can search crew," for non-email data? And I'm not just talking about search engines themselves (as there are a couple: xapian, namazu), or ungainly desktop GUIs (which aren't without utility,) but the proper command-line tools, emacs interfaces, and web based interfaces?
  • Are conventional search tools the most expressive way of specifying what we want to find when filtering or looking for data? Are there effective improvements that can be made?
  • I think there's intellectual value created by organizing and cataloging information "manually," and "punting to search" seems like it removes the opportunity to develop good and productive information architectures (if we may be so bold.) Is there a solution that provides the ease of search without giving up the benefits that librarianism brings to information organization?

Creating Useful Archives

I've done a little tweaking to the archives for dialectical futurism recently, including creating a new archive for science fiction and writing and being who I am this has inspired a little of thought regarding the state and use of archives of blogs.

The latest iteration of this blog has avoided the now common practice of having large endless lists of posts organized by publication month or by haphazardly assigned category and tag systems. While these succeed at providing a complete archive of every post written, they don't add any real value to a blog or website. I'm convinced that one feature of successful blogs moving forward will be archives that are curated and convey additional value beyond the content of the site.

Perhaps blogs as containers for a number of posts will end up being to ephemeral than I'm inclined to think about them, and will therefore not require very much in the way of archives, Perhaps, Google's index will be sufficient for most people's uses. Maybe. I remain unconvinced.

Heretofore, I have made archives for tychoish as quasi-boutique pieces: collections of the best posts that address a given topic. This is great from the perspective of thinking about blog posts as a collection of essays, but I've started to think that this may be less less useful if we think of blogs as a collection of resources that people might want to have access to beyond it's initial ephemeral form.

Right now my archives say "see stuff from the past few months, and several choice topics on which I wrote vaguely connected sequences of posts." The problem with the list of posts from the last few months is that beyond date, there's not a lot of useful information beyond the title and the date. The problem with the topical archives is that they're not up to date, their not comprehensive even for recent posts, and there's little "preview" of a given post beyond it's title. In the end I think the possibility of visiting a topical archive looking for a specific post and not finding it is pretty large.

In addition to editorial collecting, I think archives, guides, or indexes of a given body of information ought to, provide some sort of comprehensive method for accessing information. There has to be some middle ground.

I think the solution involves a lot of hand mangling of content, templates, and posts. I'm fairly certain that my current publication system is probably not up for the task without a fair amount of mangling and beating. As much as I want to think that this is an problem in search of the right kind of automation, I'm not sure that's really the case. I'm not opposed to editing things by hand, but it would increase the amount of work in making any given post significantly.

There is, I suspect, no easy solution here.

The Old Projects Project

Before a road trip a, by now, a couple of months ago, I installed a copy of nginx on my laptop on the hope of doing some web development and working on other projects when I was in the car. For the uninitiated (you mean you don't all write technical documentation for web developers and systems administrators?!?) nginx is an incredibly powerful web server. As of June 11th, foucualt the server that hosts the Cyborg Institute and tychoish.

This is, almost always, I think, a loosing proposition.

I never get any sort of substantial (or insubstantial) work done during my road trips up-and-down the north east corridor. Not that that's a bad thing, but I also expect that there'll be more awake-time when I'm not driving or gossiping.

And there never is.

So the web server sat unused for a long time on my laptop, but recently I've been playing with it a bit and I've finally gotten a number of cool things set up. I have a local "git web" instance which makes it easier to track progress on local and private projects that are stored in git. Perhaps more importantly, I have set up quick local ikiwiki instances for a number of projects. They're easy to configure, quick to setup, and while I suppose I could hack something together in nifty for myself, there's something nifty about being able to take an alternate view of some content and also being able to really preview changes to you work before publishing them.

Also, and the real reason for this post, is that by virtue of this development, I have revisited a few projects that had been lingering in the home directory of my computer for far too long. Which has been a powerful and useful exercise.

By which I mean, it's been painful.

Besides "the novel," which has been the lingering and dragging front burner project for a year, there are a number of quasi-serial stories that have lingered in some state of incompleteness for a couple of years now. I'm kind of amazed both at how foreign these stories seem to me both in terms of the style (good to know that I'm a better writer than I was a few years ago,) and also how quickly I can fall right back into the story and tell you every little thing about the world, situation, and moment where I left off.

The mind is, indeed, an amazing thing.

Where my strategy for the past year has been to "plow through and finish the novel," I think my tactic this summer will be to move all of my projects forward in some way. Small daily writing goals for the novel, combined with somewhat less regular (but more specific) goals with regards to other projects. In the next two months I want to have a fairly active and varied writing schedule worked out that isn't based around the monthly (or so) weekend binges that I've been using for most of the last year.

That's the plan at any rate.

Organize Your Thoughts More Betterly

I've been working with a reader and friend on a project to build a tool for managing information for humanities scholars and others who deal with textual data, and I've been thinking about the problem of information management a bit more seriously. Unlike numerical, or more easily categorized information data, how to take a bunch of textual information--either of your own production or a library of your own collection--is far from a solved problem.

The technical limitation--from a pragmatic perspective--is that you need to have an understanding not only of the specific tasks in front of you, but a grasp of the entire collection of information you work with in order to effectively organize, manage, and use the texts as an aggregate.

"But wait," you say. "Google solved this problem a long time ago, you don't need a deterministic information management tool, you need to brute force the problem with enough raw data, some clever algorithms, and search tools," you explain. And on some level you'd be right. The problem is of course, you can't create knowledge with Google.

Google doesn't give us the ability to discover information that's new, or powerful. Google works best when we know exactly what we're looking for, the top results in Google are most likely to be the resources that the most people know and are familiar. Google's good, and useful and a wonderful tool that more people should probably use but Google cannot lead you into novel territory.

Which brings us back to local information management tools. When you can collect, organize, and manipulate data in your own library you can draw novel conclusions, When the information is well organized, and you can survey a collection in useful and meaningful ways, you can see holes and collect more, you can search tactically, and within subsets of articles to provide. I've been talking for more than a year about the utility of curation in the creation of value on-line. and fundamentally I think the same holds true for personal information collections.

Which brings us back to the ways we organize information. And my firm conclusion that we don't have a really good way of organizing information. Everything that I'm aware of either relies on search, and therefore only allows us to find what we already know we're looking for, or requires us to understand our final conclusions during the preliminary phase of our investigations.

The solution to this problem is thus two fold: First, we need tools that allow us to work with and organize the data for our projects, full stop. Wiki's, never ending text files, don't really address all of the different ways we need to work with and organize information. Secondly we need tool tools that are tailored to the way researchers who deal in text work with information from collection and processing to quoting and citation, rather than focusing on the end stage of this process. These tools should allow our conceptual framework for organizing information to evolve as the project evolves.

I'm not sure what that looks like for sure, but I'd like to find out. If you're interested, do help us think about this!

(Also, see this post `regarding the current state of the Cyborg Institute <http://www.cyborginstitute.com/2010/06/a-report-from-the-institute/>`_.)

Strategies for Organizing Wiki Content

I've been trying to figure out wikis for a long time. It always strikes me that the wiki is probably the first truly unique (and successful) textual form of the Internet age. And there's a lot to figure out. The technological innovation of the wiki is actually remarkably straightforward, [1] and while difficult the community building aspects of wikis are straightforward. [2] The piece of the wiki puzzle that I can't nail down in a pithy sentence or two is how to organize information effectively on a wiki.

That's not entirely true.

The issue, is I think that there are a number of different ways to organize content for a wiki, and no one organizational strategy seems to be absolutely perfect, and I've never been able to settle on a way of organizing wiki pages that I am truly happy with. The goals of a good wiki "information architecture" (if I may be so bold) are as follows:

  • Clarity: It should be immediately clear to the readers and writers of a wiki where a page should be located in the wiki. If there's hierarchy, it needs to fit your subject area perfectly and require minimal effort to grok. Because you want people to focus on the content rather than the organization, and we don't tend to focus on organizational systems when they're clear.
  • Simplicity: Wikis have a great number of internal links and can (and are) indexed manually as needed, so as the proprietor of a wiki you probably need to do a lot less "infrastructural work" than you think you need to. Less is probably more in this situation.
  • Intuitive: Flowing from the above, wikis ought to strive to be intuitive in their organization. Pages should answer questions that people have, and then provide additional information out from there. One shouldn't have to dig in a wiki for pages, if there are categories or some sort of hierarchy there pages there shouldn't be overlap at the tips of various trees.

Strategies that flow from this are:

  • In general, write content on a very small number of pages, and expand outward as you have content for those pages (by chopping up existing pages as it makes sense and using this content to spur the creation of new pages.
  • Use one style of links/hierarchy (wikish and ciwiki fail at this.) You don't want people to think: Should this be a camel case link? Should this be a regular one word link? Should this be a multiple word link with dash separated words or underscore separated words? One convention to rule them all.
  • Realize that separate hierarchies of content within a single wiki effectively create separate wikis and sites within a single wiki, and that depending on your software, it can be non-intuitive to link between different hierarchies.
  • As a result: use as little hierarchy and structure as possible. hierarchy creates possibilities where things can go wrong and where confusion can happen. At some point you'll probably need infrastructure to help make the navigation among pages more intuitive, but that point is always later than you think it's going to be.
  • Avoid reflexivity. This is probably generalizable to the entire Internet, but in general people aren't very interested in how things work and the way you're thinking about your content organization. They're visiting your wiki to learn something or share some information, not to think through the meta crap with you. Focus on that.
  • Have content on all pages, and have relatively few pages which only serve to point visitors at other pages. Your main index page is probably well suited as a traffic intersection without additional content, but in most cases you probably only need a very small number of these pass through pages. In general, make it so your wikis have content everywhere.

... and other helpful suggestions which I have yet to figure out. Any suggestions from wiki maintainers?

[1]There are a number of very simple and lightweight wiki engines, including some that run in only a few lines of Perl. Once we had the tools to build dynamic websites (CGI, circa 1993/1994), the wiki became a trivial implementation.
[2]The general Principal of building a successful community edited wiki is basically to pay attention to the community in the early stages. Your first few contributors are very important, and contributions have to be invited and nurtured, and communities don't just happen. In the context of wikis, in addition to supporting the first few contributors, the founders also need to construct a substantive seed of content.

A Git of One's Own

My most sincere apologies to Virginia Woolf for the title.

We use a lot of git at work, and I've earned a bit of a reputation as "the git guy," both at work and amongst the folks who read the blog. So, I suppose it should come as no surprise that a coworker (hi stan!) said "You should write something about using git when it's just one person." And I said "well sure, but it's not nearly as interesting as you think it's going to be." He didn't seem to mind, so here I am.

Lets back up for a second.

Git is a tool that programmers use to facilitate collaboration. It stores versions of computer code (and associated file) and save incremental sets of changes to those files, so that programmers can easily experiment with changes without destroying code, and so that teams of programmers (sometimes even large teams) can all work and develop on a single code base without stepping on eachothers toes, or duplicating efforts because you end up working on different versions of the code.

Git makes a number of innovations that make version control with git much preferable (at least in my experience) to other tools, but fundamentally that's what git does. Git has all sorts of innovations that make it awesome: it's fast, it can take "diverged branches" and merge them together painlessly and almost automatically. It's great and mind bending, and I think really forces us to rethink all sorts of assumptions about authorship, and the coherency of "texts" in general.

But I'm famous for using git all alone, with just me. Here are the lessons and conclusions that I'd draw from my experiences over the past... two or three (or so) years of using git:

  • Use fewer features. Git can do all sorts of funny stuff with branches and merges, but the truth is that when you're working alone you don't really want to have to much with branches. Because they're the really novel feature (at least in terms of their usability) in git, everyone wants to use them but they add complexity, and there are other approaches to managing files and content with git that are probably preferable.
  • Resist the temptation to store binary files in git. It'll work, but you won't be really happy with it.
  • Even though you don't need to have a remote repository to push your git repositories to, keep an off site repository in almost every case. You get incremental backups for free with git, and remote back ups are nearly free.
  • Use a tool like gitosis (but it's probably in your distribution's repository and you should use that version) to manage repositories. It's overkill for your use-case, but it makes things easier in terms of creating repositories. Perhaps consider something like girocco if you want even more overkill, and more web-based interface.
  • There are git tools for most text editors and graphical tools that you may choose to use, but don't, at least until you understand what's going on behind the scenes. Learn git commands, and do stuff from the command line, as you'll be much better (in the long term,) at fixing things as issues come up.
  • If you need to maintain multiple machines, think of each machine as a collaborator, and it's probably easiest to have a centralized group of repositories that you can push to in order to keep these machines up to date.
  • If you're using git to manage configuration files (which is great) I strongly recommend having a "sub-home" directory in a git repository with your configuration files with symbolic links pointing to the files in the repository. This strikes me as towing the balance between utility and control, without being a total pain the ass. As it were.

And that's about it.

Ritual, Velocity, and Getting Things Done

I finished chapter eight of the novel that I've been working on for... Oh? Way. Too. Fucking. Long. And you want to know how I'm even comfortable asserting that I'm done with the initial draft of Chapter 8? I wrote an entire scene from Chapter 9 without saying "wow, I need another scene in Chapter 8 so that the story works out." Because I've finished Chapter 8 at least three times, but this time I'm pretty sure. There are a couple of interesting, or at least quasi interesting factors that I think are worth some attention.

For starters, I made some progress on the novel. I know that I don't have endless time to write fiction, and in addition to a day job that requires a bunch of my time and brain cycles, I write the blog, and work on other side projects: fiction, programming related things, the Cyborg Institute, dancing, singing, and occasionally sleeping. These are all projects that are important to me, and I think create value for me (and I hope in their own ways, you as well) so I don't want to sound as if I'm complaining about being too busy and overextended (perhaps I am), but as a result I think I'm being pragmatic to accept a slower pace of development.

At the same time, damn I need to finish this thing. It's good, I'm finally back in a place where I don't hate the story, but at the same time I'm very aware that I've learned a lot.

And perhaps that's the problem with taking so long to finish novels. Not so much that writing is a race, but if you aren't able to pull it off in a reasonable period of time, say 12 months, or so by the time you get to the end, you know so much more about the way you work about how to put together stories, and how to write, that creating a cohesive work becomes an actual challenge. At least for me, the thing I probably want more than anything right now, is a chance to work on other fiction projects, to take the lessons that I've learned from writing this story and apply them to writing other projects. I have a great idea for a new story laying around in a text file, but I'm not touching it yet.

The two most important things about being a writer, as far as I'm concerned, are actually writing things (done!) and finishing things (at which I think I get a middling B). So just starting new projects at whim, isn't exactly an option either. So in light of all this, what's my strategy? Fairly simple...

I've set a recurring task in my org system to write 100 words a day on the novel. Just 100 words. And if I know it's not going to happen I can mark the task as "skipped" or do it "late." But the truth is that 100 words is the kind of thing I can do in only a few moments, so it's not only a regular reminder to write, but also an eminently reasonable goal. Not only do 100 word segments add up (in a way that 0 word segments never do,) but the real trick is that in my mind I'm not trying to write very much, just enough to get started. If I don't, at least I've made a little progress. If I do, then all the better.

In addition to the regular writing task for the fiction project, I've also started keeping a journal using this method. I've also created a recurring tasks for keeping the journal, and I find this method tends to have a positive effect on my productivity. To-do lists are great for remembering and prioritizing tasks when you have a lot of balls up in the air, but they often fail at tracking real life in a reasonable way. The journal provides a good way to keep track of, and recognize the importance of all the things that we spend time doing, but that don't often have an opportunity to be captured into the to-do list before they get done. I think of it as a sort of inverse-to-do list.

It doesn't always work, of course. There are days when I don't get to either one of these tasks, and there are some days where I catch up on one or the other of them. But it's a good practice, and I focus on the things that are important: actually producing something and then also building and maintaining a habit.

Because I don't know how else things get done. Not that I'd be unwilling to listen if you have a better solution. See you in comments...