open-source

The Hard (GNU/Linux) Truth

2009-03-23 – tychoish

Backstory: Chris is something of an Operating system junkie, and likes to play around with things. I think he’s weird, but whatever. Also, he bought a netbook several months ago, and after much persistence on my part (and some epic failures of Ubuntu instalations,) he finally installed Arch Linux, and it worked amazingly well. Here’s a funny (ok, mildly amusing?) conversation about his latest system plan with only minor editing for understandability and comedic value:

Chris: I was thinking that I’d move some stuff off of my second internal hard drive and install the alpha version of Ubuntu to see how it works.

tycho: How it works? Like crap. It’s ubuntu, so it’s meant to be easy to install and usable, not fresh, robust, and hardened. Besides its an alpha, if you what stability just install Arch and get it done with.

Chris: [silence and pause]

tycho: now that we resolved this quandary what’s next?

Chris: [sighs and laughs] Nothing, really. [pause] I’m downloading Arch now, asshole.

tycho: [laughs] You’re welcome.

I don’t actually use Arch, because Ubuntu has been simple and I’ve yet to have a problem with it, but I would use Arch if I needed it, and I (seem) to recommend it to all my friends who are really geeky and are having problems with debian/ubuntu.

shrug

Agile Writing

2009-03-06 – tychoish

Lets put this in the category of “tycho writing about software development in an attempt to draw a conclusion beyond software development.” Often I find this to be an annoying impulse as, software can be meaningful in and of itself and it’s practices aren’t always incredibly relevant. On the other hand, most of my work is (at least theoretically) not software, so I find myself doing this kind of thing more than I’d really like. So be it.

Agile Development refers to a set of practices that encourages developers to review their progress regularly, to write code in testable units, to consult with the client regularly to allow the client to lead the design process to reflect the reality that requirements, contexts, and possibilities change as a result of the ongoing development process. Extreme Programing (XP), is probably the most famous subset of Agile Development, and I think both are interesting (and popular) because they promote a kind of flexibility and respond to (and draws from) the creative impulse. XP takes the iterative/test driven Agile philosophy and does “wacky” things like “pair programming” where two developers take turns typing and monitoring the coding process. I’ve of course, not really, worked in these situations, but I am fascinated by the possibilities.

I often think about the implications of these kinds of methodologies on the work I do (writing). I have yet to be convinced that this is an entirely productive impulse, but that never stops me.

The key feature of Agile development--to my mind--is that it’s built around multiple iterations. Rather than concentrating on getting all of the details right, the goal is to get something working, and then expand/refactor/revise and get review on all these iterations, so that through successive iteration you have a solid, relevant, and sturdy result. Once you have iterations, getting customer review is easier (because there’s something to evaluate), testing is easier, collaboration is easier.

Writers already have a sense of drafts, and as such this is the way we always work. In another sense, we don’t seek feedback on most drafts, and so while we might revise in a couple of “lumps” we editorial collaboration is pretty minimal during the writing process. That’s not a bad thing, just a commentary on the analogy. Writing collaboratively is also damned hard, and so collaborations are more often based on structural divisions (eg. “you write parts one, four, five, and seven; and I’ll write two, three, six, and eight,") or in larger groups, require dedicated editorial nodes/contributors to organize logistics.

True story: I wrote an academic paper with someones (we lived next door to each other at the time) and as I remember, we tended to do something very much like “pair programming,” I’d drive (type) and she’d navigate (read over my shoulder,) or she’d type and I’d pace, though I think I tended toward the typing roll for any number of reasons. It worked, but we had (and have) such different approaches to writing, thinking about that sort of boggles.

In another sense, posting rough drafts of works on the Internet (critical futures; Cory Doctorow’s Podcast; sam starbuck’s projects; etc.) is another way to get the kind of on going feedback that features so prominently in the Agile/XP methodologies.

The truth is that I had expected to talk about how programming and writing are fundamentally different, and how while Agile and XP are really powerful ways to think about the creation of programs, the creation of novels, stories, and essays can’t work that way.

While I was able to find some parallels, and examples to the contrary, there are so many features of the way that I write, the way that I create, that run quite counter to the “agile way:”

I don’t do iterative drafting very well. I write something, I run through it twice, someone else gives feedback, I run through it once more, and it’s either good enough to do something with at that point, or I abandon it.
We mysticsize the creative process, particularly for “artistic” creation. I don’t particularly think of myself as an artist, but I think regardless, because we’re not very good at articulating our creative process (and generally unwilling to change the way we work, much), there isn’t a lot of willingness to change how we write.
Collaboration is a challenge because of the aforementioned mysticism, and because individuals are capable of (in most cases) writing the long-forms by themselves (novels, screenplays) collaboration isn’t a vital necessity. The counter-example would be what happens in the writing rooms of television shows, I suppose, though I haven’t worked in these situations. Not that I’d be opposed, if someone wanted to hire me to do that ;).
Writers make their money (at least as we’re taught to think) by selling publication rights. Iterative work requires frequent publication, which discourages working in this way. Obviously there are some other business models, and other kinds of writing, but generally speaking…

Writing this has inspired me to move more in the direction of posting to Critical Futures again, and to work harder on collaboration projects. I’ve been stuck in my own writing, as life and an iterative hump have combined to really take me out of the game for a while. While I doubt any change in methodology could really make me slightly less linear, it is helpful to think about process in new and different ways. In point of fact, everyone works eclectically anyway, but just thinking about how we/I work has some worth. That much I’m sure.

Onward and Upward!

Wordpress Limitations

2009-03-02 – tychoish

Wordpress is great software, and I’ve been a user for many years. Many years. It used to be called “b2” and I used it then as well. There are a lot of more powerful content management systems, a lot of systems that are much more flexible than wordpress these days, and often I get the feeling that other platforms attempt to define themselves in contrast to wordpress. In the larger sense, this post is an attempt to resist this temptation while also exploring the limitations of wordpress.

Wordpress is a pure blogging engine: it provides interfaces for writers to publish weblogs (blogs), manage content (to some degree) and generate pages based on templates. Before wordpress, blogging was done either by hand edited text files, or by systems that complied static HTML from some sort of database.¹ Wordpress is an improvement because it’s easy to install, it’s reliable, and pages generate dynamically on viewing, rather than just when the site owner hits “save” or “rebuild.” In the end, we discovered that systems where managing “websites” was divorced from (even simple) server management had a great democratizing effect on content, and that’s sort of the core of wordpress.

Because wordpress is designed to be a blogging platform, it doesn’t need to be as flexible as other generalized content management systems. Flexibility comes at the cost of complexity, and developers decided that in some cases, less was, in fact, more. There are a lot of things that you could do with b2 (albeit with some hacking) because the site generation/templating system was much less rigid, at the same time, it was much easier to get sites with broken links, and bad pages, particularly as you changed from theme to theme. That’s bad, and it seems pretty reasonable to me to want to avoid that.

The end result is a program that does almost everything you could want it to do as long as you only want a blog, if you try and stretch it too far it simply won’t work. Well it will work, but the advantage of using Wordpress to manage a website that isn’t a blog (or very similar to one) disappears quickly when you have to impose informal limitations on how you enter content in the system to generate well formated pages. It’s a slippery slope, and you’d be surprised how quickly a site goes from being a standard Wordpress site, to requiring customized themes, specialized content entry patterns. And pretty soon, a lot of the things that make Wordpress “simple” and “essay,” aren’t really available to your new site. That’s the limitation of Wordpress.

Knowing where the line is, is often the largest challenge in Wordpress development, and being able to say, “you know, this is the kind of site that you really want to be building with Django, or Drupal, or Rails, or Expression Engine,” Or even saying “you know this is the kind of site that we could probably do more effectively using flat files and PHP includes. Wordpress is great, and in the cases where it’s well suited to the task at hand, it’s the ideal solution. In other situations? Less so.

Onward and Upward!

Interestingly, this whole “static site compiling” is making a come back, because it turns out that dynamic page generation doesn’t scale as well as we thought it would five or six years ago. So we have static site compilers and complex caching tools. What comes around, goes around I guess. ↩︎

Sapir Whorf Hypothesis and Computer Programing

2009-02-25 – tychoish

There’s this idea in linguistic/cognitive anthropology that the limitations of linguistic possibility limit what the bounds of what we’re able to think about. If we lack words for a given thing or a concept, it’s really hard to even conceive what it is. I’ll get to the strengths and limits of the hypothesis in a bit, but lets just say the reception of these ideas (i.e. “linguistic relativism”) is somewhat mixed. Nevertheless it’s had a great impact on me and the kinds of ideas I deal in.

For instance, though I’m not an active programmer, I talk to programmers a bunch I tweak code from time to time, and I’ve tried to learn programming enough times that I sort of get the basics of enough stuff to know what’s going on, and if there’s one theme to my interests in open source and software development, it’s looking at the software and tools that developers us, in part for issues related to linguistic relativism. Basically, if the people who develop programming languages, and software itself don’t provide for possibilities, developers and users won’t be able to think about things downstream. Or at least that’s the theory.*

The problem with linguistic relativism in general, is that it’s really hard to test, and we get into causality issues. “Describe this thing that you don’t know about!” is a bad interview tactic and we run into the questions like: Is it really language that limits knowability or is some other combination of typical experiences that limits both knowability and language? I’ve read far too many a number of papers from a couple of different scholars, and I almost always end up in the “relativist camp,” but that might be a personality feature.

In computer science, I suppose it is also not quite so cut and dry. Questions like “Does the advancement of things like hardware limit technical possibility more than programing languages and tools?” come up, but I think for the most part it is more cut and dry: Erlang’s concurrency model makes thins possible, and makes programmers think in ways that they’re not prone to thinking about them otherwise. Git’s method promoting collaboration requires people to think differently about authorship and collaboration. Maybe. I mean it makes sense to me.

These low-level tools shape what’s possible on the higher level not simply in that a programing language implements features that are then used to build higher level applications, but if you teach someone to program in emacs-lisp (say) they’ll think about building software in a very different way from the folks who learn how to program Java. Or Perl. Or PHP.

And those differences work down the software food chain: what programmers are able to code, limits the software that people like you and me use on a day to day basis. That’s terribly important.

I think the impulse when talking about open source and free software is to talk about subjects that are (more) digestible to non-technical users, and provide examples of software projects that are more easily understood (e.g. firefox and open office rather than gcc and the Linux kernel, say.) This strikes me as the wrong impulse, when we could focus on talking about more technical projects and then use abstractions and metaphors to get to a more general audience if needed. I’m not saying I’ve mastered this, but I’m trying, and I think we’ll ultimately learn a lot more this way. Or so I hope. There is always much to learn.

Onward and Upward!

* In a previous era/maturity state I would have been somewhat guarded about that subject which I think is going to be a major theme in oh the rest of my life. But, this is open source and knowledge wants to be free and all that. Actually, less altruistically, I’m much more worried that it’s a lackluster research question than I am that someone is going to “snipe it from me,” and suggestions and challenges I think would be really productive.

Comitting From the Bottom Up

2009-02-23 – tychoish

My blog reading eyes/ears tend to perk up when I see someone writing about git as this piece of software fascinates me in a potentially unhealthy sort of way. I read a post the other day that talked a bunch about git, and centralized SCM tools like SVN and CVS, as well as the other distributed SCM bazaar. If that last sentence was greek to you, don’t worry, I’m heading into a pretty general discussion. Here’s the background:

Version control or source control management systems (VCS/SCM), are tools that programmers use to store the code of a program or project as they develop it. These tools store versions of a code base which has a lot of benefits: programmers can work concurrently on a project and distribute their changes regularly to avoid duplicating efforts or working on divergent editions code. SCMs also save your history incase you change something that you didn’t intended to you can go back to known working states, or “revive” older features that you’d deleted. SCMs are It’s a good thing, and I’d wager that most programmers use some sort of system to track this task.¹

The basic unit of any version control system is the “commit,” which represents a collection or set of changes that a given developer chooses to “check in” to the system. There are two basic models of VCS/SCM: the centralized client/server system and the distributed system. Centralization means that the history is stored on a server or centralized machine, and a group of developers all send and pull changes from that central “repository.” Distributed systems give every developer in a project a copy of the full history, and give them the capability of sending or pulling changes from any other developer in a system.

There’s a lot of topics about the various merits of both distributed and centralized version control systems, and a lot of this discussion ends up being hashed over technological features like speed and the various ease of various operations or over process features that relate to what a system allows or promotes in terms of workflow. While these discussions are interesting they’re too close to the actual programs to see something that I think is pretty interesting.

In centralized systems, “the commit” is something that serves the project’s management. If done right (so the theory goes), in a centralized system, only a select few have access to submit changes, as the central server’s only way of reconciling diverging versions of a code-base is to accept the first submitted change (poor solution) and the more developers you have the greater the chance of having version collisions. As a result there’s a lot less committing that happens. In big projects, you still have to mail patches around because only a few people can commit changes and in smaller teams, people are more likely to “put off committing” because frequent commits of incremental changes are more likely to confuse teammates, and committing amounts to publication.

In distributed systems, since the total “repository” is stored locally, committing changes to your repository and publishing changes with collaborators are separate options. As a result, there’s less incentive for developers to avoid creating commits for incremental changes. Rather than have commits mark complete working states with a lot of changes in every individual commit, commits mark points of experimentation in the distributed system.

This difference, is really critical. Commits in a centralized system serve the person who “owns” the repository, whereas in the distributed system they serve the developer. There are other aspects of these programs which affect the way developers relate to their code, but I think on some fundamental level this is really important.

Also, I don’t want to make the argument that “bottom up distribution = good and top down centralization = bad,” as I think it’s more complicated than that. It’s also possible to use distributed technology in centralized workflows, and if you use centralized systems with the right team, the top-down limitation isn’t particularly noticeable. But as a starting point, it’s an interesting analysis.

So common are they, that I was surprised to learn that the Linux Kernel (is a massive project) spent many many years without any formal system to manage these functions. They used “tar balls and patches, for years” which is amazing. ↩︎

org-mode snippets

2009-02-20 – tychoish

I promised that I’d post some of the stuff from my .emacs file that makes my org-mode system work. Here we are.

There are some basic settings that I use on all major modes that I use in emacs. Basically, I want to attach the spell checker (the minor modes, flyspell-mode and auto-fill-mode). These lines do this:

(add-hook 'org-mode-hook 'turn-on-auto-fill)
(add-hook 'org-mode-hook 'flyspell-mode)

I also, attached “.org” as the file extension to org-mode. This setting is good for this kind of thing:

(setq auto-mode-alist
(cons '("\\.org" . org-mode)auto-mode-alist))

The following are a list of basic org-mode related settings that I’ve found helpful. In some sequence, I keep org-mode files in the ~/org/ directory, with codex.org being my general catch-all file. I like my agenda views to include todos, even if they’re not date-specific (this is a great boon) I’ve included the diary in the agenda views for grins, though I’m not yet smart enough to really make the most of that one.

The odd-levels-only, and hide-leading-stars are aesthetic settings only, and can be changed/converted from at any point, but I like them

The org-todo-keywords setting allows you to specify alternate todo-statuses. I’ve found that this sorting is a useful and allows me to visually sort out things I need to write, versus chores and other more clerical tasks. The pipe seperates finished statuses from open statuses. I debated for a long time about weather “differed” should be “done” or “not done,” but decided that with “pending,” I was safe to use “differed” tasks for “not my problem any more” items.

(setq org-directory "~/org/")
(setq org-default-notes-file (concat org-directory "/codex.org"))
(setq org-agenda-include-all-todo t)
(setq org-agenda-include-diary t)
(setq org-hide-leading-stars t)
(setq org-odd-levels-only t)
(setq org-todo-keywords
      '((sequence "TODO"
                  "WRITE"
                  "REVIEW"
                  "PENDING" "|"
                  "DIFFERED"
                  "DELEGATED"
                  "DONE")))

The next bit, is something that I got from Jack. It creates an org-mode file with time-stamp headlines which you can use to create a journal file to record daily activities.

The first block sets up which file the journal should be in, and the second sets up entry. My main complaint with this is that I’m not very habitual about using it.

(defvar org-journal-file "~/org/journal.org"
   "Path to OrgMode journal file.")
(defvar org-journal-date-format "%Y-%m-%d"
   "Date format string for journal headings.")

(defun org-journal-entry ()
  "Create a new diary entry for today or append to an existing one."
  (interactive)
  (switch-to-buffer (find-file org-journal-file))
  (widen)
  (let ((today (format-time-string org-journal-date-format)))
    (beginning-of-buffer)
    (unless (org-goto-local-search-headings today nil t)
      ((lambda ()
         (org-insert-heading)
         (insert today)
         (insert "\n\n  \n"))))
    (beginning-of-buffer)
    (org-show-entry)
    (org-narrow-to-subtree)
    (end-of-buffer)
    (backward-char 2)
    (unless (= (current-column) 2)
      (insert "\n\n  "))))

The integration between remember-mode functionality and org-mode is one of those things that just makes org-mode amazing and awe inspiring. The sad part is that it takes some setup to make it work right and therefore doesn’t work straight out of the hook.

I’d explain the template syntax better if I understood it a bit better. I should look into that.

(require 'remember)
(setq remember-annotation-functions '(org-remember-annotation))
(setq remember-handler-functions '(org-remember-handler))
(add-hook 'remember-mode-hook 'org-remember-apply-template)

(setq org-remember-templates
      '(("todo" ?t "* TODO %?\n  %i\n  %a" "~/org/codex.org" "Tasks")
        ("notes" ?n "* %?\n  %i\n  %a" "~/org/codex.org" "Inbox and Notes")
        ("blog" ?b "* %U %?\n\n  %i\n  %a" "~/org/blog.org")
        ("technology" ?s "* %U %?\n\n  %i\n  %a" "~/org/technology.org")
        ("fiction" ?f "* %U %?\n\n  %i\n  %a" "~/org/fiction.org"))

Finally, key bindings that make org-mode functionality accessible whenever I need it in emacs. I should do things to have raise emacsclient windows from other applications, but I’ll deal with that later. There aren’t that many, and I put org-mode stuff under control-c (C-c).

(global-set-key "\C-ca" 'org-agenda)
(global-set-key "\C-cr" 'org-remember)
(global-set-key "\C-cj" 'org-journal-entry)

And that’s it. If you use org-mode, what’s the killer snippet that I’ve forgotten? If you don’t use org-mode but are curious, what should I talk about next. If you’re still not clear what org-mode is, ask, as I should work on getting better at explaining.

Thanks for reading. Cheers!

Open Source Userland

2009-02-04 – tychoish

Free software and open source users/developers are a sort of evangelical bunch. I think a lot of this is because hackers what other people to use the software that they spend their time working on, and of course some of this is because of the ethical systems that pervade the free software movement. And of course we want to both expand the user base of certain pieces of software within the open source world (eg. getting vim users to use emacs) as well as getting people using proprietary systems (like Windows/OS X/Microsoft Office) to use free/open systems (like Linux/BSD/emacs).

The biggest challenge in the second project is usability, and I think both prospective users and developers (and people like me) often wonder “Is open source usable for non-technical users?” This is a question that I don’t have an answer for. On the one hand, yes, GNOME--for instance--is really usable. I don’t think it’s particularly innovative software, nor is it clever in the way that OS X sometimes is, but it is on the whole very functional.

Very often open source, in its entirety, is judged on the basis of its usability, which strikes me as pretty ironic, as I’d wager most open source projects--and without a doubt the most influential ones--are not “user applications.” In terms of importance, the kernels, the programing languages, the libraries, the servers, and the frameworks are way more successful, powerful, and robust than programs like “Open Office,” or “GNOME,” or even--frankly--“Firefox.”

I suspect this is the case because lower level stuff is either to get right, and because hackers end up working with computers on a very low level, so it makes sense that the itches they’re scratching with open source would work on a lower level. And the “cause of free software,” is more directly served by these lower level projects: open source depends on users recognizing the value of hacking on code, which is more likely to be realized in low level projects.

Which makes the project of evangelizing non-technical users more difficult, because the argument isn’t exactly “switch programs to (potentially) better ones,” but rather “become more involved the technology you use,” which is a much different argument. And I think the “usability” question often serves as point of mystification in this much different argument.

My original intent with this post to explore how some of the biggest open source user-applications were in fact sponsored by really big companies (in whole or in part). Novell puts some considerable resources into GNOME and KDE; Sun obviously backs Open Office; Firefox and Mozilla grew out of Netscape/AOL and get a lot of money from Google.

More than anything, I wonder what to make of this. Certainly there is also backing for lower level projects: Sun and Java/mySQL; countless companies and kernel development; 37 Signals and Ruby-on-Rails; and so forth, but it feels more substantial for user applications, somehow.

I wouldn’t go so far as to suggest that corporations are attempting to re-mystify technology in open source. I think it’s much more likely that business know that having viable desktop environments will be advantageous to them in the long run, and that since hackers are less likely (on the whole) to work in the user-application space key contributions of corporate-backed developers are more noticeable.

But maybe there’s something else there too. I’m not sure, isn’t the world grand?

Onward and Upward!

Linux Emergence

2008-12-15 – tychoish

Here’s a little bit about emergence/systems theory and open source, as promised in my `change process <http://tychoish.com/posts/theories-of-change>`_ post a while back.

I was reading this article about linux and complexity theory last week, which I think is a pretty good frame for any discussion of open source and linux from a systems approach. There are a few things that I had issue with. First, the article is 3+ years old, and so it doesn’t have the benefit of seeing what’s happened with ubuntu linux, netbooks, the AGPL, rails and web 2.0, let alone things like Drupal and the last two major iterations of Firefox, which for all their assorted faults have really changed the face of open source.

Secondly, the empirical study focuses on the kernel development, and takes the kernel to represent the entire Linux eco-system, which I think it doesn’t do very well. Kernel development is really high level, really niche, and despite its size, represents so little of what people think about when they talk about “Linux.” GNOME, KDE, the GNU Toolchain, X11, pyton/ruby/perl, let alone the superstrucutral elements that projects like Debian, Gentoo and Arch represent are really more important than the Kernel. Debian and Gentoo more or less work with various BSD kernels and given enough money--to inspire interest and caring--full BSD based releases wouldn’t be technologically difficult. The dominance of the Linux Kernel in the free-operating system space is--I think largely the result of momentum and the fact that the kernel is damn good and that there’s not a lot of a need for another kernel option.

In any case, the paper is in a lot of ways a 21st century review of Eric S. Raymond’s “The Cathedral and the Bazaar” paper. Raymond’s argument is taken as a statement in favor of bottom up organization (Linux, and the bazaar) in open source projects, and the author uses this insight to explore kernel development with some good-old systems theory.

While it’s very true that there isn’t a monolithic organization in the Kernel development, it’s not exactly a free for all. Linus (or whoever is at the top of the project) provides a measure of top-down structure. This makes it easier for contributions to bubble from the bottom up. I think it’s no mistake that many open source projects have “dictator”-type leaders that are pretty consistent.

This says nothing of the barriers to entry for most kinds of development in the open source world, aren’t trivial (commit access projects that use non-distributed version control; eg. Drupal), let alone the burden of engineering knowledge for Kernel-level development, and other lower level projects. These--largely--informal standards none the less, constrain what happens on the kernel development.

Even if a lot of the day to day work on the kernel isn’t done by Linus himself, his presence, and the importance of his branch/tree gives structure. Particularly before the advent of git, but even now. And I’m not saying that this is a bad thing--quite the contrary, top down forces are often a good thing--but I do think that overly romantic depictions of open source development as being bottom-up/anarchical aren’t productive.

Doest his mean that Linux is a Cathedral? Or that it’s headed that way? Not at all. But I don’t think that the Bazaar is necessarily as bottom-up as Raymond (and those that have followed) thought it was. Open source is also much more commercial now than it was ten or twenty years ago, that can’t not have an impact.

Just thoughts…

Onward and Upward!