cyborg

Enterprise Linux Community

2010-03-31 – tychoish

Ok. I can’t be the only one.¹

I look at open source projects like OpenSolaris, Alfresco, Resin, Magento, OpenSuSE, Fedora, and MySQL, among others, and I wonder “What’s the community around these projects that people are always talking about.” Sure I can download the source code under licenses that I’m comfortable with, sure they talk about a community, but what does that mean?

What, as a company, does it mean to say that the software you develop (and likely own all the rights to,) is “open source,” and “supported by a community?”

If I were sensible, I’d probably stop writing this post here. From the perspective of the users of and participants in open source software, this is the core question, both because it dictates what we can expect from free software and open source and more importantly because it has been historically ill defined.

There are two additional, but related, questions that lurk around this question, at least in my mind:

1. Why are new open source projects only seen as legitimate if the developers are able to build a business around the project?

2. What does it mean to be a contributor to open source in this world, and what do contributors in “the community,” get from contributing to commercial projects?

There are of course exceptions to this rule: the Debian Project, the Linux Kernel itself, GNU packages, and most open source programming languages among others. I’d love to know if I’ve missed a class of software in this list--and there’s one exception that I’ll touch on in a moment--but the commonality here is that that these projects are so low level that it seems too hard to build businesses around directly.

When “less technical” free software projects began to take off, I think a lot of people said “I don’t know if this open source thing will work when the users of the software aren’t hackers,” because after all what does open source code do for non-hackers? While it’s true that there are fringe benefits that go beyond the simple “free as in beer” quality of open source for non-hacker users, these benefits are not always obvious. In a lot of ways the commercialization around open source software helps add a buffer between upstreams and end users. This is why I included Debian in the list above. Debian is very much a usable operating system, but in practice it’s often an upstream of other distributions. Ubuntu, Maemo, etc.

The exception that I mentioned is, to my mind, projects like Drupal and web development frameworks like Ruby on Rails and Django. These communities aren’t sponsored or driven by venture capital funded companies. Though the leader of the Drupal community has taken VC money for a Drupal-related start up. I think the difference here is that the economic activity around these projects is consulting based: people use Drupal/Django/Rails to build websites (which aren’t, largely open source) for clients. In a lot of ways these are much closer to the “traditional free software business model,” as envisioned in the eighties and nineties, than what seems to prevail at the moment.

So to summarize the questions:

What, as a company, does it mean to say that the software you develop (and likely own all the rights to,) is “open source,” and “supported by a community?”
What does it mean to participate in and contribute to a community around a commercial product that you don’t have any real stake in?
How does the free software community, which is largely technical and hacker centered, transcend to deal with and serve end users?
How do we legitimize projects that aren’t funded with venture capital money?

Onward and Upward!

I think and hope this is the post I meant to write when I started writing this post on the work of open source ↩︎

Analyzing the Work of Open Source

2010-03-30 – Kleinman

This post covers the role and purpose (and utility!) of analysts and spectators in the software development world. Particularly in the open source subset of that. My inspirations and for this post come from:

In the video Coté says (basically,) open source projects need to be able to justify the “business case” for their project, to explain what’s the innovation that this project seeks to provide the world. This is undoubtedly a good thing, and I think we should probably all be able to explore and clearly explain and even justify the projects we care about and work on in terms of their external worth.

Project leaders and developers should be able to explain and justify the greater utility of their software clearly. Without question. At the same time, problems arise when all we focus on is the worth. People become oblivious to how things work, and become unable to successfully participate in informed decisions about the technology that they use. Users, without an understanding of how a piece of technology functions are less able to take full advantage of that technology.

As an aside: One of the things that took me forever to get used to about working with developers is the terms that they describe their future projects. They use the imperative case with much more ease than I would ever consider: “the product will have this feature” and “It will be architected in such a way.” From the outside this kind of talk seems to be unrealistic and grandiose, but I’ve learned that programmers tend to see their projects evolving in real time, and so this kind of language is really more representative of their current state of mind than their intentions or lack of communications skills.

Returning for a moment to the importance of being able to communicate the business case of the projects and technology that we create. As we force the developers of technology to focus on the business cases for the technology they develop we also make it so that the only people who are capable of understanding how software works, or how software is created, are the people who develop software. And while I’m all in favor of specialization, I do think that the returns diminish quickly.

And beyond the fact that this leads to technology that simply isn’t as good or as useful, in the long run, it also strongly limits the ability of observers and outsiders (“analysts”) to be able to provide a service for the developers of the technology beyond simply communicating their business case to outside world. It restricts all understanding of technology to journalism rather than the sort of “rich and chewy” (anthropological?) understanding that might be possible if we worked to understand the technology itself.

I clearly need to work a bit more to develop this idea, but I think it connects with a couple of previous arguments that I’ve put forth in these pages one regarding Whorfism in Programming, and also in constructing rich arguments.

I look forward to your input as I develop this project. Onward and Upward!

The Successful Failure of OpenID

2010-03-29 – tychoish

Just about the time I was ready to call OpenID a total failure, something clicked and, if you asked how I thought “OpenID was doing,” I’d have to say that it’s largely a success. But it certianly took long enough to get here.

Lets back up and give some context.

OpenID is a system for distributing and delegating authentication for web services to third party sites. Basically to the end user, rather than signing into a website with your username and password, you sign in with your profile URL on some secondary site that you actually log into. The site you’re trying to log in, asks the secondary site “is this legit,” the secondary site prompts you (usually just the first time, though each OpenID provider may function differently here.) then you’re good to go.

Additionally, and this is the part that I really like about Open ID is that you can delegate the OpenID of a given page to a secondary host. So on tychoish.com you’ll find the following tags in the header of the document:

<link rel="openid.server" href="http://www.livejournal.com/openid/server.bml" />
<link rel="openid.delegate" href="http://tychoish.livejournal.com/" />

So I tell a third party site “I wanna sign in with http://tychoish.com/ as my OpenID,” it goes and sees that I’ve delegated tychoish.com’s OpenID to LiveJournal (incidentally the initiators of OpenID if memory serves,) and LiveJournal handles the authentication and validation for me. If at some point I decide that LiveJournal isn’t doing what I need it to, I can change these tags to a new provider, and all the third party sites go talk to the new provider as if nothing happened. And it’s secure because I control tychoish.com and contain a provider-independent identity server, while still making use of these third party servers. Win.

The thing is that OpenID never really caught on. Though managing a single set of authentication credentials, and a common identity across a number of sites has a lot of benefits to the users, it never really caught on. Or I should say, it took a very long time to be taken seriously. There are a number of reasons for this, in my understanding:

1. Third party vendors wanted to keep big user databases with email addresses. OpenID means, depending on implementation that you can bypass the traditional sign up method. This isn’t a technological requirement but can be confusing in some instances. By giving up the “traditional” value associated with sponsoring account creation, OpenID seemed like a threat to traditional web businesses. There were ways around this, but it’s confusing and as is often the case a dated business model trumped an inspiring business model.

2. There was and is some fud around security. People thought if they weren’t responsible for the authentication process that they wouldn’t be able to ensure that only the people who were supposed to were able to get into a given account. Particularly since the only identifying information associated with an account was a publicly accessible URL. Nevertheless it works, and I think people used these details to make people feel like the system isn’t/wasn’t secure.

3. There are some legitimate technological concerns that need to be sorted out. Particularly around account creation. This is the main confusion cited above. If someone signs up for an account with an OpenID, do they get a username and have to enter that, or do we just use the OpenID URL? Is there an email address or password associated with the account? What if they get locked out and need to get into the account but there’s no email? What if they need to change their OpenID provider/location at some point. These are legitimate concerns, but they’re solvable problems.

4. Some users have had a hard time groking it. Because it breaks with the conventional usage model, and it makes signing into sites simple it’s a bit hard to grok.

What’s fascinating about this is that eventually it did succeed. More even than joy at the fact that I get to use OpenID, finally, I think OpenID presents an interesting lesson in the eventual success of emergent technological phenomena. Google accounts, flickr accounts, and AIM accounts all provide OpenID. And although “facebook connect” is not using OpenID technology, it’s conceptually the same. Sites like StackOverflow have OpenID only authentication, and it’s becoming more popular.

OpenID succeeded not because the campaign to teach everyone that federated identity vis a vis OpenID was the future and the way we should interact with web services, but rather because the developers of web applications learned that this was the easier and more effective way to do things. And, I suspect in as much as 80% or 90% of cases when people use OpenID they don’t have a clue that that’s the technology they’re using. And that’s probably an ok thing.

The question that lingers in my mind as I end this post is: is this parallel any other optimistic technology that we’re interested in right now? Might some other “Open*” technology take away a strategic lesson from the tactical success of OpenID? I’d love to see that.

Onward and Upward!

File System Metaphors

2010-03-18 – tychoish

The file system is dead. Long live the File system.

We live in an interesting time. There are two technologies that aim to accomplish two very goals. On the one hand we have things like Amazon’s S3, Hadoop, NoSQL, and a host of technologies that destroy the file system metaphor as we know it today. The future, if you believe it, lays in storing all data in some sort of distributed key/value store-based system. And then, on the other hand we have things like “FUSE” that attempt to translate various kinds of interfaces and data systems onto the file system metaphor.

Ok, so the truth is that the opposition between the “lets replace file systems” with non-file based data stores folks and the “lets use the file system as a metaphor for everything,” is totally contrived. How data is stored and how we interact with data are very different (and not always connected) problems.

Let’s lay down some principals:

There are (probably) more tools to interact with, organize, manage, and manipulate files and file system objects than there are for any other data storage system in contemporary technology.
Most users of computers have some understanding of file systems and how they work, though clearly there are a great diversity of degrees here.
In nearly every case, only one system can have access to a given file system at a time. In these days of such massive parallel computing, the size of computer networks, (and the associated latency) this has become a rather substantial limitation.
From the average end user’s perspective, it’s probably the case that file systems provide too much flexibility, and can easily become disorganized.
There are all sorts of possible problems regarding consistency, backups, and data corruption that all data storage systems must address, but that present larger problems as file systems need to scale to handle bigger sets of data, more users, and attach to systems that are more geographically disparate.

Given these presumptions, my personal biases and outlook, and a bit of extrapolation here’s a basic feature set for “information storage system.” These features will transcend the storage engine/interface boundary a bit. You’ve been warned.
Multiple people and systems need to be able to access and edit the same objects concurrently.
Existing tools need to be able to work in some capacity. Perhaps using FUSE-like systems. File managers, mv, ls, and cp should just work, etc.
There ought to be some sort of off-network capability so that a user can loose a network connection without loosing access to his or her data.
Search indexing and capabilities should be baked into the lowest levels of the system so that people can easily find information.
There ought to be some sort of user facing meta-data system which can affect not just sort order, but also attach to actions, to create notifications, or manipulate the data for easier use.

These sorts sorts of features are of course not new ideas. My sygn project is one example, as is haven, as is this personal information management proposal.

Now all we need to do is figure some way to build it.

Why Bother With Lisp?

2010-03-16 – tychoish

I’m quite fond of saying “I’m not a programmer or software developer,” on this blog, and while I don’t think that there’s a great chance that I’ll be employed as a developer, it’s becoming more apparent that the real difference between me and a “real developer” is vanishingly small. Stealth Developer, or something. In any case, my ongoing tooling around with common lisp and more recently the tumble manager project have given me opportunities to think about lisp and to think about why I enjoy it.

This post started when a friend asked me “so should I learn common lisp.” And my first response was something to the effect of “no, are you crazy?” or, alternately “well if you really want to.” And then I came to my senses and offered a more reasonable answer that I think some of you might find useful.

Let us start by asking “Should You Study Common Lisp?”

Yes! There are a number of great reasons to use Common Lisp:

There are a number of good open source implementations of the common lisp language including a couple of very interesting and viable options. They’re also stable: SBCL which is among the more recent entrants to this field is more than a decade old.
There are sophisticated development tools, notably SLIME (for emacs) which connects and integrates emacs with the lisp process, as well as advanced REPLs (i.e. interactive mode). So getting started isn’t difficult.
Common Lisp supports many different approaches to programming. Indeed, contemporary “advanced” languages like Ruby and Python borrow a lot from Lisp. So it’s not an “archaic” language by any means. Dynamic typing, garbage collection, macros, and so forth.
CL is capable of very high performance, so the chance of saying “damn, I wish I wrote this in a faster language,” down the road isn’t terribly likely. Most implementations run on most platforms of any consequence, which is nice.
You’re probably tired of hearing that “Learning Lisp will make your a better programmer in any language,” but it’s probably true on some level.

The reasons to not learn Lisp or to avoid using it are also multiple:

“Compiled” Lisp binaries are large compared to similarly functional programs in other languages. While most CL implementations will compile native binaries, they also have to compile in most of themselves.
Lisp is totally a small niche language, and we’d be dumb to assume that it’s ever going to take off. It’s “real” by most measurements, but it’s never really going to be popular or widely deployed in the way that other contemporary languages are.
Other programmers will think you’re weird.

Having said that all of I think we should still start projects in CL, and expand the amount of software that’s written in the language. Here’s why my next programing project is going to be written in lisp:

I enjoy it. I suspect this project like many projects you may be considering is something of an undertaking. Given that I don’t want to have to work in an environment that I don’t enjoy, simply because it’s popular or ubiquitous.
Although Lisp isn’t very popular, it’s popular enough that all of the things that you might want to do in your project have library support. So it’s not exactly a wasteland.
The Common Lisp community is small, but it’s dedicated and fairly close knit. Which means you may be able to get some exposure for your application in the CL community, simply because your app is written in CL. This is a question of scale, but it’s easier to stand out in a smaller niche.

Of course there are some advantages to “sticking with the crowd” and choosing a different platform to develop your application in:

If you want other people to contribute to your project, it’s probably best to pick a language that the people who might be contributing to your application already know.
While there are libraries for most common things that you might want to do with Common Lisp, there might not be libraries for very new or very esoteric tasks or interfaces. Which isn’t always a problem, but can be depending on your domain.
The binary size problem will be an issue if you plan to deploy in limited conditions (we’re talking like a 15 meg base size for SBCL, which is a non issue in most cases, but might become an issue.)
If you run into a problem, you might have a hard time finding an answer. This is often not the case, but it’s a risk.

Onward and Upward!

Where is Java Today?

2010-03-15 – tychoish

A few weeks ago a coworker walked into my office to talk about the architecture of a project, complete with diagrams, numbers I didn’t grasp (nor really need to,) and the examples of potential off the shelf components that would make up the stack of the application at hand. I asked scores of questions and I think it was a productive encounter. Normal day, really. I seem to be the guy developers come to and pitch ideas to for feedback. Not sure why but I thin think that the experience of talking through a programing or design problem tends to be a productive learning experience for everyone. In any case the details aren’t terribly important

What stuck in my head is that an off the self, but non-trivial part of the system was written in Java.

We all inhaled sharply.

I don’t know what it is about Java, and I don’t think it’s just me, but the moment I find out that an application is written in Java, I have a nearly visceral reaction. And I don’t think it’s just me.

Java earned a terrible reputation in the 90s, because although it was trumped as the next big thing every user facing application in Java sucked: first you had to download a lot of software (and hope that you got the right version of the dependency) and then when you ran the app it took a while to start up and looked like crap. And then your system ground to a halt and the app crashed. But these problems have been fixed: the dependency issue is more clear with the GPLing of Java, GUI bindings for common platforms are a bit stronger, computers have gotten a lot faster, and perhaps most importantly the hopes of using Java as the cross platform application development environment have been dashed. I think it’s probably fair to say that most Java these days runs on the server side, so we don’t have to interact with it in the same sort of hands on way.

This isn’t to say that administering Java components in server operations is without problems: Java apps tend to run a bit hot (in terms of RAM,) and can be a bit finicky, but Java applications seem to fit in a bit better in these contexts, and certainly have been widely deployed here. Additionally, I want to be very clear, I don’t want to blame the language for the poor programs that happen to be written in it.

Here are the (hopefully not too leading) questions:

1. Is the “write once run everywhere,” thing that Java did in the beginning still relevant, for server-based applications? It’s a server application after all, you wouldn’t be loosing much by targeting a more concrete native platform.

2. Is the fact that Java is statically typed more of hindrance in terms of programmer time? And will the comparative worth of Java’s efficiency wear off as computers continue to get more powerful

Conventional wisdom being that while statically typed apps “run faster,” but take longer to develop. This is the argument used by Python/Perl/Ruby/etc proponents, and I don’t know how the dynamics of these arguments shift in response to the action of Moore’s Law.

3. One of the great selling points of Java is that it executes code in a “managed” environment, which provides some security and safety to the operator of the system. Does the emergence of system-level visualization tools make the sandboxing features of the JVM less valuable?

4. I don’t think my experiences are particularly typical, but all of the Java applications I’ve done any sort of administrative work with have been incredibly resource intensive. This might be a product of the problem domains. Using Java is often like slinging a sledge hammer around, and so many problems these days don’t really require a sledge hammer.

5. At this point, the amount of “legacy” Java code in use is vast. I sometimes have trouble understanding if Java current state is the result of all of the tools that have already been invested in the platform or the result of actually interesting and exciting developments in the platform. Like Clojure. Is Clojure (as an example,) popular because Lisp is cool again and people have finally come to their senses (heh, unlikely) or because it’s been bootstrapped by java and provides a more pain free coding experience for Java developers?

Anyone have ideas on these points? Questions that you think I’m missing?

Input in the Next Wave

2010-03-10 – tychoish

In response mostly to my own comentary of the iPad I’d like to lead a collective brainstorming of input and computer interact modalities in “the next wave.”

What’s the next wave? That thing that’s always coming “soon,” but isn’t quite here yet, the thing that we are starting to see glimpses of, but don’t really know. Accepting for a moment that things like Blackberries, netbooks, Kindles, iPads, iPhones and the like are these “harbingers” of the next wave.

The “make or break” feature of all these new and shiny things is the input method: how we get stuff from our heads into a format that a computer can do something with. While I’m a particularly… textual sort of guy, the “input question,” is something everyone who uses technology will eventually come to care about. Blackberry’s sell because they speak “messaging,” and because most of them have hardware keyboards. The iPad, with its bigger onscreen keyboard and external keyboard dock, is--to my mind--an admission that the little onscreen keyboard of the iPhone doesn’t work if you want enter more than 50 or 60 characters at any given time.

I love a good hardware keyboard. A lot, and I’m not just talking about the kind on the blackberry, but a real keyboard. The truth is I can’t even quite bring myself to justify one of the little “netbooks” on the principal that everything I do involves massive amounts of typing. And fundamentally, at the moment there doesn’t seem to be a good replacement for getting data into a computer system, that doesn’t involve a keyboard. Clearly this can’t hold out forever, and so I’d like to pose two questions:

What kind of computer interfaces will replace the command line?

So in 2010 most people interact with their computers by way of the mouse and a lot of pretty pictures. Even mobile environments like the iPhone/iPad/etc. and the Blackberry have some sort of a pointer that the user has to manipulate.

But the truth is that this kind of modality has always been inefficient: switching between the mouse and the keyboard is the greatest time sink in current user interfaces. Graphical environments require increasingly sophisticated graphics hardware, they require users to memorize interfaces in a visual way that may not be intuitive (even if we’re accustomed,) and they have incredibly high development costs relative to other kinds of software. Furthermore, most of us use a lot of text-based interfaces weather we know it or not. Google is a command line interface, as are most web browser’s address bars. And although my coworkers and I are hardly typical, we all have a handful of terminals open at any given time.

Clearly shells, (e.g. bash, zsh, and the like) are not going to be around forever, but I think they’re going to be around until we find some sort of solution that can viably replace the traditional shell. We need computer interfaces that are largely textual, keyboard driven, powerful, modern, lightweight, fast, and designed to be used interactively. I’m not sure what it looks like, but I know that it needs to exist.

What kind of interfaces will replace the keyboard for data entry?

When I was writing the iPad reflection, I thought it might be cool to have an input device that was mostly on the back of the device, so that you hold the device in both hands, your fingers make contact with some sort of sensors on the back, with your thumbs touching something on the front, and there’s some sort of on-screen interface that provides feedback to make up for the fact that you can’t see “the keys.”

I’d be inclined to think that this would be QWERTY derived, but that’s as much a habit as it is anything. I’m a pretty good touch typist, not perfect, and not the fastest, but I don’t have to think at all about typing it just happens. But I don’t know or think that the QWERTY keyboard is going to be the interface modality of the future. While I do want to learn DVORAK typing--but haven’t managed to really feel inspired enough to do that--I think its more productive to think about replacements for the keyboard itself rather than alternate layouts.

Thoughts?

If Open Source is Big Business Then Whither the Community?

2010-03-09 – Kleinman

I’ve been thinking recently about the relationship and dynamic between the corporations and “enterprises” which participate in and reap benefits from open source/free software and the quasi-mythic “communities” that are responsible for the creation and maintenance of the software. Additionally this post may be considered part of my ongoing series on cooperative economics.

When people, ranging from business types, to IT professionals, to programmers, and beyond, talk about open source software we talk about a community: often small to medium sized groups of people who all contribute small amounts of time to creating software. And we’re not just talking about dinky little scripts that make publishing blogs easier (or some such), we’re talking about a massive amount of software: entire operating systems, widely used implementations of nearly all relevant programing languages, and so forth. On some level the core of this question is who are these people, and how do they produce software?

On the surface the answer to these questions is straightforward. The people who work on open source software are professional programmers, students, geeks, and hacker/tinkerer-types who need their computers to do something novel, and then they write software. This works as model for thinking about who participates in open source, if we assume that the reason why people contribute to open source projects is because their individual works/contributions are too small to develop business models around. This might explain some portion of open source contributions, but it feels incomplete to me.

There are a number of software projects that use open source/free software licenses, with accessible source code, supported by “communities,” which are nonetheless developed almost entirely by single companies. MySQL, Alfresco, and Resin among others serve as examples these kinds of projects which are open source by many any definitions and yet don’t particularly strike me as “community,” projects. Is the fact that this software provides source code meaningful or important?