technology

database market

2009-05-13 – tychoish

This post is the spiritual sequal to my (slight) diatribe against database powered websites of a few weeks ago. And a continuation of my thoughts regarding the acquisition of Sun Microsystems by Oracle. Just to add a quick subtitle: Oracle is a huge vendor of database software, and about 18 months ago (? or so) Sun acquired mySQL which is the largest and most successful open-source competitor to Oracle’s products.

With all this swirling around in my head I’ve been thinking about the future of database technology. Like ya’do…

For many years, 15 at least, relational database systems (rdbms') have ruled without much opposition. This is where Oracle has succeeded, and mySQL is an example of this kind of system, and on the whole they accomplish what they set out to do very well.

The issue, and this is what I touched on the last time around, is that these kinds of systems don’t “bend” well, which is to say, if you have a system that needs flexibility, or that is storing a lot of dissimilar sorts of data, the relational database model stops making a lot of sense. Relational databases are big collections of connected tabular data and unless the data is regular and easily tabulated… it’s a big mess.

So we’re starting to see things like CouchDB, google’s big table, Etoile’s CoreObject MonetDB that manage data, but in a much more flexible and potentially multi-dimensional way. Which is good when you need to merge dissimilar kinds of data.

So I can tell the winds are blowing in a new direction, but this is very much outside of the boundaries of my area of expertice or familiarity. This leads me to two obvious conclusions

1. For people in the know: What’s happening with database engines, and the software that is built upon these database systems. I suspect there’s always going to be a certain measure of legacy data around, and developers who are used to developing against RBDMS' aren’t going to let go of that easily.

At the same time, there’s a lot of rumbling that suggests that something new is going to happen. Does anyone have a sense of where that’s going?

2. For people who lost me at when I said the word database: In a lot of ways, I think this has a huge impact on how we use computers and what technology is able to do in the near term. Computers are really powerful today. In the nineties the revolution in computing was that hardware was vastly more powerful than it had been before; in the aughts it became cheaper. In the teens--I’d wager--it’ll become more useful, and the evolution of database systems is an incredibly huge part of this next phase of development.

Pragmatic Library Science

2009-05-12 – tychoish

Before I got started down my current career path--that would be the information management/work flow/web strategy/technology and cultural analyst path--I worked in a library.

I suppose I should clarify somewhat as the image you have in your mind is almost certainly not accurate, both of what my library was like and of the kind of work I did.

I worked in a research library at the big local (private) university, and I worked not in the part of library where students went to get their books, but in the “overflow area” where the special collections, book preservation unit, and the catalogers all worked. What’s more, the unit I worked with had an archival collection of film/media resources from a few documentary film makers/companies, so we didn’t really have books either.

Nevertheless it was probably one of the most instructive experiences I’ve had. There are things about the way Archives work, particularly archives with difficult collections, that no one teaches you in those “how to use the library” and “welcome to library of congress/dewy decimal classification systems” lessons you get in grade school/college. The highlights?

Physical and Intellectual Organization While Archives keep track of, and organize all sorts of information about their collections, the organization of this material “on the shelf” doesn’t always reflect this.

Space is a huge issue in archives, and as long as you have a record or “where” things are, there’s a lot of incentive to store things in the way that will take up the least amount of space physically. Store photographs, separately from oversized maps, separately from file boxes, separately from video cassettes, separately from CDs (and so forth.)
“Series” and intellectual cataloging - This took me a long time to get my head around, but Archivists have a really great way of taking a step back and looking at the largest possible whole, and then creating an ad-hoc organization and categorization of this whole, so as to describe in maximum detail, and make finding particular things easier. Letters from a specific time period. Pictures from another era.
An acceptance that perfection can’t be had. Perhaps this is a symptom of working with a collection that had only been archived for several years, or working with a collection that had been established with one large gift, rather than as a depository for a working collection. In any case, our goal--it seemed--was to take what we had and make it better: more accessible, more clearly described, easier to process later, rather than to make the whole thing absolutely perfect. It’s a good way to think about organizational project.

In fact, a lot of what I did was to take files that the film producers had on their computers and make them useful. I copied disks off of old media, I took copies of files and (in many cases, manually) converted them to use-able file formats, I created index of digital holdings. Stuff like that. No books were harmed or affected in these projects, and yet, I think I was able to make a productive contribution to the project as a whole.

The interesting thing, I think, is that when I’m looking through my own files, and helping other people figure out how to manage all the information--data, really--they have, I find that it all boils down to the same sorts of problems that I worked with in the library: How to balance “work-spaces” with storage spaces. How to separate intellectual and physical organizations. How to create usable catalogs and indices’s of a collection. How to lay everything down so that you can, without “hunting around” for a piece of paper lay your hands on everything in your collection in a few moments, and ultimately how to do this without spending very much energy on “upkeep.”

Does it make me a dork that I find this all incredibly interesting and exciting?

jaunty upgrade

2009-05-04 – tychoish

So I’ve upgraded to the latest version of Ubuntu, 9.04 “Jaunty Jackalope,” and I thought I’d post some thoughts on the matter.

On my desktop the upgrade (eg. sudo apt-get dist-upgrade) was a bit touch and go, but I managed to save things (logging in running the old kernel into a root shell to fix the upgrade which mysteriously held back some packages it shouldn’t have) and here I am. The laptop was easier to upgrade, and I suspect this has something to do with the blob video card drivers I’m using on the desktop.

On the whole, I can’t say I’ve toyed with the updates to gnome terribly much so I don’t know what to say, but I suspect that they are in fact newer, better, and worthwhile if you’re using intrepid and or interested in trying out ubuntu. It’s really a great OS, and it does a great job--in my experience--of just working with minimal fussing.

I’m not sure that I’d choose an Ubuntu distribution again knowing what I know today. At the same time, I don’t know that I’d know as much as I do about Linuxes today without it, and given that this still works, I’m not switching.

My jaunty upgrade, however, inspired a few changes to my setup, and I’m quite happy with those improvements. They are:

I switched to using rxvt-unicode as a terminal emulator (I had been using gnome-terminal). I really like this, because the terminal is low resource (and can run demonized, so you can have a lot of windows open). It’s hard as hell to setup (in my experience,) but you can learn from my .Xdefaults file if you’re interested.
I started (finally) using gnome-color-chooser and gtk-chtheme (debian package names) to remove gnome-settings-daemon from my list of background programs, while still having windows that don’t look like 1992.
I stopped using vimperator in firefox, opting instead for firemacs (to control keybindings) and LoL for keyboard navigation (hit-a-hint links). Having access to the Awesome bar is a good thing indeed.

Still on the list of things to update?

I need to upgrade to the latest awesome version, as I’m behind.
I need to actually ditch gdm, which irritates me still.

Lamenting Project Xanadu

2009-04-30 – tychoish

I’ve been reading this is stevenf.com recently, and I have to say that it’s among my favorite current blogs (by people I don’t know). Geeky, but it doesn’t revolve around code snippets, simple, and minimal but in all of the right ways. And a bunch of fun. Anyway he posted an article a while ago that got me thinking called, “it’s my xanadu,” go ahead read it and then come back.

That’s a great idea isn’t it? I’ve been thinking a lot about data management and the way we represent, store, access, and use knowledge on the computer, so stuff like this gets me more excited than it really should, all things being equal. My good friend Joseph Spiros is even working on a program would implement something very much like Xanadu and the system that sevenf described.

First order of business should probably be to explain what Project Xanadu is for those of you who don’t know.

Xanadu was the first “hypertext system” designed that recognized that text in digital formats was a different experience and proposition than analog text. Proposed by Theodor Holm Nelson in the 1970s (with floundering development in the 1980s), Xanadu to be something amazing. It had features that contemporary hypertext systems sill lack. I think everyone has their own list of “things from xanadu that I want now,” but for me the big sells were:

Links went both ways: If you clicked on a link, you could reverse directions and see what documents and pages had links to the current page. This means that links couldn’t break, or point to the wrong page, among other things
Dynamic transclusions. Beyond simply being able to quote text statically, Xanadu would have been able to include a piece of text from another page that dynamically represent the most current revision of the included page. For example, I include paragraph 13 on page A (A.13) somewhere on page B; later you change A.13 to fix a typo, and the change is reflected in page B. I think links could also reference specific versions of a page/paragraph (but then users could from page B, access new and older dimensions of A.13).
Micropayments. The system would have had (built in) a system for compensating “content creators/originators” via a system to collect very small amounts of money from lots of people.

Needless to say, it didn’t work out. It turns out that these features are really hard to implement in an efficient way--or at least they were in the eighties--because of computing requirements, and the very monolithic nature of the system. Instead we have a hypertext system that:
Is built around a (real or virtual) system of files, rather than documents.
Has no unified structural system.
Must rely on distributed organizational systems (tagging, search engine indexes.)
Is not version-aware, and it’s pages are not self-correcting.
Relies on out-modded business models.

To be fair, much of the conceptual work on the system was done before the Internet was anything like it is today, and indeed many of these features we can more or less hack into the web as we know it now: wiki’s have “backlinks,” and google’s link: search is in effect much like Xanadu-Links, using dynamic generation we can (mostly) get transclusions on one site (sort of), and paypal allows for micropayments after a fashion.

But it’s not baked in to the server, like it would have been in Xanadu, this is both the brilliance and the downfall of Xanadu. By “baking” features into the Xanadu server, hypertext would have been more structured, easier to navigate, easier to collaborate, share and concatenate different texts, and within a structure easier to write.

And yet, in a lot of cases, I (and clearly others) think that Xanadu is worth considering, adopting: indeed, I think we could probably do some fairly solid predictions of the future of hypertext and content on the internet let alone information management in general, based on what was in Xanadu.

That’s about all I have, but for those of you who are familiar with Xanadu I’d love to hear what you “miss most” about Xanadu, if you’re game.

Laptop Usage

2009-04-20 – tychoish

My grandmother reminds me that she got a laptop in the late eighties,¹ it’s massive by today’s standards (particularly in comparison to my 12 inch thinkpad), but it had a great keyobard, and she remembers using it rather effectively. It had WordPerfect 5.1 back in the day but I think it also had StarWriter/StarOffice (which, the astute will recognize as the predecessor code-base for today’s Open Office). It probably weight ten or fifteen pounds, and I think she even brought it between work and home several times a week (using a luggage cart); but this was before her days on the Internet, and like all good things this laptop has gone to the land beyond.

For my college years, and a few years after I was a laptop-only computer user. It didn’t make sense to have a computer that I’d have to move so damn frequently, and it wasn’t like I was playing games or anything that would require desktop, and I loved having only one machine to keep current and up to date. It seems like laptop-only is a definite trend among the Internet-hipster/start-up monkey crowd. And it’s admirable, and for these folks (who are likely, and appropriately, Apple users) a laptop-tax of 400 dollars isn’t too much for people who have already bought into the Apple tax.

And then, along came the “netbook” phenomena, which posits that most of the time we don’t really need a desktop-grade laptop when we’re on the run. There’s a lot of merit to this model as well. We don’t really need to carry around powerhouses to check our emails in coffee shops, and for folks like me for whom the vast majority of our computing is pretty lightweight, building a system around a primary desktop computer and a sufficient but not supercharged laptop makes a lot of sense.

So what kind of laptop system do you use?

Turns out it was a toshiba t100. Here’s another picture/account ↩︎

Database Powered Websites

2009-04-09 – tychoish

For the last, call it 8 years, having a website, dynamically generated with most of the content pulled on (nearly) every reload from a database. MySQL or PostgreSQL, or some such. It’s a great model, in a lot of ways, because it represents the (near) ultimate separation of content and display, it means that static pages don’t have to be generated whenever you generate new content. Dynamic, database driven websites were going to the future, and I think history has proven this argument.

At the same time, I think a further analysis is required.

Just to be super clear the advantages of database driven websites are that:

Pages don’t have to be regenerated when content changes.
Content is more customizable and can be pulled together in ad-hoc settings.

I would argue however, that there are some fundamental weaknesses that this system of deploying websites promotes:

Database driven websites increase the complexity of web-site software by a magnitude or two. I can in my sleep hack together a static website (most people can); working with database requires a much more specialized system that is harder for website owners to maintain. While separating content from display is often an effort to make systems easier to understand and change, in point of fact, databases make website maintenance a specialized task.
Database driven websites have a lot of overhead. Because pages need to be regenerated regularly, they require beefy hardware to work correctly. On top of this database systems need to be cached in such a way, that they’re not quite as dynamic as they once were.
Databases are mostly server-based technologies, which means a lot of the dynamic client-side scripting (EMCAscript/JavaScript and AJAX/AHAH) that are all the rage these days (and what people most often mean when they say “dynamic”) aren’t nearly as dependent on databases as what’s going on in that space.
Given the very structured nature of databases, websites often need to develop their content structure with near prescience regarding what’s going to happen in their site for the next five years. This is complicated, difficult, and often means that the same content-system needs to be redeveloped (often at great cost) far too often.

In light of this I’ve been thinking, increasingly, that the future of websites will likely be powered by a much different kind of website software. Here are some observations about the future of the web:

Structured data formats, and plain text files are the future. Stored in/with formats like yaml, we can (very easily) have flexible structures can adapt to the changing needs of a website owner.
Some very large sites (eg. facebook, wikipedia) will likely always be powered by databases, because in situations where a single website has > 100,000 pieces of content databases begin to make sense. Remarkably, single websites so rarely have that much content. Particularly if engineered correctly.
Most content on the web doesn’t change very often. We regenerate pages thousands and thousands of times a day that would be unlikely to be update more than a dozen times a day.

This is not to say that there aren’t several challenges to the prospect of websites powered by static-site generators/compilers. They are:

Some content will likely always be compiled using some very basic server-side includes, and dynamic content will continue to be generated using java script or something similar.
Authentication and web-based security will likely also need to be built into webservers directly (the direction, I think things are going in anyway) and complicated access control (for intranets, say) may still require databases.
Web-based interfaces for editing (ubiquitous, in page, “edit this” links). and commenting systems often need more dynamic functionality than other content. We need to both figure out a way to do commenting/annotation in a static-based system and find a way to do commenting in a more socially productive manner (existing locally hosted commenting systems, are I think fundamentally broken).
Concurrent Editing. Wiki engines address this to some degree, and I think we need additional productive ways of addressing this in a truly user friendly manner, that doesn’t rely on over powered databases for what is probably an edge case.

Thoughts? I’m not describing technology that doesn’t exist, but I am suggesting that the current “way of doing things,” isn’t the future of how we will “do content on the web.” The tools are out there, all that’s missing is a simple, user friendly, method for pulling all this content together.

I’m really interested in seeing what people are coming up with.

Links and Old Habits

2009-03-30 – tychoish

So I’ve noticed that my impulse on these shorter blog posts (codas) that I tend to just do my normal essay thing only shorter, which is more of an old habit than it is something productive or intentional. Weird. To help break out of this bad habit, I’m going to post some links that I’ve collected recently.

I saw a couple of articles on user experience issues that piqued my interest, perhaps they’ll pique yours as well: Agile Product Desgin on Agile Development and UX Practice and On Technology, User Experience and the need for Creative Technologists.

Cheetah Template Engine for Python. This isn’t an announcement but I’ve been toying around with the idea of reimplementing Jekyll in python (to learn, because I like python more than Ruby). Cheetah seems to be the template engine for python that looks the coolest/best. I need to read more about it, of course

I didn’t get to go to Drupal Con (alas), but there were a few sessions that piqued my interest, or at least that I’d like to look into mostly because the presenters are people I know/watch/respect: Sacha Chua’s Rocking your Development Environment Liza Kindred’s Bussiness and Open Source James Walker’s Why I Hate Drupal.

Sacha’s because I’m always interested in how developers work, and we have emacs in common. Liza’s because Open Source business models are really (perversely) fascinating, even if I think the Drupal world is much less innovative (commercially) than you’d initially think. Finally, given how grumpy I’m prone to being, how could walkah’s talk not be on my list?

Anyone have something good for me?

getting emacs daemon to work right

2009-03-27 – tychoish

In the latest/forthcoming version of GNU Emacs there exists a daemon mode. The intent is to make it easier to run in one instance of emacs, and rather than starting new instances, you can run everything in one instance. Less overhead on your system, and everyone’s happy. Without daemon mode, you can still run in server mode and get nearly the same effect, but you end up with one emacs frame that’s the “host,” which means, if you’re dumb and close it (or it’s running inside of another process which crashes…) all of the frames close.

I suppose, as an interjection, that my attempt to explain why this is cool to a generalized audience is somewhat of a lost cause.

In any case, there’s a problem daemon mode doesn’t behave like anyone would want it to. This post explains it better than anything else I’ve read thus far, but it’s not all that clear. Basically when you start emacs --daemon it doesn’t load your .emacs init file correctly, and so things like display settings are off kilter, and can’t really be fixed. I resorted to running my “host” emacs frame inside of a terminal/screen session, because that worked and was basically the same from my perspective.

Nevertheless I’ve discovered “the fix” to the emacs daemon it to work like you’d expect. Run the following command at the command line (changing username and font/fontsize):

emacs -u [USERNAME] --daemon --eval "(setq default-frame-alist \ '((font-backend . "xft") (font . "[FONT]-[SIZE]")))" -l ~/.emacs

And then, open an emacsclient.

… and there was much rejoicing …

There are a couple of things that I can’t get to work reliably. Most notably, though the emacsclient works in terminal instances, it has some sort of problem with attaching new clients to X after an X crash/restart. No clue what this is about. Not quite sure what the deal is with this, but needing to reboot every time X goes down is a bummer. Other than that? Emacs bliss. For the moment.