Beyond SQL and Database Technology

2010-01-15 – tychoish

People have been thinking about databases recently. Even I’ve been thinking about databases, and I’m not particularly prone to thinking about databases. It’s fair given the ongoing drama of the Oracle/Sun, and even mainstream press of the NoSQL Movement. I’d like to take a step back and think a bit more honestly and holistically about the database application, aboth this “NoSQL” phenomena, and about the evolving role of relational database management systems in our technology “ecosystems.”

(Seriously folks this is what I think about for fun in my free time.)

I’ve been milling over the notion that databases, like MySQL and PostgreSQL and Oracle’s RDBM products, are not particularly “Unix-like.” Sure they run on Unix systems, and look and feel like Unix applications, but the niche fulfill--providing quick access to structured data with a specialized query language, doesn’t jive with the Unix philosophies: small specialized tools for precise tasks. “Plain text” as lingua franca of system tools, and so forth.

Databases solve a problem. Indeed they solve a problem in a very functional and workable manner. I don’t want to suggest that the relational database model is somehow broken; however, I would like to suggest that industrial strength database systems are over utilized, and have become the go-to solution for storing and interacting with data of any kind, even in cases where they’re not a good fit for the job at hand.

I’m not the first person to suggest this, not by a long shot. The NoSQL “movement,” addresses this issue from a couple different direction. It’s true that NoSQL refers to a collection of practices and approaches related to providing systems for storing data that goes above and beyond the type and model of a database system. In the end NoSQL is about addressing the scaling problem: what happens when we have so much data that it can’t easily fit in one database system, or in situations where centralized model is untmaintable for any number of reasons. I think NoSQL is also relevant as we think about storing data that doesn’t easily fit into RDBMs’es: I’ve seen a lot of very poorly architected database systems, that suffer from a “square peg in round hole” problem.

Indeed, as we try and put all of our data in these RDBMs systems, particularly data that doesn’t fit very well, these databases loose their ability to scale. The complex logic required to pull more complex data back out of a database and reassemble it for use and analysis is computationally expensive and doesn’t scale particularly well.

But let’s focus for a moment on the scaling question, apart from the data modeling and storage question. The real problem at the core of the scaling question is: we need a way, a thing, that allows multiple systems to access a shared data store in a reliable and consistent manner.

The ongoing work around clustered file systems seems to address this issue from a much different direction, and perhaps a more interesting perspective. Beyond a certain point--and its a fuzzy point--database systems basically become file system replacements. So rather than work on making databases more like file systems, the thought is (I assume) lets make file systems a bit more “database like.” Like I said, I don’t know a lot about the ins-and-outs of clustered file systems, but I think, in addition to worrying and thinking the future of current database systems, we need to also think about the future of these very scalable and clustered manner.

I’m not sure what the next-generation data storage technology really looks like, the NoSQL stuff is a step in the right direction, but I’m not sure if it’s a large enough step in a lot of ways, as its focus is a bit narrow. To be honest, I’m not incredibly familiar with the work that’s going on in the clustered file system space. Nonetheless, I think it’s important to not just think about the future of the relational database platforms as such, but the model and the underlying problems that these kinds of data storage methods address, and to think about other possible ways of addressing the original issues.

Time Management

2010-01-13 – tychoish

I’ve not written here about time management and productively very much recently. I’m a pretty busy guy, I work a lot, I live an hour away (minimum) from my social life, and I have a lot of things simmering in various stages. While I might accept the challenge that I’m spreading myself a bit too thin (I’m working on it!) I feel like the largest challenging isn’t that my attention is too divided, or even that I don’t have enough time to do the things I want to do.

Rather, I think my biggest challenge at the moment is that I’m not particularly good at using very short blocks of time to get things done. The twenty minutes of free time I have in the morning before work, the time after work when I’m too tired of looking at words to write, but not tired enough to go to bed. One of the great things about having lots of projects is that they’re all in different stages and require different kinds attention.

As a result, I’m taking the following strategies in an effort to use time a bit more effectively:

1. Stub out projects during binges, fill in the gaps in the interstitial time.

One of the problems with writing in short little bits and pieces throughout the week, is that writing is often a game of momentum, and it’s hard to really build up speed and absorb yourself in a project, however big or small in a few moments. In most cases, the hardest thing to do in writing is figure out “Okay, so what do I need to write here.” Given this, it’s incredibly easy (at least for me) to become enchanted with the successful binge and the ability to bury myself in a long writing session for hours on end; because that tends to work well, I’m prone to just not trying to write in the interstitial moments.

My approach, recently, and one that I need to pay a bit more attention to maintaining, has been to use binges to start projects, to do a lot of free writing and note taking, and then with things mostly sketched out and “stubbed out” (to borrow a term from the wiki world), it’s easier to write things during the week.

Work on keeping the “to do” list more populated.

I suppose this is really an extension of the above, but I sometimes find myself avoiding adding items to my todo list if I’m close to “clearing the decks” (OCD much?) and I sometimes fall out of the habit of really using my todo list as a method for planning my day and week out. Todo lists have three main purposes that I can see: first they help with remembering things that you might otherwise forget. That’s not something I struggle with in any major way. The second is to do some organizational work up front so that big and hugely imposing tasks seem much more manageable when you sit down to work. Finally, they should all but moot the question “what should I work on now?”

In any case, a todo list is completely useless when left unmaintained, and underpopulated. There’s always something that needs doing, and there’s often time to do something. Todo take the thought out of figuring what to do in those spaces, and they can’t really do that when they’re not kept up to date.

3. Get away from the computer when I’ve lost the ability to concentrate.

When I’m tired or bored (or both) and don’t think I could muster the ability concentrate on a sentence, I rarely muster the courage to get up and do something else. Instead, I usually tab into a web browser and doddle away the evening doing something like refreshing Facebook endlessly, or some-such. While there’s nothing wrong with a little bit of harmless perusal of the Internet, it’s too easy to get sucked in and then not get other stuff done. I think of this as the “cut your looses strategy.” I’m not terribly good at it most of the time, but when it works I’m pleased.

Read more, particularly when I’ve “run out of words."

At the end of a day, I sit down on the couch with my laptop, and I find whatever emacs buffer open that I’d been hacking or writing away on in the morning (or the previous evening), and I think “Dear god, I couldn’t possibly write anything more,” which is a fair feeling: I stare at emacs buffers and hack away on words all day as it is, coming home to do more of this, even if the topics are a bit different, is sometimes difficult. I enjoy writing a bunch and find it to be a very rewarding experience so this isn’t always a problem, but when it is, it is.

My goal is to avoid waisting time because I’m bored or tired and using this time to read instead. I tend to find reading to be refreshing and I feel like I don’t have enough time to read as it is, so this solves a few problems. I think if we look honestly at our days and our goals, most of us might be able to find a way to get the more things done that we want to get done, with such a strategy.

We’ll see how that works.

Starting a Collaboration

2010-01-12 – tychoish

Alternate Title: “How to start a collaborative writing project or die trying,”

Step 1: Lock yourself in your office, fire up Emacs, and write an initial draft from beginning to end yourself.

Step 2: Post it on the Internet.

Step 3: Encourage contributions and hands on feedback from your collaborators. Have a piece of cake.

You may thing that I’m kidding, but it’s true. I think there’s a misconception that the way to write something with other people follows a path that might look like: having a meeting to establish the common goals and an outline of what needs to get said, and then another meeting to divide up who is going to write want, and then people go back and write their little parts, and then you mash them all up and everyone rewrites their part till it meshes with the other parts, and then you pray it says what you need it to say, and doesn’t need further revising--except it sort of dose, so you repeat the whole process over again, to revise the text, except with editing instead of writing. And because each stage requires endless conversation, when you don’t have the benefit of face-to-face meetings, things can take a long time: so long, in fact, that most people will have probably lost interest long before something has been written. The short of it is that this method, though very democratic and open seeming, isn’t.

I think there’s a fear, that when a single person puts a lot of individual energy into a text (or any kind of project, really,) and doesn’t consult with collaborators at every turn that it somehow becomes not a collaboration. This is emphatically not true. There is significant difference between endless group process and the collection of meaningful feedback; a real distinction between a text created with a process that involves many people, and a text that many people can agree represents their interest, purposes, and needs.

I think, though I’m not certain, that one could replace the words “writing” and “text” in the above, with “programing” and “code” but I don’t know for sure.

At work, I have a moto: “you can’t edit it if it doesn’t exist yet.” The more interesting thing, I think, in every context is when you go off into your own office, fire up the emacs, write something, and then say “so how does this look?” People sometimes say, “nice, but you used ‘setup’ as a verb in the third paragraph,” or “ok, but you left out a section about flux capacitors in section two, and I think that’s crucial for understanding most of section three,” but these are problems that are fairly easily addressed.

Now it could be the case that I’m just that awesome (unlikely), but I think it boils down to the fact that most people don’t understand how to make texts. I also think that a lot of “group process,” can be obsoleted by an individual who can produce something, and has a good sense of the group’s desires, and who knows how to check in with various group members at the right moments. While these skills can be listed quite effectively, and it’s true that there is no rocket science involved: some things are easier said than done.

Not every collaboration works, and there are a lot of variables at play in any situation where a group of people must come together to make something, but in nearly every situation beginning with “hey, I want make something with you, look at this draft,” is better than “I was thinking about making something with you but I wanted to get your feedback first.”

Just sayin'…

End User RSS

2010-01-11 – tychoish

I’m very close to declaring feed reader bankruptcy. And not just simple “I don’t think I’ll ever catch up with my backlog,” but rather that I’ll pull out of the whole RSS reading game all together. Needless to say, because of the ultimate subject matter--information collection and utilization and cultural participation on the Internet--and my own personal interests and tendencies this has provided some thinking… Here goes nothing:

Problems With RSS

Web 2.0 in a lot of ways introduced the world to ubiquitous RSS. There were now feeds for everything. Awesome right?

I suppose.

My leading problem with RSS is probably a lack of good applications to read RSS with. It’s not that there aren’t some good applications for RSS, its that RSS is too general of a format, and there are too many different kinds of feeds, and so we get these generic applications that simply take the chronology of RSS items from a number of different feeds and present them as if they were emails or one giant feed, with some basic interface niceties. RSS readers, at the moment, make it easier to consume media in a straightforward manner without unnecessary mode switching, and although RSS is accessed by way of a technological “pull,” the user experience is essentially “push.” The problem then, is that feed reading applications don’t offer a real benefit to their users beyond a little bit of added efficiency.

Coming up a close second, is the fact that the publishers of RSS sometimes have silly ideas about user behaviors with regards to RSS. For instance there’s some delusion that if you truncate the content of posts in RSS feeds, people will click on links and visit your site, and generate add revenue. Which is comical. I’m much more likely to stop reading a feed if full text isn’t available than I am to click through to the site. This is probably the biggest single problem with that I see with RSS publication. In general, I think publishers should care as much about the presentation of their content in their feed as they do about the presentation of content on their website. While it’s true that it’s “easier” to get a good looking feed than it is to get a good looking website, attending to the feed is important.

The Solution

Web 2.0 has allowed (and expected) us to have RSS feeds for nearly everything on our sites. Certainly there are so many more rss feeds than anyone really cares to read. More than anything this has emphasized the way that RSS has become the “stealth data format of the web,” and I think it’s pretty clear, that for all its warts, RSS is not a format that normal people are really meant to interact with.

Indeed, in a lot of ways the success of Facebook and Twitter have been as a result of the failure of RSS-ecosystem software to present content to us in a coherent and usable way.

Personally, I still have a Google Reader account, but I’m trying to cull my collection of feeds and wean myself from consuming all feeds in one massive stew. I’ve been using notifixlite for any feed where I’m interested in getting the results in very-near-real time. Google alerts, microblogging feeds, etc.

I’m using the planet function in ikiwiki, particularly in the cyborg institute wiki as a means of reading collection of feeds. This isn’t a lot better than the conventional feed reader, but it might be a start. I’m looking at plagger for the next step.

I hope the next “thing” in this space are some feed readers that add intelligence to the process of presenting the news. “Intelligent” features might include:

Noticing the order you read feeds/items and attempting to present items to you in that order.
Removing duplicate, or nearly duplicate items from presentation.
Integrate--as appropriate--with the other ways that you typically consume information: reading email and instant messaging (in my case.)
Provide notifications for new content in an intelligent sort of way. I don’t need an instant message every time a flickr tag that I’m interested in watching updates, but it might be nice if I could set these notifications up on a per-folder or per-feed manner. Better yet, the feed reader might be able to figure this out.
Integrate with feedback mechanisms in a clear and coherent way. Both via commenting systems (so integration with something like Disqus might be nice, or the ability auto-fill a comment form), and via email.

It’d be a start at any rate. I look forward to thinking about this more with you in any case. How do you read RSS? What do you wish your feed reader would do that it doesn’t?

The Blog, Next in Lisp

2010-01-07 – tychoish

Here’s a crazy idea: in addition to posting an RSS feed, say I start posting the content of the blog as Common Lisp code. Not, to replace any format that I currently publish in, but as an additional output option. Entries might look something like this:

(tychoish:blog-post
   (tychoish:meta-data
      :title "The Blog, Next in Lisp"
      :author "tycho garen"
      :pubtime #'(format () time-t)
      (tychoish:blog-tags '(lisp cyborg crazy))
      (tychoish:archive-collection '(programing)))
  (tychoish:blog-content (markdown)
    "Here's a crazy idea: in addition to posting an RSS feed, say I start
    posting the content of the blog as Common Lisp code. Not, to replace
    any format that I currently publish in, but as an additional output
    option. Entries might look something like this: [...]"))

That’s pretty. In a lispy story of way. I’m not sure that it’s actually correct, and it makes calls to functions that don’t exist, of course. But I hope you can get the gist enough to see where I’m going with this, and maybe enough to correct my newbish mistakes.

By my count there needs to be functions for: blog-post, meta-data, blog-tags, blog-content, and markdown. And of course it’s missing some notion of what these functions might do. I’m not terribly sure what they could do, build a better indexing system for the site (lord knows I need it), or more easily create a Lisp-based content reader/browser thatls like a feed reader but more in some way that I haven’t envisioned.

In a lot of ways, this isn’t any different from RSS. And it is RSS, basically, except you don’t have to parse it into some format that your programing language can understand, (assuming you’re programing with Lisp, of course, but you are, aren’t you?) because it is your programing language. At least in my mind this has a lot in common with the Sygn Project in that both projects focus on providing some sort of loose standard that allow us to share and use data openly and freely, using formats that are easy (enough) to construct by hand, are human readable, and easy to process and use programatically.

In any case, it shouldn’t be terribly hard to generate this format, the question is: does seeing the data like this present possiblities to anyone? And, while we’re at it, if anyone wants to help define some of the more basic functions, that might be awesome. I look forward to hearing from you all?

Corporate Government

2010-01-06 – tychoish

I was talking to someone, probably a coworker, and I was saying something about a “government,” except I slipped and said “corporation.” An easy mistake, if not a common one, perhaps, and certainly somewhat telling. In this post I want to discuss a number of ideas that have been lingering about in my thoughts regarding the role of corporations on culture and technology, and the role of corporate structures on the conveyance of cultural values.

In a lot of ways this post is the successor to my posts on: Martian Economics and Transformational Economics

Maybe that’s a bit much for one blog post. In brief:

Corporations and Governments

We can think of both corporations and governments fall into the larger category of “formal social institutions,” or “formal collective institutions.” They are both, at least in theory, productive beyond the ability of a single individual, and both dominate the shape and course of our lives to significant degrees.

Corporate Structure and Cultural Transmission

How corporations, and governments more obviously, are structured and behave is--I would argue--a means of creating and transmitting cultural values.

Praxis and IBM: Autonomy and Bottom Up Organization

I use these two examples--one fictional, one actual--as possible illustrations of a different sorts of ways of thinking about corporate organization. Both of these examples represent large institutions engaged in diverse operations, that are organized (I think) with a great number of quasi-autonomous operations and divisions, which contribute to common project but have the freedom to operate independently and encapsulated ways. This strikes me as a unique modality.

This leads to a lot of questions, and not very many good answers. I suppose that’s not intrinsically a bad thing.

One of the biggest problems with corporations as far as I’m concerned is that by virtue of their fiduciary responsibility they have no obligation to operate in a sustainable manner or in the common interests of either their employees or the public.

Can the potentially harmful potentials inherent in corporate person-hood be offset by certain types of organization?
Is there a better way to manage and organize our political society that balances the power of governments, corporations, that is sustainable and efficient?
We talk, and think, a lot about how the Internet affects how people use technology, and how the Internet creates new possibilities for business. How does the Internet change the way we organize our work lives? Has technology made smaller corporate operations more sustainable and able to compete?
Are the alternatives loose and autonomous-cells in corporate organizations that might be able to address the concerns regarding efficiency and sustainability?

And so forth…

Common Lisp, Practically

2010-01-05 – tychoish

So in the emacs session running on my laptop (13 days plus) I have a number of buffers open, a great many of which include the entirety of Practical Common Lisp thanks to emacs-w3m, which I’ve been working through slowly. I’ve written here about how I find Lisp to be intriguing and grok-able in a way that other programing languages aren’t really.

My exposure to lisp isn’t great. I hack about with my emacs code, and I do a little bit of tweaking with the window manager that I use (written in common lisp), StumpWM, but other than that I don’t actually have much experience. What follows are a series of reflections that I have with regards to lisp:

Although there’s a lot of really amazing capabilities in Common Lisp, and a lot of open source energies behind Lisp… Lisp isn’t flourishing.

This shouldn’t be a great surprise to anyone, lisp is sort of the epitome of “Programing Languages that don’t get enough respect.” Having said that there are a lot of lisp projects that aren’t really well maintained at all. Even things that would just be standard and maintained for other languages (various common libraries and the like) haven’t been touched in a few years. While it’s not a huge worry, it does make it a bit worrying. Having said that, I don’t think lisp is ever really going to go anywhere, and Common Lisp seems like a pretty darn good spec. But I don’t have any real exposure to Scheme, and Arc isn’t really real yet, I guess.

Lisp works funny, particularly for people who only have a passing familiarity with programming.

We’re used to programming languages that either pass the source code through an interpreter (e.g. Python, Ruby, PHP, Perl, and I suppose Java and C#) compile into some sort of intermediate bytecode and then run that code on a virtual machine, then output stuff; conversely there are languages which compile down to some sort of native binary and then execute directly on the hardware. Examples of this second class of languages include: C, C++, and Haskell. Sorry if my examples or descriptions of the execution model aren’t particularly precise.

When you run lisp code, you define stuff and load it into the memory of a lisp process, and then stuff happens as the program runs. It’s compiled to native code (I’m pretty sure at least,) but there aren’t binaries, in the conventional sense. To get a “binary,” you have to dump the memory of the program, and pretty much the entire lisp process into a blob. So the base size for executables is way bigger than one might expect. I’ve also had some success at running scripts with sbcl shebangs from the terminal. That’s pretty nifty, not that I’ve really done very much of that, but its nice to know that it’s possible.

Web programing in Lisp. I’m not so sure about that.

So you might see lisp code, and think: “So. Many. God. Damn. Parentheses.” and you’d be right. But even well formatted HTML is considerably less “human readable” than Lisp, and I don’t think there’s a lot of room for debate there. But when you think about it, Lisp actually makes a fair amount of sense for the web.

I’ve actually done a little bit of poking around and from what I can see, the actual architecture and deployments of lisp aren’t terribly bad. There are Apache modules that will pass requests back to a single lisp process (mod_lisp similar to how fastcgi works,) and there’s always the option of running performance CL specific web-application servers and just proxying requests to those servers from Apache. Lisp is, or can be, pretty damn fast by contemporary standards, and although there’s a lot of under-maintained lisp infrastructure, the basics are covered, including database connectors and java script facilities which might not be incredibly enticing, but all the parts are there.

I mean, having said that, I’m not a web developer, or really much of a developer in general, but it’s fun to think about, and even if I only use Lisp to hack on various things here and there, I’m still learning a bunch from the book and that seems more than worthwhile.

Another One Already

2010-01-04 – tychoish

Alternate titles for this post include, “Depending on When You Start Counting,” and “Happy New Year.”

I must confess that I don’t do holiday’s very well. It’s not that I’m a huge curmudgeon (though I probably am) or that I don’t like celebrations (though I don’t much.) More, I think it’s that I’m mostly a homebody, and given the option, will in most cases, choose to spend any given evening at home writing and hanging out with the cats. Furthermore, I’m generally of the opinion that formal excuses are not needed for spending time with your friends and family. Nevertheless there is a certain sort of cultural momentum around holidays like New Years, and its hard to avoid them entirely.

For many years, my annual cycle has largely been on the academic calendar. Indeed for a few years after I graduated formally, I still took a few classes, and was wrapped up in applying for graduate school and enough of my friends were still in school that I seemed to stay on the academic year. This year with, I noted the beginning of the academic year, mostly because for the first time, really the first time in a long time I wasn’t in school, I wasn’t trying to be in school, and that wasn’t a bad thing. I still have a lot of academic habits: the impulse to review and summarize my work about every four months, the way I stricture and organize my work is very reminiscent of a sort of academic way of looking at things. Shrug.

But it’s a new year, depending on when you start counting, and given what a ride 2009 was, it seems like a bit of reflection is in order. The most significant thing was the fact that I took a job half way across the country and moved in late June. This has lead to a number of interesting developments: I met a number of people in “real life” who had previously been on-line friends, I’ve learned a lot about my skills and abilities and myself as a writer, I’ve developed a circle of friends that delight me.

This isn’t to say that it was a stellar year. I spent six months not working (really,) and a lot of time unsure about what I was going to do for work, let alone “my career,” all my friends were in graduate school and most of them weren’t anywhere near where I was, and so forth.

But I kept, hacking away at various projects, kept thinking and writing about new things, and did my best to seize opportunities when they arose. And somehow it all worked out. In retrospect it’s all very weird, to think how monumental this year has been, and the ways that I’ve really pushed myself to do things that don’t seem very “me” like. In the end I’m pleased with where I am and where I’ve come.

But perhaps, more significantly, I’m excited to what the next year holds.

As it should be, I suppose. I hope you all are doing well in this regard and I look forward to talking with you throughout the year to come.

Problems With RSS#

The Solution#

Problems With RSS

The Solution