Jekyll and Automation

As this blog ambles forward, albeit haltingly, I find that the process of generating the site has become a much more complicated proposition. I suppose that’s the price of success, or at least the price of verbosity.

Here’s the problem: I really cannot abide by dynamically generated publication systems: there are more things that can go wrong, they can be somewhat inflexible, they don’t always scale very well, and it seems like horrible overkill for what I do. At the same time, I have a huge quantity of static content in this site, and it needs to be generated and managed in some way. It’s an evolving problem, and perhaps one that isn’t of great specific interest to the blog, but I’ve learned some things in the process, and I think it’s worthwhile to do a little bit of rehashing and extrapolating.

The fundamental problem is that the rebuilding-tychoish.com-job takes a long time to rebuild. This is mostly a result of the time it takes to convert the Markdown text to HTML. It’s a couple of minutes for the full build. There are a couple of solutions. The first would be to pass the build script some information about when files were modified and then have it only rebuild those files. This is effective but ends up being complicated: version control systems don’t tend to version mtime and importantly there are pages in the site--like archives--which can become unstuck without some sort of metadata cache between builds. The second solution is to provide very limited automatically generated archives and only regenerate the last 100 or so posts, and supplement the limited archive with more manual archives. That’s what I’ve chosen to do.

The problem is that even the last 100 or so entries takes a dozen seconds or more to regenerate. This might not seem like a lot to you, but the truth that at an interactive terminal, 10-20 seconds feels interminable. So while I’ve spent a lot of time recently trying to fix the underlying problem--the time that it took to regenerate the html--when I realized that the problem wasn’t really that the rebuilds took forever, it was that I had to wait for them to finish. The solution: background the task and send messages to my IM client when the rebuild completed.

The lesson: don’t optimize anything that you don’t have to optimize, and if it annoys you, find a better way to ignore it.

At the same time I’ve purchased a new domain, and I would kind of like to be able to publish something more or less instantly, without hacking on it like crazy. But I’m an edge case. I wish there were a static site generator, like my beloved jekyll that provided great flexibility, and generated static content, in a smart and efficient manner. Most of these site compilers, however, are crude tools with very little logic for smart rebuilding: and really, given the profiles of most sites that they are used to build: this makes total sense.


I realize that this post comes off as pretty complaining, and even so, I’m firmly of the opinion that this way of producing content for the web is the most sane method that exists. I’ve been talking with a friend for a little while about developing a way to build websites and we’ve more or less come upon a similar model. Even my day job project uses a system that runs on the same premise.

Since I started writing this post, I’ve even taken this one step further. In the beginning I had to watch the process build. Then I basically kicked off the build process and sent it to the background and had it send me a message when it was done. Now, I have rebuilds scheduled in cron, so that the site does an automatic rebuild (the long process) a few times a day, and quick rebuilds a few times an hour.

Is this less efficient in the long run? Without a doubt. But processors cycles are cheap, and the builds are only long in the subjective sense. In the end I’d rather not even think that builds are going on, and let the software do all of the thinking and worrying.

Organize Your Thoughts More Betterly

I’ve been working with a reader and friend on a project to build a tool for managing information for humanities scholars and others who deal with textual data, and I’ve been thinking about the problem of information management a bit more seriously. Unlike numerical, or more easily categorized information data, how to take a bunch of textual information--either of your own production or a library of your own collection--is far from a solved problem.

The technical limitation--from a pragmatic perspective--is that you need to have an understanding not only of the specific tasks in front of you, but a grasp of the entire collection of information you work with in order to effectively organize, manage, and use the texts as an aggregate.

“But wait,” you say. “Google solved this problem a long time ago, you don’t need a deterministic information management tool, you need to brute force the problem with enough raw data, some clever algorithms, and search tools,” you explain. And on some level you’d be right. The problem is of course, you can’t create knowledge with Google.

Google doesn’t give us the ability to discover information that’s new, or powerful. Google works best when we know exactly what we’re looking for, the top results in Google are most likely to be the resources that the most people know and are familiar. Google’s good, and useful and a wonderful tool that more people should probably use but Google cannot lead you into novel territory.

Which brings us back to local information management tools. When you can collect, organize, and manipulate data in your own library you can draw novel conclusions, When the information is well organized, and you can survey a collection in useful and meaningful ways, you can see holes and collect more, you can search tactically, and within subsets of articles to provide. I’ve been talking for more than a year about the utility of curation in the creation of value on-line. and fundamentally I think the same holds true for personal information collections.

Which brings us back to the ways we organize information. And my firm conclusion that we don’t have a really good way of organizing information. Everything that I’m aware of either relies on search, and therefore only allows us to find what we already know we’re looking for, or requires us to understand our final conclusions during the preliminary phase of our investigations.

The solution to this problem is thus two fold: First, we need tools that allow us to work with and organize the data for our projects, full stop. Wiki’s, never ending text files, don’t really address all of the different ways we need to work with and organize information. Secondly we need tool tools that are tailored to the way researchers who deal in text work with information from collection and processing to quoting and citation, rather than focusing on the end stage of this process. These tools should allow our conceptual framework for organizing information to evolve as the project evolves.

I’m not sure what that looks like for sure, but I’d like to find out. If you’re interested, do help us think about this!

(Also, see this post `regarding the current state of the Cyborg Institute <http://www.cyborginstitute.com/2010/06/a-report-from-the-institute/>`_.)

The Schedule

Wow. Hello blog.

I’m pretty busy. It even seems sort of cliche to complain about such things on ones blog, but I think being busy has coincided with a somewhat larger reevaluation of nearly everything.

Wait, no. I’m not quitting blogging.

I’m actually really proud of the Knowing Mars launch, and it feels really good to have that project “done,” even if I think it needs a major revision, and I have a lot more fiction on my plate that I don’t want to just “let go” like that.

I’m also somewhat displeased with the kind of blog posts that I’ve been writing recently. It seems that I’ve been writing about some basic ideas: my disdain for the way the web functions as a user interface, some general work flow topics, some basic cyber-culture topics, and half way through most of these blog posts I mostly loose interest, and I suspect you have as well.

I’ve had a post in my “write this soon list,” about digging in deeper and striving for a more rich engagement with the topics I try and cover here, and I’ve pretty much failed with that. In any case, this post was supposed to be more about the things that are on my schedule:

I’ve been doing a lot offline these past few weeks. It’s May and that means it’s Morris Dancing season. I seem to have joined an interesting phenomena called “Maple Morris,” (more reflection on that when I’ve processed a bit more,) the usual Mayday festivities, Midwest Morris Ale. And then there are a bunch of singing conventions, which are a great deal of fun and fulfilling, and then there are contra dancing things, but none of these things transmit to quiet weekends alone writing. Or even quiet evening around writing. At least very often. Some highlights of the recent past and near future:

  • Maple Morris; A my-generation Morris dancing event, last weekend in Boston. A bunch of Morris dancers in my general age rage got together to dance some really challenging dances and to sing great songs. I was totally overwhelmed.
  • The Midwest Morris Ale; My regular annual Morris dancing ale. This is my 9th consecutive ale (and my 10th anniversary of dancing Morris.)
  • Since last September, I’ve gone to an all- or mulit- day Sacred Harp singing convention most months, since last September, and there’s one on my calender every month between now and this September.
  • I’m going to “Youth Dance Weekend” in Vermont in September, which I’ve never been to, but I think it’ll be a a great deal of fun, and I’m very much looking forward to it. I’ve not been contra dancing as much, but that’s not a huge problem for me.
  • I’m moving to Philadelphia in the summer, which means a drastically longer commute, but an easier to orchestrate social life, and a better work/life balance. This means apartment hunting and all that jazz.

While this means less writing time and time for taking care of my own projects, it doesn’t mean that I don’t have any writing time. Sure writing takes time, but the largest challenge as a writer is in using the time I/we already have effectively, and getting the most out of those opportunities.

It also, I think, means finding a way to develop a writing (and blogging) habit that:

  • Doesn’t revolve around a fixed daily publication schedule. I still want to write essays, but I need to write essays when I have a compelling argument for an essay, rather than around the same core of ideas that I’ve been running around for the last year.
  • I need to be able to put the blog on the back burner while I focus on things like writing fiction, or hacking projects, or Cyborg Institute stuff. The blog is great, and I love writing the blog, but It’s far to easy for me to fall into a pattern where the blog becomes the project, rather than the journal in support of the project.
  • I need to organize my projects and tasks into clumps of work that are easier to manage in shorter periods of time. This is probably a reorganization problem that needs to mostly occur within my head.

So where does that leave us? I have a few posts piled up that I’ll parcel out over the next few weeks, though on the whole there will probably be less posting by me around here. I’m probably going to do more posts along the lines of “here’s what I’ve been up to, go read my work elsewhere.” There will be some guest posts and I’ve already begun working with some writers for that. Beyond that, I guess we’ll both be able to be surprised.

Knowing Mars, a Novella

I don’t know about this. But here it goes, anyway.

I’m pleased to announce the complete publication of my novella “`Knowing Mars <http://tychogaren.com/mars/>`_” on `tychogaren.com <http://tychogaren.com/>`_.

“Knowing Mars” is an important story for me. I wrote it after I graduated from college, after I didn’t go to graduate school the first time, and in a lot of ways it was the project that got me started down the path of being a “real writer,” post-graduation. I’d written fiction before college, and mostly avoided writing fiction in college, and then right as I was finishing college I started writing stories again. It was strange for a while, but it was delightful to be able to tell stories and be so much better at it than I was the first time.

I find that this is fundamentally a recurring issue. When I started writing “Knowing Mars,” I felt like I was starting out light years ahead of what I had written four or five years before. Now, I feel like the stuff I’m working on now is light years ahead of “Knowing Mars.” This is probably an encouraging sign.

I suppose you’d like to know a bit more about the story. I’ll leave most of the details to the reading, but basically it’s a sort of superhero/cyberpunk story that explores themes related to diaspora, political organization, historical narrative, and gender. I don’t know if I’ve ever described it as such before.

The story is available in multiple formats. Each chapter is available in full HTML as part of tychogaren.com and in a plan Markdown formatted plain text. Furthermore the complete text of the novella is available in both of these formats, and a simple un-styled HTML version that should be ideal for conversion to various electronic reading platforms. If you want to read the novella but would find another format easier to process, talk to me about it and I’ll get something pulled together for you.

All “full html” versions of the text have comments enabled using the same system as the blog. I look forward to your comments. Thanks for reading, and stay tuned for more fiction.

Onward and Upward!

In Favor of Unpopular Technologies

This post ties together a train of thought that I started in “The Worst Technologies Always Win” and “Who Wants to be a PHP Developer” with the ideas in the “Ease and the Stack” post. Basically, I’ve been thinking about why the unpopular technologies, or even unpopular modes of using technologies are so appealing and seem to (disproportionately) capture my attention and imagination.

I guess it would first be useful to outline a number of core values that seems to guide my taste in technologies:

  • Understandable

Though I’m not really a programmer, so in a lot of ways it’s not feasible to expect that I’d be able to expand or enhance the tools I use. At the same time, I feel like even for complex tasks, I prefer using tools that I can have a chance of understanding how they work. I’m not sure if this creates value in the practical sense, however, I tend to think that I’m able to make better use of technologies that I understand the fundamental underpinnings of how they work.

  • Openness and Standard

I think open and standardized technologies are more useful, in a way that flows from “understandable,” I find open source and standardized technology to be more useful. Not in the sense that open source technology is inherently more useful because source code is available (though sometimes that’s true), but more in the sense that software developed in the open tends to have a lot of the features and values that I find important. And of course, knowing that my data and work is stored in a format that isn’t locked into a specific vendor, allows me to relax a bit about the technology.

  • Simple

Simpler technologies are easier to understand and easier--for someone with my skill set--to customize and adopt. This is a good thing. Fundamentally most of what I do with a computer is pretty simple, so there’s not a lot of reason to use overly complicated tools.

  • Task Oriented

I’m a writer. I spend a lot of time on the computer, but nearly everything I do with the computer is related to writing. Taking notes, organizing tasks, reading articles, manipulating texts for publication, communicating with people about various things that I’m working on. The software I use supports this, and the most useful software in my experience focuses on helping me accomplish these tasks. This is opposed to programs that are feature or function oriented. I don’t need software that could do a bunch of things that I might need to do, I need tools that do exactly what I need. If they do other additional things, that’s nearly irrelevant.

The problem with this, is that although they seem like fine ideals and values for software development, they are, fundamentally unprofitable. Who makes money selling simple, easy to understand, software with limited niche-targeted feature sets? No one. The problem is that this kind of software and technology makes a lot of sense, and so we keep seeing technologies that have these values that seem like they could beat the odd and become dominant, and then they don’t. Either they drop task orientation for a wider feature set, or something with more money behind it comes along, or the engineers get board and build something that’s more complex, and the unpopular technologies shrivel up.

What to do about it?

  • Learn more about the technologies you use. Even, and epically if you’re not a programmer.
  • Develop simple tools and share them with your friends.
  • Work toward task oriented computing, and away from feature orientation.

The Worst Technologies Always Win

This post is the culmination of two things:

1. Who wants to be a PHP Developer?

2. An ongoing conversation I’ve had with a number of coworkers about the substandard technologies that always seem to triumph over the “better” options.

The examples of the success of inferior technologies are bountiful. MySQL’s prevalence despite some non-trivial technical flaws (around clustering, around licensing as highlighted by the Oracle merger); PHP as the de facto glue language of the web despite the fact that every other language in it’s class is probably a better programming language (e.g. Python, Perl); VHS and Beta Max; BlueRay (which are proprietary and a physically less durable media) and HD-DVD; and so forth.

The factors are (of course) multiple:

  • Marketing.

People have to know about technologies at some stage in their development if the technology is to take off. I’m not sure what that crucial point is, and frankly marketing is something that not only I don’t understand, but I don’t really think anyone understands. Having said that, I think it’s clear that technologies don’t compete simply on their technical merits, and this is in recognition of that.

  • Timing.

Technologies that appear at the right time, with regards to availability of alternatives and the needs/interests in the market on those technologies matter a great deal, and can sometimes tip the balance between competing technologies. Arguably MySQL beat PostgreSQL not because it was better, but because it existed in a firm way a little bit earlier. Linux “won” market share over BSD, because BSD wasn’t quite fully free/open source (or available) in 1990-1992 when Linux was taking off.

  • Momentum.

A project that doesn’t look like it has energy and a large team behind it is probably doomed to fail on some level, not because it’s a bad technology, but potential users of a technology need to feel confident that it’s going to stick around. If no one is excited about it, then it’ll never win, even if it’s superior in the final analysis.

There’s nothing, really, to be done. I think the truly superior technologies might benefit by paying attention to these factors when it matters, but then the developers of said technologies are probably less interested in marketing and proper timing than the competition. Which is perhaps as it should be.

I guess the lingering questions that I’ll leave you with relate to thinking about the ways that open source and free software relate to other technologies. It strikes me that while there’s a pretty good balance between open source and proprietary technologies in the examples I provided above, all of the open source technologies were commercialized very early on and very intensely in a way that none of their competitors really did. Are “wining technologies,” a mystification of the proprietary technology world? Can community-based open source and free software technologies innovate and “win” in relation to their competitors?

I look forward to sorting out the answers in the comments. Onward and Upward!

Who wants to be a PHP Developer?

So PHP is this programming language that’s widely used, and often reviled by systems administrators and people who fancy theme selves “real programmers.” And yet, I think, while the “real programmers” were busy being “real,” PHP got something very fundamental right that explains its success despite the disdain.

I should interject with some context. First, I think this is another in my ongoing series of posts regarding linguistic relativism and computer programming. Second, for those of you who don’t spend your days in this space PHP is a programming language designed specifically for use in the context of the web, and it has only comparatively recently emerged as a possibility for “general programming tasks,” in contrast to other languages in “the space” (ruby, python, perl, etc.) which started as general purpose languages that have become common for use for web programming. Also as a computer language, there’s nothing particularly innovative about PHP, which earns it no small amount of ire.

So here’s the thing. PHP is easy. It’s designed to be easy. The syntax is familiar to people who are know even a little Perl or other C-like languages. Although the language has had object oriented support for several years, most PHP applications aren’t written in an object oriented manner and in a number of contexts that makes things a bit easier to understand.

And here’s the thing that I seem to notice in the context of administration: compared to other languages and frameworks, PHP is dead simple to deploy. Sure, everything under big loads becomes complex, and sure PHP applications consume more server resources than perhaps they should, but basically you configure a web-server to process PHP code, and then you write your code, inside of your page, and the web server generates what you need it to, and it just works. You don’t have to screw around with writing boilerplate CGI stuff, you don’t have to screw around with cgi-bins/ and script aliases which were never intuitive, you don’t need special servers, it just works.

And I already know that, someone is going to tell me that there’s a Perl module that lets you use perl in the same way, or that Python and Ruby don’t make you write CGI boilerplate either (or that there’s a Perl module to write the CGI boilerplate). And I know these things, but I’m not sure that it matters anymore. PHP, as a language is written around the needs of web development, and there’s merit in that.

I’m not saying, “go forth and write your next application in PHP:” I don’t even know if dynamic web applications are worth writing anymore. I am saying that despite all of the dreck in the PHP space, there are some things that are incredibly worthwhile that the current generation of web developers may miss.

That is all. For now. Onward and Upward!

Ease and The Stack

As if I needed a new project, this post introduces a new project that’s floating around in my mind. I was having a conversation with a friend about how I use the computer, I realized that while I’ve talked about various elements of how I use computer’s (the short story: peculiarly), I’ve not really talked about the holistic experience. As I started to talk about the various components and how they connect and work together, I realized that with out an example it was about as clear as mud.

So, in light of this, I’ve decided to make a “tychoish stack,” which won’t be anything particularly novel, but a repackaging of the software--mostly configurations and little bits here and there--that I use on a daily basis. My stumpwm configuration. The highlights of my Emacs configuration, and a few install scripts to make it all work together. An SSH configuration file that will make your life much easier, a list of packages that you’ll want to install on common operating systems (Debian/Ubuntu and Arch Linux), and--because I am who I am--a fair piece of writing about best practices and how to these tools effectively.


I was talking with Chris the other day in one of our never ending conversations regarding is ever changing choices of desktop operating systems. “Windows just feels more polished to me right now,” he said after a stint with the latest Ubuntu or one of its derivatives.

To which I said, “of course it does,” they pay bunches of people lots of money to make sure that Windows is polished and it’s a high priority given the failure of Vista and the direction of the market. The reason I use Linux full time is not because I want a better more polished experience, I use Linux full time, because I want something very specific: a window manager that stays out of my way and doesn’t distract me with “chrome,” emacs buffers that run in the way that I expect them to, package management tools that allow my system to work and function day in and day out, the ability to customize all of these functions to suit the evolving needs of my work, and nothing else that I have to mess around with.

This is something that I can only get from a UNIX system, and the more I play with different systems, the more I’m inclined to think that the only way to get this is with Arch Linux. But that’s just me. I’ve played with a bunch of different operating systems (that’s an aspect of my day job) and I’ve spent time using OS X, and it just doesn’t work for me. I don’t need smooth, I don’t want polish, I just want something that lets me work.

Different computer systems make sense for different people. There’s no problem with that assertion, I think.


The problem of course, that my setup doesn’t scale. I can go from a bare arch installation to a working version of my system in a few hours by using rsync to copy over my home directory, updating a few git repositories, installing a list of packages, and creating a half dozen symbolic links, but building this from the bottom up would take a long time.

The goal of this project then, is to make that process easier. I’ve done a bunch of work to get a setup that does what I need it to do, I know which applications work, I know how to plug everything together to make it easier to manage. I want to build a stack so that you all can take it, learn from what I’ve done, and spend the time customizing it to what you do, rather than going through the trouble of building it up yourselves.

How’s that sound?