Pieces of the Data Puzzle

I'd like to talk a bit about what I've come up with in terms of data management/research organizational tools. So in my last post on this subject I mentioned that I had hacked together a shell script that did a lot of what I needed. Rather than look for the "one right" OS X tool (which I don't think really exists at this point,) I've worked on collecting extant tools and programs and figuring some way of making something that would really continue to work into the future and that would be more tailored to my particular workflow.

I'm going to outline and present what I'm doing for the public betterment anyone else looking for something like this. Also I hope that people with a similar sort of need/workflow might be able to contribute tools or offer enhancements.

Let me outline the pieces of the puzzle, first:

  • OS X/UNIX shell. [1] Having access to all of these great unix tools makes it reasonably easy to write scripts to automate the key parts of this workflow. Also, as most of my system reilies on Apache (which comes installed on OS X) and web servers, having this kind of low level access to the system is great.
  • ikiwiki. So this is a great little program that takes a directory of text files, and turns them into a wiki/website via a markdown interpreter. It also connects and automates through subversion, an open source version management tool (it can also use git and others if that's more your speed.) This part of the project is probably the geekiest and most difficult part to get working, particularly if you don't have access a good package installer. But it's possible and totally worth it.
  • Several bash/shell scripts that I have written to insert data and clips into the wiki.
    • One script that, with a command in the form of, clip [FileName] [Space delinated tags] will create a page in a research clipping section of the wiki with the contents of the clipboard, and the proper notation for tag organization and the date of collection. The script then opens the file in the text editor. I wish it could capture, more automatically the citation information (author/url) but I think this would require the browser to expose more information than it currently does. But it's pretty good.
    • One script that create new "tag indexes," which makes it easier to see all the pages tagged with specific labels. If I were more clever I could probably tie the scripts together so that whenever I added a new tag that didn't already exist that it would generate the new tag page. Except that this wouldn't cover all instances of new tag pages, so it's ok to have separate tag pages. This also helps control "tag sprawl," and prevent metadata from getting out of control.
  • A set very quick functions that let me append text any text file, as well as a quick command to append lines to the end of a general "inbox" or "codex.txt" file for quick thoughts, notes, todos, and tasks. This is outside of the wiki and not a new tool, but it works.
  • Fluid - This is a really nifty program that uses OS X tools to build "programs" built around single, site specific websites. Basically this is the ideal bridge between "web apps" and "desktop apps," particularly once Google Gears begins to work with this kind of app [2] amazing things are going to happen. Since I'm running the wiki locally, this is moot. There are other non-mac options like Mozilla Prisim, though I don't have experience with it. This makes the wiki more useable and open I think.

Here's a file with the relevant scripts and ikiwiki templates that I'm using. It's rough and kludgy, but if people are interested or willing to contribute (feedback, knowhow, etc.), I'd certainly be willing to work on making this more polished and accessible.

ETA: I just discovered bsag's textmate plugin, which might be a little more prime-time ready than my script, but I think with ikiwiki's tagging function and potentially a recourse to spotlight/etc indexing, my solution works better for me. But, bsag is awesome and the text mate bundle is totally way more hard core than my kludge.

[1]If you use a windows machine, or don't have easy access to a shell, my recommendation is that rather than fight windows to try and get all these things installed and working, I would see about getting server space somewhere where you'd get hosting that would include shell/ssh access in a UNIX environment. Probably the cost is pretty low, and you could use an account like this to back up data, host your email, and so forth.
[2]I've written here before a couple of times about how much I really dislike the experience of using online/web applications. Programs like this really solve this complaint. My major complaint is still the fact that there aren't good offline options for web apps, which is why I mention Gears. This is mooted by the fact that in this instance we're/I'm running the server locally, so even without a connection the wiki is accessible.
comments powered by Disqus