Is Dropbox the Mobile File System Standard

I've started using Dropbox on my Android devices recently (and my laptop as a result, [1]) and I'm incredibly impressed with the software and with the way that this service is a perfect example of the kind of web services that we need to see more of. While I have some fairly uninteresting concerns about data security and relying on a service that I'm not administrating personally, I think it's too easy to get caught up the implications of where the data lives and forget what the implications of having "just works," file syncing between every computer.

I used to think that the thing that kept mobile devices from being "real" was the fact that they couldn't sell "post-file system" computer use. I'm not sure that we're ready to do away with the file system metaphor yet. I think Dropbox is largely successful because it brings files back and makes them available in a way that makes sense for mobile devices.

The caveat is that it provides a file system in a way that makes sense in the context for these kinds of "file systemless" platforms. Dropbox provides access to files, but in a way that doesn't require applications (or users) to have a firm awareness of "real files. Best of all, Dropbox (or similar) can handle all of the synchronization, so that every application doesn't need to have its own system.

This might mean that Dropbox is the first functionally Unix-like mobile application. I think (and hope) that Dropbox's success will prove to be an indicator for future development. Not that there will be more file syncing services, but that mobile applications and platforms will have applications that "do one thing well," and provide a functionality upon which other applications can build awesome features.


This isn't to say that there aren't other important issues with Dropbox. Where your data lives does matter, who controls the servers that your data lives on is important. Fundamentally, Dropbox isn't doing anything technologically complicated. When I started writing the post, I said "oh, it wouldn't be too hard to get something similar set up," and while Dropbox does seem like the relative leader, it looks like there is a fair amount of competition. That's probably a good thing.

So despite the concerns about relying on a proprietary vendor and about trusting your data on someone else's server, data has to go somewhere. As long as users have choices and options, and there are open ways of achieving the same ends, I think that these issues are less important than many others.

[1]To be fair, I'm using it to synchronize files to the Android devices, and not really to synchronize files between machines: I have a server for simple file sharing, and git repositories for the more complex work. So it's not terribly useful for desktop-to-desktop sharing, But for mobile devices? Amazing.

I want ZFS in the Kernel

The background:

Sun Microsystems developed this file system called "ZFS," which is exceptionally awesome in it's capabilities and possibilities. The problem, is that it was developed and released as part of the Open Solaris project which has a licensing incompatibility with the Linux Kernel. Both are open source, but there is a technical (and not all together uncommon) conflict in the terms of the license that makes it possible to combine code from both licenses in a single executable.

Basically the GPL, under which the Linux Kernel is distributed, says if you distribute a binary (executable) under the terms of the GPL, the source code is all files that you used to make that binary. By contrast ZFS's license says "here are all the files that we used to make this binary, if you change them when you make your binary and give that binary to other people you have to give them, but if you add additional files, you don't have to give those out to people."

Apparently the idea behind the ZFS license (i.e. the CDDL, and the MPL from whence it originated,) is that it allows for easier embedding of ZFS (and other technologies) in proprietary code because the resulting binary isn't list covered by the CDDL in most cases. Even though the CDDL is incredibly confusing, apparently it's more "business friendly," but I diverge from my original point.

And so if Linux users want to run ZFS, they have to run it as a user-space process (i.e. not in the kernel,) which is suboptimal, or they have to run Solaris in a vitalized environment (difficult,) or something. There's also a ZFS-like file system called "btrfs," which can be included in the kernel (interestingly, developed by Oracle who of course now own ZFS itself,) but it is not production ready.

What I'm about to propose is an end run around the GPL. Because it seems to me that combining the source code violates neither license, distributing source code violates no license. Compiling the source code for your own use violates no license. I mean it's annoying and would require a bit of bootstrapping to get a Linux+zfs system up and running, but this is the kind of thing that Gentoo Linux users do all the time, and isn't a huge technological barrier.

It feels a bit creepy of course. but I think it works. The logic has also been used before. We'll call it the "PGP loophole."

PGP is an encryption system that's damn good so good in fact, that when it was first released, there were export restrictions on the source code because it qualified as military-grade munitions in America. Source code. How peculiar. In any case there were lawsuits, and PGP source was released outside of America by printing it in a book. Which could be disassembled and scanned into a computer and then compiled. Books were never and--as far as I know--are not classified as munitions, and so they could be exported. Of course I'm not a lawyer, but it strikes me that linux+zfs and PGP in the 90's may be in analogous situations.

And I think, because this proposal centers around the distribution of source code and only source code this kind of distribution is fully within the spirit of free software. Sure it's pretty easy, even for the "good guys," to run a foul by distributing a binary, but this would be easy to spot, and there are already suitable enforcement mechanisms in place, for the Linux kernel generally, and Oracle's legal department which we can assume will take care of itself.

Or Oracle could release ZFS under GPL. Either solution works for me.

Systems Review

I wrote in my post on the one true system about the informal systems that we use to interface the way we interact with knowledge and information in the "real world" with the way we represent that information on our computers. Exploring these systems lay at the core of the cyborg question, but today's essay [1] is more about how our logic systems adapt as we use computers and as the kinds of information we need to store change and grow.

As near as I can tell there are a few kinds of "systems review" that we tend to do. Theoretically if you develop a system that's flexible and that accounts for all of your future information needs, then you shouldn't have to modify your system very much. Theoretically this is a good thing: better to spend time "doing things," rather than thinking about "how you're going to do things."

The sad truth is that this doesn't work out very well pragmatically: we change our work habits, and our information changes, and our projects change, and our informal logic for interacting with our computers fails to address the problems, and eventually everything spirals out of control. This is pretty abstract, but every time you see someone with hundreds of icons stacked on your (or someone else's) desktop, or you find yourself with hundreds of unsorted (and unread!) email messages, or you have to hunt through half a dozen places for a PDF you are witnessing the symptoms of a flawed system.

The only way to address this is to review your "systems," and make sure that you capture any problem before information spirals out of control:

  1. Have an overflow bin, but only one overflow bin. This is important, but counter intuitive. By overflow bin, I mean something where unfile-able items are placed. This hopefully alleviates the tension to file away information that hasn't been fully processed, or that doesn't fit into your system, or might be ambiguously filed [2].

  2. Do major reviews of your system only infrequently. By major review, I mean, think about how you use your information, what has worked, and what hasn't, and then use this as a model for revising your system. Don't do it regularly, even if you know that something isn't working. Think of this as something that you do only about twice as often as you get a new computer. As part of this major review process:

    Keep a regular journal when you aren't in the process of updating procedures. Track about what works and what doesn't. Often I've found that I have ideas about how things should change, but the changes aren't the kind of thing that I could reasonably change during normal work. These insights/problems are useful, eventually even if they aren't always immediately relevant. So record them for later.

  3. Do Minor reviews regularly. Look in the "overflow bin" from item one and see what's falling through the cracks, file things that do need to be filed. The GTD folks call this a "weekly review," and while GTD-task processing is only part of "the system." [3] It depends on what kind of information you're managing, but staying on top of and in touch with your "stuff" is important.

  4. Be sure to "touch" your information regularly. While I'm in favor of keeping information even when it's not apparently useful (you never know), I also agree with the idea that information is only really useful if you use it. I've often found myself falling into the trap where I'll stockpile stuff "to read later," which of course rarely happens. Avoid this and browse from time to time.

I mean in the end, I'm just a guy who writes more than he should, and has a pile of digital information that's probably a bit too big, but this is how I do things, and I think the lessons I've learned (and continue to learn) may be helpful to some of you. Reviewing and thinking about systems before hand is, if nothing else, instructive.

Onward and Upward!

[1]I use the word "essay" under the terms of its slightly less common meaning "to make an attempt," rather than the terms that describe a genre of writing.
[2]Borrowing from the Python programing language motto, somewhat, "Every bit of information in your system should have one, and ideally only one, obvious location." Now of course, we can categorize information on many different axises, so the key isn't to pound data/information it's to build your system around a consistent axis.
[3]The system, for me, represents everything from the way we store bookmarks on line, to notes that we collect as we work, to tasks and other time-sensitive data, to the way that we store resources like PDFs and papers. Though we don't have "one" system for all these things, and we're not likely to revise them all at the same time, on some conceptual level it's all the same thing.