Git Mail

"So basically, what I'm doing is, sorting the email on the server in a git repository, and then pushing and pulling to that as needed, or just working their over ssh." I explained

"So what you're saying, is that you've basically reinvented IMAP," Chris said.

"Yeah, pretty much, except that this works," I said.

"If you say so."


I do say so. So this is what it comes down to:

IMAP got one thing right: we need a way of accessing our email that works on public computers, is machine independent, that works offline and online, and that keeps all these mail reading environments synchronized.

The problem is that IMAP is incredibly flakey and inconsistent. Messages that you've read suddenly become unread, messages that you've moved suddenly pop back into your inbox, it's slow, and if you don't have the right mail client and a server that's tweaked in the right way, it might not really work at all.

If IMAP worked as well as it could, we'd all use it because, ideally it's the best way to manage email. Everything is stored remotely but cached locally for offline use and backup, you can use multiple machines without worry. Instead I suspect most people who need these features use webmail if they need multi-machine email accounts,[^webmail] and that's great, if it works for you. Gmail and the like are great pieces of software, I don't mean to begrudge webmail, I just don't enjoy the experience of the browser, and seek to avoid it except for browsing.

So while I'd given up on email that synced really well in the last couple of months (preferring to use mutt and procmail locally,) I still longed for this kind of email set up.

So I thought a bit and figured some things out, and here's what I came up with. (A step by step explanation of how I get email):

1. A bunch of email addresses (for different contexts) are forwarded to a gmail account, which despamifies my email, and uses it's own filters to sort and forward mail (using +address aliases) to a secret email account on the web server.

2. The webserver uses procmail to sort and deliver the email into a non-public (obviously) folder/git repository.

3. I have a series of scripts to manage the git/mail interaction depending on if I'm on an SSH connection or not, mostly this is straightforward, but here's how the sync works (commands from the local perspective in parenthesis):

  1. via ssh I add and commit any new mail that may have arrived since the last sync to the repository on the server:

    ssh <foo> git add * && git commit
    
  2. Locally I commit any change that's been made to my mail directory since the last sync, and pull down the new mail from the server:

    git add * && git commit && git pull
    
  3. Push my changes up, and everything merges and the repositories look the same now. and I reset the index of the remote server so to reflect the changes: [1]

    ssh <foo> git push && ssh <foo> git reset --hard
    
  4. Rejoice. [2]

  5. Set up a cronjob or a launchd daemon to do this automatically every, say 15 minutes, and then you get the illusion of having "push" email.

    If you get growl to notify you of the results of the pull (end of step 2) you'll be able to see whether there's new mail or not.

    I think you could probably set much of this up in post-commit/post-push hooks. And I think it goes without saying that if there are more than one "client" repository in play, you'll want to pull changes more often than you sync.

So why would I (and you?) want to do this? Easy:

  • It's quick. Git is fast, by design, and since email files have a lot of redundancy in the headers and what not, git can save a lot of space in the "tube" and speed up your email downloads.

  • It's robust and flexible. If my server stooped working for some reason, I could set up another, or simply start using fetch-mail again. If the server went back up, I just push the changes up, and I'm back in business.

  • Git handles renames implicitly, and this is a really cool feature that I don't think gets quite enough mention. Basically if I move or rename a file, and add the new file to git, (git add * does this) git realizes that I've moved the file. so as files move and get named other things, git realizes that it's happened, deals with it and moves on.

  • Understandably, if you delete a file locally git won't delete it from the repository "head" unless you tell it to. This bash script takes care of that:

    for i in `git status | grep deleted`; do
       git rm --quiet  $i;
    done
    
  • It's secure, or at least is if you do it right. All the email gets downloaded over SSH, so no clear text passwords (if you're using public key authentication) and encrypted data transmission.

  • Oh yeah, and it's not flakey like IMAP. Totally worthwhile.

  • If there is some sort of syncing problem, you know where it is, as opposed to IMAP making an executive decision without telling you and un-sorting all your email.

Any questions?

Onward and Upward!

[1]Git is really good at merging, and because it can track renaming implicitly, it does pretty well in this situation. There are some things you can do to basically ensure that you never have a conflicted merge (because after all, you're never really editing these files, and so rarely concurrently. for the most part). Remember after pushing to the server, always reset it's index. Under normal conditions we'd not push to a repository that had an index, so this usually isn't an issue, but here where it's important that the server's repository have files, keeping that index "right" means you won't undo your changes in successive commits.
[2]My first instinct was to focus on pushing changes up early, and it turns out that this is the wrong thing to do, as it means potentially that the state of the upstream repository could be wonky when you want to pull. Pulling before you push, locally, prevents having more than one kind of change in any one commit, and makes the whole situation a bit more error proof.
comments powered by Disqus