6 Awesome Arch Linux Tricks

A couple of years ago I wrote “Why Arch Linux Rocks” and “Getting the most from Arch Linux.” I’ve made a number of attempts to get more involved in the Arch project and community, but mostly I’ve been too busy working and using Arch to do actual work. Then a few weeks ago when I needed to do something minor with my system--I forget what--and I found myself thinking “this Arch thing is pretty swell, really.”

This post is a collection of the clever little things that make Arch great.

::: {.contents} :::

abs

I’m using abs as a macro for all of the things about the package build system that I enjoy.

Arch packages are easy to build for users: you download a few files read a bash script in the PKGBUILD file and run the makepkg command. Done. Arch packages are also easy to specify for developers: just specify a “build()” function and some variables int eh PKGBUILD file.

Arch may not have as many packages as Debian, but I think it’s clear that you don’t need comprehensive package coverage when making packages is trivially easy.

If you use Arch and you don’t frequent that AUR, or if you ever find yourself doing “./configure; make; make install” then you’re wasting your time or jeopardizing the stability of your server.

yaourt

The default package management tool for Arch Linux, pacman, is a completely sufficient utility. This puts pacman ahead of a number of other similar tools, but to be honest I’m not terribly wild about it. Having said that, I think that yaourt is a great thing. It provides a wrapper around all of pacman’s functionality and adds support for AUR/ABS packages in a completely idiomatic manner. The reduction in cost of installing this software is quite welcome.

It’s not “official” or supported, because it’s theoretically possible to really screw up your system with yaourt but if you’re cautious, you should be good.

yaourt -G

The main yaourt functions that I use regularly are the “-Ss” which provides a search of the AUR, and the -G option. -G just downloads the tarball with the package specification (e.g. the PKGBUILD and associated files) from the AUR and untars the archive into the current directory.

With that accomplished, it’s trivial to build and install the package, but you get to keep a record of the build files for future reference and possible tweaking. So basically, you this is the way to take away the tedium of getting packages from the AUR, while giving you more control and oversight of package installation.

rc.conf

If you’ve installed Arch, then you’re already familiar with the rc.conf file. In case you didn’t catch how it works, rc.conf is bash script that defines certain global configuration values, which in turn controls certain aspects of the boot process and process initialization.

I like that it’s centralized, that you can do all kinds of wild network configuration in the script, and I like that everything is in one place.

netcfg

In point of fact, one of primary reasons I switched to Arch Linux full time, was because of the network configuration tool, netcfg. Like the rc.conf setup, netcfg works by having a network configuration files which define a number of variables which are sourced by netcfg when imitating a network connection.

It’s all in bash, of course, and it works incredibly well. I like having network management easy to configure, and setup in a way that doesn’t require a management daemon.

Init System

Previous points have touched on this, but the “BSD-style” init system is perfect. It works quickly, and boot ups are stunningly fast: even without an SSD I got to a prompt in less than a minute, and probably not much more than 30 seconds. With an SSD: it’s even better great. The points that you should know:

  • Daemon control scripts, (i.e. init scripts) are located in /etc/rc.d. There’s a pretty useful “library” of shell functions in /etc/rc.d/function and a good template file in``/etc/rc.d/skel` for use when building your own control scripts. The convention is to have clear and useful output and easy to understand scripts, and with the provided material this is pretty easy.

  • In /etc/rc.conf there’s a DAEMON variable that holds an array. Place names, corresponding to the /etc/rc.d file name, of daemons in this array to start them at boot time. Daemons are started synchronously by default (i.e. order of items in this array matters and the control script must exit before running the next script.) However, if a daemon’s name is prefixed by an @ sign, the process is started in the background and the init process moves to the next item in the array without waiting.

    Start-up dependency issues are yours to address, but using order and background start-up this is trivial to implement. Background start ups lead to fast boot times.

Xen and KVM: Failing Differently Together

When I bought what is now my primary laptop, I had intended to use the extra flexibility to learn the prevailing (industrial-grade) virtualization technology. While that project would have been edifying on its own, I also hoped to use the extra flexibility to some more consistent testing and development work.

This project spurned a xen laptop project, but the truth is that Xen is incredibly difficult to get working, and eventually the “new laptop” just became the “every day laptop,” and I let go of the laptop Xen project. In fact, until very recently I’d pretty much given up on doing virtualization things entirely, but for various reasons beyond the scope of this post I’ve been inspired to begin tinkering with virtualization solutions again.

As a matter of course, I found myself trying KVM in a serious way for the first time. This experience both generated a new list of annoyances and reminded me about all the things I didn’t like about Xen. I’ve collected these annoyances and thoughts into the following post. I hope that these thoughts will be helpful for people thinking about virtualization pragmatically, and also help identify some of the larger to pain points with the current solution.

Xen Hardships: It’s all about the Kernel

Xen is, without a doubt, the more elegant solution from a design perspective and it has a history of being the more robust and usable tool. Performance is great, Xen hosts can have up-times in excess of a year or two.

The problem is that dom0 support has, for the past 2-3 years, been in shambles, and the situation isn’t improving very rapidly. For years, the only way to run a Xen box was to use an ancient kernel with a set of patches that was frightening, or a more recent kernel with ancient patches forward ported. Or you could use cutting edge kernel builds, with reasonably unstable Xen support.

A mess in other words.

Now that Debian Squeeze (6.0) has a pv-ops dom0 kernel, things might look up, but other than that kernel (which I’ve not had any success with, but that may be me,) basically the only way to run Xen is to pay Citrix1 or build your own kernel from scratch, again results will be mixed (particularly given the non-existent documentation,) maintenance costs are high, and a lot of energy will be duplicated.

What to do? Write documentation and work with the distributions so that if someone says “I want to try using Xen,” they’ll be able to get something that works.

KVM Struggles: It’s all about the User Experience

The great thing about KVM is that it just works. “sudo modprobe kvm kvm-intel” is basically the only thing between most people and a KVM host. No reboot required. To be completely frank, the prospect of doing industrial-scale virtualization on-top of nothing but the Linux kernel and with a wild module in it, gives me the willies is inelegant as hell. For now, it’s pretty much the best we have.

The problem is that it really only half works, which is to say that while you can have hypervisor functionality and a booted virtual machine, with a few commands, it’s not incredibly functional in practical systems. There aren’t really good management tools, and getting even basic networking configured off the bat, and qemu as the “front end” for KVM leaves me writhing in anger and frustration.2

Xen is also subject to these concerns, particularly around netowrking. At the same time, Xen’s basic administrative tools make more sense, and domU’s can be configured outside of interminable non-paradigmatic command line switches.

The core of this problem is that KVM isn’t very Unix-like, and it’s a problem that is rooted in it’s core and pervades the entire tool, and it’s probably rooted in the history of its development.

What to do? First, KVM does a wretched job of anticipating actual real-world use cases, and it needs to do better at that. For instances it sets up networking in a way that’s pretty much only good for software testing and GUI interfaces but sticking the Kernel on the inside of the VM makes it horrible for Kernel testing. Sort out the use cases, and there ought to be associated tooling that makes common networking configurations easy.

Second, KVM needs to at least pretend to be Unix-like. I want config files with sane configurations, and I want otherwise mountable disk images that can be easily mounted by the host.

Easy right?


  1. The commercial vendor behind Xen, under whose stewardship the project seems to have mostly stalled. And I suspect that the commercial distribution is Red Hat 5-based, which is pretty dead-end. Citrix doesn’t seem to be very keen on using “open source,” to generate a sales channel, and also seems somewhat hesitant to put energy into making Xen easier to run for existing Linux/Unix users. ↩︎

  2. The libvirtd and Virt Manager works pretty well, though it’s not particularly flexible, and it’s not a simple command line interface and a configuration file system. ↩︎

Ikiwiki Tasklist Update

I added a few lines to a script that I use to build my task list, and for the first time ever, I opened a file with code in it, added a feature, tested it, and it worked. Here’s the code with enough context so it makes sense (explained later if you don’t want to spend the time parsing it:)

ARG=`echo "$@" | sed -r 's/\s*\-[c|p|s]\s*//g'`
WIKI_DIR="`echo $ARG | cut -d " " -f 1`"
if [ "`echo $ARG | cut -d " " -f 2 | grep -c /`" = 1 ]; then
   TODO_PAGE="`echo $ARG | cut -d " " -f 2`"
elif [ "`echo $ARG | cut -d " " -f 2 | grep -c $EXT`" = 1 ]; then
   TODO_PAGE="$WIKI_DIR/`echo $ARG | cut -d " " -f 2`"
else
   TODO_PAGE="$WIKI_DIR/`echo $ARG | cut -d " " -f 2`.$EXT"
fi

This is from the section of the script that processes the arguments and options on the command line. Previously, commands were issued such that:

ikiwiki-tasklist [-c -p -s] [DIR_TO_CRAWL] [OUTPUT TODO FILE NAME]

My goal with the options was to have something that “felt like” a normal command with option switches and had a lot of flexibility. The two fields that followed: however, I didn’t provide as much initial flexibility. The directory to crawl for tasks (i.e. “[DIR_TO_CRAWL]") was specified the way it is now, but the output file was 1) assumed to have an extension specified in a variable at the top of the script, 2) automatically placed the output file in the top level of the destination directory.

It worked pretty well, but with the advent of a new job I realized that I needed some compartmentalization. I needed to fully use the tasklist system for personal and professional tasks without getting one set of items mixed in with the other. Being able to have better control of the output is key to having full control over this.

The modification detects if the output file looks like a path rather than a file name. If it’s senses a path, it creates the task list in the path specified, with no added extension. If a file name specifies the extension, then you won’t get “.ext.ext” files. And the original behavior is preserved.


I’m a hacker by inclination: I take code that I find and figure out how to use it. Sometimes I end up writing or writing code, but I’m not really a programmer. My own code, at least until recently has tended to be somewhat haphazard and until now (more or less) I’ve not felt like I could write code from scratch that was worth maintaining and enhancing in any meaningful way.

Apparently some of that’s changed.

I’ve made a few additional changes to the scripts, but most of these feel more trivial and can be described as “I learned how to write slightly tighter shell scripts. so if you’re using it you might want to update: the ikiwiki tasklist page is up to date.

Constraints for Mobile Software

This post is mostly just an overview of Epistle by Matteo Villa, which is--to my mind--the best Android note taking application ever. By the time you read this I will have an Android Tablet, but it’s still in transit while you read this and that’s a topic that dissevers it’s own post.

Epistle is a simple notes application with two features that sealed the deal:

1. It knows markdown, and by default provides a compiled rich text view of notes before providing a simple notes editing interface. While syntax highlighting would be nice, we’ll take what we can get.

2. It’s a nice, simple application. There’s nothing clever or fancy going on. This simplicity means that the interface is clean and it just edits text.

For those on the other side there’s Paragraft that seems similar. While in my heart of hearts I’m probably still holding out for the tablet equivalent1 of emacs. In the mean time, I think developing a text editing application that provide a number of paradigmatic text editing features and advances for the touch screen would be an incredibly welcome development.

In the end there’s much work to be done, and the tools are good enough to get started.


  1. I want to be clear to say equivalent and not replacement, because while I’d like to be able to use emacs and have that kind of slipstream writing experience on an embeded device, what I really want is something that is flexible and can be customized and lets me do all the work that I need to do, without hopping between programs, without breaking focus, that makes inputting and manipulating text a joy. And an application that we can trust (i.e. open source, by a reputable developer,) in a format we can trust (i.e. plain text.) Doesn’t need to be emacs and doesn’t need lisp, but I wouldn’t complain about the lisp. ↩︎

Publishing System Requirements

Like issue tracking systems, documentation publication systems are never quite perfect. There are dozens of options, and most of them are horrible and difficult to use for one reason or another. Rather than outline why these systems are less than ideal, I want to provide a list of basic requirements that I think every documentation publishing system1 should have.

Requirements

  • Tag System. You have to be able to identify and link different pieces of content together in unique and potentially dynamic ways across a number of dimensions. Tagging systems, particularly those that can access and create lists of “other posts with similar tags,” are essential for providing some much needed organization to projects that are probably quite complex. Tagging systems should provide some way of supporting multiple tag namespaces. Also operations affecting tags need to be really efficient, from the users and software’s perspective, or else it won’t work at realistic scales.
  • Static Generation. Content needs to be statically generated. There are so many good reasons to have static repositories. It allows you to plan releases (which is good if you need to coordinate documentation releases with software releases) most documentation changes infrequently. The truth is this feature alone isn’t so important, but static generation makes the next several features possible or easier.
  • Development Builds. As you work on documentation, it’s important to be able to see what the entire resource will look like when published. This is a mass-preview mode, if you will. The issue here, is that unlike some kinds of web-based publications, documentation often needs to be updated in batches, and it’s useful to be able to see those changes all at once because the can all interact in peculiar ways. These test builds need to be possible locally, so that work isn’t dependent on a network connection, or shared infrastructure.
  • Verification and Testing. While building “self-testing” documentation is really quite difficult (see also /technical-writing/dexy,) I think publication systems should be able to do “run tests” against documents and provide reports, even if the tests are just to make sure that new of software versions haven’t been released, or that links still work. It’s probably also a good idea to be able to verify that certain conventions are followed in terms of formatting: trailing white space, optional link formats, tagging conventions, required metadata, and so forth.
  • Iteration Support. Documents need to be released and revised on various schedules as new versions and products are developed. Compounding this problem, old documentation (sometimes,) needs to hang around for backwards compatibility and legacy support. Document systems need to have flexible ways to tag documents as out of date, or establish pointers that say “the new version of this document is located here.” It’s possible to build this off of the tag system, but it’s probably better for it to be a separate piece of data.
  • Version Control. These systems are great for storing content, facilitating easy collaboration, and supporting parallel work. Diffs are a great way to provide feedback for writers, and having history is useful for being able to recreate and trace your past thinking when you have to revisit a document or decision weeks and months later.
  • Lightweight Markup. It’s dumb to make people write pure XML in pretty much every case. With rst, markdown, and pandoc the like there’s no reason to write XML. Ever. of story End.
  • Renaming and Reorganization. As document repositories grow and develop, it seems inevitable that the initial sketch of the organization for the body of work change. Documents will need to be moved, URLs will need to be redirected or rewritten, and links will need to be updated. The software needs to support this directly.
  • Workflow Support. Documentation systems need to be able to facilitate editorial workflows and reviews. This should grow out of some combination of a private tag name space and a reporting feature for contributions, which can generate lists of pages to help groups distribute labor and effort.

This might just be a quirk of my approach, but I tend approach documentation, terms of process and tooling, as if it were programming and writing software. They aren’t identical tasks, of course, but there are a lot of functional similarities And definitely enough to take advantage of the tooling and advances (i.e. make, git, etc.) that programmers have been able to build for themselves. Am I missing something or totally off base?


  1. Think knowledge bases, documentation sites, and online manuals. I’m generally of the opinion that one should be able to publish all of these materials using the same tool. ↩︎

9 Awesome SSH Tricks

Sorry for the lame title. I was thinking the other day, about how awesome SSH is, and how it’s probably one of the most crucial pieces of technology that I use every single day. Here’s a list of 10 things that I think are particularly awesome and perhaps a bit off the beaten path.

Update: (2011-09-19) There are some user-submitted ssh-tricks on the wiki now! Please feel free to add your favorites. Also the hacker news thread might be helpful for some.

SSH Config

I used SSH regularly for years before I learned about the config file, that you can create at ~/.ssh/config to tell how you want ssh to behave.

Consider the following configuration example:

Host example.com *.example.net
User root
Host dev.example.net dev.example.net
User shared
Port 220
Host test.example.com
User root
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
Host t
HostName test.example.org
Host *
Compression yes
CompressionLevel 7
Cipher blowfish
ServerAliveInterval 600
ControlMaster auto
ControlPath /tmp/ssh-%r@%h:%p

I’ll cover some of the settings in the “Host *” block, which apply to all outgoing ssh connections, in other items in this post, but basically you can use this to create shortcuts with the ssh command, to control what username is used to connect to a given host, what port number, if you need to connect to an ssh daemon running on a non-standard port. See “man ssh_config” for more information.

Control Master/Control Path

This is probably the coolest thing that I know about in SSH. Set the “ControlMaster” and “ControlPath” as above in the ssh configuration. Anytime you try to connect to a host that matches that configuration a “master session” is created. Then, subsequent connections to the same host will reuse the same master connection rather than attempt to renegotiate and create a separate connection. The result is greater speed less overhead.

This can cause problems if you' want to do port forwarding, as this must be configured on the original connection, otherwise it won’t work.

SSH Keys

While ControlMaster/ControlPath is the coolest thing you can do with SSH, key-based authentication is probably my favorite. Basically, rather than force users to authenticate with passwords, you can use a secure cryptographic method to gain (and grant) access to a system. Deposit a public key on servers far and wide, while keeping a “private” key secure on your local machine. And it just works.

You can generate multiple keys, to make it more difficult for an intruder to gain access to multiple machines by breaching a specific key, or machine. You can specify specific keys and key files to be used when connected to specific hosts in the ssh config file (see above.) Keys can also be (optionally) encrypted locally with a pass-code, for additional security. Once I understood how secure the system is (or can be), I found my self thinking “I wish you could use this for more than just SSH.”

SSH Agent

Most people start using SSH keys because they’re easier and it means that you don’t have to enter a password every time that you want to connect to a host. But the truth is that in most cases you want to have unencrypted private keys that have meaningful access to systems because once someone has access to a copy of the private key the have full access to the system. That’s not good.

But the truth is that typing in passwords is a pain, so there’s a solution: the ssh-agent. Basically one authenticates to the ssh-agent locally, which decrypts the key and does some magic, so that then whenever the key is needed for the connecting to a host you don’t have to enter your password. ssh-agent manages the local encryption on your key for the current session.

SSH Reagent

I’m not sure where I found this amazing little function but it’s great. Typically, ssh-agents are attached to the current session, like the window manager, so that when the window manager dies, the ssh-agent loses the decrypted bits from your ssh key. That’s nice, but it also means that if you have some processes that exist outside of your window manager’s session (e.g. Screen sessions) they loose the ssh-agent and get trapped without access to an ssh-agent so you end up having to restart would-be-persistent processes, or you have to run a large number of ssh-agents which is not ideal.

Enter “ssh-reagent.” stick this in your shell configuration (e.g. ~/.bashrc or ~/.zshrc) and run ssh-reagent whenever you have an agent session running and a terminal that can’t see it.

ssh-reagent () {
  for agent in /tmp/ssh-*/agent.*; do
      export SSH_AUTH_SOCK=$agent
      if ssh-add -l 2>&1 > /dev/null; then
         echo Found working SSH Agent:
         ssh-add -l
         return
      fi
  done
  echo Cannot find ssh agent - maybe you should reconnect and forward it?
}

It’s magic.

SSHFS and SFTP

Typically we think of ssh as a way to run a command or get a prompt on a remote machine. But SSH can do a lot more than that, and the OpenSSH package that probably the most popular implementation of SSH these days has a lot of features that go beyond just “shell” access. Here are two cool ones:

SSHFS creates a mountable file system using FUSE of the files located on a remote system over SSH. It’s not always very fast, but it’s simple and works great for quick operations on local systems, where the speed issue is much less relevant.

SFTP, replaces FTP (which is plagued by security problems,) with a similar tool for transferring files between two systems that’s secure (because it works over SSH) and is just as easy to use. In fact most recent OpenSSH daemons provide SFTP access by default.

There’s more, like a full VPN solution in recent versions, secure remote file copy, port forwarding, and the list could go on.

SSH Tunnels

SSH includes the ability to connect a port on your local system to a port on a remote system, so that to applications on your local system the local port looks like a normal local port, but when accessed the service running on the remote machine responds. All traffic is really sent over ssh.

I set up an SSH tunnel for my local system to the outgoing mail server on my server. I tell my mail client to send mail to localhost server (without mail server authentication!), and it magically goes to my personal mail relay encrypted over ssh. The applications of this are nearly endless.

Keep Alive Packets

The problem: unless you’re doing something with SSH it doesn’t send any packets, and as a result the connections can be pretty resilient to network disturbances. That’s not a problem, but it does mean that unless you’re actively using an SSH session, it can go silent causing your local area network’s NAT to eat a connection that it thinks has died, but hasn’t. The solution is to set the “ServerAliveInterval [seconds]” configuration in the SSH configuration so that your ssh client sends a “dummy packet” on a regular interval so that the router thinks that the connection is active even if it’s particularly quiet. It’s good stuff.

/dev/null .known_hosts

A lot of what I do in my day job involves deploying new systems, testing something out and then destroying that installation and starting over in the same virtual machine. So my “test rigs” have a few IP addresses, I can’t readily deploy keys on these hosts, and every time I redeploy SSH’s host-key checking tells me that a different system is responding for the host, which in most cases is the symptom of some sort of security error, and in most cases knowing this is a good thing, but in some cases it can be very annoying.

These configuration values tell your SSH session to save keys to `/dev/null (i.e. drop them on the floor) and to not ask you to verify an unknown host:

UserKnownHostsFile /dev/null
StrictHostKeyChecking no

This probably saves me a little annoyance and minute or two every day or more, but it’s totally worth it. Don’t set these values for hosts that you actually care about.


I’m sure there are other awesome things you can do with ssh, and I’d live to hear more. Onward and Upward!

Persistent SSH Tunels with AutoSSH

Rather than authenticate to a SMTP server to send email, which is fraught with potential security issues and hassles, I use a SSH tunnel to the machine running my mail server. This is automatic, easy to configure both for the mail server and mail client, and incredibly secure. It’s good stuff.

The downside, if there is one, is that the tunnel has to be active to be able to send email messages, and SSH tunnels sometimes disconnect a bit too silently particularly on unstable (wireless) connections. I (and others, I suspect) have had some success with integrating the tunnel connection with pre- and post- connection hooks, so that the network manager automatically creates a tunnel after connecting to the network. but this is a flawed solution that produces uneven results.

Recently I’ve discovered this program called “AutoSSH,” which creates an SSH tunnel and tests it regularly to ensure that the tunnel is functional. If it isn’t, AutoSSH recreates the tunnel. Great!

First start off by getting a copy of the program. It’s not part of the OpenSSh package, so you’ll need to download it separately. It’s in every pacakge management repository that I’ve tried to get it from. So installation, will probably involve one of the following commands at your system’s command line:

apt-get install autossh
pacman -S autossh
yum install autossh
port install autossh

When that’s done, you’ll issue a command that resembles the following

autossh -M 25 -f tychoish@foucault.cyborginstitute.net -L 25:127.0.0.1:25

Really, the important part here is the “autossh -M 25” part of the command. This tells autossh to watch (“monitor”) port number 25 on the local system for a tunnel. The rest of the command (e.g. “-f -L 127.0.0.1:25:127.0.0.1:25 mailserver@tychoish.com -N") is just a typical call to the ssh program.

Things to remember:

  • If you need to create a tunnel on a local port with numbered lower than 1000, you’ll need to run the autossh command as root.
  • SSH port forwarding only forwards traffic from a local port to a remote port, through an SSH connection. All traffic is transmitted over the wire on port 22. Unless you establish multiple tunnels, only traffic sent to the specific local port will be forwarded.
  • Perhaps it’s obvious, but there has to be some service listening on the specified remote end of the tunnel, or else the tunnel won’t do anything.
  • In a lot of ways, depending on your use case autossh, can obviate the need for much more complex VPN setups for a lot of deployments. Put an autossh command in an @reboot cronjob, with an account that has ssh keys generated, and just forget about it for encrypting things like database traffic and the like.

Onward and Upward!

Jekyll Publishing

I wrote about my efforts to automate my publishing workflow a couple of weeks ago, (egad!) and I wanted to follow that up with a somewhat more useful elucidation of how all of the gears work around here.

At first I had this horrible scheme setup that dependent on regular builds triggered by cron, which is a functional, if inelegant solution. There’s a lot of tasks that you can give the appearance of “real time,” responsiveness by scheduling more brute tasks regularly enough. The truth is, however, that its not quite the same, and I knew that there was a better way.

Basically the “right way” to solve this problem is to use the “hooks” provided by the git repositories that I use to store the source of the website. Hooks, in this context refer to a number of scripts which are optionally run before or after various operations in the repositories that allow you to attach actions to the operations you perform on your git repositories. In effect, you can say “when I git push do these other things” or “before I git commit check for these conditions, and if they’re not met, reject the commit” and so forth. The possibilities can be a bit staggering.

In this case what happen is: I commit to the tychoish.com repositories a script that synchronizes the appropriate local packages runs and publishes changes to the server. It then sends me an xmpp message saying that this operation is in progress. This runs as the post-commit hook, and for smaller sites could simply be “git push origin master”. Because tychoish is a large site, and I don’t want to be rebuilding it constantly, I do the following:

#!/bin/bash

# This script is meant to be run in a cron job to perform a rebuilding
# of the slim contents of a jekyll site.
#
# This script can be run several times an hour to greatly simplify the
# publishing routine of a jekyll site.

cd ~/sites/tychoish.com/

# Saving and Fetching Remote Updates from tychoish.com
git pull >/dev/null &&

# Local Adding and Committing
git checkout master >/dev/null 2>&1
git add .
git commit -a -q -m "$HOSTNAME: changes prior to an  slim rebuild"  >/dev/null 2>

# Local "full-build" Branch Mangling
git checkout full-build >/dev/null 2>&1 &&
git merge master &&

# Local "slim-bild" Branch Magling and Publishing
git checkout slim-build >/dev/null 2>&1 &&
git merge master &&
git checkout master >/dev/null 2>&1 &
git push --all

# echo done

Then on the server, once the copy of the repo on the server is current with the changes published to it (i.e. the post-update hook), the following code is run:

#!/bin/bash
#
# An example hook script to prepare a packed repository for use over
# dumb transports.
#
# To enable this hook, make this file executable by "chmod +x post-update".

unset GIT_DIR
unset GIT_WORKING_TREE

export GIT_DIR
export GIT_WORKING_TREE

cd /path/to/build/tychoish.com
git pull origin;

/path/to/scripts/jekyll-rebuild-tychoish-auto-slim &

exit

When the post-update hook runs, in runs in the context of the repository that you just pushed to, and unless you do the magic (technical term, it seems) the GIT_DIR and GIT_WORKING_TREE variables are stuck in the environment and the commands you run fail. So basically this is a fancy git pull, in a third repository (the one that the site is built from.) The script jekyll-rebuild-tychoish-auto-slim looks like this:

#!/bin/bash
# to be run on the server

# setting the variables
SRCDIR=/path/to/build/tychoish.com/
DSTDIR=/path/to/public/tychoish/
SITENAME=tychoish
BUILDTYPE=slim
DEFAULTBUILD=slim

build-site(){
 cd ${SRCDIR}
 git checkout ${BUILDTYPE}-build >/dev/null 2>&1
 git pull source >/dev/null 2>&1

 /var/lib/gems/1.8/bin/jekyll ${SRCDIR} ${DSTDIR} >/dev/null 2>&1
 echo \<jekyll\> completed \*${BUILDTYPE}\* build of ${SITENAME} | xmppipe garen@tychoish.com

 git checkout ${DEFAULTBUILD}-build >/dev/null 2>&1
}

build-site;

This sends me an xmpp message when the build has completed. And does the needful site rebuilding. The xmppipe command I use is really the following script:

#!/usr/bin/perl
# pipes standard in to an xmpp message, sent to the JIDs on the commandline
#
# usage: bash$ `echo "message body" | xmppipe garen@tychoish.com
#
# code shamelessly stolen from:
# http://stackoverflow.com/questions/170503/commandline-jabber-client/170564#170564

use strict;
use warnings;

use Net::Jabber qw(Client);

my $server = "tychoish.com";
my $port = "5222";
my $username = "bot";
my $password = ";
my $resource = "xmppipe";
my @recipients = @ARGV;

my $clnt = new Net::Jabber::Client;

my $status = $clnt->Connect(hostname=>$server, port=>$port);

if (!defined($status)) {
  die "Jabber connect error ($!)\n";
}
my @result = $clnt->AuthSend(username=>$username,
password=>$password,
resource=>$resource);

if ($result[0] ne "ok") {
  die "Jabber auth error: @result\n";
}

my $body = '';
while (<STDIN>) {
  $body .= $_;
}
chomp($body);

foreach my $to (@recipients) {
 $clnt->MessageSend(to=>$to,
 subject=>",
 body=>$body,
 type=>"chat",
 priority=>10);
}

$clnt->Disconnect();

Mark the above as executable and put it in your path somewhere. You’ll want to install the Net::Jabber Perl module, if you haven’t already.

The one final note. If you’re using a tool like gitosis to manage your git repositories, all of the hooks will be executed by the gitosis user. This means that this user will need to have write access the “build” copy of the repository and the public directory as well. You may be able to finesse this with the +s “switch uid” bit, or some clever use of the gitosis user group.

The End.