9 Awesome SSH Tricks

Sorry for the lame title. I was thinking the other day, about how awesome SSH is, and how it's probably one of the most crucial pieces of technology that I use every single day. Here's a list of 10 things that I think are particularly awesome and perhaps a bit off the beaten path.

Update: (2011-09-19) There are some user-submitted ssh-tricks on the wiki now! Please feel free to add your favorites. Also the hacker news thread might be helpful for some.

SSH Config

I used SSH regularly for years before I learned about the config file, that you can create at ~/.ssh/config to tell how you want ssh to behave.

Consider the following configuration example:

Host example.com *.example.net
User root
Host dev.example.net dev.example.net
User shared
Port 220
Host test.example.com
User root
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
Host t
HostName test.example.org
Host *
Compression yes
CompressionLevel 7
Cipher blowfish
ServerAliveInterval 600
ControlMaster auto
ControlPath /tmp/ssh-%r@%h:%p

I'll cover some of the settings in the "Host *" block, which apply to all outgoing ssh connections, in other items in this post, but basically you can use this to create shortcuts with the ssh command, to control what username is used to connect to a given host, what port number, if you need to connect to an ssh daemon running on a non-standard port. See "man ssh_config" for more information.

Control Master/Control Path

This is probably the coolest thing that I know about in SSH. Set the "ControlMaster" and "ControlPath" as above in the ssh configuration. Anytime you try to connect to a host that matches that configuration a "master session" is created. Then, subsequent connections to the same host will reuse the same master connection rather than attempt to renegotiate and create a separate connection. The result is greater speed less overhead.

This can cause problems if you' want to do port forwarding, as this must be configured on the original connection, otherwise it won't work.

SSH Keys

While ControlMaster/ControlPath is the coolest thing you can do with SSH, key-based authentication is probably my favorite. Basically, rather than force users to authenticate with passwords, you can use a secure cryptographic method to gain (and grant) access to a system. Deposit a public key on servers far and wide, while keeping a "private" key secure on your local machine. And it just works.

You can generate multiple keys, to make it more difficult for an intruder to gain access to multiple machines by breaching a specific key, or machine. You can specify specific keys and key files to be used when connected to specific hosts in the ssh config file (see above.) Keys can also be (optionally) encrypted locally with a pass-code, for additional security. Once I understood how secure the system is (or can be), I found my self thinking "I wish you could use this for more than just SSH."

SSH Agent

Most people start using SSH keys because they're easier and it means that you don't have to enter a password every time that you want to connect to a host. But the truth is that in most cases you want to have unencrypted private keys that have meaningful access to systems because once someone has access to a copy of the private key the have full access to the system. That's not good.

But the truth is that typing in passwords is a pain, so there's a solution: the ssh-agent. Basically one authenticates to the ssh-agent locally, which decrypts the key and does some magic, so that then whenever the key is needed for the connecting to a host you don't have to enter your password. ssh-agent manages the local encryption on your key for the current session.

SSH Reagent

I'm not sure where I found this amazing little function but it's great. Typically, ssh-agents are attached to the current session, like the window manager, so that when the window manager dies, the ssh-agent loses the decrypted bits from your ssh key. That's nice, but it also means that if you have some processes that exist outside of your window manager's session (e.g. Screen sessions) they loose the ssh-agent and get trapped without access to an ssh-agent so you end up having to restart would-be-persistent processes, or you have to run a large number of ssh-agents which is not ideal.

Enter "ssh-reagent." stick this in your shell configuration (e.g. ~/.bashrc or ~/.zshrc) and run ssh-reagent whenever you have an agent session running and a terminal that can't see it.

ssh-reagent () {
  for agent in /tmp/ssh-*/agent.*; do
      export SSH_AUTH_SOCK=$agent
      if ssh-add -l 2>&1 > /dev/null; then
         echo Found working SSH Agent:
         ssh-add -l
         return
      fi
  done
  echo Cannot find ssh agent - maybe you should reconnect and forward it?
}

It's magic.

SSHFS and SFTP

Typically we think of ssh as a way to run a command or get a prompt on a remote machine. But SSH can do a lot more than that, and the OpenSSH package that probably the most popular implementation of SSH these days has a lot of features that go beyond just "shell" access. Here are two cool ones:

SSHFS creates a mountable file system using FUSE of the files located on a remote system over SSH. It's not always very fast, but it's simple and works great for quick operations on local systems, where the speed issue is much less relevant.

SFTP, replaces FTP (which is plagued by security problems,) with a similar tool for transferring files between two systems that's secure (because it works over SSH) and is just as easy to use. In fact most recent OpenSSH daemons provide SFTP access by default.

There's more, like a full VPN solution in recent versions, secure remote file copy, port forwarding, and the list could go on.

SSH Tunnels

SSH includes the ability to connect a port on your local system to a port on a remote system, so that to applications on your local system the local port looks like a normal local port, but when accessed the service running on the remote machine responds. All traffic is really sent over ssh.

I set up an SSH tunnel for my local system to the outgoing mail server on my server. I tell my mail client to send mail to localhost server (without mail server authentication!), and it magically goes to my personal mail relay encrypted over ssh. The applications of this are nearly endless.

Keep Alive Packets

The problem: unless you're doing something with SSH it doesn't send any packets, and as a result the connections can be pretty resilient to network disturbances. That's not a problem, but it does mean that unless you're actively using an SSH session, it can go silent causing your local area network's NAT to eat a connection that it thinks has died, but hasn't. The solution is to set the "ServerAliveInterval [seconds]" configuration in the SSH configuration so that your ssh client sends a "dummy packet" on a regular interval so that the router thinks that the connection is active even if it's particularly quiet. It's good stuff.

/dev/null .known_hosts

A lot of what I do in my day job involves deploying new systems, testing something out and then destroying that installation and starting over in the same virtual machine. So my "test rigs" have a few IP addresses, I can't readily deploy keys on these hosts, and every time I redeploy SSH's host-key checking tells me that a different system is responding for the host, which in most cases is the symptom of some sort of security error, and in most cases knowing this is a good thing, but in some cases it can be very annoying.

These configuration values tell your SSH session to save keys to `/dev/null (i.e. drop them on the floor) and to not ask you to verify an unknown host:

UserKnownHostsFile /dev/null
StrictHostKeyChecking no

This probably saves me a little annoyance and minute or two every day or more, but it's totally worth it. Don't set these values for hosts that you actually care about.


I'm sure there are other awesome things you can do with ssh, and I'd live to hear more. Onward and Upward!

Against Open Stacks

I have misgivings about Open Stack. Open Stack is an open source "Cloud" or infrastructure/virtualization platform, that allows providers to create on-demand computing instances, as if "in the cloud," but running on their own systems. This kind of thing is generally refereed to as "private clouds," but as all things in the "cloud space," this is relatively nebulous concept.

To disclose, I am employed by a company that does work in this space, that isn't the company that is responsible for open space. I hope this provides a special perspective, but I am aware that my judgment is very likely clouded. As it were.

Let us start from the beginning, and talk generally about what's on the table here. Recently the technology that allows us to virtualize multiple instance on a single piece of hardware has gotten a lot more robust, easy to use, and performant. At the same time, for the most part the (open source) "industrial-grade" virtualization technology isn't particularly easy to use or configure. It can be done, of course, but it's non trivial. These configurations and the automation to glue it all together--and the quality therein--is how the cloud is able to differentiate itself.

On some level "the Cloud" as a phenomena is about the complete conversion of hardware into a commodity. Not only is hardware cheap, but it's so cheap that we can do most hardware in software, The open sourcing of this "OpenStack" pushes this barrier one step further and says, that the software is a commodity as well.

It was bound to happen at some point, it's just a curious move and probably one that's indicative of something else in the works.

The OpenStack phenomena is intensely interesting for a couple of reasons. First, it has a lot of aspects of some contemporary commercial uses of open source: the project has one contributor and initial development grows out of the work of one company that developed the software for internal use and then said "hrm, I guess we can open source it." Second, if I'm to understand correctly, OpenStack isn't software that isn't already open source software (aside from a bunch of glue and scripts), which is abnormal.

I'm not sure where this leads us, and I've been milling over what this all means for a while, and have largely ended up here: it's an interesting move, if incredibly weird and hard to really understand what's going on.

Persistent SSH Tunels with AutoSSH

Rather than authenticate to a SMTP server to send email, which is fraught with potential security issues and hassles, I use a SSH tunnel to the machine running my mail server. This is automatic, easy to configure both for the mail server and mail client, and incredibly secure. It's good stuff.

The downside, if there is one, is that the tunnel has to be active to be able to send email messages, and SSH tunnels sometimes disconnect a bit too silently particularly on unstable (wireless) connections. I (and others, I suspect) have had some success with integrating the tunnel connection with pre- and post- connection hooks, so that the network manager automatically creates a tunnel after connecting to the network. but this is a flawed solution that produces uneven results.

Recently I've discovered this program called "AutoSSH," which creates an SSH tunnel and tests it regularly to ensure that the tunnel is functional. If it isn't, AutoSSH recreates the tunnel. Great!

First start off by getting a copy of the program. It's not part of the OpenSSh package, so you'll need to download it separately. It's in every pacakge management repository that I've tried to get it from. So installation, will probably involve one of the following commands at your system's command line:

apt-get install autossh
pacman -S autossh
yum install autossh
port install autossh

When that's done, you'll issue a command that resembles the following

autossh -M 25 -f tychoish@foucault.cyborginstitute.net -L 25:127.0.0.1:25

Really, the important part here is the "autossh -M 25" part of the command. This tells autossh to watch ("monitor") port number 25 on the local system for a tunnel. The rest of the command (e.g. "-f -L 127.0.0.1:25:127.0.0.1:25 mailserver@tychoish.com -N") is just a typical call to the ssh program.

Things to remember:

  • If you need to create a tunnel on a local port with numbered lower than 1000, you'll need to run the autossh command as root.
  • SSH port forwarding only forwards traffic from a local port to a remote port, through an SSH connection. All traffic is transmitted over the wire on port 22. Unless you establish multiple tunnels, only traffic sent to the specific local port will be forwarded.
  • Perhaps it's obvious, but there has to be some service listening on the specified remote end of the tunnel, or else the tunnel won't do anything.
  • In a lot of ways, depending on your use case autossh, can obviate the need for much more complex VPN setups for a lot of deployments. Put an autossh command in an @reboot cronjob, with an account that has ssh keys generated, and just forget about it for encrypting things like database traffic and the like.

Onward and Upward!

Wikis are not Documentation

It seems I'm writing a minor series on the current status (and possible future direction?) of technical writing and documentation efforts. Both in terms of establishing a foundation for my own professional relevancy, as well as in and for itself because I think documentation has the potential to shape the way that people are able to use technology. I started out with Technical Writing Appreciation and this post will address a few sore points regarding the use of wikis as a tool for constructing documentation.

At the broadest level, I think there's a persistent myth regarding the nature of the wiki and the creation of content in a wiki that persists apart from their potential use in documentation projects. Wiki's are easy to install and create. It is easy to say "I'm making a wiki, please contribute!" It is incredibly difficult to take a project idea and wiki software and turn that into a useful and vibrant community and resource. Perhaps these challenges arise from the fact that wiki's require intense stewardship and attention, and this job usually falls to a very dedicated leader or a small core of lead editors. Also, since authorship on wikis is diffuse and not often credited, getting this kind of leadership and therefore successfully starting communities around wiki projects can be very difficult.

All wikis are like this. At the same time, I think the specific needs of technical documentation makes these issues even more prevalent. This isn't to say that wiki software can't power documentation teams, but the "wiki process" as we might think of it, is particularly unsuited to documentation.

One thing that I think is a nearly universal truth of technical writing is that the crafting of texts is the smallest portion of the effort of making documentation. Gathering information, background and experience in a particular tool or technology is incredibly time consuming. Narrowing all this information down into something that is useful to someone is a considerable task. The wiki process is really great for the evolutionary process of creating a text, but it's not particularly conducive to facilitating the kind of process that documentation must go through.

Wikis basically "here's a simple editing interface without any unnecessary structure: go and edit, we don't care about the structure or organization, you can take care of that as a personal/social problem." Fundamentally, documentation requires an opposite approach, once a project is underway and some decisions have been made, organization isn't the kind of thing that you want to have to manually wrestle, and structure is very necessary. Wikis might be useful content generation and publication tools, but they are probably not suited to supporting the work flow of a documentation project.

What then?

I think the idea of a structured wiki, as presented by twiki has potential but I don't have a lot of experience with it. My day-job project uses an internally developed tool, and a lot of internal procedures to enforce certain conventions. I suspect there are publication, collaboration, and project management tools that are designed to solve this problem, but I'm not particularly familiar with anything specific. In any case, it's not a wiki.

Do you have thoughts? Have I missed something? I look forward to hearing from you in comments!

Technical Writing Appreciation

I'm a technical writer. This is a realization that has taken me some time to appreciate and understand fully.

Technical writing is one of those things that creators of technology, a term that I will use liberally, all agree is required, but it's also something that's very difficult to do properly. I think this difficulty springs from the following concerns: What constitutes describing a technology or process in too much detail? Not enough? Are all users of a technology able to make use of the same level of documentation sets? If users are too diverse, what is the best way to make sure that their needs are addressed: do we write parallel documentation for all users, or do we try and equalize less advanced users so that the core documentation is useful to everyone?

The answers to these questions vary of course with the needs of the product being documented and the use cases, but I think resolving these concerns presents a considerable challenge to any kind of technical documentation project, and the way that the documentation developers resolve these issues can have a profound effect not only on the documentation itself but the value and usefulness of the documentation itself. As I've been thinking about the utility and value of technical writing, a professional hazard, I've come up with a brief taxonomy of approaches to technical writing:

  • First, there's the document everything approach. Starting with a full list of features (or even an application's source) the goal here is to make sure that there's no corner unturned. We might think of this as the "manual" approach, as the goal is to produce a comprehensive manual. These are great reference materials, particularly when indexed effectively, but the truth is that they're really difficult for users to engage with, even though they may have all the answers to a users questions (e.g. "RTFM.") I suspect that the people who write this kind of documentation either work closely with developers or are themselves developers.
  • Second, there's what I think of as the systems or solutions document, which gives up comprehensiveness, and perhaps even isolation to a single tool or application, and documents outcomes and processes. They aren't as detailed, and so might not answer underlying questions, but when completed effectively they provide an ideal entry point into using a new technology. In contrast to the "manual" these documents are either slightly more general interest or like "white papers." This class of documentation, thus, not simply explains how to accomplish specific goals but illuminates technical possibilities and opportunities that may not be clear from a function-based documentation approach. I strongly suspect that the producers of this kind of documentation are very rarely the people who develop the application itself.
  • In contrast to the above, I think documentation written for education and training purposes, may appear to be look either a "manual" or a "white paper," but have a fundamentally different organization and set of requirements. Documentation that supports training is often (I suspect) developed in concert with the training program itself, and needs to impart a level of deeper understanding of how a system works (like the content of a manual,) but doesn't need to be comprehensive, and needs mirror the general narrative and goals of the training program.
  • Process documentation finally, is most like solution documentation, but rather than capture unrealized technological possibilities or describe potentially hypothetical goals, these kinds of documents capture largely institutional knowledge to more effectively manage succession (both by future iterations of ourselves, and our replacements). These documents have perhaps the most limited audience, but are incredibly valuable both archival (e.g. "How did we used to do $*?") and also for maintaining consistency particularly amongst teams as well as for specific tasks.

I think the fundamental lesson regarding documentation here isn't that every piece of technology needs lots and lots of documentation, but rather that depending on the goals for a particular technology development program or set of tools, different kinds of documentation may be appropriate, and more useful in some situations.

As a secondary conclusion, or direction for more research: I'd be interested in figuring out if there are particular systems that allow technical writers (and development teams) to collect multiple kinds of information and produce multiple documentation for different organizations. Being able to automatically generate different wholes out of documentation "objects" if we may be so bold.

I must look into this. Onward and Upward!

Saved Searches and Notmuch Organization

I've been toying around with the Notmuch Email Client which is a nifty piece of software that provides a very minimalist and powerful email system that's inspired by the organizational model of Gmail.

Mind you, I don't think I've quite gotten it.

Notmuch says, basically, build searches (e.g. "views") to filter your email so you can process your email in the manner that makes the most sense to you, without needing to worry about organizing and sorting email. It has the structure for "tagging," which makes it easy to mark status for managing your process (e.g. read/unread, reply-needed), and the ability to save searches. And that's about it.

Functionally tags and saved searches work the way that mail boxes in terms of the intellectual organization of mailboxes. Similarly the ability to save searches, makes it possible to do a good measure of "preprocessing." In the same way that Gmail changes the email paradigm by saying "don't think about organizing your email, just do what you need to do," not much says "do less with your email, don't organize it, and trust that the machine will be able to help you find what you need when the time comes."


I've been saying variations of the following for years, but I think on some level it hasn't stuck for me. Given contemporary technology, it doesn't make sense to organize any kind of information that could conceivably be found with search tools. Notmuch proves that this works, and although I've not been able to transfer my personal email over, I'm comfortable asserting that notmuch is a functional approach to email. To be fair, I don't feel like my current email processing and filtering scheme is that broken, so I'm a bad example.

The questions that this raises, which I don't have a particularly good answers for, are as follows:

  • Are there good tools for the "don't organize when you can search crew," for non-email data? And I'm not just talking about search engines themselves (as there are a couple: xapian, namazu), or ungainly desktop GUIs (which aren't without utility,) but the proper command-line tools, emacs interfaces, and web based interfaces?
  • Are conventional search tools the most expressive way of specifying what we want to find when filtering or looking for data? Are there effective improvements that can be made?
  • I think there's intellectual value created by organizing and cataloging information "manually," and "punting to search" seems like it removes the opportunity to develop good and productive information architectures (if we may be so bold.) Is there a solution that provides the ease of search without giving up the benefits that librarianism brings to information organization?

In Favor of Unpopular Technologies

This post ties together a train of thought that I started in "The Worst Technologies Always Win" and "Who Wants to be a PHP Developer" with the ideas in the "Ease and the Stack" post. Basically, I've been thinking about why the unpopular technologies, or even unpopular modes of using technologies are so appealing and seem to (disproportionately) capture my attention and imagination.

I guess it would first be useful to outline a number of core values that seems to guide my taste in technologies:

  • Understandable

Though I'm not really a programmer, so in a lot of ways it's not feasible to expect that I'd be able to expand or enhance the tools I use. At the same time, I feel like even for complex tasks, I prefer using tools that I can have a chance of understanding how they work. I'm not sure if this creates value in the practical sense, however, I tend to think that I'm able to make better use of technologies that I understand the fundamental underpinnings of how they work.

  • Openness and Standard

I think open and standardized technologies are more useful, in a way that flows from "understandable," I find open source and standardized technology to be more useful. Not in the sense that open source technology is inherently more useful because source code is available (though sometimes that's true), but more in the sense that software developed in the open tends to have a lot of the features and values that I find important. And of course, knowing that my data and work is stored in a format that isn't locked into a specific vendor, allows me to relax a bit about the technology.

  • Simple

Simpler technologies are easier to understand and easier--for someone with my skill set--to customize and adopt. This is a good thing. Fundamentally most of what I do with a computer is pretty simple, so there's not a lot of reason to use overly complicated tools.

  • Task Oriented

I'm a writer. I spend a lot of time on the computer, but nearly everything I do with the computer is related to writing. Taking notes, organizing tasks, reading articles, manipulating texts for publication, communicating with people about various things that I'm working on. The software I use supports this, and the most useful software in my experience focuses on helping me accomplish these tasks. This is opposed to programs that are feature or function oriented. I don't need software that could do a bunch of things that I might need to do, I need tools that do exactly what I need. If they do other additional things, that's nearly irrelevant.

The problem with this, is that although they seem like fine ideals and values for software development, they are, fundamentally unprofitable. Who makes money selling simple, easy to understand, software with limited niche-targeted feature sets? No one. The problem is that this kind of software and technology makes a lot of sense, and so we keep seeing technologies that have these values that seem like they could beat the odd and become dominant, and then they don't. Either they drop task orientation for a wider feature set, or something with more money behind it comes along, or the engineers get board and build something that's more complex, and the unpopular technologies shrivel up.

What to do about it?

  • Learn more about the technologies you use. Even, and epically if you're not a programmer.
  • Develop simple tools and share them with your friends.
  • Work toward task oriented computing, and away from feature orientation.

The Successful Failure of OpenID

Just about the time I was ready to call OpenID a total failure, something clicked and, if you asked how I thought "OpenID was doing," I'd have to say that it's largely a success. But it certianly took long enough to get here.

Lets back up and give some context.

OpenID is a system for distributing and delegating authentication for web services to third party sites. Basically to the end user, rather than signing into a website with your username and password, you sign in with your profile URL on some secondary site that you actually log into. The site you're trying to log in, asks the secondary site "is this legit," the secondary site prompts you (usually just the first time, though each OpenID provider may function differently here.) then you're good to go.

Additionally, and this is the part that I really like about Open ID is that you can delegate the OpenID of a given page to a secondary host. So on tychoish.com you'll find the following tags in the header of the document:

<link rel="openid.server" href="http://www.livejournal.com/openid/server.bml" />
<link rel="openid.delegate" href="http://tychoish.livejournal.com/" />

So I tell a third party site "I wanna sign in with http://tychoish.com/ as my OpenID," it goes and sees that I've delegated tychoish.com's OpenID to LiveJournal (incidentally the initiators of OpenID if memory serves,) and LiveJournal handles the authentication and validation for me. If at some point I decide that LiveJournal isn't doing what I need it to, I can change these tags to a new provider, and all the third party sites go talk to the new provider as if nothing happened. And it's secure because I control tychoish.com and contain a provider-independent identity server, while still making use of these third party servers. Win.

The thing is that OpenID never really caught on. Though managing a single set of authentication credentials, and a common identity across a number of sites has a lot of benefits to the users, it never really caught on. Or I should say, it took a very long time to be taken seriously. There are a number of reasons for this, in my understanding:

1. Third party vendors wanted to keep big user databases with email addresses. OpenID means, depending on implementation that you can bypass the traditional sign up method. This isn't a technological requirement but can be confusing in some instances. By giving up the "traditional" value associated with sponsoring account creation, OpenID seemed like a threat to traditional web businesses. There were ways around this, but it's confusing and as is often the case a dated business model trumped an inspiring business model.

2. There was and is some fud around security. People thought if they weren't responsible for the authentication process that they wouldn't be able to ensure that only the people who were supposed to were able to get into a given account. Particularly since the only identifying information associated with an account was a publicly accessible URL. Nevertheless it works, and I think people used these details to make people feel like the system isn't/wasn't secure.

3. There are some legitimate technological concerns that need to be sorted out. Particularly around account creation. This is the main confusion cited above. If someone signs up for an account with an OpenID, do they get a username and have to enter that, or do we just use the OpenID URL? Is there an email address or password associated with the account? What if they get locked out and need to get into the account but there's no email? What if they need to change their OpenID provider/location at some point. These are legitimate concerns, but they're solvable problems.

4. Some users have had a hard time groking it. Because it breaks with the conventional usage model, and it makes signing into sites simple it's a bit hard to grok.

What's fascinating about this is that eventually it did succeed. More even than joy at the fact that I get to use OpenID, finally, I think OpenID presents an interesting lesson in the eventual success of emergent technological phenomena. Google accounts, flickr accounts, and AIM accounts all provide OpenID. And although "facebook connect" is not using OpenID technology, it's conceptually the same. Sites like StackOverflow have OpenID only authentication, and it's becoming more popular.

OpenID succeeded not because the campaign to teach everyone that federated identity vis a vis OpenID was the future and the way we should interact with web services, but rather because the developers of web applications learned that this was the easier and more effective way to do things. And, I suspect in as much as 80% or 90% of cases when people use OpenID they don't have a clue that that's the technology they're using. And that's probably an ok thing.

The question that lingers in my mind as I end this post is: is this parallel any other optimistic technology that we're interested in right now? Might some other "Open*" technology take away a strategic lesson from the tactical success of OpenID? I'd love to see that.

Onward and Upward!