Longer Forms

A friend asked me a question (several weeks ago by publication) on a technical topic and I spent most of the next few days writing a missive on database administration strategy. That seemed like a normal response. I was delighted to find that: I liked the voice, I enjoyed writing the longer document, and there are a dozen or so other related topics that I wanted to explore. So, apparently, I'm writing a book. This is exactly what I need: more projects. Not.

But it's a good thing: I find the writing inspiring and invigorating. I have a perspective and collection of knowledge that hasn't been collected and presented in a single place. I like long form writing. The larger piece might also be a good contribution to my portfolio (such as it is.)

I think this kind of writing suits my attention span.

This has left me without a lot of spare time for blogging, and (as I'm prone to do every so often,) rethinking the future of my efforts on tychoish.com and as a blogger. This is boring for all of you, but I'll give some higher level stuff here and we can follow up in comments:

  • Blogging is fun, and even though I've not been posting regularly, I'm always writing blog posts. Sometimes I find myself writing posts in emails to friends, but I'm never really going to stop writing blog posts.

  • The general explosion of blog publishing that we saw a few years ago has declined. Audience fragmentation happened, readership got entrenched. I feel like I weathered the storm pretty well and I'm really happy with the site and readers I have, but I'm also pretty confident that blogging isn't going to be the means by which I "level up." [1]

  • eBooks have finally happened. For the last decade most people have been saying that ebooks are great for reference material (given search-ability,) and for providing an introduction to a text that people will eventually buy in a paper edition. That may be true, but I think it's changing rapidly, and with kindles and tablets and smart-phones, I think eBooks have effectively won, such as it is.

    In another ten years, perhaps, we'll just call them books.

  • I'm pretty clear that keeping a blog, and perhaps most of the writing I do in my spare time is for my own enjoyment and betterment and helps to develop a personal portfolio and account of my work. I have no (real) interest in using my writing on tychoish.com or any other side that I maintain, as a way of supporting myself to any greater or lesser extent.

I want to be in the business of writing things and working with technology and ideas and people, not the business of publishing. While the line is not always clear between "writing projects that you publish yourself online," and "new media publisher," I want to stay away from the later as much as possible.

So I think this means that most of my "tychoish," writing time will go to writing this book project, and to fiction, and once my blog post backlog is fully depleted (heh,) most of my postings will either be announcements/progress-reports or a bunch of shorter more off-the-cuff notes.

Here's hoping at least.

[1]I can't really believe that I just used "level up" in this context.

Technical Writing Fiction

On Outer Alliance Podcast #8, David Levine talked about having worked as a technical writer for some 15 years and then said something to the effect of "It's a point of great personal pride that I've never put a bulleted list in a piece of fiction."

I laughed out loud. Perhaps frightening a woman walking her dog nearby.

In most ways, the kind of writing that I do for work, API references, tutorials, administration overviews, best-practice descriptions, is very different from the kinds of things I write away from work, or at least I like to think so.

The truth is that I've learned a bunch about writing and about communicating in general from writing documentation. While my "professional background," doesn't include formal technological training, I definitely "broke in" because I was familiar with technology and could write, rather than being a particularly skilled or trained writer. Any more (just 2.5 years on,) I think the inverse is more true, but that's conjecture.

Technical writing has definitely shaped the evolution of my taste: a couple years ago, I found myself most drawn to complex tightly constructed prose in fiction. These days I mostly go for sparse clear concise prose that isn't particularly ornamented. Perhaps it's only really possible to tune the internal editor for one kind of style at a time.

Having said that, I will confess to feeling--and resisting--the urge to put a bulleted list or some other structured convention of software manuals in fiction.

It's the little things, really.

Multi-Audience Documentation

I've written before about different types of documentation, and the different purposes and goals that each type services. Rather than rehash what documentation is, I'm interested in using this post to think about ways of managing and organizing the documentation process to produce better documentation more easily, with the end goal of being able to increase both maintainability and usability of documentation resources.

Different groups of users--say: administrators, end-users, and developers--interact with technology in overlapping but distinct ways. For some technologies, the differences between the classes of users is not significant and one set of documentation is probably good do every one, plus or minus a very small set. In most other cases, multiple resources are required to be able to address the unique needs of different user groups. Figuring out effective ways to address the different kinds of questions that various groups of users ask, but in a way that makes sense to those users is often the primary challenge in writing documentation.

Having said that, writing different sets of documentation for different users is a lot of work, but given time its not insurmountable. The problem is after six months or more (say,) or a couple of new releases when its time to update the documentation, there are three manuals to update instead of one. This is pretty much horrible. Not only is it more work, but the chances for errors skyrockets, and it's just a mess.

The solution, to my mind, is to figure out ways to only ever have to write one set of documentation. While it might make theoretical sense to split the material into multiple groups, do everything you can to avoid splitting the documentation. Typically, a well indexed text can be used by multiple audiences if its easy enough for users to skip to read only the material they need.

The second class of solutions revolves around taking a more atomic approach to writing documentation. In my own work this manifests in two ways:

  • Setting yourself up for success: understanding how software is going to be updated, or how use is likely to change over time allows you to construct documents that are organized in a way that makes them easy to update. For example: Separate processes from reference material, and split up long processes into logical chunks that you can interlink to remove redundancies.

    Unfortunately, in many cases, it's necessary to learn enough about a project and the different use patterns before you have the background needed to predict what the best structure of the documentation ought to be.

  • Separate structure from content: This is a publishing system requirement, at the core, but using this kind of functionality must be part of the writer's approach. Writers need to build documentation so that the organization (order, hierarchy, etc.) is not implicit in the text, but can be rearranged and reformed as needed. This means writing documentation "atoms" in a structurally generic way. Typically this also leads to better content. As a matter of implementation, documentation resource would require a layer of "meta files" that would provide organization that would be added at build time.

In effect this approach follows what most systems are doing anyway, but in practice we need another processing layer. Sphinx is pretty close in many ways but most document formats and build systems don't really have support for this kind of project organization (or they require enough XML tinkering to render them unfeasible.) Once everything's in place and once all of the atoms exist, producing documents for a different audience is just a matter of compiling a new organization layer and defining an additional output target.

This also produces problems for organization. If content is segregated into hundreds of files for a standard-book-length manual (rather than dozens, say) keeping the files organized is a challenge. If nothing else, build tools will need to do a lot more error checking and hopefully documentation writers will develop standard file organizations and practices that will keep things under control.

Thoughts? Onward and Upward!

Why you Don't Want Programers to Write Your Documentation

So the documentation sucks. Hire someone to make the documentation suck less.

Simple enough, right?

Right.

Just don't hire a programmer to write documentation, even though this seems to be a pretty common impulse. There are a lot of reasons, but here are some of the most important from my perspective:

  • Programmers focus on the code they write, or might write, to be able to describe and document entire projects. It's really hard to get programmers to approach documentation from the biggest possible frame.
  • Programmers have a hard time organizing larger scale documentation resources, because they approach it as a database problem rather than a cognitive/use problem.
  • Programmers solve problems by writing code, not by documenting it. You can push programmers to write notes and you can push the best programmers who can write to work on documentation; but unless you dedicate an engineer to writing documentation full time (which is a peculiar management decision) documentation will always come second.
  • I'd wager that every organization large enough to have documentation that sucks is probably large enough to have enough documentation for a full time technical writer.
  • Engineers, particularly those who are familiar with a piece of technology, do this really interesting thing where they explain phenomena from the most basic assumptions prompted to describe something, but regularly skip crucial steps in processes and parts explanations if they think they're obvious.

Interesting cognitive phenomena do not make for good documentation.

What am I missing?

Publishing System Requirements

Like issue tracking systems, documentation publication systems are never quite perfect. There are dozens of options, and most of them are horrible and difficult to use for one reason or another. Rather than outline why these systems are less than ideal, I want to provide a list of basic requirements that I think every documentation publishing system [1] should have.

Requirements

  • Tag System. You have to be able to identify and link different pieces of content together in unique and potentially dynamic ways across a number of dimensions. Tagging systems, particularly those that can access and create lists of "other posts with similar tags," are essential for providing some much needed organization to projects that are probably quite complex. Tagging systems should provide some way of supporting multiple tag namespaces. Also operations affecting tags need to be really efficient, from the users and software's perspective, or else it won't work at realistic scales.
  • Static Generation. Content needs to be statically generated. There are so many good reasons to have static repositories. It allows you to plan releases (which is good if you need to coordinate documentation releases with software releases) most documentation changes infrequently. The truth is this feature alone isn't so important, but static generation makes the next several features possible or easier.
  • Development Builds. As you work on documentation, it's important to be able to see what the entire resource will look like when published. This is a mass-preview mode, if you will. The issue here, is that unlike some kinds of web-based publications, documentation often needs to be updated in batches, and it's useful to be able to see those changes all at once because the can all interact in peculiar ways. These test builds need to be possible locally, so that work isn't dependent on a network connection, or shared infrastructure.
  • Verification and Testing. While building "self-testing" documentation is really quite difficult (see also /technical-writing/dexy,) I think publication systems should be able to do "run tests" against documents and provide reports, even if the tests are just to make sure that new of software versions haven't been released, or that links still work. It's probably also a good idea to be able to verify that certain conventions are followed in terms of formatting: trailing white space, optional link formats, tagging conventions, required metadata, and so forth.
  • Iteration Support. Documents need to be released and revised on various schedules as new versions and products are developed. Compounding this problem, old documentation (sometimes,) needs to hang around for backwards compatibility and legacy support. Document systems need to have flexible ways to tag documents as out of date, or establish pointers that say "the new version of this document is located here." It's possible to build this off of the tag system, but it's probably better for it to be a separate piece of data.
  • Version Control. These systems are great for storing content, facilitating easy collaboration, and supporting parallel work. Diffs are a great way to provide feedback for writers, and having history is useful for being able to recreate and trace your past thinking when you have to revisit a document or decision weeks and months later.
  • Lightweight Markup. It's dumb to make people write pure XML in pretty much every case. With rst, markdown, and pandoc the like there's no reason to write XML. Ever. of story End.
  • Renaming and Reorganization. As document repositories grow and develop, it seems inevitable that the initial sketch of the organization for the body of work change. Documents will need to be moved, URLs will need to be redirected or rewritten, and links will need to be updated. The software needs to support this directly.
  • Workflow Support. Documentation systems need to be able to facilitate editorial workflows and reviews. This should grow out of some combination of a private tag name space and a reporting feature for contributions, which can generate lists of pages to help groups distribute labor and effort.

This might just be a quirk of my approach, but I tend approach documentation, terms of process and tooling, as if it were programming and writing software. They aren't identical tasks, of course, but there are a lot of functional similarities And definitely enough to take advantage of the tooling and advances (i.e. make, git, etc.) that programmers have been able to build for themselves. Am I missing something or totally off base?

[1]Think knowledge bases, documentation sites, and online manuals. I'm generally of the opinion that one should be able to publish all of these materials using the same tool.

Managing Emacs Configuration

This document outlines the use of emacs' require and provide functions to help new users understand how to better configure the text editor. While there are a number of different strategies for organizing emacs configuration files and lisp systems and there is no single dominant best practice, consider this document if you find your .emacs or init.el` file growing out of control.

Background and Overview

After using emacs for any period of time, one begins to develop a rather extensive emacs configuration. Emacs comes with very little default configuration and large number of configuration possibilities. Because writers, programmers, and researchers of all persuasions and backgrounds use emacs for a larger array of tasks and work profiles, the need for customization is often quite high. n Rather than have a massive emacs configuration with thousands of lines, I've broken the configuration into a number of files that are easier to manage, easier to troubleshoot, and easier to make sense of. These files are then linked together and loaded using emacs' native require function. This document explains that organizational principal and provides the code needed to duplicate my configuration.

I store all of my emacs configuration in a folder that I will refer to as ~/emacs/, in actuality this is a sub-folder within a git repository that I use to store all of my configuration folders, and you should modify this location to suit your own needs. Additionally, I have the habit of prepending the characters tycho- to every function and emacs file name that are my own writing. This namespace trick helps keep my customization separate from emacs' own functions or the functions of loaded packages and prevents unintended consequences in most cases. You might want to consider a similar practice.

Configuring .emacs

My .emacs file is really a symbolic link to the ~/emacs/config/$HOSTNAME.el file. This allows the contents of .emacs to be in version control and if you have your emacs configuration on multiple machines to use the same basic configuration on multiple machines with whatever machine specific configuration you require. To create this symlink, issue the following command:

ln -s ~/emacs/config/$HOSTNAME.el ~/.emacs

Make sure that all required files and directories exist. My .emacs file is, regardless of it's actual location, is very minimal because the meat of the configuration is in ~/emacs/tycho-init.el. Take the following skeleton for ~/.emacs:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Startup and Behavior Controls
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(setq load-path (cons "~/emacs" load-path))

(setq custom-file "~/emacs/custom.el")
(add-to-list 'load-path "~/emacs/snippet/")
(add-to-list 'load-path "/usr/share/emacs/site-lisp/slime/")

(require 'tycho-display)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Machine Specific Configuration Section
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(tycho-font-medium)
(setq bookmark-default-file "~/garen/emacs/bookmarks/arendt"
      w3m-session-file "~/garen/emacs/bookmarks/w3m-session-arendt"
      bookmark-save-flag 1)

(if (file-directory-p "~/garen/emacs/backup")
    (setq backup-directory-alist '(("." . "~/garen/emacs/backup")))
  (message "Directory does not exist: ~/garen/emacs/backup"))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Load the real init
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(require 'tycho-init)

(menu-bar-mode -1)

The first seq defines the load path. Like other configuration paths, this is the directory that emacs will look for files to load when you use require later. load-path does not crawl a directory hierarchy, so if you store emacs lisp within ~/emacs/, you'll need to add those directories here. To see the value of the load-path use "C-h v" in emacs. I then define "custom.el" as it's own file to prevent customize from saving configuration in my init file. Then I use require to load a number of display-related functions (from the file ~/emacs/tycho-display.el,) including the tycho-font-medium function.

Then I have a number of machine-specific configuration opens set, mostly to keep multiple machines from overwriting state files.

Finally, I load the file with the real configuration with the (require 'tycho-init) sexp. The configuration is located in the ~/emacs/tycho-init.el file. The file closes with the (menu-bar-mode -1) sexp, which is the last part of the configuration to evaluate and ensures that there isn't a menu-bar at all.

Require and Provide

require, however, does not simply load .el files in the load path. Rather, the file needs to be announced to emacs. Accomplish this with provide functions in the file. For ~/emacs/tycho-display.el the relevant parts are as follows:

(provide 'tycho-display)

(defun tycho-font-medium ()
  (interactive)
  (setq default-frame-alist '((font-backend . "xft")
                              (font . "Inconsolata-13")
                              (vertical-scroll-bars . 0)
                              (menu-bar-lines . 0)
                              (tool-bar-lines . 0)
                              (alpha 86 84)))
  (tool-bar-mode -1)
  (scroll-bar-mode -1))

(global-set-key (kbd "C-c f m") 'tychoish-font-medium)

(setq-default inhibit-startup-message 't
              initial-scratch-message 'nil
              save-place t
              scroll-bar-mode nil
              tool-bar-mode nil
              menu-bar-mode nil
              scroll-margin 0
              indent-tabs-mode nil
              flyspell-issue-message-flag 'nil
              size-indication-mode t
              scroll-conservatively 25
              scroll-preserve-screen-position 1
              cursor-in-non-selected-windows nil)

The provide call, identifies this file as the location of the tycho-display functionality. tycho-font-medium describes the font and display parameters that I called in the .emacs file. And the file ends with a keybiding to call that function and a number of default settings.

Init and Conclusion

While the tycho-init.el file holds all of the interesting configuration options, functions and settings, it's mostly beyond the scope of this file. When you download contributed emacs.lisp files from emacswiki, put them in ~/emacs/ and put the require call in tycho-init.el. By convention provide names map to file names but be sure to check files to ensure that this is the case.

Using this setup as a framework, you can create--without confusion--a number of configuration files to properly collect and organize your settings, emacs modes, and other emacs code and functions that you've gotten from other users. Good luck!

You may also be interested in a couple other tutorials I've collected on emacs, notably:

Public Transit Information Overload: A Lesson

Philadelphia is replacing, or at least promising to replace, the trains that run the commuter rail system. The new trains are 35-40 years newer than the usual fair, and are replete with "new technologies," one of which is an automated (I believe GPS-based) announcement system, which figures out what station is next, and which line you're on. This is great in theory, but there's a problem.

This system gives you too much information. Trains in Philly are named by their terminus, and all trains converge (and pass through) downtown. There's history here which makes things a bit easier to understand if you're a transit geek, but after every stop--including outlying stops--the train tells you what line you're on, and which stops it makes or skips. The problems:

  • At most outlying stations, you can tell by the station you're at, which line you're on. It's sometimes useful to know where the train you're on is headed, but the trains only tell you that on the outside of the train, until you get downtown, when the announcements change from "where you've been," to "where you're going."
  • The "this is the train you're on," announcements don't change as you pass stops, so you hear where the train's been at every stop after even you passed the relevant stops. The announcements make sense, as there are 5 or six "main line" stops that some trains stop on, and others don't, so as you're heading towards doubtful stops, it's useful information, when you're passed them: less so.
  • All announcements are displayed on screens in written form and read by a speech synthesizer. I understand the accessibility concerns, but there are still conductors and I'm not sure that the information is presented in a way that is usable by people who don't already have a significant understanding of the transit system.

Given this background, as a technical writer, and someone who geeks out on information presentation, I felt that there are a number of things that can be learned from this case:

  • More information is sometimes confusing, and can make concepts harder to grasp.
  • Figuring out what people need to know in any given situation is more important (and more difficult) than figuring out what is true or correct.
  • Sometimes multi-modal presentation may not actually add value in proportion with the amount of annoyance it generates.
  • Presentation matters. The speech synthesizer does not sound very good and it's inefficient.

make all dexy

See "Why The World Is Ready For Dexy" and "Dexy and Literate Documentation" as well as the technical writing section section of the tychoish wiki for some background to this post.

The brief synopsis: dexy is a new tool for handling the process of the documentation work flow between writing and publication. It takes snippets of code, and bits of documentation and passes these atomic chunks through filters to generate technical articles, manuals, and other high quality resources. It's exciting because it may provide a way to write and manage the documentation process in a manner that is more effective than many other options and has the potential to produce better quality technical texts.

The reason, I think, is that dexy treats documentation like code. This is different, fundamentally, from systems that expect that developers write documentation. The former has interesting implications about the way technical writers work, and the later is nearly always a foolhardy proposition doomed to end in failure.

Documentation has a lot in common with code: documentation is written and revised in response to software versions, so the process of iterations has a lot in common. Documentation should typically be compiled, and the build process should produce a static product, between iterations. Documentation, like code, must also be maintained and fixed in response to new bugs and use-cases as they are found/developed.

If we accept this analogy, Dexy begins to look more like a tool like make which is used to manage compilation of code. make figures out what source files have changed, and what needs to be rebuilt in order to produce some sort of binary code. That doesn't sound like a very important tool, but it is. make makes it easy to automate tasks with dependencies, without writing a bunch of novel code to check to see what's been done and what has yet to be done, particularly when build processes need to be done in parallel. Furthermore, make is one of these typical "UNIX-like" utilities that does only one thing (but does it very well) and ties together functionality from a number of different kinds of programs (compilers, configuration tools, etc.)

Dexy is kind of like that. It manages a build process. It ties together a group of existing tools, thereby saving developer time and building something that can be more flexible and consistent.

This is, however, imperfect analogy: I think Dexy isn't "make for documentation," because it would be possible to use make to manage the build process for documentation as well as code. [1] Dexy manages text processing, make can work one level up--if needed--to build coherent texts from Dexy-processed documentation units. Dexy and make are glue that turns documentation and code into useful resources.

There are obviously some situations where this developer-like workflow may be overly complicated. For starters, Dexy, like make, really provides a framework for building documents. A portion of creating every project and text in this manner would necessarily go to developing build-related infrastructure. It's not a huge burden, but it's the kind of thing that requires a little bit of thought, and maybe some good "default" or base configuration for new projects and texts. Dexy is a great first step into a new way of thinking about and working with documentation, but there is much work yet to be done.

Onward and Upward!

[1]I should thank, or perhaps blame, a coworker for planting this idea in my mind.