NYC Subway Strategy

I have a question for game theory/urban planning/transit geeks, in part for practical reasons, and in part for a story I’m developing:

Is there some sort of resource that explains “most efferent” rapid transit rider strategies, perhaps from a game theory perspective?

I’ve been living in NYC for almost a year, with frequent visits for about 6 months before that and I’ve learned things like:

  • The physical layouts of a number of station complexes and transfer points, to facilitate quick/easy transfers.

  • A faltering sense of when to take an express and when to take a local, and when it makes sense to switch.

  • A decent sense of which route will be more direct/quicker in a given situation.

  • An acceptable sense of which part of the train you need to be on.

    I’m interested in knowing if there is any work aimed at a general audience that addresses any of these questions, in particular:

  • The express/local decision making logic, particularly given situations like:

  • When does it make sense to walk ~5-10 blocks to an express stop (possibly in the wrong direction,) rather than walk a shorter distance to a local stop that would require a transfer. (If your destination is an express stop?)

  • At what point in a journey does it make the most sense to transfer between trains?

  • How does time-of-day affect the logic.

  • Specific differences (if any) for navigating subway lines with multiple converging services. (e.g. the “M” with regards to the F and E, as well as the R with regards to the N in Manhattan.)

  • Logic for transferring between non-parallel train services that intersect at multiple points in a given journey (e.g. in Brooklyn, vs. Manhattan, if needed for inter-borough trips; E/M; 4,5,6 vs N, R.)

  • Logic for transferring between services that run on the same track (e.g. in NYC: N/R, E/C, F/M, etc.) with respect to how your journey overlays the route divergence and convergences.

The Million Words of Crap

There’s this saying, I traced it down once but have forgotten the source again, that before you can write anything good, you have to write a million words of crap.

Well I’ve done it. I’m guessing that most writers hit this number without much difficulty fairly early on in their lives/careers, but it’s hard keep records and do proper accounting of this data. And let’s be frank, it’s kind of a stupid thing to track. While I’m definitely a better writer these days than I was even 2 years ago, I know that I have a lot more to learn.

In any case, I have a firm account of a million words of crap that I’ve written in the last 10 years or so:

  • tychoish.com this wiki, mostly rhizome accounts for a bit more than 800,000 words.
  • I wrote a novel in 2002 and 2003 that now exists primarily as a paper volume in a zippered binder that I use as a foot rest. That was 100,000 words.
  • Since September 26, 2011 I’ve written a bit under 101,500 words for my work. The number is slightly inflated on account of code samples, some repetitive sections, and a few articles that colleagues wrote that I did some significant editing/rewriting.
  • Between June 2009 and May 2011, I wrote somewhere between 100,000 and 200,000 words for a previous job (again some measure of redundant content, other people’s work, and code samples makes this figure hard to track.)
  • The Knowing Mars novella that I wrote between July 2007 and ~March 2008 is about 35,000 words.
  • The novel that I finished the first draft of few weeks ago is about 85,000 words.
  • I’m 35,000 words into the first draft of a technical book-like object.
  • I have 10,000 words in a couple of other projects.

Which puts me way way way, over (what? 1,300,000.) I wonder when I broke the million word mark?

Some observations on writing.

  • I still make all kinds of writing mistakes that I find embarrassing and difficult. Particularly with fiction (which probably only accounts for 200,000-250,000 words,) I feel incredibly clumsy.

  • At the same time, as long as I have a clear idea of what a piece of text needs to say, I’m reasonably comfortable pulling together a draft without much fuss. Sometimes figuring out what I need to say involves a bunch of reading and long walk, but it’s possible.

  • I’m getting better at writing less. Writing concisely is hard work and it’s easy to get into the habit of generating “word stew” that don’t say anything useful and are impenetrable to read.

  • Writing is hard. I’ve always been an awful speller and I have some dyslexic moments, but--as longtime readers of the blog have surely noticed--I’m getting way better and writing cleaner copy.

    Editing other peoples writing on a regular basis has also been enlightening because it allows me to put things in perspective, and be able to see value in what I can do with words. Also it’s cool to be able to help other people write.

  • This is totally a correlation, but I pretty much never write anything longhand, and I think I’m a better writer for it. The downside is that I’m incredibly dependent on my own computer and its setup to do anything.

Onward and Upward!

Loops and Git Automation

This post provides a few quick overviews of cool bits of shell script that I’ve written or put together recently. Nothing earth shattering, but perhaps interesting nonetheless.

Making a Bunch of Symbolic Links.

I have two collection of Maildir folders to store my mail. One in a ~/work/mail folder for my work mail, and ~/mail for my personal projects. I want the work mailboxes to be symbolically linked to the personal ones (some work email, notably from GitHub comes to a personal address, and I want to be able to refile as necessary.) I moved the work maildirs last week, so I needed to create about 15 symbolic links, and so I wrote the following little loop:

cd ~/mail
for i in `find ~/work/mail -maxdepth 1 -mindepth 1`; do
   ln -s $i work.`echo $i | cut -d / -f 6`
done

Should be useful, no?

Commit all Git Changes

For a long time, I used the following bit of code to provide the inverse operation of “git add .”. Where “git add .” adds all uncommited changes to the staging area for the next commit, the following commit automatically removes all files that are no longer present on the file-system from the staging area for the next commit.

if [ "`git ls-files -d | wc -l`" -gt "0" ]; then
  git rm --quiet `git ls-files -d`
fi

This is great if you forget to use “git mv” or you delete a file using rm, you can run this operation and pretty quickly have git catch up with the state of reality. In retrospect I’m not really sure why I put the error checking if statement in there.

There are two other implementations of this basic idea that I’m aware of:

for i in `git ls-files -d`; do
  git rm --quiet $i
done

Turns out you can do pretty much the same thing with the following statement using the xargs command and you end up with something that’s a bit more succinct:

git ls-files --deleted -z | xargs -0 git rm --quiet

I’m not sure why, I think it’s because I started being a Unix nerd after Linux dropped the argument number limit, and as a result I’ve never really gotten a chance to become familiar with xargs. While I sometimes sense that a problems is xargs shaped, I almost never run into “too many arguments” errors, and always attempt other solutions first.

A Note About xargs

If you’re familiar with xargs skip this section. Otherwise, it’s geeky story time.

While this isn’t currently an issue on Linux, some older UNIX systems (including older versions of Linux,) had this limitation where you could only pass a limited number of arguments to a command. If you had too many, the command would produce an error, and you had to find another way.

I’m not sure what the number was, and the specific number isn’t particularly important to the story. Generally, I understand that this problem would crop up when attempting to take the output of a command like find and piping or passing it to another command like grep or the like. I’m not sure if you can trigger “too many arguments” errors with globbing (i.e. *) but like I said this kind of thing is pretty uncommon these days.

One of the “other ways” was to use the xargs command which basically takes very long list of arguments and passes them one by one (or in batches?) to another command. My gut feeling is that xargs can do some things, like the above a bit more robustly, but that isn’t experimentally grounded. Thoughts?

Onward and Upward!

Cron is the Wrong Solution

Cron is great, right? For the uninitiated, if there are any of you left, Cron is a task scheduler that makes it possible to run various scripts and programs at specified intervals. This means that you can write programs that “do a thing” in a stateless way, set them to run regularly, without having to consider any logic regarding when to run, or any kind of state tracking. Cron is simple and the right way to do a great deal of routine automation, but there are caveats.

At times I’ve had scads of cron jobs, and while they work, from time to time I find myself going through my list of cron tasks on various systems and removing most of them or finding better ways.

The problems with cron are simple:

  • Its often a sledge hammer, and it’s very easy to put something in cron job that needs a little more delicacy.

  • While it’s possible to capture the output of cron tasks (typically via email,) the feedback from cronjobs is hard to follow. So it’s hard to detect errors, performance deterioration, inefficiencies, or bugs proactively.

  • Its too easy to cron something to run every minute or couple of minutes. A task that seems relatively lightweight when you run it once can end up being expensive in the aggregate when they have to run a thousand times a day.

    This isn’t to say that there aren’t places where using cron isn’t absolutely the right solution, but there are better solutions. For instance:

  • Include simple tests and logic for the cron task to determine if it needs to run before actually running.

  • Make things very easy to invoke or on demand rather than automatically running them regularly.

    I’ve begun to find little scripts and dmenu, or an easily called emacs-lisp function to be preferable to a cron job for a lot of tasks that I’d otherwise set in a cron job.

  • Write real daemons. It’s hard and you have to make sure that they don’t error out or quit unexpectedly--which requires at least primitive monitoring--but a little bit of work here can go a long way.

Onward and Upward!

Git In Practice

Most people don’t use git particularly well. It’s a capable piece of software that supports a number of different workflows, but because it doesn’t mandate any particular workflow it’s possible to use git productively for years without ever really touching some features.

And some of the features--in my experience mostly those related to more manual branching, merging, and history manipulation operations--are woefully underutilized. Part of this is because Github, which is responsible for facilitating much of git’s use, promotes a specific workflow that makes it possible to do most of the (minimal required) branch operations on the server side, with the help of a much constrained interface. Github makes git usable by making it possible to get most of the benefit of git without needing to mess with SHA1 hashes, or anything difficult on the command-line.

That’s a good thing. Mostly.

Nevertheless, there are a few operations that remain hard with git: I sometimes encounter situations that I have to try a few times before I get it right, and there are commands that I always have to check the man page to figure out how to specify the references. And even then I’m sometimes still confused. So maybe I (or we?) can spend a little bit of time and figure out what processes remain hard with git and maybe try and see if there is a way to make the process a bit more streamlined.

Here’s my list:

  • Reorder all commits since x commit.

    This is basically: find the commit before the earliest one that you want to change, run git rebase -i <commit hash> to reorder the commits even though git sorts the commits in the order that I find most un-intuitive.

  • Create local branches to track remote branches or repositories.

    Setup the remotes, if necessary, and then run: git branch --track <local-branch-name> <remote>/<branch-name> and git config branch.{name}.push {local-branch}:master.

  • Stash all local changes and switch branches.

    It would also be nice if you could figure out way for git (or a helper) to see any open files in your text editor and save/close them if needed.

  • Pull a commit from the history of one branch into another branch without pulling anything else.

    I think this is chery-pick? It might also be nice to pull a series of commits from one branch, rebase them into one commit in the destination branch, and then commit that.

  • Pretty much every time I’ve tried to use the merge command to get something other than what I would have expected to happen by using “pull,” it ends tragically.

Reader suggestions:

  • Put your process/procedural frustrations with git here.

How about we work on figuring out how to solve these problems in comments?

Assisted Editing

I learned about artbollocks-mode.el from Sacha Chua’s post, and it’s pretty freaking amazing.

Basically, it does some processing of your writing--while you work--to highlight passive sentences and affected jargon.1 And that’s all. There are some functions for generating statistics about your writing, but I find I don’t use that functionality often. You can enable it all of the time, or just turn it on when you’re doing editing.

After a few weeks, I’ve noticed a marked improvement in the quality of my output. I leave it on all the time, but I’m pretty good at resisting the urge to edit while I’m writing. Or at least I’m pretty good at picking up again after going back to tweak a wording. In general it’s hard to keep more than a few things in an editing pass at any time.

It turns out that the instant feedback on passive sentences, even though it’s not perfect, is great for improving the quality of my content the first time out. And it’s even better for doing editing work. It’s harder to ignore a passive sentence when the editor is highlighting you see a screen full of them for you.

It’s of course important to be able to ignore its suggestions from time to time, and it’s no harder to ignore than “flyspell-mode” (the on-the-fly spell checker in emacs.)


  1. This is perhaps the clumsiest part of the default distribution, as jargon is terribly specific to the kind of writing you’re doing, and it turns out that one of the “art critic”/post-modern words (i.e. “node”) is a word that I end up using (acceptably, I think) in a technical context when describing a clustered system. And there’s a difference between a technical lexicon and a jargon, and regular expressions aren’t terribly sensitive to this, so the actual list of words that you need to call yourself out on, varies a bit from person to person. But once you customize it, it’s great. ↩︎

Allowable Complexity

I’m not sure I’d fully realized it before, but the key problems in systems administration--at least the kind that I interact with the most--are really manifestations of a tension between complexity and reliability.

Complex systems are often more capable flexible, so goes the theory. At the same time, complexity often leads to operational failure, as a larger number of moving parts leads to more potential points of failure. I think it’s an age old engineering problem and I doubt that there are good practical answers.

I’ve been working on this writing project where I’ve been exploring a number of fundamental systems administration problem domains, so this kind of thing is on my mind. It seems, that the way to address the hard questions often come back to “what are the actual requirements, and are you willing to pay the premiums to make the complex systems reliable?”

Trade-offs around complexity also happen in software development proper: I’ve heard more than a few developers talk in the last few months weigh the complexity of using dynamic languages like Python for very large scale projects. While the quests and implications manifest differently for code, it seems like this is part of the same problem.

Rather than prattle on about various approaches, I’m just going to close out this post with a few open questions/thoughts:

  • What’s the process for determining requirements that accounts for actual required complexity?

  • How do things that had previously been complex, become less complex?

    Perhaps someone just has write the code in C or C++ and let it mature for a few years before administrators accept it as stable?

  • Is there an corresponding level of complexity threshold in software development and within software itself? (Likely yes,) and is it related to something intrinsic to particular design patterns, or to tooling (i.e. programming language implementations, compilers, and so forth.)

Might better developer tooling allow us to programs of larger scope in dynamic languages (perhaps?)

Reader submitted questions:

  • Your questions here.

Answers, or attempts thereat in comments.

Persistent Emacs Daemons

I’ve been subject to a rather annoying emacs bug for months. Basically, when you start emacs with the --daemon switch, and the X11 session exits, and any emacs frames are open, then the emacs process dies. No warning. The whole point, to my mind, of the daemon mode is to allows emacs sessions to persist beyond the current X11 session.

This shouldn’t happen. I think this is the relevant bug report, but I seem to remember that the issue has something to do with the way that GTK interacts with the X11 session and emacs’s frames. It’s something of a deadlock: the GTK has no real need to fix the bug (and/or it’s a behavior that they rely on for other uses,) and it might not really be possible or feasible for emacs to work around this issue.1

I also think that it’s probably fair to say that daemon mode represent a small minority all emacs usage.

Regardless, I’ve figured out the workaround:

Turns out, it’s totally possible to build GNU emacs without GTK, by using the “Lucid” build. Which is to say, use the windowing system kit built for Lucid Emacs (i.e. XEmacs,) rather than GTK. I was able, using the code below, to get an emacs experience with the new build that seems identical2 to the one I used to get with GTK, except without the frustrating crashes every time that X11 spazzed when I decided to unplug a monitor or some such. A welcome improvement, indeed.

The following emacs-lisp covers all of the relevant configuration of the “look and feel” of my emacs session. Install the Inconsolata font if you haven’t already, you’ll be glad you did.

(setq-default inhibit-startup-message 't
              initial-scratch-message 'nil
              save-place t
              scroll-bar-mode nil
              tool-bar-mode nil
              menu-bar-mode nil
              scroll-margin 0
              indent-tabs-mode nil
              flyspell-issue-message-flag 'nil
              size-indication-mode t
              scroll-conservatively 25
              scroll-preserve-screen-position 1
              cursor-in-non-selected-windows nil)

(setq default-frame-alist '((font-backend . "xft")
                            (font . "Inconsolata-14")
                            (vertical-scroll-bars . 0)
                            (menu-bar-lines . 0)
                            (tool-bar-lines . 0)
                            (alpha 86 84)))

(tool-bar-mode -1)
(scroll-bar-mode -1)
(menu-bar-mode -1)

Hope this helps you and/or anyone else that might have run into this problem.


  1. I’d like to add the citation and more information here, but can’t find it. ↩︎

  2. To be fair, I mostly don’t use the GUI elements in emacs, though having emacs instances outside of the terminal is nice for displaying images when using emacs-w3m, and for having a little bit of additional display flexibility for some more rich modes. ↩︎