The Editing Hole

I’m stuck in an editing hole, and not only am I not editing the things I need to edit, I’m not getting anything done.

I’m at a point where I have about 25 things on my personal task list, and 16 of them are editing related tasks: edit the article in this file, edit this fiction, edit this documentation, edit these would-be-blog posts, and so forth. It seems like I went on something of a six month writing bender, and while I did a little bit of editing during this period, I have clearly fallen behind.

There are a number of factors:

1. I’ve been making a point of putting editing tasks on the todo list because I want to make sure that I actually finish projects rather than just abandon them. I’ve not been particularly good with follow through in the past few years, so that’s been a big personal improvement project.

The sad part is that my editing queue is probably 10-20 times larger, but I’ve got some projects on the less-actionable back burner.

  1. I’m not a very good editor.

I’m awful at copy (or otherwise) editing my own work, and while I know that I’ve become better at this in the past few years. I still know that it’s not perfect and so it seems sort of futile, which makes it hard to get inspired to do editing.

3. I find writing to be rewarding, and given the choice I will probably always choose to write new stuff. While this is clearly a learned response to the kind of work, this doesn’t make the effect any less real.

  1. I find editing to be really difficult work.

This is probably related to #2, but editing wears me out. I find it difficult to spend long periods of time editing, which makes it difficult to make any really substantial progress on the pile of editing tasks.

As a result, I take a long time to edit things, I’m most effective at editing in short bursts. I often want to break up longer editing tasks with other kinds of work just to keep a clean mind set. After a week or so of this, I have almost everything else done on the list, leaving me with a big pile of editing that looks even bigger for the lack of other things on the list.

So now, I’m trying the following:

1. I’m working on making editing tasks smaller, which will turn into more editing tasks, but it’ll be possible to face editing tasks in units less than 10 or 20 pages.

  1. Make more tasks for other projects.

There are two ways to get stuff done: 1) You can be really focused and work on one project at a time until it’s finished, or 2) you can be working on a lot of projects in parallel and when you start to loose focus, you switch to another kind of project. The idea is that you end up getting more done because you’re being productive more of the time. I subscribe to the second theory.

Here’s hoping it works!

Onward and Upward!

Git Feature Requests

  • The ability to mark a branch “diverged,” to prevent (or warn) on attempted merges from master (for example) into a maintenance branch.

  • The ability to create and track dedicated topic branches, and complementary tooling to encourage rebasing commits in these sorts of branches. We might call them “patch sets” or “sets” rather than “branches.” Also, it might be useful to think about using/displaying these commits, when published, in a different way.

  • Represent merge commits as hyperlinks to the user, when possible. I think GitHub’s “network graph” and similar visualizations are great for showing how commits and branches interact and relate to each other.

    This would probably require some additional or modifies output from “git log”.

  • Named stashes.

  • Branched stashes (perhaps this is closer to what I’m thinking about for the request regarding topic branches.)

  • The ability to checkout “working copies,” of different points/branches currently from a single repository at the same time, using “native” git utilities.

    Related, “shelf” functionality is scriptable, but this too needs to be easier and more well supported.

    I think legit is a step in the right direction, but it’s weird and probably makes it more difficult to understand what’s happening with git conceptually as opposed to the above features which would provide more appropriate conceptual metaphors for the work that would-be-git-users need.

Limitiations of GitHub Forks

Assumption:

  1. git is pretty awesome, but it’s conceptually complex. As a result using git demands a preexisting familiarity with git itself or some sort of wrapper to minimize the conceptual overhead.
  2. The collaboration methods (i.e. hosting) provided by git, which are simple by design to allow maximum flexibility, do not provide enough structure to be practically useful. As a result providers like GitHub (and BitBucket and gitorious) offer a valuable service that makes it easier--or even possible--for people to use git.

Caveats:

  • there are problems with using centralized repository services controlled by third parties, particularly for open source/free software projects.

    There are ways that GitHub succeeds an fails in this regard. but this dynamic is too complex to fully investigate within the scope of this post.

  • If you use GitHub as designed, and the way that most projects use nGitHub, then you have a very specific and particular view of how Git works.

    While this isn’t a bad thing, it’s less easy to use git in some more distributed workflows as a result. This isn’t GitHub’s fault so much as it is an artifact of people not really knowing how git itself works.

Assertions:

  1. GitHub’s “fork” model[^fork] disincentives people from working in “topic” branches.

  2. By making it really easy for people to publish their branches, GitHub disincentives the most productive use of the “git rebase” command that leads to clean and clear histories.

  3. There’s no distinction between a “soft fork” where you create a fork for the purpose of submitting a patch (i.e. a “pull request”) and a “hard fork,” where you actually want to break the relationship with the original project.

    This is mostly meaningful in context of the other features that GitHub provides, notably the “Network” chart, and the issue tracker. In a soft-fork that I would intend to merge back in, I’d like the issues to “come with,” the repository, or at least connect in some way to the “parent.” For hard forks, it might make sense to leave the old issues behind. The same with the network chart, which is incredibly powerful, but it’s not great at guessing how your repository relates to the rest of its “social network.”

The solution: keep innovating, keep fighting lock-in, and don’t let GitHub dictate how you work.

Making Things Easier

I spent a lot of time in the past few months thinking about “automation,” as a project to take things that take a long time and require a lot of human intervention into things that just do themselves, and I think this is the wrong approach.

While total automation is an admirable, it’s difficult, both because it requires more complex software to deal with edge cases, but also because it’s hard to iterate into a fully automated solution.

Let’s back up for a moment and talk about automation in general.

Computers are great at automating things. When you figure out how exactly to accomplish something digitally (i.e. polling an information source for an update, transforming data, testing a system or tool,) writing a program to perform this function is a great idea: not only does it reduce the workload on actual people (i.e. you.) I think the difference between people who are “good with computers,” and people who are “great with computers,” is the ability to spot opportunities for these kinds of automations, and potentially implement them..

To my mind the most important reason to automate tasks is to ensure consistency and to make it more likely that tedious tasks get done.

Having said this, rather than develop complete task automations for common functions, the better solution is probably to approach automation on the bottom up: instead of automating a complete process, automate smaller pieces particularly the most repetitive and invariable parts, and then provide a way for people to trigger the (now simplified) task.

The end result, is a system that’s more flexible easier to write, and less prone to failure under weird edge cases. Perhaps this is a manifestation of “worse is betteralso.

Thoughts?

Onward and Upward!

Distributed Bug Tracking

The free software/open source/software development world needs a distributed bug tracking story. Because the current one sucks.

::: {.contents} :::

The State of the Art

There are a number of tools written between 2006 and 2010 or so that provide partial or incomplete solutions to the problem. Almost isn’t quite good enough. The “resources” section of this post, contains an overview of the most important (my judgment,) representatives of the current work in the area with a bit of editorializing.

In general these solutions are good starts, and I think they allow us (or me) a good starting point for thinking about what distributed bug tracking could be like. Someday.

Bug tracking needs are diverse, which creates a signifigant design challenge for any system in this space. There are many existing solutions, that everyone hates, and I suspect most would-be developers and innovators in the space would like to avoid opening this can of worms.

Another factor is that, while most people have come to the conclusion that distributed source control tools are the “serious” contemporary tool for managing source code the benefits of distributed bug tracking hasn’t yet propogated in the same way. Many folks have begun to come to terms with the fact that some amount of tactical centralization is inevitable, required, and even desirable1 in the context of a issue tracking systems.

Add to this the frequent requirement that non-developer users often need to track and create issues, and the result is that we’ve arrived at something of an impasse.

Requirements

A distributed bug tracking system would need:

  • A good way to provide short, unique identifiers for individual issues and comments so that users can discuss issues canonically.

  • An interface contained in a single application, script, or binary, that you could distribute with the application.

  • A simple/lightweight web-based interface so that users can (at least) review, search, and reference issues from a web browser.

    Write access would also be good, but is less critical. Also, it might be more practical (both from a design and a workflow perspective,) to have users submit bugs on the web into a read-only “staging queue,” that developers/administrators would then formally import into the project. This formalizes a certain type of triage approach that many projects may find useful.

  • To be separable from the source code history, either by using a branch, or by using pre-commit hooks to ensure that you never commit changes to code/content and the bugs at the same time.

  • To be editable, and to interact with commonly accessible tools that users already use. Email, command line tools, the version control systems, potentially documentation systems, build systems, testing frameworks and so forth.

  • Built on reliable tools.2

  • To provide an easy way to customize your “views” on bugs for a particular team or project. In other words, each team can freely decide which extra fields get attached to their bugs, along with which fields are visible by default, which are required, and so on--without interfering with other projects.

The Future of the Art

  1. We (all) need to work on building new and better tools to help solve the distributed issue tracking problem. This will involve:
    • learning from the existing attempts,
    • continuing to develop and solidify the above requirements,
    • (potentially) test and develop a standard (yaml/json?) based data storage format that is easy to parse, and easily merged that multiple tools can use.
    • Develop some simple prototype tools, potentially as a suite of related utilities (a la early versions of git.) that facilitate interaction with the git database. With an eye towards flexibility and extensible.
  2. While there are implications for free software hosting as well as vendor independence and network service autonomy (a la `Franklin Street Statement <http://autonomo.us/2008/07/franklin-street-statement/>`_.) I think the primary reason to pursue distributed bug tracking has more to do with productivity and better engineering practices, and less with the policy. In summary:
    • Bug database systems that run locally and are fast3 and always available.
    • Tools that permit offline interaction with issue database.
    • Tools that allow users to connect issues to branches.
    • Tools that make it possible to component-ize bug databases in parallel with software

Resources

(With commentary,)

  • dist-bugs mailing list

    This is the canonical source for discussion around distributed bug tracking.

  • Bugs Everywhere

    This is among the most well developed solution speaking holistically. “be” is written in Python, can generate output for the web. It uses its own data format, and has a pretty good command line tool. The HTML output generate is probably not very fast at scale (none are,) but I have not tested it.

  • Ditz

    Ditz is a very well developed solution. Ditz: implemented in Ruby, has a web interface, has a command line tool, uses a basic YAML data format, and stores data in branch. Current development is slow, getting it up and running is non-trivial, and my sense is that there isn’t a very active community of contributors. There are reasons for this, likely but they are beyond the scope of this overview.

  • pitz

    Pitz is a Python re-implementation of Ditz, and while the developer(s?) have produced a “release,” the “interface” is a Python shell, and to interact with the database you have to, basically write commands in Python syntax. From a data perspective, however, Pitz, like Ditz is quite developed. Pitz while it stores data in-tree, I think it’s important source of ideas/examples/scaffolding.

  • Artemis

    This is a really clever solution that uses Maildirs to store issues. As a result you can interact with and integrate Artimis issues with your existing email client. Pull down changes, and see new bugs in your email, without any complicated email and list server setups.

    The huge caveat is that it’s implemented as a plugin for Mercurial, and so can’t be used with git projects. Also, all data resides in the tree.

  • git-issues

    In most ways, git-issues is my favorite: it’s two Python files, 1700 lines of code, stores issues outside of the source branch, and has a good command line interface. On the downside, it uses XML (which shouldn’t matter, but I think probably does, at least in terms of attracting developers,) and doesn’t have a web-based interface. It’s also currently un-maintained.

  • Prophet/sd

    SD, which is based on a distributed database named Prophet, is a great solution. The primary issue is that it’s currently unmentioned and is not as feature complete as it should be. Also a lot of SD focuses on synchronizing with existing centralized issue trackers, potentially at the expense of developing other tools.


  1. It seems that you want centralized issue databases, or at least the fact that centralized issue databases appear canonical is a major selling point for issue tracking software in general. Otherwise, everyone would have their own text file with a bunch of issues, and that would suck. ↩︎

  2. Because I don’t program (much) and it’s easy to criticize architectural decisions from afar, I don’t want to explicitly say “we need to write this in Python for portability reasons” or something that would be similarly unfounded. At the same time, adoption and ease of use is crucial here, both for developers and users. Java and Ruby (and maybe Perl,) for various reasons, add friction to the adoption possibilities. ↩︎

  3. “Is Jira/Bugzilla/etc. slow for you today?” ↩︎

2012 New Haven Singing

I went to a fabulous all day Sacred Harp singing in New Haven Connecticut last weekend. It was great. Thoughts:

Size and Space

This wasn’t a huge singing. There plenty of singings in the Northeast that have higher attendance, but that doesn’t matter, there was something nice about getting to sing with the people assembled.

It helped that the room was great for singing, and it was the perfect size for the crowd. Sacred Harp singers are often big folklore geeks or big music geeks, but I think deep down, we’re all really huge room geeks. Because a room that just sounds and feels good, makes all the difference in the world.

Note to self: Go to more awesome regional singings in great rooms in the future.

Heat

It was a pleasantly warm spring day, and with 80 or so people in a room, it got warm, and while this did subtract something from the comfort level, it also sets the mood somehow, and changes the tone of the day. Also, when the air is a bit more humid (but it only needs to be a bit,) and it’s not as drafty and cold, its easier to keep your voice warmed up. The end result: I (or one) will sing better between April and September.

My Voice Part

In an unusual move for me, I spent 3 out of 4 sessions singing Tenor. While I don’t have a “super bass” voice, I’m defiantly on bass side of baritone. In Sacred Harp, theoretically everyone can sing tenor, and it’s fun to mix things up a bit and songs sound different in different parts. I am also finding that having a sense of another part makes it possible to have a more rich sense of the music. Highly recommended.

Also, every time I sing tenor it takes me 25 minutes to remember (or remind my body) how to do it, so it was particularly nice to have a good long spell of singing to both figure out how to sing the part but also to get more comfortable with it.

My local singing community has been a bit bereft of basses lately, so I haven’t had much opportunity to actually sing tenor as much as I might like recently, so it was a particularly good change of pace.

Song Selection

I had something of an epiphany about leading and choosing your song at a singing.

While choosing a song that you like and enjoy hearing is obviously a part of the processes, I think song choice is more about choosing the right song for the moment, and figuring out what will sound best next, given the previous few songs.

I used to obsess a great deal about what song I would lead, and study it (at least some) before the singing even began. This weekend, I came with a few songs that I’d been thinking about but changed at the last moment when I thought that the song I had picked out wouldn’t fit very well.

The Origins of Morris Dancing

When you’re out in the world Morris Dancing, everyone asks what you’re doing. Actually, more typically people ask “What country is this from?” but I don’t know what that means. In any case, this past weekend I witnessed the following exchange between elderly spectator to the foreman of a certain New York City Men’s Team as the tour was moving between stands:

Spectator: What is this and where is it from?

J.D.: It’s English Morris Dancing, do you know what that is?

Spectator: No.

J.D.: Do you care?

Spectator (pauses, unsure of how to continue) Yes, is it like cricket or football?

J.D. Cricket.

(At this point the spectator continued about his way satisfied and the tour continued to the next stand.)

This exchange, as these usually go, was pretty good. And the cricket part is totally right. There’s actually, as I understand it, a lot of connection between the history of cricket and Morris Dancing: village’s teams would play cricket and dance Morris, Morris Kits were often cricket uniforms with ribbons and bells, and before cricket, most Morris dancers wore black pants.

Still.

Lies About Documentation...

.. that developers tell.

  1. All the documentation you’d need is in the test cases.
  2. My comments are really clear and detailed.

3. I’m really interested and committed to having really good documentation.

  1. This code is easy to read because its so procedural.
  2. This doesn’t really need documentation.

6. I’ve developed a really powerful way to extract documentation from this code.

  1. The documentation is up to date.
  2. We’ve tested this and nothing’s changed.
  3. This behavior hasn’t changed, and wouldn’t affect users anyway.
  4. The error message is clear.

11. This entire document needs to be rewritten to account for this change.

  1. You can document this structure with a pretty clear table.

Often this is true, more often these kinds of comments assume that it’s possible to convey 3-5 dimension matrixes clearly on paper/computer screens.

  1. I can do that.
  2. I will do that.
  3. No one should need to understand.