Rhizome
Git Feature Requests
The ability to mark a branch "diverged," to prevent (or warn) on attempted merges from master (for example) into a maintenance branch.
The ability to create and track dedicated topic branches, and complementary tooling to encourage rebasing commits in these sorts of branches. We might call them "patch sets" or "sets" rather than "branches." Also, it might be useful to think about using/displaying these commits, when published, in a different way.
Represent merge commits as hyperlinks to the user, when possible. I think GitHub's "network graph" and similar visualizations are great for showing how commits and branches interact and relate to each other.
This would probably require some additional or modifies output from "
git log".Named stashes.
Branched stashes (perhaps this is closer to what I'm thinking about for the request regarding topic branches.)
The ability to checkout "working copies," of different points/branches currently from a single repository at the same time, using "native" git utilities.
Related, "shelf" functionality is scriptable, but this too needs to be easier and more well supported.
I think legit is a step in the right direction, but it's weird and probably makes it more difficult to understand what's happening with git conceptually as opposed to the above features which would provide more appropriate conceptual metaphors for the work that would-be-git-users need.
The Limitiations of the GitHub Fork Model
Assumption:
git is pretty awesome, but it's conceptually complex. As a result using git demands a preexisting familiarity with git itself or some sort of wrapper to minimize the conceptual overhead.
The collaboration methods (i.e. hosting) provided by git, which are simple by design to allow maximum flexibility, do not provide enough structure to be practically useful. As a result providers like GitHub (and BitBucket and gitorious) offer a valuable service that makes it easier--or even possible--for people to use git.
Caveats:
there are problems with using centralized repository services controlled by third parties, particularly for open source/free software projects.
There are ways that GitHub succeeds an fails in this regard. but this dynamic is too complex to fully investigate within the scope of this post.
If you use GitHub as designed, and the way that most projects use GitHub, then you have a very specific and particular view of how Git works.
While this isn't a bad thing, it's less easy to use git in some more distributed workflows as a result. This isn't GitHub's fault so much as it is an artifact of people not really knowing how git itself works.
Assertion:
GitHub's "fork" model disincentives people from working in "topic" branches.
By making it really easy for people to publish their branches, GitHub disincentives the most productive use of the "
git rebase" command that leads to clean and clear histories.There's no distinction between a "soft fork" where you create a fork for the purpose of submitting a patch (i.e. a "pull request") and a "hard fork," where you actually want to break the relationship with the original project.
This is mostly meaningful in context of the other features that GitHub provides, notably the "Network" chart, and the issue tracker. In a soft-fork that I would intend to merge back in, I'd like the issues to "come with," the repository, or at least connect in some way to the "parent." For hard forks, it might make sense to leave the old issues behind. The same with the network chart, which is incredibly powerful, but it's not great at guessing how your repository relates to the rest of its "social network."
The solution: keep innovating, keep fighting lock-in, and don't let GitHub dictate how you work.
Making Things Easier
I spent a lot of time in the past few months thinking about "automation," as a project to take things that take a long time and require a lot of human intervention into things that just do themselves, and I think this is the wrong approach.
While total automation is an admirable, it's difficult, both because it requires more complex software to deal with edge cases, but also because it's hard to iterate into a fully automated solution.
Let's back up for a moment and talk about automation in general.
Computers are great at automating things. When you figure out how exactly to accomplish something digitally (i.e. polling an information source for an update, transforming data, testing a system or tool,) writing a program to perform this function is a great idea: not only does it reduce the workload on actual people (i.e. you.) I think the difference between people who are "good with computers," and people who are "great with computers," is the ability to spot opportunities for these kinds of automations, and potentially implement them..
To my mind the most important reason to automate tasks is to ensure consistency and to make it more likely that tedious tasks get done.
Having said this, rather than develop complete task automations for common functions, the better solution is probably to approach automation on the bottom up: instead of automating a complete process, automate smaller pieces particularly the most repetitive and invariable parts, and then provide a way for people to trigger the (now simplified) task.
The end result, is a system that's more flexible easier to write, and less prone to failure under weird edge cases. Perhaps this is a manifestation of "worse is better" also.
?Thoughts?
Onward and Upward!
Supporting Distributed Bug Tracking
The free software/open source/software development world needs a distributed bug tracking story. Because the current one sucks.
The State of the Art
There are a number of tools written between 2006 and 2010 or so that provide partial or incomplete solutions to the problem. Almost isn't quite good enough. The "resources" section of this post, contains an overview of the most important (my judgment,) representatives of the current work in the area with a bit of editorializing.
In general these solutions are good starts, and I think they allow us (or me) a good starting point for thinking about what distributed bug tracking could be like. Someday.
Bug tracking needs are diverse, which creates a signifigant design challenge for any system in this space. There are many existing solutions, that everyone hates, and I suspect most would-be developers and innovators in the space would like to avoid opening this can of worms.
Another factor is that, while most people have come to the conclusion that distributed source control tools are the "serious" contemporary tool for managing source code the benefits of distributed bug tracking hasn't yet propogated in the same way. Many folks have begun to come to terms with the fact that some amount of tactical centralization is inevitable, required, and even desirable1 in the context of a issue tracking systems.
Add to this the frequent requirement that non-developer users often need to track and create issues, and the result is that we've arrived at something of an impasse.
Requirements
A distributed bug tracking system would need:
A good way to provide short, unique identifiers for individual issues and comments so that users can discuss issues canonically.
An interface contained in a single application, script, or binary, that you could distribute with the application.
A simple/lightweight web-based interface so that users can (at least) review, search, and reference issues from a web browser.
Write access would also be good, but is less critical. Also, it might be more practical (both from a design and a workflow perspective,) to have users submit bugs on the web into a read-only "staging queue," that developers/administrators would then formally import into the project. This formalizes a certain type of triage approach that many projects may find useful.
To be separable from the source code history, either by using a branch, or by using pre-commit hooks to ensure that you never commit changes to code/content and the bugs at the same time.
To be editable, and to interact with commonly accessible tools that users already use. Email, command line tools, the version control systems, potentially documentation systems, build systems, testing frameworks and so forth.
Built on reliable tools.2
To provide an easy way to customize your "views" on bugs for a particular team or project. In other words, each team can freely decide which extra fields get attached to their bugs, along with which fields are visible by default, which are required, and so on -- without interfering with other projects.
The Future of the Art
We (all) need to work on building new and better tools to help solve the distributed issue tracking problem. This will involve:
learning from the existing attempts,
continuing to develop and solidify the above requirements,
(potentially) test and develop a standard (yaml/json?) based data storage format that is easy to parse, and easily merged that multiple tools can use.
Develop some simple prototype tools, potentially as a suite of related utilities (a la early versions of git.) that facilitate interaction with the git database. With an eye towards flexibility and extensible.
While there are implications for free software hosting as well as vendor independence and network service autonomy (a la Franklin Street Statement.) I think the primary reason to pursue distributed bug tracking has more to do with productivity and better engineering practices, and less with the policy. In summary:
Bug database systems that run locally and are fast3 and always available.
Tools that permit offline interaction with issue database.
Tools that allow users to connect issues to branches.
Tools that make it possible to component-ize bug databases in parallel with software
Resources
(With commentary,)
-
This is the canonical source for discussion around distributed bug tracking.
-
This is among the most well developed solution speaking holistically. "be" is written in Python, can generate output for the web. It uses its own data format, and has a pretty good command line tool. The HTML output generate is probably not very fast at scale (none are,) but I have not tested it.
-
Ditz is a very well developed solution. Ditz: implemented in Ruby, has a web interface, has a command line tool, uses a basic YAML data format, and stores data in branch. Current development is slow, getting it up and running is non-trivial, and my sense is that there isn't a very active community of contributors. There are reasons for this, likely but they are beyond the scope of this overview.
-
Pitz is a Python re-implementation of Ditz, and while the developer(s?) have produced a "release," the "interface" is a Python shell, and to interact with the database you have to, basically write commands in Python syntax. From a data perspective, however, Pitz, like Ditz is quite developed. Pitz while it stores data in-tree, I think it's important source of ideas/examples/scaffolding.
-
This is a really clever solution that uses Maildirs to store issues. As a result you can interact with and integrate Artimis issues with your existing email client. Pull down changes, and see new bugs in your email, without any complicated email and list server setups.
The huge caveat is that it's implemented as a plugin for Mercurial, and so can't be used with git projects. Also, all data resides in the tree.
-
In most ways, git-issues is my favorite: it's two Python files, 1700 lines of code, stores issues outside of the source branch, and has a good command line interface. On the downside, it uses XML (which shouldn't matter, but I think probably does, at least in terms of attracting developers,) and doesn't have a web-based interface. It's also currently un-maintained.
-
SD, which is based on a distributed database named Prophet, is a great solution. The primary issue is that it's currently unmentioned and is not as feature complete as it should be. Also a lot of SD focuses on synchronizing with existing centralized issue trackers, potentially at the expense of developing other tools.
It seems that you want centralized issue databases, or at least the fact that centralized issue databases appear canonical is a major selling point for issue tracking software in general. Otherwise, everyone would have their own text file with a bunch of issues, and that would suck. ↩
Because I don't program (much) and it's easy to criticize architectural decisions from afar, I don't want to explicitly say "we need to write this in Python for portability reasons" or something that would be similarly unfounded. At the same time, adoption and ease of use is crucial here, both for developers and users. Java and Ruby (and maybe Perl,) for various reasons, add friction to the adoption possibilities. ↩
"Is Jira/Bugzilla/etc. slow for you today?" ↩