Limitiations of GitHub Forks

Assumption:

  1. git is pretty awesome, but it's conceptually complex. As a result using git demands a preexisting familiarity with git itself or some sort of wrapper to minimize the conceptual overhead.
  2. The collaboration methods (i.e. hosting) provided by git, which are simple by design to allow maximum flexibility, do not provide enough structure to be practically useful. As a result providers like GitHub (and BitBucket and gitorious) offer a valuable service that makes it easier--or even possible--for people to use git.

Caveats:

  • there are problems with using centralized repository services controlled by third parties, particularly for open source/free software projects.

    There are ways that GitHub succeeds an fails in this regard. but this dynamic is too complex to fully investigate within the scope of this post.

  • If you use GitHub as designed, and the way that most projects use nGitHub, then you have a very specific and particular view of how Git works.

    While this isn't a bad thing, it's less easy to use git in some more distributed workflows as a result. This isn't GitHub's fault so much as it is an artifact of people not really knowing how git itself works.

Assertions:

  1. GitHub's "fork" model[^fork] disincentives people from working in "topic" branches.

  2. By making it really easy for people to publish their branches, GitHub disincentives the most productive use of the "git rebase" command that leads to clean and clear histories.

  3. There's no distinction between a "soft fork" where you create a fork for the purpose of submitting a patch (i.e. a "pull request") and a "hard fork," where you actually want to break the relationship with the original project.

    This is mostly meaningful in context of the other features that GitHub provides, notably the "Network" chart, and the issue tracker. In a soft-fork that I would intend to merge back in, I'd like the issues to "come with," the repository, or at least connect in some way to the "parent." For hard forks, it might make sense to leave the old issues behind. The same with the network chart, which is incredibly powerful, but it's not great at guessing how your repository relates to the rest of its "social network."

The solution: keep innovating, keep fighting lock-in, and don't let GitHub dictate how you work.

Documentation Emergence

I stumbled across a link somewhere along the way to a thread about the Pyramid project's documentation planning process. It's neat to see a community coming to what I think is the best possible technical outcome. In the course of this conversation Iain Duncan, said something that I think is worth exploring in a bit more depth. The following is directly from the list, edited only slightly:

I wonder whether some very high level tutorials on getting into Pyramid that look at the different ways you can use it would be useful? I sympathize with Chris and the other documenters because just thinking about this problem is hard: How do you introduce someone to Pyramid easily without putting blinders on them for Pyramid's flexibility? I almost feel like there need to 2 new kinds of docs:

  • easy to follow beginner docs for whatever the most common full stack scaffold is turning out to be (no idea what this is!)
  • some mile high docs on how you can lay out pyramid apps differently and why you want to be able to do that. For example, I feel like hardly anyone coming to Pyramid from the new docs groks why the zca under the hood is so powerful and how you can tap into it.

Different sets of users have different needs from documentation. I think my ":Multi-Audience Documentation" post also addresses this issue.

I don't think there are good answers and good processes that always work for documentation projects. Targeted users and audience changes a lot depending on the kind of technology at play. The needs of users (and thus the documentation) varies in response to the technical complexity and nature every project/product varies. I think, as the above example demonstrates, there's additional complexity for software whose primary users are very technical adept (i.e. systems administrators) or even software developers themselves.

The impulse to have "beginner documentation," and "functional documentation," is a very common solution for many products and reflects two main user needs:

  • to understand how to use something. In other words, "getting started," documentation and tutorials.
  • to understand how something works. In other words the "real" documentation.

I think it's feasible to do both kinds of documentation within a single resource, but the struggle then revolves around making sure that the right kind of users find the content they need. That's a problem of documentation usability and structure. But it's not without challenges, lets think about those in the comments.

I also find myself thinking a bit about the differences between web-based documentation resources and conventional manuals in PDF or dead-tree editions. I'm not sure how to resolve these challenges, or even what the right answers are, but I think the questions are very much open.

Issue Tracking and the Health of Open Source Software

I read something recently that suggested that the health of an open source project and its community could be largely assessed by reviewing the status of the bug tracker. I'm still trying to track down the citation for this remark. This basically says that vital active projects have regularly updated bugs that are clearly described and that bugs be easy to search and easy to submit.

I'm not sure that free software communities and projects can be so easily assessed or that conventional project management practices are the only meaningful way to judge a project's health. While we're at it, I don't know that it's terribly useful to focus too much attention or importance on project management. Having said that, the emergence of organizational structure is incredibly fascinating, and could probably tolerate more investigation.

As a starting point, I'd like to offer two conjectures:

  • First, that transparent issue tracking is a reasonably effective means of "customer service," or user support. If the bug tracking contains answers to questions that people encounter during use, and provide a way to resolve issues with the software that's productive and helps with support self-service. Obviously some users and groups of users are better at this than others.
  • Second, issue tracking is perhaps the best way to do bottom-up project/product management and planning in the open, particularly since these kinds or projects lack formal procedures and designated roles to do this kind of organizational work.

While the overriding goal of personal task management is to break things into the smallest manageable work units, the overriding goal of issue tracking systems is to track the most intellectually discrete issues within a single project through the development process. Thus, issue tracking systems have requirements that are either much less important in personal systems or actively counter-intuitive for other uses. They are:

  • Task assignment, so that specific issues can be assigned different team members. Ideally this gets a specific developer can "own" a specific portion of the project and actually be able to work and coordinate efforts on the project.
  • Task prioritization, so that less important or crucial issues get attention before "nice to have," items are addressed.
  • Issue comments and additional attached information, to track progress and support information sharing among teams, particularly over long periods of time with asynchronous elements.

While it's nice to be able to integrate tasks and notes (this is really the core of org-mode's strength) issue tracking systems need to be able to accommodate error output and discussion from a team on the best solution, as well as discussion about the ideal solution.

The truth is that a lot of projects don't do a very good job of using issue tracking systems, despite how necessary and important bug trackers. The prefabricated systems can be frustrating and difficult to use, and most of the minimalist systems [1] are hard to use in groups. [2] The first person to write a fully featured, lightweight, and easy to use issue tracking system will be incredibly successful. Feel free to submit a patch to this post, if you're aware of a viable systems along these lines.

[1]I'm thinking about using ikiwiki or org-mode to track issues, but ditz suffers from the same core problem.
[2]Basically, they either sacrifice structure or concurrency features or both. Less structured systems rely on a group of people to capture the same sort of information in a regular way (unlikely) or they capture less information, neither option is tenable. Without concurrency (because they store things in single flat files) people can't use them to manage collaboration, which make them awkward personal task tracking systems.

Micro-Entrepreneurship, Good Enough, and Crowd Sourcing

I read this post by one of the partners in one of the coolest web services around, you should open that in a new window and then come back.

Back? Great!

Lars, proposes crowd-funding as a way to support free software development. Basically, run a "sponsor me to develop stuff" program, but rather than fund free software as a start-up around a single project or work for a big vendor.

It's a nifty idea, and it's got me thinking about micro-entrepreneurship. This would be where you make or do things, but not on a big scale. The businesses you create are small, and probably aren't completely full-time equivalent, but in aggregate it's good enough. While this is not the most prominent form of entrepreneurship on the internet, my sense is that it's way bigger than most people think.

We're too used to seeing multi-million dollar venture capital fund raising, IPOs, big acquisition deals, to realize the multitude of people who are making a few to several tens of thousands of dollars doing much smaller amounts of work.

I suppose I could write a whole post on good enough economics in the vein of this post on patronage from JamesGovernor but I'll just leave a place holder link to a wiki page, in case someone else wants to fill things in.

Thoughts:

  • Service-businesses don't scale particularly well, any individual work can only produce so much work, and it's hard to make individuals any more productive. In light of that, large service-based firms are unlikely to form.
  • Most people have pretty specialized skills and abilities. Self-employment, particularly full time employment makes it difficult for people to spend most of their time doing what their best at. Specialization and differing skills is also what creates a market for service-based endeavors.
  • Lacking health care and other benefits of traditional employment, it's hard for people to be more self-employed and less conventionally-employed. Given this, doing entrepreneurial projects on a smaller scale makes more sense.
  • Some kinds of entrepreneurial activities are attractive because, while they may not produce the same level of income as a salaried position, they allow more freedom and flexibility. This is the conventional justification for self-employment, and also the reason that most aligns with the "good enough" policy.

The problem with these kinds of "little businesses," is that it's too easy to focus on income earning work (e.g. freelance, and client work,) at the expense of doing basic work (e.g. developing core free software, doing basic research, writing fiction.) While the crowd sourcing notion makes a lot of sense, it requires a lot of faith in the crowd. I'm also unsure of how sustainable it is: while individuals can justify small amounts of money for such purposes, organizations cannot. Without organizational support, revenue is much lower, and it probably puts the larger financial burden on the smaller users, relatively speaking.

The Inevitability of Open Source

I recently attended POSSCON as part of my day-job. I don't usually blog directly about this kind of stuff ("You like to keep your church and state separate," a fellow attendee said, which fits.) But, I had a number of awesome conversations with the speakers, attendees and sponsors, that may spawn a series of brief posts here. POSSCON is a regional open source convention that drew developers, leaders of informational technology departments, and IT consultants of various types.

I had a number of conversations that revolved around the adoption of open source in opposition to proprietary systems. People asked questions like "what do we have to do to get more people to use open source software?" and many people apologized for doing work with proprietary software for mostly economic reasons (e.g. "I have a .NET development job," or "people need windows administration and I can't turn away work.")

This led me to have one of three reactions:

1. Working with any specific (proprietary) technology, particularly because you have to make ends meet should never require excusing. There are cases where "working with proprietary technology," may more like "building a business model on proprietary technology," and that sort of thing needs to be watched out for, but I don't think it's morally ambiguous to make a living.

2. I'm not sure that the success of technology, particularly open source, is determined solely on the basis of adoption rates. Successful technology is technology that efficiently allows people/cyborgs to do work, not overwhelmingly ubiquitous technology.

3. In many many contexts, open source technology has triumphed over proprietary alternatives: Linux-based systems are the dominant UNIX-like operating system. OpenSSH is the dominant SSH implementation (and remote terminal protocol/implementation). Darwin/FreeBSD is incredibly successful (as Mac OS X.) Other domains where open source packages have very high (dominating) adoption rates: OpenSSL, gcc, perl/python/php/ruby (web development), Apache/Lighttpd/nginx (web servers) etc.

While I think the end-user desktop isn't unimportant, I think there may be merit in playing to the strengths of open source (servers, infrastructure, developers.) Additionally, it seems more productive to have the discussion about "how do we advance open source," couched in terms of a battle for technological dominance in which open source has already won.

And Free Software/Open Source has won. While there remain sectors and domains where non-free software remains prevalent and business models that don't value user's freedom, I think that most people who know anything about technology will say that all paths forward lead toward a greater level of software freedom.

Maybe this is a symptom of the situation in which I work and maybe I'm being too optimistic, but I don't think so. Thoughts?

A Modest Blogging Proposal

I'm going to present this post somewhat out of order. Here's the proposal.

I want to think about moving this blog (or starting another?) to be a blog/wiki hybrid, and at the very least, moving forward I'd like the "discussion" or comment's link to link to a wiki page rather than a comments thread.


I've been thinking about the blog recently. I really enjoy writing posts, and there are days when I rely on writing a blog post to get me thinking and moving in the morning (or afternoon!) and kinds of projects that I don't think I would be able to work on if it weren't for having the space to write and the opportunity to have conversations with you all.

I've done some work recently to streamline and simplify the publishing process, which does a lot to make me more likely to post on the fly, but I'm not sure that the tone of this site, or the current design, or my own habits would really support a different kind of publishing schedule.

As an aside, I think the technological shift that made blogs possible were "content management systems" and website building tools that made updating a website with new content incredibly simple. While blogging has come to mean many other things and is defined by a number of different features, having the ability to publish on very short notice has a large effect on the way people write blog content.

So here's the thing about blog comments: I don't think that they're used particularly well, and there are some important flaws in so many of the options around. First, the best systems, like the one used on LiveJournal, IntenseDebate, and Disqus (which I use on this site) are all proprietary systems that are depending on an external service to function. The worse systems all have independent authentication methods, often lack proper threading (which most comm enters aren't terribly good at using anyway,) and it's very difficult to prevent all these systems from being filled with spam.

What's more, people don't really comment that much. At least for most blogs.

  • better comment systems, better discussions.
  • catering to people who want to write a lot. in comments.
  • allowing the conversation to grow from comments, in productive rather than purely discursive ways.

And once you've moved comments into a wiki, why not move the rest of the blog as well? My preferred engine, ikiwiki, has support for blog-like content so while there would be some work involved, it wouldn't be a major hassle to manage. And the worst case scenario is that the old content remains in the old system, which might not be a bad thing in the end.

Anyone out there in reader land have any thoughts on the subject? While I'll probably make some sort of revision to the way I blog/maintain tychoish.com, any such change is probably a month or two in the future.

Analyzing the Work of Open Source

This post covers the role and purpose (and utility!) of analysts and spectators in the software development world. Particularly in the open source subset of that. My inspirations and for this post come from:


In the video Coté says (basically,) open source projects need to be able to justify the "business case" for their project, to explain what's the innovation that this project seeks to provide the world. This is undoubtedly a good thing, and I think we should probably all be able to explore and clearly explain and even justify the projects we care about and work on in terms of their external worth.

Project leaders and developers should be able to explain and justify the greater utility of their software clearly. Without question. At the same time, problems arise when all we focus on is the worth. People become oblivious to how things work, and become unable to successfully participate in informed decisions about the technology that they use. Users, without an understanding of how a piece of technology functions are less able to take full advantage of that technology.

As an aside: One of the things that took me forever to get used to about working with developers is the terms that they describe their future projects. They use the imperative case with much more ease than I would ever consider: "the product will have this feature" and "It will be architected in such a way." From the outside this kind of talk seems to be unrealistic and grandiose, but I've learned that programmers tend to see their projects evolving in real time, and so this kind of language is really more representative of their current state of mind than their intentions or lack of communications skills.

Returning for a moment to the importance of being able to communicate the business case of the projects and technology that we create. As we force the developers of technology to focus on the business cases for the technology they develop we also make it so that the only people who are capable of understanding how software works, or how software is created, are the people who develop software. And while I'm all in favor of specialization, I do think that the returns diminish quickly.

And beyond the fact that this leads to technology that simply isn't as good or as useful, in the long run, it also strongly limits the ability of observers and outsiders ("analysts") to be able to provide a service for the developers of the technology beyond simply communicating their business case to outside world. It restricts all understanding of technology to journalism rather than the sort of "rich and chewy" (anthropological?) understanding that might be possible if we worked to understand the technology itself.

I clearly need to work a bit more to develop this idea, but I think it connects with a couple of previous arguments that I've put forth in these pages one regarding Whorfism in Programming, and also in constructing rich arguments.

I look forward to your input as I develop this project. Onward and Upward!

Independent Web Services

So much of the time, when we talk about network services, technological/software freedom, and this idea of "Cloud" computing, there's a bunch of debate: "is it a good idea?" "are we giving up too much freedom?" "how does this work out economically?" "what about privacy in the cloud?" While these are important questions, without doubt, I fear that they're too ethereal, and we end up tussling with a bunch of questions about the future and present of computing that might not be entirely worth debating (at least for the moment.)

Lets take 2 assertions, to start:

1. There are some applications--things we do with technology--that work best when these applications are running on high performance servers that have consistent connections to the Internet, that we can access regardless of where we are in the world.

2. The only way to have control over your data and computing experience is to be responsible for the administration and maintenance of these services yourself.

Huh?

I mean to say, that if we care about our autonomy, and our freedom as we use computers in the contemporary age (i.e. in the era of cloud computing), the only thing to be done is to run our own services. If the fact that Google has all of your data scares you: run your own mail server. If the fact that all of your microblogging output is on twitter, run your own status.net instance. And so forth.

If we really care about having power over our technological experiences, we must take responsibility for services on the Internet. We can say "wouldn't it be nice if service providers weren't such dicks with our data," or "wouldn't it be nice if software developers wrote networked software that respected our freedom." And while it would be nice, these convinces don't in and of themselves

Control over technology and autonomy in the networked context ultimately means that we as users have to:

  • Administer networked servers that provide us with the services that we want and need to do whatever it is that we do.
  • Participate in some exchange for networked services (i.e. pay for service, either in cash or by way of access to data.)

That's hard! Computers should get easier to use not harder, right?

Leading question there, but...

Yes. One of the leading arguments for consumer-"Cloud Computing" is that by accessing computer services (software) in the browser, developers can provide a more structured and "safe" user experience. At least that's how I understand it.

While this is a great thing in terms of making computers more accessible, no argument from me, I think we must be careful to avoid confusing of use" with technologically limiting. I fervently believe that its possible to design powerful software that is also easy to use, and I think that as often as not, a confusing technology is an opportunity to provide a teaching experience as much as it presents an opportunity to improve a given technology.

And if it comes down to it, there are situations where it doesn't matter so much if you're the one entering the commands into the server. It doesn't much matter if you are the one managing the server or if you've hired someone to configure it for you. As I think about it, there's probably something of a niche here for people to offer management services in a very boutique sort of style.

If we have to contract to people to do our administration for us, is that really a step in the right direction?

I think, it is. At the moment we pay for our networked computing services (i.e. gmail) by looking at google's ads next to our mail and giving Google access to the aggregate of our mail spools so that they can mine it for whatever data they need. The other price that we pay for these services is "lock in:" once we commit to using a service it's quite difficult to change to an alternate provider. Since these are "real costs," it seems reasonable to expect and want to pay (money) for services that don't have these costs. Which is where cooperative and boutique-style services make a lot of sense.

I'm not a systems administrator, I just want to do [the thing that I do] and not have to tinker with my computer. This is a lousy idea.

And that's a lousy question.

To dig in a bit further. I don't think that "doing the [whatever you do]," would necessarily require a lot of tinkering. It might, of course, and the chances are that we've all had to tinker with our technology at one point or another. In most cases tinkering is an upfront rather than ongoing cost. Ideally, the other thing that having full control over your network services you'll be able to use have services which are more tailored to [the thing you do] than the one size fits all application provided by a third party.

Ok, so what's the stack look like.

I'm not sure. There's clearly a common set tasks that we currently use in the networked context. I'm not sure what the application is, exactly, but here's a beginning of what this "application stack" looks like.

  • An XMPP Server like Prosody.im, with PyAIMt and other convectional IM network transports.
  • Some sort of Email Service: Citadel springs instantly to mind as an "all in one solution," but some postfix+procmail+fetchmail+horde/squirrelmail seems to make some sense
  • A web server, either for hosting personal websites, or with some sort of authentication scheme (digest?) for sharing files with yourself. The truth is that web servers, are pretty darn lightweight and it doesn't make sense to not install one. Having said that, people see "web hosting," and probably often think "Well, I don't really need web hosting," when that's almost beside the point.
  • SSH and some system for FUSE (or FUSE-like) mount points, so that they can use and store remote files.
  • There's probably a host of web-based applications that would need to be installed as a matter of course: some sort of web-based RSS reader, wiki-like note taking. Bookmarking. Some sort of notification service, Etc.
  • [your suggestion here.]