Shimgo Hugo

In an effort to relaunch tychoish with a more contemporary theme and a publishing tool that (hopefully) will support a more regular posting schedule, I also wrote a nifty go library for dealing with reStructuredText, which may be useful and I think illustrates something about build systems.

In my (apparently still) usual style, there’s some narrative lead in that that takes a bit to get through.


Over the past couple of weeks, I redesigned and redeployed my blog. The system it replaced was somewhat cobbled together, was missing a number of features (e.g. archives, rss feeds, social features, etc) and to add insult to injury it was pretty publishing was pretty slow, and it was difficult to manage a pipeline of posts.

In short, I didn’t post much, though I’ve written things from time to time that I haven’t done a great job of actually posting them, and it was hard to actually get people to read them, which was further demotivating. I’ve been reading a lot of interesting things, and I’m not writing that much for work any more, and I’ve been doing enough things recently that I want to write about them. See this twitter strand I had a bit ago on the topic.

So I started playing around again. Powering this blog is hard, because I have a lot of content1 and I very much want to use restructuredText. 2 There’s this thing called hugo which seems to be pretty popular. I’ve been using static site generators for years, and prefer the approach. It’s also helpful that I worked with Steve (hugo’s original author) during its initial development, and either by coincidence, or as a result our conversations and a couple of very small early contributions a number of things I cared about were included in its design:

  • support for multiple text markup features (including reStructuredText,) (I cobbled together rst support. )
  • customizeable page metadata formats. (I think I pushed for support of alternate front-matter formats, specifically YAML, and might have made a few prototype commits on this project)
  • the ability to schedule posts in the future, (I think we talked about this.)

I think I also winged a bunch in those days about performance. I’ve written about this here before, but one of the classic problems with static site generators is that no one expects sites with one or two thousand posts/content atoms, and so they’re developed against relatively small corpus’ and then have performance that doesn’t really scale.

Hugo is fast, but mostly because go is fast, which I think is, in most cases, good enough, but not in my case, and particularly not with the rst implementation as it stood. After all this preamble, we’ve gotten to the interesting part: a tool I’m calling shimgo.


The initial support for rst in hugo is straight forward. Every time hugo encounters an rst file, it calls the shell rst2html utility that is installed when you install docutils, passing it the content of the file on standard input, and parsing from the output, the content we need. It’s not pretty, it’’s not smart, but it works.

Slowly: to publish all of tychoish it took about 3 minutes.

I attempted an rst-to-markdown translation of my exiting content and then ran that through the markdown parsers in hugo, just to get comparative timings: 3ish seconds.

reStructuredText is a bit slower to parse than markdown, on account of it’s comparative strictness and the fact that the toolchain is in python and not go, but this difference seemed absurd.

There’s a go-rst project to write a pure-go implementation of reStructuredText, but I’ve kept my eye on that project for a couple of years, and it’s a lot of work that is pretty far off. While I do want to do more to support this project, I wanted to get a new blog up and running in a few weeks, not years.

Based on the differences in timing, and some intuition from years of writing build systems, I made a wager with myself: while the python rst implementation is likely really slow, it’s not that slow, and I was loosing a lot of time to process creation, teardown, and context switching: processing a single file is pretty quick, but the overhead gets to be too much at scale.

I built a little prototype where I ran a very small HTTP service that took rst as a POST request and returned processed HTML. Now there was one process running, and instead of calling fork/exec a bunch, we just had a little but of (local) network overhead.

Faster: 20 second.

I decided I could deal with it.

What remains is making it production worthy or hugo. While it was good enough for me, I very much don’t want to get into the position of needing to maintain a single-feature fork of a software project in active development, and frankly the existing rst support has a difficult to express external dependency. Adding a HTTP service would be a hard sell.

This brings us to shimgo: the idea is to package everything needed to implement the above solution in an external go package, and package it behind a functional interface, so that hugo maintainers don’t need to know anything about its working.

Isn’t abstraction wonderful?

So here we are. I’m still working on getting this patch mainlined, and there is some polish for shimgo itself (mostly the README file and some documentation), but it works, and if you’re doing anything with reStructuredText in go, then you ought to give shimgo a try.


  1. While I think it would be reasonable to start afresh, I think the whole point of having archives is that you mostly just leave them around. ↩︎

  2. It’s not the most popular markup language, but I’ve used it more than any other text markup, and I find the fact that other langauges (e.g. markdown) vary a lot between implementations to be distressing. Admitedly the fact that there aren’t other implementations of rst is also distressing, but one the balance is somewhat less distressing. ↩︎

Going Forward

I wrote a post about moving on from being a technical writer, and I’ve definitely written some since then about programming and various side projects, but I haven’t really done the kind of public reflection on this topic that I’ve done historically about, many other things.

When I switched to a programming team, I knew some things about computers, and I was a decent Python programmer. The goal, then was to teach myself a second programming language (Go,) and learn how to make “real” software with other people, or on teams with other people. Both of those projects are going well: I think I’ve become pretty solid as a Go programmer, although, it’s hard to say what “real” software is, or if I’m good at making it, but all indications are positive.

This weekend, for various reasons, I’ve been reviving a project that I did some work on this fall and winter, that I’ve abandoned for about 6 months. It’s been both troubling (there are parts that are truly terrible,) and kind of rewarding to see how much I’ve grown as a programmer just from looking at the code.

Queue then, I guess, the self reflective interlude.

My reason for wanting to learn--really learn--a second programming language, was to make sure that all the things I knew about system design, algorithms, and data structures was generalizable, and not rooted in the semantics of a specific language or even implementation of that language. I was also interested in learning more about the process of learning new programming languages so that I had some experience with the learning process, which may come in handy in the future.

Learning Go, I think helped me achieve or realize these goals. While I haven’t really set out to learn a third language yet, it feels tractable. I’ve also noticed some changes and differences in some other aspects of my interests.

I used to be really interested in programming qua programming, and I thought a lot about programming languages. While I still can evaluate programming languages, and have my own share of opinions about “the way things work,” I’m less concerned with the specific syntax or implementation. I think a lot about build tools, platform support, deployment models, and distributing methods and stories, rather than what it can do or how you have to write it. Or, how you make it ship it and run it.

I’ve also gotten less interested in UNIX-esque systems administration and operations, which is historically a thing I’ve been quite interested in. These days, I find myself thinking more about the following kinds of problems:

  • build systems, the tools building software from source files, (and sometimes testing it!) and the ways to do this super efficiently and sensibly. Build systems are quite hard because in a lot of ways they’re the point through which your software (as software) interacts with all of the platforms it runs on. Efficient build systems have a huge impact on developer productivity, which is a big interest.
  • developer productivity, this is a big catch all category, but it’s almost always true that people are more expensive than computers, so working on tools and features (like better build systems, or automating various aspects of the development process,)
  • continuous integration and deployment, again connected to developer productivity, but taking the “automate building and testing,” story to its logical conclusion. CD environments mean you deploy changes much more often, but you also require and force yourself to trust the automated systems and make sure that project leadership and management is just as automated as the development experience.
  • internal infrastructure, as in “internal services and tools that all applications need,” like logging, queuing systems, abstractions for persistence, deployment systems, testing, and exposed interfaces (e.g. RPC systems, REST/HTTP, or command line option option parsing). Having good tools for these generic aspects of the application make writing actual features for users easier. I’m also increasingly convinced that the way to improve applications and systems is to improve these lower level components and their interfaces.

Free Software and open source are still important, as is UNIX, but these kinds of developer productivity and automation issues are a level above that. I’ve changed in the last 5 years, software has changed in the last five years, the way we run software on systems has changed in the last 5 years. I’m super excited to see what kinds of things I can do in this space, and where I end up in 5 years.

I’m also interested in thinking about ways to write about this. I’d written drafts of a number of posts that were about learning how to program, about systems administration, and now that I’m finding and making more time for writing, one of the things I don’t really know about is what kind of writing on these topics I’m interested in doing, or how to do it in a way that anyone would be interested in reading.

We shall see. Regardless, I hope that I’m back, now.

Works In Progress

I’ve posed about some of these projects before, and I used to regularly post little overviews of the things that I’m working on. But I’ve not done a lot of this recently. Time to do some back fill.

I think it’s probably good to take a step back from time to time and inventory and evaluate the priorities of various projects. Not to mention the fact that I usually say “I’m not really doing much these days,” when this isn’t really true. Here goes:

Mango

This is a project that is private, at the moment, because its mostly useful as an experimental piece of testing infrastructure for work. The idea is to use the same underlying infrastructure to start, stop, and configure processes, but provide REST and command line interfaces for all of these operations.

We have a lot of distinct software that does this internally and it’s always fragile and limited. While grand discussions of code reuse are sort of silly, in this case, it’s a bit annoying that we’ve reinvented this wheel a dozen times… And have to make different changes to a dozen tools as configurations change.

This was also my first project written in Go, which means its been a great learning experience and the place where a number of other Go packages that have become useful in their own right.

Future work:

  • Write all the tests.
  • Make the REST interface feature compatible with one of the legacy tools it aims to supplant.
  • Make a new REST interface that’s more sensible and might be easier to use in more circumstances.
  • Figure out better ways to block for the appearance of synchronous operations, despite the fact that internally the operations are non-blocking.

Gimlet

Gimlet Blog Post

Gimlet Github

This is really just some convenience work to make it easy to build REST interfaces in Go, without needing to suffer through tools that are designed to support complete “full-stack” web applications. It’s built on the same Negroni/Gorilla Mux stack that I think everyone uses, and it’s very net/http compliant, but with an API that makes it easy (even fun,) to provide high quality JSON+HTTP interfaces.

It struck me, when working on part of Mango, that this chunk of the code didn’t have anything to do with the actual core application and was all about getting a REST-like application to happen. So I split that out, for everyone’s pleasure/suffering.

Future work:

  • Documentation.
  • More tests.
  • Exposed API stabilization and versioning.
  • Develop story for authentication, sessions and SSL termination.

Grip

Grip Blog Post

Grip Github

Grip is a logging package for Go that attempts to resolve my constant feelings of “I miss x feature of the Python logging package.” It’s not feature comparable with Python logging (but that’s ok,) and since I was working on writing a logging package, I got to add some nifty features.

Future Work:

  • More documentation.
  • Better examples, and potentially support for “print this message x% of the time.”
  • Support for logging to conventional syslog sources in addition to systemd’s logging.

Archer/Dropkick

I’ve wanted to work on a tool to unify a number of personal operations and scripts in a single system and tool. The problem that I’m trying to solve is that I have a number of different computers that I use with some frequency, including laptops, desktops, and a number of servers, and test systems, and I want to be able to describe their configuration, and synchronize files and git data between machines with ease.

My first approach was getting a bunch of random system setup scripts out of a makefile and into a configuration file that a Go program knew how to read and process, and then to expand from there.

I haven’t gotten to the git repository management stuff, because I was working on the Gitgone project.

Future Work:

  • add better support for creating and managing containers and images using systemd-nspawn and docker.
  • support for setting up git repositories
  • support for syncing automatically (i.e. dropox-like functional it -> dropkick).
  • report status of repositories via a REST API
  • triggering syncs on remote systems.

Gitgone

Gitgone Github

The idea here is to provide a consistent and full featured way to access and interact with git repositories from Go without needing to wrap the git command yourself (or worse, learn the ins and outs of the git2go). This is largely modeled off of a similar project I did as part of libgiza that does the same sort of thing for Python repositories.

The cool thing about this project is its build abstractly so that you can use one interface and switch between a libgit2 implementation and one that wraps the git command itself.

Future Work:

  • complete implementation using libgit2
  • write more extensive tests.
  • add support for creating repository tags.
  • provide access to the log.

Novel Project

I’ve been, sporadically, working on notes for writing a new novel. It’s not going anywhere fast, and I’m not even to the point. where I’m outling plot.

I’m trying to tell a story about urban development and how smaller local communities/groups participate in larger communities/groups. How does urban development in place a, impact nation building more globally, and what does this all look like to people as they get to work in the morning, and have to build neighborhood institutions like gyms and restaurants and grocery stores.

But there’s a lot of work to do, and while thinking about the project is fun, there’s a lot of work, and I feel like I’m not ready to commit to a writing project of this scope and, I’m not sure how publishable this project will be (and furthermore, even if its' publishable, will I be willing to do all of that work.)

Software projects are much harder to justify and prioritize than writing projects.

Get a Grip

I made another Go(lang) thing. Grip is a set of logging tools modeled on Go’s standard logging system, with some additional (related) features, including:

  • level-based logging, with the ability to set a minimum threshold to exclude log messages based on priority (i.e. debugging.)
  • Error capture/logging, to log Go error objects.
  • Error aggregation, in continue-on-error situations, where you want to perform a bunch of operations and then return any errors if any of them returned an error but don’t want to return an error after the first operation fails.
  • Logging to the systemd journal with fallback to standard library logging to standard output.

There are helper functions for logging using different kinds of default string formatting, as well as functions that take error objects, and a “lazy” logging method that take a simple interface for building log messages at log-time rather than at operation time.

None of these features are terribly exciting, and the systemd library wraps the systemd library from CoreOS. I’m a big fan of log levels and priority filtering, so it’s nice to have a tool for that.

In the future, I’d like to add more generic syslog support if that’s useful, and potentially tools for better categorical logging. There’s also a good deal of repeated code and it might be nice to us this as an excuse to write a code-generator using go tool.

Pull requests and feedback are, of course, welcome.

Said on the Train

I finished, on the train this week, reading Freud and the Non-European by Edward Said (on the recommendation of zmagg and it was, one of the better reading experiences I’ve had in a while.

Said is brilliant, and clear and says really complex important hard things in a really clear and approachable style. He’s also frustratingly correct, which isn’t really a problem, but as an engaged and independent reader, I occasionally realize that the internal monologue of my response is an unintelligent “yep yep” chorus, and I feel like I’ve fallen down on the job of being a good reader.

I might have a bit of a complex.

The thing is, that he actually is very right, and does an amazing job of meeting Freud in his historical context, respecting in that context for the audacity of his mission and the power of his insights to encourage us to think about culture, its impact on human motivation, and how personal and cultural histories combine to produce identity, and inspire behavior. Or, more simply, that self-hood and experience are a product of history and context.

Without, of course, in anyway excusing the flaws in Freud’s methods, biases, basis in fact (or lack there of), or utility (or lack there of) in the care of the mentally ill.

Moreso, Said uses Frued, and his ideas about Jewish identity, and himself as an example of late a certain phenotype of 19th century Jewishness, to help contextualize (roughly) contemporary thinking about jewish identity and Israeli culture and statehood.

It’s roughly brilliant.


I’ve long struggled with any kind of theory that engages seriously with Freud or his intellectual successors: there’s so much crap around Freud, and it sort of feels like good energy after bad to try and justify or resuscitate the tradition. And hurts when Freudian are used to support what are otherwise really interesting intellectual projects.

If nothing else Said gives a good example of a successful intellectual interaction with Freud can occur, and what kinds of parameters and context promote that kind of successful and productive interaction.

Maybe someday, I’ll learn how to be a quarter the reader that Said was. If I’m lucky.

In the mean time, I’m just going to keep reading things on the train.

Have a Gimlet: A Go JSON/HTTP API Toolkit

Look folks, I made a thing!

It’s called gimlet, and it’s a Go(lang) tool for making JSON/HTTP APIs (i.e. REST with JSON). Give it a whirl!

It’s actually even less a tool, and more of a toolkit or just “a place to put all of the annoying infrastructure that you’ll inevitably need when you want to build an JSON/HTTP interface, but that have nothing to do what whatever your API/application does: routing, serializing and serializing JSON.

Nothing hard, nothing terribly interesting, and certainly not anything you couldn’t do another way, but, it’s almost certainly true that this layer of application infrastructure is totally orthogonal to whatever you application is actually doing, so you should focus on that, and probaly use something like Gimliet.

Background

I’m using the term HTTP/JSON APIs for services where you send and recive JSON data over HTTP. Sometimes people call these REST APIs, and that’s not inaccurate, but I think REST is a bit more complicated, and not exactly the core paradigm that I’m pursuing with Gimlet.

Sending and reviving JSON over HTTP makes a lot of sense: there are great tools for parsing JSON and HTTP is a decent high level protocol for interprocess communication between simple data applications. Look up “microservices” at your leisure.

Go is a great language for this it has a lot of tooling that anticipates these kinds of applications, and the deployment model is really friendly to operations teams and systems. Also the static

typing and reasonable separation of private and public interfaces is particularly lovely.

So it should be no surprise that there are a lot tools for building stweb applications, frameworks even. They’re great, things like gorilla and negroni are great and provide a very useful set of tools for building Go web apps. Indeed even Gimlet uses components of each of these tools.

The issue, and reason for Gimlet, is that all of these tools assume that you’re building a web application, with web pages, static resources, form handling, session state handling, and other things that are totally irrelevant to writing JSON/HTTP interfaces.

So then, Gimlet is a tool to build these kinds of APIs: simple, uses Negroni and Gorilla’s mux, and does pretty much everything you need except actually write your code.

Example

Set up the app with some basic configuration: :

import "github.com/tychoish/gimlet"

app := gimlet.NewApp()
app.SetPort(9001)
app.SetDefaultVersion(1)

This sets which port the HTTP server is going to listen for requests and configures the default version of the API. You do want all of your endpoints prefixed with “/v<number>” right? The default version of the API is also avalible without the prefix, or if the version of the route is 0. If you don’t set it to 0.

Then register some routes: :

app.AddRoute("/<path>").Version(<int>).Get().Handler(http.HandlerFunc)
app.AddRoute("/<path>").Version(<int>).Post().Handler(http.HandlerFunc)

app.AddRoute returns an API route object with a set of chainable methods for defining the routes. If you add multiple HTTP methods (GET POST and the like,) then Gimlet automatically defines multiple routes with the same handler for each method.

For handlers, I typically just write functions that take arguments from the top level context (database connections, application configuration, etc) and returnhttp.HandlerFunc objects. For example: :

func helloWorld(config *Configuration) http.HandlerFunc {
     return func(w http.ResponseWriter, r *http.Request) {
          input := make(map[string]interface{})
          response := make(map[string]interface{})            

          err := gimlet.GetJSON(input)

          // do stuff here

          gimlet.WriteJSON(w, response)
     }
}

Gimlet has the following functions that parse JSON out of the body of a request, or add JSON output to the body of a response, they are:

  • WriteJSONResponse(w http.ResponseWrite, code int, data interface{})
  • GetJSON(r *http.Request, data interface)

Which read or write data into the interface{} object (typically a struct.) The following three provide consistent response writers for common exit codes:

  • WriteJSON(w http.ResponseWriter, data interface{}) // 200
  • WriteErrorJSON(w http.ResponseWriter, data interface{}) // 400
  • WriteInternalErrorJSON(w http.ResponseWriter, data interface{}) // 500

Finally, when you’ve written your app, kick it all off, with the following: :

err := app.Run()
if err != nil {
   fmt.Println(err)
   os.Exit(1)
}

And that’s it. Enjoy, tell me in the comments or on the issues feed if you find something broken or confusing. Contribution welcome, of course.

A New Era

I’m writing the draft of this from an airplane bound for Ireland for a week of singing. I travel often: taking weekend jaunts to go to folk festivals, singing conventions, Morris dancing tours, and so forth, I don’t really vacation often. I find travel and managing the logistics of being in unfamiliar places stressful, and my idea of a good time has a lot to do with sipping a cup of coffee1 writing something, and reading a book.

Even though I love spending time at home, it’s still important to (sometimes) leave, try new things, and exist somewhere differently for a little while to reset and reflect a bit. With luck, this vacation thing will become something I feel comfortable doing, at least occasionally.

While it certainly wasn’t part of the initial plan, it turns out that this trip is pretty well timed both as denotes a relatively significant change in my life, and I think is a fitting celebration of a period of large changes in my life.


I am no longer a technical writer.

When I return to work, I’m joining the core engineering team to work on build infrastructure and systems projects. I’ll be working on automating our release process, maintaining continuous integration systems, along with an eclectic set of other projects (some of which may involve some of technical writing. There are somethings that you can never escape.)

The truth though, is that this change has been a long time coming: it feels pretty natural. I’ve been working on the documentation build system for a while, and that’s increasingly been the most fun part of my job, so it’ll be good to spend more time doing that kind of work and learn from folks who know more about this kind of thing. Also, build infrastructure and packaging is incredibly important to how people use software and how engineers work, which have been consistent interests of mine for years.

Also, through the last release process, I’ve also found that I’m burning out on writing documentation. I can do it, and I’m not bad at it. After writing, editing and shepherding, more than a million words of documentation (over several thousand pages,) I sometimes feel like I’ve seen it all. I’m interested in seeing the new ideas and perspectives that will prosper in my absence. I’m also eager to see how the foundation I’ve built stands up without me around. It was time.


The decision to change jobs happened rather suddenly. While I’ve built a narrative (see above), in reality, something clicked and I realized it was time. Ten days later there was a plan. I said to myself, “Shit I thought I was done with major life changes for a while.

In the last year, I’ve bought an apartment and moved to Brooklyn and reorganized my local family grouping:2 The good news is that I find my self in a good state, and ’there’s nothing left to change.3

It’s been a rocky year, nothing is going to change that. Even if in retrospect I find myself satisfied with my actions and decisions, and even if I come out the other end better for my struggle, this is a year I wouldn’t care to repeat.

And even so, I’m excited about the future, about continuing to do interesting work professionally, about enjoying my city and local geography, to surround myself with top notch humans, and to make cool things.

These are early days, and there’s work to do.


In the mean time, I’ll be over here, enjoying something different for a little while.


  1. Astute readers of the blog will note that I am historically a tea drinker. I changed to coffee in late June 2014: I discovered that I didn’t mind the taste as much as I thought, and I like a slightly more potent caffeine delivery system. ↩︎

  2. Break ups suck, each in its own special fucked up way. ↩︎

  3. I’ve started singing more tenor, I guess, so maybe there’s more to change after all. ↩︎

Style Chameleon

When I started my current job there were three major problems with the documentation:

  • There was too much duplicated content, so it was difficult to know where to point people.
  • Given that there were always multiple versions of the product in use, it was hard to figure out which paragraph refereed to which version, particularly as the product changed.
  • Each page felt like it was written by someone else (it was!) and the reading experience could be quite jarring.

The first two were huge tasks, but the solutions were pretty straight forward: build system for documentation that was structured such that it could theoretically hold all the information (replace lots of information repositories with a single information repository) and then use maintenance branches in a version control system to snapshot and fork off old branches as they’re released.

Done.

The last problem is harder. Much harder.

I did a pretty good job, at first, of just writing everything myself, which meant that the first drafts all sounded like they were written by one person: because they were. This doesn’t scale. The next step was to edit the hell out of all contributions that weren’t by me.

This also doesn’t scale.

There are some canonical solutions: write a long style guide and try to get people to comply with it; use templates and standard formats for documents so that everything uses common structure and forms. It’s still hard to enforce, but it’s something.

I put a lot of time into content reuse systems that had additional structure. It helps, and reduces some editorial overhead, but has the same weak points as the conventional solution.

At some point, you need actual humans to edit and make sure things are clear and consistent across the entire corpus. If you don’t have the resources for good editors, then either writers have to spend a significant amount of time editing, which is a huge time suck (and slows progress,) or you start to get drift.

I’m not sure that we have a good answer yet. In the mean time

Appendices

Style in Group Processes

  1. As you innovate and improve it’s really hard to resist the impulse to go back to older pieces to revise them. You should make passes through all your content on some sort of schedule, but you have to give yourself permission and allowance to go back and fix style later.
  2. Common style is less about “being right,” and more about figuring out the kind of communication that’s appropriate for the target audience(s) and using effective structure to support them. In short: compromise.
  3. Style is about the entire reading expenses, not just the syntax, or the typesetting: it is both of these things as well as others. For some kinds of texts, the trick is to often to write as few words as possible, and to make it so that people can scan through documents as quickly as possible while only reading the sections that are relevant to them.

Personal Observations

It’s actually interesting to write blogs again, as I’m (re?)discovering a style of writing that I have been very comfortable with in the past but haven’t really exercised recently. It’s also interesting to see how my own writing and writing process has changed as a result of writing so much technical material.