An update on the book

July 7, 2009 Django, Meta

So, the repository for the second edition of Practical Django Projects is not yet done, but due to the general clamor I’m opening up public access; you can browse it, or check out a copy of the code, from its page on Bitbucket. You’ll probably want to have a look over the README file displayed on that page, since it provides helpful information on how the repository works.

Right now the first three chapters’ worth of code (covering the first project in the book — a very simple content-management system) are in the repository; the rest is in the process of being added, and will likely come in spurts as I push bunches of commits from my local copy up to the public repository. But before I push any more commits I’d like to take a moment to explain the workflow I’m using for this, since it’s both mildly interesting and somewhat helpful in understanding the time involved in getting this to work properly.

The repository you see on Bitbucket right now says it was created a little less than two days ago. And it was, but also it wasn’t. The first attempt at this repository was created last year, when I started writing the second edition of the book. The thinking was that this would provide the eventual model for the public repository, but that during the writing process it would be useful for doing tech review. But that fizzled out, and for a while it wasn’t updated; it ended up containing a fun mishmash of corrections and backtracking edits that did more harm to its utility than good.

So as soon as I had a copy of the printed book in my hands — my first definitive copy of the text to work from, since there were a few changes still to be made to the book’s content even on the final proofs — I deleted the repository and re-created it, and started going through the book, following the general plan I’d laid out: each code listing in the book would be a changeset in the repository, and readers would be able to step through the repository one change at a time as they went through the book. Once all the code was in, I’d add a bunch of tags to the repository to mark things like where chapters begin and end.

And all was well for a little while until I noticed that I’d made a mistake in one of the early changesets and pushed it into the copy of the repository on Bitbucket. Crap.

Being a stickler and a perfectionist, I once again dropped the repository and started over, this time from just before the spot where I’d screwed up. And all was well for a little while until I did it again.

So on Sunday, I dropped and re-created the repository again, and started it over from scratch with a new workflow:

I have a local copy of the repository, cloned from Bitbucket. We’ll call this Repository A.
I have another local copy of the repository, cloned from Repository A. We’ll call this Repository B.
Repository A is treated, essentially, as untouchable. I do not edit files in that repository.
Repository B is where all the action happens; I pick up the book, make the changes in the next code listing, test them to make sure they work, and commit.
Every so often — typically at useful break points — I pause and push from Repository B to Repository A.
Having done that, I test again, then finally push from Repository A up to Bitbucket.

The advantage of this is that I’m free to screw up as much as I need to; if I do, I blow away the local Repository B, and re-clone from Repository A at a point before the mistake. And if a problem does creep into Repository A, I’m still doing another round of testing to try to catch it before it becomes public; in that case I blow away both repositories, re-clone from Bitbucket and try again.

The disadvantage of this is that it’s a timesink. Taking all of the testing and repository shuffling into account, each changeset can end up representing ten minutes’ worth of work or more (especially if I find an error; larger gaps between commits on the public history usually happen because of this). I pushed just shy of twenty commits tonight, for example, and it shows up as three hours on the timeline, or roughly nine minutes per changeset. Oh, and these are from the easy, beginner portion of the book. Some of the later stuff’s going to be really fun.

But I’ve finally got a workflow that lets me keep a clean history in the public repository while still being able to correct mistakes, and that’s what counts.

I do realize that certain other version-control tools would let me reach back and just edit history to suit whatever I’d like it to be, but I’m also quite heavily committed to Mercurial at this point; not being able to edit history, though it requires contortions at times, is generally a good thing (as is the overall simplicity of working with Mercurial). And I do realize that I could get the same effect in Mercurial with a patch queue, but that’s a much bigger can of worms than I’d care to deal with right now.

Anyway, that’s what’s up right now. Three chapters’ worth of code so far, all tested against current Django trunk, with more on the way. For those who desire it, a feed of commits to the repository is available, so you can track it in real time.

Meanwhile, I think the overall effect of this approach ends up looking rather nice. If somebody can come up with a way to make it easier to manage, I’d love for this to become a usual thing for technical books to publish code.