Handle choices the right way

Published November 2, 2007. Filed under: Django.

A lot of common use cases involve a model field which needs to accept values from a restricted set of choices; for example, a field for selecting a US state should, logically, only allow values which correspond to actual US states. In Django’s ORM, this is represented by using the “choices” argument when defining the field, and generally this provides a fairly easy solution.

But it’s not always perfect: while string-based values (such as those for a US state field, which are — in Django’s implementation — simply two-letter postal abbreviations like “VA” or “KS”) work pretty well with this system, numeric values are a bit trickier. As an example, consider the Entry model I use for my blog (the full source code, if you’re interested, is in a Google Code repository), which has a “status” field to let me differentiate three different types of entries:

  1. Live” entries are those which appear publicly on the site.
  2. Draft” entries are works in progress, and don’t yet appear publicly; once I’ve finished writing them, I’ll change the status to “Live”.
  3. Hidden” entries are entries that, for whatever reason, I no longer want to display publicly, but don’t want to delete from the database.

As an aside, I highly recommend not deleting content from your database; you never know when you might want it back again, and although there are ways to implement “undelete”-like functionality, it’s often simplest just to have a way to toggle public display on and off.

Now, translating this set of options into a choices tuple is fairly easy:

STATUS_CHOICES = (
    (1, 'Live'),
    (2, 'Draft'),
    (3, 'Hidden'),
)

There’s really not any easy way to come up with a string-based abbreviation for these values (at least, not one that can be sensibly internationalized), so using integer values is the way to go. Then it’s a simple matter to add it to the model:

class Entry(models.Model):
    # ...some other fields here...
    status = models.IntegerField(choices=STATUS_CHOICES)

And we can improve that a bit by defaulting entries to being “Live”:

status = models.IntegerField(choices=STATUS_CHOICES, default=1)

And from there it’s fairly easy to filter out entries that aren’t live, by querying like so:

live_entries = Entry.objects.filter(status=1)

You could even implement a custom manager which overrides get_query_set() to only return live entries (which I’ve done here).

But there’s a big problem with this: now the application is relying heavily on a magic number.

Bad magic

In general, a “magic number” is any numeric constant (or any otherwise-meaningless value) which is referenced literally in your code. In this case, we’re using the “Live” status value — 1 — in at least two places already: once in the declaration of the model field (to provide a default value) and, even if we write a custom manager for returning live entries, at least one more time in order to filter on status=1. And that’s just if we want to have conveniences for working with the live entries; if we ever need, say, a manager method or a QuerySet of drafts, now we get to go drop another number into the code in a couple of places.

This is problematic for several reasons; first and foremost it violates the DRY principle, by requiring you to repeat the same magic integer value in multiple places. It also runs the risk of violating the closely-related “Once and Only Once” principle, because it’s easy to fall into the trap of writing filter(status=1) in more than one place. Finally, it creates a maintenance headache: you need to keep track of what the “magic” value is and a list of every single place it’s used (since any future change needs to happen in all of those places at once).

Removing the magic

But that leaves us with a problem: how do we reference this value without hard-coding the magic number all over the place? In languages with enumerated types (such as enum in C and its relatives), this is a fairly easy problem to solve; in one of those languages we could just declare an enum with names for the different options, and let the language handle the underlying values.

But Python doesn’t really have anything resembling enum. We could import the STATUS_CHOICES tuple and filter like this:

live_entries = Entry.objects.filter(status=STATUS_CHOICES[0][0])

But, though we’re no longer hard-coding the integer value, we’re now relying on the precise definition of the STATUS_CHOICES tuple; if some other value (say, one for editorial approval by an administrator) ever creeps into the first slot, anything which references STATUS_CHOICES[0][0] is going to break.

What we can do, however, is just define a set of constants:

LIVE_STATUS = 1
DRAFT_STATUS = 2
HIDDEN_STATUS = 3

And from there we can redefine the STATUS_CHOICES tuple to rely on these constants:

STATUS_CHOICES = (
    (LIVE_STATUS, 'Live'),
    (DRAFT_STATUS, 'Draft'),
    (HIDDEN_STATUS, 'Hidden'),
)

And, similarly, we can start importing and referencing these constants. For example, the status field could now be written like so:

status = models.IntegerField(choices=STATUS_CHOICES, default=LIVE_STATUS)

And anything which needs to filter for live entries can use it:

live_entries = Entry.objects.filter(status=LIVE_STATUS)

And we can identify drafts by filtering for status=DRAFT_STATUS or hidden entries by filtering for status=HIDDEN_STATUS. This significantly cleans up our code, in two ways:

  1. We no longer have “magic numbers” lying around which need to be updated separately; a single change to the definition of one of the constants is all that’s needed to bring the code up to date (the database is another matter, but it can be fixed with a single UPDATE query).
  2. The code is now much clearer: a query for status=1 might mean anything, but a query for status=LIVE_STATUS is almost self-explanatory.

Encapsulation

There’s one more change, though, that’d make this even better: encapsulating the status choices inside the Entry model. These choices logically “belong” to the Entry model, after all, so it shouldn’t be necessary to define them separately or import them separately from Entry itself. So move the constants, and the choices tuple, inside the Entry class:

class Entry(models.Model):
    LIVE_STATUS = 1
    DRAFT_STATUS = 2
    HIDDEN_STATUS = 3
    STATUS_CHOICES = (
        (LIVE_STATUS, 'Live'),
        (DRAFT_STATUS, 'Draft'),
        (HIDDEN_STATUS, 'Hidden'),
    )
    # ...some other fields here...
    status = models.IntegerField(choices=STATUS_CHOICES, default=LIVE_STATUS)

Now we can just import the Entry model and query like so:

live_entries = Entry.objects.filter(status=Entry.LIVE_STATUS)
draft_entries = Entry.objects.filter(status=Entry.DRAFT_STATUS)

We can also do comparisons of actual entries to the constant values:

if entry_object.status == Entry.LIVE_STATUS:
    # do something with live entry

Go forth and enumerate

Though it involves a bit more typing up-front (since the constants need to be defined first, then the choices tuple needs to be defined based on them), this is generally the best solution for handling integer-based choices in Django applications; the ability to reference the constants as attributes of the model class, like Entry.LIVE_STATUS, instead of hard-coding magic numbers or dealing with a separate data structure from the class, is about as clean as this is going to get.