A lot of common use cases involve a model field which needs to accept values from a restricted set of choices; for example, a field for selecting a US state should, logically, only allow values which correspond to actual US states. In Django’s ORM, this is represented by using the “choices” argument when defining the field, and generally this provides a fairly easy solution.
But it’s not always perfect: while string-based values (such as those for a US state field, which are — in Django’s implementation — simply two-letter postal abbreviations like “VA” or “KS”) work pretty well with this system, numeric values are a bit trickier. As an example, consider the Entry model I use for my blog (the full source code, if you’re interested, is in a Google Code repository), which has a “status” field to let me differentiate three different types of entries:
As an aside, I highly recommend not deleting content from your database; you never know when you might want it back again, and although there are ways to implement “undelete”-like functionality, it’s often simplest just to have a way to toggle public display on and off.
Now, translating this set of options into a choices tuple is fairly easy:
STATUS_CHOICES = ( (1, 'Live'), (2, 'Draft'), (3, 'Hidden'), )
There’s really not any easy way to come up with a string-based abbreviation for these values (at least, not one that can be sensibly internationalized), so using integer values is the way to go. Then it’s a simple matter to add it to the model:
class Entry(models.Model): # ...some other fields here... status = models.IntegerField(choices=STATUS_CHOICES)
And we can improve that a bit by defaulting entries to being “Live”:
status = models.IntegerField(choices=STATUS_CHOICES, default=1)
And from there it’s fairly easy to filter out entries that aren’t live, by querying like so:
live_entries = Entry.objects.filter(status=1)
You could even implement a custom manager which overrides get_query_set() to only return live entries (which I’ve done here).
But there’s a big problem with this: now the application is relying heavily on a magic number.
In general, a “magic number” is any numeric constant (or any otherwise-meaningless value) which is referenced literally in your code. In this case, we’re using the “Live” status value — 1 — in at least two places already: once in the declaration of the model field (to provide a default value) and, even if we write a custom manager for returning live entries, at least one more time in order to filter on status=1. And that’s just if we want to have conveniences for working with the live entries; if we ever need, say, a manager method or a QuerySet of drafts, now we get to go drop another number into the code in a couple of places.
This is problematic for several reasons; first and foremost it violates the DRY principle, by requiring you to repeat the same magic integer value in multiple places. It also runs the risk of violating the closely-related “Once and Only Once” principle, because it’s easy to fall into the trap of writing filter(status=1) in more than one place. Finally, it creates a maintenance headache: you need to keep track of what the “magic” value is and a list of every single place it’s used (since any future change needs to happen in all of those places at once).
But that leaves us with a problem: how do we reference this value without hard-coding the magic number all over the place? In languages with enumerated types (such as enum in C and its relatives), this is a fairly easy problem to solve; in one of those languages we could just declare an enum with names for the different options, and let the language handle the underlying values.
But Python doesn’t really have anything resembling enum. We could import the STATUS_CHOICES tuple and filter like this:
live_entries = Entry.objects.filter(status=STATUS_CHOICES[0][0])
But, though we’re no longer hard-coding the integer value, we’re now relying on the precise definition of the STATUS_CHOICES tuple; if some other value (say, one for editorial approval by an administrator) ever creeps into the first slot, anything which references STATUS_CHOICES[0][0] is going to break.
What we can do, however, is just define a set of constants:
LIVE_STATUS = 1 DRAFT_STATUS = 2 HIDDEN_STATUS = 3
And from there we can redefine the STATUS_CHOICES tuple to rely on these constants:
STATUS_CHOICES = ( (LIVE_STATUS, 'Live'), (DRAFT_STATUS, 'Draft'), (HIDDEN_STATUS, 'Hidden'), )
And, similarly, we can start importing and referencing these constants. For example, the status field could now be written like so:
status = models.IntegerField(choices=STATUS_CHOICES, default=LIVE_STATUS)
And anything which needs to filter for live entries can use it:
live_entries = Entry.objects.filter(status=LIVE_STATUS)
And we can identify drafts by filtering for status=DRAFT_STATUS or hidden entries by filtering for status=HIDDEN_STATUS. This significantly cleans up our code, in two ways:
UPDATE query).
status=1 might mean anything, but a query for status=LIVE_STATUS is almost self-explanatory.
There’s one more change, though, that’d make this even better: encapsulating the status choices inside the Entry model. These choices logically “belong” to the Entry model, after all, so it shouldn’t be necessary to define them separately or import them separately from Entry itself. So move the constants, and the choices tuple, inside the Entry class:
class Entry(models.Model): LIVE_STATUS = 1 DRAFT_STATUS = 2 HIDDEN_STATUS = 3 STATUS_CHOICES = ( (LIVE_STATUS, 'Live'), (DRAFT_STATUS, 'Draft'), (HIDDEN_STATUS, 'Hidden'), ) # ...some other fields here... status = models.IntegerField(choices=STATUS_CHOICES, default=LIVE_STATUS)
Now we can just import the Entry model and query like so:
live_entries = Entry.objects.filter(status=Entry.LIVE_STATUS) draft_entries = Entry.objects.filter(status=Entry.DRAFT_STATUS)
We can also do comparisons of actual entries to the constant values:
if entry_object.status == Entry.LIVE_STATUS: # do something with live entry
Though it involves a bit more typing up-front (since the constants need to be defined first, then the choices tuple needs to be defined based on them), this is generally the best solution for handling integer-based choices in Django applications; the ability to reference the constants as attributes of the model class, like Entry.LIVE_STATUS, instead of hard-coding magic numbers or dealing with a separate data structure from the class, is about as clean as this is going to get.
Comments for this entry are closed. If you'd like to share your thoughts on this entry with me, please contact me directly.
We use a very basic Enum object to handle this. It works nicely and reduces typing over your examples (same idea, though).
Used as:
The operator overloading lets you add options for newforms easily, especially if you aren’t using form_for_model:
You can use it similarly to what you had:
It’s missing a few methods, notable __contains__ __nonzero__ and maybe a few others, but we don’t use them and removed them.
That’s not a bad solution either; I’d seen a third-paty enum implementation, but it seemed like it might not be a good fit for this sort of thing.
I have been using something like this: class MyChoices: def init(self, entries): self.dict.update(entries) self.choices = [ ] for val in entries.values(): for key in entries.keys(): if entries[key] == val: self.choices.append((val, key))
I use it like this (in models.py): statuses = MyChoices(BIDDING_STARTED=10, BIDDING_ENDED=20)
And then: status = models.IntegerField(default=statuses.BIDDING_STARTED, choices=statuses.get_choices())
Sorry I screwed up the formatting earlier.
I have posted the above code snippet at djangosnippets.org instead.
One could argue that ‘Hidden’, ‘Live’, and ‘Draft’ are display-oriented values. While this isn’t necessarily problematic for blogging software, I find it best to use statuses that describe state while remaining content-agnostic; e.g.: ‘ACTIVE’, ‘INACTIVE’, ‘DELETED’, ‘WAIT4CHG’, ‘R2UPLOAD’, ‘R2DOWNLOAD’, etc. This allows for the re-use of statuses across your models and keeps your front-end decisions out of your database—after all, you might want to have a view that exposes those non-public entries, at which point they wouldn’t be ‘Hidden’ anymore (although you’d still want them ‘private’).
And one could argue that trying to find truly generic descriptions of object state is a fool’s errand ;)
As for the field being “display-oriented”, bear in mind the specific application in which it’s being used: the manner in which an entry will or will not be publicly viewable is an integral part of a weblogging application, and should not be dismissed merely because of concerns about theoretical purity.
Not truly generic, but at least not indicative of display. A very practical change from ‘Hidden’ to ‘Private’ would do for my tastes. But I understand we don’t all like the same tea.
Well, I kind of agree with Matthew. I mean this is a pretty pedantic article about tidying up loose ends and the “right” way of doing things. To that end, Matthew’s comment is useful and was meant in the right spirit. I have a constants module which contains classes of this nature and I like to keep the names meaningful yet generic enough to re-use across projects.
Previous comment not to imply anything negative. Very nice article and much appreciated !
Rick, I think it’s just my philosophy background speaking; too many people spent time on the wild goose chase of “well, if we just come up with really good categorization for everything, it’ll solve all these problems”. So I get a bit wary when I hear talk of trying to find good generic names for things.
Now, it’s true that sometimes there are more generic things which make sense for a larger group of models, but it feels (to me, at least) that more often the values are more tightly coupled to a single specific model; in the example of blog entries, I think that’s the case, simply because there is — at this point — a fairly established set of conventions for how blogging software works.