Managers versus class methods

February 25, 2008 Django, Python

In the triumphant return of “James answers questions from the django-users list”, today I’d like to take a few moments to talk something that’s recently become something of a hot topic, spawning not one but two threads, as well as a couple off-list email discussions: what, exactly, is appropriate to put into a method on a custom manager as opposed to, say, a class method, and when and how can you tell?

This is a somewhat tricky question to answer, since there’s a substantial grey area where personal preference will be the deciding factor, but I’d like to at least make a start on the question and offer up some general guidelines to go with my own personal preferences. First, though, let’s quickly review the subject of class methods in Python and managers in Django.

Class versus instance methods

When you’re defining a class in Python, there are three types of methods you can write as part of it: instance methods, class methods and static methods. The third type — static methods — really aren’t relevant to this discussion (and are somewhat problematic to explain, due to confusion on the part of people coming from, say, Java who have pre-conceived notions of what a “static” method is good for), so we’ll skip over that.

Of the other two, instance methods are by far the most common things you’ll be working with; any time you define a method on a class in Python, it’ll be an instance method unless you explicitly tell Python to treat it otherwise. Class methods have to be created using the classmethod decorator and, by convention, use cls instead of self for the name of their first argument. The essential difference — as their names imply — is that an instance method is generally called from an instance of the class, while a class method is generally called from the class itself. You can call an instance method from the class, or a class method from an instance, if you really want to, but there’s often not much point to doing so.

Generally, when you have a method that logically belongs to a class (by virtue of what it does, or by working with instances of the class) but isn’t necessarily acting on just a single instance of the class or changes the class itself, it’ll make sense to write as a class method; when it only does something specific to a single instance of the class, it’ll make sense to write as an instance method.

Managers

You might expect, then, that when you define a Django model class the methods for querying against the class would be class methods. But, in fact, they’re not; you generally perform queries through an attribute on the class, often named objects. For example, consider this simple model representing a blog entry:

class Entry(models.Model):
    title = models.CharField(max_length=250)
    pub_date = models.DateTimeField()
    body = models.textField()
    is_live = models.BooleanField()

The correct way to query for a list of all entries is not Entry.all(), but rather Entry.objects.all(). The attribute objects is called a “manager”, and it’s an instance of the class django.db.models.Manager; it’s where all the default methods for performing queries against the entire model class — all(), get(), filter(), etc. — will end up. If you’re interested in specifics, the Django model documentation covers the way the default manager is set up, and how to add your own custom managers to a model.

One immediate advantage of this approach is that it keeps the model class’ namespace relatively uncluttered; you don’t have to worry about accidentally defining something on your model which conflicts with the name of a query method (it’d be awfully annoying if you could never have a field named, say, count or values). But there are two much larger benefits from having these methods on a separate class:

It opens up the ability to define multiple sets of query behaviors, and choose between them as needed. So, for example, the Entry class might get two managers: one which queries on all entries, for use in the admin interface, and another which only queries on entries with is_live=True for public views.
It provides an easy way to encapsulate patterns of behavior; if you commonly need to have a set of extra query methods, for example, or want to change the behavior of existing methods, you only need to write that code once — in a Manager subclass — and can reuse it on multiple models, rather than having to duplicate the logic every time you want to use it.

Given this, managers are pretty clearly a better option for this sort of functionality than normal class methods.

What belongs on a manager?

It’s fairly easy to see from this that methods which perform queries against the entire table a particular model represents, or which change the basic query behavior for a particular model, belong in a manager. What’s not so easy is deciding what else belongs in a manager; this is the grey area previously alluded to. Some people seem to take the stance that anything which isn’t a query against the entire table should be a class method instead of a manager method, but I tend to disagree with that. In fact, I personally tend to avoid class methods on models in general, favoring manager methods whenever possible, for a couple of reasons:

I don’t want to get into a situation where some bits of class-level functionality are defined on the class and others on a manager; I like having one place to look and see everything that’s available. And a whole lot of class-level functionality is already in the manager, so I just add methods there unless I have a compelling reason not to.
More than once I’ve been bitten by something I thought would be specific to a single model but turned out not to be (one notable case is fetching a list of “most commented” objects for a given model), and ended up as a method on a manager class that got reused in several places; writing a manager method from the start would, in those cases, have saved me a bit of refactoring.

But even if you don’t want to go all the way down that path, I can recommend a couple of guidelines for when to use a manager method:

If there will be cases where you need to create a model instance, but with some specialized logic that doesn’t necessarily apply in all cases, do it in a manager as a factory-style method. The auth app already does this, for example, exposing a create_user() method which handles hashing the password. Since Python doesn’t let you overload constructors, this is a whole lot cleaner and simpler than writing a custom __init__() with complex branching logic to handle your different cases.
If there’s even the hint of a possibility that you might want to reuse a piece of functionality, do it in a manager method; reusing the same custom manager on multiple models is better than writing the same method multiple times (and since you can subclass managers more or less infinitely, it’s fairly easy to implement reusable versions of even fairly complex functionality).

And of course, this goes together with a general rule that any sort of custom queries or customized query behavior should go into manager methods; handling that sort of “table-level” interaction is what managers are best at.

Also, when in doubt poke around in the Django codebase; several of the applications in django.contrib have custom managers which add interesting functionality to their models, and which might give you some ideas for whether and how to implement something as a manager method.