Managers versus class methods

An entry published by James Bennett on February 25, 2008, Part of the categories Django and Python. 10 comments posted.

In the triumphant return of “James answers questions from the django-users list”, today I’d like to take a few moments to talk something that’s recently become something of a hot topic, spawning not one but two threads, as well as a couple off-list email discussions: what, exactly, is appropriate to put into a method on a custom manager as opposed to, say, a class method, and when and how can you tell?

This is a somewhat tricky question to answer, since there’s a substantial grey area where personal preference will be the deciding factor, but I’d like to at least make a start on the question and offer up some general guidelines to go with my own personal preferences. First, though, let’s quickly review the subject of class methods in Python and managers in Django.

Class versus instance methods

When you’re defining a class in Python, there are three types of methods you can write as part of it: instance methods, class methods and static methods. The third type — static methods — really aren’t relevant to this discussion (and are somewhat problematic to explain, due to confusion on the part of people coming from, say, Java who have pre-conceived notions of what a “static” method is good for), so we’ll skip over that.

Of the other two, instance methods are by far the most common things you’ll be working with; any time you define a method on a class in Python, it’ll be an instance method unless you explicitly tell Python to treat it otherwise. Class methods have to be created using the classmethod decorator and, by convention, use cls instead of self for the name of their first argument. The essential difference — as their names imply — is that an instance method is generally called from an instance of the class, while a class method is generally called from the class itself. You can call an instance method from the class, or a class method from an instance, if you really want to, but there’s often not much point to doing so.

Generally, when you have a method that logically belongs to a class (by virtue of what it does, or by working with instances of the class) but isn’t necessarily acting on just a single instance of the class or changes the class itself, it’ll make sense to write as a class method; when it only does something specific to a single instance of the class, it’ll make sense to write as an instance method.

Managers

You might expect, then, that when you define a Django model class the methods for querying against the class would be class methods. But, in fact, they’re not; you generally perform queries through an attribute on the class, often named objects. For example, consider this simple model representing a blog entry:

class Entry(models.Model):
    title = models.CharField(max_length=250)
    pub_date = models.DateTimeField()
    body = models.textField()
    is_live = models.BooleanField()

The correct way to query for a list of all entries is not Entry.all(), but rather Entry.objects.all(). The attribute objects is called a “manager”, and it’s an instance of the class django.db.models.Manager; it’s where all the default methods for performing queries against the entire model class — all(), get(), filter(), etc. — will end up. If you’re interested in specifics, the Django model documentation covers the way the default manager is set up, and how to add your own custom managers to a model.

One immediate advantage of this approach is that it keeps the model class’ namespace relatively uncluttered; you don’t have to worry about accidentally defining something on your model which conflicts with the name of a query method (it’d be awfully annoying if you could never have a field named, say, count or values). But there are two much larger benefits from having these methods on a separate class:

  1. It opens up the ability to define multiple sets of query behaviors, and choose between them as needed. So, for example, the Entry class might get two managers: one which queries on all entries, for use in the admin interface, and another which only queries on entries with is_live=True for public views.
  2. It provides an easy way to encapsulate patterns of behavior; if you commonly need to have a set of extra query methods, for example, or want to change the behavior of existing methods, you only need to write that code once — in a Manager subclass — and can reuse it on multiple models, rather than having to duplicate the logic every time you want to use it.

Given this, managers are pretty clearly a better option for this sort of functionality than normal class methods.

What belongs on a manager?

It’s fairly easy to see from this that methods which perform queries against the entire table a particular model represents, or which change the basic query behavior for a particular model, belong in a manager. What’s not so easy is deciding what else belongs in a manager; this is the grey area previously alluded to. Some people seem to take the stance that anything which isn’t a query against the entire table should be a class method instead of a manager method, but I tend to disagree with that. In fact, I personally tend to avoid class methods on models in general, favoring manager methods whenever possible, for a couple of reasons:

  1. I don’t want to get into a situation where some bits of class-level functionality are defined on the class and others on a manager; I like having one place to look and see everything that’s available. And a whole lot of class-level functionality is already in the manager, so I just add methods there unless I have a compelling reason not to.
  2. More than once I’ve been bitten by something I thought would be specific to a single model but turned out not to be (one notable case is fetching a list of “most commented” objects for a given model), and ended up as a method on a manager class that got reused in several places; writing a manager method from the start would, in those cases, have saved me a bit of refactoring.

But even if you don’t want to go all the way down that path, I can recommend a couple of guidelines for when to use a manager method:

And of course, this goes together with a general rule that any sort of custom queries or customized query behavior should go into manager methods; handling that sort of “table-level” interaction is what managers are best at.

Also, when in doubt poke around in the Django codebase; several of the applications in django.contrib have custom managers which add interesting functionality to their models, and which might give you some ideas for whether and how to implement something as a manager method.

On February 25, 2008, Hugh Bien said:

Whenever a method has nothing to do with a particular instance, I usually end up sticking it in the model’s Manager. Right now, I can’t think of any sort of method I’d make as a class method on the model.

On February 25, 2008, Petar said:

Thanks James,

this post made it all a bit clearer for me.

On February 25, 2008, Tane Piper said:

I agree with Petar, this has made understanding where to use a manager more simple. I’m still relativly new to Python and Django, but every day I see something new that just blows me away with the power.

On February 25, 2008, Sean said:

I’ve been wondering about this. I have some models in a ‘common’ app that represent common bits of data that are pointed to by FKs in models in several other apps. An example is Timescale that encapsulates information about an event time and duration (i.e. start/end date and time plus some repetition options).

Now when I come to build a query within a manager method for the, say, Event model I want to put the (somewhat intricate) logic which will add the relevant filters to a queryset to code owned by the Timescale model. I do this with a staticmethod in Timescale called ‘add_time_filter(qs, time, relation_name)’ (the relation name allows the calling code to specify the name of the FK field).

When I came up with this it didn’t seem entirely like idiomatic Django, and I also end up having to call qs.filter with the gnarly constructs like qs.filter({‘%s__start_time__lte’ % relation_name, time}) to get relation_name into the dynamic kwargs of filter.

I guess you’re saying that I could do this in a custom manager for the Timescale model which my Event model could subclass, but I’d still be left with the gnarly {} stuff to allow runtime specification of the FK fieldname, and I’d be adding an extra class just for that when it sits fairly well as a staticmethod in the Timescale class at the moment.

Any thoughts on a better way to handle this case of encapsulating query logic for models in the model class when the model is primarily brought into queries by joins? (I hope this is a reasonably description of the use case I’m getting at - if not I could post a cut down code snippet to demonstrate)

On February 25, 2008, Sean said:

the gnarly code I refer to should of course look like qs.filter(**{‘%s__start_time__lte’ % relation_name, time}) when I do proper Markdown escaping.

On February 25, 2008, James Bennett said:

Sean, I’ve written up an example of how to handle your case (albeit with somewhat contrived models since I don’t know the details of your setup) and posted it to djangosnippets. Let me know if it helps.

On February 25, 2008, Sean said:

James - ah, yes. That’s nice, using model introspection to determine the FK field rather than passing it explicitly (but of course caching so it’s a one-off op per manager instance). With this the benefit of a base class manager rather than staticmethods becomes much more evident. Fits my use case perfectly, and definitely feels more idiomatic. Thanks!

On February 25, 2008, Sean said:

Well, when I say perfectly I meant almost perfectly!

Two issues arise as I adapt my code to your superior methodology:

1) Some of my models point at more than one ‘shared’ model in this way. For example Event has FKs pointing to both Timescale and Location. Both Timescale and Location have some reasonably intricate logic involved in constructing queries against them, which I want to encapsulate in both cases. One approach might be to create a TimedLocatedObjectManager which handles both - I don’t yet have a case where I use one without the other. This seems a tad brittle though. An alternative might be to define classes similar to your ReviewedObjectManager, but not have them subclass Manager and call them something like ‘TimedObjectMixin’ and ‘LocatedObjectMixin’, and then define EventManager by multiply-inheriting from Manager, TimedObjectMixin and LocatedObjectMixin. Sound reasonable or overcomplicating matters?

2) Relatedly I want my equivalents of your rating_equals method to allow filtering of an existing queryset, allowing the chaining I need to fully handle what I describe in (1). This seems more straightforward - I just have the method take an optinal ‘queryset’ parameter (defaulting to None in which case self.filter is used as in your example).

IMO the nice thing about really working through a good model design like this, even though it creates somewhat more complexity in the model definitions, is it creates higher-level units of abstractions for constructing queries in the views, and adheres better to DRY and encapsulation principles.

On February 26, 2008, Marco Pantaleoni said:

Hi James, let me express my appreciation for your posts, always insightful and interesting!

The essential difference — as their names imply — is that an instance method is > generally called from an instance of the class, while a class method is generally > called from the class itself. You can call an instance method from the class, or a class method from an instance, if you really want to, but there’s often not much point to doing so.

I think that it’s very often useful to call a class method from an instance (from an instance method). For example, to access properties shared by all the instances. Think about controlled access to a map of all instances, just to name a typical case.

On February 26, 2008, Tim Chase said:

The one time I’ve found it helpful to use classmethods instead of managers: when the filtering required external parameters. Particularly by users. Thus, I might have something like

class MyExample(Model):
    # field definitions
    owner = ForeignKey(User)
    def allowed(cls, user):
        # rather complex logic happens once here
        # instead of every referencing view
        return cls.objects.filter(owner=user)
    allowed = classmethod(allowed)

which allows me to do things like

MyExample.allowed(request.user).filter(…)

which, for some of my applications is what I need to do.

If there’s some way to dynamically pass this information to a manager, it would make more sense to have a manager.

The other reason to use managers is it allows introspection to find all the attributes of an object that can be treated like managers.

-tim

Comments for this entry are closed. If you'd like to share your thoughts on this entry with me, please contact me directly.

ponybadge