Performance tips

November 27, 2007 Django

Nearly two years ago, Jacob posted a round-up of useful tips for Django performance, geared mainly at the non-Django portions of your stack; suggestions like having dedicated media and database servers, memcached, plenty of RAM and database tuning really aren’t Django-specific. Two years later, all of his tips are still relevant and will still have an impact on the performance of your Django-based stack.

This leaves wide open the question of how to squeeze every last bit of performance out of Django — and your Django-based applications — though, so today let’s take a look at some useful and suitably general techniques you might want to try when you know you need to squeeze every last drop of performance out of Django or your application.

Caching

It should go without saying that caching can be your best friend; anything which cuts down on the number of database queries or expensive calculations you need to do will inevitably pay off in lighter server load and faster response times. As Jacob says, use memcached; it’s blazing fast, and if it’s good enough for the millions who use LiveJournal, it’s good enough for you.

But the stock site-wide cache isn’t always ideal. For example, on all of our news sites we allow logged-in users to post comments, take part in interactive chats and use other features which involve changes to the content we display to them. And even if they aren’t heavily interacting, we still need to change some parts of each page in response to a logged-in user’s identity. The stock site-wide caching provided by Django makes this problematic: if a user posts a comment, for example, it won’t show up until the page’s cache expires, and we can forget about displaying things like Hello, {{ user.username }} at the top of the page.

The solution is to set CACHE_MIDDLEWARE_ANONYMOUS_ONLY=True in your Django settings file; this will let the site-wide cache work as normal for users who aren’t logged in — and hence aren’t interacting with the site in way the cache will interfere with — while letting logged-in users see a page that’s generated fresh each time (though if you’re making use of the low-level cache interface that won’t be affected, which is handy for caching any expensive queries which happen regardless of who the user is).

Also, remember that you can hook into the low-level cache API from any Python code in your application; anywhere you have an expensive database query or a calculation that takes a lot of work, you should consider using the low-level API to stuff the result into your cache.

Database queries

Django’s ORM is extremely handy and — in most cases — won’t cause you any sort of performance problems. There are two situations, though, where you might want to take a little extra care:

You need to get all of the objects in a complex relationship which would ordinarily involve many separate queries as relationship-spanning attributes are accessed.
You need to work with extremely large numbers of records, and just need the data.

In the first case, the “select_related()” method can be a huge help, but not always. Some databases — for example, MySQL — are often actually happier with large numbers of small queries, but others, like PostgreSQL, will perform better on a smaller number of more complex queries. If you’re noticing a bottleneck in your application from accessing related objects, using select_related() can often get you an important performance boost.

In the second case, Django’s ORM can hurt more than it helps; instantiating a Django model object isn’t a trivial process, and each model object also consumes a certain amount of memory. If you’re retrieving extremely large numbers of objects, you’re going to start noticing this overhead, and — if all you need is to get the data and do something with it — you can use the “values()” method to get back a ValuesQuerySet; this is a special subclass of the normal QuerySet which doesn’t instantiate any model objects. Instead, iterating, slicing or indexing into it will yield dictionaries containing the field names and values, and dictionaries are far simpler to create. You can also pass a list of field names to values(), and the resulting dictionaries will only contain values for those fields. The slightly altered SELECT statement executed in this case probably won’t change the performance of the query noticeably, but will result in less memory use (since fewer pieces of data have to be kept around in memory).

Templates

Templates can be extremely tricky to profile unless you know what you’re doing, because there’s not a whole lot of information available on how long you’re spending in different parts of a template. In general, though, there are a few things you should be aware of which — once you identify a problematic segment of a template — may help you out a bit.

The heavy-handed solution is simply to cache part of a template, using the (somewhat new) template fragment caching available through Django’s {% cache %} tag. You can also write custom template tags which access the cache API, potentially cutting down on expensive work or time-consuming queries.

You can also use the “with” tag to avoid the overhead of repeatedly resolving a variable or calling an expensive method. Because of the way Django’s template variable resolution works, each occurrence of a variable in your template will normally result in a separate Variable instance and a separate resolution; this can be an expensive process for complex variables (e.g., in a variable name like foo.bar.baz.quux, Django has to try each of several methods of resolution on each “part” of the variable) or for variables which resolve to methods to be called (and which might in turn have to do some work to return a result). The with tag avoids this repeated work by ensuring there’s only a single copy of the Variable instance between the with and endwith tags, which means it only needs to resolve once.

Complex template logic can also be problematic; for loops are especially troublesome because they involve quite a bit of setup and have to create, track and update several loop-related variables. All of that takes time, so a template which involves lots of for loops (especially nested loops) should be carefully examined to see if there’s not a simpler way to express the same logic. Writing a custom template tag for a particularly complex piece of display logic is also something to consider; shifting some of the work into Python code can provide important performance improvements (and the shortcut decorators for “simple” and “inclusion” tags can make this a fairly simple process).

General tips

Predicting in advance where the performance bottlenecks will be in an application is almost impossible; generally, the database will be a major factor, so limiting the number of queries you perform, and constructing them carefully to ensure you don’t over-tax your DB, are important first steps. But there are plenty of other places where you can run into issues; learning how to use and work with your database’s logging facilities, the Python profiler and Django’s own debugging information are all key to identifying and resolving bottlenecks in your applications.

The best advice I can give here, though, is not to get caught up in trying to optimize performance until you know whether you need to; until your application is actually up and running, and seeing real-world traffic levels, it’s going to be difficult or impossible to predict how the application will perform or when and where it will run into problems. Trying to optimize your code before you know how it actually performs is a recipe for needless over-engineering at best, and disaster — when you find that the real performance bottleneck was lurking somewhere else entirely — at worst.