Be careful with your URL patterns

Tonight in the Django IRC channel, someone stumbled across a seemingly-odd error when trying to use a generic view:

TypeError: object_list() got multiple values for keyword argument 'queryset'

The problem turned out to be the URL pattern which was routing to the generic view. Consider a simple example, as might be found in a weblog application:

from django.conf.urls.defaults import *
from weblog.models import Entry

info_dict = {
    'queryset': Entry.objects.all()
}

urlpatterns = ('',
    (r'^(index|weblog)/$', 'django.views.generic.list_detail.object_list', info_dict)
)

The idea here is that either of two different URLs — index/ or weblog/ — should route to the object_list generic view and display a list of entries, and at first glance this looks all right; it’s even somewhat clever in using the regular expression to handle two potential URLs instead of having two full patterns which essentially do the same thing. But it’ll actually raise the TypeError listed above.

The reason for this is that parentheses in a regular expression also capture the values they match. Django passes captured values as positional arguments, so the object_list view is getting, as its first positional argument, the bit of text which matched the (index|weblog) part of the regular expression. The first positional argument defined (after request) in the argument signature of object_list, as it turns out, is queryset, but the supplied info_dict dictionary is also going to try to pass that as a keyword argument, which means that the view does indeed end up getting two values for the queryset argument. And at that point Python steps in and raises the TypeError.

In retrospect I suppose this isn’t the most obvious thing in the world, because the common practice in Django applications is to match keyword arguments from the URL (using the ?P construct); in fact, I don’t think I’ve ever seen real-world code which relies on matching positional arguments from a URL. But it’s important to keep in mind that — because a URL pattern is a regular expression, and any captured value will end up being an argument to the view function — using bare parentheses in a URL to specify alternate means of matching can be a risky proposition.

If you find yourself needing to use this regex feature without the risk of accidentally screwing up your view’s arguments, use non-matching parentheses. The example pattern above could, then, be written like so:

(r'^(?:index|weblog)/$', 'django.views.generic.list_detail.object_list', info_dict)

and — because the ?: construct avoids capturing anything which would then be interpreted as an argument — it would work as expected.

Though the ultimate moral of the story is to take care with URL patterns, and to be sure that you understand both how Python’s regular expressions work and how Django handles URL dispatch; sound knowledge of both can prevent this sort of issue, or at least make it much easier to figure out if you run up against it.

Comments

stubblechin
October 14, 2007
#

Interesting gotcha. I would never have run into it myself because I wouldn’t have tried that, but that’s why it’s interesting. It’s educational to see what sort of shapes people try to beat their code into!

Malcolm Tredinnick
October 14, 2007
#

I use positional parameters to views a lot when I’m writing my own view. No need for all that extra typing with the ?P and trying to keep the name synced in two places.

The general takeaway here is your last point: good reg-exp practice is not to capture when you don’t need to, so using (?:…) should be the first choice, not the last. It’s more efficient under the covers. It’s just that, historically, reg-exp engines didn’t used to have this feature, but that was the 1980’s and early 1990’s. It’s 2007 now and people don’t use punch-cards any longer, either.

Ludvig Ericson
October 24, 2007
#

I’ve done matching on positionals, but I’ve always thought of the regular expression part as “something I’d pass to re.search” and thus I’ve never had that issue.

I still believe there somehow needs to be a way to narrow it down to a failure in passing the aggregated arguments from the info dict and regex groups, maybe wrap the view calling in a try/except that’ll add that it failed when it was calling the view, because IMO as it is now, it isn’t immediately apparent that it failed there, especially so since the code that calls the view isn’t very self-explaining, so the traceback becomes sort of cryptic.

Just my $0.02.

Add a comment

You may use Markdown syntax in your comment, but raw HTML will be removed. By posting a comment here, you are agreeing to the terms of my comment policy.