Newforms, part 2

Published November 23, 2007. Filed under: Django.

Yesterday we took a look at how Django’s newforms library works and explored the different components, how they fit together and how the overall process of form validation takes place inside a newforms form. And while that’s all useful knowledge, it’s helpful to have some practical examples to see all of the various bits in action, so today we’ll look at a simple example which shows off some of the features, building it up step-by-step.

The specific example I’ll be using here is a user-registration form; about a year ago I wrote an example of how to do this using Django’s old forms system; what we’ll build today is the newforms equivalent. If you’re interested in a more extensible and generic implementation of this feature, feel free to check out django-registration, a generic user-registration application — based on newforms — which handles this in a pretty clean way.

First things first

We’re going to need a form, so we start by importing the newforms library and creating a form class; for now we don’t have anything in it, but this at least gets it set up:

from django import newforms as forms


class RegistrationForm(forms.Form):
    pass

Like all newforms forms, this one inherits from django.newforms.Form.

Now, a form with no fields isn’t all that useful, so let’s add them. For user registration we’ll want at least three fields — username, email, and password — so let’s start with those three. Now the form looks like this:

from django import newforms as forms

class RegistrationForm(forms.Form):
    username = forms.CharField()
    email = forms.EmailField()
    password = forms.CharField()

We use CharField for the username and password, and the specialized EmailField — which validates that its input looks like an email address — for the email address.

Based on this, we could already write a view which displays the form, checks that the data is valid, then reads the information out of its cleaned_data dictionary to create a user (say, via the utility method create_user() on Django’s bundled User model). But there are some problems with this:

  1. Django’s User model requires the username to be unique, and we’re not checking for that.
  2. The User model also places a maximum length of 30 characters on the username (it becomes a VARCHAR(30) in the database), and we’re not enforcing that.
  3. The User model imposes one more requirement on the username field: it must conform to a regular expression which only permits certain alphanumeric characters.
  4. The password will be a plain input type=”text”, but what we really want is an input type=”password”.
  5. It’s generally good practice to have users type in the password twice just to make sure a typo doesn’t result in them thinking the password is something other than what it actually is.
  6. Right now, any view which uses this form needs to read out cleaned_data and manually save the new user; it’d be nice to put that into the form so that multiple views don’t have to duplicate this logic.

Validating the username

Enforcing the 30-character maximum for usernames is relatively easy, so let’s start with that. The CharField for forms accepts a max_length argument just like the one for the model CharField, so we can simply add it and the CharField will automatically validate the length of the username for us:

from django import newforms as forms

class RegistrationForm(forms.Form):
    username = forms.CharField(max_length=30)
    email = forms.EmailField()
    password = forms.CharField()

Validating the uniqueness of the username is going to involve a tiny bit more work; we could write a subclass of CharField, maybe called UsernameField, which checks the database and verifies that the username isn’t taken, but that’d probably be overkill for something like this. Instead, we can add a clean_username() method to the form class, and it will automatically be called as part of the validation of the username field. So let’s do that (and notice that we need to import the User model to perform this check):

from django import newforms as forms
from django.contrib.auth.models import User

class RegistrationForm(forms.Form):
    username = forms.CharField(max_length=30)
    email = forms.EmailField()
    password = forms.CharField()
    
    def clean_username(self):
        try:
            user = User.objects.get(username=self.cleaned_data['username'])
        except User.DoesNotExist:
            return self.cleaned_data['username']
        raise forms.ValidationError(u'This username is already taken. Please choose another.')

The logic here is pretty simple. First we look in self.cleaned_data for the username, which will be available if the clean_username() method is being called; the clean() method of the CharField will be called first, and if that already raised a validation error clean_username() won’t be called (field-specific validation stops as soon as it hits the first ValidationError); if it didn’t raise an error, it will have placed the value into cleaned_data for the next stage of validation to work with it.

Once we’ve read out the username, we just do a query for a User with that username; if it raises DoesNotExist, then the username isn’t in use and we return the value, but if the query does find something we raise a ValidationError with an appropriate message.

And it’s important to note the specific validation message used; it’d be tempting to do something like this:

raise forms.ValidationError(u'The username "%s" is already taken. Please choose another.' % self.cleaned_data['username'])

This would result in a message like

'The username "bob" is already taken. Please choose another.'

But that’s a dangerous thing to do: even though Django now has autoescaping on by default in its template system, it can be turned off, which means that we’re echoing form input directly back into the HTML and potentially opening up a cross-site scripting vulnerability. So it’s best to never echo back the value in an error message; instead use a more generic error like the “This username is already taken” message in clean_username() above.

Finally, we need to check that the username corresponds to a regular expression which only permits certain alphanumeric characters. There are two ways we can do this:

  1. Switch to using RegexField instead of CharField, and pass in the regular expression to have it automatically validated.
  2. Keep the CharField, but add the regular-expression validation in clean_username().

In either case, we need to import the regular expression Django uses for this validation, which lives in django.core.validators and is called alnum_re:

from django.core.validators import alnum_re

Then to switch to a RegexField all we have to do is change the definition of the username field like so:

username = forms.RegexField(regex=alnum_re, max_length=30)

Or to put the validation into clean_username():

def clean_username(self):
    if not alnum_re.search(self.cleaned_data['username']):
        raise forms.ValidationError(u'Usernames can only contain letters, numbers and underscores')
    try:
        user = User.objects.get(username__exact=self.cleaned_data['username'])
    except User.DoesNotExist:
            return self.cleaned_data['username']
    raise forms.ValidationError(_(u'This username is already taken. Please choose another.'))

We do the regular-expression check first, because if it fails there’s no point trying to look up the username in the database, which saves a query.

Either of these methods — using RegexField or putting the validation in clean_username() — is fine, and ultimately it’s your choice as to how you handle it; I’ve gone back and forth on this a couple times in django-registration, and for now I’m keeping the CharField simply because that imposes one less restriction on subclasses (which might want to add their own additional restrictions on usernames), but if you’re ever implementing this for yourself you should feel free to use whichever solution works best for you.

Improving and validating the password field

For the password, we need two separate changes:

  1. We want to have it generate an input type=”password” for security.
  2. We want to make the user enter the password twice to catch typos.

The first is easy to do: we can simply tell the password field to use the PasswordInput widget instead of the default TextInput used by CharField:

password = forms.CharField(widget=forms.PasswordInput())

To accomplish the second change — entering the password twice and verifying it was entered the same both times — we’ll need another field of exactly the same type. So let’s rename the password field to password1, and add a password2 of exactly the same type:

password1 = forms.CharField(widget=forms.PasswordInput())
password2 = forms.CharField(widget=forms.PasswordInput())

Another useful trick is an optional argument accepted by PasswordInput: render_value, which defaults to True and determines whether the widget will have a value filled in when the form is re-displayed after a validation error. So if you’d like to have those two inputs rendered “empty” after a validation error, you can pass render_value=False:

password1 = forms.CharField(widget=forms.PasswordInput(render_value=False))
password2 = forms.CharField(widget=forms.PasswordInput(render_value=False))

And now that the password fields are using the correct HTML, we just need to validate that the values entered in the two fields match. Once again, there are two ways we could do this:

  1. We could add a clean_password2() method which raises a ValidationError if password1 and password2 don’t match.
  2. Since it involves multiple fields, we could implement this in the form’s clean() method.

This is another one that I tend to waffle on; in the last released version of django-registration, it’s implemented as clean_password2(), but in the development version I’m experimenting with doing it in the form’s clean() method. Since we’ve already seen a custom validation method for a specific field (the clean_username() method above), let’s look at how it works in clean():

def clean(self):
    if 'password1' in self.cleaned_data and 'password2' in self.cleaned_data:
        if self.cleaned_data['password1'] != self.cleaned_data['password2']:
            raise forms.ValidationError(u'You must type the same password each time.')
    return self.cleaned_data

There are two important things to note here:

  1. With clean_username() above we were able to rely on the value already being in cleaned_data, because field-specific validation stops after the first ValidationError is raised for that field. The form-level clean() gets called regardless of prior errors, however, which means we can’t rely on the values of the password1 and password2 fields being in cleaned_data. So before attempting anything with those values, we need to make sure we have them.
  2. The clean() method, if it doesn’t raise a ValidationError, should simply return cleaned_data; if other fields already raised errors this won’t interfere with them, so it’s safe to do that regardless of whether there were values for the password fields.

Also, the error message raised here if the passwords don’t match won’t end up associated with a specific field. If we’d done this as a clean_password2() method, it would have ended up in the form’s errors dictionary under the key password2, but an error from the form’s clean() ends up in a special location in the errors dictionary which needs to be accessed via the method non_field_errors(). So in a template, to display this error message we’d use

{{ form.non_field_errors }}

Instead of (for example):

{{ form.errors.password2 }}

Saving the User from the form

All that’s left to implement now is saving a User object from the form; we’d like to get this into the form class if at all possible, because that means views don’t have to re-implement and duplicate the logic for this. We’ll handle it in a method named save(), which is somewhat conventional for Django forms (there are helper methods in newforms for automatically generating a form from a model class or instance, and they use save() as well). Here’s what it looks like:

def save(self):
    new_user = User.objects.create_user(username=self.cleaned_data['username'],
                                        email=self.cleaned_data['email'],
                                        password=self.cleaned_data['password1'])
    return new_user

This simply uses User.objects.create_user() — a helper method which takes a username, email address and password, and creates, saves and returns a User object — to create the User, then returns it in case the view which used the form wants access to the object.

Views which use the form should be checking is_valid() before trying to call save(), but if you’re paranoid you can enforce this by adding a check inside save() and raising an appropriate exception (in this case, ValueError is the best candidate) if someone’s trying to save from an invalid form:

if not self.is_valid():
    raise ValueError("Cannot save from an invalid form")

Whether you do this or not is up to you, and depends largely on how much you trust people who’ll be writing views to go with your forms.

Other useful touches, and the final form

Each newforms field lets you attach a label, via the keyword argument label; if this isn’t specified it will be generated from the name of the field. It’s generally a good idea to supply labels for each field in your form, since that helps to make them a bit friendlier (a label like “password1” isn’t all that helpful for a user) and also opens up the ability to mark them for translation using Django’s internationalization framework.

Each widget also accepts an argument called attrs, which is a dictionary that becomes HTML attributes and values on the rendered form input; this is often useful for adding HTML class names, for example, and so you might want to take advantage of that as well.

Here’s the final version of the form, with labels added:

from django import newforms as forms
from django.core.validators import alnum_re
from django.contrib.auth.models import User


class RegistrationForm(forms.Form):
    username = forms.CharField(label=u'Username', max_length=30)
    email = forms.EmailField(label=u'E-mail address')
    password1 = forms.CharField(label=u'Password',
                                widget=forms.PasswordInput(render_value=False))
    password2 = forms.CharField(label=u'Password (again)',
                                widget=forms.PasswordInput(render_value=False))

    def clean_username(self):
        if not alnum_re.search(self.cleaned_data['username']):
            raise forms.ValidationError(u'Usernames can only contain letters, numbers and underscores')
        try:
            user = User.objects.get(username=self.cleaned_data['username'])
        except User.DoesNotExist:
            return self.cleaned_data['username']
        raise forms.ValidationError(u'This username is already taken. Please choose another.')

    def clean(self):
        if 'password1' in self.cleaned_data and 'password2' in self.cleaned_data:
            if self.cleaned_data['password1'] != self.cleaned_data['password2']:
                raise forms.ValidationError(u'You must type the same password each time.')
        return self.cleaned_data

    def save(self):
        new_user = User.objects.create_user(username=self.cleaned_data['username'],
                                            email=self.cleaned_data['email'],
                                            password=self.cleaned_data['password1'])
        return new_user

And here’s a simple view which uses it, checking the request method to see if data was submitted, displaying errors if the form is invalid and redirecting after a successful registration:

def register(request):
    if request.method == 'POST':
        form = RegistrationForm(request.POST)
        if form.is_valid():
            new_user = form.save()
            return HttpResponseRedirect('/users/%s/' % new_user.username)
    else:
        form = RegistrationForm()
    return render_to_response('registration.html',
                              { 'form': form })

One other thing to note: this will almost never be a problem for a real-world site, but there is always a tiny possibility that two people might try to register the same username at the same time. Assuming an extremely precise coincidence — two forms would have to be submitted at almost exactly the same moment, probably within less than a second of one another — it’s possible that one would create the User with that username after the other had already finished validation but before it managed to save. The probability of two identical submissions coming in close enough to one another to trigger this is extremely low, so you’ll probably never have to worry about it, but if you’re processing huge numbers of registrations in short periods of time it might be worth looking into workarounds for this (transaction isolation at the database level is the most effective).

And that’s a wrap

I’ve been working on a project recently which involves a bit more advanced use of newforms (including generating dynamic forms on-the-fly), but it’s not quite ready for public review yet; once it is, I’d like to come back and add a third article covering that topic. For now, though, you should have a pretty good understanding of both the theroretical way that newforms works and practical methods for designing forms and making use of the various validation features.