Newforms, part 2

An entry published by James Bennett on November 23, 2007, Part of the category Django. 23 comments posted.

Yesterday we took a look at how Django’s newforms library works and explored the different components, how they fit together and how the overall process of form validation takes place inside a newforms form. And while that’s all useful knowledge, it’s helpful to have some practical examples to see all of the various bits in action, so today we’ll look at a simple example which shows off some of the features, building it up step-by-step.

The specific example I’ll be using here is a user-registration form; about a year ago I wrote an example of how to do this using Django’s old forms system; what we’ll build today is the newforms equivalent. If you’re interested in a more extensible and generic implementation of this feature, feel free to check out django-registration, a generic user-registration application — based on newforms — which handles this in a pretty clean way.

First things first

We’re going to need a form, so we start by importing the newforms library and creating a form class; for now we don’t have anything in it, but this at least gets it set up:

from django import newforms as forms


class RegistrationForm(forms.Form):
    pass

Like all newforms forms, this one inherits from django.newforms.Form.

Now, a form with no fields isn’t all that useful, so let’s add them. For user registration we’ll want at least three fields — username, email, and password — so let’s start with those three. Now the form looks like this:

from django import newforms as forms

class RegistrationForm(forms.Form):
    username = forms.CharField()
    email = forms.EmailField()
    password = forms.CharField()

We use CharField for the username and password, and the specialized EmailField — which validates that its input looks like an email address — for the email address.

Based on this, we could already write a view which displays the form, checks that the data is valid, then reads the information out of its cleaned_data dictionary to create a user (say, via the utility method create_user() on Django’s bundled User model). But there are some problems with this:

  1. Django’s User model requires the username to be unique, and we’re not checking for that.
  2. The User model also places a maximum length of 30 characters on the username (it becomes a VARCHAR(30) in the database), and we’re not enforcing that.
  3. The User model imposes one more requirement on the username field: it must conform to a regular expression which only permits certain alphanumeric characters.
  4. The password will be a plain input type=”text”, but what we really want is an input type=”password”.
  5. It’s generally good practice to have users type in the password twice just to make sure a typo doesn’t result in them thinking the password is something other than what it actually is.
  6. Right now, any view which uses this form needs to read out cleaned_data and manually save the new user; it’d be nice to put that into the form so that multiple views don’t have to duplicate this logic.

Validating the username

Enforcing the 30-character maximum for usernames is relatively easy, so let’s start with that. The CharField for forms accepts a max_length argument just like the one for the model CharField, so we can simply add it and the CharField will automatically validate the length of the username for us:

from django import newforms as forms

class RegistrationForm(forms.Form):
    username = forms.CharField(max_length=30)
    email = forms.EmailField()
    password = forms.CharField()

Validating the uniqueness of the username is going to involve a tiny bit more work; we could write a subclass of CharField, maybe called UsernameField, which checks the database and verifies that the username isn’t taken, but that’d probably be overkill for something like this. Instead, we can add a clean_username() method to the form class, and it will automatically be called as part of the validation of the username field. So let’s do that (and notice that we need to import the User model to perform this check):

from django import newforms as forms
from django.contrib.auth.models import User

class RegistrationForm(forms.Form):
    username = forms.CharField(max_length=30)
    email = forms.EmailField()
    password = forms.CharField()
    
    def clean_username(self):
        try:
            user = User.objects.get(username=self.cleaned_data['username'])
        except User.DoesNotExist:
            return self.cleaned_data['username']
        raise forms.ValidationError(u'This username is already taken. Please choose another.')

The logic here is pretty simple. First we look in self.cleaned_data for the username, which will be available if the clean_username() method is being called; the clean() method of the CharField will be called first, and if that already raised a validation error clean_username() won’t be called (field-specific validation stops as soon as it hits the first ValidationError); if it didn’t raise an error, it will have placed the value into cleaned_data for the next stage of validation to work with it.

Once we’ve read out the username, we just do a query for a User with that username; if it raises DoesNotExist, then the username isn’t in use and we return the value, but if the query does find something we raise a ValidationError with an appropriate message.

And it’s important to note the specific validation message used; it’d be tempting to do something like this:

raise forms.ValidationError(u'The username "%s" is already taken. Please choose another.' % self.cleaned_data['username'])

This would result in a message like

'The username "bob" is already taken. Please choose another.'

But that’s a dangerous thing to do: even though Django now has autoescaping on by default in its template system, it can be turned off, which means that we’re echoing form input directly back into the HTML and potentially opening up a cross-site scripting vulnerability. So it’s best to never echo back the value in an error message; instead use a more generic error like the “This username is already taken” message in clean_username() above.

Finally, we need to check that the username corresponds to a regular expression which only permits certain alphanumeric characters. There are two ways we can do this:

  1. Switch to using RegexField instead of CharField, and pass in the regular expression to have it automatically validated.
  2. Keep the CharField, but add the regular-expression validation in clean_username().

In either case, we need to import the regular expression Django uses for this validation, which lives in django.core.validators and is called alnum_re:

from django.core.validators import alnum_re

Then to switch to a RegexField all we have to do is change the definition of the username field like so:

username = forms.RegexField(regex=alnum_re, max_length=30)

Or to put the validation into clean_username():

def clean_username(self):
    if not alnum_re.search(self.cleaned_data['username']):
        raise forms.ValidationError(u'Usernames can only contain letters, numbers and underscores')
    try:
        user = User.objects.get(username__exact=self.cleaned_data['username'])
    except User.DoesNotExist:
            return self.cleaned_data['username']
    raise forms.ValidationError(_(u'This username is already taken. Please choose another.'))

We do the regular-expression check first, because if it fails there’s no point trying to look up the username in the database, which saves a query.

Either of these methods — using RegexField or putting the validation in clean_username() — is fine, and ultimately it’s your choice as to how you handle it; I’ve gone back and forth on this a couple times in django-registration, and for now I’m keeping the CharField simply because that imposes one less restriction on subclasses (which might want to add their own additional restrictions on usernames), but if you’re ever implementing this for yourself you should feel free to use whichever solution works best for you.

Improving and validating the password field

For the password, we need two separate changes:

  1. We want to have it generate an input type=”password” for security.
  2. We want to make the user enter the password twice to catch typos.

The first is easy to do: we can simply tell the password field to use the PasswordInput widget instead of the default TextInput used by CharField:

password = forms.CharField(widget=forms.PasswordInput())

To accomplish the second change — entering the password twice and verifying it was entered the same both times — we’ll need another field of exactly the same type. So let’s rename the password field to password1, and add a password2 of exactly the same type:

password1 = forms.CharField(widget=forms.PasswordInput())
password2 = forms.CharField(widget=forms.PasswordInput())

Another useful trick is an optional argument accepted by PasswordInput: render_value, which defaults to True and determines whether the widget will have a value filled in when the form is re-displayed after a validation error. So if you’d like to have those two inputs rendered “empty” after a validation error, you can pass render_value=False:

password1 = forms.CharField(widget=forms.PasswordInput(render_value=False))
password2 = forms.CharField(widget=forms.PasswordInput(render_value=False))

And now that the password fields are using the correct HTML, we just need to validate that the values entered in the two fields match. Once again, there are two ways we could do this:

  1. We could add a clean_password2() method which raises a ValidationError if password1 and password2 don’t match.
  2. Since it involves multiple fields, we could implement this in the form’s clean() method.

This is another one that I tend to waffle on; in the last released version of django-registration, it’s implemented as clean_password2(), but in the development version I’m experimenting with doing it in the form’s clean() method. Since we’ve already seen a custom validation method for a specific field (the clean_username() method above), let’s look at how it works in clean():

def clean(self):
    if 'password1' in self.cleaned_data and 'password2' in self.cleaned_data:
        if self.cleaned_data['password1'] != self.cleaned_data['password2']:
            raise forms.ValidationError(u'You must type the same password each time.')
    return self.cleaned_data

There are two important things to note here:

  1. With clean_username() above we were able to rely on the value already being in cleaned_data, because field-specific validation stops after the first ValidationError is raised for that field. The form-level clean() gets called regardless of prior errors, however, which means we can’t rely on the values of the password1 and password2 fields being in cleaned_data. So before attempting anything with those values, we need to make sure we have them.
  2. The clean() method, if it doesn’t raise a ValidationError, should simply return cleaned_data; if other fields already raised errors this won’t interfere with them, so it’s safe to do that regardless of whether there were values for the password fields.

Also, the error message raised here if the passwords don’t match won’t end up associated with a specific field. If we’d done this as a clean_password2() method, it would have ended up in the form’s errors dictionary under the key password2, but an error from the form’s clean() ends up in a special location in the errors dictionary which needs to be accessed via the method non_field_errors(). So in a template, to display this error message we’d use

{{ form.non_field_errors }}

Instead of (for example):

{{ form.errors.password2 }}

Saving the User from the form

All that’s left to implement now is saving a User object from the form; we’d like to get this into the form class if at all possible, because that means views don’t have to re-implement and duplicate the logic for this. We’ll handle it in a method named save(), which is somewhat conventional for Django forms (there are helper methods in newforms for automatically generating a form from a model class or instance, and they use save() as well). Here’s what it looks like:

def save(self):
    new_user = User.objects.create_user(username=self.cleaned_data['username'],
                                        email=self.cleaned_data['email'],
                                        password=self.cleaned_data['password1'])
    return new_user

This simply uses User.objects.create_user() — a helper method which takes a username, email address and password, and creates, saves and returns a User object — to create the User, then returns it in case the view which used the form wants access to the object.

Views which use the form should be checking is_valid() before trying to call save(), but if you’re paranoid you can enforce this by adding a check inside save() and raising an appropriate exception (in this case, ValueError is the best candidate) if someone’s trying to save from an invalid form:

if not self.is_valid():
    raise ValueError("Cannot save from an invalid form")

Whether you do this or not is up to you, and depends largely on how much you trust people who’ll be writing views to go with your forms.

Other useful touches, and the final form

Each newforms field lets you attach a label, via the keyword argument label; if this isn’t specified it will be generated from the name of the field. It’s generally a good idea to supply labels for each field in your form, since that helps to make them a bit friendlier (a label like “password1” isn’t all that helpful for a user) and also opens up the ability to mark them for translation using Django’s internationalization framework.

Each widget also accepts an argument called attrs, which is a dictionary that becomes HTML attributes and values on the rendered form input; this is often useful for adding HTML class names, for example, and so you might want to take advantage of that as well.

Here’s the final version of the form, with labels added:

from django import newforms as forms
from django.core.validators import alnum_re
from django.contrib.auth.models import User


class RegistrationForm(forms.Form):
    username = forms.CharField(label=u'Username', max_length=30)
    email = forms.EmailField(label=u'E-mail address')
    password1 = forms.CharField(label=u'Password',
                                widget=forms.PasswordInput(render_value=False))
    password2 = forms.CharField(label=u'Password (again)',
                                widget=forms.PasswordInput(render_value=False))

    def clean_username(self):
        if not alnum_re.search(self.cleaned_data['username']):
            raise forms.ValidationError(u'Usernames can only contain letters, numbers and underscores')
        try:
            user = User.objects.get(username=self.cleaned_data['username'])
        except User.DoesNotExist:
            return self.cleaned_data['username']
        raise forms.ValidationError(u'This username is already taken. Please choose another.')

    def clean(self):
        if 'password1' in self.cleaned_data and 'password2' in self.cleaned_data:
            if self.cleaned_data['password1'] != self.cleaned_data['password2']:
                raise forms.ValidationError(u'You must type the same password each time.')
        return self.cleaned_data

    def save(self):
        new_user = User.objects.create_user(username=self.cleaned_data['username'],
                                            email=self.cleaned_data['email'],
                                            password=self.cleaned_data['password1'])
        return new_user

And here’s a simple view which uses it, checking the request method to see if data was submitted, displaying errors if the form is invalid and redirecting after a successful registration:

def register(request):
    if request.method == 'POST':
        form = RegistrationForm(request.POST)
        if form.is_valid():
            new_user = form.save()
            return HttpResponseRedirect('/users/%s/' % new_user.username)
    else:
        form = RegistrationForm()
    return render_to_response('registration.html',
                              { 'form': form })

One other thing to note: this will almost never be a problem for a real-world site, but there is always a tiny possibility that two people might try to register the same username at the same time. Assuming an extremely precise coincidence — two forms would have to be submitted at almost exactly the same moment, probably within less than a second of one another — it’s possible that one would create the User with that username after the other had already finished validation but before it managed to save. The probability of two identical submissions coming in close enough to one another to trigger this is extremely low, so you’ll probably never have to worry about it, but if you’re processing huge numbers of registrations in short periods of time it might be worth looking into workarounds for this (transaction isolation at the database level is the most effective).

And that’s a wrap

I’ve been working on a project recently which involves a bit more advanced use of newforms (including generating dynamic forms on-the-fly), but it’s not quite ready for public review yet; once it is, I’d like to come back and add a third article covering that topic. For now, though, you should have a pretty good understanding of both the theroretical way that newforms works and practical methods for designing forms and making use of the various validation features.

On November 24, 2007, Sean Stoops said:

In your first code-block under ‘Saving the User from the form’, I believe you have a typo. You have User.objects.create( … ) instead of User.objects.create_user( … ). This has been a really nice writeup on newforms, however. Thanks for taking the time to thoroughly explain all this.

On November 24, 2007, kevin said:

it is very good!

On November 24, 2007, Jakub Stolarski said:

Under “def clean()” block there is: “There are two mportant things to note here:” should be “important” :)

And one question. You wrote that we shouldn’t use “%s” in error messages, because it isn’t escaped. But this form with error message will get the same person, who put malicious code, so we shouldn’t bother. Am I right?

On November 24, 2007, Amit Upadhyay said:

You should have taken this opportunity to explain how to combine the two password fields into one MatchingPasswordField, or should we look forward to part 3? :-)

On November 24, 2007, David said:

James, it looks like the labels you mentioned for the final version didn’t make it in.

On November 24, 2007, James Bennett said:

Jakub, no, you’re not right at all about the XSS problem ;)

Generally, you should always assume that an attacker will find a way to exploit a potential XSS hole; in this specific case the easiest attack would involve a combination of XSS with other techniques, but it’s certainly possible that an attacker could cause malicious script to execute in the context of your site’s registration form.

On November 24, 2007, James Bennett said:

Amit, I don’t really see a need to combine them into one field for a one-off use like this.

On November 24, 2007, b23 said:

thanks, great job again!

you start with the name “usename” (without “r”) and end with “username”, maybe this could confuse.

On November 24, 2007, Dan said:

Thanks very much for this informative article. However I’m still struggling with something that is probably very simple. How would you use newforms to handle an object that has related objects?

A good example of this is the “Poll” application given in the Django tutorial. How would you re-write the “vote” view using newforms?

http://www.djangoproject.com/documentation/tutorial04/#write-a-simple-form

On November 24, 2007, James Bennett said:

Dan, newforms includes a ModelChoiceField which accepts a QuerySet from which it will generate choices; the value returns from that field will be a model object suitable for assigning to a ForeignKey on your model.

On November 24, 2007, Rudolph said:

You mention that there is a tiny possibility that two people might try to register the same username at almost same time. A more probable problem is someone double clicking on the submit button (I have seen this several times). The first request will be processed and succeeds, the second request will validate (because the first request is not committed yet) but just before Django commits this request the database will raise an exception because it of a duplicate key violation. You can solve it at two levels (that I know of): * a piece of javascript that replaces the submit button with something else (image, or a piece of text) at the onclick event, this prevents double clicks for all people that have Javascript enabled, * catch the exception of the duplicate key, check if it’s the same user submitting, get the user object created at the previous request and continue with the code. It’s very important to be sure that it’s the same visitor, the session id might help. The first solution is very easy to implement but not bulletproof. I’m thinking about a generic way to solve this in the form processing code.

On November 24, 2007, Dan said:

Either I wasn’t specific enough or I’m totally clueless… Let me try again…

In the case of the tutorial, the Poll object has no ForeignKey. Instead, the model Choices has a foreign key that relates back to Poll. What I want is a form for a Poll object that also contains a representation of any Choices objects that happen to be related to my Poll object.

Is there a simple way to do this (a la form_from model or formfrom_instance), or do I have to build a custom Poll form class by hand?

On November 24, 2007, Thejaswi Puthraya said:

Aagh!!! I too had written a post on custom newforms-fields and their validation at http://thejaswi.info/blog/2007/11/18/ recently but your article is brilliant. Your blog is a must for all beginners and pros alike. Your blog is an additional django doc. Keep up the great work!!!

On November 25, 2007, heru said:

FREAKIN AWESOME

On November 25, 2007, noname said:

Great posts.

It would be very nice if you would write a short introduction on how to use ModelChoiceField. I’m completely new to Django and Python and this is the one thing that I can’t get working: How should I pass the queryset to a particular form instance to get it working?

This doesn’t seem to work:

def init(self, queryset): self.question = forms.ModelChoiceField(queryset) super(SomeForm, self).init()

On November 27, 2007, Amit Upadhyay said:

James, sure its not needed, but it serves as a good example for multi field, everyone understands the validation and the motivation in this case, and can be reused at least among registration and change password forms, so there is a little bit of use too. Or ditch it, but do a article on multi-field and custom widgets please! :-)

On November 27, 2007, rp said:

But that’s a dangerous thing to do: even though Django now has autoescaping on by default in its template system, it can be turned off, which means that we’re echoing form input directly back into the HTML and potentially opening up a cross-site scripting vulnerability.”

Any safety concerns in using some simple custom replacement like:

lt; for every

to achieve echoing? (hope i get this displayed correctly into my comment)

On November 27, 2007, rp said:

I meant the html “ampersand-l-t-;” for every “less than character”

On November 28, 2007, Joshua Works said:

Hey James,

I noticed all the oldforms complex validators are still in validators.py (RequiredIfOtherFieldEquals, e.g.), but there is no documentation on their use with newforms. Is it easy to take advantage of these still? Could you share an example? Thanks for all your work.

On November 28, 2007, James Bennett said:

Joshua, they’re still there because the old forms system — which is still used by some parts of Django as newforms gets finished — relies on them. They will be going away, though at this point I couldn’t tell you exactly how they’re going to be replaced.

On November 30, 2007, VK said:

First of all - many thanks for the great post! I have a question. You use these 2 statements:

user = User.objects.get(username=self.cleaned_data[‘username’])

and

user = User.objects.get(username__exact=self.cleaned_data[‘username’])

Is the second one a typo or there is some deeper magic?

On December 12, 2007, Merric said:

Fantastic. I was struggling to work out the various clean methods. You’ve made a difficult subject so much easier to understand. Many thanks.

On March 28, 2008, Dave said:

Just wanted to let you kno that between your user reg. code and your info on new forms i am doin great in my comp sci class … thanks alot!

Comments for this entry are closed. If you'd like to share your thoughts on this entry with me, please contact me directly.

ponybadge