Testing Django applications in 2018

April 2, 2018 Django, Python

I spend a lot of time writing Django applications. At each of my last three jobs I worked with Django, and I’m the primary maintainer of quite a few open-source Django applications. Which means I’ve written a lot of tests for code that uses Django. And although Django provides a lot of useful tools for testing, there are areas where it doesn’t prescribe or even suggest how you should do things, and over the years I’ve found myself going back and forth on different approaches and patterns. Judging from some of the posts I see on Django-related forums and social media, I’m not the only one in that situation.

So I want to talk a bit today about how I’m testing my Django applications here in 2018. There are some changes from the last time I wrote about this (if you want to go dig it up in my archives), and I’m also going to go into a bit more detail than I have previously.

And before I go any further, I want to reiterate something I say fairly frequently here: this is my blog, where I publish my opinions. They work for me right now. They might not work for you, and that’s OK! If something else works better for you, stick with it.

Writing tests: unittest style

There are two main ways people write tests in the Python world. One is to use the unittest module from the Python standard library, or something derived from it. This involves writing test cases as classes (deriving, directly or otherwise, from unittest.TestCase) and using the assertions provided by TestCase and any subclasses of it you might be using.

The other is to use pytest, or tools built on top of it. Although you can write unittest-style test-case classes and use them with pytest, people mostly seem to prefer using it to write tests as standalone functions, and using the base Python assert statement instead of lots of custom assertions.

The unittest module in Python comes in for a lot of criticism, largely because some parts of it are “un-Pythonic”. It borrows a lot of its design from the JUnit testing framework in Java (in fact, it started life as a port of JUnit to Python), which is why it uses classes for everything, and Java-style names on standard methods (like tearDown() instead of the Pythonic teardown()). Writing tests with unittest and its derivatives can also feel verbose sometimes.

Nevertheless, I write tests in the unittest style. Admittedly, this is an easier choice when working with Django since Django’s own testing utilities build on top of unittest, but there is a pytest-django package which provides pytest-style tools for Django.

This is a personal preference; while I respect that pytest is very good at what it does, and that a lot of people love it, it’s just not my cup of tea. I agree with one common criticism — that its default output can be cluttered and hard to read through, especially with multiple test failures — and I find that once I start using pytest’s more advanced features it becomes a lot harder to reason about what my tests are doing, because of all the automatic dependency-injection work (I won’t use the “m” word!) it’s doing behind the scenes. I also often find myself pining for unittest-style assertion methods, since they’re often very convenient at encapsulating complex ideas, and since it can sometimes be awkward to structure a test in such a way that a bare assert will work.

So I stick to the unittest style, and to using Django’s provided subclass of TestCase and the handy-dandy assertions Django provides.

In a couple of non-Django-related libraries I do use pytest as a test runner while still using the unittest approach to writing my tests. The alternative would be python -m unittest, relying on the unittest module’s automatic test discovery, but it’s a bit fiddly to make that work when also using the coverage run command from coverage.py.

(there was another option — nose — once upon a time, but it’s now unmaintained and hasn’t had a release in almost three years)

Also, a note on how I organize tests: test discovery tools typically look for your tests in a module named tests. Which can be either a file named tests.py, or a directory named tests containing an __init__.py file and one or more files with tests in them. It’s actually pretty rare for me to use the single-file approach; I almost always end up splitting tests across multiple files in a tests directory.

Mostly this is to make it easier for me to organize and later find tests. For Django applications, I typically end up writing one test file for each component of Django I’m using, named for that component. So, for example, tests for any custom methods/managers on models will go in a test_models.py, tests for forms in a test_forms.py, and so on.

Also, pre-emptively setting up tests in a directory with multiple files is a good way to remind myself that the number of tests in my apps should grow over time, and take advantage of the room I’m providing to grow in.

Test runner: using Django “standalone”

Of course, writing tests is just the start; next I need to be able to run them. And with tests for Django applications, that can be a bit tricky, since Django requires settings in order to work at all. If you’re planning to deploy Django, you already have a settings file specifying everything you plan to use, and you can run manage.py test to execute the tests for all your applications (or pass arguments to fine-tune exactly what will run).

But what if you’re developing a single application to be distributed and re-used in a bunch of places? How do you get its tests to run without setting up an entire project, settings file and all?

The answer is that you don’t actually need a full settings file and project to run Django. As long as you know what you’re doing, you can run Django “standalone” with only minimal configuration. Here’s how it works:

Define a dictionary containing some settings. You don’t need a full Django settings file here, just the specific things needed to get Django up and running (which is, basically, DATABASES — which you can point at a temporary or in-memory SQLite database — and usually INSTALLED_APPS and ROOT_URLCONF) and let Django’s defaults fill in the rest.
Use the normal settings import — from django.conf import settings, and then call settings.configure passing in your pre-defined settings (using dictionary unpacking, so if your settings were in, say, SETTINGS_DICT you’d call settings.configure(**SETTINGS_DICT)).
Finally, import django and call django.setup() (no arguments) to populate the application registry.

And that’s it. After those three steps, you can import and use any component of Django you’d like.

From there, it’s just a matter of invoking Django’s test runner; since it’s being done standalone instead of via manage.py test, this needs to be done manually, but it’s not much work. Here’s the relevant code from the test runner script for pwned-passwords-django:

# Now we instantiate a test runner...
from django.test.utils import get_runner
TestRunner = get_runner(settings)

# And then we run tests and return the results.
test_runner = TestRunner(verbosity=2, interactive=True)
failures = test_runner.run_tests(['pwned_passwords_django.tests'])
sys.exit(failures)

This grabs a Django (unittest-based) test runner class, instantiates it and runs the test suite — the tests to run are the argument to the run_tests() method — and then exits with a status code indicating how many tests failed (remember: on Unix-y operating systems, an exit status of zero is “success”).

One thing worth mentioning here is that I’m using verbosity=2 when setting up the test runner. This makes the test output longer, but I strongly prefer it since it shows the full setup of the run, including which database migrations were applied, and outputs one line per test showing which tests were run.

Managing test environments with tox

Most of my open-source Django applications are written to be distributed and used by anyone who finds them useful. Which means supporting multiple versions of Django and Python simultaneously. My general policy is that each new release I make of one of my apps should support every version of Django that’s receiving upstream bugfix or security support at the time, and should support every version of Python supported by those versions of Django.

Right now that’s as simple a task as it ever is, since only two versions of Django — 1.11 and 2.0 — are receiving upstream support (Django 1.8, a long-term support release, reached end-of-life yesterday). But it still adds up to seven different combinations of versions of Django and Python to test against. I have Travis CI to automatically run tests whenever I push a commit, and Travis makes it easy to specify the combinations and get tests running against all of them without much manual work. But I also like running my tests locally.

Which is where tox comes in. I use a combination of tox and pyenv to also run my tests locally against multiple versions of Django and Python.

For an example, here’s the full tox.ini config file for pwned-passwords-django:

[tox]
envlist =
  {py27}-django{111}
  {py34,py35,py36}-django{111,20}
  docs

[testenv:docs]
basepython = python
changedir = docs
deps =
     sphinx
     sphinx_rtd_theme
commands=
    sphinx-build -b html -d {envtmpdir}/doctrees . {envtmpdir}/html

[testenv]
setenv =
    PYTHONWARNINGS=module::DeprecationWarning
commands =
  coverage run setup.py test
  coverage report -m
  flake8 pwned_passwords_django
deps =
  -rtest_requirements.txt
  django111: Django>=1.11,<2.0
  django20: Django>=2.0

[travis]
python =
  2.7: py27
  3.4: py34
  3.5: py35
  3.6: py36, docs

Walking through that section by section:

The tox section is a list of environments to run tests in. This uses a concise list-based syntax that expands out all the various combinations without having to write them manually. Notice there’s one non-Django environment here, called docs; that runs a build of the documentation, to make sure it doesn’t error out.

The testenv:docs section provides specific configuration for the docs test environment, informing tox that this needs Sphinx (the tool I use for all my technical documentation) and the Read the Docs Sphinx theme, and tells tox to use sphinx-build as the “test” command.

The testenv section configures all the other test environments. Here, the deps option tells tox to install everything in the file test_requirements.txt (which will install coverage.py, flake8 and a standalone version of unittest.mock), and also differentiates which version of Django to install for the 1.11 and 2.0 environments.

The test commands invoke the test suite with coverage measurement, print the coverage report, and run flake8.

Finally, the travis section specifies what to do when running on Travis CI (using the tox-travis plugin for tox).

There are a couple other things I’ll talk about from this file, but not just yet.

Test entry point and tool configuration

There are two final pieces to my testing setup. One is the use of a setup.cfg file in the packaged application, to supply additional configuration information for coverage.py and flake8; the relevant sections look like this:

[coverage:run]
include = pwned_passwords_django/*
omit = pwned_passwords_django/tests/*

[coverage:report]
fail_under = 100
exclude_lines =
    pragma: no cover
omit =
    pwned_passwords_django/runtests.py
    pwned_passwords_django/__init__.py
    pwned_passwords_django/tests/*

[flake8]
exclude = locale,__pycache__,.pyc,templates,runtests.py
max-complexity = 10

This tells coverage.py which parts of the code to pay attention to and which to exclude, and tells it to fail the run if the coverage comes back under 100%. It also tells flake8 to exclude some files and filename patterns from its analysis, and to enforce a limit on the cyclomatic complexity score of any functions or methods it finds in the code. I usually set that at 10, and try to avoid hitting up against that limit (the most complex piece of pwned-passwords-django, currently, has a score of 5).

(setup.cfg is a semi-standard file for Python packages, by the way; it can specify a lot of things related to packaging itself, but because it’s so often present many other Python tools will read it to get configuration data)

Finally, in the setup.py packaging script, I use setuptools.setup() instead of distutils.core.setup(), and pass the test_suite argument, setting it to 'pwned_passwords_django.runtests.run_tests', which is the standalone test-runner script I mentioned above. This allows the command setup.py test — a relatively standard entry point for testing Python packages — to work, which in turn is why the tox.ini file uses a test command of coverage run setup.py test.

Things I’ve learned the hard way

I know coverage is a contentious metric, but I always eventually enable it, and I always eventually require 100% coverage to pass a test run.

My usual process for this is to start out by just writing tests. I’ll build out tests for individual functions/classes/methods, then start working on integration tests for the various components, then take a manual look for anything that should be tested but isn’t. Only then do I look at a coverage report, and usually by then I’m already at or very close to 100% coverage, and with (hopefully) meaningful tests rather than tests written to try to game the coverage score.

I don’t use coverage reports as a measure of quality — that’s something I only determine by actually reading the tests — but I do use them as a “canary”. I want to know if the coverage suddenly changes after a new commit, because that’s a sign of a problem; it means I’ve added a new code path without any tests, or I’ve introduced a bug (either in application code or tests) which causes some existing code path to no longer be executed. Either way, there’s a problem and I want to fix it. This is definitely something that’s caught bugs for me in the past.

There’s also a small but important line in the tox.ini file above that I’ve started adding. It’s this:

setenv =
    PYTHONWARNINGS=module::DeprecationWarning

This sets the environment variable PYTHONWARNINGS during the test run. This controls Python’s behavior when warnings are generated, and is equivalent to running Python with the -W command line flag. Python’s warnings system is for situations where code does something that’s not an error, but that also isn’t good, and that the author of the code might want to know about. Django uses this to provide a heads-up about use of deprecated features and APIs which are scheduled to be removed (and tells you which version they’ll be removed in). But there’s a problem: by default, Python silences DeprecationWarning, so unless you tweak the warning settings you won’t see Django telling you that you’ve used something deprecated.

In this case, I’m saying I want to see DeprecationWarning, but only the first instance of it that occurs in each module (since code that triggers a DeprecationWarning once is likely to do it multiple times, cluttering up the output).

Since one of the stories I told last time around involved some failing tests that wouldn’t have failed if I’d seen and paid attention to a DeprecationWarning, I’m now making that a standard part of my testing setup. The other thing that helps spot problems is Django’s system check framework, which will execute automatically during test runs, but to be safe I want the combination of system-check output and visible deprecation warnings.

And that’s a wrap

That’s the full tour of how I test. I realize in retrospect that when it’s all written out like this it sounds a lot more complicated than it really is. Compared to what I’d be doing anyway, it only adds two files (tox.ini and the runtests.py script) to an application. And although it does pull in multiple tools, all of them are pretty straightforward to use and understand.

And the end result is that on my laptop, I just pop over to the directory with my application’s code in it and run tox, and see the full matrix of tests (or pass the -e argument if I only want to run a subset of the full environment matrix). And whenever I commit and push, a few minutes later I see a build on Travis CI, and (hopefully) the little green “build passing” badge. And then I get a nice feeling of confidence that my code works.

There are some more complex topics I haven’t gone into, like my tendency to write a base TestCase class for each application, or thoughts on when to use mocks versus when not to, but those can be saved for another day.

And finally, I’ll say it one more time: what I’ve written here is what works for me, and I’m sharing it in the hope it’ll be useful to someone else, but there’s no guarantee it’ll work for you. If it doesn’t, or if you already have a testing approach that you like, then that’s OK; you should do what works for you. Django doesn’t really constrain how you write or run tests, and I think that’s probably for the best, since there isn’t and likely never will be a one-size-fits-all solution. The only prescriptive thing I’ll say here is you should make sure you have tests, and that they’re meaningful tests and you run them regularly. Any set of tools and techniques you choose for accomplishing that is just fine.