A Python 3.11 “gotcha”

November 8, 2022 Django, Python

Recently at work I’ve been doing a bit of performance tuning on a service that’s getting ready to launch. It’s been built mostly on the tried-and-true principle of “first make it correct, then make it fast”, and really more like “then make it fast if necessary“. This is important because you generally want to have an idea of your performance goals up-front, and if you’re already hitting them then you should not spend a bunch more time trying to micro-optimize your way into being as fast as possible.

The actual performance tuning part of this wasn’t that exciting; there were just a couple database columns that needed indexes, and once they’d been added, the response times were back to where I wanted them to be.

But then I decided, since I was already in there doing some performance work, to check out Python 3.11, which includes some performance improvements. I’m always happy to have my code become better for free just from upgrading Python or a library I’m using, so I figured it was worth at least taking a look at.

Don’t rush to upgrade Python

Before I go any further, I should make clear that even if I’d seen a major performance improvement, I likely would not have pushed to upgrade this service immediately to Python 3.11. Upgrading to Python bugfix releases is fine, and security updates of course should be applied immediately, but with feature releases like 3.11.0 it’s generally best to wait a bit.

There are several reasons for this, but the most important one is: many packages in the Python ecosystem take a little while to get out their own corresponding releases after a new Python release, or to publish new compiled binaries for any extension modules they include.

So rushing to upgrade Python can often be an exercise in frustration, which is why I usually wait until the first bugfix release (3.11.1, due to be released about two months after 3.11.0) before seriously considering an upgrade to the latest Python.

Applying the upgrade

This was pretty simple.The service in question is fully Docker-ized, and uses the “boring” dependency management approach I’ve written about previously, with most common dev tasks driven by a Makefile. So the first thing I did was change the Dockerfile, which is Debian-flavored.

This:

FROM python:3.10-slim-bullseye

Became this:

FROM python:3.11-slim-bullseye

Then I rebuilt my local containers, and went through a couple cycles of finding a package that didn’t yet have a 3.11-compatible binary .whl, removing it temporarily, and trying again. There were fewer of these than I expected, and soon I had a local instance running on Python 3.11. I ran a quick load test (the service includes a testing profile for use with Locust, so that was easy), checked the numbers, and decided there wasn’t enough of an improvement to justify trying to rush an upgrade. Which is about what I expected; this was mostly being done just to see what Python 3.11 can do, not because I thought it was going to end in actually trying to upgrade right away.

So I added back the packages I’d removed, recompiled the dependency lockfiles, and switched the Dockerfile back to Python 3.10.

And the container rebuild to go back to 3.10… failed.

The story of why it failed is moderately interesting, and some of you reading this may already be able to figure it out. But if you want a bit more time to think, I’ll explain a bit about Python packaging, and then explain the failure.

Chickens and eggs

I’ve written at some length about the different concerns that make up “packaging” in Python, and I won’t repeat all of that here. But one of those concerns is how, given some code, you produce a distributable artifact.

Once upon a time, the way you did this was by writing a setup.py script that used the distutils module in the standard library. Then setuptools came along as a more featureful drop-in replacement. And if you want to, you can actually just call setuptools.setup() in your setup.py, and put all the configuration in a static setup.cfg file, which setuptools will automatically read. This has meant that a lot of popular tools, especially linters and formatters and other things that analyze a project’s code, have adopted setup.cfg as a place to put their own configuration.

Over the past few years, the Python packaging ecosystem has been focused on specifying and standardizing the behaviors of different parts of the “packaging” toolchain, in ways which allow multiple implementations of the core APIs and functionality. But one particularly tricky part is how to specify a package builder that isn’t setuptools, or even just how to specify that you need setuptools plus some additional libraries or tools — the only place to put the configuration for that is your setup.py script, which means you run into a bit of a weird situation where you have to execute setup.py to find out how you’re supposed to execute setup.py.

The solution to this was PEP 518, which specified that information about the desired package build system should be put in a static configuration file named pyproject.toml.

But just as setup.cfg had become a sort of dumping-ground place where tons of non-packaging-related tools let you specify configuration, pyproject.toml very quickly was latched onto as the new “put all your configuration here” file. Many popular Python tools now support reading their configuration from a pyproject.toml, if present, and some, like the Black code formatter, only support pyproject.toml.

Amusingly, this all happened despite the fact that no TOML-parsing module was included in the Python standard library, so everything that wanted to read pyproject.toml had to have a dependency on a third-party TOML module, typically the tomli package.

But now comes Python 3.11, which added a tomllib module to the standard library.

What happened

The last piece of the puzzle is the fact that Python’s packaging tools support conditional dependencies. You can use environment markers in a dependency specification to declare that you only want that dependency when running on a particular operating system, or only on particular Python versions or implementations, or machine type, or various other factors.

And so a lot of packages that read from a pyproject.toml changed their tomli dependency to be conditional on the Python version. For example, the Black code formatter mentioned above specifies it like this:

 tomli>=1.1.0; python_full_version < '3.11.0a7'

In other words, Black depends on tomli version 1.1.0 or greater, but only when the Python version is below 3.11.0a7, since that’s the exact moment when the standard-library tomllib module was added.

And this was the source of my build failure: I’d removed some dependencies to get a 3.11 container to build, and then to drop back to 3.10 I added them back and recompiled my dependency tree using pip-compile. But the way this project is set up, that gets run in the container… which still had Python 3.11 at that point. So it compiled a dependency tree that didn’t include tomli, because no package required tomli on 3.11. As soon as I tried to build a 3.10 container, though, pip correctly attempted to pull in the third-party tomli and failed. If you’re following my recommended “boring” dependency approach, you’ll be using pip install in a mode which errors out unless the entire dependency tree is pinned to exact versions with hashes. Since tomli wasn’t pinned and hashed, pip refused to install anything.

Fixing this wasn’t too hard once I read the initial error message and realized what was going on. In fact, there’s no reason why I should have run into this at all, since the easy solution — use git to revert the dependency lockfiles to the correct contents for Python 3.10 — is what I should have done in the first place rather than manually restoring and rebuilding the lockfiles.

But I still found it amusing at the time, and maybe this writeup will save someone from having to scratch their head wondering why they’re having trouble testing an upgrade/downgrade of Python — although the tomllib module is specific to Python 3.11, this general issue is something you could run into any time a new standard-library module sees instant wide adoption in the ecosystem.