A Python 3.11 “gotcha”
Recently at work I’ve been doing a bit of performance tuning on a service that’s getting ready to launch. It’s been built mostly on the tried-and-true principle of “first make it correct, then make it fast”, and really more like “then make it fast if necessary“. This is important because you generally want to have an idea of your performance goals up-front, and if you’re already hitting them then you should not spend a bunch more time trying to micro-optimize your way into being as fast as possible.
The actual performance tuning part of this wasn’t that exciting; there were just a couple database columns that needed indexes, and once they’d been added, the response times were back to where I wanted them to be.
But then I decided, since I was already in there doing some performance work, to check out Python 3.11, which includes some performance improvements. I’m always happy to have my code become better for free just from upgrading Python or a library I’m using, so I figured it was worth at least taking a look at.
Don’t rush to upgrade Python
Before I go any further, I should make clear that even if I’d seen a major performance improvement, I likely would not have pushed to upgrade this service immediately to Python 3.11. Upgrading to Python bugfix releases is fine, and security updates of course should be applied immediately, but with feature releases like 3.11.0 it’s generally best to wait a bit.
There are several reasons for this, but the most important one is: many packages in the Python ecosystem take a little while to get out their own corresponding releases after a new Python release, or to publish new compiled binaries for any extension modules they include.
So rushing to upgrade Python can often be an exercise in frustration, which is why I usually wait until the first bugfix release (3.11.1, due to be released about two months after 3.11.0) before seriously considering an upgrade to the latest Python.
Applying the upgrade
This was pretty simple.The service in question is fully Docker-ized, and uses the “boring” dependency management approach I’ve written about previously, with most common dev tasks driven by a
Makefile. So the first thing I did was change the Dockerfile, which is Debian-flavored.
Then I rebuilt my local containers, and went through a couple cycles of finding a package that didn’t yet have a 3.11-compatible binary
.whl, removing it temporarily, and trying again. There were fewer of these than I expected, and soon I had a local instance running on Python 3.11. I ran a quick load test (the service includes a testing profile for use with Locust, so that was easy), checked the numbers, and decided there wasn’t enough of an improvement to justify trying to rush an upgrade. Which is about what I expected; this was mostly being done just to see what Python 3.11 can do, not because I thought it was going to end in actually trying to upgrade right away.
So I added back the packages I’d removed, recompiled the dependency lockfiles, and switched the Dockerfile back to Python 3.10.
And the container rebuild to go back to 3.10… failed.
The story of why it failed is moderately interesting, and some of you reading this may already be able to figure it out. But if you want a bit more time to think, I’ll explain a bit about Python packaging, and then explain the failure.
Chickens and eggs
I’ve written at some length about the different concerns that make up “packaging” in Python, and I won’t repeat all of that here. But one of those concerns is how, given some code, you produce a distributable artifact.
Once upon a time, the way you did this was by writing a
setup.py script that used the
distutils module in the standard library. Then
setuptools came along as a more featureful drop-in replacement. And if you want to, you can actually just call
setuptools.setup() in your
setup.py, and put all the configuration in a static
setup.cfg file, which
setuptools will automatically read. This has meant that a lot of popular tools, especially linters and formatters and other things that analyze a project’s code, have adopted
setup.cfg as a place to put their own configuration.
Over the past few years, the Python packaging ecosystem has been focused on specifying and standardizing the behaviors of different parts of the “packaging” toolchain, in ways which allow multiple implementations of the core APIs and functionality. But one particularly tricky part is how to specify a package builder that isn’t
setuptools, or even just how to specify that you need
setuptools plus some additional libraries or tools — the only place to put the configuration for that is your
setup.py script, which means you run into a bit of a weird situation where you have to execute
setup.py to find out how you’re supposed to execute
The solution to this was PEP 518, which specified that information about the desired package build system should be put in a static configuration file named
But just as
setup.cfg had become a sort of dumping-ground place where tons of non-packaging-related tools let you specify configuration,
pyproject.toml very quickly was latched onto as the new “put all your configuration here” file. Many popular Python tools now support reading their configuration from a
pyproject.toml, if present, and some, like the Black code formatter, only support
Amusingly, this all happened despite the fact that no TOML-parsing module was included in the Python standard library, so everything that wanted to read
pyproject.toml had to have a dependency on a third-party TOML module, typically the
But now comes Python 3.11, which added a
tomllib module to the standard library.
The last piece of the puzzle is the fact that Python’s packaging tools support conditional dependencies. You can use environment markers in a dependency specification to declare that you only want that dependency when running on a particular operating system, or only on particular Python versions or implementations, or machine type, or various other factors.
And so a lot of packages that read from a
pyproject.toml changed their
tomli dependency to be conditional on the Python version. For example, the Black code formatter mentioned above specifies it like this:
tomli>=1.1.0; python_full_version < '3.11.0a7'
In other words, Black depends on
tomli version 1.1.0 or greater, but only when the Python version is below 3.11.0a7, since that’s the exact moment when the standard-library
tomllib module was added.
And this was the source of my build failure: I’d removed some dependencies to get a 3.11 container to build, and then to drop back to 3.10 I added them back and recompiled my dependency tree using
pip-compile. But the way this project is set up, that gets run in the container… which still had Python 3.11 at that point. So it compiled a dependency tree that didn’t include
tomli, because no package required
tomli on 3.11. As soon as I tried to build a 3.10 container, though,
pip correctly attempted to pull in the third-party
tomli and failed. If you’re following my recommended “boring” dependency approach, you’ll be using
pip install in a mode which errors out unless the entire dependency tree is pinned to exact versions with hashes. Since
tomli wasn’t pinned and hashed,
pip refused to install anything.
Fixing this wasn’t too hard once I read the initial error message and realized what was going on. In fact, there’s no reason why I should have run into this at all, since the easy solution — use
git to revert the dependency lockfiles to the correct contents for Python 3.10 — is what I should have done in the first place rather than manually restoring and rebuilding the lockfiles.
But I still found it amusing at the time, and maybe this writeup will save someone from having to scratch their head wondering why they’re having trouble testing an upgrade/downgrade of Python — although the
tomllib module is specific to Python 3.11, this general issue is something you could run into any time a new standard-library module sees instant wide adoption in the ecosystem.