Recently Armin Ronacher (whose blog you should be reading if you do anything at all involving Python and the web) has published a couple of good articles poking at the current state of WSGI, the standard interface for Python web applications. Some of his comments dovetail nicely into concerns I’ve been trying to put into words for a while now, so I’m glad he’s posting on the subject and providing some context.
In short, I’ve come to have some rather severe misgivings about WSGI — both as currently constituted and as it’s likely to be in the future — which break down into three areas:
So. Allow me to explain.
My biggest gripe with WSGI from an HTTP standpoint is simply that it seems to have a track record of muddling or outright punting on some of the more interesting/complex features of HTTP. One simple example is the common optimization of compressing (via gzip) the outgoing HTTP response, which usually provides a significant boost in performance as seen by end users of web applications. WSGI seems at first to forbid applications from applying this optimization:
Note: applications and middleware must not apply any kind of
Transfer-Encodingto their output, such as chunking or gzipping; as “hop-by-hop” operations, these encodings are the province of the actual web server/gateway.
But gzipping of the response body is indicated by the Content-Encoding header, which per the HTTP spec is not a “hop-by-hop” header. So is a WSGI application or middleware allowed to gzip an outgoing response body? I wish I knew. Django ships an optional middleware class which applies gzipping to suitable outgoing responses. The Pylons book provides a detailed example of how to write a WSGI middleware for gzipping and seems to encourage its use. And I know there are other implementations of this feature in the wild. But I have a sneaking suspicion that WSGI intends to forbid this feature to any part of the stack other than the server, which would greatly diminish its utility (applications and middlewares are far more likely to have access to information which lets them make reliable judgments about when to apply gzipping, while servers can only apply some relatively blind heuristics).
Meanwhile, chunked transfer — which does involve an actual hop-by-hop header — is clearly forbidden, even though HTTP handles it just fine and it’s an integral part of certain techniques for long-polling (e.g., Comet) applications. As such, the spec basically derails the idea of building those types of applications within WSGI. When I mentioned this in passing in a thread on the Python web-sig list it was suggested that servers could simply look for certain signs that a response should be chunked, but that’s insufficient: for one thing, it requires the server to make guesses about what the application author meant to do. For another, it presupposes that any middlewares involved in the request/response cycle will implement the same heuristics and avoid attempts to consume the response body (since it’s likely that the response body will be some sort of iterable which can only be consumed once, or which poses a threat of prohibitive resource use or a gateway timeout if consumed all in one go).
If WSGI applications could use the Transfer-Encoding header we’d have an easy way to signal this to servers and middlewares, and although the implementation still wouldn’t be simple it would at least have the capability to be much more reliable. Django currently has several open tickets related to this very issue, and I wouldn’t be surprised if other libraries and frameworks face similar problems.
For one final example I’ll pick on a genuinely hard problem which I’ll be harping on again in a bit: character encoding. The WSGI spec impresses upon its readers (or upon this reader, at least) the overwhelming desire for everybody to just quiet down and use ISO-8859-1 instead of whatever character set is actually convenient. Although it does go so far as to mimic HTTP’s ability to use MIME-encoding for non-latin-1 characters, its requirements regarding string types then go on to place heavy burdens not only on Python implementations which have native Unicode strings, but also on applications which might want to do useful things like, say, implement a subclass of str which knows whether it needs to be HTML-escaped or not (which bit Django when we implemented auto-escaping; the “solution” is a throwaway upcast to str to satisfy the WSGI spec’s overzealous type checking).
Problems like these leave me with the feeling that WSGI simply isn’t up to the job of providing Python’s “one obvious way to do” HTTP.
Meanwhile, as an interface for programmers to actually implement and work with in their applications, WSGI feels like nothing so much as a blast from the past, and is designed from the ground up to give this impression. The original standard for web programming was, of course, CGI, and its programming model looked like this:
WSGI’s programming model, meanwhile, is as follows:
environ — containing keys which indicate various aspects of the request.
The parallels here are deliberate: at heart, WSGI is CGI, and this is allegedly a good thing. But it means that WSGI inherits a collection of pathological edge cases, and lays them all squarely at the application author’s feet.
If there were a good, solid, standard implementation of WSGI request parsing that everybody used (say, in a module in the Python standard library), this wouldn’t be as big a problem. But the reference implementation of WSGI is incomplete, and the bits of the standard library which would obviously supplement it have issues (once again, Armin provides useful explanations).
In a way, WSGI has been a victim of its own success. It’s been so heavily promoted as an easy and standard way to implement Python web applications that many developers have simply jumped in head-first and rolled their own solutions, which often turn out to be incomplete or incorrect: the intersection of HTTP, web server quirks, CGI backwards-compatibility and the Python standard library is full of “fun” situations which make the development of solid, reliable WSGI stacks far more complex and subtle than the marketing materials would have you believe. The result is that while there are a few good (as in, mostly complete and free of major bugs) implementations, most people aren’t using them. And even if people tried to use them, dependencies between third-party Python packages are a whole ‘nother world of pain.
But that’s really just the tip of the iceberg. WSGI’s insistence on the CGI programming model, for example, means that non-trivial WSGI stacks have to burn a lot of cycles doing useless work, since every application and every middleware in the chain has to do its own parsing of the environ when invoked. Parsing once and handing the result down the stack is not how WSGI is meant to work.
WSGI’s curious insistence on compatibility with CGI also means that, here in 2009, the Python web-development world still hasn’t been able to significantly improve on 1997’s application programming model. Various libraries and frameworks have implemented useful, normalized abstractions and simpler object-oriented APIs for HTTP requests and responses, but adherence to WSGI rules out (in the practical sense that arbitrary WSGI components can’t be relied on to have any knowledge of these abstractions or APIs) meaningful reuse and interoperability. As a result, the only thing we can really count on is an interface that’s so low-level and complex that the first thing most people do is try to hide it under something easier to work with.
And, of course, character encoding rears its ugly head again. WSGI requires that (in Python 2.x) everything be of type str or StringType. In other words, bytestrings. I’ve heard that some popular libraries and frameworks do their best to handle encoding quirks and just let application authors deal with Unicode (translating to/from bytestrings at the boundaries), but once again it’s something that has to be handled repetitively and independently at each point in the processing chain. Current discussion on the future of WSGI seems to be favoring a continuation of this “feature” into Python 3.x, where it will grow from being incredibly annoying to being downright dangerous, since Python 3.x does not allow you to be as promiscuous about mixing Unicode and bytes as Python 2.x. The result, if implemented, is likely to be lots of preventable type errors and lots of subtler, hard-to-diagnose problems resulting from the mismatch of a Unicode-based string type and incompatible byte-based WSGI environments.
Finally, WSGI simply cannot live up to what’s expected of it in terms of providing an interface which allows arbitrary Python web components to be interoperable enough to compose useful applications. While this is not technically a goal of WSGI — which aims simply to provide a way for Python applications to speak HTTP — it’s become something of a major focus in the last couple of years. Given that the same thing seems to be happening elsewhere (e.g., with Ruby and Rack), I suspect there’s a corollary to Zawinski’s Law at work here: every gateway interface expands until it looks sort of like a framework API.
The problem is that WSGI isn’t and never will be a framework API, and attempts to use it in place of one are pretty much doomed to eternal complication. One obvious issue is that the only way to add additional processing in WSGI is by introducing middlewares between the server and the application, creating a sort of onion-skin model where a request passes from the server, through one or more layers of middleware, finally arriving at the application which emits a response, which goes back through the middlewares and out to the server. This sounds like a workable model, but it really isn’t, and Django has learned that lesson the hard way: we have an onion-skin middleware system, and it’s resulted in things which are one logical unit of functionality being broken up into separate physical chunks of code because otherwise you get into catch-22 situations where there’s no one order of middlewares that will do what you want (e.g., Middleware A needs to be invoked before Middleware B in both request and response processing).
This is, I think, why some sort of “lifecycle” method API is a frequent request for WSGI. Having an officially-blessed way to write some code and insert it at a precise point in the processing chain would be incredibly useful and open up a lot of functionality that’s either difficult or impossible to obtain right now.
A deeper issue is that the only official way for servers, middlewares and applications to communicate with each other is by passing around the WSGI environ (on the incoming request) or the HTTP headers (on the outgoing response). This opens up a couple nasty cans of worms:
Trying to go outside WSGI by setting up side channels doesn’t help with either of these problems. And the only other solution seems to be monkeypatching (for example, if a middleware wants to signal to some component that it should enter a non-default or debugging mode, it could reach in and directly tweak some bit of the relevant code), but that’s a “solution” that’s likely to be worse than the problem.
Unfortunately, I don’t think any of these problems have simple solutions. An easier-said-than-done summary corresponding to my main bullet points might be:
Of course, I could be completely wrong about all of this, and if I am I’m sure someone will helpfully point that out or offer alternate ideas. So if you’ve got ‘em, fire away :)
Comments for this entry are closed. If you'd like to share your thoughts on this entry with me, please contact me directly.
Well put.
Time is due for a WSGI successor and it’s good to see the people in charge starting to talk about it. I’m looking forward to the outcome of this.
Zerost: great writeup. :)
First, “Content-Encoding: gzip” is just fine; WSGI never intended to restrict it.
Second, the problem of accidental buffering by middleware or other components will continue to exist regardless of any new standards for declaring whether a response should be streamed or not. The “proof” I offer is several years of providing streaming in CherryPy, carefully honoring it in all distro’d code, and watching people still get tripped up.
Regarding the CherryPy approach of allowing the server to chunk the response when no Content-Length response header is provided by the application: I don’t think of it as a “guess”. What else should a server do if no Content-Length is given? Buffer it and provide its own Content-Length? You want to make the problem I just mentioned above even more prevalent? ;) A client that receives an HTTP response that has no Content-Length knows it’s either chunked or ends when the conn closes; I’ve found continuing that semantic up through WSGI works very well.
Third, yes, reparsing the environ for each middleware is a huge burden. I already mentioned two years ago the costs in building your app like an onion, and six months before that the problem of middleware ordering. I continue to encourage the Django and Pylons teams to move their middleware-graph-composition step from startup time to request time in order to avoid some of the costs (no matter that ‘middleware’ means two different things in those frameworks—they both need more trees and less onions). I realize the functional design of WSGI, where you can only traverse the graph by calling it, actively works against this in one of the few Python domains where the performance of every single function call can be of very high importance. Web servers have to work like theatre productions: do everything you possibly can before the curtain goes up to make sure there are no delays during the show. Django doesn’t need to wait for a new spec to realize these gains.
Because of the ordering problem, I would think any new interface that had the goal of “composing useful applications” would have to define multiple processing stages, so that logic which needs to run, say, before the request body is read, or after the response has been written out, can be effectively isolated to that stage, as CherryPy 3’s hooks and tools do. Or you could skip the 10 years waiting for such a spec and just build Django 2 on top of CherryPy 3 this week. ;)
You may think that WSGI forbids Chunking - but this isn’t true. It just doesn’t support it enough. WSGI spec clearly allows application to return iterator as the output. It only doesn’t specify how and when this iterator should be flattened.
Take a look at this project: http://code.google.com/p/evserver/ (disclaimer: I’m the author). Also, this extension (which in fact is compatible with WSGI), can help: http://mail.python.org/pipermail/web-sig/2008-May/003439.html
I’d like to suggest simply providing high quality HTTP request & response message classes, and a more robust HTTP server in the standard library. At this point, I don’t see why WSGI needs to exist at all (I’ve been trying to find a convincing reason for oh, 5 years now?), and we can happily take off and nuke the bugger from orbit.
For those that need simplicity, using the existing CGI interface would suffice. For those that need more, running an application persistently and with its own web server (probably behind a proxy) is a simpler, much more obvious choice to passing random function objects around that return 2-tuples of crap.
As for ‘middleware’, I grimace every time I read that term. I don’t know where it came from, I don’t know what it means. If we need some convention for discrete modules to manipulate shared data, well, they’re called interfaces, and perhaps all we need is one or two abstract base classes added to the stdlib that unifies those modules.
What about a WSGI and layer-on-top-of-WSGI openspace at djangocon?
Robert’s suggestion about chunking is good; simply documenting it in PEP 333 would be sufficient IMHO. The criticism about Content-Encoding just isn’t correct; you should correct this in the main article. I’m not sure what you are talking about with ISO-8859-1, except perhaps an overly optimistic reading of the HTTP spec (its inherited notion about the encoding of headers). This is just a suggestion, because WSGI works with bytes it’s not directly enforced. I suppose it is true that PEP 333 doesn’t document the actual realities of HTTP header encoding. Eh.
As for a library that handles the issues you bring up, there is WebOb. It handles:
Getting something into the standard library does not seem to me to be a very good path right now. The standard library (a) isn’t retroactive, (b) doesn’t allow for regular releases, (c) is based on a consensus process which doesn’t actively include many of the people interested in the problem (I for one don’t track Python-Dev). We could have an ad hoc standard of sorts if a substantial portion of the Python web community agreed on something. If that worked, and the result was something really stable, maybe in a while we could talk about actually putting it in the standard library; but if Python packaging is hard now, the standard library is only a big step backwards from that. (And I think figuring out packaging is also within shooting distance — if that’s what is holding up Django on using external libraries, consider resolving that instead of avoiding external libraries entirely.)
Yes, WSGI has some issues. You’ve covered some of them well here.
But we have WebOb. WebOb is great, and cleanly solves all of the issues you’ve addressed (and more). Why does it matter whether or not there is an “official” WSGI-like standard when we have WebOb? As Ian Bicking points out, putting things in the standard library doesn’t really help anything. It’s not like a WSGI 2.0 spec would be any more or less likely to be used than WebOb is now.
Incidentally, though it is a thought experiment and not tested code (and it has a couple things missing), here’s an attempt to mimic the Django request and response objects with a WebOb subclass: http://svn.pythonpaste.org/Paste/WebOb/branches/ianb-decorator-experiment/webob/django.py
For what its worth, I can understand most of your issues with WSGI. Is WSGI perfect? Of course not. It was a first attempt at standardizing an interface in the simplest way possible. It has been hugely successful in that regard, in my opinion. We make heavy use of WSGI through Pylons/TurboGears in ShootQ.com, and have implemented several parts of the application using WSGI middleware.
That being said, I think its time for the community to move on to higher-level abstractions, and I think WebOb is the perfect place for that to happen. WSGI is a perfectly reasonable low-level standard, and has served its purpose quite well. Now, its time for the community to build new de-facto standards on top of it, to further increase integration between frameworks.
Currently, the main obstacle to this is Django, to be quite honest. The other major frameworks seem to embrace WSGI because it allows for interoperability and the sharing of code and abstractions. Django isn’t as interested in WSGI because they have higher level abstractions already available to them. I’d love to see a world where Django, Pylons, TurboGears, etc. all share a common request/response object, at the very least.
Nice post, James. Its leading to some interesting conversation and thoughts!
As far as gzipping goes, i believe that is way faster when done on the web server side. Can you name a situation where would you want more control than you can have on the web server?
I’m working with a django developer to train me up a bit on Python and Django. I’m having trouble understanding WSGI though, and this article was actually more informative than Nuovo was. I think I understand how Python is interacting with the web server now. I’m planning on using NGINX in front of Apache for my application.
@ django pony wannabe
If you’re gonna use NGINX, be sure to figure out everything that you intend on doing before adding private pages like SSL like how to view private facebook pages, because it’ll save you time later.