Media and performance

June 23, 2008 Django

Ever since last September when I moved this site off the shared-hosting account which had been handling it from its initial launch, I’ve been using separate services to handle static files — “media” in common Django parlance — instead of using the same web server instance, or a separate instance running on the same physical server as the rest of the site. Specifically, I’m using Amazon S3.

When I first explained this a few months ago, I got a bit of pushback and a few questions, both in comments and in private emails, about why the Django documentation recommends doing this; for example, the mod_python deployment docs say that:

Django doesn’t serve media files itself; it leaves that job to whichever Web server you choose.

We recommend using a separate Web server — i.e., one that’s not also running Django — for serving media.

And with my most recent redesign/server move I’ve been getting similar questions again, because I’ve switched from mod_python to mod_wsgi, in daemon mode, for serving Django, and mod_wsgi’s documentation says that

Because the WSGI applications in daemon mode are being run in their own processes, the impact on the normal Apache child processes used to serve up static files and host applications using Apache modules for PHP, Perl or some other language is minimal.

The general impression people get, however, is that serving both media and Django will have an adverse impact on performance, which causes some misunderstandings and hides the real issue. So I’d like to recap and explain in a bit more detail why, exactly, it’s considered important to have a separate server (or a separate service, in the case of something like S3) handling your media.

The wrong kind of performance analysis

The biggest point of confusion on this topic is simply a misunderstanding of what’s meant by “performance” in this context. Unfortunately, “performance” is a bit of a loaded word; a lot of folks hear that handling Django and media from the same server is bad for performance and make some assumptions based on that, typically that it means you’ll be using more memory or more CPU time or that the I/O overhead of serving files will slow things down, or… well, a lot of things that aren’t necessarily accurate.

While it’s definitely true that most web servers can and should be tuned for performance, and that media serving is generally best handled by a configuration that’s different from what you’d want for serving a dynamic application like Django, these sorts of concerns really should be the least of your worries if you’re debating how to handle your media.

So the first thing to do is to stop thinking about “performance” in these terms and start thinking about it in the right terms.

The right kind of performance analysis

In order to understand why it’s a good thing to have a separate media server, it might be better to talk about a different term: capacity. In order to understand that, let’s consider a typical server setup for Django. So we’ll need some numbers to work with:

Assume VPS with 256MB of RAM available.
Assume that everything that’s not web serving — database server, operating-system services, etc. — can fit into 128MB, leaving 128MB for the web server.
Assume that the deployment strategy is Apache/mod_python, using a process-based model, and that each Apache process will want about 16MB when fully loaded up with a Python interpreter, Django, application code, etc.
Assume that a full request/response cycle takes half a second.

These numbers are, of course, purely hypothetical, but — except for the last one — they’re not too far off from what you might see in a real setup. The actual request/response time will, of course, vary quite a bit depending on factors like the complexity of each type of request, network latency and bandwidth at the remote client, but for purposes of illustrating this point I’ll run with half a second as a decent average. In the real world, you’d probably want to minimize these effects by using caching and/or by sticking a lightweight proxy in front of the actual application server (Perlbal and nginx are good choices) to let Apache only talk over your local network while the proxy deals with the outside world.

So. Let’s do some math.

Given k megabytes of available RAM and a need for n megabytes per Apache process, obviously we can have at most k / n Apache processes. In this hypothetical case it’s 128 / 16 = 8 Apache processes maximum.

In a process-based model you can handle one request per process at a time, which means that this configuration can handle eight concurrent requests. A full request/response cycle of the application is half a second, so we can handle sixteen requests per second, working out to a maximum capacity of not quite 1.4 million requests per day. Not bad.

But what happens if we let those Apache processes serve both the application and static media files? Well, now we have to consider two types of requests: requests which involve a response from the application, and requests which involve serving a file off disk. And these two types are not equal in importance: people aren’t dropping by just to read our stylesheet, they’re coming to see whatever it is our application does. Given that, let’s say that our “effective capacity” is the maximum number of application requests we can serve. So long as we keep media separate, our effective capacity is about 1.4 million application requests per day.

If we start handling media in the same server instance, in the best possible case we might have something approximating one media request to serve for each application request we serve. Maybe we’re not using any images or JavaScript at all, just a stylesheet, or maybe we have CSS and JavaScript inline in style and script elements and use a mind-boggling sprite to keep the number of images down to one. In this case each client will either have two concurrent connections, meaning we can support four concurrent clients instead of eight, or will issue two consecutive requests.

If we stick to a half-second response time estimate, the numbers come out the same either way: before, we were able to do sixteen application requests per second, and now we can hopefully average eight. We were able to do about 1.4 million application requests per day, and now the best we can look forward to is about 700,000.

By handling both types of requests in the same server instance, we just cut our effective capacity in half. And I doubt very much that anyone would ever really get things down to just a single media request per application request (even with aggressive use of HTTP caching features for static files), which means that the typical case will be even worse:

If we have two media requests per application request, we’ve cut our effective capacity by about 67%.
If we have three media requests per application request, we’ve cut our effective capacity by 75%.
If we have four media requests per application request, we’ve cut our effective capacity by 80%.

Ouch.

And the picture doesn’t get any better if we change things around:

If we switch to mod_wsgi and use daemon mode, the capacity problem doesn’t go away, because an Apache process that’s serving a file is an Apache process that can’t be talking to a mod_wsgi daemon process at the same time.
If we switch to a threaded model the capacity problem doesn’t go away, because we’re still cutting the number of web-serving units — either processes or threads — available at any given moment to service an application request.
If we drop a proxy out in front and have it handle the static files the capacity problem doesn’t go away, because any process or thread in the proxy which is serving a file can’t be talking to one of the application server processes at the same time. As some of the comments have pointed out, the performance of some of the better proxying servers makes this last one moot. But hey, at that point you’re using multiple servers anyway.

The moral of the story

Even when you factor in normal variations in request/response times, even if you switch to a threaded server model, even if you switch from Apache/mod_python to some other deployment option, even if you take client-side caching of static files into account, two clear and simple facts remain:

Handling both media and application through the same server instance (or through the same proxy in front of one or more separate servers) cuts your effective capacity, probably by a significant amount.
Any change — for better or for worse — in average request/response times as a result of serving media (e.g., slow or fast disk I/O, hot or cold filesystem caches, small or large files, etc.) is likely to be drowned out by the simple fact that you are serving media, and that every media request you handle cuts your effective capacity by some fraction of an application request.

There’s simply no way around this.

The conclusion to take away from this is that you really do want to have at least a separate server instance for your media. Whether that means another HTTP daemon running on the same box or a completely separate machine or hosting service is up to you and depends on factors not covered here (e.g., available system resources, time to spend on system administration tasks, costs of separate media hosting, etc.).

In several posts and presentations, I’ve seen people who manage large sites make an obvious point: particular pieces of your stack — programming language, web server, database server, etc. — don’t “scale” all by themselves. Overall architectures scale. What I’m hopefully explaining here is that an architecture which handles media and application in the same server instance does not scale, or at least doesn’t scale as gracefully as one that doesn’t: even if you have the fastest language, the fastest web server, the fastest database, the fastest operating system, the fastest hardware, the fastest everything, all tuned until they scream, it won’t make this problem go away. Serving media and application from the same server instance will always cut your effective capacity and thus will always mean you’re using more resources (and, hence, paying more money) to meet the demands of your traffic than if you keep them separate.

What about low-end hosting?

The usual objection to all of this, of course, is from someone who’s using a low-cost shared hosting service which doesn’t allow a separate web server to be used for media; often this goes hand in hand with a complaint that obtaining a separate service for media hosting would be prohibitively expensive.

If this is coupled with a situation where concerns about capacity simply don’t matter — e.g., if it’s expected that an application will receive very little or only sporadic traffic — then most of the above is irrelevant, and using the facilities the host offers for delegating to either Django or static files from the same server instance shouldn’t be a problem.

But it’s worth pointing out once again that dedicated media hosting can be had at a ludicrously cheap rate; my bill from S3 for an entire year of service is going to come to less than the price of a decent lunch, for example (even with the use of S3 for both media and backups, it comes out to about 40 cents per month). At those prices, it’s hard to argue that media hosting is too expensive, and doing things right to start with leads to an easier process later, if and when you eventually need to grow.

So really, there’s no good argument against separating these concerns, and some very good arguments in favor of it. What are you waiting for?