Media and performance

An entry published by James Bennett on June 23, 2008, Part of the category Django. 12 comments posted.

Ever since last September when I moved this site off the shared-hosting account which had been handling it from its initial launch, I’ve been using separate services to handle static files — “media” in common Django parlance — instead of using the same web server instance, or a separate instance running on the same physical server as the rest of the site. Specifically, I’m using Amazon S3.

When I first explained this a few months ago, I got a bit of pushback and a few questions, both in comments and in private emails, about why the Django documentation recommends doing this; for example, the mod_python deployment docs say that:

Django doesn’t serve media files itself; it leaves that job to whichever Web server you choose.

We recommend using a separate Web server — i.e., one that’s not also running Django — for serving media.

And with my most recent redesign/server move I’ve been getting similar questions again, because I’ve switched from mod_python to mod_wsgi, in daemon mode, for serving Django, and mod_wsgi’s documentation says that

Because the WSGI applications in daemon mode are being run in their own processes, the impact on the normal Apache child processes used to serve up static files and host applications using Apache modules for PHP, Perl or some other language is minimal.

The general impression people get, however, is that serving both media and Django will have an adverse impact on performance, which causes some misunderstandings and hides the real issue. So I’d like to recap and explain in a bit more detail why, exactly, it’s considered important to have a separate server (or a separate service, in the case of something like S3) handling your media.

The wrong kind of performance analysis

The biggest point of confusion on this topic is simply a misunderstanding of what’s meant by “performance” in this context. Unfortunately, “performance” is a bit of a loaded word; a lot of folks hear that handling Django and media from the same server is bad for performance and make some assumptions based on that, typically that it means you’ll be using more memory or more CPU time or that the I/O overhead of serving files will slow things down, or… well, a lot of things that aren’t necessarily accurate.

While it’s definitely true that most web servers can and should be tuned for performance, and that media serving is generally best handled by a configuration that’s different from what you’d want for serving a dynamic application like Django, these sorts of concerns really should be the least of your worries if you’re debating how to handle your media.

So the first thing to do is to stop thinking about “performance” in these terms and start thinking about it in the right terms.

The right kind of performance analysis

In order to understand why it’s a good thing to have a separate media server, it might be better to talk about a different term: capacity. In order to understand that, let’s consider a typical server setup for Django. So we’ll need some numbers to work with:

These numbers are, of course, purely hypothetical, but — except for the last one — they’re not too far off from what you might see in a real setup. The actual request/response time will, of course, vary quite a bit depending on factors like the complexity of each type of request, network latency and bandwidth at the remote client, but for purposes of illustrating this point I’ll run with half a second as a decent average. In the real world, you’d probably want to minimize these effects by using caching and/or by sticking a lightweight proxy in front of the actual application server (Perlbal and nginx are good choices) to let Apache only talk over your local network while the proxy deals with the outside world.

So. Let’s do some math.

Given k megabytes of available RAM and a need for n megabytes per Apache process, obviously we can have at most k / n Apache processes. In this hypothetical case it’s 128 / 16 = 8 Apache processes maximum.

In a process-based model you can handle one request per process at a time, which means that this configuration can handle eight concurrent requests. A full request/response cycle of the application is half a second, so we can handle sixteen requests per second, working out to a maximum capacity of not quite 1.4 million requests per day. Not bad.

But what happens if we let those Apache processes serve both the application and static media files? Well, now we have to consider two types of requests: requests which involve a response from the application, and requests which involve serving a file off disk. And these two types are not equal in importance: people aren’t dropping by just to read our stylesheet, they’re coming to see whatever it is our application does. Given that, let’s say that our “effective capacity” is the maximum number of application requests we can serve. So long as we keep media separate, our effective capacity is about 1.4 million application requests per day.

If we start handling media in the same server instance, in the best possible case we might have something approximating one media request to serve for each application request we serve. Maybe we’re not using any images or JavaScript at all, just a stylesheet, or maybe we have CSS and JavaScript inline in style and script elements and use a mind-boggling sprite to keep the number of images down to one. In this case each client will either have two concurrent connections, meaning we can support four concurrent clients instead of eight, or will issue two consecutive requests.

If we stick to a half-second response time estimate, the numbers come out the same either way: before, we were able to do sixteen application requests per second, and now we can hopefully average eight. We were able to do about 1.4 million application requests per day, and now the best we can look forward to is about 700,000.

By handling both types of requests in the same server instance, we just cut our effective capacity in half. And I doubt very much that anyone would ever really get things down to just a single media request per application request (even with aggressive use of HTTP caching features for static files), which means that the typical case will be even worse:

Ouch.

And the picture doesn’t get any better if we change things around:

The moral of the story

Even when you factor in normal variations in request/response times, even if you switch to a threaded server model, even if you switch from Apache/mod_python to some other deployment option, even if you take client-side caching of static files into account, two clear and simple facts remain:

  1. Handling both media and application through the same server instance (or through the same proxy in front of one or more separate servers) cuts your effective capacity, probably by a significant amount.
  2. Any change — for better or for worse — in average request/response times as a result of serving media (e.g., slow or fast disk I/O, hot or cold filesystem caches, small or large files, etc.) is likely to be drowned out by the simple fact that you are serving media, and that every media request you handle cuts your effective capacity by some fraction of an application request.

There’s simply no way around this.

The conclusion to take away from this is that you really do want to have at least a separate server instance for your media. Whether that means another HTTP daemon running on the same box or a completely separate machine or hosting service is up to you and depends on factors not covered here (e.g., available system resources, time to spend on system administration tasks, costs of separate media hosting, etc.).

In several posts and presentations, I’ve seen people who manage large sites make an obvious point: particular pieces of your stack — programming language, web server, database server, etc. — don’t “scale” all by themselves. Overall architectures scale. What I’m hopefully explaining here is that an architecture which handles media and application in the same server instance does not scale, or at least doesn’t scale as gracefully as one that doesn’t: even if you have the fastest language, the fastest web server, the fastest database, the fastest operating system, the fastest hardware, the fastest everything, all tuned until they scream, it won’t make this problem go away. Serving media and application from the same server instance will always cut your effective capacity and thus will always mean you’re using more resources (and, hence, paying more money) to meet the demands of your traffic than if you keep them separate.

What about low-end hosting?

The usual objection to all of this, of course, is from someone who’s using a low-cost shared hosting service which doesn’t allow a separate web server to be used for media; often this goes hand in hand with a complaint that obtaining a separate service for media hosting would be prohibitively expensive.

If this is coupled with a situation where concerns about capacity simply don’t matter — e.g., if it’s expected that an application will receive very little or only sporadic traffic — then most of the above is irrelevant, and using the facilities the host offers for delegating to either Django or static files from the same server instance shouldn’t be a problem.

But it’s worth pointing out once again that dedicated media hosting can be had at a ludicrously cheap rate; my bill from S3 for an entire year of service is going to come to less than the price of a decent lunch, for example (even with the use of S3 for both media and backups, it comes out to about 40 cents per month). At those prices, it’s hard to argue that media hosting is too expensive, and doing things right to start with leads to an easier process later, if and when you eventually need to grow.

So really, there’s no good argument against separating these concerns, and some very good arguments in favor of it. What are you waiting for?

On June 23, 2008, Martin said:

Hey, your new design doesnt have any media! except the css and the js….

On June 23, 2008, Eric Florenzano said:

I agree with the idea in general, but I also think that there’s a good middle ground:

If you set up nginx or perlbal in front of your apache/mod_wsgi server, you can have literally tens of thousands of extremely lightweight static-media-serving threads if the url matches some regex, while proxying back to the apache setup if it doesn’t match.

This is what I do with all of my sites, and it has worked extremely well. There are also many added benefits of having this proxy server, such as solving the spoonfeeding problem, caching (like you mentioned), and also lightweight redirects that don’t ever hit an Apache process (think redirecting http://www.b-list.org/ to http://b-list.org/).

So even though it may only be 40 cents per month for S3, for medium-sized sites that already have a sensible reverse proxy setup, I think a great solution that costs 40 cents less is to let the proxy server itself handle the static media.

On June 23, 2008, Ben said:

Nice explanation of the reasons for offloading static content. It will be nice to have an intelligible piece that I can direct people toward when they come asking questions. Thanks for that.

One thing that I’m still curious about is how you handle putting media onto S3.

On June 23, 2008, David, biologeek said:

@Ben: that’s one of the goals of the new storage backend http://code.djangoproject.com/ticket/5361 with the appropriated S3 storage http://code.djangoproject.com/ticket/6390

On June 23, 2008, Eric Florenzano said:

By the way, I realize that with this:

If we drop a proxy out in front and have it handle the static files the capacity problem doesn’t go away, because any process or thread in the proxy which is serving a file can’t be talking to one of the application server processes at the same time.

You’re addressing what I was talking about in my previous comment. However, when you have tens to hundreds of thousands of proxy/media threads available, and only tens to hundreds of apache/mod_wsgi processes, the sheer number of leftover threads negates the threat of proxy thread saturation.

On June 23, 2008, Simon Willison said:

If we drop a proxy out in front and have it handle the static files the capacity problem doesn’t go away, because any process or thread in the proxy which is serving a file can’t be talking to one of the application server processes at the same time.”

I disagree. As Eric Florenzano points out, nginx doesn’t have this problem because nginx doesn’t use one process / thread per connection. nginx will quite happily serve up 1000+ static file requests per second using just 4MB of RAM and simultaneously proxy the dynamic requests through to Apache/mod_python running on the backend. It solves the problem you describe perfectly.

There’s still a performance benefit to be had in serving static files from a separate server though: it works around the two connection per host limit in most browsers which means they can load more resources in parallel.

On June 23, 2008, Nicholas Riley said:

Simon Willison said pretty much what I wanted to say regarding nginx, lighttpd and other event-based web servers…although there’s no reason why “static.mydomain.com” and “www.mydomain.com” (or whatever) can’t be served from the same nginx to get around the two connection per host limit.

I’ve seen in a few sites where flaky static file serving (especially large JavaScript libraries, which be loaded asynchronously like images) significantly slows down page loading. Typically, backend static servers aren’t being effectively load balanced or monitored (typically, some backend servers are slower or flakier than others). So don’t forget to monitor the reliability of your static serving infrastructure just as you do your application server(s)!

On June 23, 2008, pytechd said:

One minor snag is when your entire application is HTTPS-only you run into performance problems with a separate service — each page then requires two separate connections. On slower computers, the additional time can add up quickly.

Any tips to avoid that?

On June 23, 2008, Rob Hudson said:

James,

I notice that you’re CSS is both set for cache busting (has a date stamp on it) and is gzipped on S3. I know Amazon S3 doesn’t do content encoding negotiation so this potentially breaks for older browsers (which ones exactly I need to look up) unless you’re doing the negotiation on the page and providing different URLs to the CSS file.

I was wondering if you could elaborate on both how you set up cache busting and how you upload media and set the appropriate headers to S3 knows to serve with the appropriate content-encoding header. Do you have automated approaches for either of these? Are they in Python? Are there tips you learned?

Thanks, Rob

On June 23, 2008, Max Battcher said:

Hey Rob, here’s James’ post on S3 uploading that answers some of your questions: http://www.b-list.org/weblog/2008/feb/07/media/

I asked the question about content negotiation in the comments there but I don’t see a response. I don’t think that S3 supports it though and I think that James has decided to go with the trade-off of only supporting certain browsers for media requests.

On June 24, 2008, James Bennett said:

Rob, I just serve it gzipped no matter what. As far as I know the last browser to have problems with gzipped CSS was Netscape 4, and people using it wouldn’t be getting much out of my stylesheet anyway.

On June 24, 2008, mike bayer said:

was this site actually having performance issues serving a single CSS file off the same server as the dynamic content ? or are there a few hundred gigs of streaming video content I’m not aware of ?

Comments for this entry are closed. If you'd like to share your thoughts on this entry with me, please contact me directly.

ponybadge