share
Stack OverflowDoes Django scale?
[+435] [25] Roee Adler
[2009-05-20 05:07:55]
[ django web-applications scalability ]
[ http://stackoverflow.com/questions/886221/does-django-scale ]

I'm building a web application with Django. The reasons I chose Django were:

Now that I'm getting closer to thinking about publishing my work, I start being concerned about scale. The only information I found about the scaling capabilities of Django is provided by the Django team (I'm not saying anything to disregard them, but this is clearly not objective information...).

My questions:

(3) Might want to fix "speed was the main factor" to clarify if you're talking about execution speed or development effort. It sounds like development effort, which makes sense. - S.Lott
(1) Would be interesting to compare this with RoR. - Kozyarchuk
Disqus uses Pylons. - ajkumar25
Maybe OT, but you can use pypy to speed up django. - pevik
[+263] [2009-05-20 07:41:16] Van Gale [ACCEPTED]
  1. "What are the largest sites built on Django today?"

    There isn't any single place that collects information about traffic on Django built sites, so I'll have to take a stab at it using data from various locations. First, we have a list of Django sites on the front page of the main Django project page [1] and then a list of Django built sites at djangosites.org [2]. Going through the lists and picking some that I know have decent traffic we see:

  2. "Can Django deal with 100,000 users daily, each visiting the site for a couple of hours?"

    Yes, see above.

  3. "Could a site like Stack Overflow run on Django?"

    My gut feeling is yes but, as others answered and Mike Malone mentions in his presentation, database design is critical. Strong proof might also be found at www.cnprog.com if we can find any reliable traffic stats. Anyway, it's not just something that will happen by throwing together a bunch of Django models :)

There are, of course, many more sites and bloggers of interest, but I have got to stop somewhere!

Dec 2009 UPDATE:

Blog post about Using Django to build high-traffic site michaelmoore.com [16] described as a top 10,000 website [17]. Quantcast stats [18] and compete.com stats [19].

[1] http://www.djangoproject.com/
[2] http://www.djangosites.org/
[3] http://disqus.com
[4] http://pycon.blip.tv/file/4880330/
[5] http://curse.com/
[6] http://www.quantcast.com/curse.com
[7] http://cramer.io/2007/05/23/rapid-development-serving-500000-pageshour/
[8] http://tabblo.com/
[9] http://www.quantcast.com/tabblo.com
[10] http://nedbatchelder.com/blog/200902/infrastructure_for_modern_web_sites.html
[11] http://chesspark.com/
[12] http://www.alexa.com/siteinfo/chesspark.com
[13] http://pownce.com/
[14] http://www.alexa.com/siteinfo/pownce.com
[15] http://www.slideshare.net/road76/scaling-django
[16] http://web.archive.org/web/20130307032621/http://concentricsky.com/blog/2009/oct/michaelmoorecom
[17] http://www.alexa.com/siteinfo/http%3A%2F%2Fmichaelmoore.com
[18] http://www.quantcast.com/michaelmoore.com
[19] http://siteanalytics.compete.com/michaelmoore.com/

(31) Curse moved to ASPNET for reasons I don't know. - Oli
Gah, thanks Oli, I hadn't seen any news about that :( - Van Gale
(10) I seem to remember curse going through an acquisition about 6 months ago - that might have influenced their decision to change technologies. I wouldn't put it down to any inadequacy in django. - Bayard Randel
@Oli a quick google search didn't turn up anything about Curse moving to ASP.NET (all I can seem to find about their tech is related to Django). Do you have a link? - TM.
(1) Response headers seem to indicate ASP.NET $ curl -I curse.com HTTP/1.1 301 Moved Permanently Content-Type: text/html; charset=UTF-8 Location: curse.com Server: Microsoft-IIS/7.0 X-Powered-By: ASP.NET Date: Thu, 03 Dec 2009 22:16:05 GMT Content-Length: 144 Connection: close - David J. Liszewski
Also, Mahalo.com - Zack
many pbs.org sites use Django - Evgeny
(1) why curse.com moved to asp.net???? - vernomcrp
(1) vernomcrp: I've spoke with a curse.com django developper some month ago, and the rewrite of the website in asp.net was a decision of the leaders (Why ? I don't know..) - Kedare
Hunch.com is built on Django and has significant traffic - 828
(27) Bitbucket runs on django - miki725
(22) Instangram also uses Django. Another site would be BringIt.com - CppLearner
(12) This answer sorely needs an update to mention Discus and Instagram. These two are huge, well-known, highly-scaled apps. - Dustin Rasener
Answering the question “Could a site like Stack Overflow run on Django?”: As far as I know, Russian clone of StackOverflow, HashCode, runs on Django. - kirelagin
(2) Could update this answer for 2013 :) there's some big ones now! - jdero
Also YouTube with a few calls to c++ moduls - Tom
(1) Pixabay.com - a huge image sharing website with lots of traffic - runs on Django, too. Alexa 2.800 / about 10 GB traffic per hour with 400.000 images served per day. And lots of room for scaling up :-) - Simon Steinberger
(1) The link to Mike Malone's "Scaling Django WebApps" presentation is broken, this works, I'm not sure if it is the same one slideshare.net/road76/scaling-django - Dan R
(1) Note: the high-profile companies mentioned use a variety of technology coupled on top of Django to achieve scalability. Few languages can achieve high scalability with a helping hand from distributed architecture or heavy caching. Django is a "sync" language which can make more bottlenecks vs. an "async" language (see Node.js). Varnish is every web server(s) friend. - Tim Selaty Jr.
Pixabay - an image sharing platform - is built on Django: currently Alexa 2000 / 125.000 visitors per day. - Simon Steinberger
1
[+141] [2009-05-20 11:16:35] S.Lott

We're doing load testing now. We think we can support 240 concurrent requests (a sustained rate of 120 hits per second 24x7) without any significant degradation in the server performance. That would be 432,000 hits per hour. Response times aren't small (our transactions are large) but there's no degradation from our baseline performance as the load increases.

We're using Apache front-ending Django and MySQL. The OS is Red Hat Enterprise Linux (RHEL). 64-bit. We use mod_wsgi in daemon mode for Django. We've done no cache or database optimization other than to accept the defaults.

We're all in one VM on a 64-bit Dell with (I think) 32Gb RAM.

Since performance is almost the same for 20 or 200 concurrent users, we don't need to spend huge amounts of time "tweaking". Instead we simply need to keep our base performance up through ordinary SSL performance improvements, ordinary database design and implementation (indexing, etc.), ordinary firewall performance improvements, etc.

What we do measure is our load test laptops struggling under the insane workload of 15 processes running 16 concurrent threads of requests.


(1) just curious, is this a single server setup? what are the specs on the server(s)? - monkut
It's one VM on a 64-bit Dell with 32Gb RAM (AFAIK). - S.Lott
Also curious: is your DB running on the same machine, or a separate server? - Jarret Hardie
(3) One VM with Apache, Django and MySQL. mod_wsgi. RHEL. - S.Lott
2
[+106] [2010-08-01 12:03:20] Neil

Not sure about the number of daily visits but here are a few examples of large Django sites:

Screencast on how to deploy django with scaling in mind http://ontwik.com/python/django-deployment-workshop-by-jacob-kaplan-moss/

Here is a link to list of high traffic Django sites on Quora [19].

[1] http://disqus.com/
[2] http://djangocon.blip.tv/file/4135225/
[3] http://bitbucket.org
[4] http://code.djangoproject.com/wiki/DjangoSuccessStoryBitbucket
[5] http://lanyrd.com/
[6] http://lanyrd.com/colophon/
[7] http://support.mozilla.com/
[8] https://github.com/mozilla/kitsune
[9] https://addons.mozilla.org/
[10] https://github.com/mozilla/zamboni
[11] http://python.mirocommunity.org/video/1866/djangocon-2010-switching-addon
[12] http://www.theonion.com/
[13] http://www.reddit.com/r/django/comments/bhvhz/the_onion_uses_django_and_why_it_matters_to_us
[14] http://www.guardian.co.uk/
[15] http://www.guardian.co.uk/info/developer-blog/2011/feb/08/computing-apps
[16] http://instagr.am/
[17] https://pinterest.com
[18] http://www.rdio.com/
[19] http://www.quora.com/Django/What-is-the-highest-traffic-website-built-on-top-of-Django

3
[+62] [2009-11-16 04:00:13] jacobian

What's the "largest" site that's built on Django today? (I measure size mostly by user traffic)

In the US, Mahalo [1]. I'm told they handle roughly 10 million uniques a month.

Abroad, the Globo [2] network (a network of news, sports, and entertainment sites in Brazil); Alexa ranks them in to top 100 globally (around 80th currently).

Other notable Django users include PBS, National Geographic, Discovery, NASA (actually a number of different divisions within NASA), and the Library of Congress.

Can Django deal with 100k users daily, each visiting the site for a couple of hours?

Yes -- but only if you've written your application right, and if you've got enough hardware. Django's not a magic bullet.

Could a site like StackOverflow run on Django?

Yes (but see above).

Technology-wise, easily: see soclone [3] for one attempt. Traffic-wide, compete pegs StackOverflow at under 1 million uniques per month. I can name at least dozen Django sites with more traffic than SO.

[1] http://mahalo.com/
[2] http://globo.com/
[3] http://code.google.com/p/soclone/

4
[+53] [2009-05-20 06:33:46] Paolo Bergantino

Playing devil's advocate a little bit:

You should check the DjangoCon 2008 Keynote [1], delivered by Cal Henderson [2], titled "Why I hate Django" where he pretty much goes over everything Django is missing that you might want to do in a high traffic website. At the end of the day you have to take this all with an open mind because it is perfectly possible to write Django apps that scale, but I thought it was a good presentation and relevant to your question.

[1] http://www.youtube.com/watch?v=i6Fr65PFqfk
[2] http://en.wikipedia.org/wiki/Cal%5FHenderson

(3) Also, Flickr wasn't built in a day. - Deniz Dogan
(14) It appears that several of the issues Cal harped on are now standard features: docs.djangoproject.com/en/dev/topics/db/multi-db - Dolph
5
[+25] [2009-05-20 05:22:37] Bayard Randel

The largest django site I know of is the Washington Post [1], which would certainly indicate that it can scale well.

Good design decisions probably have a bigger performance impact than anything else. Twitter is often cited as a site which embodies the performance issues with another dynamic interpreted language based web framework, Ruby on Rails - yet Twitter engineers have stated that the framework isn't as much an issue as some of the database design choices they made early on.

Django works very nicely with memcached and provides some classes for managing the cache, which is where you would resolve the majority of your performance issues. What you deliver on the wire is almost more important than your backend in reality - using a tool like yslow is critical for a high performance web application. You can always throw more hardware at your backend, but you can't change your users bandwidth.

[1] http://www.washingtonpost.com/

(1) Isn't only part of washingtonpost.com run on Django? The Django frontpage seems to indicate it's only projects.washingtonpost.com/congress - Xiong Chiamiov
(2) You are perhaps confusing the Washington Post with the Washington Times. I believe the Times is all on Django, but it is a much smaller paper. - Eli
6
[+22] [2009-05-20 06:27:28] Daniel Roseman

I was at the EuroDjangoCon conference the other week, and this was the subject of a couple of talks - including from the founders of what was the largest Django-based site, Pownce (slides from one talk here [1]). The main message is that it's not Django you have to worry about, but things like proper caching, load balancing, database optimisation, etc.

Django actually has hooks for most of those things - caching, in particular, is made very easy.

[1] http://immike.net/files/scaling%5Fdjango.pdf

7
[+17] [2009-05-21 01:36:22] razenha

Scaling Web apps is not about web frameworks or languages, is about your architecture. It's about how you handle you browser cache, your database cache, how you use non-standard persistence providers (like CouchDB [1]), how tuned is your database and a lot of other stuff...

Don't bother...

[1] http://couchdb.apache.org/

A downvote? Why? - razenha
(2) +1 I didn't downvote, but maybe it's because of the don't bother? - Roee Adler
Web framework do matter! Look how fast tornado compared to other python webframework: tornadoweb.org/documentation#performance - jpartogi
I didn't downvote, but I suppose you went a little off the topic as they where discussing the merits of django and you cannot use every database, tune your database and use couchdb at its max in every given framework. Unless you don't, of course, rewrite big chunks of it. - ZJR
(3) @ZLR i don't believe I went off-topic. He asked if Django can scale, I said yes, because almost all modern web frameworks, regardless the language, can scale if you use the right architectural approach - razenha
Yeah, blocking vs nonblocking IO does matter, as per the Tornado example. Although having said that, Tornado's not a web framework, but your application will need to be written in a way that takes advantage of nonblocking IO. - Robert Grant
8
[+15] [2009-05-20 05:23:35] jess

I'm sure you're looking for a more solid answer, but the most obvious objective validation I can think of is that Google pushes Django for use with its App Engine [1] framework. If anybody knows about and deals with scalability on a regular basis, it's Google. From what I've read, the most limiting factor seems to be the database back-end, which is why Google uses their own...

[1] http://en.wikipedia.org/wiki/Google_App_Engine

Promoting Django/Python may be more related to Google's policy of promoting Python as its choice of 'Other' language after C++? - GuruM
9
[+11] [2012-01-10 21:29:53] Milind

I think we might as well add Apple's App of the year for 2011, Instagram [1], to the list which uses django intensively.

[1] http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances-dozens-of

10
[+9] [2009-05-25 08:36:10] GvS

Could a site like Stack Overflow run on Django?

Chinese version of Stack Overflow is using Django:

http://stackoverflow.com/questions/694966/impressed-or-angry-at-http-www-cnprog-com


11
[+6] [2009-05-20 05:24:22] coulix

Yes it can. It could be Django with Python or Ruby on Rails. It will still scale.

There are few different techniques. First, caching is not scaling. You could have several application servers balanced with nginx as the front in addition to hardware balancer(s). To scale on the database side you can go pretty far with read slave in MySQL / PostgreSQL if you go the RDBMS way.

Some good examples of heavy traffic websites in Django could be:

  • Pownce [1] when they were still there.
  • Discus (generic shared comments manager)
  • All the newspaper related websites: Washington Post and others.

You can feel safe.

[1] http://en.wikipedia.org/wiki/Pownce

(1) Just sayin... dead social networks make a bad scalability example :) - ZJR
(2) I don't think the Pownce dead is related to a scalability issue. - Kedare
12
[+6] [2009-05-20 05:30:36] monkut

If you haven't already, I recommend reading the section on scaling in The Django Book:

http://www.djangobook.com/en/1.0/chapter20/

Or the newer version:

http://www.djangobook.com/en/2.0/chapter12/


I read it, thanks, but as I mentioned, I was looking for information not coming from Django or the Django-book. - Roee Adler
13
[+5] [2009-05-20 05:35:26] Beep beep boop boop

Note that if you're expecting 100K users per day, that are active for hours at a time (meaning max of 20K+ concurrent users), you're going to need A LOT of servers. SO has ~15,000 registered users, and most of them are probably not active daily. While the bulk of traffic comes from unregistered users, I'm guessing that very few of them stay on the site more than a couple minutes (i.e. they follow google search results then leave).

For that volume, expect at least 30 servers ... which is still a rather heavy 1,000 concurrent users per server.


(2) It appears from the podcast that SO uses just 3 servers. But SO is built using C#, not Python, so it rips. - S.Lott
(1) Obviously the question will be: How much powerfull servers are they? - mamcx
14
[+5] [2009-05-20 11:48:27] Glader

Another example is rasp.yandex.ru, Russian transport timetable service. Its attendance satisfies your requirements.


15
[+5] [2009-09-29 01:27:05] Koliber Services

I have been using Django for over a year now, and am very impressed with how it manages to combine modularity, scalability and speed of development. Like with any technology, it comes with a learning curve. However, this learning curve is made a lot less steep by the excellent documentation from the Django community. Django has been able to handle everything I have thrown at it really well. It looks like it will be able to scale well into the future.

BidRodeo Penny Auctions [1] is a moderately sized Django powered website. It is a very dynamic website and does handle a good number of page views a day.

[1] http://www.bidrodeo.com/

16
[+5] [2009-11-16 18:51:51] mazelife

Here's a list of some relatively high-profile things built in Django:

  1. The Guardian's " Investigate your MP's expenses [1]" app

  2. Politifact.com (here's a Blog post [2] talking about the (positive) experience. Site won a Pulitzer.

  3. NY Times' Represent [3] app

  4. EveryBlock [4]

  5. Peter Harkins, one of the programmers over at WaPo, lists all the stuff they’ve built with Django [5] on his blog

  6. It's a little old, but someone from the LA Times gave a basic overview [6] of why they went with Django.

  7. The Onion's AV Club was recently moved from (I think Drupal) to Django.

I imagine a number of these these sites probably gets well over 100k+ hits per day. Django can certainly do 100k hits/day and more. But YMMV in getting your particular site there depending on what you're building.

There are caching options at the Django level (for example caching querysets and views in memcached [7] can work wonders) and beyond (upstream caches like Squid [8]). Database Server specifications will also be a factor (and usually the place to splurge), as is how well you've tuned it. Don't assume, for example, that Django's going set up indexes properly. Don't assume that the default PostgreSQL [9] or MySQL [10] configuration is the right one.

Furthermore, you always have the option of having multiple application servers running Django if that is the slow point, with a software or hardware load balancer in front.

Finally, are you serving static content on the same server as Django? Are you using Apache or something like nginx [11] or lighttpd [12]? Can you afford to use a CDN [13] for static content? These are things to think about, but it's all very speculative. 100k hits/day isn't the only variable: how much do you want to spend? How much expertise do you have managing all these components? How much time do you have to pull it all together?

[1] http://mps-expenses.guardian.co.uk/
[2] http://www.mattwaite.com/posts/2007/aug/22/announcing-politifact/
[3] http://prototype.nytimes.com/represent/
[4] http://www.everyblock.com/
[5] http://push.cx/2009/washington-post-update
[6] http://www.poynter.org/column.asp?id=52=150818
[7] http://en.wikipedia.org/wiki/Memcached
[8] http://en.wikipedia.org/wiki/Squid_%28software%29
[9] http://en.wikipedia.org/wiki/PostgreSQL
[10] http://en.wikipedia.org/wiki/MySQL
[11] http://en.wikipedia.org/wiki/Nginx
[12] http://en.wikipedia.org/wiki/Lighttpd
[13] http://en.wikipedia.org/wiki/Content_delivery_network

17
[+4] [2009-08-22 08:45:45] Anders Rune Jensen

If you have a site with some static content, then putting a Varnish [1] server in front will dramatically increase your performance. Even a single box can then easily spit out 100 Mbit/s of traffic.

Note that with dynamic content, using something like Varnish becomes a lot more tricky.

[1] http://en.wikipedia.org/wiki/Varnish_%28software%29

(1) Problem here is varnish will dramatically increase the performance of everything. And faster frameworks will still be faster. - ZJR
18
[+4] [2010-01-26 02:00:18] orokusaki

If it means anything, Django is run on Python (platitude intended)

YouTube is built on Python.

YouTube has about 500 million hits per month and about 90 million users per month.


(1) But youtube is not built with django. Python might be fast, but not so for django. - jpartogi
(3) Yea, but the point was that as Django grows, it's sitting on a good foundation for speed re factoring and with Google out there working on projects like Unladen Swallow, it'll just get better. - orokusaki
19
[+3] [2009-05-20 05:22:30] tomeedee

My experience with Django is minimal but I do remember in The Django Book they have a chapter where they interview people running some of the larger Django applications. Here is a link. [1] I guess it could provide some insights.

It says curse.com is one of the largest Django applications with around 60-90 million page views in a month.

[1] http://www.djangobook.com/en/1.0/appendixA/

(1) curse.com urls now end in .aspx... (dunno if they craft them) - ZJR
20
[+3] [2009-05-27 14:00:18] Ed Menendez

You can definitely run a high-traffic site in Django. Check out this pre-Django 1.0 but still relevant post here: http://menendez.com/blog/launching-high-performance-django-site/


21
[+3] [2010-01-08 10:15:00] siddu

Check out this micro news aggregator called EveryBlock [1].

It's entirely written in Django. In fact they are the people who developed the Django framework itself.

[1] http://everyblock.com

22
[+2] [2010-11-25 01:20:04] Ashwin

Spreading the tasks evenly, in short optimizing each and every aspect including DBs, Files, Images, CSS etc. and balancing the load with several other resources is necessary once your site/application starts growing. OR you make some more space for it to grow. Implementation of latest technologies like CDN, Cloud are must with huge sites. Just developing and tweaking an application won't give your the cent percent satisfation, other components also play an important role.


23
[0] [2014-09-08 15:25:31] redsnapper

I develop high traffic sites using Django for the national broadcaster in Ireland. It works well for us. Developing a high performance site is more than about just choosing a framework. A framework will only be one part of a system that is as strong as it's weakest link. Using the latest framework 'X' won't solve your performance issues if the problem is slow database queries or a badly configured server or network.


24
[0] [2014-11-09 12:15:29] gmourier

The problem is not to know if django can scale or not.

The right way is to understand and know which are the network design patterns and tools to put under your django/symfony/rails project to scale well.

Some ideas can be :

  • Multiplexing.
  • Inversed proxy. Ex : Nginx, Varnish
  • Memcache Session. Ex : Redis
  • Clusterization on your project and db for load balancing and fault tolerance : Ex : Docker
  • Use third party to store assets. Ex : Amazon S3

Hope it help a bit. This is my tiny rock to the mountain.


25