share
Stack OverflowHow does Facebook achieve good performance?
[+119] [14] Hafizul Amri
[2010-10-08 15:09:41]
[ php mysql performance facebook ]
[ http://stackoverflow.com/questions/3891873/how-does-facebook-achieve-good-performance ] [DELETED]

Almost everyone has a Facebook account, even people who are not familiar with the Internet. With millions people actively using Facebook, updating their status, replying to messages, uploading photos and so on, how is Facebook's page still loading very fast?

I was told that Facebook was built using only PHP and MySQL, so how can Facebook's performance be so good?

(38) @Hafizul Why do you keep changing the title to one with wrong grammar? "Why is Facebook has a very good performance?" is grammatically incorrect. - NullUserException
Far more impressive even than the examples you give, is the number of images they serve: facebook.com/note.php?note_id=76191543919 - 550k per second at peak rate! At that was April 09 so I'm guessing you could double that now. - Richard H
(4) It's interesting how different the answers are. - CiscoIPPhone
(4) How bout the thousands of images that are posted to them every second? The short answer is they have tons of smart people writing great code and tons of hardware to handle the loads. Their engineers have tons of great articles on some of the specifics of their technology solutions: facebook.com/Engineering - Hardwareguy
@Cisco They all revolve around HipHop, Hadoop and Memcached. I think that pretty much sums it all up. - NullUserException
(1) @NullUserException Opps, sorry, I don't know you also can edit the question. I think it was my last incorrect grammar. - Hafizul Amri
(2) @Hafizul Amri: Yup, the FAQ says "Like Wikipedia, this site is collaboratively edited, and all edits are tracked." Don't worry though — usually when the community chips in with edits they're with good intentions (like correcting grammar) :) - BoltClock
(4) Thousands of millions of active users? Facebook is big but not quite that big!....yet - Macros
Anyway, +1 for perspective (i.e. considering Facebook is a really, really large-scale web app, it is pretty fast by its standards). - BoltClock
@Macros Actually I mean for hundred, but I was confused by thousand's word. - Hafizul Amri
@BoltClock stackoverflow has a lot of interesting features. I like it. - Hafizul Amri
(4) good performance? seriously? - DGM
(1) @DGM: You need to look at the statement with the right perspective, and not as, say, an end user ("why is Facebook so laggy today" etc). OP says "With millions people actively using Facebook" (emphasis mine), so considering the massive usage its response time is pretty darn quick. - BoltClock
Facebook has good performance? Have you ever used Facebook? Really? - Ed S.
Thank you for all the answers! Very useful! - Hafizul Amri
@Ed Swangren So, you think facebook doesn't have good performance ? What gives ? - Sathya
@Ed and @DGM. Really? .. or baiting? - Josh Smeaton
(2) Every day I open Gmail and Facebook at the same time. Facebook is always the first one to come up, while Gmail is still churning away. Yes, Facebook has good performance. - Kyralessa
[+140] [2010-10-08 15:13:39] NullUserException [ACCEPTED]

There's no single reason, but a whole lot of reasons:

  1. Heavy usage of caching [1] (APC and memcached [2]), which drastically cuts processing time. Slide 12 compares load time with APC (~130 ms) versus without it - 4050 ms. That's 30x faster!

  2. Usage of HipHop [3], which converts PHP into C++ code (which is then compiled into much more efficient machine code than actual PHP).

  3. Facebook uses PHP and MySQL, but that's not the only thing they use. For example, they use Erlang [4] for their chat, Hadoop clusters [5] for some of their storage. If you go visit their careers page [6], you'll see they are hiring developers with experience in C++, Java, Python, and others.

  4. Facebook has data distributed across many, many servers. In June 2010, FB had 60,000 servers [7]. (think that's too much? Google had half a million... 5 years ago)

  5. Facebook sends as little traffic as possible [8]: they use static CDNs to deliver static content. Gzip to compress data. Cookies, Javascript, HTML - everything is cut back to reduce the number of bytes sent over the network. They use a technology they call "BigPipe", which sends partial content rather than the whole page.

to mention a few...

[1] http://www.scribd.com/doc/3871729/Facebook-Performance-Caching
[2] http://memcached.org/
[3] http://developers.facebook.com/blog/post/358
[4] http://www.facebook.com/note.php?note_id=14218138919
[5] http://www.facebook.com/note.php?note_id=16121578919
[6] http://www.facebook.com/careers/
[7] http://downloadsquad.switched.com/2010/06/29/facebook-doubles-its-server-count-from-30-000-to-60-000-in-just-6-months/
[8] http://www.facebook.com/note.php?note_id=307069903919

(3) My impression from reading that article on Hadoop is that they're using it for archival storage, statistics gathering, data mining, etc., not so much for real-time serving of live content. - David Gelhar
Related : ustream.tv/recorded/4409735 (It's a talk about HipHop done by the developper of Facebook). - HoLyVieR
(3) Facebook is still the largest single user of MySQL. - Josh Smith
(39) I have yet to encounter a good hiphop performance on Facebook. rapbasement.com/kanye-west/… - littlegreen
Wow I didn't quite expect this response. I'll update this answer with more details when I get a chance. - NullUserException
(1) All that and it's still slow. - Ed S.
NullUserException: I just asked an employee there, and he said that pretty much all their data is stored in MySQL and Hadoop is used for batch processing. - Gabe
(1) I'm skeptical that HipHop would be the #1 reason for their good performance. Most requests on Facebook probably end up being I/O bound, so I doubt HipHop does much to lower the latency. Your #2 is probably more important. Their system architecture and how work gets distributed strikes me as more important than any one component. - Doug
@NullUserException they do seem use MySQL facebook.com/video/… - Trufa
100th +1. Nice answer. - Salman A
1
[+32] [2010-10-08 15:18:53] Xeoncross

Ultimate reason: http://memcached.org/

They claim 98% of everything you see on Facebook is from their massive memcache server cluster.


(18) Do you have sources? - NullUserException
(3) there is no ultimate reason... - galambalazs
(4) "They claim"... :) - BoltClock
(3) They have 12 clients/users listed on their homepage, non of which is facebook. - JD Isaacks
(1) Interestingly, according to their Post on Septembers 2.5 hour outage their cache (memcached?) cluster failed. - Stefan Lasiewski
(5) @John : memcached.org may not mention it, but Facebook says that "Memcached was not originally developed at Facebook, but we have become the largest user of the technology". - Stefan Lasiewski
(1) Actually, they claim they have 98% hit rate. They have no reason to have all of their data [in] memcached. Source: somewhere in archives of the fb dev blog. - analytik
2
[+19] [2010-10-08 15:54:43] Josh Clemm

Check out http://facebook.com/techtalks.

They have some great videos describing many of their various optimizations. For instance, there's a talk on memcached [1] (which helps speed up common key gets) and their front-end optimizations [2] (doing lazy loading of Javascript, etc).

The amazing part is how large everything is at facebook. With millions of users and thousands of servers, even a seemingly small optimization can end up saving them millions of dollars or gigabytes of memory.

[1] http://www.facebook.com/video/video.php?v=631826881803
[2] http://www.facebook.com/video/video.php?v=631826881803#!/video/video.php?v=596368660334

3
[+14] [2010-10-08 20:44:56] tnotstar
+1 That article paints a good picture of the inherent system complexities and importantly (in context of the question) what the system consists of to faciliate speed. - John K
4
[+7] [2010-10-08 20:55:51] nos

By

  • Caching
  • Having many servers
  • Having many smart people working on making it fast.

5
[+6] [2010-10-08 17:03:30] mellowsoon

This article talks about the inner workings of Facebook: http://royal.pingdom.com/2010/06/18/the-software-behind-facebook/


6
[+5] [2010-10-08 15:17:49] SB.

Facebook was not only using MySql - it started out using Cassandra [1], and is migrating over to HBase [2]. Applications like FB need a highly scalable Database.

[1] http://cassandra.apache.org/
[2] http://hbase.apache.org/

(2) Cassandra was just an experiment they tried that didn't work out. They didn't start out with it, and they barely ever used it. - Gabe
Really? Wikipedia seems to indicate that they had a 200 node cluster deployed and used it for inbox searches. - SB.
SB: Yes, that is exactly correct. They have tens of thousands of MySQL servers running virtually everything except a 200-node cluster used for inbox searches, which to me qualifies as "barely ever used". - Gabe
Yea good point. Didn't realize they went crazy with sharding MySQL. - SB.
@Gabe, or anyone else: Do you know if Facebook originally intended to use Cassandra for more things than "only" [the 200-node cluster used for inbox searches] ? - LeoMaheo
@Leo: Cassandra was just a project somebody played with. - Gabe
7
[+3] [2010-10-08 18:00:37] Amirouche Douda

Watch this presentation [1] of Aditya Agarwal, Director of Engineering at Facebook, this presentation talks about Facebook’s architecture and its major components (LAMP (PHP, MySQL), Memcache, Thrift, Scribe).

[1] http://www.infoq.com/presentations/Facebook-Software-Stack

8
[+2] [2010-10-08 20:30:39] jerome

It's worth looking into how Facebook selectively loads front end JS so that there is very little latency in the responsiveness of the UI.


9
[+1] [2010-10-08 15:14:56] Joachim VR

They have a compiled version of php, in fact. My guess would have to be: insane amounts of crazy hardware, brutally efficient code, and a database structure optimized with caching, denormalization, clustering...


(6) I don't think they have insane amounts of crazy hardware. Insane amounts of hardware, yes. But like Google, they are relatively cheap computers, but they leverage the power of cloud computing. - NullUserException
(3) @NullUserException Crazy hardware probably wouldn't be a good idea anyway :) - Richard H
10
[+1] [2010-10-08 17:23:54] Andrew Vit

Maybe a small part of the solution overall, but they also optimize for fast client-side rendering. They contracted CSS expert Nicole Sullivan [1] to do some optimization based on her OOCSS techniques [2].

[1] http://www.stubbornella.org/
[2] http://www.stubbornella.org/content/2010/07/01/top-5-mistakes-of-massive-css/

11
[0] [2010-10-08 16:48:49] bobdiaes

Because they have a lot of money. Hiring smart developers and buying tons of servers every week is quite costly.


12
[0] [2010-10-08 20:23:26] Phill Pafford

XHP is a PHP extension which augments the syntax of the language such that XML document fragments become valid PHP expressions. - GIT checkout [1] and Wiki [2]

[1] http://github.com/facebook/xhp/
[2] http://github.com/facebook/xhp/wiki

(1) How does this affect site serving performance? - NikiC
Here is an overview of performance toys.lerdorf.com/archives/54-A-quick-look-at-XHP.html - Phill Pafford
13
[0] [2010-10-09 03:53:16] Stefan Lasiewski

There is a lot of information about their technology in the Facebook Group Facebook Engineering [1], including discussions of their MySQL cluster, XHP, memcached, etc.

It comes down to having enough money to hire smart staff who write efficient programs running on many many computers.

[1] http://www.facebook.com/Engineering#!/Engineering?v=info

14