Google – A study in Scalability and A little systems horse sense

Google’s Jeff Dean did an excellent talk at Stanford as part of EE380 – it is worth one’s time to listen. Very informative, instructive and innovative. As I listened, I jotted a few quick notes.

Interesting comparison of the scale in search from 1999 to 2010
- Docs and queries are up 1000X, while the query latency has decreased 5X
- Interesting to hear that in 1999 they used to update a web page store in a month or two, but now it is reduced 50000X to seconds!
They have had 7 significant revisions in 11 years
Trivia : They encounter very expensive queries for example “circle of death” requires ~30GB of I/O
Trivia : In 2004, they did a rethink and refreshed the systems infrastructure from scratch
He discussed a little about encodings – informative discussion on Byte aligned variable length & group encoding schemes << I have to try it out …
Trivia : They have had long distance links failure by wild dogs, sharks, dead horses and (in Oregon) drunken hunters !
Jeff talked in length about MapReduce. An interesting set of statistics of MapReduce over time
- MapReduce at Google, now at 4 million jobs; processing ~1000 PB with 130 PB intermediate data and 45 PB output
- Data has doubled while the number of machines have been constant from ‘07 to ‘10.
- Machine usage has quadrupled while the job completion has doubled ‘07 to ‘10
- Trivia : Jeff shared an anecdote where the network engineers were rewiring the network while Jeff & Co were running MapReduce. They did lose machines in a strange pattern and were wondering what is going on; but the job succeeded, a little slower than normal and of course, the machines came back up ! Only after the fact did they hear about the network rewiring !
He is working on a project called Spanner, that spans data centers. Looks like this is one of their hairy problems. Dean also mentioned this during Q & A. All their systems work well inside a datacenter, but have no way of spanning datacenters. They have manual methods & task specific tools to copy data across datacenters, monitor tasks across datacenters and so forth. But no systemic infrastructure.
- Declarative Policies, Common namespaces, transactions, strong and weak consistency and automation are all parts that Spanner addresses
I think the most important part of the talk was the final section on experiences & patterns
- Break Large systems into smaller services. << We have heard this from Amazon as well. The Google page touches 200+ services. (Same with amazon page)
- One should be able to estimate performance based on back-of the envelope calculations
- I am compelled to insert Jeff’s “Numbers Everyone Should Know” as it is a very useful chart. I hope Jeff doesn’t mind. [Update 11/17/10] Thanks to Kevin Le, this chart comes from Norvig’s blog.
- Identify common problems & build general systems to address them. Very important not to be all things to all people.
  - Paraphrasing Jeff, “That final feature will drive you over the edge of complexity“!
- Don’t design to scale infinitely – consider 5X – 50X growth. But > 100X requires redesign << very insightful
- He likes the centralized master design & so does not suggest a fully distributed system. Of course, the master should not be in the data plane but can be a control plane artifact
- He also likes multiple small units per machine than a mongo job running on a big machine. Smaller units of work are easier to recover, load balance and scale. << agreed!
He concludes the talk saying there are lots of interesting “Big Data” available.
- I have seen this emergence of Data Scientists from multiple sources, here and here. << I agree, my main focus as well …
Couple of insights from the Q & A sessions
- They run chained sequence of M/Rs that implement some part of a larger algorithm in a sequence of steps than a Map-Reduce-Reduce-… pattern
- The predictive search is more of a resource issue and not a fundamental change in the underlying infrastructure
- He wishes they had incorporated distributed transactions as a part of their infrastructure for example : in GFS et al. Many internal apps have rolled their own. BTW, Spanner has distributed transactions
Over all an excellent talk, as always … And a Big Thanks to Jeff Dean …