Facebook Infrastructure @ New Years Eve – A study in Scalability

Another interesting article on how Facebook is preparing for the New Year’s Eve, this time from our own San Jose Mercury News By Mike Swift.

Interesting points:

  • New Year is one of the busiest times for social network sites as people post pictures & exchange best wishes

CEO Mark Zuckerberg has long been focused on having the digital horsepower to support unbridled growth — are a key reason behind the .. network’s success

  • It received > 1 B photo uploads during Haloween 2010
  • Since then Facebook added 200 million more members and so New Year Eve 2012 can see more than 1.5 B uploads !
  • My favorite quote from the article:

The primary reason Friendster died was because it couldn’t handle the volume of usage it had. … They (Mark,Dustin and Sean) always talked about not wanting to be ‘Friendstered,’ and they meant not being overwhelmed by excess usage that they hadn’t anticipated

  • The engineers at Facebook just finished a preflight checklist and are geared up for the scale
  • In terms of scale “Facebook now reaches 55 percent of the global Internet audience, according to Internet metrics firm comScore and accounts for one in every seven minutes spent online around the world.”
  • From a Big Data perspective, Facebook data has all the essential proprieties viz. Connected & Contextual in addition to large scale – Volume & Velocity (see my earlier blog on big data)
  • Facebook has the “Emergency Parachutes” which let the site degrade gracefully  (for example display smaller photos when the site is heavily loaded)
  • Their infrastructure instrumentation is legendary (for example, the MySQL talk here)

To manage Facebook’s data infrastructure, you kind of need to have this sense of amnesia. Nothing you learned or read about earlier in your career applies here …

And finally, Our New Year Wishes to all readers & well wishers of this blog 

Social Networking – The next ERP ?

We can probably find enough evidence to argue this point.  An interesting waypoint is the project ESME


  • Enterprise Social Messaging Experiment (ESME) is an Open Source tool designed by Siemens IT Solutions and Services together with SAP Community specialists.
  • One client to the ESME system is,… yep you guessed it – ABAP, which is the programming language for SAP. So literally the ESME is an extension to ERP !
  • And as SAP matures it’s cloud products and (inevitably) move into SaaS/Cloud models, an ERP-based social media interface which leverages the multi-tenant capabilities (thus deriving the social graph across enterprises)  is not far behind !
  • ESME is a “self organizing communication group” which is interesting, as this is the only way collaboration/communication can scale. They need to acquire context as well as intelligently derive connectivity inferences
  • “It serves to identify company employees with particular knowledge or expertise, and networks these experts together so that they can exchange information”
  • And it is an open source Apache project . I might contribute … may be an *OpenSocial* interface, which the project lacks now. Also need to see how they are organizing the knowledge graph and social graph.


ESME is written in *Scala* – a programming language which combines the object oriented world and the functional programming world. It also has primitives from Erlang, a very scalable language system – see my blog for a quick review


Cheers & happy Pi day (3/14)

Marc Andreessen with Charlie Ross – Innovation, mobility, Social Media & Viral platforms

A very informative interview – Charlie asked interesting questions and Marc has equally insightful answers & discussions.

Video and full text at http://seekingalpha.com/article/121915-marc-andreessen-on-charlie-rose-internet-and-new-media-companies

For the attention challenged my bullet notes:

  1. Future of news papers

    • Two words – kill it ! Stop Printing newspapers !
    • Fundamental structural change happening in the newspaper business. It is happening in all branches of the media industry but the newspaper is at the front
    • Investors have seen thru the transition. But the industry is still trying to survive.
    • An interesting analogy : Chronic pain vs acute pain – How many years of chronic pain vs one year of acute pain for transition ;o)
    • Acute pain will be acute but inevitable. But need to build for future.
    • Wrote a blog New York Times deathwatch ! “What is with you & NY Times ?” Charlie asks. Blog post no longer there
  2. Social Media Industry

    • Facebook: Facebook 175 million users, half of them use it every day; many use it 50 times a day. On its way to 500 million users !  Mark is on the board of directors. 135 million active users  equates to 6th most populous country in the world !
      They are taking a more organic growth model if they had taken the normal advertising, they would make been over a billion dollar in ad revenue. Facebook has tremendous potential for example could monetize the home page just a question of how they choose to extract the value. They want to build a long term business, eyes way on the horizon and big vision (to connect everybody on the planet (what about beyond?))
    • Ning: Ning has crossed 20 million users adding 2 million users a month. There are a million social networks on Ning !
    • YouTube, Facebook et al – under-monetized assets
    • Twitter as a real-time electronic nervous system – says you could twitter when a plane lands in water. May be people did, but I wouldn’t be twittering if my plane crash landed on water ;o)
      Story of twitter – Evan Williams had a podcasting company ; raised ~3.5 million; didn’t succeed and returned all (Evan made up the difference!) Twitter was a side business at that time, it took off. So they changed focus, closed the podcast operation and focused on Twitter
    • Social networking is here to stay and it’s potential is just beginning. Marc is big on “viral” applications
    • The Obama campaign employed the social networking approach and philosophy as the engine for fund raising, volunteer coordination
    • Viacom suing YouTube wrong strategy – They should be using it to distribute their videos ! Every time there is a Viacom video in YouTube, there should be a buy button! Distribution channel that bring traffic to their properties !
      Napster – 20 million people showed up. If music industry had a buy button they would have been successful. When people line up, find a way to monetize it
  3. Innovation

    • More opportunity than ever before – Cascading effect – every new layer of technology makes another layer of innovation possible and that keeps rolling
      There is an interesting discussion of Intel’s transformation from a memory chip maker to a microprocessor maker in around 1985; was not an obvious bet to make, but they had to do it to escape the overhang of Japanese memory makers who were crushing Intel.
    • Innovation Cycle: Silicon graphics was out of business due to Intel’s microprocessor and that freed up engineers to work on nVidia/ATI which in-turn is posing challenges to Intel in video and graphics business
  4. Mobility, iPhone & the new landscape

    • Usually people talk about a new idea for long time, finally the technology comes together and the thing takes off – internet in ‘95 is an example, mobile is in that stage now
    • iPhone is a template every other vendor will copy. For first time iPhone real os, sdk and an application delivery infrastructure – 1st time all of these over a fast network
    • iPhone itself is fantastic – beams from the future as Marc characterizes it – and inspired a lot more creative thinking around it
    • He mentioned an investment of his Qik [http://qik.com/], where any phone can be the source of live streaming video to any device or other phone; will be very effective as phones with HiDef videos capability in 2 years
  5. The Magic Business

    • Bill Joy once said : some products have the “it works” feature !
    • There were more than thirty-five search ventures before Google; but Google search really worked in terms of the core technology plus they unlocked the ad business model.
    • Marc characterizes this as the “Magic Business” which happens once in 10 years or so – Cisco was a magic business, intel was one, so was Microsoft and even Amazon. With Magic Business, one goes for scale and size. People had written AMZ off in 2002, but Bezos had the fortitude and foresight to stick with the long vision
  6. New form factor

    • Marc believes Kindle is the new form factor along with iPhone and netbook; each with a different but effective purpose.
    • Kindle is the web-pad, a 7” form factor, the next opportunistic screen size which people will for video, telephony and conferencing.
    • Most probably the next new product from Apple would be this 7” e-Book, conference, web appliance !
  7. New VC Firm with a slightly different focus

    • Marc is starting a new VC fund with Ben Horowitz. They have invested their own money in the last 3 years in 36 companies
      They focus on smaller companies – 100K-200K; may be 500K to million. Marc is of the opinion that a whole generation of startups do not need very much money (“very much” defined as  200K – 1.5 million)
    • His new VC firm’s name – Andreessen Horowitz ; can be a law firm or a vc firm! Abbreviates to A to Z and will get listed first in yellow pages – could be a good name for a tow truck business as well!
  8. Impact of the recent economy related challenges

    • During the 2001 recession, we were the nose of the dog , this time we are the tail. Companies on valley do not generally run on debit financing and so affected the least. But the big recession will impact salesSilicon Valley will be the tragic beneficiary from damage in other industries – like banking et al
    • <KS>
      • I thought the discussion on new types of banks was a little asymptotic but the concept of new way of just-in-time credit scoring and credit provisioning by Bill me Later is interesting.
      • On a tangential discussion, Marc was referred to “Good Banks, bad banks and ugly Assets” and ideas by Paul Romer
    • </KS>
    • Innovation will continue tons of innovation will be bottled up in the next 5 years. Companies like Google, YouTube and Facebook developed thru the last bust. Look for return in 7-10 years from today’s funding.


  1. Good Comments. Thanks.

Future apps for iPhones – social tagging !

A very interesting blog on next applications for iPhone I had been a proponent of social tagging tagging not only as a commentary but also as a marker of history. Have been associated with social tagging projects with UCLA for a couple of years, after participating in the Urban Sensing Summit. We had some ideas even doing social tagging at the Olympics ! I agree with Alex that the time has come for collaborative apps at the phone level and the iPhone as a platform can make it happen !

facebook a phone book ? A comparson of facebook & open social …

This blog [or here ] got me thinking, what exactically[1] is a social network platform ? Murdoch is quoted as saying “MySpace is a place for self-expression, where users’ MySpace pages become their home on the Internet. It is where they discover people, content, and culture — where they share information, communicate, and consume. Facebook, on the other hand, tends to be a web utility, similar to a phonebook.”

As described here the war of the social network interfaces has began. One main reason for the war is the control of architecture – companies have concluded that she who controls the interfaces controls the domain; social networks being a nascent domain with vast potential … the math is clear ..

Naturally there is more to the social network wars of the 2007, between Opensocial and Facebook, with Microsoft/IBM/Sun waiting in the wings. In order to get a fundamental understanding to evaluate the significance, we have to look at the offerings, systemically, from a much broader perspective, in spite of the rhetoric by the executives and popular press.

Till now, in some sense, social networks have been a consumer artifact, a new phenomenon employed mainly for the social graph – that too in a very limited fashion. Only a handful of serious apps exist in this domain, again for now. This analysis is being done in the context; there is another adjacent domain enterprise social and also the knowledge graph ! Both of which have many flavors; will touch upon some, but not in any detail ….



Programming Model




Completeness of Information Model









As of now, facebook has more mindshare. facebook definitely is developer friendly – the facebook developers Wiki is much more useful than the opensocial docs.

In short …

There is no question that social graph is a platform than a feature or an application. Presentation by Reid Hoffman & Tim O’Reilly at the Graphical Social patterns 07 clearly articulates this domain.


Epilogue a.k.a Locus & Trajectory

The domain of social networking doesn’t stop here. IMHO, it has just started. There are two adjacent and complementary domains to the social networks – semantic web and topic maps. May be this is a topic for another blog … Let me stop here in the name of brevity (and go back and fill-in the details)…

Ref: [1] From signoff of Detroit weatherman Sunny Eliott

facebook act 2 scene 1

I think facebook is maturing too fast; but that is the nature of the beast, one rolls with the momentum. Given that premise, what would next scene be ? Two thoughts:

i) Explore into next level of information maturity order.

Information Maturity Order

Most probably fcebook is in 3rd order and now is the time to add context. Some candidates include the knowledge graph, skills graph, …
ii) facebook/e (facebook – Enterprise version)

It is no secret that enterprises see value in leveraging social graph ( and the derivatives there of like the knowledge graph) facebook on some form of enterprise class infrastructure as an offering would be a good sized market. An “enterprise class infrastructure” means assuring the required security,privacy, availability, extensibility/integration, multi-tenancy (if hosted in a collective farm), compliance verifiability et al. What enterprises get is an innovative infrastructure which can replace the corporate employee directory. Otherwise they will implement some form of social network with less features and will take more time. Basically a social network SaaS ! Of course, facebook should not touch rest of corporate “stuff” that hang off of an employee directory like entitlement. Once the business and technology model is defined/refined, I can see a good revenue stream for facebook. Again part of growing up ;o)

Realistically, in order not to slow down the mainstream, the facebook/e might have to be a distinct entity, still feeding off of the mainline – in some sense distinct but not separate …

Let me know what you all think … with all the smart people around this company, most probably both these are on their way …

Publishing, Search, Fulfilment and Conversation as four pillars of any software system …

I am sure it has happened to you as well : After thinking thru & working feverishly on a problem and having convinced that one has THE solution, suddenly one come across an insightful discussion that totally changes one’s point-of-view !

It has happened to me occasionally and had one occasion today when I read JP’s insightful blog “Facebook and the Enterprise: Part 4: Four Pillars”

In the name of brevity let me get to the point with a brief executive background …

What is a good story without a little rambling eh ? ;o)

I am part of a team at the San Jose Education Foundation working towards creating a destination for teachers – local schools first and then, of course, world domination ;o)

We were following three vectors – roughly speaking -

  • SIMS(Student Info management System),
  • Sharing Teacher Innovations &
  • Teacher tools.

Of these, after talking with the teachers and working with the board, we zeroed in on the domain of lesson plans and teacher tools around it.What we need is a destination for teachers, which would offer value for their precious time – so LessonOPOLY is born ! BTW, our fearless Director Of Innovation Gina came up with that name !

Incidentally this is a poster child for Web 2.0 because, we want a derivative based [1] collaborative creation of lesson plans as well as some form of social networking/interactivity.

We have this romantic notion of two classes from different geographies taking a field trip (to the same place, at the same time, by design not by accident) based on a shared lesson plan and coordinating the trip thru LessonOPOLY so that the students can meet and share their learning experience !

In short, the main features we want to offer to teachers are :

  1. Collaborative creation of lessons,
  2. A derivative based, repurposable, remixable, on-demand interchange multi-media content capability (it sounds more lofty than it actually is ;o))
  3. Ability to meaningfully stitch a fabric (of lessons, in this case) from a vast sea of materials which differ in content grade, content relevance, granularity and format diversity [2]
  4. Ability to perform semantic search on the federated content with contextual attributes meaningful to teachers &
  5. Finally, of course, social networking and other second order artifacts (For example, the above mentioned simultaneous field trip based on a shared lesson plan !)

For #3 and #4, what we realized was that there are lots of content available, many in lesson plan format ! But still it is not easy for a teacher to get to the right one to use for a particular lesson or an objective. And teachers, being overworked they are, would spend their time face to face with students than wading thru 100,000 items on a search page !

Our solution was a semantic search based on technologies like clustering, Kernel Methods and SVMs, with attribute based tagging. I was convinced that this is a search problem, until … I read JP’s blog and saw the words publishing, search and conversation in a coherent/cohesive way that was relevant to our work ! I would possibly add context in the mix as well …

After reading JP, I realized that we are looking at is publishing (based on a derivative based approach [1]), search and conversation ! In fact, the derivative based lesson plan creation is a conversation, not the traditional voice/im/e-mail type, but definitely a conversation or a narrative as mentioned by Dr.Norvig here (at the bottom of the interview)

And as JP mentions, we are building a context based visualization tool to enrich and improve the vast sea of content out there ! This is not a find problem, but an information overload problem ! In fact, we are anti-search ! We want to throw out stuff, not find them !

Yet Brutus is an honorable man and this is still a search semantic overlay for a derivative / tagging based collaborative content creation & aggregation. We still need multi-variate, multi-modal clustering. But my Support Vector Machines, Kernel Methods, Kalman Filters and Clusters should search for (and make probabilistic inferences on) irrelevance (based on context, of course, which would include location, temporal attributes, abstract past interactions and wisdom of the crowds) !

Thanks JP for the insights ! Your observations are true not only for enterprises (i.e. where we work) but also where we play & learn !

P.S : I think this is possible more now than before. See a discussion on semantic technologies …

Update [Oct 2, 2007] : Came across Sramana’s blog which also talks about Context and Vertical Search as pillars of Web 3.0.

[1] By derivative, I mean collaborative, revision based creation, like a Wiki but which includes multi media ! Then what we see is not 50 intermixed versions of the same materials, but a systemic organization of work by multiple folks, all trying to create a set of coherent lesson plans for their students. Plus tagging, not only about the utility but also the appropriateness for a particular purpose (for example introductory vs advanced, for CA curriculum vs main curriculum, for 3rd graders vs 6th graders, as a first course vs as a second course, …)
[2] http://www.hewlett.org/Programs/Education/OER/OpenContent/Hewlett+OER+Report.htm

[3] A great discussion on the top down approach for semantic web is at http://www.readwriteweb.com/archives/the_top-down_semantic_web.php

Dimemsions & pillars of social networking …

Read a good article 35 Perspectives on Online Social Networking from a fellow wordpress blogger. Very interesting. And that led to another article The Four Pillars of Social Software.

Both writings raise interesting dimensions -

The last pillar “Scale kills conversation, so Protect Conversations From Scale” is very relevant (and very difficult !)

“Finding way for people to self-organize, split up and reform dynamically, and form affinities with groups ” is congruent to the graph primitives in ad-hoc networks, one of my passions !

When a social network group scales in a scale-free manner, one would like it to morph into sub-graphs (to keep the conversation, one really cannot shout to 10,000 folks and manage the threads ;o(), still maintaining the connectivity thru sentinel nodes and thus keeping the “bigger” network alive.

Each network has it’s own form, function and utility (Book [1] below explain the dimensions of organization very well), so we cannot arbitrarily create sub-groups and forget about the original organization.


Couple of good books for summer reading (in addition to The Deathly Hallows, of course ;o))

1. Everything Is Miscellaneous: The Power of the New Digital Disorder
2. Watch This, Listen Up, Click Here: Inside the 300 Billion Dollar Business Behind the Media You Constantly Consume
3. The Black Swan: The Impact of the Highly Improbable
4. Herd: How to Change Mass Behaviour by Harnessing Our True Nature


Urban sensing, community (wireless) networks, intelligent edges & social tagging (Part V of X)

We still haven’t talked about social tagging, community networks and urban sensing. We will now !

First I have attached my presentation for my talk “Urban Sensing, the place in-between work and play 2010 & Olympics 2008” at CINACON 2006 (Chinese Information & Networking Association) . I will explain the presentation as a starting point and then continue with a discussion on our exciting collaboration with UCLA REMAP/CENS). The project is just starting, and I will report the progress time to time …