Context:
I had conducted a Tutorial at OSCON-2012 – “The Art of Social Media Analysis with Twitter & Python”. Slides are at slideshare and the Python/MongoDB/Networkx programs are in GitHub. Next day I was fortunate to be interviewed by Mac Slocum – Mac has a way of asking interesting questions. Thanks Mac
These are a series of blogs annotating the slides with notes, as required. Some things are detailed in the slides, but the slides miss lot of the stuff I talked at the tutorial. Am planning on adding the notes in a series of six blogs. This is Part 1 of 6.
The hands-on project, patterns & code ended up handling ~970,000 unique users, a social graph with ~500,000 cliques, some Twitter REST API runs took 19 hrs to complete and the MongoDB was ~6GB in an m2.large aws instance. Will point out some of the interesting big data patterns related to Twitter API and the social graph.
Status:
- Aug 4, 2012 – Part 1 Completed
- Aug 4, 2012 – Part 2 Completed
- Aug 4, 2012 – Part 3 Completed
- Aug 5,2012 – Part 4 Being contemplated
Prelude:
Twitter is at a fork – it has achieved certain amount of status and popularity, not to mention utility and value to the society. We all are slowly adapting to the medium and finding out ways of utilizing the medium. My thoughts on the recent changes in API “branding”:
Twitter Tips – A Baker’s Dozen:
The slides capture the detailed bullet points.
Big Data with TwitterAPI – Twitter Tips
In the next blog, we will look at the Big Data Pipeline for a Twitter API eco system and then move on to APIs and Twitter Object Models.
Cheers
<k/>