Get me outta here!

Room With A View to the Thames

Copy of my post in Linkedin


View From My Room

London has changed a lot since I visited last ! Interesting constructions – at least this part of the town.

Am at the Aloft London Excel, ready for our tutorial at the StrataHadoop London “Building machine-learning apps with Spark“. I will be talking aboutApache Spark/GraphX along with my esteemed colleagues Jayant & Vartika.

Got a good room with a view to the Thames !

Took the Norwegian flight OAK-GTW. This is a good alternative to London. Am also flying by Norwegian to Gothenberg.

The OAK-GTW fight was a 787-8 ! Good plane – I really like the window – a lot bigger … and it has the hi-tech ElectroChromatic Dimming System (or Sun Glasses as they are colloquially called ! ) that replaces the window shutter – you can always see the outside.

Judging Lego Robotics @FirstLegoLeague World Competition-2016

P.S : Copy of my blog in Linkedin

As usual, was fortunate enough to be a Robot Design Judge at the FLL Robotics World Festival this year. I have been judging FLL Robotics for more than a decade and plan to do so in the future. Also been building 5000+piece Lego sets !

Here is a summary from ~ 150 pictures, visiting and talking with teams from all around the world – Asia, Europe, Australia, Africa, Far East and Of course the Americas.

If you are involved with an FLL team next year (anywhere in the world) and need my help, ping me – would be happy to share insights. Have some thoughts for potential teams at the end of this blog.




The FLL events were at the Edward Jones Stadium and the America’s Center













What would you want AI to do, if it could do whatever you wanted it to do ?

P.S: Copy of my blog in Linkedin

Note : I am capturing interesting updates at the end of the blog.


Exponential Advances:

An interesting article in Nature points out that exponential advanced in technological growth can result is a very alternate world very soon.

IBM X Prize:



And the IBM AI X Prize is offering a chance to showcase powerful ideas that tackle challenges.

Got me thinking … What do would we want our machines/AI to do ?

I am interested in your thoughts. Pl comment on what you would like AI to do.

Earlier I had written about us not wanting our machines to be like us; understand us – may be, help us – definitely, but imitate us – absolutely not  …

So what does that mean ?

  • Driving cars ? – Definitely
  • Image recognition, translation and similar tasks ? – Absolutely
  • Write like Shakespeare just by feeding all the plays to a neural network like the LSTM ? – Definitely not !

I see folks training deep Learning systems by feeding them Shakespeare plays and see what the AI can write. Good exercise, but is that something we would get an X Prize for ? Of course, that is putting the cart before the horse !

We don’t write just by memorizing the dictionary and Elements of Style !!

  • We write because we have a story to tell.
  • The story comes before writing;
  • Experience & imagination comes before a story …
  • A good story requires both the narrative power as well as a powerful content with it’s own anti-climax, and of course the hanging chads ;o)
  • Which the current AI systems do not possess …
  • Already we have robots (Google Atlas) that can walk like a human – leaving aside the the goofy gait – which, of course, is mainly a mechanical/balance problem than an AI challenge
  • Robots can drive way better than a human
  • They translate a lot better than humans can (Of course language semantics is a lot more mechanical than storytelling)
  • Robots and AI machines do all kinds of stuff (Even though Mercedes Assembly plant found that they cannot handle the versatile customization!)

Is there anything remaining for an AI prize One wonders …

In the article “How Google’s impressive new robot demo will fuel your nightmares” , at 2:09, the human (very rudely) pushes the robot to the ground and the robot gets up on it’s own ! That proves that we have solved the mechanical aspects of human anatomy w.r.t movements & balance.

[Update 3/17/16] Looks like Google is pushing Boston Dynamics out of the fold !



But a meta question remains.

  • Would the robot be upset at the human ?
  • Would it know the difference – if it was pushed to keep it away from harm’s way (say a falling object) vs. out of spite ?
  • And, if we later hug the robot (as the author suggests we do) would it feel better ?
  • Will it forget the insult ?

So there is something to be done after all !

Impart into our AI – the capability to imagine, the ability to understand what life is;  feel sadness & joy; understand what it is to struggle through a loss,…

This is important – for example, if we want robots to act as companions for the sick, the elderly and the disabled, may be even the occasional lonely, the desolate and for that matter even the joyous!

If the AI cannot comprehend sadness, how can it offer condolences as a companion ? Wouldn’t understanding our state of mind help it to help us better? 


 In many ways, by helping AI to understand us, the ultimate utility might not be whether AI really comprehends us or not, but whether we get to understand us better, in the process !! And that might be the best outcome out of all of these innovations.

As H2O-ai Chief SriSatish points out,

Over the past 100 years, we’ve been training humans to be as punctual and predictable as machines; … we’re so used to being machines at work—AI frees us up to be humans again ! – Well said SriSatish

With these points in mind, it is interesting to speculate what the AI X-Prize TED talks would look like in 2017; in 2018. And what better way to predict the future than to invent it ? I am planning on working on one or two submissions …

And what says thee ?

[Update 3/12/16] Very interesting article in GoGaneGuru about AlphaGo’s 3rd win.


  • AlphaGo’s strength was simply remarkable and it was hard not to feel Lee’s pain
  • Having answered many questions about AlphaGo’s strengths and weakness, and exhausting every reasonable possibility of reversing the game, Lee was tired and defeated. He resigned after 176 moves.
  • It’s time for broader discussion about how human society will adapt to this emerging technology !!

And Jason Millar @guardian agrees.


Maybe all is not lost after all, WSJ says … !


[Update 3/9/16] Rolling Stone has a 2-part report – Inside the Artificial Intelligence Revolution. They end the report with a very ominous statement.



[Update 3/4/16] Baidu Chief Scientist Andrew Ng has insightful observations

  • “What I see today is that computer-driven cars are a fundamentally different thing than human-driven cars and we should not treat them the same”- so true !

[Update 3/6/16] An interesting post from Tom Devenport about Cognitive Computing.

  • Good insights into what Cognitive Computing is, as a combination of Intelligence(Algorithms), Inference(Knowledge) and Interface (Visualization, Recommendation, Prediction,…)
  • IMHO, Cognitive Computing is more than Analytics over unstructured data, it also has touches of AI in there.
  • Reason being, Cognitive Computing understands humans – whether it is about buying patterns or the way different bodies reacts to drugs or the various forms of diseases or even the way humans work and interact
  • And that knowledge is the difference between Analytics and Cognitive Computing !

I like Cognitive Computing as an important part of AI, probably that is where most of the applications are … again understanding humans rather than being humans !

Reference & Thanks:

My thanks to the following links from which I created the collage:


The Master Algorithm (Book Review) a.k.a Data the Final Frontier

P.S: Copy of my blog in Linkedin

Book Review of “The Master Algorithm”  MasterAlg-01

Prof.Pedro Domingos has done a masterful job of unboxing Machine Learning – and unboxing is the right word!

A very insightful book that would bring tears (of joy, not misery) to the eyes of Data Scientists and Data Engineers; not to mention the C-Suite execs who would acquire deep wisdom of the data kind (am not sure if they would shed tears, they would if they could….)

And for those who haven’t read the book yet you should run – not walk, to the nearest store (or to the nearest Amazon web site with a speedy DNS) and buy one (or more!)

While you are waiting for the book to arrive (by second day shipping – you’all have prime shipping don’t you ?) you could prime yourself for the intellectual feast by reading the two resources :


The book can be consumed at least at two levels – first an insight into the domain of algorithms, data and machine learning; but a more exciting level is as an inspiration and a guide post into techniques and mechanisms that augment current models one is working on – a natural extension to Prof.Domingos’ call for action …

I’d like to give you a parting gift …  the great undiscovered ocean stretches into the distance, the gift is a boat-Machine Learning- and it’s time to set sail

My trek through the book – the latter, and what an incredible journey it was ! As Prof.Domingos says

Before we can learn deep truths withmachine learning, we have to discover deep truths about machine learning …

and the book does the latter – in spades!

The society is changing, one learning algorithm at a time” – The prologue runs like a Bond movie (A Tron-esq Master Algorithm/MCP as the next head of Spectre, anyone ?) expanding this idea into various modern day successes, for example “The candidate with the best voter model wins” (Ref my blog All The President’s Data Scientists)

Main Ideas:

The main thesis of the book is around the Five Tribes of Machine learning and the Master Algorithm that unifies all (& more..) The central hypothesis of the book is like so :

 All knowledge – present, past & future – can be derived from data, by a single, universal learning algorithm – the Master Algorithm



The language is poetic and picturesque, weaving through a lot of deep concepts, conveying the art of possible and the probable, tickling the imagination of the uninitiated as well as the practitioner.

The analogies are very real and reflect the fundamental principles of Machine Learning and Big data viz

  • Learning Algorithms are the seeds, Data the soil & Learned programs the grown plants
  • Machine Learning cartons in super market labelled ‘Just Add Data’
  • Every field needs data commensurate with the complexity of the phenomenon it studies
  • Perceptrons – mathematically impeachable, searing in it’s clarity and disastrous in it’s effects
  • ramblings of a drunkard, locally coherent even if globally meaningless
  • MCMC as drown our sorrows in alcohol, get punch drunk & stumble around all night
  • SVM as a fat snake slithering thru mine field or comparing dimensionality reduction and arranging books on a shelf  !

The book is full of nuggets of wisdom and insights, let me iterate a couple:

  • S-curve as the basis of evolving systems “the most important curve in the world”, quoting Hemingway’s The Sun Also Rises about how he went bankrupt “Two ways – Gradually & then Suddenly!” the S curve of course. Also the S-curve, not Singularity that will explain the evolution of AI
  • The progression from Hopfield’s deterministic spin glass, to work on probabilistic neurons by Hinton, et al.
  • Nature (the program) evolves for the nurture (the data) it gets, and the Baldwin evolution ie “behaviors that are first learned become genetically hardwired” – a strong case for the important step of model evolution after deployment (I had talked about it at The Best of the Worst in Big Data – see slide #7, video of pyata talk)
  • Power laws, where things get better with time, “except, of course, Windows, that gets slower with every version !
  • The jobs machines are good at “Credit applications and car assembly rather than stumbling around a construction site”. The key is, machines can’t be like us and vice versa; humans are good at tasks that require complex context & common sense and we don’t compete with the machines viz. “you don’t outrun a horse, you ride it!” – well said, Prof.Domingos. I also have similar thoughts about AI.


Absolutely worth reading, in the genre of Stephen Bakers “Final Jeopardy” (my book review) & Stephen Levy’s “In The Plex” (my book review) to name a few. It is instructive to see how much the domain of Machine Learning has evolved in the span of ~4 years !


Works that blend multiple genres are hard to create but provide endless enjoyment. I enjoyed 3 in the last couple of weeks – Prof. Domingos’ The Master Algorithm, the movie Bahubali and the songs (a juxtaposition of Sanskrit/ vernacular) and of course, Spectre (the movie & the motion picture soundtrack)

And am planning on next set of book reviews – a somewhat orthogonal domain- FinTech – Actually am pursuing the MS-CFRM at UWA !

Illuminae (and S – I have both !) belong to a new meta genre – books that give you a multi-dimensional on-line experience; the inverse (or transpose – am watching MIT 18.06) of e-books, that is, you read them like an e-book, but in the physical form !


Twitter 2.0 = Curated Signals + Applied Intelligence + Stratified Inference

P.S: Copy of my blog in Linkedin

Exec Summary:

One possible trajectory and locus (“product cadence”) for Twitter 2.0 is to be a platform – to tell stories with different levels of abstraction – from basic curated signals to aggregated intelligence (ie trends, positions, sentiments and issues) & finally the higher order of exposing stratified inference built on the signals and intelligence.

For example CPC advertisers might want to know “Who is an NBA Fan” for personalized ad campaigns based on the interest graph (We did a similar project few years ago, based on Twitter data)

All without sacrificing the core Twitter consumption experience, but adding different dimensions to Twitter consumption …

Constituents like political campaigns (pardon the pun ;o)) can consume the platform at different levels and sophistication. All the (potential!) President’s Data Scientists can run multiple models over the signals while All the President’s Devops can build dashboards for the strategists to consume the curated inferences

Twitter Network != Facebook Network; Twitter Graph != LinkedIn graph ie. Twitter is an interest graph, not a social graph. If so, why can’t Twitter expose the interest graph as a first class entity, with appropriate intelligence?

Twitter is the right platform for ad-hoc,ephemeral spaces to exchange quick notes.

[Update : Julia Boorstin’s blog What’s Next for Twitter also echoes many of my recommendations below]


Due to various reasons I have been contemplating about Twitter in general and specifically Political Campaigns as an example of an eco system where Twitter has lots of potential

Twitter has been in the news recently with the CEO change as well as the stock dip. Time for Twitter 2.0 ? Definitely !

Interestingly I had written about Twitter 2.0 in 2011 and most of it is still true ! I will include relevant parts from that blog

For the technically minded who are into the gory details, pl refer to materials from my 2012 OSCON tutorial [Social Network Analysis with Twitter] 

What do campaigns want ?

They want curated inference (which they can directly consume for actionable outcomes) and curated intelligence (for overlaying specialized models over the exposed signals at different orders). All the President’s Data Scientists would have interesting data science models over the Twitter signals. A general model is like so:

Twitter 2.0 – Trajectory & Locus

Now Twitter is an agora for pure message-based interactions; but it has lot more potential – to be a platform (of course,without sacrificing the essential nature of the medium) ! To get there, it needs to be proactive, providing different levels of abstraction – from the basic curated signals to aggregated intelligence (ie trends, positions, sentiments and issues) and the higher level of stratified inference. It also should provide congruences on Twitter to Rest-Of-The-World ie how indicative are the twitter-verse of the general population.

Topic Streams a.k.a TweeTopics

I use Twitter for 3 things – to keep current with topics that interest me, keep in touch with friends & acquaintances and finally publish things that I am interested in – many times as a bookmark !

It is almost impossible to follow topics. The List functionality never worked for me. It should be as easy to follow & unfollow at the level of topics. In the day and age, it is not that hard to run the tweets through a set of analytics engines, cluster them by subjects and offer the topics, with the same semantics as people ! The current interaction semantics are very relevant – that is what makes Twitter Twitter.

There was some thoughts about tweet threading – I think that defeats the purpose; tweets are stateless and that attribute is very important

Twitter is different from facebook and linkedin, it is not a social graph but an interest graph. Many of the traditional network mechanisms & mechanics, like network diameter & degrees of separation, might not make sense. But, others like Cliques and Bipartite Graphs do

Why can’t Twitter expose the interest graph, with appropriate intelligence ?

Topic Spaces a.k.a. TweetSpaces a.k.a TweetRooms

Twitter is the right platform for ad-hoc,ephemeral spaces to exchange quick notes.

This was my observation in 2011, and still it is true.

IM is too heavy weight and not that easy for quick things like “Where is that meeting room” or “Which seat are you in” or “What should we discuss next” et al. A one-to-many exchange, between people who are spatially (and even temporally) in separate spaces. They might in a plane, on a call or even in a hallway! Should be easy to add  a “!” tag, and shout the info. Yep, folks need to know what the ! tag is. Actually come to think of it, we could have many types of tags using a lot of the ‘$’,’@’,’%’,’^’,’&’ and ‘*’ characters with different semantics!

Time for a “Tweet Mark-up Language” ?

These are some of my quick thoughts, what says thee ?

An excursion into ranking the NBA with Elo

P.S: Copy of my blog in Linkedin

Ranking and odd making are one of the oldest professions, probably dating to around AD 69 – the romans applying inferences on predicting gladiatorial shows! Fast forward, the recent NBA finals have become more interesting (from a Data Science perspective, of course) after the Cavalier’s Win !

Update (for those who were here before) : The Game 4 win by GSW (See the Update section at the end) shows how Elo adjusts for larger Margin Of Victory without oscillation!

One interesting algorithm is the Elo ranking, which has seen application in chess, computer games, NFL, NBA and Facebook ! In the movie Social Network, Eduardo Saverin writes the Elo on the glass, responding to Mark Zuckerberg’s call for the algorithm – the picture says it all !

An Introduction to Elo:

Leaving Eduardo and Mark Zuckerberg aside for a moment and moving on to the world of LeBron James and Steph Curry, Elo ranks teams or individuals in chess, basketball, computer games et al. The rank goes up or down as one wins or loses.

If a team is expected to win and it wins, the Elo rank goes up by a small amount; the gains are higher when a lower ranking opponent wins against a stronger team, with adjustments made for the margin of victory.

After every season, the rank reverts to a norm of 1505 (for Basketball) – but basketball teams being stable Year-to-Year, the folks at 538 has a distribution of 75% carry over and 25% revert to norm – we won’t deal with this now, but I did check this in my R program

Back to the main feature … NBA

The current NBA is a dream series for Elo – the thrills and chills of Elo can be observed! viz. a good matchup, but definitely a seemingly strong team, winning 1st game as expected; and boom, games 2 & 3 won by the (not so) weaker team !

You can see (below) the Elo stats at it’s best viz. capturing the transition, giving credit (and higher ranking) where it is due

I had done Elo for NFL, but wasn’t going to try NBA after game 1, but now lit ooks like a good exercise in data algorithmics …

Fortunately Nate Silver & his team has curated the basketball data from 1946 and explained their methodology. Thanks Guys.

I downloaded the data and did some R programming.

An ugly graph plotting Elo rating for the 2015 season for GSW (black) & CLE (blue).

We can definitely see that GSW is the stronger team, but CLE (Cavaliers) is getting stronger recently – especially as it wins over stronger teams.

Let us trace the stats summary ie the Elo rating of the teams, the point spread predictions, the actual score and the response from the algorithm ….

Stuff that brings tears to the eyes of a Data Scientist !

  • Going to Game 1, Elo said – GSW : 1802; CLE : 1712 ; Point Spread : GSW by 6.78 points. Actual – GSW by 8 points
  • Nothing fancy; the Elo ranking of GSW goes up by a little, CLE goes down a little
  • Going to Game 2, Elo said – GSW : 1806; CLE : 1708 ;Point Spread : GSW by 7.07 points. Actual – CLE by 2 points
  • Now, Elo kicks in ! CLE gains higher Elo (because they won over a stronger team), GSW loses more
  • Going to Game 3, Elo said – GSW : 1798; CLE : 1716 ;Point Spread : GSW by 2.92 points. Actual – CLE by 5 points
  • GSW’s Elo goes down; CLE’s future brightens; GSW still has a slim lead
  • Going to Game 4, Elo says – GSW : 1791; CLE : 1723 ;Point Spread : GSW by 2.3 points ! <- We are here (June 10,2015)

I will update with more Elo stats after Games 4,5,6 & 7 … (am sure it has the possibility to go to 7!)

6/11/15 : See Updates below

Incidentally Nate Silver’s tweets have an unintended consequence ! They are motivating Steph ! I am hoping this is the beginning of GSW’s path to a title …


  • [Update 6/11/15 10:31 PM ] Actual : GSW by 21 Points !
  • Nate’s Tweets worked !
  • It is instructive to see the Elo graph. Even though the point spread (21 points) is much larger (than 2 & 5 points from earlier games) the Elo doesn’t go up by a huge amount. This is good, because we don’t want Elo to oscillate, but still should account for the larger than normal point spread. The Margin Of Victory multiplier adjusts that. Interesting to see the graph below, as Nate says it, in one game GSW regained their old position.
  • Going to Game 5, Elo says – GSW : 1810; CLE : 1704 ;Point Spread : GSW by 7.35 points (with home court advantage-refer to the formula (above) for details) ! <- Now, we are here (June 11,2015)
  • [Update 6/14/15] Game 5, GSW by 13 !
  • Going to Game 6, Elo says – GSW : 1814; CLE : 1701 ;Point Spread : GSW by 4.04 points (without home court advantage-refer to the formula (above) for details) ! <- Now, we are here (June 14,2015)



Of Byzantine Failures,unintended consequences & Architecture Heuristics

P.S: Copy of my blog in Linkedin

Way …. back in 2007, I gave a talk on Architecture Heuristics – we talked about Byzantine failures, systems with strong bones and the politics of systems architectures.

One would think that all this is way behind us ! Apparently not so ! There is a software bug in 787 GCU ! The root cause – yep you guessed it, integer overflow !

The plane’s electrical generators fall into a failsafe mode if kept continuously powered on for 248 days. The 787 has four such main generator-control units that, if powered on at the same time, could fail simultaneously and cause a complete electrical shutdown

And self-parking car hits pedestrians because …

Keeping the car safe is included as a standard feature, but keeping pedestrians safe isn’t. …

Interesting … whatever happened to the prime directive ? And Pedestrian Recognition – an option in self parking cars ? What next ? Steering wheel as an option ?

And, we keep on building machines that are software intensive ! Ford GT has more code than a 787 !

Back to Architecture Heuristics …

  1. Select technologies that you can dance with & Be flexible in scaling as you grow
  2. Embrace Failure & Influence Scalability
  3. Build systems with good bones (my slides from 2007 sill look relevant!)
  4. Solve the right problems
  5. While we build complex AI systems, remember that our ingenuity is hard to beat – even by the smart machines that we build !
  6. And, those who don’t learn from the history should read these recommendations, they are still valid !
  7. … Of course, pay that extra $3,000 and buy the Pedestrian Detection – you might drive the car in this world (where we humans reside – at least for now) not in Mars !

%d bloggers like this: