A Glimpse of Google, NASA & Peter Norvig + The Restaurant at the End of the Universe


I came across an interesting talk by Google’s Peter Norvig at NASA.

Of course, you should listen to the talk – let me blog about a couple of points that are of interest to me:

Algorithms that get better with Data

Peter had two good points:

Norvig-01

  • Algorithms behave differently as they churn thru more data. For example in the figure, the Blue algorithm was better with a million training dataset. If one had stopped at that scale, one would be tempted to optimize that algorithm for better performance
  • But as the scale increased, the purple algorithm started showing promise – in fact the blue one starts deteriorating at larger scale. The old adage “don’t do premature optimization” is true here as well. 
  • Norvig-02In general, Google prefers algorithms that get better with data. Not all algorithms are like that, but Google likes to go after the ones with this type of performance characteristic. 

There is no serendipity in Google Search or Google Translate

  • There is no serendipity in search – it is just rehashing. It is good for finding things, but not at all useful for understanding, interpolation & ultimately inference. I think Intelligent Search is an oxymoron ;o)
  • Same with Google Translate. Google Translate takes all it’s cue from the web – it wouldn’t help us communicate with either the non-human inhabitants of this planet or any life form from other planets/milky ways.
    • In that sense, I am a little disappointed with Google’s Translation Engines.  OTOH, I have only a minuscule view of the work at Google.

The future of human-machine & Augmented Cognition

And, don’t belong to the B-Ark !

The Curious Case of the Data Scientist Profession


Data Science & the profession of a Data Scientist is being debated, rationalized, defined and refactored … I think the domain & the profession is maturing and our understanding of the Mythical Data Scientist is getting more pragmatic.

Now to the highlights:

1. Data Scientist is multi-faceted & contextual

  • Two points – It requires a multitude of skills & different skill sets at different situations; and definitely is a team effort.
  • This tweet sums it all
  • DataScienceTeam
  • Sometimes a Data Scientist has to tell a good business story to make an impact; other times the algorithm wins the day
    • Harlan in his blog identifies four combinations – Data Business Person, Data Creative, Data Engineer & Data Researcher
      • I don’t fully agree with the diagram – it has lot less programming & little more math; math is usually built-in the ML algorithms and the implementation is embedded in math libraries developed by the optimization specialists. A Data Scientist should n’t be twiddling with the math libraries
    • I had proposed the idea of a Data Science Engineer last year with similar thoughts; and elaborated more at “Who or what is a Data Scientist?
    • The BAH Field Guide suggests the following mix:
    • Data Scienc 03
    • I would prefer to see more ML than M. ML is the higher from of applied M and also includes Statistics
  • Domain Expertise and the ability to identify the correct problems are very important skills of a Data Scientist, says John Forman.
  • Or as Rachel Schutt at Columbia quotes:
    • Josh Wills (Cloudera)
      • Data Scientist (noun): Person who is better at statistics than any software engineer & better at software engineering than any statistician

    • Will Cukierski (Kaggle) retorts
      • Data Scientist (noun): Person who is worse at statistics than any statistician & worse at software engineering than any software engineer

2. The Data Scientist team should be building data products

3.  To tell the data story effectively, the supporting cast is essential

  • As Vishal puts it in his blog,
    • Data must be there & processable – the story definitely depends on the data
    • Processes & buy-in from management – many times, it is not the inference that is the bottle neck but the business processes that needs to be changed to implement the inferences & insights
    • As the BAH Field Guide says it:
    • Data Scienc 04
    • DS01

 4.  Pay attention to how the Data Science team is organized

5. Data Science is a continuum of Sophistication & Maturity – a marathon than a spirint

Let me stop here, I think the blog is getting long already …

Is it still “Artificial” Intelligence, if our Computers learn -to think- from the workings of our Brain ?


Image

  • In fact that would be Natural Intelligence ! Intelligence is intelligence – it is a way of processing information to arrive at inferences, recommendations, predictions and so forth …

May be it is that Contemporary AI is actually just NI !

Point #1 : Machines are thinking like humans rather than acting like Humans

  • Primitives inspired by Computational Neuroscience like DeepLearning are becoming mainstream. We are no more enamored with Expert Systems that learn the rules & replace humans. We would rather have our machines help us chug through the huge amount of data.

We would rather interact with them via Google Glass – a two-way, highly interactive medium that act as a sensor array as well as augment cognition with a digital overlay over the real world

  • In fact, till now, our computers were mere brutes, without the elegance and finesse of the human touch !
  • Now the computers are diverging from Newtonian determinism to probabilistic generative models.
  • Instead of using greedy algorithms, the machines are now being introduced to Genetic Algorithms & Simulated Annealing. They now realize that local minima, computed via exhaustive brute force, are not the answers for all problems.
  • They now have knowledge graphs and have the capability to infer based on graph traversals and associated logic

Of course, deterministic transactional systems have their important place – we don’t want a probabilistic bank balance!

Point #2 : We don’t even want our machines to be like us

  • The operative word is “Augmented Cognition” – our machines should help us where we are not strong and augment our capabilities. More later …
  • Taking a cue from the contemporary media, “Person Of Interest” is a better model than “I,Robot” or “Almost Human” – a Mr.Spock, rather than a Sonny; Logical but resorts to the improbable and the random, when the impossible has been eliminated !

Point #3 : Now we are able to separate Interface from Inference & Intelligence

AI-03

  • New Yorker asks, “Why can’t my computer understand me?” Finding answers to questions like “Can an alligator run the hundred-meter hurdles?” is syntax.
  • NLP (Natural Language Processing) and it’s first cousin NLU(Natural Language Understanding) are not intelligence, they are interface.
  • In fact, the team that built IBM Watson realized that “they didn’t need a genius, … but build the world’s most impressive dilettante … battling the efficient human mind with spectacular flamboyant inefficiency”.

Taking this line of thought to it’s extreme, one can argue that Google (Search) itself is the case and point of an ostentatious and elaborate infrastructure for what it does … no intelligence whatsoever – Artificial or Natural ! It should have been based on knowledge graph rather than a referral graph. Of course, in a few years, they would have made huge progress, no doubt.

  • BTW, Stephen Baker has captured the “Philosophy of an Intelligent Machine” very well.
  • I have been & am keeping track of the progress by Watson.
  • Since then, IBM Watson. itself, has made rapid progress in the areas of Knowledge Traversal & Contextual Probabilistic Inferences i.e. ingest large volume of unstructured data/knowledge & reason about it
  • I am not trivializing the effort and the significance of machines to understand the nuances of human interactions (speech, sarcasm, slang, irony, humor, satire et al); but we need to realize that, that is not an indication of intelligence or a measure what machines can do.

Human Interface is not Human Intelligence, same with machines. They need not look like us, walk like us, or even talk like us. They just need to augment us where we are not strong … with the right interface, of course

  • Gary Markus in New Yorker article “Can Super Mario Save AI” says “Human brains are remarkably inefficient in some key ways: our memories are lousy; our grasp of logic is shallow, and our capacity to do arithmetic is dismal. Our collective cognitive shortcomings are so numerous … And yet, in some ways, we continue to far outstrip the very silicon-based computers that so thoroughly kick our carbon-based behinds in arithmetic, logic, and memory …

Well said Gary. Humans & Machines should learn from the other and complement … not mimic each other … And there is nothing Artificial about it …

I really wish we take “Artificial” out of AI – Just incorporate what we are learning about ourselves into our computers & leave it at that !

Finally:

AI-04-01

The Art of an Insightful Recommendation


  • I have been working multiple aspects of recommendation including AI & DeepLearning
  • Came across an insightful talk by Eric Colson of Stitch Fix at Strata 2013 titled “Committing to Recommendation Algorithms”
  • Short, succinct & very informative. Slides
  • It is only ~8 min. So I urge you all to watch it.
  • I took down some notes and created couple of collages out of the presentations.
  • Strong Algorithms

  • StitchFix-01
  • Human Judgement

StitchFix-02

You see, their value proposition goes beyond convenience.
StitchFix-03
They provide a shopping experience beyond the casual encounter in a store or browse on a web page - The ability to find things that one wouldn’t have find on one’s own – and that is priceless!

Of Building Data Products


  • [Update 11/28/13] Notes from blog by Jon “Data Driven Disruption at Shuttershock” on what a data products company is
    1. Data is your product, regardless of what you sell
    2. Data is your lens into your business – Jon echo’s Peter’s insights viz. invest in data access; feel the pulse of the business & iterate
    3. Data creates your growth
  • Back to the main feature, Peter’s talk
  • A very insightful & informative talk by Peter Skomoroch of Linkedin via Zipfian academy
  • It is short & succinct, only 37 minutes. I urge all to watch
  • The slides of the talk “Developing Data Products” are at slideshare
  • Quick Notes:
    • A Data Product understands the world through inferential probabilistic models built on data
      • So collecting right data through “thoughtful” data design is very important
      • The data determines & precedes the feature set & the intelligence of your app
        • LinkedIn is a prime example – as they get more data, the app has become more intelligent, intuitive and ultimately more useful
        • Offer progressively sophisticated products, leveraging the data & insights, across the different user population segments – customer segmentation & stratification is not just for retail !
    • While more data, see “Unreasonable Effectiveness of Data” Distinguished Lecture by Peter Norvig, is good; for complex models, a deep understanding of the models and feature engineering would eventually be necessary (beyond the “black box”)
      • Data products about people, are usually complex, in terms of models as well as the data

Image

[Update 12/13/13] Remember, a data product usually has the three layers – Interface, Inference & Intelligence

One Band to Rule them all – Nike FuelBand+ vs. Fitbit Flex


Background:

  • I had the Fitbit Flex for a few months. But am not that satisfied with it. Is Nike FuelBand+ any better ? Plan to find out, by wearing them both for a week or two. Am tracking a few factors, pl suggest more …

IMG_1557

Summary:

  • The Good
    • The style & form factor of Nike FuleBand +
      • I am sorry guys, Fitbit is ugly ;o(
    • Fitbit sends you e-mail when the battery is low ! Good work guys
    • Hours won feature of Nike+
      • Good concept. Basically, it wants to you to move periodically
    • Fitbit is a lot lighter than Nike+
  • The Bad
    • Nike Fuel, while a good concept, is a black box. Couldn’t internalize it
    • Hours won notification of Nike+ is useless. It shows on the band’s display, but I miss it most of the time. Others notice the scrolling display !
  • The Ugly
    • The metrics like steps et al from both bands are so different that they can’t be compared
    • Fitbit is maturing. They add more devices in a short span and there is a fundamental difference between the versions. Would have been good if they have an upgrade plan
    • Tracking sleep on both are not that useful

Band Log:

  • Stardate  91454.39 Day 1 : Both devices fully charged
  • Stardate 91461.40 Day 3 : Have been wearing both devices for 3 days. Updating the comparison.
  • Stardate : Updated Summary above.

Comparison: (I will start filling-in in as I go along):

  • Aesthetics – Fashion vs. Functionality:

    • Of course, Fuelband + is a lot good looking.
    • Fitbit should definitely change it’s looks. 
  • Dashboard:

    • First cut, Fitbit looks better, may be because I am use to it.
    • Day 3: Am used to Nike+ dashboard as well. The Nike+ dashboard on iPhone 5S looks very functional
  • Track:

    • Nike has only one main trackable feature NikeFuel against a goal.
    • Fitbit has multiple features – each one with it’s own goals
    • Two Screens after Day 1 below
    • Day-01-fb Day-01-n
    • Nike
      • Nike Fuel Graph : My Nike Fuel goal is 2500. Haven’t yet figured out how it is calculated
      • Hours Won is interesting. Keeps you moving
      • Steps, Calories – Informational. But they are different for the devices.
    • Fitbit
      • Very Active Minutes – 30 min is my goal. Good metric to track
      • Distance, Steps, Calories – Normal metrics
  • Gamification

    • Both devices have badges, trophies, buddy system (friends, groups). I haven’t explored them yet.
    • May be I will buddy with a turtle and feel good ;o)

  • Inactivity

    • Fitbit does nothing except log it
    • Nike has an alert mechanism. Let us see if it works
  • Accuracy:

    • I will jot down counters from both devices. Let us see how they stack up
    • Here are the screen shots for 2 days. I took the shots at the same time.
    • The devices do not agree at all. I think Fitbit is far off on the plus side while Nike might be more closer.
      • This was my first concern with Fitbit and I had contacted their support. Didn’t get a satisfactory response. The support folks are very good, but I think this i a technical fault.
    • Day-02-fb  Day-02-n
    • Day-03-fb  Day-03-n
  • Sleep

    • Fitbit’s sleep tracking is a little awkward. You need to log time went to bed & time woke up. As far as I can tell, if you miss one day it is gone
    • Nike has the session feature, I haven’t yet tested it
  • Battery Life and ease of charging:

    • Fitbit sends an e-mail when the battery is low. Very cute and useful.
    • Let us see what Nike Fuelband+ has in store for us
  • Conclusions after 3 days

    • I think I will go with Nike+
      • Nike is more mature in many ways
      • Nike Fuel is a better motivation than FitBit
      • Overall Nike has a better form factor & a better app
      • Fitbit has better counters & goals for each metric. But they are not cohesive
      • Tracking sleep, while kludgy, is better with Fitbit ! (I never thought I will say this ;o))
      • Fitbit has the premium subscription ($50/yr) that gives more analytics.
        • But am not sure it is worth the price. I think it is an overkill.
        • And Nike has the analytics feature in the base product. Of course, Nike might add a paid feature set
      • Fitbit Flex lacks the display
        • I bought the Fitbit Flex and they have the Fitbit Force. Came out within 3-4 months after they introduced the Flex. I think they should have provided an upgrade path
        • I think Fitbit need at least couple more product revs to add a better display
  • Not so fast !

    • 11/11/13 : Nike+ FuelBand crashed ! I get an page full of error when I connect it – even the Nike web site is crashing ! Looks like the Fuelband crashes the Nikeplus site !
    • NikeError
    • I couldn’t fond a way to report this to Nike. Finally send them a mail via their site & twitter
    • I also contacted the Fitbit guys to see if I can swap out the flex for their force. They released Flex too soon.
    • Let us see how the response is from both the companies …
    • 11/11/13 : Night
      • Heard from both. So support is good from Nike and Fitbit.
      • Nike’s twitter support came back. Mail via web site is stuck somewhere
      • Fitbit (via e-mail) politely refused to swap my Flex for Force.
        • I understand – I bought the device 5 months ago.
      • Nike Plus web site is up; my FuelBand Plus doesn’t crash the site anymore
      • Fuelband SE got reset & reinitialized. It is up & running (Unfortunately we can’t say the same about me ;o( I am static & typing)
      • One good side effect – In the process I discovered where Nike has the manual for FuelBand SE. The links are not that obvious. 

Big Data on the other side of the Trough of Disillusionment


5. Don’t implement a technology infrastructure but the end-to-end pipeline a.k.a. Bytes To Business

SImple Reason : Business doesn’t care about a shiny infrastructure, but about capabilities they can take to market …

AI-Arch-21-P199

4. Think Business Relevance and agility from multiple points of view

Aggregate Even Bigger Datasets, Scenarios and Use Cases

  • Be flexible, tell your stories, leveraging smart data, based on ever changing crisp business use cases & requirements

3. Big Data cuts across enterprise silos – facilitate organization change and adoption

  • Data always has been siloed, with each function having it’s own datasets – transactional as well as data marts
  • Big Data, by definition is heterogeneous & muti-schema
  • Data refresh, source of truth, organizational politics and even fear comes in the picture. Deal with them in a positive way

2. Build Data Products

1. tbd

  • One more for the road …