The Art of an Artificial Intelligence Pipeline & Jen-Hsun Huang’s nVidia GTC Keynote


The background of this blog is the GPU Technology Conference’16 Amsterdam keynote by nVIDIA CEO Jen-Hsun Huang. Extremely eloquent, very knowledgeable, articulate and passionate – all great ingredients for a memorable keynote and you won’t be disappointed. He keeps the energy up for 2 hours – no curtains, no diversions, a feat on it’s own ! (P.S: I have heard from other folks that JH’s whiteboard talks are exponentially more informative and eloquent).

Popping the stack back to the main feature, the goal of this blog is to address a very thin slice of the talk – the AI backbone for a scalable infrastructure beyond the lab experiments. I had talked with many folks and the topic of a scalable AI pipeline/infrastructure gets lost in the hyperbole discussions on Artificial Intelligence and the rest …

First let us take a quick detour and put down some thoughts on an AI Backbone in four visualizations, nothing less-nothing more ….

nvidia-03
nvidia-05

With this background, let us take a look at the relevant slides slides from Jen-Hsun Huang’s eloquent GTC’16 Keynote at Amsterdam.

I. Computing is evolving from a CPU model to a GPU Model

  • Since 2012, GPU based systems have surpassed human level cognition in fields like image recognition, translation and so forth

There is a new computing model – the GPU Computing pipeline with four components:

  1. Training,
  2. The network master models a.k.a the AI backbone,
  3. Inferencing in the data center (AI applications) and
  4. Inferencing in IoT/devices (eg. autonomous cars)

II. Training

  • This where new architectures are tested with large amounts of data, transfer learning with pre trained models and so forth.
  • I like nVidia’s DGX-1 or AWS GPU clouds. You can also build a local cluster with nVidia GPUs like the Titan-X (like mine, below).

III. The Master Models

  • Reference Models, curated data and other stuff live here. Engineers can interact with the models, morph it to newer domains (using Transfer Learning, for example) and so forth. They will train using the DGX-1s or GPU clouds

IV. Scoring the models at various contexts

  • Now we reap the benefits of our hard work, the applications ! They have two flavors – either run in the datacenter or run on devices – cars, drones, phones et al.
  • The Datacenter inferencing is relatively straightforward – host in a cluster or in the cloud. The hosting infrastructure can use GPUs.
  • The device inferencing is a little more trickier – in my world, it is the Drive PX-2 for autonomous cars (you can see it in my picture of the desk)

The new Xavier Architecture is interesting – tastes better, less calories (er … power).

P.S: BTW, I like the view from the camera on the top right corner of the podium ! The slides have an elegant 3-D look !

V. Epilogue

Interesting domain, with a future.

“… something really big is around the corner … a brand new revolution, what people call the AI revolution, the beginning of the fourth industrial revolution … “

As you can see I really liked the keynote. Digressing, another informative & energetic presenter is Mobileye’s CTO/Chairman Prof. Amnon Shashua. I usually take time to listen and take notes.

Advertisements

The Bridges of Pittsburgh County … That Autonomous Cars can’t SLAM through !


Same as my post in Linkedin

Autonomous cars do bring out interesting nuances to the normal things that we take for granted and don’t think twice about !

Business Insider’s article “Here’s why self-driving cars can’t handle bridges” fits this category.

“Bridges are really hard,… and there are like 500 bridges in Pittsburgh.”

Of course, it is the infinite (or near infinite) context that we, humans can process and machines aren’t even close … But, one would think bridges would be easier – no distractions, well designed straight roads; of course with the current GPS accuracy, the car might think that it is in the water and start rolling out it’s fins !!

“You have a lot of infrastructure on the bridge above the level of the car that we as humans take into account, … But when you sense those things with a sensor that doesn’t have the domain knowledge that we do … you could imagine that the girders coming up from the side of the bridge and that kind of thing would be disturbing or possibly confusing.”

Pittsburgh does have bridges, lots of them … There is even a BBC documentary! Even how the city deals with the bridges is interesting.

In fact Pittsburgh is called “The City of Bridges”, even though some have different interpretations (we will come to that discussion in a minute)

While we are on the subject, I do have a couple of books for the Uber Car to read ! It can even order them through it’s robotic friend Alexa ! or drive to wherever fine books are sold, on it’s own time – Uber might not pay for the impromptu solo drive.

  1. Bob Regan’s Book is the first one to read
  2. Next is Pittsburgh’s Bridges (Images of America)
  3. The book Bridges… Pittsburgh at the Point… a Journey Through History gives interesting perspectives the riders would enjoy (of course, the ones with enquiring minds…)
  4. Finally, the hardcore bridge fans would be thrilled to hear from Pittsburgh’s Bridges: Architecture and Engineering

Now, to SLAM, it is the set of algorithms collectively called Simultaneous Localization And Mapping – a very interesting topic by itself.

In short, a SLAM system needs known points in addition to unknown points, to reason about & figure out it’s trajectory – bridges have less of known points it can rely on …

We can definitely employ Deep Learning ConvNets as well as traditional computer vision with a dash of contextualization is a good start … that is a topic for another time (sooner than later…). Probably an interesting opportunity for bridges.ai or openbridges.org

For those snappy Machine Learning experts, there is even a Pittsburgh Bridges Data Set at UCI, to start with ! Probably nowhere near the data needed to train modern Convolutional Nets, but one can augment the images with algorithms like Flip, Jitter, Random Crop and Scale et al.

If we think Pittsburgh is difficult, wait until Uber starts autonomous driving in Amsterdam ! While Pittsburgh has 446 bridges, many sources put Amsterdam with over 1000 bridges that cars can travel. There are many bicycle and pedestrian bridges in Amsterdam that an Uber car wouldn’t be interested in – except, of course, to pick up the tired pedestrians ;o). The which-city-has-max-number-of-bridges discussions can be followed here:

  1. http://www.wtae.com/Just-How-Many-Bridges-Are-There-In-Pittsburgh/7685514
  2. http://nolongerslowblog.blogspot.com/2014/02/what-city-has-most-bridges-and-why-is.html
  3. https://www.quora.com/Which-city-has-the-most-bridges

Trivia:

  • The bridges in Pittsburgh are not painted Yellow (as one might tend to think) but Aztec Gold !
  • And yep, it is Allegheny County ! But Pittsburgh rhymes better ;o)

Cheers

 

5 Lessons on AI from the Tesla Autopilot Fatality


Unfortunately it takes extreme repercussions for us to feel in our bones, the limitations of our technologies.

Three points :

  1. I have included relevant links about this incident at the end of the blog (incl the AutoPilot v8 with Radar). Informative read
  2. One of my parent died in an automobile accident; so I do know, first hand, the human toll – I do not take this lightly; in fact the reverse is true
  3. And the views expressed in my writing are my own and do not reflect any organization I am part of … now or in the future …

 

Lesson 1 : Our machines inherit our faults (so far …)

As I pointed out in one of my AI blogs:

Robot-06

 

Lesson 2 : Many domains are not forgiving to byzantine failures

We are learning that painful lesson whether they are rockets, airplanes or cars. Even though we freak out of snapchat is down for an hour, we can survive that, but not these. The drivers need to understand the downside of technologies and be alert.

Lesson 3 : Mission Critical Systems should have redundancy, over coverage & independency

For example multiple sensor sources & probably independent situational interpretation. I saw the following from somewhere where the Japanese Ministry talks about “correcting the wrong train of thoughts”:

Lesson 4 : Swarm Intelligence

Lesson 5 : This might lead to some level of Standardization & Legalization

  • Standardization of components & protocols
  • Legislation/Standardization of algorithms or semantic behaviors incl image recognition, policies and pragmas …
  • Even driver education and certification to dive autonomous vehicles !

Robot-05

Reference:

  1. http://fortune.com/2016/07/03/teslas-fatal-crash-implications/
  2. http://www.latimes.com/business/technology/la-fi-hy-tesla-google-20160701-snap-story.html
  3. https://cleantechnica.com/2016/07/02/tesla-autopilot-fatality-timeline-facts/
  4. https://www.teslamotors.com/blog/your-autopilot-has-arrived
  5. https://www.quora.com/How-does-Teslas-Autopilot-work-What-are-the-sensors-that-power-it
  6. http://www.gocomics.com/pcandpixel/2016/07/01
  7. Tesla’s Response to fortune’s Article http://fortune.com/2016/07/06/tesla-fortune-response-autopilot/
  8. http://www.greencarreports.com/news/1104892_tesla-autopilot-crash-what-one-model-s-owner-has-to-say
  9. http://money.cnn.com/2016/07/07/technology/tesla-autopilot-name/
  10. http://fortune.com/2016/07/11/elon-musk-tesla-self-driving-cars/
  11. http://fortune.com/self-driving-cars-silicon-valley-detroit/
  12. http://in.reuters.com/article/us-autos-selfdriving-investment-idINKCN0ZS0CQ
  13. http://www.latimes.com/opinion/editorials/la-ed-self-driving-cars-20160710-snap-story.html
  14. http://gizmodo.com/teslas-autopilot-driving-mode-is-a-legal-nightmare-1783280289
  15. http://www.bbc.com/news/technology-36783345
  16. http://www.freep.com/story/money/cars/2016/07/14/consumer-reports-tesla-disable-autopilot/87074826/
  17. http://www.usatoday.com/story/money/cars/2016/07/17/what-tesla-autopilot-crash-means-self-driving-cars/87219126/
  18. http://fortune.com/2016/07/17/tesla-rethinking-radar-system/
  19. https://www.wired.com/2016/08/hackers-fool-tesla-ss-autopilot-hide-spoof-obstacles/
  20. https://www.autoindustrylawblog.com/2016/08/02/connected-and-autonomous-vehicles-full-speed-ahead-or-tapping-the-brakes/
  21. http://www.usatoday.com/story/money/cars/2016/08/10/tesla-model-s-autopilot-china-crash/88510532/
  22. Finally AutoPilot v8 with Radar ! https://www.tesla.com/blog/upgrading-autopilot-seeing-world-radar/

 

Yann LeCun @deeplearningcdf Collège de France


I am spending this weekend with Yann LeCun (virtually, of course) studying the excellent video Lectures and slides at the College de France. A set of 8 lectures by Yann LeCun (BTW pronounced as LuCaan) and 6 guest lectures. The translator does an excellent job – especially as it involves technical terms and concepts !

(I will post a few essential slides for each video …)

Inaugural Reading – Deep Learning: a Revolution in Artificial Intelligence

LeCunn-01

My favorite slide – of course !!! And the DGX-1 !!

Missing Pieces of AI – interesting …

The reasoning, attention,episodic memory and  a rational behavior based on a value system are my focus for autonomous vehicles (cars & drones!)

Convnets are everywhere !

Probably the most important slides of the entire talk – the future of AI.

Parse it couple of times, it is loaded with things that we should pay attention to …

Can AI beat hardwired bad behaviors ?

LeCunn-06

I agree here, here, here and here – we don’t want AI to imitate us, but take us to higher levels !

Stay tuned for rest of the video summaries …..

Google’s Jeff Dean on Scalable Predictive DeepLearning – A Kitbizer’s notes from Recsys 2014 (Note :


It is always interesting to hear from Jeff and understand what he is upto. I have blogged about his earlier talks at XLDB and at Stanford. Jeff Dean’s Keynote at RecSys2014 was no exception. The talk was interesting, the Q&A was stimulating and the links to papers … now we have more work ! – I have a reading list at the end.

Of course, you should watch it (YouTube Link) and go thru his keynote slides at the ACM Conference on Information and Knowledge Managment. Highlights of his talk, from my notes …

dean-recsys-01

  • Build a system with simple algorithms and then throw lots of data – let the system build the abstractions. Interesting line of thought;
  • I remember hearing about it from Peter Norwig as well ie Google is interested in algorithms that get better with data
  • An effective recommendation system requires context ie. understand the user’s surroundings, previous behavior of the user, previous aggregated behavior of many other users and finally textual understanding.

dean-recsys-02-01


  • He then elaborated one of the area they are working on — semantic embeddings, paragraph vector and similar mechanisms

dean-recsys-03

Interesting concept of embedding similar things such that they are nearby in a high dimensional space!

  • Jeff then talked about using LSTM (Long Short-Term Memory) Neural Networks for translation.

pic09-01

  • Notes from Q & A:
    • The async training of the model and random initialization means that different runs will result in different models; but results are within epsilon
    • Currently, they are handcrafting the topology of these networks ie now many layers, how many nodes, the connections et al. Evolving the architecture (for example adding a neuron when an interesting feature is discovered) is still a research topic.
      • Between ages of 2 & 4, our brain creates 500K neurons / sec and from 5 to 15, starts pruning them !
    • The models are opaque and do not have explainability. One way Google is approaching this is by building tools that introspect the models … interesting
    • These models work well for classification as well as ranking. (Note : I should try this – may be for a Kaggle competition. 2015 RecSys Challenge !)
    • Training CTR system on a nightly basis ?
    • Connections & Scale of the models
      • Vision : Billions of connections
      • Language embeddings : 1000s of millions of connections
      • If one has more data, one should have less parameters;otherwise it will overfit
      • Rule of thumb : For sparse representations, one parameter per record
    • Paragraph vector can capture granular levels while a deep lSTM might be better in capturing the details – TBD
    • Debugging is still an art. Check the modelling; factor into smaller problems; see if different data is required
    • RBMs and energy based models have not found their way into GOOGL’s production; NNs are finding applications
    • Simplification & Complexity : NNs, once you get them working, forms this nice “Algoritmically simple computation mechanisms” in a darkish-brown box ! Less sub systems, less human engineering ! At a different axis of complexity
    • Embedding editorial policies is not easy, better to overlay them … [Note : We have an architecture where the pre and post processors annotate the recommendations/results from a DL system]
  • There are some interesting papers on both the topics that Jeff mentioned (This my reading list for the next few months! Hope it is useful to you as well !):
    1. Efficient Estimation of Word Representations in Vector Space [Link]
    2. Paragraph vector : Distributed Representations of Sentences and Documents [Link]
    3. [Quoc V.lee ‘s home page]
    4. Distributed Representations of Words and Phrases and their Compositionality [Link]
    5. Deep Visual-Semantic Embedding Model [Link]
    6. Sequence to Sequence Learning with Neural Networks [Link]
    7. Building high-level features using large scale unsupervised learning [Link]
    8. word2vec Tool for computing continuous distribution of words [Link]
    9. Large Scale Distributed Deep Networks [Link]
    10. Deep Neural Networks for Object Detection [Link]
    11. Playing Atari with Deep Reinforcement Learning [Link]
    12. Papers by Google’s Deep Learning Team [Link to Vincent Vanhoucke’s Page]
    13. And, last but not least, Jeff Dean’s Page

The talk was cut off after ~45 minutes. Am hoping they would publish the rest and the slides. Will add pointers when they are on-line. Drop me a note if you catch them …

Update [10/12/14 21:49] : They have posted the second half ! An watching it now !

 Context : I couldn’t attend the RecSys 2014; luckily they have the sessions on YouTube. Plan to watch, take notes & blog the highlights; Recommendation Systems are one of my interest areas.

  • Next : Netflix’s CPO Neal Hunt’s Keynote
  • Next + 1 : Future Of recommender Systems
  • Next + 2 : Interesting Notes from rest of the sessions
  • Oh man, I really missed the RecSysTV session. We are working on some addressable recommendations. Already reading the papers. Didn’t see the video for the RecSysTV sessions ;o(

Is it still “Artificial” Intelligence, if our Computers learn -to think- from the workings of our Brain ?


Image

  • In fact that would be Natural Intelligence ! Intelligence is intelligence – it is a way of processing information to arrive at inferences, recommendations, predictions and so forth …

May be it is that Contemporary AI is actually just NI !

Point #1 : Machines are thinking like humans rather than acting like Humans

  • Primitives inspired by Computational Neuroscience like DeepLearning are becoming mainstream. We are no more enamored with Expert Systems that learn the rules & replace humans. We would rather have our machines help us chug through the huge amount of data.

We would rather interact with them via Google Glass – a two-way, highly interactive medium that act as a sensor array as well as augment cognition with a digital overlay over the real world

  • In fact, till now, our computers were mere brutes, without the elegance and finesse of the human touch !
  • Now the computers are diverging from Newtonian determinism to probabilistic generative models.
  • Instead of using greedy algorithms, the machines are now being introduced to Genetic Algorithms & Simulated Annealing. They now realize that local minima, computed via exhaustive brute force, are not the answers for all problems.
  • They now have knowledge graphs and have the capability to infer based on graph traversals and associated logic

Of course, deterministic transactional systems have their important place – we don’t want a probabilistic bank balance!

Point #2 : We don’t even want our machines to be like us

  • The operative word is “Augmented Cognition” – our machines should help us where we are not strong and augment our capabilities. More later …
  • Taking a cue from the contemporary media, “Person Of Interest” is a better model than “I,Robot” or “Almost Human” – a Mr.Spock, rather than a Sonny; Logical but resorts to the improbable and the random, when the impossible has been eliminated !

Point #3 : Now we are able to separate Interface from Inference & Intelligence

AI-03

  • New Yorker asks, “Why can’t my computer understand me?” Finding answers to questions like “Can an alligator run the hundred-meter hurdles?” is syntax.
  • NLP (Natural Language Processing) and it’s first cousin NLU(Natural Language Understanding) are not intelligence, they are interface.
  • In fact, the team that built IBM Watson realized that “they didn’t need a genius, … but build the world’s most impressive dilettante … battling the efficient human mind with spectacular flamboyant inefficiency”.

Taking this line of thought to it’s extreme, one can argue that Google (Search) itself is the case and point of an ostentatious and elaborate infrastructure for what it does … no intelligence whatsoever – Artificial or Natural ! It should have been based on knowledge graph rather than a referral graph. Of course, in a few years, they would have made huge progress, no doubt.

  • BTW, Stephen Baker has captured the “Philosophy of an Intelligent Machine” very well.
  • I have been & am keeping track of the progress by Watson.
  • Since then, IBM Watson. itself, has made rapid progress in the areas of Knowledge Traversal & Contextual Probabilistic Inferences i.e. ingest large volume of unstructured data/knowledge & reason about it
  • I am not trivializing the effort and the significance of machines to understand the nuances of human interactions (speech, sarcasm, slang, irony, humor, satire et al); but we need to realize that, that is not an indication of intelligence or a measure what machines can do.

Human Interface is not Human Intelligence, same with machines. They need not look like us, walk like us, or even talk like us. They just need to augment us where we are not strong … with the right interface, of course

  • Gary Markus in New Yorker article “Can Super Mario Save AI” says “Human brains are remarkably inefficient in some key ways: our memories are lousy; our grasp of logic is shallow, and our capacity to do arithmetic is dismal. Our collective cognitive shortcomings are so numerous … And yet, in some ways, we continue to far outstrip the very silicon-based computers that so thoroughly kick our carbon-based behinds in arithmetic, logic, and memory …

Well said Gary. Humans & Machines should learn from the other and complement … not mimic each other … And there is nothing Artificial about it …

I really wish we take “Artificial” out of AI – Just incorporate what we are learning about ourselves into our computers & leave it at that !

Finally:

AI-04-01

Deep Learning for the masses


Updates:

Back to the main feature …

An interesting blog in GigaOm by Derrick Harris on Deep Learning for the masses. What interested me most was Jeremy Howard from Kaggle.

DL-Dean-12

  • “…It’s going to enable whole new classes of products that have never existed before …”
  • But there’s a catch: deep learning is really hard. So far, only a handful of teams in hundreds of Kaggle competitions have used it. Most of them have included Geoffrey Hinton or have been associated with him.
    • Yep, it is hard. We are trying to bootstrap an application system and haven’t even scratched the surface – so it seems
  • If data scientists in places outside Google could simply (a relative term if ever there was one) input their multidimensional data and train models to learn it, that could make other approaches to predictive modeling all but obsolete.
    • Yep. Deel Learning is being applied in image recognition, translation et al. It would be interesting to see how the technologies can be applied to retail, banking, manufacturing et al

I also think the broader architecture of the three amigos viz Interface,Inference & Intelligence needs to come together

Finally,

Smarter Models = Smarter Apps – Yep, definitely !