How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017
Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Deep Learning Data Science Voice & NLP Smart Home Recommendations & Search Smart Internet 2
Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience Media & Video Analytics Smart TV Deep Learning Data Science Voice & NLP Smart Home Recommendations & Search Smart Internet 3
Online Video Netflix AI for Content Discovery Voice Search LIVETV 4
X1 Smart TV with Voice Query: HBO Voice remote ASR NLP modules Set-top Box TV query action Answer Selector 5
Open NLP: Multiple Domains with Voice TV query Domain Selector HOME Answer Answer Selector Selector response NEWS CUSTOMER CARE...... 6
Open NLP: Multiple Domains with Voice 0.15 TV turn on the heat Domain Selector 0.80 0.02 HOME NEWS Answer Answer Selector Selector response 0.03 CUSTOMER CARE Threshold=0.10 Selected={TV, Home} Applicable={TV, Home} Precision=100% Recall=100% 7
Open NLP: Multiple Domains with Voice 0.04 TV Show me my password Domain Selector 0.03 HOME Answer Answer Selector Selector response 0.03 NEWS 0.90 CUSTOMER CARE Threshold=0.10 Selected={Customer Care} Applicable={Customer Care} Precision=100% Recall=100% 8
Domain Selector in Practice Cascade of Deep Learning Models of increasing complexity HBO Entity Detection Service YES NO NO NO Simple Model YES Complex Model YES DO NOT SEND TO DOMAIN SEND TO DOMAIN 9
Domain Selector in Practice Cascade of Deep Learning Models of increasing complexity Show me funny comedies Entity Detection Service NO YES NO NO Simple Model YES Complex Model YES DO NOT SEND TO DOMAIN SEND TO DOMAIN 10
X1 Smart TV with Voice Query: who plays the oracle in matrix Voice remote ASR Question (text) query 11 Set-top Box NLP modules QA Answer (id or text) action TV
First-order Question Answering Given: Question in natural-language form q Structured knowledge base that contains list of facts [ subject relation (attribute) object ] Tom subject Matrix Hanks 9/1/1956 Keanu object Reeves Return: Answer to q attribute Neo Assuming: q answerable by a single fact. Source entity mentioned in q. Answer is neighbor of source entity node. 12
Question Answering with Knowledge Graph Question How old is Tom Hanks? Predict Relation Extract names / titles Entities [ e 1,, e N ] Train relation r Knowledge Graph Structured Query subj rel obj Subj=e 1 Obj=? Rel=r Search e 1 r e 2 Generate Answer Text answer 13
Question Answering with Knowledge Graph Question How old is Tom Hanks? Predict Relation relation birth r Extract names Tom / titles Entities [ Hanks e 1,, e N ] Train Knowledge Graph Structured Query subj rel obj Subj=e 1 Obj=e = 2? Rel=r Subj=Tom Hanks Rel=birth Search Tom Hanks birth 1956 Tom Hanks e 1 r e is 55 years 2 old. Generate Answer Tom Hanks Text is 59 answer years old 14
Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs) Question Predict Relation relation r Entity names / titles Detection [ e 1,, e N ] Structured Query subj=e obj=? attr=? rel=r memory Entity Detection ~ Tagging NA Subj Subj NA NA memory Relation Prediction ~ Classification place of birth where Tom Hanks was born where Tom Hanks was born 15
Recurrent Neural Networks output LOC PER LOC PER 0.39 0.61 0.89 0.11 hidden memory input word washington heights 16
Online Video AI for Content Discovery Automatic NetflixContent Analysis LIVETV 17
Most metadata is at the asset level Genres Credits Synopsis Keywords 18
Much more data exists within the asset Chapters Moments Annotations Movie Frame Shot Scene Chapter 19
Why is this useful? What are the best moments on TV? In-game highlight navigation Who is in this scene? Search & Recommendations 20
How does Automatic Content Analysis work? Video Computer Vision Chaptering Audio Analysis Natural Language Processing AI & Machine Learning Scene-level Annotations Frame-level Annotations 21
Why is it possible now? Big Data Cloud/GPU Computing Better Algorithms (Deep learning) Large-scale Image recognition performance 22
Super-human accuracy in speech and image recognition! Big Data Cloud/GPU Computing Better Algorithms (Deep learning) Large-scale Image recognition performance 23
New experiences! Big Data 24 Cloud/GPU Computing Better Algorithms (Deep learning)
Example Application: In-Game Highlights Place highlights over games recorded onto customers DVRs for football, baseball, hockey, basketball and soccer. In-Game Highlights Feature for NFL has been released on Comcast X1 last fall I ll record as many games as I can. When I don t want to watch the whole game, it s a great way to do it. Customer Testimonial 25
Online Video Netflix AI for Content Discovery Personalization LIVETV 26
Personalized Entertainment Experiences What is popular right now? What do you like? + Personalized Recommendations = 27
What should I watch right now? Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores - Channel- and Program-based recommendations - Time-dependent recommendations - Trending/popular and personal favorite channels, programs, sport teams - Rich content descriptions from automatic content analysis Favorite Channels Favorite Programs Content Descriptions Collaborative Filtering Trending Popularity Live TV Recommender System 28
Online Video Netflix Deep Learning Infrastructure LIVETV 29
Deep Learning Infrastructure Deep Learning Frameworks Keras, Tensorflow, Theano, PyTorch, Caffee (older models) All deployments using nvidia-docker Thanks to Nvidia solutions team to help with best practices All deep learning training done on multi-gpu servers Nvidia Tesla (Production) and 8xTitan X (Dev) GPUs Nvidia DGX-1 for large scale training video and nlp Next steps Container scheduler Kubernetes and Hashicorp Nomad Network compression/simplification for increased efficiency (TensorRT) 30
Deep Learning-based ML is applied everywhere at Comcast Machine Learning Data Science Big Data AI Improving Customer Experience Everywhere at Comcast/NBCU For more info see: dclabs.comcast.com High Speed Internet Video IP Telephony Home Security / Automation Universal Parks Media Properties 31