Machine Learning Techniques at the core of AlphaGo success

Size: px
Start display at page:

Download "Machine Learning Techniques at the core of AlphaGo success"

Transcription

1 Machine Learning Techniques at the core of AlphaGo success Stéphane Sénécal Orange Labs Paris Machine Learning Applications Group Meetup, 14/09/ / 42

2 Some facts... (1/3) AlphaGo Computer program, designed by Google DeepMind, which plays the game of Go 2 / 42

3 Some facts... (2/3) Breakthrough! AlphaGo defeated EU Go champion Fan Hui in 2015 by 5 games won to 0! Google DeepMind video: Ground-breaking AlphaGo masters the game of Go 3 / 42

4 Some facts... (3/3) Breakthrough!!! AlphaGo defeated world-class professional Go player Lee Se-dol by 4 games won to 1!!! (ended 15 March 2016) 4 / 42

5 Questions... (1/2) Game of Go? What is the game of Go? Why is it a complex game to play? 5 / 42

6 Questions... (2/2) AlphaGo Machine Learning (ML) System? How AlphaGo is built? How does it work? What are the main ML techniques constituting the system? 6 / 42

7 Machine Learning at the core of AlphaGo success Outline: 1 (Context: AlphaGo and its success) 2 Survey of the game of Go and of its complexity 3 High-level introduction to AlphaGo ML system 4 Take away messages, references 7 / 42

8 The Game of Go Complexity of Go Reducing the Complexity Go (1/3): How to play? Board with a lines grid, each turn black and white stones are placed on the intersections of the lines on the board (here numbers represent game rounds/turns) 8 / 42

9 The Game of Go Complexity of Go Reducing the Complexity Go (2/3): Aim of the Game Conquer a larger part of the board than your opponent the stones you placed on the board plus the stones which could be added inside your own walls 9 / 42

10 The Game of Go Complexity of Go Reducing the Complexity Go (2/3): Aim of the Game Counting: ( = 22) vs ( = 27) black wins this game by 5 points 10 / 42

11 The Game of Go Complexity of Go Reducing the Complexity Go (3/3): Game Example (272 moves) 11 / 42

12 The Game of Go Complexity of Go Reducing the Complexity Complexity? (1/4) Go is a game with perfect information: Each player can see all of the pieces on the board at all times it is possible to determine the game outcome under the hypothesis of perfect play by the players Optimal value function: input = every board configuration output determines the outcome of the game: for example +1 if you win and -1 if your opponent wins 12 / 42

13 The Game of Go Complexity of Go Reducing the Complexity Complexity? (2/4) Playing Go Perfectly? Game can be solved by computing the optimal value function in a search tree This tree contains b d possible sequences of moves, where: b = tree s breadth number of possible moves per position d = tree s depth game length 13 / 42

14 The Game of Go Complexity of Go Reducing the Complexity Complexity (3/4): Search Tree Tic-Tac-Toe Example (tree breadth = 3, tree depth = 3) 14 / 42

15 The Game of Go Complexity of Go Reducing the Complexity Complexity... (4/4) For classical and popular games: Chess: b 35 and d 80 b d Go: b 250 and d 150 b d Magnitudes number of atoms in the Universe Exhaustive search of optimal game strategies is infeasible... Huge search space for choosing efficient game strategies: difficulty of evaluating board configurations (i.e. the outcome of the game from board configurations) difficulty of selecting moves 15 / 42

16 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity Searching in the tree can be simplified via intuitive approaches: Reducing the depth of the search tree Reducing the breadth of the search tree 16 / 42

17 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity: Tree Depth (1/3) Reduction of tree depth by board configuration evaluation truncate the search tree at a given level 17 / 42

18 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity: Tree Depth (2/3) Reduction of tree depth by board configuration evaluation replace the true optimal value function by an approximation for the subtree below the cut this predicts the outcome of the game from the current board configuration 18 / 42

19 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity: Tree Depth (3/3) Reduction of tree depth by board configuration evaluation truncate the search tree at a given level replace the true optimal value function by an approximation for the subtree below the cut this predicts the outcome of the game from the current board configuration Performance Leads to efficient (superhuman!) performance in games like Chess, Checkers/Draughts and Othello but believed to be intractable for Go due its complexity 19 / 42

20 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity: Tree Breadth (1/2) Reduction of tree breadth by moves selection Instead of performing exhaustive search among all possible moves Sampling an efficient move from a probability distribution ( policy ) over all possible moves from board configuration 20 / 42

21 The Game of Go Complexity of Go Reducing the Complexity Reducing the Complexity: Tree Breadth (2/2) Reduction of tree breadth by moves selection Instead of performing exhaustive search among all possible moves Sampling an efficient move from a probability distribution ( policy ) over all possible moves from current board configuration Performance Leads to efficient (superhuman!) performance in games like Backgammon, Scrabble and Go but only for weak amateur playing level in Go 21 / 42

22 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Google DeepMind? Reducing the depth and breadth of the search tree with classical approaches not efficient enough for playing Go at a professional level! Quick review of Google DeepMind s article [Silver et al. 2016] 22 / 42

23 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo Summary Reducing the complexity deep neural networks Evaluation of board configurations (prediction of the game outcome for a given board configuration, reduce tree depth) value networks Selection of moves (reduce tree breadth) policy networks Deep neural networks trained/learnt by combination of: Supervised learning from human expert games dataset Reinforcement learning from games of self-play dataset ( Search algorithm in the tree uses Monte Carlo simulation techniques with value networks and policy networks) 23 / 42

24 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Starting Point: Neural Networks 24 / 42

25 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Deep Neural Networks Recent advances in Machine Learning (Artificial Intelligence) Deep Learning: Deep/Convolutional Neural Networks improve performance for pattern recognition applications in computer vision construct increasingly abstract and localized representations of images data Core idea to design AlphaGo ML system employ a similar architecture/model for the game of Go 25 / 42

26 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Example of Convolutional Neural Network (1/2): Modeling and Training/Learning 26 / 42

27 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Example of a Convolution Kernel 27 / 42

28 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Convolutional Neural Network (2/2): Prediction/Testing Samoyed 16; Papillon 5.7; Pomeranian 2.7; Arctic fox 1.0; Eskimo dog 0.6; white wolf 0.4; Siberian husky / 42

29 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo in a Nutshell Deep learning architecture Picture the board configuration as a image Use convolutional neural networks to build a representation of the board configuration The consideration of deep neural networks aims at reducing the depth and breadth of the search tree: evaluating board configurations and predicting game outcomes via value networks ( depth of the search tree) sampling possible moves from policy networks ( breadth of the search tree) 29 / 42

30 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo Deep Neural Networks Models (1/2) Value Network ( reduces tree depth) takes an image representation of the board configuration as input passes it to a convolutional neural network model (estimated by regression) outputs (numerical) approximate value of the optimal value function Value predicts the expected game outcome for a given board configuration 30 / 42

31 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo Deep Neural Networks Models (2/2) Policy Network ( reduces tree breadth) takes an image representation of the board configuration as input passes it to a convolutional neural network model (estimated by supervised learning or by reinforcement learning) outputs a probability distribution for sampling efficient moves given the board configuration Policy probability map over the board for sampling efficient moves 31 / 42

32 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo ML Training/Learning Global Scheme/Pipeline 32 / 42

33 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Reinforcement Learning Framework (1/2) 33 / 42

34 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Reinforcement Learning Framework (2/2) Reinforcement learning goal: optimize rewards by choosing adequately actions for given observations from policies 34 / 42

35 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques Reinforcement Learning for Computer Go 35 / 42

36 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo Reinforcement Learning Framework 36 / 42

37 Deep Neural Networks AlphaGo Deep Learning Architecture AlphaGo ML Training/Learning Techniques AlphaGo Reinforcement Learning Framework Reinforcement learning policy network optimizes the final outcome of games of self-play, against its previous versions (Reinforcement learning combined with deep neural networks also efficient for learning how to play to classical video games!) 37 / 42

38 Take Away Messages References Key/Take Away Messages (1/2) for Computer Go Tractable in theory but quite complex in practice searching in a tree of sequences of moves... Core Idea Picturing the board configurations as images and use deep neural networks to build an approximate search tree easier to solve To perform training/learning efficiently, needs for: ad hoc and efficient algorithms massive datasets: 30M expert moves for reinforcement learning policy network initialization for games vs Fan Hui huge computational resources: 1202 CPU GPU for playing the games vs EU Go champion Fan Hui 38 / 42

39 Take Away Messages References Key/Take Away Messages (2/2) Deep Neural Networks in Aim at reducing depth and breadth in the original search tree: by evaluating board configurations via value networks ( predicting the outcomes of the games) by sampling game moves from policy networks (computed in particular with reinforcement learning) AlphaGo Computer Go Artificial Intelligence Playing Go is a very specific task, with 2 enjoyable properties: possibility to generate games and to perform self-play stationary problem: game rules do not change over time (like for computer vision and natural language processing) but general AI still remains an open and hard problem! 39 / 42

40 Take Away Messages References AlphaGo and Beyond... (1/2) David Silver et al. (2016) Mastering the game of Go with deep neural networks and tree search Nature (529), , 28 January 2016 Volodymyr Mnih et al. (2015) ( video games ) Human-level control through deep reinforcement learning Nature (518), , 26 February 2015 Richard Sutton and Andrew Barto (1998) Reinforcement learning: an introduction MIT Press, / 42

41 Take Away Messages References AlphaGo and Beyond... (2/2) Yann LeCun et al. (1990) Handwritten digit recognition with a back-propagation network In Proc. of NIPS, , 1990 Geoffrey Hinton, Simon Osindero and Yee-Whye Teh (2006) A fast learning algorithm for deep belief nets Neural Computation 18(7), , 2006 Yann LeCun, Yoshua Bengio and Geoffrey Hinton (2015) Deep learning ( review ) Nature (521), , 28 May / 42

42 Take Away Messages References Thank you! Thanks for your attention! Questions? ( stephane.senecal@orange.com) Credits: Anaëlle Laurans, Vincent Lemaire, Henri Sanson, Mikael Orange Labs and Demis Hassabis@DeepMind! This work is supported by the collaborative research projects ANR NETLEARN (ANR-13-INFR-0004) and EU H2020 5G-PPP COGNET 42 / 42

Neural Networks and Tree Search

Neural Networks and Tree Search Mastering the Game of Go With Deep Neural Networks and Tree Search Nabiha Asghar 27 th May 2016 AlphaGo by Google DeepMind Go: ancient Chinese board game. Simple rules, but far more complicated than Chess

More information

Go in numbers 3,000. Years Old

Go in numbers 3,000. Years Old Go in numbers 3,000 Years Old 40M Players 10^170 Positions The Rules of Go Capture Territory Why is Go hard for computers to play? Brute force search intractable: 1. Search space is huge 2. Impossible

More information

CME 213 SPRING Eric Darve

CME 213 SPRING Eric Darve CME 213 SPRING 2017 Eric Darve MPI SUMMARY Point-to-point and collective communications Process mapping: across nodes and within a node (socket, NUMA domain, core, hardware thread) MPI buffers and deadlocks

More information

Experiments with Tensor Flow

Experiments with Tensor Flow Experiments with Tensor Flow 06.07.2017 Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant) A Smart Home? 2 WEBGATE WELTWEIT WebGate USA Boston WebGate Support Center Brno, Tschechische Republik

More information

Artificial Intelligence. Game trees. Two-player zero-sum game. Goals for the lecture. Blai Bonet

Artificial Intelligence. Game trees. Two-player zero-sum game. Goals for the lecture. Blai Bonet Artificial Intelligence Blai Bonet Game trees Universidad Simón Boĺıvar, Caracas, Venezuela Goals for the lecture Two-player zero-sum game Two-player game with deterministic actions, complete information

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 45. AlphaGo and Outlook Malte Helmert and Gabriele Röger University of Basel May 22, 2017 Board Games: Overview chapter overview: 40. Introduction and State of the

More information

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan Lecture 17: Neural Networks and Deep Learning Instructor: Saravanan Thirumuruganathan Outline Perceptron Neural Networks Deep Learning Convolutional Neural Networks Recurrent Neural Networks Auto Encoders

More information

Applications of Reinforcement Learning. Ist künstliche Intelligenz gefährlich?

Applications of Reinforcement Learning. Ist künstliche Intelligenz gefährlich? Applications of Reinforcement Learning Ist künstliche Intelligenz gefährlich? Table of contents Playing Atari with Deep Reinforcement Learning Playing Super Mario World Stanford University Autonomous Helicopter

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 12: Deep Reinforcement Learning Types of Learning Supervised training Learning from the teacher Training data includes

More information

CME 213 SPRING Eric Darve

CME 213 SPRING Eric Darve CME 213 SPRING 2017 Eric Darve Final project Final project is about implementing a neural network in order to recognize hand-written digits. Logistics: Preliminary report: Friday June 2 nd Final report

More information

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning

More information

Demystifying Machine Learning

Demystifying Machine Learning Demystifying Machine Learning Dmitry Figol, WW Enterprise Sales Systems Engineer - Programmability @dmfigol CTHRST-1002 Agenda Machine Learning examples What is Machine Learning Types of Machine Learning

More information

Convolutional Neural Networks

Convolutional Neural Networks Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go

Convolutional Restricted Boltzmann Machine Features for TD Learning in Go ConvolutionalRestrictedBoltzmannMachineFeatures fortdlearningingo ByYanLargmanandPeterPham AdvisedbyHonglakLee 1.Background&Motivation AlthoughrecentadvancesinAIhaveallowed Go playing programs to become

More information

Machine Learning in WAN Research

Machine Learning in WAN Research Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network

More information

Deep Learning. Yee Whye Teh (Oxford Statistics & DeepMind)

Deep Learning. Yee Whye Teh (Oxford Statistics & DeepMind) Deep Learning Yee Whye Teh (Oxford Statistics & DeepMind) http://csml.stats.ox.ac.uk/people/teh What is Machine Learning? Information Structure Prediction Decisions Actions data What is Machine Learning?

More information

Deep Reinforcement Learning

Deep Reinforcement Learning Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3. Policy Gradient and Gradient Estimators 4. Q-prop: Sample Efficient Policy Gradient and an Off-policy Critic

More information

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016 CS 4510/9010 Applied Machine Learning 1 Deep Learning Paula Matuszek Fall 2016 Beyond Simple Neural Nets 2 In the last few ideas we have seen some surprisingly rapid progress in some areas of AI Image

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019 Reinforcement Learning and Optimal Control ASU, CSE 691, Winter 2019 Dimitri P. Bertsekas dimitrib@mit.edu Lecture 1 Bertsekas Reinforcement Learning 1 / 21 Outline 1 Introduction, History, General Concepts

More information

Machine Learning with Python

Machine Learning with Python DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

A Fast Learning Algorithm for Deep Belief Nets

A Fast Learning Algorithm for Deep Belief Nets A Fast Learning Algorithm for Deep Belief Nets Geoffrey E. Hinton, Simon Osindero Department of Computer Science University of Toronto, Toronto, Canada Yee-Whye Teh Department of Computer Science National

More information

Supervised Learning of Classifiers

Supervised Learning of Classifiers Supervised Learning of Classifiers Carlo Tomasi Supervised learning is the problem of computing a function from a feature (or input) space X to an output space Y from a training set T of feature-output

More information

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Deep Learning Workshop Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Why deep learning? The ImageNet Challenge Goal: image classification with 1000 categories Top 5 error rate of 15%. Krizhevsky, Alex,

More information

Slides credited from Dr. David Silver & Hung-Yi Lee

Slides credited from Dr. David Silver & Hung-Yi Lee Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Machine Learning in WAN Research

Machine Learning in WAN Research Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network

More information

APHID: Asynchronous Parallel Game-Tree Search

APHID: Asynchronous Parallel Game-Tree Search APHID: Asynchronous Parallel Game-Tree Search Mark G. Brockington and Jonathan Schaeffer Department of Computing Science University of Alberta Edmonton, Alberta T6G 2H1 Canada February 12, 1999 1 Running

More information

A Series of Lectures on Approximate Dynamic Programming

A Series of Lectures on Approximate Dynamic Programming A Series of Lectures on Approximate Dynamic Programming Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology Lucca, Italy June 2017 Bertsekas (M.I.T.)

More information

Potential Midterm Exam Questions

Potential Midterm Exam Questions Potential Midterm Exam Questions 1. What are the four ways in which AI is usually viewed? Which of the four is the preferred view of the authors of our textbook? 2. What does each of the lettered items

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images Marc Aurelio Ranzato Yann LeCun Courant Institute of Mathematical Sciences New York University - New York, NY 10003 Abstract

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Human-level Control Through Deep Reinforcement Learning (Deep Q Network) Peidong Wang 11/13/2015

Human-level Control Through Deep Reinforcement Learning (Deep Q Network) Peidong Wang 11/13/2015 Human-level Control Through Deep Reinforcement Learning (Deep Q Network) Peidong Wang 11/13/2015 Content Demo Framework Remarks Experiment Discussion Content Demo Framework Remarks Experiment Discussion

More information

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Introduction to Deep Q-network

Introduction to Deep Q-network Introduction to Deep Q-network Presenter: Yunshu Du CptS 580 Deep Learning 10/10/2016 Deep Q-network (DQN) Deep Q-network (DQN) An artificial agent for general Atari game playing Learn to master 49 different

More information

Unsupervised Deep Learning for Scene Recognition

Unsupervised Deep Learning for Scene Recognition Unsupervised Deep Learning for Scene Recognition Akram Helou and Chau Nguyen May 19, 2011 1 Introduction Object and scene recognition are usually studied separately. However, research [2]shows that context

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms Session 26. April 29, 2009 Instructor: Bert Huang http://www.cs.columbia.edu/~bert/courses/3137 Announcements Homework 6 due before last class: May 4th Final Review May 4th

More information

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples. Supervised Learning with Neural Networks We now look at how an agent might learn to solve a general problem by seeing examples. Aims: to present an outline of supervised learning as part of AI; to introduce

More information

Autoencoders, denoising autoencoders, and learning deep networks

Autoencoders, denoising autoencoders, and learning deep networks 4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Automation.

Automation. Automation www.austech.edu.au WHAT IS AUTOMATION? Automation testing is a technique uses an application to implement entire life cycle of the software in less time and provides efficiency and effectiveness

More information

Why equivariance is better than premature invariance

Why equivariance is better than premature invariance 1 Why equivariance is better than premature invariance Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto with contributions from Sida Wang

More information

Deep Learning. Volker Tresp Summer 2015

Deep Learning. Volker Tresp Summer 2015 Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Recurrent Neural Network (RNN) Industrial AI Lab.

Recurrent Neural Network (RNN) Industrial AI Lab. Recurrent Neural Network (RNN) Industrial AI Lab. For example (Deterministic) Time Series Data Closed- form Linear difference equation (LDE) and initial condition High order LDEs 2 (Stochastic) Time Series

More information

Convolutional Neural Networks. CSC 4510/9010 Andrew Keenan

Convolutional Neural Networks. CSC 4510/9010 Andrew Keenan Convolutional Neural Networks CSC 4510/9010 Andrew Keenan Neural network to Convolutional Neural Network 0.4 0.7 0.3 Neural Network Basics 0.4 0.7 0.3 Individual Neuron Network Backpropagation 1. 2. 3.

More information

A Deep Learning primer

A Deep Learning primer A Deep Learning primer Riccardo Zanella r.zanella@cineca.it SuperComputing Applications and Innovation Department 1/21 Table of Contents Deep Learning: a review Representation Learning methods DL Applications

More information

Knowledge-Defined Networking: Towards Self-Driving Networks

Knowledge-Defined Networking: Towards Self-Driving Networks Knowledge-Defined Networking: Towards Self-Driving Networks Albert Cabellos (UPC/BarcelonaTech, Spain) albert.cabellos@gmail.com 2nd IFIP/IEEE International Workshop on Analytics for Network and Service

More information

Does the Brain do Inverse Graphics?

Does the Brain do Inverse Graphics? Does the Brain do Inverse Graphics? Geoffrey Hinton, Alex Krizhevsky, Navdeep Jaitly, Tijmen Tieleman & Yichuan Tang Department of Computer Science University of Toronto How to learn many layers of features

More information

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

Application of Deep Learning Techniques in Satellite Telemetry Analysis. Application of Deep Learning Techniques in Satellite Telemetry Analysis. Greg Adamski, Member of Technical Staff L3 Technologies Telemetry and RF Products Julian Spencer Jones, Spacecraft Engineer Telenor

More information

. Smart-Cities and Cloud Computing. Panel Discussion

. Smart-Cities and Cloud Computing. Panel Discussion . Smart-Cities and Cloud Computing 1 Toward Smart Society and the 4 th Industrial Revolution Panel Discussion Yong Woo LEE, Ph.D. Professor, University of Seoul President, Smart Consortium for Seoul, Korea

More information

HEURISTIC SEARCH. 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and References 4.6 Exercises

HEURISTIC SEARCH. 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and References 4.6 Exercises 4 HEURISTIC SEARCH Slide 4.1 4.0 Introduction 4.1 An Algorithm for Heuristic Search 4.2 Admissibility, Monotonicity, and Informedness 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and

More information

Polytechnic University of Tirana

Polytechnic University of Tirana 1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous

More information

AI and Security: Lessons, Challenges and Future Directions. Taesoo Kim

AI and Security: Lessons, Challenges and Future Directions. Taesoo Kim AI and Security: Lessons, Challenges and Future Directions Taesoo Kim Taesoo Kim About Myself Research interests: Taesoo Kim (taesoo@gatech.edu) 14-00: Assistant Professor at Gatech 11-14: Ph.D. from MIT

More information

RETRIEVAL OF FACES BASED ON SIMILARITIES Jonnadula Narasimha Rao, Keerthi Krishna Sai Viswanadh, Namani Sandeep, Allanki Upasana

RETRIEVAL OF FACES BASED ON SIMILARITIES Jonnadula Narasimha Rao, Keerthi Krishna Sai Viswanadh, Namani Sandeep, Allanki Upasana ISSN 2320-9194 1 Volume 5, Issue 4, April 2017, Online: ISSN 2320-9194 RETRIEVAL OF FACES BASED ON SIMILARITIES Jonnadula Narasimha Rao, Keerthi Krishna Sai Viswanadh, Namani Sandeep, Allanki Upasana ABSTRACT

More information

Deep Learning for Remote Sensing

Deep Learning for Remote Sensing 1 ENPC Data Science Week Deep Learning for Remote Sensing Alexandre Boulch 2 ONERA Research, Innovation, expertise and long-term vision for industry, French government and Europe 3 Materials Optics Aerodynamics

More information

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah Using neural nets to recognize hand-written digits Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the first chapter of the online book by Michael

More information

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

DEEP NEURAL NETWORKS FOR OBJECT DETECTION DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional

More information

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL

ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL ADAPTIVE TILE CODING METHODS FOR THE GENERALIZATION OF VALUE FUNCTIONS IN THE RL STATE SPACE A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY BHARAT SIGINAM IN

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x Deep Learning 4 Autoencoder, Attention (spatial transformer), Multi-modal learning, Neural Turing Machine, Memory Networks, Generative Adversarial Net Jian Li IIIS, Tsinghua Autoencoder Autoencoder Unsupervised

More information

Alpha-Beta Pruning in Mini-Max Algorithm An Optimized Approach for a Connect-4 Game

Alpha-Beta Pruning in Mini-Max Algorithm An Optimized Approach for a Connect-4 Game Alpha-Beta Pruning in Mini-Max Algorithm An Optimized Approach for a Connect-4 Game Rijul Nasa 1, Rishabh Didwania 2, Shubhranil Maji 3, Vipul Kumar 4 1,2,3,4 Dept. of Computer Science & Engineering, The

More information

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction

More information

The Fly & Anti-Fly Missile

The Fly & Anti-Fly Missile The Fly & Anti-Fly Missile Rick Tilley Florida State University (USA) rt05c@my.fsu.edu Abstract Linear Regression with Gradient Descent are used in many machine learning applications. The algorithms are

More information

Deep Learning and Its Applications

Deep Learning and Its Applications Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent

More information

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

Neural Nets & Deep Learning

Neural Nets & Deep Learning Neural Nets & Deep Learning The Inspiration Inputs Outputs Our brains are pretty amazing, what if we could do something similar with computers? Image Source: http://ib.bioninja.com.au/_media/neuron _med.jpeg

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

MULTIVARIATE ANALYSES WITH fmri DATA

MULTIVARIATE ANALYSES WITH fmri DATA MULTIVARIATE ANALYSES WITH fmri DATA Sudhir Shankar Raman Translational Neuromodeling Unit (TNU) Institute for Biomedical Engineering University of Zurich & ETH Zurich Motivation Modelling Concepts Learning

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Implementation of Deep Convolutional Neural Net on a Digital Signal Processor

Implementation of Deep Convolutional Neural Net on a Digital Signal Processor Implementation of Deep Convolutional Neural Net on a Digital Signal Processor Elaina Chai December 12, 2014 1. Abstract In this paper I will discuss the feasibility of an implementation of an algorithm

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Deep Learning & Neural Networks

Deep Learning & Neural Networks Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural

More information

Seminars in Artifiial Intelligenie and Robotiis

Seminars in Artifiial Intelligenie and Robotiis Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin Ellis - MIT, Daniel Ritchie - Brown University, Armando Solar-Lezama - MIT, Joshua b. Tenenbaum - MIT Presented by : Maliha Arif Advanced

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

CS 354R: Computer Game Technology

CS 354R: Computer Game Technology CS 354R: Computer Game Technology AI Fuzzy Logic and Neural Nets Fall 2018 Fuzzy Logic Philosophical approach Decisions based on degree of truth Is not a method for reasoning under uncertainty that s probability

More information

arxiv: v1 [cs.cv] 2 Sep 2018

arxiv: v1 [cs.cv] 2 Sep 2018 Natural Language Person Search Using Deep Reinforcement Learning Ankit Shah Language Technologies Institute Carnegie Mellon University aps1@andrew.cmu.edu Tyler Vuong Electrical and Computer Engineering

More information

Large Scale Parallel Monte Carlo Tree Search on GPU

Large Scale Parallel Monte Carlo Tree Search on GPU Large Scale Parallel Monte Carlo Tree Search on Kamil Rocki The University of Tokyo Graduate School of Information Science and Technology Department of Computer Science 1 Tree search Finding a solution

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

Mocha.jl. Deep Learning in Julia. Chiyuan Zhang CSAIL, MIT

Mocha.jl. Deep Learning in Julia. Chiyuan Zhang CSAIL, MIT Mocha.jl Deep Learning in Julia Chiyuan Zhang (@pluskid) CSAIL, MIT Deep Learning Learning with multi-layer (3~30) neural networks, on a huge training set. State-of-the-art on many AI tasks Computer Vision:

More information

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina Residual Networks And Attention Models cs273b Recitation 11/11/2016 Anna Shcherbina Introduction to ResNets Introduced in 2015 by Microsoft Research Deep Residual Learning for Image Recognition (He, Zhang,

More information

CS885 Reinforcement Learning Lecture 9: May 30, Model-based RL [SutBar] Chap 8

CS885 Reinforcement Learning Lecture 9: May 30, Model-based RL [SutBar] Chap 8 CS885 Reinforcement Learning Lecture 9: May 30, 2018 Model-based RL [SutBar] Chap 8 CS885 Spring 2018 Pascal Poupart 1 Outline Model-based RL Dyna Monte-Carlo Tree Search CS885 Spring 2018 Pascal Poupart

More information

Using Neural Cells to Improve Image Textual Line Segmentation

Using Neural Cells to Improve Image Textual Line Segmentation Using Neural Cells to Improve Image Textual Line Segmentation Patrick Schone (patrickjohn.schone@ldschurch.org) 7 February 2017 Standards Technical Conference Overview Motivation Neural Cells for Line

More information

CS 4700: Artificial Intelligence

CS 4700: Artificial Intelligence CS 4700: Foundations of Artificial Intelligence Fall 2017 Instructor: Prof. Haym Hirsh Lecture 8 Today Informed Search (R&N Ch 3,4) Adversarial search (R&N Ch 5) Adversarial Search (R&N Ch 5) Homework

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Deep Learning of Visual Control Policies

Deep Learning of Visual Control Policies ESANN proceedings, European Symposium on Artificial Neural Networks - Computational Intelligence and Machine Learning. Bruges (Belgium), 8-3 April, d-side publi., ISBN -9337--. Deep Learning of Visual

More information

Populating the Galaxy Zoo

Populating the Galaxy Zoo Populating the Galaxy Zoo Real-time Image Classification with SQL Server R Services David M Smith @revodavid R Community Lead Microsoft Algorithms and Data Science THANKS to all Sponsors! EVENT SPONSORS

More information

HEURISTIC SEARCH. 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and References 4.6 Exercises

HEURISTIC SEARCH. 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and References 4.6 Exercises 4 HEURISTIC SEARCH Slide 4.1 4.0 Introduction 4.1 An Algorithm for Heuristic Search 4.2 Admissibility, Monotonicity, and Informedness 4.3 Using Heuristics in Games 4.4 Complexity Issues 4.5 Epilogue and

More information

Emotion Detection using Deep Belief Networks

Emotion Detection using Deep Belief Networks Emotion Detection using Deep Belief Networks Kevin Terusaki and Vince Stigliani May 9, 2014 Abstract In this paper, we explore the exciting new field of deep learning. Recent discoveries have made it possible

More information

Revolver: Vertex-centric Graph Partitioning Using Reinforcement Learning

Revolver: Vertex-centric Graph Partitioning Using Reinforcement Learning Revolver: Vertex-centric Graph Partitioning Using Reinforcement Learning Mohammad Hasanzadeh Mofrad 1, Rami Melhem 1 and Mohammad Hammoud 2 1 University of Pittsburgh 2 Carnegie Mellon University Qatar

More information

Deep Q-Learning to play Snake

Deep Q-Learning to play Snake Deep Q-Learning to play Snake Daniele Grattarola August 1, 2016 Abstract This article describes the application of deep learning and Q-learning to play the famous 90s videogame Snake. I applied deep convolutional

More information