GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING
|
|
- Roderick Blair
- 5 years ago
- Views:
Transcription
1 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING M. Babaeizadeh,, I.Frosio, S.Tyree, J. Clemons, J.Kautz University of Illinois at Urbana-Champaign, USA NVIDIA, USA An ICLR 2017 paper A github project
2 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING M. Babaeizadeh,, I.Frosio, S.Tyree, J. Clemons, J.Kautz University of Illinois at Urbana-Champaign, USA NVIDIA, USA
3 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING M. Babaeizadeh,, I.Frosio, S.Tyree, J. Clemons, J.Kautz University of Illinois at Urbana-Champaign, USA NVIDIA, USA
4 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING 4
5 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING Learning to accomplish a task Image from 5
6 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING Definitions ~R t S t R, t Environment Agent S t RL agent Observable status S t Reward R t Action a t Policy 6
7 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING Definitions ~R t S t Deep RL agent 7
8 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING Definitions ~R t Dp( ) S t 8
9 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING 9
10 Agent 16 Agent 2 Agent 1 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING Asynchronous Advantage Actor-Critic (Mnih et al., arxiv: v2, 2015) Dp( ) p ( ) Master model 10
11 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING 11
12 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING The GPU 12
13 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING LOW OCCUPANCY (33%) LOW OCCUPANCY 13
14 Batch size GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING HIGH OCCUPANCY (100%) 14
15 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING HIGH OCCUPANCY (100%), LOW UTILIZATION (40%) time time 15
16 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING HIGH OCCUPANCY (100%), HIGH UTILIZATION (100%) time time 16
17 GPU-BASED A3C FOR DEEP REINFORCEMENT LEARNING BANDWIDTH LIMITED time time 17
18 Agent 16 Agent 2 Agent 1 Dp( ) p ( ) Master model + 18
19 MAPPING DEEP PROBLEMS TO A GPU REGRESSION, CLASSIFICATION, REINFORCEMENT LEARNING data 100% utilization / occupancy status, reward Pear, pear, pear, pear, Empty, empty, Fig, fig, fig, fig, fig, fig, Strawberry, Strawberry, Strawberry, labels? action 19
20 Agent 16 Agent 2 Agent 1 A3C Master model 20
21 Agent 16 Agent 2 Agent 1 A3C Master model 21
22 Agent 16 Agent 2 Agent 1 A3C Master model 22
23 Agent 16 Agent 2 Agent 1 A3C Master model 23
24 Agent 16 Agent 2 Agent 1 A3C Dp( ) Master model 24
25 Agent 16 Agent 2 Agent 1 A3C Dp( ) p ( ) Master model 25
26 GA3C GPU-based A3C 26 El Capitan big wall, Yosemite Valley
27 Agent N Agent 2 Agent 1 a t GA3C (INFERENCE) prediction queue S t predictors {a t } {S t } Master model 27
28 Agent N Agent 2 Agent 1 GA3C (TRAINING) Dp( ) Master model S t,r t R 0 R 1 R 2 R 4 {S t,r t } training queue trainers 28
29 Agent N Agent 2 Agent 1 a t GA3C prediction queue {a t } {S t } Dp( ) S t predictors Master model R 0 R 1 R 2 R 4 {S t,r t } training queue trainers 29
30 GA3C GPU-based A3C 30 El Capitan big wall, Yosemite Valley
31 GA3C Learn how to balance GPU-based A3C 31 El Capitan big wall, Yosemite Valley
32 Agent N Agent 2 Agent 1 GA3C: PREDICTIONS PER SECOND (PPS) a t prediction queue {a t } {S t } Dp( ) S t predictors Master model R 0 R 1 R 2 R 4 {S t,r t } training queue trainers 32
33 Agent N Agent 2 Agent 1 GA3C: TRAININGS PER SECOND (TPS) a t prediction queue {a t } {S t } Dp( ) S t predictors Master model R 0 R 1 R 2 R 4 {S t,r t } training queue trainers 33
34 AUTOMATIC SCHEDULING Balancing the system at run time ATARI Boxing ATARI Pong N P = # predictors, N T = # trainers, N A = # agents, TPS = training per seconds 34
35 THE ADVANTAGE OF SPEED More frames = faster convergence 35
36 LARGER DNNS For real world applications (e.g. robotics, automotive) A3C (ATARI) Others (robotics) Conv 16 8x8 filters, stride 4 Conv 32 4x4 filters, stride 2 FC 256 Timothy P. Lillicrap et al., Continuous control with deep reinforcement learning, International Conference on Learning Representations, Conv 32 8x8 filters, stride 1, 2, 3, 4 Conv 32 4x4 filters, stride 2? Conv 64 4x4 FC 256 filters, stride 2 S. Levine et al., End-to-end training of deep visuomotor policies, Journal of Machine Learning Research, 17:1-40,
37 PPS GA3C VS. A3C*: PREDICTIONS PER SECONDS * Our Tensor Flow implementation on a CPU x 11x 12x 20x 45x A3C GA3C 1 Small DNN Large DNN, stride 4 Large DNN, stride 3 Large DNN, stride 2 Large DNN, stride 1 37
38 Utilization (%) CPU & GPU UTILIZATION IN GA3C For larger DNNs CPU % GPU % Small DNN Large DNN, stride 4 Large DNN, stride 3 Large DNN, stride 2 Large DNN, stride 1 38
39 Agent N Agent 2 Agent 1 GA3C POLICY LAG Asynchronous playing and training prediction queue {a t } {S t } Dp( ) S t predictors Master model DNN A R 0 R 1 R 2 R 4 {S t,r t } training queue DNN B trainers 39
40 STABILITY AND CONVERGENCE SPEED Reducing policy lag through min training batch size 40
41 GA3C (45x faster) Balancing computational resources, speed and stability. GPU-based A3C 41 El Capitan big wall, Yosemite Valley
42 RESOURCES THEORY CODING M. Babaeizadeh, I. Frosio, S. Tyree, J. Clemons, J. Kautz, Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU, ICLR 2017 (available at GvBcxl¬eId=r1VGvBcxl). GA3C, a GPU implementation of A3C (open source at A general architecture to generate and consume training data. 42
43 QUESTIONS QUESTIONS ATARI 2600 Policy lag Multiple GPUs Why TensorFlow Replay memory Github: ICLR 2017: Reinforcement Learning through Asynchronous Advantage Actor-Critic on 43 a GPU.
44 BACKUP SLIDES
45 POLICY LAG IN GA3C Potentially large time lag between training data generation and network update 45
46 SCORES 46
47 Training a larger DNN 47
48 BALANCING COMPUTATIONAL RESOURCES The actors: CPU, PCI-E & GPU Trainings per second Training queue size Prediction queue size 48
GUNREAL: GPU-accelerated UNsupervised REinforcement and Auxiliary Learning
GUNREAL: GPU-accelerated UNsupervised REinforcement and Auxiliary Learning Koichi Shirahata, Youri Coppens, Takuya Fukagai, Yasumoto Tomita, and Atsushi Ike FUJITSU LABORATORIES LTD. March 27, 2018 0 Deep
More informationSlides credited from Dr. David Silver & Hung-Yi Lee
Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to
More informationarxiv: v2 [cs.lg] 16 May 2017
EFFICIENT PARALLEL METHODS FOR DEEP REIN- FORCEMENT LEARNING Alfredo V. Clemente Department of Computer and Information Science Norwegian University of Science and Technology Trondheim, Norway alfredvc@stud.ntnu.no
More informationIntroduction to Deep Q-network
Introduction to Deep Q-network Presenter: Yunshu Du CptS 580 Deep Learning 10/10/2016 Deep Q-network (DQN) Deep Q-network (DQN) An artificial agent for general Atari game playing Learn to master 49 different
More informationarxiv: v3 [cs.lg] 2 Mar 2017
Published as a conference paper at ICLR 7 REINFORCEMENT LEARNING THROUGH ASYN- CHRONOUS ADVANTAGE ACTOR-CRITIC ON A GPU Mohammad Babaeizadeh Department of Computer Science University of Illinois at Urbana-Champaign,
More informationTopics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning
Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 12: Deep Reinforcement Learning Types of Learning Supervised training Learning from the teacher Training data includes
More informationKnowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay
Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay Haiyan (Helena) Yin, Sinno Jialin Pan School of Computer Science and Engineering Nanyang Technological University
More informationAccelerating Reinforcement Learning in Engineering Systems
Accelerating Reinforcement Learning in Engineering Systems Tham Chen Khong with contributions from Zhou Chongyu and Le Van Duc Department of Electrical & Computer Engineering National University of Singapore
More informationImplementing Long-term Recurrent Convolutional Network Using HLS on POWER System
Implementing Long-term Recurrent Convolutional Network Using HLS on POWER System Xiaofan Zhang1, Mohamed El Hadedy1, Wen-mei Hwu1, Nam Sung Kim1, Jinjun Xiong2, Deming Chen1 1 University of Illinois Urbana-Champaign
More informationarxiv: v1 [cs.cv] 2 Sep 2018
Natural Language Person Search Using Deep Reinforcement Learning Ankit Shah Language Technologies Institute Carnegie Mellon University aps1@andrew.cmu.edu Tyler Vuong Electrical and Computer Engineering
More informationNVIDIA FOR DEEP LEARNING. Bill Veenhuis
NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA
More informationNeural Episodic Control. Alexander pritzel et al (presented by Zura Isakadze)
Neural Episodic Control Alexander pritzel et al. 2017 (presented by Zura Isakadze) Reinforcement Learning Image from reinforce.io RL Example - Atari Games Observed States Images. Internal state - RAM.
More informationA Brief Introduction to Reinforcement Learning
A Brief Introduction to Reinforcement Learning Minlie Huang ( ) Dept. of Computer Science, Tsinghua University aihuang@tsinghua.edu.cn 1 http://coai.cs.tsinghua.edu.cn/hml Reinforcement Learning Agent
More informationCombining Deep Reinforcement Learning and Safety Based Control for Autonomous Driving
Combining Deep Reinforcement Learning and Safety Based Control for Autonomous Driving Xi Xiong Jianqiang Wang Fang Zhang Keqiang Li State Key Laboratory of Automotive Safety and Energy, Tsinghua University
More informationHigh Performance Computing
High Performance Computing 9th Lecture 2016/10/28 YUKI ITO 1 Selected Paper: vdnn: Virtualized Deep Neural Networks for Scalable, MemoryEfficient Neural Network Design Minsoo Rhu, Natalia Gimelshein, Jason
More informationShrinath Shanbhag Senior Software Engineer Microsoft Corporation
Accelerating GPU inferencing with DirectML and DirectX 12 Shrinath Shanbhag Senior Software Engineer Microsoft Corporation Machine Learning Machine learning has become immensely popular over the last decade
More informationDeep Learning Accelerators
Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction
More informationCME 213 SPRING Eric Darve
CME 213 SPRING 2017 Eric Darve MPI SUMMARY Point-to-point and collective communications Process mapping: across nodes and within a node (socket, NUMA domain, core, hardware thread) MPI buffers and deadlocks
More informationMarkov Decision Processes. (Slides from Mausam)
Markov Decision Processes (Slides from Mausam) Machine Learning Operations Research Graph Theory Control Theory Markov Decision Process Economics Robotics Artificial Intelligence Neuroscience /Psychology
More informationA performance comparison of Deep Learning frameworks on KNL
A performance comparison of Deep Learning frameworks on KNL R. Zanella, G. Fiameni, M. Rorro Middleware, Data Management - SCAI - CINECA IXPUG Bologna, March 5, 2018 Table of Contents 1. Problem description
More informationIntro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn
Intro to Deep Learning Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Why this class? Deep Features Have been able to harness the big data in the most efficient and effective
More informationA NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017
A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded
More informationDeep Reinforcement Learning
Deep Reinforcement Learning 1 Outline 1. Overview of Reinforcement Learning 2. Policy Search 3. Policy Gradient and Gradient Estimators 4. Q-prop: Sample Efficient Policy Gradient and an Off-policy Critic
More informationWhen Network Embedding meets Reinforcement Learning?
When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE
More informationDevice Placement Optimization with Reinforcement Learning
Device Placement Optimization with Reinforcement Learning Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Mohammad Norouzi, Naveen Kumar, Rasmus Larsen, Yuefeng Zhou, Quoc Le, Samy Bengio, and
More informationParallel Stochastic Gradient Descent: The case for native GPU-side GPI
Parallel Stochastic Gradient Descent: The case for native GPU-side GPI J. Keuper Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Mark Silberstein Accelerated Computer
More informationMachine Learning in WAN Research
Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network
More informationMachine Learning in WAN Research
Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network
More informationImplementing Deep Learning for Video Analytics on Tegra X1.
Implementing Deep Learning for Video Analytics on Tegra X1 research@hertasecurity.com Index Who we are, what we do Video analytics pipeline Video decoding Facial detection and preprocessing DNN: learning
More informationIntelligent Hybrid Flash Management
Intelligent Hybrid Flash Management Jérôme Gaysse Senior Technology&Market Analyst jerome.gaysse@silinnov-consulting.com Flash Memory Summit 2018 Santa Clara, CA 1 Research context Analysis of system &
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationDEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017
DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X
More informationDeep Neural Networks for Market Prediction on the Xeon Phi *
Deep Neural Networks for Market Prediction on the Xeon Phi * Matthew Dixon 1, Diego Klabjan 2 and * The support of Intel Corporation is gratefully acknowledged Jin Hoon Bang 3 1 Department of Finance,
More informationEmbedded GPGPU and Deep Learning for Industrial Market
Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL
More informationThroughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu
More informationUnrolling Inference: The Recurrent Inference Machine
Unrolling Inference: The Recurrent Inference Machine Max Welling University of Amsterdam / Qualcomm Canadian Institute for Advanced Research ML @ UvA (2 fte) (12fte) Machine Learning in Amsterdam (3fte)
More informationSCALABLE DISTRIBUTED DEEP LEARNING
SEOUL Oct.7, 2016 SCALABLE DISTRIBUTED DEEP LEARNING Han Hee Song, PhD Soft On Net 10/7/2016 BATCH PROCESSING FRAMEWORKS FOR DL Data parallelism provides efficient big data processing: data collecting,
More informationCafeGPI. Single-Sided Communication for Scalable Deep Learning
CafeGPI Single-Sided Communication for Scalable Deep Learning Janis Keuper itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Deep Neural Networks
More informationCS885 Reinforcement Learning Lecture 9: May 30, Model-based RL [SutBar] Chap 8
CS885 Reinforcement Learning Lecture 9: May 30, 2018 Model-based RL [SutBar] Chap 8 CS885 Spring 2018 Pascal Poupart 1 Outline Model-based RL Dyna Monte-Carlo Tree Search CS885 Spring 2018 Pascal Poupart
More informationAn introduction to Machine Learning silicon
An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional
More informationOptimazation of Traffic Light Controlling with Reinforcement Learning COMP 3211 Final Project, Group 9, Fall
Optimazation of Traffic Light Controlling with Reinforcement Learning COMP 3211 Final Project, Group 9, Fall 2017-2018 CHEN Ziyi 20319433, LI Xinran 20270572, LIU Cheng 20328927, WU Dake 20328628, and
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationLSD-NET: LOOK, STEP AND DETECT FOR JOINT NAV-
LSD-NET: LOOK, STEP AND DETECT FOR JOINT NAV- IGATION AND MULTI-VIEW RECOGNITION WITH DEEP REINFORCEMENT LEARNING Anonymous authors Paper under double-blind review ABSTRACT Multi-view recognition is the
More informationPOINT CLOUD DEEP LEARNING
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification
More informationDeep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations, and Hardware Implications
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations, and Hardware Implications Jongsoo Park Facebook AI System SW/HW Co-design Team Sep-21 2018 Team Introduction
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationIntroduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University
Introduction to Reinforcement Learning J. Zico Kolter Carnegie Mellon University 1 Agent interaction with environment Agent State s Reward r Action a Environment 2 Of course, an oversimplification 3 Review:
More informationDeep Learning Inference as a Service
Deep Learning Inference as a Service Mohammad Babaeizadeh Hadi Hashemi Chris Cai Advisor: Prof Roy H. Campbell Use case 1: Model Developer Use case 1: Model Developer Inference Service Use case
More informationHuman-level Control Through Deep Reinforcement Learning (Deep Q Network) Peidong Wang 11/13/2015
Human-level Control Through Deep Reinforcement Learning (Deep Q Network) Peidong Wang 11/13/2015 Content Demo Framework Remarks Experiment Discussion Content Demo Framework Remarks Experiment Discussion
More informationResearch Faculty Summit Systems Fueling future disruptions
Research Faculty Summit 2018 Systems Fueling future disruptions Wolong: A Back-end Optimizer for Deep Learning Computation Jilong Xue Researcher, Microsoft Research Asia System Challenge in Deep Learning
More informationDIGITS DEEP LEARNING GPU TRAINING SYSTEM
DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection, Localization,
More informationEFFICIENT INFERENCE WITH TENSORRT. Han Vanholder
EFFICIENT INFERENCE WITH TENSORRT Han Vanholder AI INFERENCING IS EXPLODING 2 Trillion Messages Per Day On LinkedIn 500M Daily active users of iflytek 140 Billion Words Per Day Translated by Google 60
More informationMachine Learning Workshop
Machine Learning Workshop {Presenters} Feb. 20th, 2018 Theory of Neural Networks Architecture and Types of Layers: Fully Connected (FC) Convolutional Neural Network (CNN) Pooling Drop out Residual Recurrent
More informationTensorFlow: A System for Learning-Scale Machine Learning. Google Brain
TensorFlow: A System for Learning-Scale Machine Learning Google Brain The Problem Machine learning is everywhere This is in large part due to: 1. Invention of more sophisticated machine learning models
More informationDeep Learning on Modern Architectures. Keren Zhou 4/17/2017
Deep Learning on Modern Architectures Keren Zhou 4/17/2017 HPC Software Stack Application Algorithm Data Layout CPU GPU MIC Others HPC Software Stack Deep Learning Algorithm Data Layout CPU GPU MIC Others
More informationMask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi
Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image
More informationDEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM
DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection,
More informationCellular Network Traffic Scheduling using Deep Reinforcement Learning
Cellular Network Traffic Scheduling using Deep Reinforcement Learning Sandeep Chinchali, et. al. Marco Pavone, Sachin Katti Stanford University AAAI 2018 Can we learn to optimally manage cellular networks?
More informationPoseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters Hao Zhang Zeyu Zheng, Shizhen Xu, Wei Dai, Qirong Ho, Xiaodan Liang, Zhiting Hu, Jianliang Wei, Pengtao Xie,
More informationCNN optimization. Rassadin A
CNN optimization Rassadin A. 01.2017-02.2017 What to optimize? Training stage time consumption (CPU / GPU) Inference stage time consumption (CPU / GPU) Training stage memory consumption Inference stage
More informationSemantic Segmentation
Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,
More informationR goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems
R goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems, Andreas Lang Olaf Neugebauer, Peter Marwedel 03/07/2017 SFB 876 Parallel Machine Learning Algorithms Challenge:
More informationOnto Petaflops with Kubernetes
Onto Petaflops with Kubernetes Vishnu Kannan Google Inc. vishh@google.com Key Takeaways Kubernetes can manage hardware accelerators at Scale Kubernetes provides a playground for ML ML journey with Kubernetes
More informationMIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius
MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius What is Mixed Precision Training? Reduced precision tensor math with FP32 accumulation, FP16 storage Successfully used to train a variety
More informationLayer-wise Performance Bottleneck Analysis of Deep Neural Networks
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks Hengyu Zhao, Colin Weinshenker*, Mohamed Ibrahim*, Adwait Jog*, Jishen Zhao University of California, Santa Cruz, *The College of William
More informationCERN openlab & IBM Research Workshop Trip Report
CERN openlab & IBM Research Workshop Trip Report Jakob Blomer, Javier Cervantes, Pere Mato, Radu Popescu 2018-12-03 Workshop Organization 1 full day at IBM Research Zürich ~25 participants from CERN ~10
More informationSelf Programming Networks
Self Programming Networks Is it possible for to Learn the control planes of networks and applications? Operators specify what they want, and the system learns how to deliver CAN WE LEARN THE CONTROL PLANE
More informationDNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs
IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei
More informationExperiments with Tensor Flow
Experiments with Tensor Flow 06.07.2017 Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant) A Smart Home? 2 WEBGATE WELTWEIT WebGate USA Boston WebGate Support Center Brno, Tschechische Republik
More informationWu Zhiwen.
Wu Zhiwen zhiwen.wu@intel.com Agenda Background information OpenCV DNN module OpenCL acceleration Vulkan backend Sample 2 What is OpenCV? Open Source Compute Vision (OpenCV) library 2500+ Optimized algorithms
More informationCS 234 Winter 2018: Assignment #2
Due date: 2/10 (Sat) 11:00 PM (23:00) PST These questions require thought, but do not require long answers. Please be as concise as possible. We encourage students to discuss in groups for assignments.
More informationMACHINE LEARNING CLASSIFIERS ADVANTAGES AND CHALLENGES OF SELECTED METHODS
MACHINE LEARNING CLASSIFIERS ADVANTAGES AND CHALLENGES OF SELECTED METHODS FRANK ORBEN, TECHNICAL SUPPORT / DEVELOPER IMAGE PROCESSING, STEMMER IMAGING OUTLINE Introduction Task: Classification Theory
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling
More information10703 Deep Reinforcement Learning and Control
10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient II Used Materials Disclaimer: Much of the material and slides for this lecture
More informationUsing Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe
Using Reinforcement Learning to Optimize Storage Decisions Ravi Khadiwala Cleversafe Topics What is Reinforcement Learning? Exploration vs. Exploitation The Multi-armed Bandit Optimizing read locations
More informationHPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov
HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads Natalia Vassilieva, Sergey Serebryakov Deep learning ecosystem today Software Hardware 2 HPE s portfolio for deep learning Government,
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More informationAutonomous Driving Solutions
Autonomous Driving Solutions Oct, 2017 DrivePX2 & DriveWorks Marcus Oh (moh@nvidia.com) Sr. Solution Architect, NVIDIA This work is licensed under a Creative Commons Attribution-Share Alike 4.0 (CC BY-SA
More informationNvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018
Nvidia Jetson TX2 and its Software Toolset João Fernandes 2017/2018 In this presentation Nvidia Jetson TX2: Hardware Nvidia Jetson TX2: Software Machine Learning: Neural Networks Convolutional Neural Networks
More informationXilinx ML Suite Overview
Xilinx ML Suite Overview Yao Fu System Architect Data Center Acceleration Xilinx Accelerated Computing Workloads Machine Learning Inference Image classification and object detection Video Streaming Frame
More information1 Overview Definitions (read this section carefully) 2
MLPerf User Guide Version 0.5 May 2nd, 2018 1 Overview 2 1.1 Definitions (read this section carefully) 2 2 General rules 3 2.1 Strive to be fair 3 2.2 System and framework must be consistent 4 2.3 System
More informationSolving the Non-Volatile Memory Conundrum for Deep Learning Workloads
Solving the Non-Volatile Memory Conundrum for Deep Learning Workloads Ahmet Inci and Diana Marculescu Department of Electrical and Computer Engineering Carnegie Mellon University ainci@andrew.cmu.edu Architectures
More information3D Deep Learning on Geometric Forms. Hao Su
3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document
More informationLearning algorithms for physical systems: challenges and solutions
Learning algorithms for physical systems: challenges and solutions Ion Matei Palo Alto Research Center 2018 PARC 1 All Rights Reserved System analytics: how things are done Use of models (physics) to inform
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training
More informationReinforcement Learning (2)
Reinforcement Learning (2) Bruno Bouzy 1 october 2013 This document is the second part of the «Reinforcement Learning» chapter of the «Agent oriented learning» teaching unit of the Master MI computer course.
More informationSURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark
SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark Linxi Fan Yuke Zhu Jiren Zhu Zihua Liu Orien Zeng Anchit Gupta Joan Creus-Costa Silvio Savarese Li Fei-Fei Stanford
More informationNUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems
NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems Carl Pearson 1, I-Hsin Chung 2, Zehra Sura 2, Wen-Mei Hwu 1, and Jinjun Xiong 2 1 University of Illinois Urbana-Champaign, Urbana
More informationTENSORFLOW: LARGE-SCALE MACHINE LEARNING ON HETEROGENEOUS DISTRIBUTED SYSTEMS. by Google Research. presented by Weichen Wang
TENSORFLOW: LARGE-SCALE MACHINE LEARNING ON HETEROGENEOUS DISTRIBUTED SYSTEMS by Google Research presented by Weichen Wang 2016.11.28 OUTLINE Introduction The Programming Model The Implementation Single
More informationAdministrivia. HW0 scores, HW1 peer-review assignments out. If you re having Cython trouble with HW2, let us know.
Administrivia HW0 scores, HW1 peer-review assignments out. HW2 out, due Nov. 2. If you re having Cython trouble with HW2, let us know. Review on Wednesday: Post questions on Piazza Introduction to GPUs
More informationCS Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm
CS294-112 Deep Reinforcement Learning HW4: Model-Based RL due October 18th, 11:59 pm 1 Introduction The goal of this assignment is to get experience with model-learning in the context of RL, and to use
More informationPOWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017
POWERING THE AI REVOLUTION JENSEN HUANG, FOUNDER & CEO GTC 2017 LIFE AFTER MOORE S LAW 10 7 40 Years of Microprocessor Trend Data 10 6 10 5 Transistors (thousands) 1.1X per year 10 4 10 3 1.5X per year
More informationDeep Learning for Embedded Security Evaluation
Deep Learning for Embedded Security Evaluation Emmanuel Prouff 1 1 Laboratoire de Sécurité des Composants, ANSSI, France April 2018, CISCO April 2018, CISCO E. Prouff 1/22 Contents 1. Context and Motivation
More informationEnd-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow
End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow Ehsan Variani, Tom Bagby, Erik McDermott, Michiel Bacchiani Google Inc, Mountain View, CA, USA
More informationWITH the recent advance [1] of deep convolutional neural
1 Action-Driven Object Detection with Top-Down Visual Attentions Donggeun Yoo, Student Member, IEEE, Sunggyun Park, Student Member, IEEE, Kyunghyun Paeng, Student Member, IEEE, Joon-Young Lee, Member,
More informationMulti-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture
The 51st Annual IEEE/ACM International Symposium on Microarchitecture Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture Byungchul Hong Yeonju Ro John Kim FuriosaAI Samsung
More informationAdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation Introduction Supplementary material In the supplementary material, we present additional qualitative results of the proposed AdaDepth
More informationIBM Leading High Performance Computing and Deep Learning Technologies
IBM Leading High Performance Computing and Deep Learning Technologies Yubo Li ( 李玉博 ) Chief Architect, on Cloud IBM Research -- China email: liyubobj@cn.ibm.com QQ: 395238640 GTC China 2016 Sept. 13, 2016
More informationDirect Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.
[ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that
More information