Deep Learning Requirements for Autonomous Vehicles
|
|
- Sophie Hensley
- 5 years ago
- Views:
Transcription
1 Deep Learning Requirements for Autonomous Vehicles Pierre Paulin, Director of R&D Synopsys Inc. Chipex, 1 May
2 Agenda Deep Learning and Convolutional Neural Networks for Embedded Vision Automotive Why deep learning vs conventional computer vision? State of the art CNN Algorithms Implementing Vision Processing on Embedded Systems Introduction to the DesignWare EV6x Embedded Vision Processor with CNN 2
3 Problem to Solve: Humans are Fallible Drivers Annual Global Road Crash Statistics Nearly 1.3 million people die in road crashes each year, on average 3,287 deaths a day. An additional million are injured or disabled Road crashes cost USD $518 billion globally, costing individual countries from 1-2% of their annual GDP About 94% of accidents caused by human error 2% environment, 2% mechanical, 2% margin error Cameras in Automobiles Rear View Camera (Intuitive Parking Assist) Front Camera (Pedestrian Detection, AEB/Automatic Emergency Braking) Surround View Cameras Interior camera Drowsiness / gaze detection Goal: To be More Responsive than a Human 3
4 Millions Autonomous Driving EVOLUTION Global Autonomous Vehicle Sales Forecast L5 L4 L3 L2 L1 L0 0 7% % 5%
5 Technical Challenge: Inferring Information from a 2D Image (Frame) Example requirements: Performance Priority What is the highest performance I can get for Pedestrian Detection using an ADAS Front Camera? Measured in TOPs or TMACs and/or FPS Power Priority How much CNN performance can I get for a power budget of 200mW? TMAC/s/W or TOP/s/W Area Priority How much embedded vision performance can I get in two mm2? mm2 or TMAC/mm2 Other concerns: External memory (DDR) bandwidth Software tools / Time to market 5
6 Traditional Computer Vision for Object Detection Histogram of oriented Gradients (HoG) example In the past, most pattern recognition tasks were performed on vector processing units with programs hand-tuned for feature extraction followed by shallow learning Histogram of Oriented Gradients (HoG) Object appearance and shape within an image can be described by the distribution of intensity gradients or edge directions. Shallow Learning e.g. SVM 6
7 Recent EV Market Trends & Perspective Displacement of traditional vision algorithms with deep learning for improved accuracy i.e. Pedestrian Detection (HoG), Face detection (Viola-Jones) moving to Convolution Neural Network (CNN) based deep learning approaches Rapid evolution in deep learning technology to a wide set of applications Classification Localization Detection Semantic Segmentation 7
8 Deep Learning using CNN DNN network or graph Convolutional Neural Networks Takes advantage of spatial structure of image Shared weights/biases reduces number of parameters Consists of convolution, pooling and fully connected layers Easier to train than fully connected network Current standard for embedded vision object detection CNN Architecture 8
9 CNN for Object Detection Low-level features Mid-level features High-level features classifier 9
10 CNN-based Object Classification Top 5 classes: 1: moped 2: motor scooter, scooter 3: barrow, garden cart, lawn cart, wheelbarrow 4: tricycle, trike, velocipede 5: crash helmet 10
11 Classification error Deep Learning Approaches Human Levels of Accuracy ImageNet Large Scale Visual Recognition Challenge Results 28% 26% 16% 12% AlexNet, 8 layers ZF, 8 layers VGG, 19 layers GoogLeNet, 22 layers ResNet,152 layers CUImage BDAT 7.3% 6.7% ILSVRC Competition Research teams compete to achieve higher accuracy on several visual recognition tasks Algorithms must identify images belonging to one of a thousand categories shallow deep 3.6% 3.0% % 2017 Human error 100% accuracy and reliability not realistic Traditional computer vision Deep learning computer vision 11
12 Yolo: Object Detection and Localization 12
13 CNN-based Denoiser Before After 13
14 ICNet for Real-Time Semantic Segmentation Source: ICNet for Real-Time Semantic Segmentation on High-Resolution Images Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia 27 Apr
15 Neural Networks for Radar Waveform Recognition An automatic radar waveform recognition system to detect, track and locate low probability of intercept (LPI) radars. The detected signals are processed into binary images which are resized for CNN The finished binary images are used in CNN and feature extraction Ming Zhang, Ming Diao, Lipeng Gao * and Lutao Liu 15
16 Choosing a Processor for Embedded Vision Embedded Vision Compute r Vision Machine Learning GPU FPG A Performance Performance Power Leading GPU Vendor: GoogLeNet 138 fps / 119W Power Area Area Embedded Systems VDSP Performance Power Area On-chip Cost efficient Energy efficient Real-time VDSP + CNN Performance Power Area Synopsys EV6x: GoogLeNet 200 fps / 0.5W (16 nm FFC) 16
17 DesignWare EV6x 17
18 DesignWare EV6x Embedded Vision Processor IP Scalable Hardware-Software Solution for High Accuracy Vision Processing Wide Vector DSP Processing - 8-, 16- & 32-bit datatypes - Up to 776 OPs/cycle (16b) - Up to 256 MACs/cycle (16b) - Easy OpenCL C programming Easy system integration - Control and communication - Low area and power - C/C++ programming Libraries (OpenCV) & API (OpenVX) Vision CPU (1/2/4 cores) 32-bit scalar SFPU EV6x Embedded Vision Processor Core 4 Core 3 Core 2 Core bit vector DSP VFPU MetaWare EV Compilers / Debuggers (C/C++, OpenCL C) Simulators (fast NSIM, EV VDK) CNN Mapping Tool CNN Engine (scalable) 3520 MAC Engine 1760 MAC Engine 880 MAC Engine Convolution Conv. 2D Classification Conv. 1D D M A High-performance CNN Engine - Up to 3520 MACs/cycle - Dedicated memory architecture, DMA - Multi-dimension parallelism - Supports 8 or 12 bit processing - Automatic programming tools Shared Memory - Low latency access from all EV cores and CNN engine - Shared Memory Visible to host Fast, Easy SoC connection - To host processor - To access frame data Sync & Debug Streaming Transfer Unit AXI Interconnect Shared Memory Background Memory Access - Load next frame in advance from on- or off-chip frame buffer 18
19 Denoiser Filter Results Close comparison of 12-bit and 8-bit accuracy Original versus Denoised PSNR: Peak Signal to Noise Ratio SSIM: Structural SIMilarity index ED: Euclidian Difference SSIM: 93.5% SSIM: 81.7% Original Noise added PSNR: 28.59, SSIM: , ED: Denoised (12b Fixed Point) PSNR: 21.98, SSIM: , ED: Denoised (8b Fixed Point) 20
20 EV6 Vector DSP and CNN Engine Benefits of Specialization Competing Implementations DesignWare EV62 + CNN1760 Scalar units, CNN Vector 1760 DSPs 12b MACs More MACs, higher accuracy, for same area Option A Option A Scalar unit Vector unit Scalar Vector CNN Eng, b MACs Running stacks, decision making, etc. Pixel Processing, Scaling / Pyramids, Filtering, etc. Object Classification, Detection, Localization, Scene Segmentation, etc. Option B Scalar unit Vector DSP Scalar unit Option B Vector unit b MACs 21
21 Summary Fast growing automotive applications DesignWare EV6x Processors ADAS and Autonomous Driving Deep learning techniques, like convolutional neural networks, offer the highest accuracy for object classification, detection, and scene segmentation CNN replacing traditional computer vision algorithms Specialized CNN architecture offers area and power efficiencies, and higher accuracy for image quality improvement applications Unified multicore processor for automotive vision processing Scalar + vector DSP + CNN engine State-of-the-art convolutional neural network (CNN) 22
22 Thank You
Enabling Safe, Secure, Smarter Cars from Silicon to Software. Jeff Hutton Synopsys Automotive Business Development
Enabling Safe, Secure, Smarter Cars from Silicon to Software Jeff Hutton Synopsys Automotive Business Development Safe Secure Smarter Systemic Complexity ADAS Autonomous V2X Infotainment Safe Secure Smarter
More informationAn introduction to Machine Learning silicon
An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional
More informationObject Detection. Part1. Presenter: Dae-Yong
Object Part1 Presenter: Dae-Yong Contents 1. What is an Object? 2. Traditional Object Detector 3. Deep Learning-based Object Detector What is an Object? Subset of Object Recognition What is an Object?
More informationThe Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016
The Path to Embedded Vision & AI using a Low Power Vision DSP Yair Siegel, Director of Segment Marketing Hotchips August 2016 Presentation Outline Introduction The Need for Embedded Vision & AI Vision
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationFuzzy Set Theory in Computer Vision: Example 3
Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures
More informationTIOVX TI s OpenVX Implementation
TIOVX TI s OpenVX Implementation Aish Dubey Product Marketing, Automotive Processors Embedded Vision Summit, 3 May 2017 1 TI SOC platform heterogeneous cores High level processing Object detection and
More informationNVIDIA FOR DEEP LEARNING. Bill Veenhuis
NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA
More informationOpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017
OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017 Agenda Why Zynq SoCs for Traditional Computer Vision Automated
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationDeep Learning Processing Technologies for Embedded Systems. October 2018
Deep Learning Processing Technologies for Embedded Systems October 2018 1 Neural Networks Architecture Single Neuron DNN Multi Task NN Multi-Task Vehicle Detection With Region-of-Interest Voting Popular
More informationScalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School
More informationSVM Segment Video Machine. Jiaming Song Yankai Zhang
SVM Segment Video Machine Jiaming Song Yankai Zhang Introduction Background When watching a video online, users might need: Detailed video description information Removal of repeating openings and endings
More informationScaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research
Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research Nick Fraser (Xilinx & USydney) Yaman Umuroglu (Xilinx & NTNU) Giulio Gambardella (Xilinx)
More informationCreating Affordable and Reliable Autonomous Vehicle Systems
Creating Affordable and Reliable Autonomous Vehicle Systems Shaoshan Liu shaoshan.liu@perceptin.io Autonomous Driving Localization Most crucial task of autonomous driving Solutions: GNSS but withvariations,
More informationTraffic Sign Localization and Classification Methods: An Overview
Traffic Sign Localization and Classification Methods: An Overview Ivan Filković University of Zagreb Faculty of Electrical Engineering and Computing Department of Electronics, Microelectronics, Computer
More informationArchitecting new deep neural networks for embedded applications
Architecting new deep neural networks for embedded applications Forrest Iandola 1 Machine Learning in 2012 Sentiment Analysis LDA Object Detection Deformable Parts Model Word Prediction Linear Interpolation
More informationArm Technology in Automotive Geely Automotive Shanghai Innovation Center
Arm Technology in Automotive Geely Automotive Shanghai Innovation Center 2018/10/22 Shanghai GIC(Geely Innovation Center)Brief Introduction Innovation Center (Belongs to GRI) Vision: To be world-leading
More informationIndustrial Technology Research Institute, Hsinchu, Taiwan, R.O.C ǂ
Stop Line Detection and Distance Measurement for Road Intersection based on Deep Learning Neural Network Guan-Ting Lin 1, Patrisia Sherryl Santoso *1, Che-Tsung Lin *ǂ, Chia-Chi Tsai and Jiun-In Guo National
More informationResearch Faculty Summit Systems Fueling future disruptions
Research Faculty Summit 2018 Systems Fueling future disruptions Efficient Edge Computing for Deep Neural Networks and Beyond Vivienne Sze In collaboration with Yu-Hsin Chen, Joel Emer, Tien-Ju Yang, Sertac
More informationBrainchip OCTOBER
Brainchip OCTOBER 2017 1 Agenda Neuromorphic computing background Akida Neuromorphic System-on-Chip (NSoC) Brainchip OCTOBER 2017 2 Neuromorphic Computing Background Brainchip OCTOBER 2017 3 A Brief History
More informationMaximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman
Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency with ML accelerators Michael
More informationOptimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms
Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms Ruizhe Zhao 1, Xinyu Niu 1, Yajie Wu 2, Wayne Luk 1, and Qiang Liu 3 1 Imperial College London {ruizhe.zhao15,niu.xinyu10,w.luk}@imperial.ac.uk
More informationA new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology
Dr.-Ing Jens Benndorf (DCT) Gregor Schewior (DCT) A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology Tensilica Day 2017 16th
More informationDEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017
DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X
More informationMulti-Core SoCs for ADAS and Image Recognition Applications
Multi-Core SoCs for ADAS and Image Recognition Applications Takashi Miyamori, Senior Manager Embedded Core Technology Development Department Center for Semiconductor Research & Development Storage Device
More informationKnow your data - many types of networks
Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationRevolutionizing the Datacenter
Power-Efficient Machine Learning using FPGAs on POWER Systems Ralph Wittig, Distinguished Engineer Office of the CTO, Xilinx Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Top-5
More informationInception and Residual Networks. Hantao Zhang. Deep Learning with Python.
Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationAccelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs
Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Ritchie Zhao 1, Weinan Song 2, Wentao Zhang 2, Tianwei Xing 3, Jeng-Hau Lin 4, Mani Srivastava 3, Rajesh Gupta 4, Zhiru
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationTEXAS INSTRUMENTS DEEP LEARNING (TIDL) GOES HERE FOR SITARA PROCESSORS GOES HERE
YOUR TEXAS INSTRUMENTS VIDEO TITLE DEEP LEARNING (TIDL) GOES HERE FOR SITARA PROCESSORS OVERVIEW THE SUBTITLE GOES HERE Texas Instruments Deep Learning (TIDL) for Sitara Processors Overview Texas Instruments
More informationAutonomous Driving Solutions
Autonomous Driving Solutions Oct, 2017 DrivePX2 & DriveWorks Marcus Oh (moh@nvidia.com) Sr. Solution Architect, NVIDIA This work is licensed under a Creative Commons Attribution-Share Alike 4.0 (CC BY-SA
More informationTHE NVIDIA DEEP LEARNING ACCELERATOR
THE NVIDIA DEEP LEARNING ACCELERATOR INTRODUCTION NVDLA NVIDIA Deep Learning Accelerator Developed as part of Xavier NVIDIA s SOC for autonomous driving applications Optimized for Convolutional Neural
More informationUnified Deep Learning with CPU, GPU, and FPGA Technologies
Unified Deep Learning with CPU, GPU, and FPGA Technologies Allen Rush 1, Ashish Sirasao 2, Mike Ignatowski 1 1: Advanced Micro Devices, Inc., 2: Xilinx, Inc. Abstract Deep learning and complex machine
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationInstance-aware Semantic Segmentation via Multi-task Network Cascades
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further
More informationDeploying Deep Neural Networks in the Embedded Space
Deploying Deep Neural Networks in the Embedded Space Stylianos I. Venieris, Alexandros Kouris, Christos-Savvas Bouganis 2 nd International Workshop on Embedded and Mobile Deep Learning (EMDL) MobiSys,
More informationComprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications
Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications Helena Zheng ML Group, Arm Arm Technical Symposia 2017, Taipei Machine Learning is a Subset of Artificial
More informationCharacterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager
Characterization and Benchmarking of Deep Learning Natalia Vassilieva, PhD Sr. Research Manager Deep learning applications Vision Speech Text Other Search & information extraction Security/Video surveillance
More informationDeep Residual Learning
Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st
More informationStandards for Vision Processing and Neural Networks
Copyright Khronos Group 2017 - Page 1 Standards for Vision Processing and Neural Networks Radhakrishna Giduthuri, AMD radha.giduthuri@ieee.org Agenda Why we need a standard? Khronos NNEF Khronos OpenVX
More informationImplementation of Deep Convolutional Neural Net on a Digital Signal Processor
Implementation of Deep Convolutional Neural Net on a Digital Signal Processor Elaina Chai December 12, 2014 1. Abstract In this paper I will discuss the feasibility of an implementation of an algorithm
More informationSmart Parking System using Deep Learning. Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins
Smart Parking System using Deep Learning Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins Content Labeling tool Neural Networks Visual Road Map Labeling tool Data set Vgg16 Resnet50 Inception_v3
More informationMIPI : Advanced Driver Assistance System
MIPI : Advanced Driver Assistance System application and system development Richard Sproul Charles Qi - Gabriele Zarri (Cadence) esame Conference Sophia Antipolis 05 October 2015 ADAS : some history FORD
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationDesign guidelines for embedded real time face detection application
Design guidelines for embedded real time face detection application White paper for Embedded Vision Alliance By Eldad Melamed Much like the human visual system, embedded computer vision systems perform
More informationArm s First-Generation Machine Learning Processor
Arm s First-Generation Machine Learning Processor Ian Bratt 2018 Arm Limited Introducing the Arm Machine Learning (ML) Processor Optimized ground-up architecture for machine learning processing Massive
More informationImplementing Long-term Recurrent Convolutional Network Using HLS on POWER System
Implementing Long-term Recurrent Convolutional Network Using HLS on POWER System Xiaofan Zhang1, Mohamed El Hadedy1, Wen-mei Hwu1, Nam Sung Kim1, Jinjun Xiong2, Deming Chen1 1 University of Illinois Urbana-Champaign
More informationAccelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx
Accelerating your Embedded Vision / Machine Learning design with the revision Stack Giles Peckham, Xilinx Xilinx Foundation at the Edge Vision Customers Using Xilinx >80 ADAS Models From 23 Makers >80
More informationVersal: AI Engine & Programming Environment
Engineering Director, Xilinx Silicon Architecture Group Versal: Engine & Programming Environment Presented By Ambrose Finnerty Xilinx DSP Technical Marketing Manager October 16, 2018 MEMORY MEMORY MEMORY
More informationObject Detection on Self-Driving Cars in China. Lingyun Li
Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More informationNeural Network Exchange Format
Copyright Khronos Group 2017 - Page 1 Neural Network Exchange Format Deploying Trained Networks to Inference Engines Viktor Gyenes, specification editor Copyright Khronos Group 2017 - Page 2 Outlook The
More informationDeep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon
Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in
More informationWu Zhiwen.
Wu Zhiwen zhiwen.wu@intel.com Agenda Background information OpenCV DNN module OpenCL acceleration Vulkan backend Sample 2 What is OpenCV? Open Source Compute Vision (OpenCV) library 2500+ Optimized algorithms
More informationSilicon Acceleration APIs
Copyright Khronos Group 2016 - Page 1 Silicon Acceleration APIs Embedded Technology 2016, Yokohama Neil Trevett Vice President Developer Ecosystem, NVIDIA President, Khronos ntrevett@nvidia.com @neilt3d
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationFPGA Image Processing for Driver Assistance Camera
Michigan State University College of Engineering ECE 480 Design Team 4 Feb. 8 th 2011 FPGA Image Processing for Driver Assistance Camera Final Proposal Design Team: Buether, John Frankfurth, Josh Lee,
More informationExploring System Coherency and Maximizing Performance of Mobile Memory Systems
Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Shanghai: William Orme, Strategic Marketing Manager of SSG Beijing & Shenzhen: Mayank Sharma, Product Manager of SSG ARM Tech
More informationXilinx ML Suite Overview
Xilinx ML Suite Overview Yao Fu System Architect Data Center Acceleration Xilinx Accelerated Computing Workloads Machine Learning Inference Image classification and object detection Video Streaming Frame
More informationThroughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu
More informationHigh Performance Computing
High Performance Computing 9th Lecture 2016/10/28 YUKI ITO 1 Selected Paper: vdnn: Virtualized Deep Neural Networks for Scalable, MemoryEfficient Neural Network Design Minsoo Rhu, Natalia Gimelshein, Jason
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationDeep Learning Accelerators
Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationIntelligent Interconnect for Autonomous Vehicle SoCs. Sam Wong / Chi Peng, NetSpeed Systems
Intelligent Interconnect for Autonomous Vehicle SoCs Sam Wong / Chi Peng, NetSpeed Systems Challenges Facing Autonomous Vehicles Exploding Performance Requirements Real-Time Processing of Sensors Ultra-High
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationDeep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper
Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision
More informationHuman-Robot Interaction
Human-Robot Interaction Elective in Artificial Intelligence Lecture 6 Visual Perception Luca Iocchi DIAG, Sapienza University of Rome, Italy With contributions from D. D. Bloisi and A. Youssef Visual Perception
More informationTowards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA
Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA Junzhong Shen, You Huang, Zelong Wang, Yuran Qiao, Mei Wen, Chunyuan Zhang National University of Defense Technology,
More informationINTRODUCTION TO DEEP LEARNING
INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional
More informationGenerative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs
Generative Modeling with Convolutional Neural Networks Denis Dus Data Scientist at InData Labs What we will discuss 1. 2. 3. 4. Discriminative vs Generative modeling Convolutional Neural Networks How to
More informationUsing Machine Learning for Classification of Cancer Cells
Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling
More informationdirect hardware mapping of cnns on fpga-based smart cameras
direct hardware mapping of cnns on fpga-based smart cameras Workshop on Architecture of Smart Cameras Kamel ABDELOUAHAB, Francois BERRY, Maxime PELCAT, Jocelyn SEROT, Jean-Charles QUINTON Cordoba, June
More informationHPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov
HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads Natalia Vassilieva, Sergey Serebryakov Deep learning ecosystem today Software Hardware 2 HPE s portfolio for deep learning Government,
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationBandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design
Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design Song Yao 姚颂 Founder & CEO DeePhi Tech 深鉴科技 song.yao@deephi.tech Outline - About DeePhi Tech - Background - Bandwidth Matters
More informationarxiv: v1 [cs.cv] 11 Feb 2018
arxiv:8.8v [cs.cv] Feb 8 - Partitioning of Deep Neural Networks with Feature Space Encoding for Resource-Constrained Internet-of-Things Platforms ABSTRACT Jong Hwan Ko, Taesik Na, Mohammad Faisal Amir,
More informationAdaptable Intelligence The Next Computing Era
Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion
More informationHow to Estimate the Energy Consumption of Deep Neural Networks
How to Estimate the Energy Consumption of Deep Neural Networks Tien-Ju Yang, Yu-Hsin Chen, Joel Emer, Vivienne Sze MIT 1 Problem of DNNs Recognition Smart Drone AI Computation DNN 15k 300k OP/Px DPM 0.1k
More informationRegionlet Object Detector with Hand-crafted and CNN Feature
Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet
More informationARM processors driving automotive innovation
ARM processors driving automotive innovation Chris Turner Director of advanced technology marketing, CPU group ARM tech forums, Seoul and Taipei June/July 2016 The ultimate intelligent connected device
More informationCSE 559A: Computer Vision
CSE 559A: Computer Vision Fall 2018: T-R: 11:30-1pm @ Lopata 101 Instructor: Ayan Chakrabarti (ayan@wustl.edu). Course Staff: Zhihao Xia, Charlie Wu, Han Liu http://www.cse.wustl.edu/~ayan/courses/cse559a/
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationA NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017
A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017 TWO FORCES DRIVING THE FUTURE OF COMPUTING 10 7 Transistors (thousands) 10 6 10 5 1.1X per year 10 4 10 3 10 2 1.5X per year Single-threaded
More informationLayer-wise Performance Bottleneck Analysis of Deep Neural Networks
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks Hengyu Zhao, Colin Weinshenker*, Mohamed Ibrahim*, Adwait Jog*, Jishen Zhao University of California, Santa Cruz, *The College of William
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant
More informationImplementing Deep Learning for Video Analytics on Tegra X1.
Implementing Deep Learning for Video Analytics on Tegra X1 research@hertasecurity.com Index Who we are, what we do Video analytics pipeline Video decoding Facial detection and preprocessing DNN: learning
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationUltra Low Power GPUs for Wearables
Ultra Low Power GPUs for Wearables Georgios Keramidas January 2015 The Company Who we are? Think Silicon is a privately held company founded in 2007. What we do? Development of low power GPU IP semiconductor
More informationDeep Learning for Computer Vision with MATLAB By Jon Cherrie
Deep Learning for Computer Vision with MATLAB By Jon Cherrie 2015 The MathWorks, Inc. 1 Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deeplearning system. 'We
More information