An introduction to Machine Learning silicon

Similar documents
Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications

Artificial Intelligence Enriched User Experience with ARM Technologies

Bringing Intelligence to Enterprise Storage Drives

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

Bringing Intelligence to Enterprise Storage Drives

How to Build Optimized ML Applications with Arm Software

The Changing Face of Edge Compute

How to Build Optimized ML Applications with Arm Software

Enable AI on Mobile Devices

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

Machine learning for the Internet of Things

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

A backward glance and a forward view

Advanced IP solutions enabling the autonomous driving revolution

ARM: Investing for future growth

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Unified Deep Learning with CPU, GPU, and FPGA Technologies

Accelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx

A NEW COMPUTING ERA. Shanker Trivedi Senior Vice President Enterprise Business at NVIDIA

Accelerating intelligence at the edge for embedded and IoT applications

In partnership with. VelocityAI REFERENCE ARCHITECTURE WHITE PAPER

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

ARM instruction sets and CPUs for wide-ranging applications

Making progress vs strategy

Accelerate AI with Cisco Computing Solutions

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

Deep Learning Requirements for Autonomous Vehicles

WAVE ONE MAINFRAME WAVE THREE INTERNET WAVE FOUR MOBILE & CLOUD WAVE TWO PERSONAL COMPUTING & SOFTWARE Arm Limited

Data-Centric Innovation Summit DAN MCNAMARA SENIOR VICE PRESIDENT GENERAL MANAGER, PROGRAMMABLE SOLUTIONS GROUP

Speculations about Computer Architecture in Next Three Years. Jan. 20, 2018

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

TEXAS INSTRUMENTS DEEP LEARNING (TIDL) GOES HERE FOR SITARA PROCESSORS GOES HERE

IoT Market: Three Classes of Devices

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

Deep Learning mit PowerAI - Ein Überblick

Neural Network Exchange Format

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Machine Learning on VMware vsphere with NVIDIA GPUs

Fast Hardware For AI

Xilinx ML Suite Overview

24th MONDAY. Overview 2018

Xilinx Machine Learning Strategies For Edge

TESLA V100 PERFORMANCE GUIDE. Life Sciences Applications

A Secure and Connected Intelligent Future. Ian Smythe Senior Director Marketing, Client Business Arm Tech Symposia 2017

Machine Learning for Selected SI & PI Problems. Timothy Michalka Sr. Director, Engineering Qualcomm Technologies, Inc. 18-Oct-2017

Revolutionizing the Datacenter

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence?

Smart Ultra-Low Power Visual Sensing

Convolutional Neural Networks

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

Brainchip OCTOBER

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

Accelerating Implementation of Low Power Artificial Intelligence at the Edge

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

Achieving on Mobile Devices

Practical Applications of Machine Learning for Image and Video in the Cloud

Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs

So you think developing an SoC needs to be complex or expensive? Think again

Deep Learning Accelerators

Arm Limited. Q Roadshow Slides. Arm Limited is a subsidiary of. 1 v1 Arm 2018

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research

The OpenVX Computer Vision and Neural Network Inference

SUPERCHARGE DEEP LEARNING WITH DGX-1. Markus Weber SC16 - November 2016

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

Shrinath Shanbhag Senior Software Engineer Microsoft Corporation

Deploying Deep Neural Networks in the Embedded Space

THE NVIDIA DEEP LEARNING ACCELERATOR

Mali-G72 Enabling tomorrow s technology today

Fuzzy Set Theory in Computer Vision: Example 3

Beyond Hardware IP An overview of Arm development solutions

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

Artificial intelligence: what s next for AI and the Cloud in physical security?

Close to the Edge How Neural Network inferencing is migrating to specialised DSPs in State of the Art SoCs. Marcus Binning Sept 2018 Lund

Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms

Edge Computing and the Next Generation of IoT Sensors. Alex Raimondi

Building the Most Efficient Machine Learning System

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Using Virtual Platforms To Improve Software Verification and Validation Efficiency

Arm Limited. Q Roadshow Slides. Arm Limited is a subsidiary of. 1 v1 Arm 2018

Building the Most Efficient Machine Learning System

LINARO CONNECT 23 HKG18 George Grey, Linaro CEO

2017 Arm Limited. How to design an IoT SoC and get Arm CPU IP for no upfront license fee

Unleash the DSP performance of Arm Cortex processors

Cisco UCS C480 ML M5 Rack Server Performance Characterization

Beyond Training The next steps of Machine Learning. Chris /in/chrisparsonsdev

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

World s most advanced data center accelerator for PCIe-based servers

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,

Bring Intelligence to the Edge with Intel Movidius Neural Compute Stick

Embedded GPGPU and Deep Learning for Industrial Market

Adaptable Computing The Future of FPGA Acceleration. Dan Gibbons, VP Software Development June 6, 2018

Cortex-A75 and Cortex-A55 DynamIQ processors Powering applications from mobile to autonomous driving

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs

Vinnie Saini Cloud Solution Architect Big Data & AI

EFFICIENT INFERENCE WITH TENSORRT. Han Vanholder

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM

Demystifying Deep Learning

Transcription:

An introduction to Machine Learning silicon November 28 2017

Insight for Technology Investors

AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional terms Location Cloud processing done in data farms Edge processing done in local devices Types of machine learning Model a mathematical approximation of a collection of input data Training in deep learning, data-sets are used to create a model Inference using a model to check against new data 3

Classification error Neural Networks (NNs) outperform humans 28% 26% AlexNet, 8 layers ZF, 8 layers VGG, 19 layers GoogleNet, 22 layers Data for ImageNet Large Scale Visual Recognition Challenge 16% 12% 7.3% 6.7% 3.6% 3% ResNet, 152 layers CUImage Human error Deep networks, introduced in 2012, resulted in big improvements 2010 2011 2012 2013 2014 2015 2016 shallow deep (Image source: Synopsys) Error rates have now stabilized at ~3% 4

Machine Learning training Training data Model For each piece of data used to train the model, millions of model parameters are adjusted. The process is repeated many times until the model delivers satisfactory performance. 5

Machine Learning inference Input Model Output 97.4% confidence 96.4% confidence When new data is presented to the trained model, large numbers of multiply-add operations are performed using the new data and the model parameters. The process is performed once. 6

Why is on-device ML driving AI to the Edge? Bandwidth Power Cost Latency Privacy 7

Inference everywhere Mobile Automotive Robotics Drones IoT Surveillance Augmented reality Shipping & logistics 8

Processor options for Machine Learning workloads 9

A System-on-Chip contains multiple compute engines Main processor (CPU) A versatile compute engine for running rich software. The main CPU runs device s operating system, applications and user interface. It also manages the flow of data to specialist processors in the device. Graphics processor (GPU) Used for generating 2D/3D images and executing highly-parallelised workloads such as neural network arithmetic Digital signal processors (DSPs) A specialist form of CPU, optimised for analysing waveforms. Useful for radio control, sensor readings, audio and image processing Accelerators Heavily-optimised data processors for frequently-used tasks, e.g. encryption, video, computer vision 10

Comparing processor options for Machine Learning CPU DSP Training Inference Usability Hardware cost Power efficiency Hardware cost Power efficiency Flexibility Programmability GPU 1 2 3 1 2 Accelerator FPGA 1 = High volume, evolving workload 2 = High volume, stable workload 3 = Low volume, evolving workload 1 = A client device that requires a GPU for graphics 2 = A device that uses a GPU for ML work only 11 Weak, relative to alternatives Good, relative to alternatives

Performance Processor options for various sizes of chip Machine Learning demands (accuracy, response time) vary by use case All use cases can default to a CPU A GPU is often a good all-rounder solution Accelerators are useful when it is essential to either maximize response speed or minimize power consumption Cortex-M Accelerator Cortex-A (little CPU) Accelerator Cortex-A (big CPU) Keyword detection GPU Speech recognition Visual object recognition Visual object detection 12 Silicon area / power consumption

Arm s ML computing platform AI Applications: ML, CV, speech recognition etc. Applications Neural network frameworks (e.g. Tensorflow, Caffe, AndroidNN) Optional Spirit libraries & model sets Stable SW interfaces Compute library Arm DS-5 / Keil tools / compilers / drivers Spirit metadata library 13 SVE CPU CPU GPU Partner IP: DSPs, FPGAs, accelerators Spirit Computer Vision Provided by Arm Provided by third-party Edge devices

Machine Learning is driving all of Arm s technology roadmap Processor design Software support Computer vision 14

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. www.arm.com/company/policies/trademarks 15