Achieving on Mobile Devices

Similar documents
Snapdragon NPE Overview

Preparing for Mass Market Virtual Reality: A Mobile Perspective. Qualcomm Technologies, Inc. September 16, 2017

Perform. Travis Lanier Sr. Director, Product Management Qualcomm Technologies,

The Mobile Future of extended Reality (XR) Hugo Swart Senior Director, Product Management Qualcomm Technologies, Inc.

The Future of Mobility. Keith Kressin Senior Vice President, Product Management Qualcomm Technologies,

Immersion. Tim Leland Vice President, Product Management Qualcomm Technologies,

Making XR a reality for everyone

Emerging Vision Technologies: Enabling a New Era of Intelligent Devices

Qualcomm Snapdragon Technologies

Qualcomm Snapdragon 450 Mobile Platform

IoT with 5G Technology

Leading the world to 5G

Ultra-low Power Always-On Computer Vision

Welcome to the 5G age

Making always-on vision a reality. Dr. Evgeni Gousev Sr. Director, Engineering Qualcomm Technologies, Inc. September 22,

Machine Learning for Selected SI & PI Problems. Timothy Michalka Sr. Director, Engineering Qualcomm Technologies, Inc. 18-Oct-2017

New Technologies for UAV/UGV

November, Qualcomm s 5G vision Qualcomm Technologies, Inc. and/or its affiliates.

Qualcomm Snapdragon 710 mobile platform

5G Design and Technology. Durga Malladi SVP Engineering Qualcomm Technologies, Inc. October 19 th, 2016

Making 5G NR a commercial reality

Smartphone-powered future

Mobile: the foundation of the digital economy

An introduction to Machine Learning silicon

Mobile technology: A catalyst for change. Kedar Kondap Vice President, Product Management Qualcomm Technologies, Inc.

RISC-V: Opportunities and Challenges in SoCs

Making an on-device personal assistant a reality

Future Networked Car Geneva Auto Show 2018

Global 5G spectrum update

Making Mobile 5G a Commercial Reality. Peter Carson Senior Director Product Marketing Qualcomm Technologies, Inc.

A New Foundation for the Connected Home. Qualcomm Technologies, Inc. June

The promise of higher spectrum bands for 5G. Rasmus Hellberg PhD Senior Director, Technical Marketing Qualcomm Technologies, Inc.

802.11ax: Meeting the demands of modern networks. Gopi Sirineni, Vice President Qualcomm Technologies, Inc. April 19,

Making 5G NR a reality

Hear, Listen now Speak

Heterogeneous Computing Made Easy:

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

May 2015 Qualcomm Technologies, Inc Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

5 GHz for consumers. Guillaume Lebrun Director 7 th June 2016

5G Spectrum Access. Wassim Chourbaji. Vice President, Government Affairs and Public Policy EMEA Qualcomm Technologies Inc.

Innovative Wireless Technologies for Mobile Broadband

5G and Automotive Cellular Vehicle-to-Everything (C-V2X) March 2017

Embedded Software: Its Growing Influence on the Hardware world

Making 5G NR a reality

How Qualcomm Wireless Reach M&E Catalyzes SGBs. Lauren H Reed Staff Analyst Government Affairs 1

Date: 13 June Location: Sophia Antipolis. Integrating the SIM. Dr. Adrian Escott. Qualcomm Technologies, Inc.

24th MONDAY. Overview 2018

Towards 5G NR Commercialization

Way-Shing Lee Vice President, Technology Qualcomm Technologies, Inc. July 16, Expanding mobile technologies for the Internet of Things

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

Snapdragon S4 System on Chip

VR Development Platform

TEXAS INSTRUMENTS DEEP LEARNING (TIDL) GOES HERE FOR SITARA PROCESSORS GOES HERE

How to Build Optimized ML Applications with Arm Software

Qualcomm AllPlay Smart Media Platform

Building the 5G future Cristiano Amon President, Qualcomm Incorporated

Smart Wearables: Enriching The Digital Sixth Sense. Pankaj Kedia Sr. Director & Business Lead, Smart Wearables Segment Qualcomm Technologies, Inc.

Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications

Qualcomm WiPower Flexible Wireless Charging

Spectrum for 4G and 5G. Qualcomm Technologies, Inc. October, 2016

Qualcomm press conference

How will 5G transform Industrial IoT?

How to Build Optimized ML Applications with Arm Software

The Internet of Everything. Pete Lancia Sr. Dir., Marketing

Mobile AI. Mérouane Debbah Mathematical and Algorithmic Sciences Lab, Huawei, France. 11 h of June, 2018


Machine learning for the Internet of Things

Autonomous Driving Solutions

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

The Benefits of GPU Compute on ARM Mali GPUs

Barclay s 2009 Worldwide Wireless and Wireline Conference

Artificial Intelligence Enriched User Experience with ARM Technologies

GTC Interaction Simplified. Gesture Recognition Everywhere: Gesture Solutions on Tegra

Sanjeev Athalye, Sr. Director, Product Management Qualcomm Technologies, Inc.

5G NR to high capacity and

LINARO CONNECT 23 HKG18 George Grey, Linaro CEO

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

Enabling the A.I. Era: From Materials to Systems

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

Leading the World to 5G NR

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

BOUNDLESS XR WITH PHOTOREALISTIC VISUALS

Movidius Neural Compute Stick

Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

New York Analyst Day. November 13, 2008

Snapdragon S4 System on Chip

Spectrum for 4G and 5G. Qualcomm Technologies, Inc. July, 2017

CSR102x Bluetooth Smart Product Line Overview

Internet of Everything Qualcomm Brings M2M to the World

CSR102x Starter Development Kit

IoT Market: Three Classes of Devices

mbed OS Update Sam Grove Technical Lead, mbed OS June 2017 ARM 2017

From Boolean Algebra to Smart Glass

DesignWare IP for IoT SoC Designs

NeoNet: Object centric training for image recognition

Bring Intelligence to the Edge with Intel Movidius Neural Compute Stick

Wu Zhiwen.

Neural Network Exchange Format

The need for collaboration across industries in Sweden. Torbjörn Lundahl Research Director at Ericsson Research Program manager 5G for Sweden

Transcription:

March 2018 @qualcomm_tech Achieving AI @Scale on Mobile Devices Qualcomm Technologies, Inc.

Mobile is the largest computing platform in the world > 8.5 Billion Cumulative smartphone unit shipments forecast between 2017 2021 Source: Gartner Mar. 17 2

30 Years of driving the evolution of wireless #1 Fabless semiconductor company #1 in 3G/4G LTE modem 842M MSM chipsets shipped FY 16 Source: Qualcomm Incorporated data. Currently, Qualcomm semiconductors are products of Qualcomm Technologies, Inc. or its subsidiaries. MSM is a product of Qualcomm Technologies, Inc. IHS, Mar. 17; Strategy Analytics, Mar. 17 3

Modem 3G (CDMA, GSM) 4G/LTE 5G Connectivity Bluetooth Smart Bluetooth Mesh 802.11ac/ax 802.11ad 802.11n 802.15.4 DSRC Powerline GNSS/Location Peripherals (PCIe, USB.) Qualcomm Technologies success is based on Technology leadership Connectivity RF Power amps Acoustic filters RF switches Other Machine learning NoC/MemCtrl Power management Computing Security Touch Fingerprint Processors CPU GPU DSP Multimedia Video Speech Display processing Media processing Camera processing Audio processing Computer Vision AR / VR 4

Chipset complexity is growing dramatically More frequent and more complex design cycles Comparative scale Circa 2000 <10 Million Transistors on a chip 1 chipset per year Today >3 Billion Transistors on a chip ~10 chipsets per year Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. Source: Qualcomm internal data 5

A mobile processor today Snapdragon 835 Highly integrated and complex SoC using 10nm process technology Snapdragon X16 LTE World s first announced gigabit-class LTE modem Snapdragon X16 LTE modem Wi-Fi Adreno 540 Graphics Processing Unit (GPU) Qualcomm Adreno Visual Processing 25% faster graphics rendering 60x more display colors* Qualcomm Hexagon DSP Snapdragon Neural Processing Engine support Hexagon 682 DSP HVX All-Ways Aware Display Processing Unit (DPU) Qualcomm Spectra 180 Camera Video Processing Unit (VPU) Qualcomm Spectra Camera ISP Smooth zoom Fast-autofocus True to life colors Qualcomm Aqstic Audio Kryo 280 CPU Qualcomm Kryo 280 CPU Our most power efficient architecture to date Qualcomm Location Suite Qualcomm Mobile Security Qualcomm Mobile Security First to support full biometric suite * Compared to Snapdragon 820 Qualcomm Adreno, Qualcomm Kryo, Qualcomm Hexagon, Qualcomm Spectra, Qualcomm Aqstic, Qualcomm Location Suite, Qualcomm All-Ways Aware, and Qualcomm Mobile Security are products of Qualcomm Technologies, Inc. 6

Smart homes Mobile computing Mobile scale changes everything Smart cities Automotive Healthcare Industrial IoT Networking Wearables Extended reality Rapid replacement cycles Superior scale Integrated and optimized technologies 7

Intelligence is moving to the device Server/Cloud Training Execution/Inference Devices Execution/Inference Training (emerging) 8

On-device intelligence is paramount Process data closest to the source, complement the cloud Privacy Reliability Low latency Efficient use of network bandwidth 9

Mobile is becoming the pervasive AI platform 10

Qualcomm Technologies is accelerating on-device AI Making efficient on-device machine learning possible for highly responsive, private, and intuitive user experiences 11

Qualcomm Artificial Intelligence Platform The platform for efficient on-device machine learning Audio intelligence Visual intelligence Intuitive security A high-performance platform designed to support myriad intelligenton-device-capabilities that utilize: Qualcomm Snapdragon mobile platform s heterogeneous compute capabilities within a highly integrated SoC Innovations in machine learning algorithms and enabling software Development frameworks to minimize the time and effort for integrating customer networks with our platform Qualcomm Artificial Intelligence Platform and Qualcomm Snapdragon are products of Qualcomm Technologies, Inc. 12

Making on-device intelligence pervasive Focusing on high performance HW/SW and optimized network design Efficient hardware Developing heterogeneous compute to run demanding neural networks at low power and within thermal limits Selecting the right compute block for the right task Algorithmic advancements Algorithmic research that benefits from state-of-the-art deep neural networks Optimization for space and runtime efficiency Software tools Software accelerated run-time for deep learning SDK/development frameworks 13

Algorithmic enhancements for space and runtime efficiency Improve performance by addressing model complexity Neural network optimizations for embedded Required operations per image Improved network architecture 40000 Focus on memory and storage Reduce bit widths Model compression Leverage sparsity GOPS 30000 20000 27x Reduction in complexity Architecture learning 10000 0 Original QCOM QTI V1 V1 QCOM QTI V2 V2 QCOM QTI V3 V3 Network versions Source: Qualcomm Research. Results from running a semantic segmentation network for automotive. 14

Snapdragon Neural Processing SDK Software accelerated runtime for the execution of deep neural networks on device Qualcomm Kryo CPU Qualcomm Adreno GPU Qualcomm Hexagon DSP Efficient execution on Snapdragon Takes advantage of Snapdragon heterogeneous computing capabilities Runtime and libraries accelerate deep neural net processing on all engines: CPU, GPU, and DSP with vector extensions Model framework/network support Convolutional neural networks and LSTMs Support for Caffe/Caffe2, TensorFlow, and user/developer defined layers Optimization/Debugging tools Offline network conversion tools Offline conversion tools Analyze performance Sample code Ease of integration Debug and analyze network performance API and SDK documentation with sample code Ease of integration into customer applications Qualcomm Kryo, Qualcomm Adreno and Qualcomm Hexagon are products of Qualcomm Technologies, Inc. Available at: developer.qualcomm.com 15

Robust software and tools simplify development Object classification Mobile GoogleNet/ Inception NP SDK Face detection Scene segmentation Auto SSD Alexnet Natural language understanding Home ResNet SqueezeNet Speaker recognition Camera Security/Authentication XR Resource management 16

Model to runtime workflow: Training and inference Training: Machine learning experts build and train their network to solve their particular problem Model building and training Backward propagation No Deep learning frameworks (e.g., Caffe2, TensorFlow) Training data Test data Data 1 Data 1 Model testing Good enough match? Yes Model file Static weights and biases Inference: NPE enables the network to run on Snapdragon devices NPE Enabled App Application.dlc file NPE runtime Optimized NPE Model (.dlc file) Model optimization tools NPE Model (.dlc file) Model conversion tools Optional (quantization, compression, etc.) 17

High-level software architecture for the Snapdragon NPE Application code User developed NPE API Model loader Future OS Application NPE library Compute runtimes Profiling logging Symphony CPU Float QSML User defined layers Float GPU Half float OpenCL driver Model debug Fixed 8 DSP DL container Fixed 16 FastRPC driver OS support: ARM Android ARM Linux x86 Linux User space Kernel space GPU kernel mode driver DSP kernel mode driver HW Kryo CPU Adreno GPU Hexagon DSP 18

User Defined Layer (UDL) workflow Supports prototyping of layers not yet supported by the Snapdragon NPE NPE workflow Model file (static weights and biases) Model conversion tool NPE model (.dlc file) Model conversion tool.dlc file NPE runtime NPE workflow with UDL Model file (static weights and biases) UDL Model conversion tool UDL handler NPE model (.dlc file) UDL Model Conversion Tool.dlc file UDL NPE runtime User-provided implementation User-defined layer weights and parameters UDL implementation Note: there is runtime performance cost to inserting a UDL into a network, related to context switching. Execution context is transferred to user in CPU control plane. 19

Converting and quantizing a model snpe-<framework>-to-dlc snpe-dlc-optimize Native model (e.g., Caffe) Model conversion tool NPE model (.dlc file) Model optimization tool NPE model (.dlc file) Only needed for certain features (e.g., activation quantization) Images Images Images Optional (quantization, compression, etc.) NPE enabled app snpe-<framework>-to-dlc (<framework> = Caffe Caffe2 TensorFlow) Input is the model in native framework format Output is a converted but not optimized NPE DLC file snpe-dlc-optimize Converts non-quantized DLC models into 8-bit quantized DLC models Additionally implements further optimizations such as SVD compression Quantized model is necessary for fixed-point NPE runtimes (e.g. DSP) 20

API example usage Application NPE NPE provides a simple C++ API with the following functionality Load a DLC model and select the runtime Execute the model Debug support Dump the output of all layers in a model Initialization Repeat new SNPEBuilder ( DLContainer ) Get input data and prepare InputTensor SNPEBuilder Object(SBO) SBO->setRunTimeProcessor( RuntimeMode ) SBO->setUdlBundle( UdlFactoryContextBundle ) SBO->build( ) UdlFactory( UdlContext ) IUDL object (IUDL) IUDL::Setup( ) Snapdragon NPE Object (Snapdragon NPE) See SNPEBuilder.hpp for all builder APIs For each UDL read from DLContainer Collect performance metrics InputTensor = SNPEFactory: :gettensorfactory().createtensor() Per-layer timing Green API calls are only required for UDLs Run time SNPE->execute( InputTensor, OutputTensorMap ) IUDL->execute( InputTensor, OutputTensor ) Process OutputTensorMap Process network See NativeCpp/BatchRun/ example in NPE 21

100s of Millions of units AI + Qualcomm Technologies Massive scale Learn more at: developer.qualcomm.com Based on Snapdragon Platforms capable of supporting Snapdragon NPE today. 22

Qualcomm Technologies and Facebook collaboration for massive AI scale 23

Facebook + Qualcomm Technologies On-device AI with Snapdragon Facebook and Qualcomm s Caffe2 collaboration Demonstrated Caffe2 acceleration with NPE at F8 2017 5x performance upside on GPU (compared to CPU) Announced commercial support of Caffe2 in July through Qualcomm Developer Network Facebook AML has integrated the NPE with Caffe2 Future Caffe2/NPE research and development Continue to work closely with Facebook to optimize key networks for maximum ondevice performance Enhancements to Caffe2 allowing Snapdragon specific SoC optimizations More advanced AI-powered XR applications On-device machine learning is made possible by the Qualcomm Snapdragon NPE which does the heavy lifting needed to run neural networks more efficiently on Snapdragon devices. Source: XDA F8 2017 24

Enhancing the Facebook experience through on-device AI More engaging social media with AI and AR Augmented reality features potentially powered by AI Style transfer and filters Frames and masks Photo and live videos, including 360 Contextual awareness (e.g. location/sensor metadata) On-device acceleration benefits Smooth UI with increased frame rate Increased battery life *Requires network connection and will support up to 20 hours of battery life 25

What s next 26

Distributed computing architectures AI hardware What does the future look like? AI acceleration research 27

AI offers enhanced experiences and new capabilities for smartphones True personal assistance Superior photography Extended battery life Natural user interfaces Enhanced connectivity Enhanced security A new development paradigm where things repeatedly improve 28

AI will bring XR closer to the ultimate level of immersion Creating physical presence in real or imagined worlds Visuals Rendering techniques Interactions Natural UI Depth estimation Battery efficiency Managing workload concurrency Sounds Audio filtering and cleanup 29

AI is revolutionizing the car of the future Redefining the in-car experience Natural user interfaces Personalization Driver awareness monitoring Paving the road to autonomy Surround view perception Sensor fusion Path planning Decision making 30 30

Reference What s next? Specialized hardware Algorithmic advancements Improved optimization strategies 31

Thank you! Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Nothing in these materials is an offer to sell any of the components or devices referenced herein. 2018 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm is a trademark of Qualcomm Incorporated, registered in the United States and other countries. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to Qualcomm may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.