Lucida Sirius and DjiNN Tutorial

Size: px
Start display at page:

Download "Lucida Sirius and DjiNN Tutorial"

Transcription

1 Lucida Sirius and DjiNN Tutorial Speakers: Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang Organizers: Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Lingjia Tang, Jason Mars

2 Before We Begin Conference wifi SSID: HPCA_PPoPP_CGO Password: BARCELONA2016 Suggested prerequisites VirtualBox Chrome VM with demo software lucida-djinn-tutorial.tgz (USB sticks) Hardware resources ~6GB RAM, 50GB disk 2

3 Schedule 9:00-9:15 Welcome Section 1 - Lucida 9:15-9:30 Introduction to Lucida Section 2 Lucida Suite 9:30-9:45 Core Algorithmic Components of IPAs 9:45-10:00 Hands-on: Lucida Suite Section 3 - DjiNN and Tonic 10:30-10:45 Deep Learning in Intelligent Web Services 10:45-11:00 Tonic Suite: Background and Hands-on Section 4 - LucidaEco 11:00-11:15 Introduction to LucidaEco 11:15-12:00 Hands-on: Build Your Own IPA 3

4 Topic: Introduction to Lucida Speakers: Johann Hauswald

5 5

6 Rise of the Wearables 40% $80bn 6

7 Questions Arise What are the impacts on our data centers? Can we scale to millions (or billions) of users? How do we design the right systems? 7

8 The Hard Place: Build an end to end intelligent personal assistant. Sirius [ASPLOS 15] 7

9 Answer Display Answer Image Matching Voice Server Mobile Users Image Database Automatic Speech-Recognition Query Classifier Execute Action Question-Answering Search Database Action Question or Action Image Data Question Image 9

10 Lucida Hierarchy Users Users Voice Command (VC) Voice Query (VQ) Voice Command Voice Query (VC) (VQ) Automatic-Speech Recognition (ASR) Automatic-Speech Recognition (ASR) Query Taxonomy Voice-Image Query (VIQ) Question Answering (QA) Question Answering (QA) Signal Processing HMM/GMM or HMM/DNN Voice-Image Query (VIQ) Natural Language Processing Regular Stemmer Expression Conditional Random Fields Query Taxonomy Image Matching (IMM) IPA Services Image Matching (IMM) Image Processing IPA Services Tasks Feature Extraction Algorithmic Feature Description Components 10

11 Speech Acoustic Model Feature Extraction Input Layer Language Model Hidden Layers Output Layer Word Dictionary Trained Data Speech Decoder Scored HMM states HMM DNN Scoring or GMM scoring Viterbi Search Feature Vectors Who was elected 44th President? ASR 0.9 IMM Image SURF Feature Description Calculate Hessian Matrix Haar Wavelet Build Scale-Space Orientation Assignment Find Keypoints Keypoints Keypoint Descriptor Three Horsemen of Lucida QA Descriptor Database Input Filter Who was elected 44th president? Image Descriptors Question-Answering Document Selector Regex elected 44th Stemmer president CRF Scores Document Filters SURF Feature Extraction Barack Obama Websearch ^[0-9,th]$ 44th -ed elect elected VERB 44th president NUM N 11

12 Project Status Recent Publication Open source community BayMax [ASPLOS 16] lucida-ai.slack.com Student projects Independent studies Summer interns 12

13 Topic: Core Algorithmic Components of IPAs Speaker: Yunqi Zhang

14 How does Lucida work? Users Voice Command (VC) Automatic-Speech Recognition (ASR) Voice Query (VQ) Voice-Image Query (VIQ) Question Answering (QA) Image Matching (IMM) Query Taxonomy IPA Services CMU Sphinx 16

15 How does Lucida work? Users Voice Command (VC) Voice Query (VQ) Voice-Image Query (VIQ) Automatic-Speech Recognition (ASR) Question Answering (QA) Signal Processing Natural Language Processing Image Matching (IMM) Image Processing Query Taxonomy IPA Services Tasks 16

16 How does Lucida work? Signal Processing Natural Language Processing Image Processing Tasks Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) 85% or 78% Regular Stemmer Expression Conditional Random Fields 85% Feature Extraction Feature Description 97% Algorithmic Components 7 kernels: 92% of total execution of Lucida 16

17 What is Lucida Suite? Suite of 7 workloads extracted from the end-to-end Lucida intelligent personal assistant Suite includes: Standalone workloads Pretrained models (when applicable) Inputs for all workloads Available at: 17

18 Objectives of Lucida Suite Represent emerging datacenter workloads speech recognition, computer vision, natural language processing Cover key components in the application Suite represents 92% of the total cycles Implementations Single thread CPU Pthread CPU GPU Easy to use Implemented in C/C++ and CUDA 18

19 Lucida Suite Service Workload Task Platforms Image Matching Feature Extraction Computer Vision Pthread, GPU Feature Description Computer Vision Pthread, GPU Automatic Speech Recognition GMM Scoring Speech Recognition Pthread, GPU DNN Scoring Speech Recognition Pthread, GPU Question Answer Stemmer Natural Language Processing Pthread, GPU Regular Expression Natural Language Processing Pthread Conditional Random Fields Natural Language Processing Pthread 19

20 Image Matching

21 Image SURF Feature Extractor SURF Feature Descriptor Calculate Hessian Matrix Haar Wavelet Build Scale-Space Orientation Assignment Find Keypoints Keypoints Keypoint Descriptor Image Descriptors Descriptor Database Accelerated workload: 97% 21

22 Automatic Speech Recognition

23 Automatic Speech Recognition (ASR) Recognition Process Input Layer Hidden Layers Output Layer Speech HMM DNN Feature Extraction Feature Vectors GMM Pre-processing ca p i Viterbi Algorithm t al 23

24 Automatic Speech Recognition (ASR) Recognition Process Input Layer Hidden Layers Output Layer 24

25 Question Answering

26 Question Answering Input Filters Who was elected 44th president of the United States? Document Database Candidate Phrases: (*) was elected (*) was the 44th Document Filters Answer Selector Barack Obama e.g., wikipedia Sentences in Candidate Documents: wikipedia/barack_obama wikipedia/presidents_of_the_us 26

27 Question Answering Input Filters Who was elected 44th president of the United States? Document Database 85% of the execution Conditional Random Fields Stemmer Regular Expression Barack Obama Candidate Phrases: (*) was elected (*) was the 44th Document Filters e.g., wikipedia Answer Selector Sentences in Candidate Documents: wikipedia/barack_obama wikipedia/presidents_of_the_us 26

28 Topic: Hands-on: Lucida Suite Speaker: Yunqi Zhang

29 Lucida Suite 28

30 Setup Prerequisites Install VirtualBox Install Chrome Copy the tutorial VM USB thumb drive 29

31 Configure VirtualBox Unpack lucida-djinn-demo.tgz into ~/VirtualBox VMs/ Add lucida-djinn-demo to VirtualBox Machine -> Add Port forwarding Machine -> Settings -> Network -> Adapter 1 -> Advanced -> Port Forwarding 30

32 Stand up the VM Start! username/password: demo/demo open Terminal application cd ~/hpca-demo/source-code/lucida/lucida-suite Follow along with Yunqi! 31

33 Topic: Deep Learning in Intelligent Web Services Speaker: Michael A. Laurenzano

34 10,000 foot view of Deep Neural Networks (DNN) What is deep about a DNN? More layers Often, bigger layers also Example DNN in Lucida ASR Input Layer Hidden Layers Output Layer HMM State p(hmmstate AcousticModel) 33

35 Why DNN? Emerging in a number of domains Big data analytics Video/image/audio recognition Language semantics/translation Many components of IPAs like Lucida can leverage DNN 34

36 DNNs in the IPA pipeline Answer Display Answer Image Matching Voice Server Mobile Users Image Database Automatic Speech-Recognition Query Classifier Execute Action Question-Answering Search Database Action Question or Action Image Data Question Image 35

37 Studying DNNs as an architect is a challenge 36

38 Studying DNNs as an architect is a challenge What should my DNN do? Network configuration How many layers? How to configure those layers? Input Layer Hidden Layers Output Layer Interconnections? Activation functions? Model training HMM State p(hmmstate AcousticModel) When are the results accurate enough? How to choose a training set? 36

39 DNN as a Service Image Classification Unified, highly optimized appliance for DNN Digit Recognition Facial Recognition Speech Recognition Natural Language Processing 37

40 Introducing DjiNN and Tonic [ISCA 15] DjiNN Infrastructure for DNN as a service Extensible to a number of DNN services CPU-optimized implementation GPU-optimized implementation Tonic Suite 7 complete DNN services covering important IPA use cases Configured networks and trained models are provided No machine learning expertise required 38

41 Topic: Tonic Suite Components Speaker: Johann Hauswald

42 Tonic Suite Image Processing Facial recognition (FACE) Digit recognition (DIG) Speech Processing Image classification (IMC) Users Part-of-speech tagging (POS) DIG It s business, Superman Natural Language Processing Task POS Chunking (CHK) CHK Name entity recognition (NER) NER FACE Speech Recognition (ASR) Task Automatic speech recognition (ASR) Natural Language Processing IMC Image Task business (noun) Superman (P. noun) It s (VP, B-NP) business (NP, I-NP) Superman (PERSON) 40

43 Deep Neural Networks (DNNs) Feature Maps Inference 0.1 Spiderman 0.7 Superman 0.2 Batman Input Convolutional layer Pooling layer Fully Connected layer Network Architecture 41

44 Deep Neural Networks (DNNs) speech features Inference Who, is, this who is this w 0 w 1 w k w 0 w 1 w k w 0 w 1 w k word vectors Fully Connected layer Who (PRONOUN) is (VERB) this (PRONOUN) Network Architecture 41

45 Tonic Suite Applications IMC Users Image Task DIG DjiNN DNN Service FACE DNN Architecture Speech Recognition (ASR) Task It s business, Superman POS CHK NER business (noun) Superman (P. noun) It s (VP, B-NP) business (NP, I-NP) Superman (PERSON) IMC DIG POS FACE CHK ASR NER Trained Models Natural Language Processing Task 42

46 Basic characterization Single CPU Not all DNN services are created equal Accelerating DNN can provide significant speedup 43

47 Tonic Suite 44

48 Topic: Introduction to LucidaEco Speaker: Michael A. Laurenzano

49 What is LucidaEco? The next step in the evolution of Lucida Open source end-to-end IPA ecosystem Rich set of infrastructure and AI tools Easily compose customized, scalable, production-quality IPAs Speech Services Image Services Text Services Frontend Applications Image Speech Query Pipeline Manager Response Text Micro-Services 46

50 Micro-service Architecture Designed to achieve Scalability Modularity Extensibility Independently deployable intelligent services Common interface - create, learn, infer Speech Services Image Services Text Services Frontend Applications Image Speech Query Pipeline Manager Response Text Micro-Services 47

51 Micro-service Design Common interface to each micro-service Create - new knowledge base/model Learn - add information to KB/model Infer - retrieve knowledge Inter-service communication RPC abstraction (Facebook s Thrift) for most websockets for streaming audio Language/platform issues hidden from micro-service Virtualized, platform-independent deployment docker (container-based) 48

52 Query Pipeline Manager Orchestrate the pipeline of IPA components Goals Optimize computational footprint Hardware accelerators and heterogeneity Micro-service co-location Speech Services Image Services Text Services Frontend Applications Image Speech Query Pipeline Manager Response Text Micro-Services 49

53 LucidaEco is Growing Micro-service ecosystem ASR (Kaldi, Sphinx, different languages) QA (OpenEphyra, Qanta, YodaQA) Machine Translation, Sentiment, Summarization Image matching, object recognition Frontend apps web, mobile, tablet; others coming LucidaEco as a research platform AI algorithms/implementations in an end-to-end pipeline Scalability - acceleration, virtualization, high-bw communication substrates Burgeoning community of developers/researchers Join us! 50

54 Topic: Build Your Own IPA Speakers: Michael A. Laurenzano and Johann Hauswald

55 The Architecture of Your IPA Open- Ephyra QA Your KB Kaldi ASR Docker Container 1 Docker Container 2 VM (VirtualBox) Host Machine 52

56 Setup Prerequisites Install VirtualBox Install Chrome Copy the tutorial VM USB thumb drive 53

57 Configure VirtualBox Unpack lucida-djinn-demo.tgz into ~/VirtualBox VMs/ Add lucida-djinn-demo to VirtualBox Machine -> Add Port forwarding Machine -> Settings -> Network -> Adapter 1 -> Advanced -> Port Forwarding 54

58 Run Your IPA Start! username/password: demo/demo open Terminal application cd ~/hpca-demo/source-code/lucida/ Running the demo $ docker-compose up ON THE HOST: 55

59 Build Your Own IPA 56

60 Thank You!!! Tutorial Website lucida.ai Lucida Suite Tonic Suite LucidaEco 57

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski,

More information

System Design for Intelligent Web Services

System Design for Intelligent Web Services System Design for Intelligent Web Services by Johann-Alexander Hauswald A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and

More information

Winter 2018 Prof. Satish Narayanasamy Special thanks to Babak Falsafi (EPFL) for ecocloud slides

Winter 2018 Prof. Satish Narayanasamy  Special thanks to Babak Falsafi (EPFL) for ecocloud slides EECS 570 Applications Winter 2018 Prof. Satish Narayanasamy http://www.eecs.umich.edu/courses/eecs570/ Special thanks to Babak Falsafi (EPFL) for ecocloud slides Slides developed in part by Profs. Falsafi,

More information

SIRIUS IMPLICATIONS FOR FUTURE WAREHOUSE-SCALE COMPUTERS

SIRIUS IMPLICATIONS FOR FUTURE WAREHOUSE-SCALE COMPUTERS ... SIRIUS IMPLICATIONS FOR FUTURE WAREHOUSE-SCALE COMPUTERS... DEMAND IS EXPECTED TO GROW SIGNIFICANTLY FOR CLOUD SERVICES THAT DELIVER Johann Hauswald Michael A. Laurenzano Yunqi Zhang Cheng Li Austin

More information

DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers

DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers Johann Hauswald Yiping Kang Michael A. Laurenzano Quan Chen Cheng Li Trevor Mudge Ronald G. Dreslinski Jason

More information

A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models

A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating Michael Price*, James Glass, Anantha Chandrakasan MIT, Cambridge, MA * now at Analog Devices, Cambridge,

More information

Automatic Speech Recognition (ASR)

Automatic Speech Recognition (ASR) Automatic Speech Recognition (ASR) February 2018 Reza Yazdani Aminabadi Universitat Politecnica de Catalunya (UPC) State-of-the-art State-of-the-art ASR system: DNN+HMM Speech (words) Sound Signal Graph

More information

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager Characterization and Benchmarking of Deep Learning Natalia Vassilieva, PhD Sr. Research Manager Deep learning applications Vision Speech Text Other Search & information extraction Security/Video surveillance

More information

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU

More information

A Deep Learning primer

A Deep Learning primer A Deep Learning primer Riccardo Zanella r.zanella@cineca.it SuperComputing Applications and Innovation Department 1/21 Table of Contents Deep Learning: a review Representation Learning methods DL Applications

More information

Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference

Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference Yunqi Zhang, David Meisner, Jason Mars, Lingjia Tang Internet services User interactive applications

More information

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017 How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017 Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Deep Learning

More information

Speech Technology Using in Wechat

Speech Technology Using in Wechat Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech

More information

M.Tech Student, Department of ECE, S.V. College of Engineering, Tirupati, India

M.Tech Student, Department of ECE, S.V. College of Engineering, Tirupati, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 5 ISSN : 2456-3307 High Performance Scalable Deep Learning Accelerator

More information

Treadmill: Tail Latency Measurement at Microsecond-level Precision. Yunqi Zhang Johann Hauswald David Meisner Jason Mars Lingjia Tang

Treadmill: Tail Latency Measurement at Microsecond-level Precision. Yunqi Zhang Johann Hauswald David Meisner Jason Mars Lingjia Tang Treadmill: Tail Latency Measurement at Microsecond-level Precision Yunqi Zhang Johann Hauswald David Meisner Jason Mars Lingjia Tang Schedule Welcome Section 1: Tail latency 08:40 ~ 09:00 Overview of data

More information

Deep Learning Accelerators

Deep Learning Accelerators Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction

More information

Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers

Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers Quan Chen 1 Hailong Yang 1 Jason Mars Lingjia Tang Clarity Lab, University of Michigan - Ann

More information

DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA

DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA J.Jayalakshmi 1, S.Ali Asgar 2, V.Thrimurthulu 3 1 M.tech Student, Department of ECE, Chadalawada Ramanamma Engineering College, Tirupati Email

More information

Deep learning in MATLAB From Concept to CUDA Code

Deep learning in MATLAB From Concept to CUDA Code Deep learning in MATLAB From Concept to CUDA Code Roy Fahn Applications Engineer Systematics royf@systematics.co.il 03-7660111 Ram Kokku Principal Engineer MathWorks ram.kokku@mathworks.com 2017 The MathWorks,

More information

SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers

SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers Yunqi Zhang, Michael A. Laurenzano, Jason Mars, Lingjia Tang Clarity-Lab Electrical Engineering

More information

NVIDIA PLATFORM FOR AI

NVIDIA PLATFORM FOR AI NVIDIA PLATFORM FOR AI João Paulo Navarro, Solutions Architect - Linkedin i am ai HTTPS://WWW.YOUTUBE.COM/WATCH?V=GIZ7KYRWZGQ 2 NVIDIA Gaming VR AI & HPC Self-Driving Cars GPU Computing 3 GPU COMPUTING

More information

Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers

Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers Quan Chen 1 Hailong Yang 1 Minyi Guo Ram Srivatsa Kannan? Jason Mars? Lingjia Tang? Department

More information

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks Deep neural networks have enabled major advances in machine learning and AI Computer vision Language translation Speech recognition Question answering And more Problem: DNNs are challenging to serve and

More information

High-Performance Data Loading and Augmentation for Deep Neural Network Training

High-Performance Data Loading and Augmentation for Deep Neural Network Training High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant

More information

Voice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Voice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Voice, Image, Video : AI in action with AWS A long heritage of machine learning at Amazon Personalized recommendations Fulfillment automation and inventory management Drones Voice driven interactions Inventing

More information

Characterization of Speech Recognition Systems on GPU Architectures

Characterization of Speech Recognition Systems on GPU Architectures Facultat d Informàtica de Barcelona Master in Innovation and Research in Informatics High Performance Computing Master Thesis Characterization of Speech Recognition Systems on GPU Architectures Author:

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A

More information

Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center

Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center 3/17/2015 2014 IBM Corporation Outline IBM OpenPower Platform Accelerating

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

Lecture 7: Neural network acoustic models in speech recognition

Lecture 7: Neural network acoustic models in speech recognition CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic

More information

ABC-CNN: Attention Based CNN for Visual Question Answering

ABC-CNN: Attention Based CNN for Visual Question Answering ABC-CNN: Attention Based CNN for Visual Question Answering CIS 601 PRESENTED BY: MAYUR RUMALWALA GUIDED BY: DR. SUNNIE CHUNG AGENDA Ø Introduction Ø Understanding CNN Ø Framework of ABC-CNN Ø Datasets

More information

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 Hidden Markov Models Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 1 Outline 1. 2. 3. 4. Brief review of HMMs Hidden Markov Support Vector Machines Large Margin Hidden Markov Models

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

Why data science is the new frontier in software development

Why data science is the new frontier in software development Why data science is the new frontier in software development And why every developer should care Jeff Prosise jeffpro@wintellect.com @jprosise Assertion #1 Being a programmer is like being the god of your

More information

Why DNN Works for Speech and How to Make it More Efficient?

Why DNN Works for Speech and How to Make it More Efficient? Why DNN Works for Speech and How to Make it More Efficient? Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering, York University, CANADA Joint work with Y.

More information

AI & Machine Learning at Amazon

AI & Machine Learning at Amazon DevDay Berlin AI & Machine Learning at Amazon Ian Massingham IanMmmm Developer Technology Evangelism, Amazon Web Services ianm@amazon.com 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

More information

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) Microsoft Solution Latest Sl Area Refresh No. Course ID Run ID Course Name Mapping Date 1 AZURE202x 2 Microsoft

More information

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018 Nvidia Jetson TX2 and its Software Toolset João Fernandes 2017/2018 In this presentation Nvidia Jetson TX2: Hardware Nvidia Jetson TX2: Software Machine Learning: Neural Networks Convolutional Neural Networks

More information

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Rajesh Bordawekar IBM T. J. Watson Research Center bordaw@us.ibm.com Pidad D Souza IBM Systems pidsouza@in.ibm.com 1 Outline

More information

Implementing Deep Learning for Video Analytics on Tegra X1.

Implementing Deep Learning for Video Analytics on Tegra X1. Implementing Deep Learning for Video Analytics on Tegra X1 research@hertasecurity.com Index Who we are, what we do Video analytics pipeline Video decoding Facial detection and preprocessing DNN: learning

More information

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA Inference Optimization Using TensorRT with Use Cases Jack Han / 한재근 Solutions Architect NVIDIA Search Image NLP Maps TensorRT 4 Adoption Use Cases Speech Video AI Inference is exploding 1 Billion Videos

More information

GPU Accelerated Model Combination for Robust Speech Recognition and Keyword Search

GPU Accelerated Model Combination for Robust Speech Recognition and Keyword Search GPU Accelerated Model Combination for Robust Speech Recognition and Keyword Search Wonkyum Lee Jungsuk Kim Ian Lane Electrical and Computer Engineering Carnegie Mellon University March 26, 2014 @GTC2014

More information

Scale-Out Acceleration for Machine Learning

Scale-Out Acceleration for Machine Learning Scale-Out Acceleration for Machine Learning Jongse Park Hardik Sharma Divya Mahajan Joon Kyung Kim Preston Olds Hadi Esmaeilzadeh Alternative Computing Technologies (ACT) Lab Georgia Institute of Technology

More information

Cisco UCS-Mini with B200 M4 Blade Servers High Capacity/High Performance Citrix Virtual Desktop and App Solutions

Cisco UCS-Mini with B200 M4 Blade Servers High Capacity/High Performance Citrix Virtual Desktop and App Solutions In Collaboration with Intel May 2015 Cisco UCS-Mini with B200 M4 Blade Servers High Capacity/High Performance Citrix Virtual Desktop and App Solutions Presenters: Frank Anderson Desktop Virtualization

More information

Big Data Systems on Future Hardware. Bingsheng He NUS Computing

Big Data Systems on Future Hardware. Bingsheng He NUS Computing Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big

More information

Maximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman

Maximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency with ML accelerators Michael

More information

Deep Learning: Transforming Engineering and Science The MathWorks, Inc.

Deep Learning: Transforming Engineering and Science The MathWorks, Inc. Deep Learning: Transforming Engineering and Science 1 2015 The MathWorks, Inc. DEEP LEARNING: TRANSFORMING ENGINEERING AND SCIENCE A THE NEW RISE ERA OF OF GPU COMPUTING 3 NVIDIA A IS NEW THE WORLD S ERA

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor WELCOME TO Operant conditioning 1938 Spiking Neuron 1952 first neuroscience department 1964 Deep learning prevalence mid 2000s The Turing Machine 1936 Transistor 1947 First computer science department

More information

In Silico Vox: Towards Speech Recognition in Silicon

In Silico Vox: Towards Speech Recognition in Silicon In Silico Vox: Towards Speech Recognition in Silicon Edward C Lin, Kai Yu, Rob A Rutenbar, Tsuhan Chen Electrical & Computer Engineering {eclin, kaiy, rutenbar, tsuhan}@ececmuedu RA Rutenbar 2006 Speech

More information

Boost your Analytics with ML for SQL Nerds

Boost your Analytics with ML for SQL Nerds Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products

More information

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. Kim Hazelwood Facebook AI Infrastructure

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. Kim Hazelwood Facebook AI Infrastructure Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective Kim Hazelwood Facebook AI Infrastructure Citation Count Re-Emergence of Machine Learning 3000 Gradient-Based Learning Applied

More information

Data center: The center of possibility

Data center: The center of possibility Data center: The center of possibility Diane bryant Executive vice president & general manager Data center group, intel corporation Data center: The center of possibility The future is Thousands of Clouds

More information

Why AI Frameworks Need (not only) RDMA?

Why AI Frameworks Need (not only) RDMA? Why AI Frameworks Need (not only) RDMA? With Design and Implementation Experience of Networking Support on TensorFlow GDR, Apache MXNet, WeChat Amber, and Tencent Angel Bairen Yi (byi@connect.ust.hk) Jingrong

More information

SVM Segment Video Machine. Jiaming Song Yankai Zhang

SVM Segment Video Machine. Jiaming Song Yankai Zhang SVM Segment Video Machine Jiaming Song Yankai Zhang Introduction Background When watching a video online, users might need: Detailed video description information Removal of repeating openings and endings

More information

Virtual SAN and vsphere w/ Operations Management

Virtual SAN and vsphere w/ Operations Management Welcome! The webinar will start shortly For audio, dial 877-668-4490 / Code 664 120 829 or Listen on Your Computer Simplify Virtual Storage and Management with VMware Virtual SAN and vsphere w/ Operations

More information

Scaling Deep Learning. Bryan

Scaling Deep Learning. Bryan Scaling Deep Learning @ctnzr What do we want AI to do? Guide us to content Keep us organized Help us find things Help us communicate 帮助我们沟通 Drive us to work Serve drinks? Image Q&A Baidu IDL Sample questions

More information

Parallel HMMs. Parallel Implementation of Hidden Markov Models for Wireless Applications

Parallel HMMs. Parallel Implementation of Hidden Markov Models for Wireless Applications Parallel HMMs Parallel Implementation of Hidden Markov Models for Wireless Applications Authors Shawn Hymel (Wireless@VT, Virginia Tech) Ihsan Akbar (Harris Corporation) Jeffrey Reed (Wireless@VT, Virginia

More information

SVD-based Universal DNN Modeling for Multiple Scenarios

SVD-based Universal DNN Modeling for Multiple Scenarios SVD-based Universal DNN Modeling for Multiple Scenarios Changliang Liu 1, Jinyu Li 2, Yifan Gong 2 1 Microsoft Search echnology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft Way, Redmond,

More information

Exploring the features of OpenCL 2.0

Exploring the features of OpenCL 2.0 Exploring the features of OpenCL 2.0 Saoni Mukherjee, Xiang Gong, Leiming Yu, Carter McCardwell, Yash Ukidave, Tuan Dao, Fanny Paravecino, David Kaeli Northeastern University Outline Introduction and evolution

More information

World s most advanced data center accelerator for PCIe-based servers

World s most advanced data center accelerator for PCIe-based servers NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

Smart Ultra-Low Power Visual Sensing

Smart Ultra-Low Power Visual Sensing Smart Ultra-Low Power Visual Sensing Manuele Rusci*, Francesco Conti * manuele.rusci@unibo.it f.conti@unibo.it Energy-Efficient Embedded Systems Laboratory Dipartimento di Ingegneria dell Energia Elettrica

More information

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM AGENDA 1 Introduction to Deep Learning 2 What is DIGITS 3 How to use DIGITS Practical DEEP LEARNING Examples Image Classification, Object Detection,

More information

Image classification by a Two Dimensional Hidden Markov Model

Image classification by a Two Dimensional Hidden Markov Model Image classification by a Two Dimensional Hidden Markov Model Author: Jia Li, Amir Najmi and Robert M. Gray Presenter: Tzung-Hsien Ho Hidden Markov Chain Goal: To implement a novel classifier for image

More information

SSDs that Think. Noam Mizrahi Vice President, Technology and Architecture CTO Office, Marvell

SSDs that Think. Noam Mizrahi Vice President, Technology and Architecture CTO Office, Marvell SSDs that Think Intelligent SSDs Can Handle a Larger Computing Load at the Edge Noam Mizrahi Vice President, Technology and Architecture CTO Office, Marvell People have been mining forever 18xx 19xx Gold

More information

J.A.R.V.I.S. Group #14: Yao Rao, # Dianchen Jiang, # Minghui Lin, # Jensen Zhang, #

J.A.R.V.I.S. Group #14: Yao Rao, # Dianchen Jiang, # Minghui Lin, # Jensen Zhang, # J.A.R.V.I.S. Group #14: Yao Rao, #108788980 Dianchen Jiang, #108250990 Minghui Lin, #109557872 Jensen Zhang, #109561796 CSE 352 Artificial Intelligence Prof. Anita Wasilewska Nov. 9th, 2015 Resources http://www.boosharticles.com/2014/10/jarvis-ironman-ai-projectinvention-future/

More information

Harp-DAAL for High Performance Big Data Computing

Harp-DAAL for High Performance Big Data Computing Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big

More information

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI Overview Unparalleled Value Product Portfolio Software Platform From Desk to Data Center to Cloud Summary AI researchers depend on computing performance to gain

More information

Embedded GPGPU and Deep Learning for Industrial Market

Embedded GPGPU and Deep Learning for Industrial Market Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL

More information

Introduction to Hidden Markov models

Introduction to Hidden Markov models 1/38 Introduction to Hidden Markov models Mark Johnson Macquarie University September 17, 2014 2/38 Outline Sequence labelling Hidden Markov Models Finding the most probable label sequence Higher-order

More information

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm Deep Learning on Arm Cortex-M Microcontrollers Rod Crawford Director Software Technologies, Arm What is Machine Learning (ML)? Artificial Intelligence Machine Learning Deep Learning Neural Networks Additional

More information

Making Sense of Artificial Intelligence: A Practical Guide

Making Sense of Artificial Intelligence: A Practical Guide Making Sense of Artificial Intelligence: A Practical Guide JEDEC Mobile & IOT Forum Copyright 2018 Young Paik, Samsung Senior Director Product Planning Disclaimer This presentation and/or accompanying

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA

Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA Junzhong Shen, You Huang, Zelong Wang, Yuran Qiao, Mei Wen, Chunyuan Zhang National University of Defense Technology,

More information

Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence?

Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence? BRKPAR-2955 Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence? Hoseb Dermanilian, EMEA BDM, NetApp Arnaud BASSALER, CSE, Cisco Systems Agenda Introduction AI, Machine Learning

More information

An Open Accelerator Infrastructure Project for OCP Accelerator Module (OAM)

An Open Accelerator Infrastructure Project for OCP Accelerator Module (OAM) An Open Accelerator Infrastructure Project for OCP Accelerator Module (OAM) SERVER Whitney Zhao, Hardware Engineer, Facebook Siamak Tavallaei, Principal Architect, Microsoft Richard Ding, AI System Architect,

More information

Natural Language Processing. SoSe Question Answering

Natural Language Processing. SoSe Question Answering Natural Language Processing SoSe 2017 Question Answering Dr. Mariana Neves July 5th, 2017 Motivation Find small segments of text which answer users questions (http://start.csail.mit.edu/) 2 3 Motivation

More information

Machine Learning on VMware vsphere with NVIDIA GPUs

Machine Learning on VMware vsphere with NVIDIA GPUs Machine Learning on VMware vsphere with NVIDIA GPUs Uday Kurkure, Hari Sivaraman, Lan Vu GPU Technology Conference 2017 2016 VMware Inc. All rights reserved. Gartner Hype Cycle for Emerging Technology

More information

GPU-Accelerated Deep Learning

GPU-Accelerated Deep Learning GPU-Accelerated Deep Learning July 6 th, 2016. Greg Heinrich. Credits: Alison B. Lowndes, Julie Bernauer, Leo K. Tam. PRACTICAL DEEP LEARNING EXAMPLES Image Classification, Object Detection, Localization,

More information

Microsoft vision for a new era

Microsoft vision for a new era Microsoft vision for a new era United platform for the modern service provider MICROSOFT AZURE CUSTOMER DATACENTER CONSISTENT PLATFORM SERVICE PROVIDER Enterprise-grade Global reach, scale, and security

More information

Dell EMC Ready Architectures for VDI

Dell EMC Ready Architectures for VDI Dell EMC Ready Architectures for VDI Designs for VMware Horizon 7 on Dell EMC XC Family September 2018 H17387 Deployment Guide Abstract This deployment guide provides instructions for deploying VMware

More information

Real-time large-scale analysis of audiovisual data

Real-time large-scale analysis of audiovisual data Finnish Center of Excellence in Computational Inference Real-time large-scale analysis of audiovisual data Department of Signal Processing and Acoustics Aalto University School of Electrical Engineering

More information

Vinnie Saini Cloud Solution Architect Big Data & AI

Vinnie Saini Cloud Solution Architect Big Data & AI Vinnie Saini Cloud Solution Architect Big Data & AI vasaini@microsoft.com data intelligence cloud Data + Intelligence + Cloud Extensible Applications Easy to consume Artificial Intelligence Most comprehensive

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

In Silico Vox: Towards Speech Recognition in Silicon

In Silico Vox: Towards Speech Recognition in Silicon In Silico Vox: Towards Speech Recognition in Silicon Edward C Lin, Kai Yu, Rob A Rutenbar, Tsuhan Chen Electrical & Computer Engineering {eclin, kaiy, rutenbar, tsuhan}@ececmuedu RA Rutenbar 006 Speech

More information

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads Natalia Vassilieva, Sergey Serebryakov Deep learning ecosystem today Software Hardware 2 HPE s portfolio for deep learning Government,

More information

Accelerating Implementation of Low Power Artificial Intelligence at the Edge

Accelerating Implementation of Low Power Artificial Intelligence at the Edge Accelerating Implementation of Low Power Artificial Intelligence at the Edge A Lattice Semiconductor White Paper November 2018 The emergence of smart factories, cities, homes and mobile are driving shifts

More information

How to Build Optimized ML Applications with Arm Software

How to Build Optimized ML Applications with Arm Software How to Build Optimized ML Applications with Arm Software Arm Technical Symposia 2018 ML Group Overview Today we will talk about applied machine learning (ML) on Arm. My aim for today is to show you just

More information

Time series, HMMs, Kalman Filters

Time series, HMMs, Kalman Filters Classic HMM tutorial see class website: *L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. of the IEEE, Vol.77, No.2, pp.257--286, 1989. Time series,

More information

OpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit

OpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit OpenCAPI Technology Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI Topics Computation

More information

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen Structured Perceptron Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen 1 Outline 1. 2. 3. 4. Brief review of perceptron Structured Perceptron Discriminative Training Methods for Hidden Markov Models: Theory and

More information

Practical Applications of Machine Learning for Image and Video in the Cloud

Practical Applications of Machine Learning for Image and Video in the Cloud Practical Applications of Machine Learning for Image and Video in the Cloud Shawn Przybilla, AWS Solutions Architect M&E @shawnprzybilla 2/27/18 There were 3.7 Billion internet users in 2017 1.2 Trillion

More information

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki 2011 The MathWorks, Inc. 1 Today s Topics Introduction Computer Vision Feature-based registration Automatic image registration Object recognition/rotation

More information

Apparel Classifier and Recommender using Deep Learning

Apparel Classifier and Recommender using Deep Learning Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu

More information

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks Layer-wise Performance Bottleneck Analysis of Deep Neural Networks Hengyu Zhao, Colin Weinshenker*, Mohamed Ibrahim*, Adwait Jog*, Jishen Zhao University of California, Santa Cruz, *The College of William

More information

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

NVIDIA FOR DEEP LEARNING. Bill Veenhuis NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA

More information

WITH INTEL TECHNOLOGIES

WITH INTEL TECHNOLOGIES WITH INTEL TECHNOLOGIES Commitment Is to Enable The Best Democratize technologies Advance solutions Unleash innovations Intel Xeon Scalable Processor Family Delivers Ideal Enterprise Solutions NEW Intel

More information

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage DDN DDN Updates Data DirectNeworks Japan, Inc Shuichi Ihara DDN A Broad Range of Technologies to Best Address Your Needs Protection Security Data Distribution and Lifecycle Management Open Monitoring Your

More information