Adaptable Computing The Future of FPGA Acceleration. Dan Gibbons, VP Software Development June 6, 2018

Size: px

Start display at page:

Download "Adaptable Computing The Future of FPGA Acceleration. Dan Gibbons, VP Software Development June 6, 2018"

Derick Howard
5 years ago
Views:

1 Adaptable Computing The Future of FPGA Acceleration Dan Gibbons, VP Software Development June 6, 2018

2 Adaptable Accelerated Computing Page 2

3 Three Big Trends

4 The Evolution of Computing Trend to Heterogeneous Architectures with Acceleration of New Workloads Mainframe Era PC Era Mobile Era Pervasive Intelligence Era

5 The Need for Adaptable Intelligence The intelligent connected world needs adaptable accelerated computing. Everything Intelligent & Connected Deployed at Global Scale Dynamic Needs & Rapid Innovation

genomics dynamically optimized Medical data

6 Why it Matters Personalized Medicine Example Whole genome diagnosis to treat critically ill newborns Analysis reduced from 1 day to 20 minutes Patient-specific genomics dynamically optimized Medical data and research needs to be securely accessed across the globe

7 The FPGA Advantage Page 7

The FPGA Advantage for Machine Learning Inference Adaptive Architecture > Customer dataflow, precision, optimizations Layer 1 FPGA Layer 2 Layer 3 Custom Memory

8 The FPGA Advantage for Machine Learning Inference Adaptive Architecture > Customer dataflow, precision, optimizations Layer 1 FPGA Layer 2 Layer 3 Custom Memory Hierarchy > Keeps data inside vs. external memory bottleneck Layer 1 GPU Layer 2 Layer 3 Workload + ML Inference > Unleashes the power of on-chip system dataflow Page 8

years of research Active research area (binary,

9 Powerful FPGA Optimizations: Precision Impact of Precision on Performance Similar accuracy 10+ years of research Active research area (binary, variable, bit serial ) Page 9 int1 int4 int8 TPU GPU CPU

10 Powerful FPGA Optimizations: Compression Compression 30x to 50x compression rate without impacting accuracy (AlexNet) Page 10

11 FPGA Advantage: Deterministic Latency Batch Input 1 Input 1 Result 1 Batch Inference Input 2 Input 3 Input 2 Input 3 GPU DNN Result 2 Result 3 > Parallel batch of data to feed SIMD > High batch => low latency, higher throughput Input 4 Input 4 Latency4 Latency3 Result 4 > Lower compute efficiency at low batch Latency2 Latency1 Input 1 Result 1 Batch-less Inference > Low and deterministic latency Input 2 Input 3 Input 4 FPGA DNN Result 2 Result 3 Result 4 > High throughput regardless of batch size Latency4 > Consistent compute efficiency Latency3 Latency2 Latency1 Page 11

12 ML Inference Integrated with Other Workloads Live video summary using CNN & RNN Multi-format Video Decoder Scaler Color Space Converter Convolutional Net PCIe RNN / LSTM Large Rabbit choking squirrel in forest FPGA Page 12

Adaptable Compute Use Cases Across the Datacenter

Data Analytics Video Transcoding Financial Services

Store ML Inference Database / Big Data Analytics

13 Adaptable Compute Use Cases Across the Datacenter Compute Storage Networking ML Inference Database / Big Data Analytics Video Transcoding Financial Services Analytics Genomics Compression Encryption Key-Value Store ML Inference Database / Big Data Analytics IPSec/SSL OVS Offload Bare Metal Services Security Monitoring Page 13

14 Zynq SoCs: Adaptable Computing on the Edge Joint Detect Ped SSD HDMI USB 3 MIPI SD Card Face Detect Traffic SSD ZCU102 Development Platform 4 CNN Models 3 Live Inputs + File IO Under 10 Watts! Page 14

15 Xilinx Enables Adaptable Accelerated Computing Page 15

16 XILINX FPGA as a Service goes wide Launched Nov 2016 Launched Nov 2016 Launched Jul 2017 Launched Aug 2017 Launched Sep 2017 Launched Oct 2017 Page 16

17 Towards Software as a Service (SaaS) Enterprise SaaS SW API Accelerated SW SDAccel FaaS Page 17

Breakout in Programming for Acceleration

18 Breakout in Programming for Acceleration Optimal acceleration results requires platform performance, compiler efficiency and programming proficiency High Performance Platform Advanced Compiler Productive IDE & Optimized Libraries User Onboarding Page 18

19 Rich Stack Integrated with Frameworks Machine Learning Video Transcoding Data Analytics Open Frameworks Accelerated Libraries Database Analytics Development Environment Platforms On Premise Boards Page 19

20 Transformation Through Innovation First MPSoC & RFSoC ACAP First 3D FPGA & HW/SW Programmable SoC Graphic of MPSoC, RFSoC Virtex-2 Pro World s First FPGA First Virtex FPGA Page 20

leading the way with platforms, tools, applications and FaaS Now

21 The Era of Heterogeneous Computing Architectures is Here FPGA s are uniquely suited for adaptable accelerated computing Xilinx is leading the way with platforms, tools, applications and FaaS Now is the opportunity for application development and deployment Page 21

Adaptable Intelligence The Next Computing Era

Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion