Inference
|
|
- Chrystal Cain
- 5 years ago
- Views:
Transcription
1 Inference Graham Schelle, PhD Principal Engineer Xilinx Research Labs
2 Xilinx Headlines!2 Twitch Chooses Xilinx to Enable its Broadcast-quality Livestream of esports
3 Agenda Xilinx Adaptive Architectures Inference Architectures Open Source
4 Xilinx Adaptive Architectures Traditionally, FPGAs for massively data-parallel applications!4
5 Xilinx Adaptive Architectures Traditionally, FPGAs for massively data-parallel applications In 2011, Zynq introduced (ZU+ in 2015) ARM CPUs added for embedded applications!4
6 Xilinx Adaptive Architectures Alveo & Versal Cortex-A72s AI Engines Network On Chip In 2018, Alveo introduced Accelerator cards for data center workloads Coming in 2019, Versal Platform Adaptive compute acceleration platform (ACAP)!5
7 Inference Architectures
8 Inference Architectures Evolving Frameworks Andrej Karpathy on Twitter Increasing, Evolving Workloads New acceleration needs & algorithms ML infused in many applications Adaptable HW a key benefit!7
9 Inference Architectures Evolving Frameworks Andrej Karpathy on Twitter Increasing, Evolving Workloads New acceleration needs & algorithms ML infused in many applications Adaptable HW a key benefit Move to Lower Precision ML inference moving to INT8 & lower Better Perf/W with similar accuracy Xilinx devices natively support variable precision Compressed Networks Higher performance with reduced compute / memory needs Pruning & load balancing to match network requirements!7
10 Inference Architectures Evolving Workloads Increasing, Evolving Workloads New acceleration needs & algorithms ML infused in many applications Adaptable HW a key benefit Move to Lower Precision ML inference moving to INT8 & lower Better Perf/W with similar accuracy Xilinx devices natively support variable precision Compressed Networks Higher performance with reduced compute / memory needs Pruning & load balancing to match network requirements!8
11 Inference Architectures Evolving Workloads 1. Inference is hard 2. Huge variation in compute and memory requirements 3. Models typically don t fit into cache Increasing, Evolving Workloads New acceleration needs & algorithms ML infused in many applications Adaptable HW a key benefit Move to Lower Precision ML inference moving to INT8 & lower Better Perf/W with similar accuracy Xilinx devices natively support variable precision Compressed Networks Higher performance with reduced compute / memory needs Pruning & load balancing to match network requirements!8
12 Inference Architectures Precision vs Power FPGA: Bits (W/A) Pareto Optimal 20 LSTM - Test Error vs Power(W) ASIC: 3/3 17 Test error [%] 14 2/3 3/4 2/4 2/8 4/4 11 3/8 4/8 8/8 Source: Bill Dally (Stanford), Cadence Embedded Neural Network Summit, February 1, Estimated Power Consumption [W] Target Device ZU7EV Ambient temperature: 25 C 12.5% of toggle rate 0.5 of Static Probability Power reported for PL accelerated block only Michaela Blott, Hot Chips 2018 Tutorial, Overview of Deep Learning and Computer Architectures for Accelerating DNNs!9 Rybalkin, V., Pappalardo, A., Ghaffar, M.M., Gambardella, G., Wehn, N. and Blott, M. "FINN-L: Library Extensions and Design Tradeoff Analysis for Variable Precision LSTM Networks on FPGAs."
13 Xilinx Cloud Inference - ML Suite Overlays with xdnn Built in Programmable Logic High Utilization, Thput or Latency Variants CPU offload for new layer exploration xdnn w/ xfdnn Compiler On-prem and cloud boards
14 Xilinx Edge Inference - DeePhi Learning both Weights and Connections for Efficient Neural Networks, NeurIPS 2015 (2013) (2016) EIE: Efficient Inference Engine on Compressed Deep Neural Network, ISCA 2016 ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA, FPGA 2017!11
15 Xilinx Edge Inference - DeePhi Learning both Weights and Connections for Efficient Neural Networks, NeurIPS 2015 (2013) (2016) EIE: Efficient Inference Engine on Compressed Deep Neural Network, ISCA 2016 ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA, FPGA 2017!11
16 Cloud & Edge Integration!12
17 Xilinx and Open Source
18 Xilinx and Open Source PYNQ Quantized Neural Networks Xilinx Runtime for PCIe Attached FPGAs!14 More on
19 Xilinx and Open Source PYNQ Quantized Neural Networks Xilinx Runtime for PCIe Attached FPGAs!14 More on
20 Python is increasingly the Language of Choice Top Programming Languages, IEEE Spectrum, July 17 July 18 To date
21 Python is increasingly the Language of Choice Top Programming Languages, IEEE Spectrum, July 17 July 18 To date Python is listed as an embedded language for the first time
22 Python is increasingly the Language of Choice Top Programming Languages, IEEE Spectrum, July 17 July 18 To date Python is listed as an embedded language for the first time Python is the fastest growing language: driven by data science, AI, ML and academia Copyright 2018 Xilinx!15
23 PYNQ: Python Productivity for Zynq Jupyter web server IPython kernel Ubuntu-based Linux ARM A9 / A53 Overlays/designs ZU+ Fabric!16
24 PYNQ: Python Productivity for Zynq Jupyter notebooks, browser-based interface Jupyter web server IPython kernel Ubuntu-based Linux ARM A9 / A53 Overlays/designs ZU+ Fabric!16
25 PYNQ: Python Productivity for Zynq Jupyter notebooks, browser-based interface PYNQ enables JupyterLab on Zynq and ZU+ Jupyter web server IPython kernel Ubuntu-based Linux ARM A9 / A53 Overlays/designs ZU+ Fabric!16
26 PYNQ: Python Productivity for Zynq Jupyter notebooks, browser-based interface Jupyter web server IPython kernel Ubuntu-based Linux ARM A9 / A53 PYNQ enables JupyterLab on Zynq and ZU+ Overlays/designs ZU+ Fabric FPGA designs delivered as Python packages!16
27 PYNQ: Python Productivity for Zynq Jupyter notebooks, browser-based interface Jupyter web server IPython kernel Ubuntu-based Linux ARM A9 / A53 PYNQ enables JupyterLab on Zynq and ZU+ Overlays/designs ZU+ Fabric FPGA designs delivered as Python packages Delivered as SD Card image!16
28 PYNQ Community ML, Non-ML & Academic Partners!17
29 PYNQ Community ML, Non-ML & Academic Partners!17
30 PYNQ Community ML, Non-ML & Academic Partners!17
31 Xilinx open source engagements related to today s TVM meeting MicroPython!18
32 Xilinx open source engagements related to today s TVM meeting University of Washington UC San Diego Xilinx Research MicroPython UC Berkeley!18
33 Finally, Xilinx & building new open source communities Cloud Free Trials pynq.io/community DAC2019 Design Contest OpenHW Design Contest
34 Summary Xilinx Great for exploring and deploying inference Xilinx Open Source We re actively engaging with TVM and other communities Visit: Boulder, Colorado
35 Adaptable. Intelligent.
36 Edge Inference to Cloud Acceleration Inference - Edge Automotive At The Edge!22
37 Edge Inference to Cloud Acceleration Inference - Edge Automotive At The Edge ADAS/AD Central Module!22
38 Edge Inference to Cloud Acceleration Inference - Edge Automotive Surround-View Camera Back Short-Range At The Edge Radar Forward-Looking Camera Drive Monitor Camera Surround-View Camera Left Short-Range Radar Surround-View Camera Right ADAS/AD Central Module Long-Range Lidar Surround-View Camera Front Short-Range Radar!22
39 Edge to Cloud Inference Xilinx Platforms ZCU104 PYNQ-Z1 Ultra96 ZCU102 Edge Devices Custom I/O, ARM CPUs Cloud Platforms Power Efficient, PCIe, Networking!23
40 Edge to Cloud Inference IIoT Latency/Data Example Example IIoT Control Rates!24
41 Edge to Cloud Inference IIoT Latency/Data Example Example IIoT Control Rates Distance NYC to LA: 2,800 miles Speed of light: 186,000 miles/s Round trip: 2*2800/ = 30ms Required Control Rate = 10ms!24
42 Edge to Cloud Inference IIoT Latency/Data Example Example IIoT Control Rates Distance NYC to LA: 2,800 miles Speed of light: 186,000 miles/s Round trip: 2*2800/ = 30ms Required Control Rate = 10ms E.g. Power 8TB/Month!24
融入 Python 生态的 Zynq 软硬件设计框架
Python Productivity for Zynq 融入 Python 生态的 Zynq 软硬件设计框架 陆佳华 Xilinx 教育与创新生态高级经理 joshual@xilinx.com Python is increasingly the Language of Choice Top Programming Languages, IEEE Spectrum, July 18 July 17
More informationAdaptable Intelligence The Next Computing Era
Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion
More informationXilinx ML Suite Overview
Xilinx ML Suite Overview Yao Fu System Architect Data Center Acceleration Xilinx Accelerated Computing Workloads Machine Learning Inference Image classification and object detection Video Streaming Frame
More informationXilinx Machine Learning Strategies For Edge
Xilinx Machine Learning Strategies For Edge Presented By Alvin Clark, Sr. FAE, Northwest The Hottest Research: AI / Machine Learning Nick s ML Model Nick s ML Framework copyright sources: Gospel Coalition
More informationDAC 2018 FPGA design contest
DAC 2018 FPGA design contest Naveen Purushotham, Xilinx Jingtong Hu, University of Pittsburgh Bei Yu, Chinese University of Hong Kong Xinyi Zhang, University of Pittsburgh Agenda Welcome DAC Contest Committee
More informationXilinx ML Suite Overview
Xilinx ML Suite Overview Jim Heaton Sr. FAE Deep Learning explores the study of algorithms that can learn from and make predictions on data Deep Learning is Re-defining Many Applications Cloud Acceleration
More informationVersal: AI Engine & Programming Environment
Engineering Director, Xilinx Silicon Architecture Group Versal: Engine & Programming Environment Presented By Ambrose Finnerty Xilinx DSP Technical Marketing Manager October 16, 2018 MEMORY MEMORY MEMORY
More informationAdaptable Computing The Future of FPGA Acceleration. Dan Gibbons, VP Software Development June 6, 2018
Adaptable Computing The Future of FPGA Acceleration Dan Gibbons, VP Software Development June 6, 2018 Adaptable Accelerated Computing Page 2 Three Big Trends The Evolution of Computing Trend to Heterogeneous
More informationESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA
ESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA Song Han 1,2, Junlong Kang 2, Huizi Mao 1, Yiming Hu 3, Xin Li 2, Yubin Li 2, Dongliang Xie 2, Hong Luo 2, Song Yao 2, Yu Wang 2,3, Huazhong
More informationRecurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks
Deep neural networks have enabled major advances in machine learning and AI Computer vision Language translation Speech recognition Question answering And more Problem: DNNs are challenging to serve and
More informationSoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator
SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator FPGA Kongress München 2017 Martin Heimlicher Enclustra GmbH Agenda 2 What is Visual System Integrator? Introduction Platform
More informationAn introduction to Machine Learning silicon
An introduction to Machine Learning silicon November 28 2017 Insight for Technology Investors AI/ML terminology Artificial Intelligence Machine Learning Deep Learning Algorithms: CNNs, RNNs, etc. Additional
More informationVersal: The New Xilinx Adaptive Compute Acceleration Platform (ACAP) in 7nm
Engineering Director, Xilinx Silicon Architecture Group Versal: The New Xilinx Adaptive Compute Acceleration Platform (ACAP) in 7nm Presented By Kees Vissers Fellow February 25, FPGA 2019 Technology scaling
More informationOnto Petaflops with Kubernetes
Onto Petaflops with Kubernetes Vishnu Kannan Google Inc. vishh@google.com Key Takeaways Kubernetes can manage hardware accelerators at Scale Kubernetes provides a playground for ML ML journey with Kubernetes
More informationRevolutionizing the Datacenter
Power-Efficient Machine Learning using FPGAs on POWER Systems Ralph Wittig, Distinguished Engineer Office of the CTO, Xilinx Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Top-5
More informationDNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs
IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei
More informationDeep Learning Accelerators
Deep Learning Accelerators Abhishek Srivastava (as29) Samarth Kulshreshtha (samarth5) University of Illinois, Urbana-Champaign Submitted as a requirement for CS 433 graduate student project Outline Introduction
More informationScaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research
Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research Nick Fraser (Xilinx & USydney) Yaman Umuroglu (Xilinx & NTNU) Giulio Gambardella (Xilinx)
More informationBandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design
Bandwidth-Centric Deep Learning Processing through Software-Hardware Co-Design Song Yao 姚颂 Founder & CEO DeePhi Tech 深鉴科技 song.yao@deephi.tech Outline - About DeePhi Tech - Background - Bandwidth Matters
More informationSmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center
SmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center Jeff Defilippi Senior Product Manager Arm #Arm Tech Symposia The Cloud to Edge Infrastructure Foundation for a World of 1T Intelligent
More informationFast Hardware For AI
Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights
More informationSmall is the New Big: Data Analytics on the Edge
Small is the New Big: Data Analytics on the Edge An overview of processors and algorithms for deep learning techniques on the edge Dr. Abhay Samant VP Engineering, Hiller Measurements Adjunct Faculty,
More informationArtificial Intelligence Enriched User Experience with ARM Technologies
Artificial Intelligence Enriched User Experience with ARM Technologies Daniel Heo Senior Segment Manager Mobile, BSG, ARM ARM Tech Forum Singapore July 12 th 2017 Global AI survey: the world is ready 71
More informationAltera SDK for OpenCL
Altera SDK for OpenCL A novel SDK that opens up the world of FPGAs to today s developers Altera Technology Roadshow 2013 Today s News Altera today announces its SDK for OpenCL Altera Joins Khronos Group
More informationSDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center
SDAccel Environment The Xilinx SDAccel Development Environment Bringing The Best Performance/Watt to the Data Center Introduction Data center operators constantly seek more server performance. Currently
More informationSoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator
SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator Embedded Computing Conference 2017 Matthias Frei zhaw InES Patrick Müller Enclustra GmbH 5 September 2017 Agenda Enclustra introduction
More informationFINN: A Framework for Fast, Scalable Binarized Neural Network Inference
FINN: A Framework for Fast, Scalable Binarized Neural Network Inference Yaman Umuroglu (XIR & NTNU), Nick Fraser (XIR & USydney), Giulio Gambardella (XIR), Michaela Blott (XIR), Philip Leong (USydney),
More informationBringing Intelligence to Enterprise Storage Drives
Bringing Intelligence to Enterprise Storage Drives Neil Werdmuller Director Storage Solutions Arm Santa Clara, CA 1 Who am I? 28 years experience in embedded Lead the storage solutions team Work closely
More information在数据中心中加速 AI - Xilinx 机器学习套件 (Xilinx ML Suite )
赛灵思高级主任 DSP/ 机器学习专家赛灵思高级主任 DSP/ 机器学习专家 赛灵思技术日 XILINX TECHNOLOGY DAY 在数据中心中加速 AI - Xilinx 机器学习套件 (Xilinx ML Suite ) 王宏强赛灵思资深主任 DSP/ 机器学习专家 2019 年 3 月 19 日 机器学习推断是赛灵思的长项 TRAINING Input cat =? labels dog
More informationMicrosoft Ignite 2018 HPE Accelerates Data Insight and Action Across the Enterprise with Latest Edgeline Capabilities
Microsoft Ignite 2018 HPE Accelerates Data Insight and Action Across the Enterprise with Latest Edgeline Capabilities Ron Neyland, Sr. Director Engineering HPE IoT & Converged Edge Systems September 2018
More informationDeep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm
Deep Learning on Arm Cortex-M Microcontrollers Rod Crawford Director Software Technologies, Arm What is Machine Learning (ML)? Artificial Intelligence Machine Learning Deep Learning Neural Networks Additional
More informationInference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA
Inference Optimization Using TensorRT with Use Cases Jack Han / 한재근 Solutions Architect NVIDIA Search Image NLP Maps TensorRT 4 Adoption Use Cases Speech Video AI Inference is exploding 1 Billion Videos
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationFPGA 加速机器学习应用. 罗霖 2017 年 6 月 20 日
FPGA 加速机器学习应用 罗霖 Andy.luo@Xilinx.com 2017 年 6 月 20 日 Xilinx The All Programmable Company XILINX - Founded 1984 Headquarters Research and Development Sales and Support Manufacturing $2.21B FY16 revenue
More informationA New Era of Hardware Microservices in the Cloud. Doug Burger Distinguished Engineer, Microsoft UW Cloud Workshop March 31, 2017
A New Era of Hardware Microservices in the Cloud Doug Burger Distinguished Engineer, Microsoft UW Cloud Workshop March 31, 2017 Moore s Law Dennard Scaling has been dead for a decade Moore s La is o er
More informationIs your IT Infrastructure Ready for Machine Learning & Artificial Intelligence?
BRKPAR-2955 Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence? Hoseb Dermanilian, EMEA BDM, NetApp Arnaud BASSALER, CSE, Cisco Systems Agenda Introduction AI, Machine Learning
More informationSOFTWARE HARDWARE CODESIGN ACCELERATION FOR EFFICIENT NEURAL NETWORK. ...Deep learning and neural
... SOFTWARE HARDWARE CODESIGN FOR EFFICIENT NEURAL NETWORK ACCELERATION... Kaiyuan Guo Tsinghua University and DeePhi Song Han Stanford University and DeePhi Song Yao DeePhi Yu Wang Tsinghua University
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationDEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017
DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE Dennis Lui August 2017 THE RISE OF GPU COMPUTING APPLICATIONS 10 7 10 6 GPU-Computing perf 1.5X per year 1000X by 2025 ALGORITHMS 10 5 1.1X
More informationThe Emerging Computational Landscape of Neural Networks
The Emerging Computational Landscape of Neural Networks Michaela Blott Principal Engineer, Xilinx Research August 2018 Background Xilinx Research - Ireland Ivo Bolsens CTO Since 13 years Part of the worldwide
More informationWelcome. Altera Technology Roadshow 2013
Welcome Altera Technology Roadshow 2013 Altera at a Glance Founded in Silicon Valley, California in 1983 Industry s first reprogrammable logic semiconductors $1.78 billion in 2012 sales Over 2,900 employees
More informationSmart Ultra-Low Power Visual Sensing
Smart Ultra-Low Power Visual Sensing Manuele Rusci*, Francesco Conti * manuele.rusci@unibo.it f.conti@unibo.it Energy-Efficient Embedded Systems Laboratory Dipartimento di Ingegneria dell Energia Elettrica
More informationOpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017
OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017 Agenda Why Zynq SoCs for Traditional Computer Vision Automated
More informationAnalyzing the Disruptive Impact of a Silicon Compiler
THE ELECTRONICS RESURGENCE INITIATIVE Analyzing the Disruptive Impact of a Silicon Compiler Andreas Olofsson 1947 Source: Wikipedia, Computer Museum 2017 Source: AMD Defense Advanced Research Project Agency
More informationMATLAB/Simulink 기반의프로그래머블 SoC 설계및검증
MATLAB/Simulink 기반의프로그래머블 SoC 설계및검증 이웅재부장 Application Engineering Group 2014 The MathWorks, Inc. 1 Agenda Introduction ZYNQ Design Process Model-Based Design Workflow Prototyping and Verification Processor
More informationA Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models
A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating Michael Price*, James Glass, Anantha Chandrakasan MIT, Cambridge, MA * now at Analog Devices, Cambridge,
More informationUnified Deep Learning with CPU, GPU, and FPGA Technologies
Unified Deep Learning with CPU, GPU, and FPGA Technologies Allen Rush 1, Ashish Sirasao 2, Mike Ignatowski 1 1: Advanced Micro Devices, Inc., 2: Xilinx, Inc. Abstract Deep learning and complex machine
More informationAccelerating Data Center Workloads with FPGAs
Accelerating Data Center Workloads with FPGAs Enno Lübbers NorCAS 2017, Linköping, Sweden Intel technologies features and benefits depend on system configuration and may require enabled hardware, software
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationVTA: Open & Flexible DL Acceleration. Thierry Moreau TVM Conference, Dec 12th 2018
VTA: Open & Flexible DL Acceleration Thierry Moreau TVM Conference, Dec 12th 2018 TVM Stack High-Level Differentiable IR Tensor Expression IR LLVM CUDA Metal TVM Stack High-Level Differentiable IR Tensor
More informationBringing the benefits of Cortex-M processors to FPGA
Bringing the benefits of Cortex-M processors to FPGA Presented By Phillip Burr Senior Product Marketing Manager Simon George Director, Product & Technical Marketing System Software and SoC Solutions Agenda
More information借助 SDSoC 快速開發複雜的嵌入式應用
借助 SDSoC 快速開發複雜的嵌入式應用 May 2017 What Is C/C++ Development System-level Profiling SoC application-like programming Tools and IP for system-level profiling Specify C/C++ Functions for Acceleration Full System
More informationAccelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs
Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Ritchie Zhao 1, Weinan Song 2, Wentao Zhang 2, Tianwei Xing 3, Jeng-Hau Lin 4, Mani Srivastava 3, Rajesh Gupta 4, Zhiru
More information金融商品取引アルゴリズムのハードウェアアクセラレ ーションに関する研究 [ 課題研究報告書 ] Description Supervisor: 田中清史, 情報科学研究科, 修士
JAIST Reposi https://dspace.j Title 金融商品取引アルゴリズムのハードウェアアクセラレ ーションに関する研究 [ 課題研究報告書 ] Author(s) 小林, 弘幸 Citation Issue Date 2018-03 Type Thesis or Dissertation Text version author URL http://hdl.handle.net/10119/15214
More informationSoftware Defined Hardware
Software Defined Hardware For data intensive computation Wade Shen DARPA I2O September 19, 2017 1 Goal Statement Build runtime reconfigurable hardware and software that enables near ASIC performance (within
More informationSimplify System Complexity
Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint
More informationEmbedded GPGPU and Deep Learning for Industrial Market
Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL
More informationIntegrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC
Integrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC 2012 The MathWorks, Inc. 1 Agenda Integrated Hardware / Software Top
More informationNVIDIA FOR DEEP LEARNING. Bill Veenhuis
NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA
More informationBringing Intelligence to Enterprise Storage Drives
Bringing Intelligence to Enterprise Storage Drives Neil Werdmuller Director Storage Solutions Arm Santa Clara, CA 1 Who am I? 28 years experience in embedded Lead the storage solutions team Work closely
More informationFinancial Analytics Acceleration
Financial Analytics Acceleration Presented By Name GEORGI GAYDADJIEV Title Director of Maxeler IoT-Labs Date Dec 10, 2018 FPGA technology is getting traction among Datacenter providers and is expected
More informationDNNDK User Guide. UG1327 (v1.0) January 22, 2019
DNNDK User Guide Revision History The following table shows the revision history for this document. Section General updates 1/22/2019 Version 1.0 Revision Summary Initial Xilinx release. DNNDK User Guide
More informationAccelerating Implementation of Low Power Artificial Intelligence at the Edge
Accelerating Implementation of Low Power Artificial Intelligence at the Edge A Lattice Semiconductor White Paper November 2018 The emergence of smart factories, cities, homes and mobile are driving shifts
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant
More informationDNNDK User Guide. UG1327 (v 2.08 Beta) December 12, 2018
DNNDK User Guide Revision History The following table shows the revision history for this document. Section General updates General updates Revision Summary 11/15/2018 Initial Xilinx release. 12/12/2018
More informationOCP Engineering Workshop - Telco
OCP Engineering Workshop - Telco Low Latency Mobile Edge Computing Trevor Hiatt Product Management, IDT IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose,
More informationEnd to End Optimization Stack for Deep Learning
End to End Optimization Stack for Deep Learning Presenter: Tianqi Chen Paul G. Allen School of Computer Science & Engineering University of Washington Collaborators University of Washington AWS AI Team
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationDr. Jean-Laurent PHILIPPE, PhD EMEA HPC Technical Sales Specialist. With Dell Amsterdam, October 27, 2016
Dr. Jean-Laurent PHILIPPE, PhD EMEA HPC Technical Sales Specialist With Dell Amsterdam, October 27, 2016 Legal Disclaimers Intel technologies features and benefits depend on system configuration and may
More informationAgenda. Introduction Network functions virtualization (NFV) promise and mission cloud native approach Where do we want to go with NFV?
August, 2018 Agenda Introduction Network functions virtualization (NFV) promise and mission cloud native approach Where do we want to go with NFV? 2 Miroslaw Walukiewicz I m from Gdansk, Poland. 25 years
More informationWorld s most advanced data center accelerator for PCIe-based servers
NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying
More informationComprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications
Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications Helena Zheng ML Group, Arm Arm Technical Symposia 2017, Taipei Machine Learning is a Subset of Artificial
More informationMaking Sense of Artificial Intelligence: A Practical Guide
Making Sense of Artificial Intelligence: A Practical Guide JEDEC Mobile & IOT Forum Copyright 2018 Young Paik, Samsung Senior Director Product Planning Disclaimer This presentation and/or accompanying
More informationVoice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Voice, Image, Video : AI in action with AWS A long heritage of machine learning at Amazon Personalized recommendations Fulfillment automation and inventory management Drones Voice driven interactions Inventing
More informationA So%ware Developer's Journey into a Deeply Heterogeneous World. Tomas Evensen, CTO Embedded So%ware, Xilinx
A So%ware Developer's Journey into a Deeply Heterogeneous World Tomas Evensen, CTO Embedded So%ware, Xilinx Embedded Development: Then Simple single CPU Most code developed internally 10 s of thousands
More informationLecture 12: Model Serving. CSE599W: Spring 2018
Lecture 12: Model Serving CSE599W: Spring 2018 Deep Learning Applications That drink will get you to 2800 calories for today I last saw your keys in the store room Remind Tom of the party You re on page
More informationEnabling FPGAs in Hyperscale Data Centers
J. Weerasinghe; IEEE CBDCom 215, Beijing; 13 th August 215 Enabling s in Hyperscale Data Centers J. Weerasinghe 1, F. Abel 1, C. Hagleitner 1, A. Herkersdorf 2 1 IBM Research Zurich Laboratory 2 Technical
More informationHEAD HardwarE Accelerated Deduplication
HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version
More informationGRVI Phalanx Update: Plowing the Cloud with Thousands of RISC-V Chickens. Jan Gray
If you were plowing a field, which would you rather use: two strong oxen or 1024 chickens? Seymour Cray GRVI Phalanx Update: Plowing the Cloud with Thousands of RISC-V Chickens Jan Gray jan@fpga.org http://fpga.org
More informationFINN: A Framework for Fast, Scalable Binarized Neural Network Inference
FINN: A Framework for Fast, Scalable Binarized Neural Network Inference Yaman Umuroglu (NTNU & Xilinx Research Labs Ireland) in collaboration with N Fraser, G Gambardella, M Blott, P Leong, M Jahre and
More informationHeterogeneous Multi-Processing for SW- Defined Multi-Tiered Storage Architectures Endric Schubert (MLE) Ulrich Langenbach (MLE) Michaela Blott
Heterogeneous Multi-Processing for SW- Defined Multi-Tiered Storage Architectures Endric Schubert (MLE) Ulrich Langenbach (MLE) Michaela Blott (Xilinx Research) SDC, 2017 Content Heterogeneous Multi-Processing
More informationData Platform Futures
Data Platform Futures Jon Jahren Data & AI Architect Microsoft jon.jahren@microsoft.com 2017 Microsoft. All rights reserved. The following content contains forward looking statements including ongoing
More informationEmbedded HW/SW Co-Development
Embedded HW/SW Co-Development It May be Driven by the Hardware Stupid! Frank Schirrmeister EDPS 2013 Monterey April 18th SPMI USB 2.0 SLIMbus RFFE LPDDR 2 LPDDR 3 emmc 4.5 UFS SD 3.0 SD 4.0 UFS Bare Metal
More informationImplementing Long-term Recurrent Convolutional Network Using HLS on POWER System
Implementing Long-term Recurrent Convolutional Network Using HLS on POWER System Xiaofan Zhang1, Mohamed El Hadedy1, Wen-mei Hwu1, Nam Sung Kim1, Jinjun Xiong2, Deming Chen1 1 University of Illinois Urbana-Champaign
More informationHardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team
Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team 2015 The MathWorks, Inc. 1 Agenda Integrated Hardware / Software Top down Workflow for SoC
More informationNVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)
NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive) NVDLA NVIDIA DEEP LEARNING ACCELERATOR IP Core for deep learning part of NVIDIA s Xavier
More informationUnleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM
Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases Steve Steele, ARM 1 Today s Computational Challenges Trends Growing display sizes and resolutions, richer
More informationAccelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx
Accelerating your Embedded Vision / Machine Learning design with the revision Stack Giles Peckham, Xilinx Xilinx Foundation at the Edge Vision Customers Using Xilinx >80 ADAS Models From 23 Makers >80
More information赛灵思技术日 XILINX TECHNOLOGY DAY 用赛灵思 FPGA 加速机器学习推断 张帆资深全球 AI 方案技术专家
赛灵思技术日 XILINX TECHNOLOGY DAY 用赛灵思 FPGA 加速机器学习推断 张帆资深全球 AI 方案技术专家 2019.03.19 Who is Xilinx? Why Should I choose FPGA? Only HW/SW configurable device 1 2 for fast changing networks High performance / low
More informationHardware Accelerated SDR Platform for Adaptive Air Interfaces Tarik Kazaz, Christophe Van Praet, Merima Kulin, Pieter Willemen, Ingrid Moerman
Hardware Accelerated SDR Platform for Adaptive Air Interfaces Tarik Kazaz, Christophe Van Praet, Merima Kulin, Pieter Willemen, Ingrid Moerman 27/01/2016 1 Overview Common SDR approach Propposed approach
More informationOPERA. Low Power Heterogeneous Architecture for the Next Generation of Smart Infrastructure and Platforms in Industrial and Societal Applications
OPERA Low Power Heterogeneous Architecture for the Next Generation of Smart Infrastructure and Platforms in Industrial and Societal Applications Co-funded by the Horizon 2020 Framework Programme of the
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationOptimizing HW/SW Partition of a Complex Embedded Systems. Simon George November 2015.
Optimizing HW/SW Partition of a Complex Embedded Systems Simon George November 2015 Zynq-7000 All Programmable SoC HP ACP GP Page 2 Zynq UltraScale+ MPSoC Page 3 HW/SW Optimization Challenges application()
More informationFPGAhammer: Remote Voltage Fault Attacks on Shared FPGAs, suitable for DFA on AES
, suitable for DFA on AES Jonas Krautter, Dennis R.E. Gnad, Mehdi B. Tahoori 10.09.2018 INSTITUTE OF COMPUTER ENGINEERING CHAIR OF DEPENDABLE NANO COMPUTING KIT Die Forschungsuniversität in der Helmholtz-Gemeinschaft
More informationOpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit
OpenCAPI Technology Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI Topics Computation
More informationZynq Ultrascale+ Architecture
Zynq Ultrascale+ Architecture Stephanie Soldavini and Andrew Ramsey CMPE-550 Dec 2017 Soldavini, Ramsey (CMPE-550) Zynq Ultrascale+ Architecture Dec 2017 1 / 17 Agenda Heterogeneous Computing Zynq Ultrascale+
More informationFPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS
FPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS Mike Ashworth, Graham Riley, Andrew Attwood and John Mawer Advanced Processor Technologies Group School
More informationSimplify System Complexity
1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller
More informationHETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE
HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE PHILIP HEAH ASSISTANT CHIEF EXECUTIVE TECHNOLOGY & INFRASTRUCTURE GROUP LAUNCH OF SERVICES AND DIGITAL ECONOMY (SDE) TECHNOLOGY ROADMAP (NOV 2018) Source
More information