efpga for Neural Network based Image Recognition

Similar documents
Revolutionizing RISC-V based application design possibilities with GLOBALFOUNDRIES. Gregg Bartlett Senior Vice President, CMOS Business Unit

Virtual Array Architecture for efpga April 5, 2018

Enabling Safe, Secure, Smarter Cars from Silicon to Software. Jeff Hutton Synopsys Automotive Business Development

An introduction to Machine Learning silicon

Turning an Automated System into an Autonomous system using Model-Based Design Autonomous Tech Conference 2018

FPGA How do they work?

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

Accelerating Implementation of Low Power Artificial Intelligence at the Edge

Programmable Logic Devices

A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology

A Lightweight YOLOv2:

CPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline

Creating Affordable and Reliable Autonomous Vehicle Systems

Field Program mable Gate Arrays

Brainchip OCTOBER

A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI SOI Symposium Santa Clara, Apr.

Fast Hardware For AI

Xilinx Machine Learning Strategies For Edge

Accelerating FPGA/ASIC Design and Verification

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

OCP Engineering Workshop - Telco

Table 1: Example Implementation Statistics for Xilinx FPGAs

Smart Ultra-Low Power Visual Sensing

MIPI : Advanced Driver Assistance System

Accelerating your Embedded Vision / Machine Learning design with the revision Stack. Giles Peckham, Xilinx

Cover TBD. intel Quartus prime Design software

Cover TBD. intel Quartus prime Design software

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

XPU A Programmable FPGA Accelerator for Diverse Workloads

Zynq-7000 All Programmable SoC Product Overview

9 REASONS WHY THE VIVADO DESIGN SUITE ACCELERATES DESIGN PRODUCTIVITY

EITF35: Introduction to Structured VLSI Design

Why the Self-Driving Revolution Hinges on one Enabling Technology: LiDAR

Chapter 2. FPGA and Dynamic Reconfiguration ...

A C H I E V E B O T H W I T H K E Y S I G H T. Company Profile

CONTACT: ,

The Changing Face of Edge Compute

Implementing Flexible Interconnect Topologies for Machine Learning Acceleration

Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team

CMPE 415 Programmable Logic Devices Introduction

MS Diploma and Semester Projects offered at the Microelectronic Systems Laboratory during the Spring 2015

Intelligent Interconnect for Autonomous Vehicle SoCs. Sam Wong / Chi Peng, NetSpeed Systems

Chapter 5: ASICs Vs. PLDs

3D Time-of-Flight Image Sensor Solutions for Mobile Devices

Презентация на SEMICON России 2014 Company Introduction. Jens Benndorf, Managing Director, COO

Speedcore efpga Datasheet (DS003) Speedcore efpga

Towards Fully-automated Driving. tue-mps.org. Challenges and Potential Solutions. Dr. Gijs Dubbelman Mobile Perception Systems EE-SPS/VCA

Signs of Intelligent Life: AI Simplifies IoT

ISE Design Suite Software Manuals and Help

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

Specializing Hardware for Image Processing

easic Technology & Nextreme Architecture

Artificial Intelligence Enriched User Experience with ARM Technologies

Frequently Asked Questions (FAQ)

Designing and Prototyping Digital Systems on SoC FPGA The MathWorks, Inc. 1

Adaptable Intelligence The Next Computing Era

Xilinx Corporate Overview

Utilizing SDSoC to Port Convolutional Neural Network to a Space-grade FPGA

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

Design of a Processing Structure of CNN Algorithm using Filter Buffers

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions

ESL design with the Agility Compiler for SystemC

Design and Verification of FPGA Applications

FPGA 加速机器学习应用. 罗霖 2017 年 6 月 20 日

ASYNC Rik van de Wiel COO Handshake Solutions

Algorithmic Memory Increases Memory Performance By an Order of Magnitude

Bring Intelligence to the Edge with Intel Movidius Neural Compute Stick

Catapult: A Reconfigurable Fabric for Petaflop Computing in the Cloud

FPGA Based Digital Design Using Verilog HDL

Use the advantages of embedded VisualApplets: Realize custom image processing on the FPGAs of your cameras in shortest time!

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

direct hardware mapping of cnns on fpga-based smart cameras

Making Sense of Artificial Intelligence: A Practical Guide

Executive Brief: The CompassIntel A-List Index A-List in Artificial Intelligence Chipset

Indian Silicon Technologies 2013

AIScaleCDP2 IP Core Preliminary Datasheet

THE NVIDIA DEEP LEARNING ACCELERATOR

Pushing the Boundaries of Moore's Law to Transition from FPGA to All Programmable Platform Ivo Bolsens, SVP & CTO Xilinx ISPD, March 2017

Design Techniques for Implementing an 800MHz ARM v5 Core for Foundry-Based SoC Integration. Faraday Technology Corp.

100M Gate Designs in FPGAs

So you think developing an SoC needs to be complex or expensive? Think again

Support Triangle rendering with texturing: used for bitmap rotation, transformation or scaling

Embedded Real-Time Video Processing System on FPGA

Introducing the FX-14 ASIC Design System. Embargoed until November 10, 2015

Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications

FPGA Technology and Industry Experience

Ten Reasons to Optimize a Processor

EE 4755 Digital Design Using Hardware Description Languages

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

Scaling Convolutional Neural Networks on Reconfigurable Logic Michaela Blott, Principal Engineer, Xilinx Research

May 26, 2017 Data Sheet Version: Table 1: Example Implementation Statistics for Xilinx FPGAs. Slices/CLB 1 (FFs/ LUTs) IOB CMT BRAM 2 DSP48A

Object Detection. Part1. Presenter: Dae-Yong

INTRODUCTION TO FPGA ARCHITECTURE

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

Edge Computing and the Next Generation of IoT Sensors. Alex Raimondi

ECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University

Introduction to Field Programmable Gate Arrays

Preparing for Mass Market Virtual Reality: A Mobile Perspective. Qualcomm Technologies, Inc. September 16, 2017

Transcription:

efpga for Neural Network based Image Recognition June 26, 2018 Yoan Dupret Managing Director Menta yoan.dupret@menta-efpga.com Copyright @ 2018 Menta S.A.S.

Menta Overview 17 employees 11 years of R&D 10 Tapeout on 4 nodes at 3 Foundries Some partners Some customers 6/26/2018 2

Fully flexible standard cells efpga efpga IP Origami Tool suite Origami Programmer From RTL to bitstream Origami Designer efpga IP definition 6/26/2018 3

Products offering Wearables/IoT Smartphones Aerospace & Defense Networking Automotive Wireless Base Station Data Center SMALL MEDIUM LARGE XLARGE <2K LC ]2K to 6K[ LC ]6K to 60K[ LC >60K LC efpga IPs examples Small Medium Large XLArge Name M5S0.5 M5S2 M5M5 M5L15 M5L40 M5XL65 M5XL130 # LC 538 1 690 5 120 15 232 40 141 65 434 ~130 000 # MAC 0 6 10 0 0 0 ~1000 # CDSP 0 0 0 21 48 18 0 SRAM (kb) 0 0 1 311 0 2 097 2 359 5 252 # IOs 208 384 1 032 2 304 3 744 4 640 6 500 Interface / Sensor Hub Consumer Electronics Communication Signal Processing Industrial Aerospace & Defense Security Automotive Data Center & HPC 6/26/2018 4

5 th generation efpga IP High density & performances Patented MLUTs - LUT6 based Fully flexible Number of elbs ASIC like embs: type, quantity & size ecbs: Catalogue of DSP blocks (MAC, CDSP) Custom blocks Number of IOs No specific interface Data: AXI, AHB, proprietary buses, direct connections, etc. Configuration: SPI, AXI/AHB, JTAG, etc. Various ASIC like power management options Standard scan chain. TC > 99.7% 100% 3 rd party standard cells based Robust verification flow GDSII delivery in 1 to 6 months Other models possible 6/26/2018 5

Custom blocks Embedded customers arithmetic blocks Optimization of performances & area Integration: Complete integration possible (auto-inference, STA) or As a black box Custom block to target AI applications under definition (Kortiq / Menta) 6/26/2018 6

Origami Programmer From RTL to bitstream Inputs: Application RTL IEEE Verilog / SystemVerilog / VHDL Constraints: timing & IOs Outputs: Bitstream Simulation model Timing reports GUI interface: STA Resources visualisation Congestion maps Move and reallocate resources Soft IPs generator Various distribution models possible 6/26/2018 7

efpga IP requirements & Menta answers Seamless SoC Integration Menta 100% standard cells No specific interface. High yield and reliability Menta is using Flops instead of SRAM bitcells for bitstream storage Area sensitivity Support of any kind of arithmetic blocks (DSPs) and SRAM (i.e. BRAM / LRAM) IP is fully flexible. Specification software available: Origami Designer PPA comparable to competitors Testability & test cost Standard scan chain DfT with TC > 99.7% (patented) Fully verifiable in customer EDA environment Flexible Origami Programmer business model 6/26/2018 8

AI inference at the edge Training Cloud Data collection Inference Edge Devices Data Center 6/26/2018 9 Icons made by Freepik & SimpleIcon from www.flaticon.com

Inference on edge devices requirements Cost for high volume production (mobile phones, security camera) Flexibility over lifetime (cars, robots) Power consumption (all) 6/26/2018 10

One fit all solution ASIC GPU FPGA DSP Cost +++ ++ - ++ Flexibility NA ++ +++ + Power consumption +++ - + + efpga with custom Block Modules running highly efficient CNN algorithms Menta + Kortiq Cost ++ Flexibility +++ Power consumption ++ 6/26/2018 11

About Kortiq Munich based company specializing in IP cores for hardware acceleration of machine learning algorithms IP portfolio includes IP cores for accelerating decision trees, support vector machines, artificial neural networks and ensemble classifiers Latest addition to the IP portfolio is the AIScale, universal CNN accelerator IP core More information available at www.kortiq.com 6/26/2018 12

Kortiq IP on FPGA Smallest & most efficient CNN accelerator Works with compressed CNNs and feature maps Employs novel all zeros skipping technology Capable of accelerating all standard CNN layers, including convolutional, pooling, adding, fully-connected Supports on-the-fly, dynamic changing of CNN architecture that needs to be accelerated Implementable using any Xilinx FPGA family 6/26/2018 13

ADAS vision application Pilot project with a well-known car manufacturer Phase 1 traffic sign detection and recognition Phase 2 person detection Phase 3 fully autonomous driving system 6/26/2018 14

Phase 1 Details Single Shot MultiBox Detector (SSD) method, using various CNN architectures as the base classifier Input image size 300x300 Target frame rate > 25 fps VGG16SSD CNN Visualization made by GenProto 6/26/2018 15

Menta Kortiq Cooperation Kortiq s ML IP cores on Menta efpga New Custom Block, AICB New AI Reconfigurable Architecture, AIRA ML SoC based on AIRA 6/26/2018 16

Thank you 6/26/2018 17