A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology

Similar documents
A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI SOI Symposium Santa Clara, Apr.

SDIP ASIC Update Tensilica Day Hannover Martin Zeller (DCT)

Презентация на SEMICON России 2014 Company Introduction. Jens Benndorf, Managing Director, COO

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

Introduction to Sitara AM437x Processors

MYC-C437X CPU Module

EyeCheck Smart Cameras

MYC-C7Z010/20 CPU Module

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

TEGRA K1 AND THE AUTOMOTIVE INDUSTRY. Gernot Ziegler, Timo Stich

MIPI : Advanced Driver Assistance System

Small is the New Big: Data Analytics on the Edge

MYD-C437X-PRU Development Board

MYD-C7Z010/20 Development Board

Your Strategic Partner for Renesas RZ/G1x Products & Solutions

IOT-GATE-iMX7 Datasheet

SoC Platforms and CPU Cores

. Micro SD Card Socket. SMARC 2.0 Compliant

THE LEADER IN VISUAL COMPUTING

Autonomous Driving Solutions

Kontron s ARM-based COM solutions and software services

Nvidia Jetson TX2 and its Software Toolset. João Fernandes 2017/2018

Full Linux on FPGA. Sven Gregori

Lesson 6 Intel Galileo and Edison Prototype Development Platforms. Chapter-8 L06: "Internet of Things ", Raj Kamal, Publs.: McGraw-Hill Education

S100 Series. Compact Smart Camera. High Performance: Dual Core Cortex-A9 processor and Xilinx FPGA. acquisition and preprocessing

Designing Multi-Channel, Real-Time Video Processors with Zynq All Programmable SoC Hyuk Kim Embedded Specialist Jun, 2014

Low-Power Neural Processor for Embedded Human and Face detection

S2C K7 Prodigy Logic Module Series

Simplify System Complexity

Close to the Edge How Neural Network inferencing is migrating to specialised DSPs in State of the Art SoCs. Marcus Binning Sept 2018 Lund

pcduino V3B XC4350 User Manual

Designing with NXP i.mx8m SoC

FEATURES. APPLICATIONS Machine Vision Embedded Instrumentation Motion Control Traffic Monitoring Security

Software Driven Verification at SoC Level. Perspec System Verifier Overview

SOM IB8000 Quad Core SOM (System-On-Module) Rev 1.3

Hugo Cunha. Senior Firmware Developer Globaltronics

COM-RZN1D - Hardware Manual

INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

ARM+DSP - a winning combination on Qseven

Area View Kit DC AVK. Features. Applications

Intel Galileo gen 2 Board

M2-SM6-xx - i.mx 6 based SMARC Modules

ECE 471 Embedded Systems Lecture 3

Revolutionizing the Datacenter

CM10 Rugged COM Express with TI Sitara ARM Cortex-A15

Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development

XPU A Programmable FPGA Accelerator for Diverse Workloads

Zynq-7000 All Programmable SoC Product Overview

Introducing the AM57x Sitara Processors from Texas Instruments

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

DevKit7000 Evaluation Kit

Reducing Time-to-Market with i.mx6-based Qseven Modules

Industrial Vision Days 2010 C. Strampe: ATOM oder DSP? Embedded Lösungen im Vergleich

C7 Player. Overview. Specifications

IoT, Wearable, Networking and Automotive Markets Driving External Memory Innovation Jim Cooke, Sr. Ecosystem Enabling Manager, Embedded Business Unit

December 1, 2015 Jason Kridner

NXP-Freescale i.mx6 MicroSoM i4pro. Quad Core SoM (System-On-Module) Rev 1.3

OK335xS Users Manual Part I - Introduction

Atlys (Xilinx Spartan-6 LX45)

ECE 471 Embedded Systems Lecture 2

TIOVX TI s OpenVX Implementation

. SMARC 2.0 Compliant

ARM Cortex-A9 ARM v7-a. A programmer s perspective Part1

RunBMC - A Modular BMC Mezzanine Card BUV - Bring Up Vehicle For BMC Mezzanine. Eric Shobe & Jared Mednick Hardware Engineer - Salesforce

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

SOM i1 Single Core SOM (System-On-Module) Rev 1.5

UltraZed -EV Starter Kit Getting Started Version 1.3

Embedded Vision Solutions

Hardware Specification. Figure 1-2 ZYNQ-7000 Device Family 2 / 9

LinkSprite Technologies,.Inc. pcduino V2

ECE 471 Embedded Systems Lecture 2

Homework 9: Software Design Considerations

Agile Hardware Design: Building Chips with Small Teams

Introduction to the Itron Riva Dev Kits

Arm Technology in Automotive Geely Automotive Shanghai Innovation Center

A176 Cyclone. GPGPU Fanless Small FF RediBuilt Supercomputer. IT and Instrumentation for industry. Aitech I/O

Brainchip OCTOBER

Deep Learning Requirements for Autonomous Vehicles

Embedded Linux Conference San Diego 2016

Designing a Multi-Processor based system with FPGAs

M100 GigE Series. Multi-Camera Vision Controller. Easy cabling with PoE. Multiple inspections available thanks to 6 GigE Vision ports and 4 USB3 ports

Adaptive processor architectures for detector applications

Smart Ultra-Low Power Visual Sensing

Ten (or so) Small Computers

Simplify System Complexity

Compact form factor. High speed MXM edge connector. Processor. Max Cores 4. Max Thread 4. Memory. Graphics. Video Interfaces.

Altera SDK for OpenCL

NXP-Freescale i.mx6. Dual Core SOM (System-On-Module) Rev 1.5

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

AT-501 Cortex-A5 System On Module Product Brief

M100 GigE Series. Multi-Camera Vision Controller. Easy cabling with PoE. Multiple inspections available thanks to 6 GigE Vision ports and 4 USB3 ports

Revolutionizing RISC-V based application design possibilities with GLOBALFOUNDRIES. Gregg Bartlett Senior Vice President, CMOS Business Unit

Ettus Research Update

BACH PRO AUDIO NETWORKING & DSP PRODUCT CATALOG September 16, 2016

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

November 3, 2015 Jason Kridner

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Transcription:

Dr.-Ing Jens Benndorf (DCT) Gregor Schewior (DCT) A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology Tensilica Day 2017 16th Feb. 2017, Leibniz University Hanover, Germany

DCT Company Profile Dream Chip Technologies Positioned as a Fabless Microelectronic Engineering Company for medium to large SoC designs covering the whole range from Architecture, Specification, Design, Verification to GDSII Technologies: 130nm, 40nm, 28nm, 22nm FDX, 14/16nm FF 60 Employees/ 52 Engineers with 10 20 years SoC design experience Based in Hanover (HQ) and Hamburg, Germany Member of Silicon Saxony/ Germany! Cadence Design Center Partner for Tensilica Dream Chip Technologies Confidential 2

Assisted Driving requires Cameras, Radar and Ultrasonic Dream Chip Technologies GmbH Dream Chip Technologies Confidential Source: Texas Instruments

Use Case #1: Digital Mirroring Dream Chip Technologies Confidential 4

Use Case #1: Digital Mirroring - The Multiview Automotive multi camera systems for Bird-View, Rear-View and Panorama-View are a major part of today s emerging technologies to make driving more safe and comfortable and to move towards autonomous vehicles. Dream Chip Technologies Confidential 5

Use Case #2 : 360 deg Top View Camera Dream Chip Technologies Confidential 6

Use Case #3 : Pedestrian detection via CNNs Dream Chip Technologies Confidential 7

Dream Chip Technologies Confidential Dream Chip Technologies GmbH

Introduction Image Sensor Processing Overview Dream Chip Technologies Confidential 9

Heterogeneous Cores for Image Sensor Processing CPU GPU (VP) DSP (VP, CPU) Vector DSP (VP, HW) Dream Chip Technologies Confidential 10

Example: 360 deg Top View Tasks: Fish-Eye Lens correction, Stitching, Warping, Photometric synchronization for ADAS surround sensors Mapping: HDMIin, HD-SDI HDMIout Dream Chip Technologies Confidential 11

Dream Chip Technologies Confidential Dream Chip Technologies GmbH

Low Power CNN Architecture Proposal Dream Chip Technologies GmbH Lock-Step ARM A53: Object L. VP6: Pixel Level HEs: Pixel Level I/O IF Memory IF Dream Chip Technologies Confidential 13

Vision P6 architecture VLIW & SIMD ALU Ops Memory width # of vector registers 32 5 issue slots 64way 8-bit 32way 16-bit 16way 32-bit 64 32-bit 128 16-bit 256 8-bit 1024-bits 2 vector load/store units SuperGather Bus interface idma Target frequency Optional 32 non-contiguous locations read/written per instruction AXI4 no alignment restrictions, local memory to local memory transfers, 800 MHz @ 28 nm 1.1 GHz @ 16 nm Vector floating point, ECC Dream Chip Technologies Confidential 14

Dream Chip Technologies Confidential Dream Chip Technologies GmbH

Why CNNs? [Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Deep Residual Learning for Image Recognition. CVPR 2016.] Human: 5.1% Dream Chip Technologies Confidential 16

Why CNNs? ADAS video applications using CNNs Traffic sign recognition Pedestrian detection Image segmentation / Scene labeling Object localization Self driving cars Dream Chip Technologies Confidential 17

What is a CNN? Special case of a neural network a deep learning based approach for highquality object detection Neural network: System of interconnected artificial neurons inspired by biological neural system Neurons are the basic computation units of the brain connected with synapses Connections have numeric weights, tuned during training process Properly trained network will respond correctly when presented image or pattern to recognize Drawing of biological neuron Mathematical model of neuron Dream Chip Technologies Confidential 18

What is a CNN? cont. Neural network organized in multiple layers Fully connected layers CNNs have additional layers Convolutional layers Pooling / subsampling layers Non-linear layers Dream Chip Technologies Confidential 19

What is a CNN? cont. Convolutional layer: Motivated by visual cortex Contains cells responsible for detecting light in small, overlapping sub-regions of the visual field, called receptive fields k x k x D multiply-accumulates (MAC) required to create one element of one output feature Convolution outputs 3-dimensional Multiple convolutional layers Results in a lot of MACs per image (see next slide) Dream Chip Technologies Confidential 20

AlexNet CNN 5 convolutional layers 3 fully connected layers 60M weighs ~800M multiply-accumulate to process one 227x227x3 image Trigger function ReLU: f(x)=max(0,x) Dream Chip Technologies Confidential 21

The SIP Dream Chip Technologies GmbH Dream Chip Technologies Confidential

Assembling the SOM... Dream Chip Technologies GmbH Dream Chip Technologies Confidential

System on Module Overview DCT ADAS Heterogeneous Multi-Core Chip (22nm FDSOI Global Foundries) Board-to-board header with chip interfaces Expandable flash storage Power management and measurement Real Time Clock (RTC) Chip power supplies included Benefits Reduced application-specific baseboard complexity Interfaces customizable to application requirements Expandable application flash storage Power consumption measurement Only Single 12 VDC power supply required System-on-Module features Embedded 4GB LP-DDR4 2400 RAM 128MB ARM Cortex A53 storage 32MB ARM Cortex R5 storage Gigabit Ethernet PHY Power Management IC Real Time Clock (RTC) Interfaces Four 300MB/s video input interfaces One 300MB/s video output interface Gigabit Ethernet Dual Quad-SPI for application storage UART, I2C, SPI, and GPIO Dimensions 194mm x 100mm Dream Chip Technologies Confidential 24

4xHDMI Application Board Overview DCT ADAS Quad-HDMI Base Board Four HDMI 1.4b inputs One HDMI 1.4 output Custom high-speed headers available Remote power management Periodic power measurement Gigabit Ethernet CAN 2.0B USB UART Video Genlock generation Benefits Official reference design Prepared for custom sensor interfaces Remote system control Base board features Four ADV7611 HDMI 1.4b receivers One ADV7511 HDMI 1.4 transmitter Video data rates up to 1080p60 Two Intel MAX10 10M08DC FPGAs Four high-speed interface headers MCP2515 CAN 2.0B controller Gigabit Ethernet jack for SoM Micro-USB UART to SoM Micro-USB to System Controller Video Genlock generation Video input synchronization True output genlock possible Dimensions 200mm x 180mm Dream Chip Technologies Confidential 25

Software Development Kit Overview DCT ADAS Software Development Kit LEDE distribution with stable Linux 4.4.42 32-bit and 64-bit flavors available Tensilica LX7-IVP6 development support Kernel API drivers Benefits Official Software Development Kit Kernel drivers available Video buffer framework Multi-core processing examples SDK features Complete ARM build environment LEDE distribution (lede-project.org) GNU ARM gcc 5.4.0 Linux 4.4.42 u-boot 2017.01 musl libc 1.1.15 32-bit and 64-bit flavors all changes against respective mainline versions Kernel Drivers for QSPI, UART, I2C, Ethernet video framework Tensilica LX7-VP6 support Firmware control Debug access Dream Chip Technologies Confidential 26

Timeline, next steps 2014 2015 2016 2017 2018 FDSOI Technology Development (ST, GF) 28nm (ST) 22nm (GF) Forerunner (DCT) 12nm (GF) Tape Out1 Tape Out2 Prototype 1 Tape Out3 (planning) Prototype 2 Dream Chip Technologies Confidential 27

Upcoming: MWC Demo (IMS LUH) Dream Chip Technologies GmbH Dream Chip Technologies Confidential

Thank You! Please contact jens.benndorf@dreamchip.de Dream Chip Technologies GmbH Steinriede 10 D-30827 Garbsen/Hannover Germany ++49-5131-90805-0 Dream Chip Technologies Confidential