Multi-Core SoCs for ADAS and Image Recognition Applications

Similar documents
An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

TIOVX TI s OpenVX Implementation

Advanced Driver Assistance: Modular Image Sensor Concept

TEGRA K1 AND THE AUTOMOTIVE INDUSTRY. Gernot Ziegler, Timo Stich

An Efficient Algorithm for Forward Collision Warning Using Low Cost Stereo Camera & Embedded System on Chip

Deep Learning Requirements for Autonomous Vehicles

Two-Stage Static/Dynamic Environment Modeling Using Voxel Representation

Semi-supervised Learning on Real-time. Pedestrian Detection System

Supplier Business Opportunities on ADAS and Autonomous Driving Technologies

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

SoC for Car Navigation Systems with a 53.3 GOPS Image Recognition Engine

ISSCC 2003 / SESSION 2 / MULTIMEDIA SIGNAL PROCESSING / PAPER 2.6

FPGA Implementation of Object Recognition Processor for HDTV Resolution Video Using Sparse FIND Feature

Venezia: a Scalable Multicore Subsystem for Multimedia Applications

LOW COST ADVANCDED DRIVER ASSISTANCE SYSTEM (ADAS) DEVELOPMENT

MPSOC Architectures for Computing for Imaging. Thierry Collette, Ph.D. CEA LIST Head of Architecture and Design Department

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

A Diverse Low Cost High Performance Platform for Advanced Driver Assistance System (ADAS) Applications

Extended CoHOG and Particle Filter by Improved Motion Model for Pedestrian Active Safety

VLSI Design Automation. Maurizio Palesi

Grand Challenge Scaling - Pushing a Fully Programmable TeraOp into Handset Imaging

A 1-GHz Configurable Processor Core MeP-h1

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

VLSI Design Automation

ADAS: A Safe Eye on The Road

Toward Retail Product Recognition on Grocery Shelves

LANE DEPARTURE WARNING SYSTEM FOR VEHICLE SAFETY

IMPROVING ADAS VALIDATION WITH MBT

Development of Low Power and High Performance Application Processor (T6G) for Multimedia Mobile Applications

Emerging Vision Technologies: Enabling a New Era of Intelligent Devices

MIPI : Advanced Driver Assistance System

W4. Perception & Situation Awareness & Decision making

In-Vehicle Vision Processors for Driver Assistance Systems

Enabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks

2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don

FPGA Image Processing for Driver Assistance Camera

Deep ST, Ultra Low Power Artificial Neural Network SOC in 28 FD-SOI. Nitin Chawla,

5MP Global Shutter. High Dynamic Range Global Shutter CMOS Sensor

Object Detection. Part1. Presenter: Dae-Yong

INTEGRATING COMPUTER VISION SENSOR INNOVATIONS INTO MOBILE DEVICES. Eli Savransky Principal Architect - CTO Office Mobile BU NVIDIA corp.

SOLUTIONS FOR TESTING CAMERA-BASED ADVANCED DRIVER ASSISTANCE SYSTEMS SOLUTIONS FOR VIRTUAL TEST DRIVING

Design and Optimization of Geometry Acceleration for Portable 3D Graphics

AUTOMATED GENERATION OF VIRTUAL DRIVING SCENARIOS FROM TEST DRIVE DATA

Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM

New Technologies for UAV/UGV

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

AN EFFICIENT SYSTEM FOR FORWARD COLLISION AVOIDANCE USING LOW COST CAMERA & EMBEDDED PROCESSOR IN AUTONOMOUS VEHICLES

EE482S Lecture 1 Stream Processor Architecture

EE482C, L1, Apr 4, 2002 Copyright (C) by William J. Dally, All Rights Reserved. Today s Class Meeting. EE482S Lecture 1 Stream Processor Architecture

Creating Affordable and Reliable Autonomous Vehicle Systems

3D Time-of-Flight Image Sensor Solutions for Mobile Devices

HDRC CMOS-Bildsensorik

WardsAuto Interiors Conference Creating the Ultimate User Experience

Big Data mais basse consommation L apport des processeurs manycore

Computer Vision with MATLAB

차세대지능형자동차를위한신호처리기술 정호기

Hezi Saar, Sr. Staff Product Marketing Manager Synopsys. Powering Imaging Applications with MIPI CSI-2

2 OVERVIEW OF RELATED WORK

Object Fusion for an Advanced Emergency Braking System (AEBS) Jonny Andersson

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Design Considerations And The Impact of CMOS Image Sensors On The Car

Vertex Shader Design I

2009 International Solid-State Circuits Conference Intel Paper Highlights

Revolutionizing RISC-V based application design possibilities with GLOBALFOUNDRIES. Gregg Bartlett Senior Vice President, CMOS Business Unit

VIA Mobile360 Surround View

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

Vector IRAM: A Microprocessor Architecture for Media Processing

LSI for Car Navigation Systems/

Next Generation Visual Computing

12/3/2009. What is Computer Vision? Applications. Application: Assisted driving Pedestrian and car detection. Application: Improving online search

Automated Road Safety Analysis using Video Data

Integrated Vehicle and Lane Detection with Distance Estimation

Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection

ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2

Future Implications for the Vehicle When Considering the Internet of Things (IoT)

There s STILL plenty of room at the bottom! Andreas Olofsson

Advanced Driver Assistance Systems: A Cost-Effective Implementation of the Forward Collision Warning Module

Image processing techniques for driver assistance. Razvan Itu June 2014, Technical University Cluj-Napoca

HotChips An innovative HD video and digital image processor for low-cost digital entertainment products. Deepu Talla.

A LOW-POWER VGA FULL-FRAME FEATURE EXTRACTION PROCESSOR. Dongsuk Jeon, Yejoong Kim, Inhee Lee, Zhengya Zhang, David Blaauw, and Dennis Sylvester

Design challenges for wireless smart cameras

Memory Systems IRAM. Principle of IRAM

Designing a software framework for automated driving. Dr.-Ing. Sebastian Ohl, 2017 October 12 th

UNIT 8 1. Explain in detail the hardware support for preserving exception behavior during Speculation.

A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology

Arm Technology in Automotive Geely Automotive Shanghai Innovation Center

FPGA Implementation of a Vision-Based Blind Spot Warning System

Data Parallel Architectures

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

May Wu, Ravi Iyer, Yatin Hoskote, Steven Zhang, Julio Zamora, German Fabila, Ilya Klotchkov, Mukesh Bhartiya. August, 2015

Neural Computer Architectures

Benchmarking Real-World In-Vehicle Applications

TRACKING AND DETECTION OF SUDDEN PEDESTRIAN CROSSING

Vehicle Detection Using Android Smartphones

Examining future priorities for cyber security management

Solutions for Smarter Driving ADAS

UFS 3.0 Controller Design Considerations

A Scalable Heterogeneous Multicore Architecture for ADAS

Transcription:

Multi-Core SoCs for ADAS and Image Recognition Applications Takashi Miyamori, Senior Manager Embedded Core Technology Development Department Center for Semiconductor Research & Development Storage Device Solution Company TOSHIBA Corporation September 9, 2016 2014 Toshiba Corporation 2015 Toshiba Corporation

Background Image recognition technology A variety of products Face Forward Collision Warning Backover Prevention Face Recognition for Door Security Pedestrian Driver Monitoring Hand Gesture Traffic Sign Recognition Lane Change Assistance 80 2

The Top 10 Causes of Death Rank Cause 2015 1 Ischaemic heart disease 2 Stroke 3 Lower respiratory infections 4 Chronic obstructive pulmonary disease 5 Diarrhoeal diseases 6 HIV/AIDS 7 Trachea, bronchus, lung cancers 8 Diabetes mellitus 9 Road injury 10 Hypertensive heart disease 1.4 million Rank Cause 2030 1 Ischaemic heart disease 2 Stroke 3 Chronic obstructive pulmonary disease 4 Lower respiratory infections 5 Diarrhoeal diseases 6 Trachea, bronchus, lung cancers 7 Road injury 8 HIV/AIDS 9 Diarrhoeal diseases 10 Hypertensive heart disease Source: World Health Organization (WHO) Projections of mortality and causes of death, 2015 and 2030 More than 1 million people die in traffic accidents every year 50 million people are injured in non-fatal accidents How can we decrease Road Injuries? 1.9 million 3

ADAS(Advanced Driving Assistant System) Market Growth M units ADAS CAGR 18% 180 160 140 cf. Overall Automotive Electronic System 6% Powertrain System 8% Steering/Brake System 3% 120 100 80 60 40 Night V Adapti Drows LDWS Blindsp Parkin Distanc 20 0 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Source: Strategy Analytics Advanced Driver Assistance Systems Demand Forecast 2004 2013 2014 2016 2017 2018 Euro NCAP Requirements Lane Departure Warning Forward Collision Warning NCAP: New Car Assessment Programme Pedestrian (Daytime) Pedestrian (Daytime & Night) Cyclists (Daytime & Night) 4

Roadmap of Image Recognition SoCs TMPV750 Series 3 rd Generation TMPV7528XBG TMPV760 Series 4 th Generation TMPV7608XBG Next Generation TMPV7506XBG 1 st Generation T5BG3XBG TMPV7504XBG TMPV7502XBG 2 nd Generation 2004 2013 2014 2016 2017 2018 Euro NCAP Requirements Lane Departure Warning Forward Collision Warning NCAP: New Car Assessment Programme Pedestrian (Daytime) Pedestrian (Daytime & Night) Cyclists (Daytime & Night) 5

Image Recognition SoCs 1st Gen. T5BG3XBG [CICC 2004] 2nd Gen. TMPV7506XBG [ISSCC 2012] 4th Gen. TMPV7608XBG [ISSCC 2015] 0.13mm CMOS 1W@1.5V Core x3 + HWA x1 40nm CMOS 1W@1.1V Core x4 + HWA x6 464GOPS 617GOPS/W 40nm CMOS 3W@1.1V Core x8 + HWA x14 1900GOPS 564GOPS/W Single application (Lane Forward Collision Warning, or Night Vision) 4 apps. execution including Pedestrian (PD) 8 apps. execution including PD at night-time 6

Demo Video of 2 nd Generation SoC 2nd Gen. Lane Pedestrian and Vehicle (Daytime) 7

Multiple ADAS Applications Vehicle (include Motorbike etc.) Pedestrian (include Wheelchair) Traffic Sign Recognition 4th Gen. New features of 4th Gen. Automatic High Beam General Obstacle Lane Traffic Light Recognition 8

Design Concept of Image Recognition SoCs Requirements: High performance with low power consumption High accuracy of object recognition Heterogeneous Multi-core Architecture Energy efficient multi-core Hardware accelerators Performance bottlenecks and frequently used tasks High accurate image recognition features Adopted Highly parallelized approach rather than High clock frequency approach 9

Image Recognition Processing Image Full Size Image Preprocessing Distortion Correction Noise Reduction Pyramid image Integral image Gradient Image ROI Feature Extraction Haar-Like HOG, CoHOG etc. ROI Feature Classification/ Tracking SVM Random Forest etc.. Dictionary Symbols Clustering/ Decision- Making Be Careful! Date Size Large Small Algorithm Simple Complex HW Accelerator Multi-Core 10

Hardware Accelerators 1st Gen. T5BG3XBG TMPV7506XBG 2nd Gen. 5 types of 6 HWAs TMPV7608XBG 4th Gen. 8 types of 14 HWAs Affine Trans. Accelerator Template Matching Accelerator Pyramid Image Accelerator Filter Accelerator Image recognition feature descriptors Histogram Accelerator CoHOG Accelerator SfM Accelerator (SfM: Structure from Motion) HOX Accelerator (HOX: Histogram of X) 11

Block Diagram of TMPV7608XBG 4-Core#1 Core #1 Core #2 Core #3 Core #4 4-Core#2 Core #1 Core #2 Core #3 Core #4 SRAM 1.5MB Affine Acc.#1 Crossbar switch 128b x 133MHz Filter Acc. #1 4th Gen. Accelerators SfM Acc Pyramid Acc#1 HOX Acc. #1 Matching Acc.#1 Histogram Acc. CoHOG Acc. Crossbar switch 128b x 133MHz 32b RISC LPDDR2 I/F Video In (4ch) Video out (2ch) Memory Bandwidth: LPDDR2 64-bit x 2ch: 12.8GB/sec CAN I/F Misc I/F 12

4-Core Processor Clusters 4th Gen. Each cluster has four 3-way VLIW processors 3-way VLIW: 32b RISC core + 2-way coprocessor Coprocessor supports Integer 8/4/2-parallel 8/16/32-bit SIMD instructions Double precision floating-point instructions 64KB 64KB Data RAM DMAC Data RAM 64KB DMAC 64KB DMAC 3-Way VLIW Data RAM 3-Way VLIW Data RAM DMAC Processor Processor 3-Way VLIW 3-Way VLIW 16KB Processor 16KB 16KB 16KB Processor I$ D$ I$ 16KB D$ 16KB 16KB 16KB I$ D$ I$ D$ L2$ 256KB I$ Dec./RF ALU D$/D-RAM RF 32b RISC Core ALU Acc. Integer 64-b SIMD 2-way Coprocessor Dec. Cop. RF DP Scalar FP ALU ALU MUL Acc. Integer 64-b SIMD Dec. DP Scalar FP ALU 13

Improvement by Color-Based Feature 4th Gen. Euro NCAP will adopt Automatic Emergency Braking (AEB) test (with pedestrian and cyclist detection) in darkness in 2018. Improved recognition accuracy using a new image feature Color-based image features to describe shape and texture Extension to CoHOG feature Background object is similar luminance Color information can tell us boundary of pedestrians 14

Four Color-Based Feature Descriptors* Four feature descriptors work complementarily Color Histogram, CoHD, Color-CoHOG, and CoHED They are designed to capture different types of information 4th Gen. Direct Color Color Histogram Relative Color CoHD Color- CoHOG CoHED Monochrome CoHOG Distribution Texture CoHOG: Co-occurrence histograms of oriented gradients CoHED: Co-occurrence histograms of pairs of edge orientations and color differences CoHD: Co-occurrence histograms of color differences Shape *) Satoshi Ito and Susumu Kubota, Object Classification Using Heterogeneous Co-occurrence Features, Computer Vision ECCV 2010, Lecture Notes in Computer Science Volume 6315, 2010, pp 701-714 15

False Negative Rate Performance of HOX Accelerator [Daytime dataset] [Night-time dataset] * Original dataset Accuracy of Pedestrian Recognition 1 CoHOG: 4 color [Daytime] feature: [Night] 0.1 4th Gen. CoHOG: [Night] 4 color feature: Better [Daytime] 0.01 0.0001 0.001 0.01 False Positive Rate 0.1 Accuracy at night using HOX accelerator is almost the same as at daytime using CoHOG accelerator 16

Demo Video of 4th Generation SoC Traffic Sign Recognition Moving Object Vehicle Head Light Pedestrian Bicycle Distance Estimation Lane 17

3D Reconstruction (SfM: Structure from Motion) TMPV7608XBG (4 th generation SoC) supports 3D reconstruction using a single camera for general obstacle detection Multiple images are taken from different viewing angles Camera motion 3D Reconstruction 3D shape Single camera 18

Principle of SfM: Triangulation Step 3) Depth is estimated by triangulation Step1) Find corresponding points Step 2) Estimate movement of camera Time: t Time: t+1 19

Performance and power consumption4th Gen. Total performance (GOPS) Power consumption (mw) Performance per Watt (GOPS/W) 2nd gen. [ISSCC2012] Power consumption for real apps. on 4 th gen. 4th gen. [ISSCC2015] 464 1900 749 3368 617 564 Vdd=1.1V, Process=Center, Temp=25 8 applications 1.4W (Typical condition) 20

Chip Micrograph and Features 4th Gen. Process 40nm CMOS LP Chip Size 105.6mm 2 (9.60mmx11.0mm) Peak Performance Multi-core (x2) 266MHz 93.6GOPS Affine(x3) 266MHz 134.9GOPS Pyramid (x2) 266MHz 352.2GOPS CoHOG 266MHz 25.5GOPS HOX (x2) 266MHz 373.5GOPS Filter (x2) 180MHz 230.4GOPS Total 1900GOPS Histogram 266MHz 5.3GOPS Matching (x2) 266MHz 594.6GOPS SfM 266MHz 90.0GOPS Perf. /W 564GOPS/W 21

Conclusion Multi-core SoCs dedicated for ADAS and image recognition applications High performance with low power consumption Energy efficient multi-core Highly parallelized hardware accelerators High accurate recognition CoHOG and four color-based features (Color Histogram, CoHD, Color-CoHOG, and CoHED) Accurate detection for pedestrian even at night Structure from Motion (SfM) for general obstacle detection 22

Toshiba Electronics Europe GmbH