Speed Sign Detection Using Convolutional Neural Network Accelerator IP User Guide

Similar documents
Speed Sign Detection Using Convolutional Neural Network Accelerator IP Reference Design

Face Tracking Using Convolutional Neural Network Accelerator IP Reference Design

Object Counting Using Convolutional Neural Network Accelerator IP Reference Design

EVDK Based Speed Sign Detection Demonstration User Guide

2:1 HDMI Switch Demo User Guide

Lattice Embedded Vision Development Kit User Guide

Machine Learning Demo User s Guide

Embedded Vision Solutions

MDP Based Face Detection Demonstration User Guide

Key Phrase Detection Using Compact CNN Accelerator IP Reference Design

MDP Based Key Phrase Detection Demonstration User Guide

ice40 UltraPlus Display Frame Buffer User Guide

Programming External SPI Flash through JTAG for ECP5/ECP5-5G Technical Note

ice40 UltraPlus Image Sensor Elliptical Crypto Engine (ECC) Demo - Radiant Software User Guide

MIPI D-PHY Bandwidth Matrix Table User Guide

NEW USE CASES HIGHLIGHT CROSSLINK S BROAD APPLICABILITY

Accelerating Implementation of Low Power Artificial Intelligence at the Edge

MIPI D-PHY Bandwidth Matrix and Implementation Technical Note

ice40 SPRAM Usage Guide Technical Note

DG0849 Demo Guide PolarFire Dual Camera Video Kit

Neural Network Compiler BNN Scripts User Guide

LCMXO3LF-9400C SED/SEC Demo

Byte-to-Pixel Converter IP User Guide

CrossLink Hardware Checklist Technical Note

VIDEO BRIDGING SOLUTION PROMISES NEW LEVEL OF DESIGN FLEXIBILITY AND INNOVATION

The Path to Embedded Vision & AI using a Low Power Vision DSP. Yair Siegel, Director of Segment Marketing Hotchips August 2016

MIPI D-PHY to CMOS Interface Bridge Soft IP

Low-Cost Serial RapidIO to TI 6482 Digital Signal Processor Interoperability with LatticeECP3

UG0850 User Guide PolarFire FPGA Video Solution

ice40 UltraPlus RGB LED Controller with BLE User Guide

ECP5 Product Families Update - Errata to Soft Error Detection (SED) Function

Multimedia SoC System Solutions

借助 SDSoC 快速開發複雜的嵌入式應用

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

OpenCV on Zynq: Accelerating 4k60 Dense Optical Flow and Stereo Vision. Kamran Khan, Product Manager, Software Acceleration and Libraries July 2017

Digital Blocks Semiconductor IP

Parallel to MIPI CSI-2 TX Bridge

4K HEVC Video Processing with GPU Optimization on Jetson TX1

Digital Blocks Semiconductor IP

ESL design with the Agility Compiler for SystemC

AL361A-EVB-A1. Multi-channel Video Processor EVB. Datasheet. (HDMI/AHD-to-HDMI) 2017 by AverLogic Technologies, Corp. Version 1.0

SmartFusion2 SoC FPGA Demo: Code Shadowing from SPI Flash to SDR Memory User s Guide

1:2 and 1:1 MIPI DSI Display Interface Bridge Soft IP User Guide

DG0723 Demo Guide SmartFusion2 Imaging and Video Kit MIPI CSI-2

IoT Sensor Connectivity and Processing with Ultra-Low Power, Small Form-Factor FPGAs

Memory Modules User Guide

Developing a Camera Application with i.mx RT Series

SBC-S32V234 QUICK START GUIDE (QSG)

Lattice Memory Mapped Interface and Lattice Interrupt Interface User Guide

A new Computer Vision Processor Chip Design for automotive ADAS CNN applications in 22nm FDSOI based on Cadence VP6 Technology

MIPI CSI2-to-CMOS Parallel Sensor Bridge

S2C K7 Prodigy Logic Module Series

AL362B-EVB-A1. AHD-to-HDMI Quad Box Development Kit by AverLogic Technologies, Corp. Version 1.0

ice40 UltraPlus 8:1 Mic Aggregation Demo User Guide

Lesson 6 Intel Galileo and Edison Prototype Development Platforms. Chapter-8 L06: "Internet of Things ", Raj Kamal, Publs.: McGraw-Hill Education

ice40 Ultra Self-Learning IR Remote User s Guide

Lattice SDI Quad-view

Model: LT-125 USER MANUAL. A Lattice ECP3 based HD video compression and decompression evaluation platform. AUGUST Page 1

Developing a simple UVC device based on i.mx RT1050

LCMXO3LF-9400C Simple Hardware Management Demo User Guide

Bring Intelligence to the Edge with Intel Movidius Neural Compute Stick

P I X E V I A : A I B A S E D, R E A L - T I M E C O M P U T E R V I S I O N S Y S T E M F O R D R O N E S

Forza 4 ASCII Game. Demo for the AK-MACHX

DisplayPort MegaCore. Altera Technology Roadshow 2013

Designing with NXP i.mx8m SoC

Sundance Multiprocessor Technology Limited. Capture Demo For Intech Unit / Module Number: C Hong. EVP6472 Intech Demo. Abstract

Microtronix Stratix III Broadcast IP Development Kit USER MANUAL REVISION Woodcock St. London, ON Canada N5H 5S1

LatticeECP3 Digital Front End Demonstration Design User s Guide

Avnet Zynq Mini Module Plus Embedded Design

Advanced Digital Machine Vision Cameras. GigE/USB3 Application. Quick Start. Application Notes. Rugged Machine Vision. Rev D

4.1 Design Concept Demonstration for Altera DE2-115 FPGA Board Demonstration for Cyclone III Development Board...

ENABLING MOBILE INTERFACE BRIDGING IN ADAS AND INFOTAINMENT APPLICATIONS

DPM Demo Kit User s Manual Version: dpm_dk_um_1_0_1.doc

SC2000 Smart Kit Selection Checklist

The BlueNRG-1, BlueNRG-2 BLE OTA (over-the-air) firmware upgrade

AL362B-DMB-A0. 4K HDMI Quad DEMO Board. Version 1.1. Mode IN1 IN2 IN3 IN4 Power RS232 IR

Prem Arora Microsemi Corporation. Multiple MIPI CSI-2 SM Camera Solution Using FPGAs

B-191 B-191s B-192 B-192S. B-190 Series - Range. 1000x. 600x. 1000x. 600x

AL582C-EVB-A0 Evaluation Board

Alpha FX Core IP-Enabled Video Wall Controller

Quick Start Guide. SABRE Platform for Smart Devices Based on the i.mx 6 Series

Advanced Digital Design Using FPGA. Dr. Shahrokh Abadi

VGA Demo. Forza 4 and Slideshow

Efficient Video Processing on Embedded GPU

Intel Galileo gen 2 Board

ArduCAM USB Camera SDK

SVM-03/03U Utility Software. [SVMCtl] Software Manual. Rev. 8.1

Acadia II Product Guide. Low-Power, Low-Latency Video Processing for Enhanced Vision in Any Condition

MIPI : Advanced Driver Assistance System

Atlys (Xilinx Spartan-6 LX45)

FreeBSD and Beaglebone Black, a robotic application.

The Design of Sobel Edge Extraction System on FPGA

Implementing Video and Image Processing Designs Using FPGAs. Click to add subtitle

Full Linux on FPGA. Sven Gregori

4K Video Processing and Streaming Platform on TX1

UM1853 User manual. STM32CubeF1 Nucleo demonstration firmware. Introduction

pcduino V3B XC4350 User Manual

Microbee Technology FTM-3SE

The Information contained herein is subject to change without notice. Revisions may be issued regarding changes and/or additions.

Overview of the Raspberry Pi Models 3B & 2B

Transcription:

Speed Sign Detection Using Convolutional Neural Network Accelerator IP FPGA-RD-02035-1.0 May 2018

Contents Acronyms in This Document... 3 Introduction... 4 Reference Design Overview... 5 Block diagram... 5 CNN Accelerator Engine... 6 SD Card Loader... 7 AXI Salve and DDR3 Memory Interface... 8 CSI2 to DVI Interface... 9 Video Processing Module... 10 Related Documentation... 12 Soft IP Document... 12 Diamond Document... 12 Hardware Requirements... 13 References... 14 Technical Support Assistance... 15 Revision History... 16 Figures Figure 1.1. Embedded Vision Development Kit... 4 Figure 2.1. Speed Sign Detection Reference Design Block Diagram... 5 Figure 3.1. CNN Accelerator IP Core Generation GUI... 6 Figure 4.1. Neural Network Compiler Tool Output File Generation Flow... 7 Figure 7.1. Speed Sign Detection Design *.yml File Snippet... 10 2 FPGA-RD-02035-1.0

Acronyms in This Document A list of acronyms used in this document. Acronym Definition AXI CNN DRAM DVI FPGA GUI passp PIP SD Card SPI Advanced extensible Interface Convolutional Neural Network Dynamic Random Access Memory Digital Visual Interface Field Programmable Gate Array Graphic User Interface Programmable Application Specific Standard Product Picture In Picture Secure Digital (Memory) Card Serial Peripheral Interface FPGA-RD-02035-1.0 3

Introduction This document describes the Speed Sign Detection machine learning neural network reference design. This reference design can be implemented on Lattice s Embedded Vision Development Kit, featuring the Lattice CrossLink passp and ECP5 FPGA. Lattice s Embedded Vision Development Kit Stackable Modular Video Interface Platform (VIP) CrossLink Input Bridge Board LIF-MD6000 passp Two Sony IMX 214 Cameras 2:1 CSI-2 MUX ECP5 Processor Board ECP5-85FGPA Image Signal Processing Sensor Interface IONOS ISP Pipeline All-inclusive demo system with video sources Prototyping header Easy programming via USB interface HDMI Output Bridge Board SiI1136 HDMI assp Non-HDCP Output Figure 1.1. Embedded Vision Development Kit 4 FPGA-RD-02035-1.0

Reference Design Overview Block diagram Figure 2.1 shows the block diagram of the Speed Sign Detection reference design. ECP5 External DRAM DDR3 Control (ddr3_ip_inst) AXI Slave (axi2lattic e128) CNN Accelerator Engine (lsc_ml_wrap) External Micro SD Card SD Loader (Sd_spi) Frame Data (32x32, 90x90) Result External Camera CSI2_to_DVI_top Video Processing (crop_downscale) External HDMI TX Lattice IP (clarity) Face Tracking Demo Support Design Modules External Components Figure 2.1. Speed Sign Detection Reference Design Block Diagram The Reference Design uses ECP5-85 FPGA containing the following major blocks: CNN accelerator engine SD card to SPI interface AXI Salve interface DDR3 memory interface CSI2 to DVI interface Video processing module FPGA-RD-02035-1.0 5

CNN Accelerator Engine Lattice Semiconductor CNN Accelerator IP Core can be used through the Diamond Clarity IP Designer. Engine configuration parameters can be set using the Clarity Designer s IP core configuration GUI, as shown in Figure 3.1. Figure 3.1. CNN Accelerator IP Core Generation GUI For detailed information about Lattice Semiconductor CNN Accelerator IP core, such as input data format, output data format and command format, refer to CNN Accelerator IP Core (FPGA-IPUG-02037). For the command generation by Lattice Neural Network Compiler, refer to Lattice Neural Network Compiler Software (FPGA- UG-02052). 6 FPGA-RD-02035-1.0

TensorFlow Coffe Speed Sign Detection Using Convolutional Neural Network Accelerator IP SD Card Loader SD card interface in this design is used to get the command data into the DRAM for execution by the CNN accelerator IP. The SD card contains a file that is generated by Lattice Neural Network Compiler Tool. Lattice Neural Network Compiler Tool allows analyzing and compiling a trained neural network, such as what is generated by Caffe or TensorFlow tool, to use with selected Lattice Semiconductor FPGA products. Lattice Neural Network Compiler tool outputs three files: A hardware configuration file (*.yml) that contains information on fixed point converted network and memory allocation. A firmware file (*.lscml) in ASCII format that contains weights coming from a trained model file. Firmware file (*.lscml) must be converted to binary format before loading into the SD card. A firmware file (*.bin) in binary format that can be directly loaded into the SD card. For detailed operation instructions, refer to Lattice Neural Network Compiler Software (FPGA-UG-02052). Figure 4.1 shows the output file generation flow of the Neural Network Compiler Tool. *.proto *.coffemodel SampleImage.jpeg Trained Model HW Configuration Generator *.yml *.pb SampleImage.jpeg *.lscml Firmware (ASCII) Hardware Simulation *.bin Firmware (Binary) Figure 4.1. Neural Network Compiler Tool Output File Generation Flow FPGA-RD-02035-1.0 7

AXI Salve and DDR3 Memory Interface AXI interface allows command code to be written in DRAM before execution of CNN Accelerator IP Core. Input data may also be written in DRAM. CNN Accelerator IP Core reads command code from DRAM and performs calculations using internal sub-execution engines. Intermediate data may also be transferred from/to DRAM per command code. 8 FPGA-RD-02035-1.0

CSI2 to DVI Interface This module implements a bridge function that converts the camera input MIPI CSI data to DVI output using Lattice CrossLink passp and SiI1136 HDMI transmitter. FPGA-RD-02035-1.0 9

Video Processing Module The crop_downscale module provides all the necessary functions needed to manage the process of inputting data, receiving output, data and generating a composite image for output to the HDMI interface. Four examples are included in the design: crop_downscale.v crops input to 32 32 crop_downscale_key.v crops input to 90 90 crop_downscale_sign.v crops input to 128 128 crop_downscale_keyl.v crops input to 224 224 The Speed Sign Detection demo uses crop_downscale_sign.v. Key functions of the code include: Capturing a downscaled image from the camera input module and saving it to a frame buffer. Writing the frame buffer data into CNN accelerator engine during the blanking period. Buffering the output after completion of the image data processing. Creating a Picture In Picture (PIP) bounding box with green borders, and outputting the composite image. Output from CSI2_to_DVI_top module is a stream of data that reflects the camera image. Input image is then downscaled to 128 128 pixels, stored in a frame buffer and passed to output. Image data is written from the frame buffer into the CNN acceleration engine prior to the start of the processing. Data is then formatted for compatibility with the trained network. The *.yml file provides majority of the information needed for understanding how the input data should be prepared. A snippet of the code in *.yml file for Speed Sign Detection design is shown in Figure 7.1. Figure 7.1. Speed Sign Detection Design *.yml File Snippet 10 FPGA-RD-02035-1.0

Input Size: [1, 3, 128, 128] indicates one input array consisting of 3 layers of dimensions 128 128 memblks: 3 total number of memory blocks needed depth_per_mem: 1 number of memory blocks allocated to each memory layer frac: 8 number of bits that is allocated to the fractional component. It is equal to the minimum number of bits to represent this number minus 1. In this case, 3 bits to represent 8-1=7. num_ebr: 16 number of memory blocks. Note despite the variable name, this does not tie directly to the number of Embedded Block Ram (EBR) used in the design. ebr_blk_size: 16384 this defines the size of the memory blocks in bytes. Note the blocks have a width of 16 bits and the depth is variable. CNN accelerator engine s ports for results, o_we and o_dout[15:0], can be used to output any number of results. Designer can add a read command to allow reading any data based on the neural network design. In Speed Sign Detection design, the speedsign_post.v accepts the 16-bit output data from CNN accelerator engine, and generates the confidence level for the pre-trained Speed Limit Signs. The results then will be overplayed onto the left side of the output image stream in the magnification bar chart format. FPGA-RD-02035-1.0 11

Related Documentation Soft IP Document CNN Accelerator IP Core (FPGA-IPUG-02037) Diamond Document For more information on Lattice Diamond Software, visit Lattice website at: www.latticesemi.com/products/designsoftwareandip 12 FPGA-RD-02035-1.0

Hardware Requirements Lattice Embedded Vision Development Kit (LF-EVDK1-EVN) Mini-USB Cable (included in the Lattice Embedded Vision Development Kit) 12 V Power Supply (included with the Kit) HDMI Cable HDMI Monitor (1080p60) Micro-SD Card Adapter (MICROSD-ADP-ENV) Micro-SD Card. Standard Micro-SD card only. FPGA-RD-02035-1.0 13

References For more information on FPGA device, visit http://www.latticesemi.com/products/fpgaandcpld/ecp5 For complete information on Lattice Diamond Project-Based Environment, Design Flow, Implementation Flow and Tasks, as well as on the Simulation Flow, see the Lattice Diamond. 14 FPGA-RD-02035-1.0

Technical Support Assistance Submit a technical support case through www.latticesemi.com/techsupport. FPGA-RD-02035-1.0 15

Revision History Revision 1.0, May 2018 First release. 16 FPGA-RD-02035-1.0

7 th Floor, 111 SW 5 th Avenue Portland, OR 97204, USA T 503.268.8000 www.latticesemi.com