Avnet Speedway Design Workshop

Similar documents
Avnet Speedway Design Workshop

Spartan -6 LX150T Development Kit Hardware Co-Simulation Reference Design User Guide

Spartan -6 LX150T Development Kit Hardware Co-Simulation Reference Design Tutorial

Intro to System Generator. Objectives. After completing this module, you will be able to:

Basic Xilinx Design Capture. Objectives. After completing this module, you will be able to:

Tutorial - Using Xilinx System Generator 14.6 for Co-Simulation on Digilent NEXYS3 (Spartan-6) Board

Agenda. Introduction FPGA DSP platforms Design challenges New programming models for FPGAs

Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink

Design and Verification of FPGA Applications

Design and Verify Embedded Signal Processing Systems Using MATLAB and Simulink

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point FFT Simulation

Designing and Prototyping Digital Systems on SoC FPGA The MathWorks, Inc. 1

A Case Study in Incremental Prototyping with Reconfigurable Hardware: DSRC Software Defined-Radio

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point Fast Fourier Transform Simulation

Design and Verification of FPGA and ASIC Applications Graham Reith MathWorks

Floating-Point to Field-Tests: A Unified Development Environment for Algorithm Research

Accelerating FPGA/ASIC Design and Verification

EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools

MATLAB/Simulink 기반의프로그래머블 SoC 설계및검증

ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point FFT Simulation

Enabling success from the center of technology. Xilinx DSP Model-Based Design Solutions

A Matlab/Simulink Simulation Approach for Early Field-Programmable Gate Array Hardware Evaluation

Reducing the cost of FPGA/ASIC Verification with MATLAB and Simulink

Integrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC

Model-Based Design for Video/Image Processing Applications

Hardware In The Loop (HIL) Simulation for the Zynq-7000 All Programmable SoC Author: Umang Parekh

Cover TBD. intel Quartus prime Design software

Modeling HDL components for FPGAs in control applications

Cover TBD. intel Quartus prime Design software

How Real-Time Testing Improves the Design of a PMSM Controller

Hardware and Software Co-Design for Motor Control Applications

System Debugging Tools Overview

Integrating a Video Frame Buffer Controller (VFBC) in System Generator Author: Douang Phanthavong and Jingzhao Ou

Next Generation Verification Process for Automotive and Mobile Designs with MIPI CSI-2 SM Interface

Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team

Designing and Targeting Video Processing Subsystems for Hardware

ISE Design Suite Software Manuals and Help

Tutorial: Image Filtering, System Generator Matlab/Simulink

Hardware Implementation and Verification by Model-Based Design Workflow - Communication Models to FPGA-based Radio

Control System Design and Rapid Prototyping Using Simulink Chirag Patel Sr. Application Engineer Modeling and Simulink MathWorks India

DSP Co-Processing in FPGAs: Embedding High-Performance, Low-Cost DSP Functions

Modeling a 4G LTE System in MATLAB

What s New with the MATLAB and Simulink Product Families. Marta Wilczkowiak & Coorous Mohtadi Application Engineering Group

Making the Most of your MATLAB Models to Improve Verification

Moving MATLAB Algorithms into Complete Designs with Fixed-Point Simulation and Code Generation

[Sub Track 1-3] FPGA/ASIC 을타겟으로한알고리즘의효율적인생성방법및신기능소개

Building Combinatorial Circuit Using Behavioral Modeling Lab

Virtex 6 FPGA Broadcast Connectivity Kit FAQ

Introduction to Field Programmable Gate Arrays

Getting started with Digilent NetFPGA SUME, a Xilinx Virtex 7 FPGA board for high performance computing and networking systems

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial

FPGA Based Digital Design Using Verilog HDL

Experiment 3. Getting Start with Simulink

EE 367 Logic Design Lab #1 Introduction to Xilinx ISE and the ML40X Eval Board Date: 1/21/09 Due: 1/28/09

S2C K7 Prodigy Logic Module Series

Interfacing a PS/2 Keyboard

Impulse Embedded Processing Video Lab

Hardware and Software Co-Design for Motor Control Applications

System-on-a-Programmable-Chip (SOPC) Development Board

ECT 224: Digital Computer Fundamentals Using Xilinx StateCAD

Model-Based Embedded System Design

Spartan-6 LX9 MicroBoard Embedded Tutorial. Tutorial 1 Creating an AXI-based Embedded System

Test and Verification Solutions. ARM Based SOC Design and Verification

FPGA Co-Simulation of Gaussian Filter Algorithm Application Note

Virtex-6 FPGA ML605 Evaluation Kit FAQ June 24, 2009

Wave Generator Xpress

Using Synplify Pro, ISE and ModelSim

Developing and Integrating FPGA Co-processors with the Tic6x Family of DSP Processors

RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015

Vivado HLx Design Entry. June 2016

100M Gate Designs in FPGAs

Using Model-Based Design to Design Real-Time Video Processing Systems

Extending Model-Based Design for HW/SW Design and Verification in MPSoCs Jim Tung MathWorks Fellow

ECE532 Design Project Group Report Disparity Map Generation Using Stereoscopic Camera on the Atlys Board

Simplify System Complexity

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

USING THE SYSTEM-C LIBRARY FOR BIT TRUE SIMULATIONS IN MATLAB

The S6000 Family of Processors

Vivado Design Suite User Guide

Optimizing HDL IP Development with Real-World I/O. William Baars National Instruments

Lab 5. Using Fpro SoC with Hardware Accelerators Fast Sorting

Simulink Design Environment

Advanced module: Video en/decoder on Virtex 5

Multimedia Decoder Using the Nios II Processor

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

PART I BASIC DIGITAL CIRCUITS

Vivado Design Suite User Guide

Connecting MATLAB & Simulink with your SystemVerilog Workflow for Functional Verification

Tutorial: Working with Verilog and the Xilinx FPGA in ISE 9.2i

Xilinx Vivado/SDK Tutorial

POWERLINK Slave Xilinx Getting Started User's Manual

The Veloce Emulator and its Use for Verification and System Integration of Complex Multi-node SOC Computing System

Four Best Practices for Prototyping MATLAB and Simulink Algorithms on FPGAs by Stephan van Beek, Sudhir Sharma, and Sudeepa Prakash, MathWorks

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks

SOPC LAB1. I. Introduction. II. Lab contents. 4-bit count up counter. Advanced VLSI Due Wednesday, 01/08/2003

Lecture 5: Aldec Active-HDL Simulator

Yet Another Implementation of CoRAM Memory

Lab #1: Introduction to Design Methodology with FPGAs part 1 (80 pts)

Introduction to the Qsys System Integration Tool

Transcription:

Accelerating Your Success Avnet Speedway Design Workshop Creating FPGA-based Co-Processors for DSPs Using Model Based Design Techniques Lecture 4: FPGA Co-Processor Architectures and Verification V10_1_2_0 1

Model-Based Design Flow Develop Executable Spec in Design Exploration for Targeting Hardware Partition Between DSP and FPGA Co-Processor Verify Hardware in HW Co-simulation Implement Stand-Alone Video System 2 We are now at the hardware co-simulation phase in the model-based design flow. 2

The Problem We Wish to Solve Verification of hardware-based video processing systems using traditional HDL simulators is tedious and slow. We propose a -based verification methodology using FPGA hardware co-simulation which operates on entire video frames and offers dramatic acceleration of simulation times. 3 3

Agenda FPGA model verification using Hardware co-simulation with System Generator and 4 4

Testbench to Verify Model Partition. FPGA Input Stimuli Video frames Subsystem A Testbench Top-Level Model Subsystem B Partitioned Function Block System Generator Subsystem C Display Compare Subsystem C Insert FPGA-based subsystem into original model 5 Side-by-side comparison against equivalent Avnet SpeedWay Design Workshop block In lab 3, we partitioned the video stabilization algorithm by implementing SAD in the FPGA. We built the SAD as a System Generator model and verified it s functionality in isolation, using test images. Now that we have decided how to partition the model between DSP and FPGA < mouse click > we use a testbench to verify that our partitioned model gives the same results in simulation as the original, by inserting the FPGA block for side-by-side comparison. Note s ability to propagate 2-D arrays as frames in the model, which eases the modeling and simulation of video and image processing algorithms. This is a verification methodology to exercise the FPGA functionality only; the DSP is not involved yet because we are still at the verification stage. When the FPGA co-processor has been validated in simulation, we will move towards a stand-alone system in lab 5 where all functionality in the left side of the above diagram will be implemented in the DSP. 5

Testbench for Video Stabilization. FPGA Input Stimuli Video frames Image Translate updated template updated ROI Testbench Video Stabilization Model Location estimation Motion estimation Sum-of-Absolute -Differences (SAD) Display Relative motion vector from frame to frame 1 2 Sum-of-Absolute - Differences (SAD) Compare Insert FPGA-based SAD into original model Compare with Video & Image Processing Avnet SpeedWay Blockset Design Workshop SAD 6 Verification process described in previous slide, applied to the video stabilization model. The technique we use to include the FPGA co-processor into the testbench is called hardware co-simulation 6

Hardware Co-Simulation in Xilinx DSP Tools An enabling technology for Simulation Acceleration On-chip Verification Run-time Debugging Software/Hardware Interaction Benchmarking (Fmax and Power) Rapid prototype of FPGA signal processing designs on actual board 7 7

Xilinx Hardware Co-Simulation. Bit true and cycle true validation in FPGA Accelerates simulation by orders of magnitude MATLAB / FPGA Development Board Results to Stimuli to FPGA JTAG / Ethernet / PCI User Model Runs in FPGA Ethernet link @ 100 Mbps full duplex achieves 70 Mbps data payload 8 Hardware co-simulation CONCEPT SLIDE : Xilinx System Generator for DSP features a powerful simulation mode called hardware cosimulation, shown above. The benefits : Immediate proof of concept of DSP model directly in hardware simulation acceleration, especially on large models such as 16K FFT allows simulation of IP cores delivered in netlist format. runs on the PC, sending input data over the Ethernet interface to the model running in the FPGA. The input data consists of a signal source. Processed results output from the model are sent back for display in. Shown above at the top right is the hardware co-simulation block, which stands for all or part of the System Generator model, after generating hardware to target board. (All Avnet Xilinx development boards have JTAG support in Xilinx System Generator for DSP. Avnet V5 LX50, V5 95SXT and Spartan-3A DSP / DaVinci are supported for hardware co-simulation over Ethernet @ 100 Mbps full-duplex) Anticipated questions : Is hardware co-simulation running at the full board system clock rate? What is the difference between single-step and free-running modes for hardware co-simulation? 8

Xilinx Hardware Co-Simulation Tool Flow. MathWorks TI TI MATLAB and Algorithm and System Design Xilinx Avnet Real-Time Workshop Embedded Coder, Targets, Links Xilinx System Generator for DSP Generate Verify C / ASM Link for CCS Code Composer Generate HDL ISE Verify Hardware Co-simulation Video frame Video frame Video frame Supports burst transfers Efficient for frame-based processing, ex. video DSP FPGA Avnet Spartan3A-DSP FPGA / DaVinci Platform Avnet Spartan3A-DSP DaVinci Development Kit 9 System Generator provides hardware co-simulation, making it possible to incorporate a design running in an FPGA directly into a simulation. Compiling a model for hardware co-simulation automatically creates a bitstream and associates it to a System Generator co-simulation block. When the design is simulated in, stimuli are sent to the compiled hardware model in the FPGA and results sent back to. This allows the compiled portion to be tested in actual hardware and can speed up simulation dramatically. < mouse click > The above diagram represents a testbench that generates video frames in, transfers them to the FPGA for realtime processing, and sends back the results for display in. The ability to pass entire video frames to a portion of the model residing in the FPGA, and to retrieve the results from the FPGA for display in, makes hardware c-simulation very efficient for video system testbenches. This is a test methodology between and the FPGA co-processor only; the DSP is not involved. 9

Hardware Co-Simulation / Compilation Start with a model that is ready to be compiled for hardware co-simulation. 1 Select an appropriate compilation target from the System Generator block dialog box. 2 10 10

Hardware Co-Simulation / Compilation (2) Modify MAP/PAR settings using custom XFlow options file. 3 Use a lower clock frequency to achieve timing closure. 4 Press to generate the bitstream. 5 11 11

Hardware Co-Simulation / Compilation (3) The compilation creates a new library containing a parameterized run-time cosimulation block. 6 Add the co-simulation run-time block to a SysGen model. 7 12 The co-simulation block executes in the FPGA, while receiving input from and sending results back to. 12

Hardware Co-Simulation / JTAG Burst-transfer support 1 Mbps down to the board 0.5 Mbps back from the board Selectable cable speeds I/O buffering performance boost Improvement is relative to the number of ports Platform USB support Additional boards can be added to the JTAG flow using a wizard in less than 20 minutes 13 13

Hardware Co-Simulation / Ethernet Two flavors: Network-based Remote access 10/100 Base-T network Ethernet-based configuration Point-to-Point Requires a direct connection between host PC and FPGA 10/100/1000 Base-T Ethernet or JTAG-based (i.e., Platform USB or PC4) configuration Avnet Spartan-3A DSP DaVinci supports Point-to- Point 10/100 Base-T 14 Future support is planned for co-simulation over GigE. 14

Hardware Co-Simulation / Shared Memory System Generator User Design Data_in full wr_en wr_dat_count System Generator FROM FIFO empty rd_en Data_out rd_dat_count System Generator C Program MATLAB Program Host PC Shared memory allows burst vector and 2-D frame transfers between FPGA and host PC 15 Xilinx hardware co-simulation supports burst transfers between and the FPGA. Burst transfers improve simulation performance because fewer transactions are required over the course of a simulation. The general idea is to transmit data vectors (or frames) to and from the hardware platform in a single transaction (as opposed to multiple transactions when using single word read / writes). This can improve simulation performance dramatically, especially over GigE. During a simulation cycle, shared memory allows the following sequence of events to transfer bidirectional data between FPGA and host PC: Bursting data from host PC TO FPGA: -Buffer a series of scalar input data values into a vector; -Transfer the vector data to a shared memory FIFO residing on the FPGA using a burst transfer -The FIFO allows crossing clock domains between bursty incoming data over Ethernet to the FPGA system clock -Use the FPGA, in free-running mode, to sequentially process the entire input buffer Bursting data from FPGA to host PC (not shown for clarity): -Use the FPGA to write the processed output data into an output buffer -Transfer the contents of the output buffer back into and reconstruct the data as a vector -Unbuffer the vector into a series of output scalar values In similar fashion, the host PC can exchange 2-D arrays or frames with the FPGA. This is especially useful for hardware co-simulation of video systems. The shared memory functionality depicted in 15 the diagram is the System Generator FROM FIFO block, which receives data from the host as input stimuli to the FPGA design under test. A companion TO FIFO block is available (not shown).

Hardware Co-Simulation / Shared Memory FPGA Input Stimuli Video frames Subsystem A Testbench Top-Level Model Subsystem B Partitioned Function Block FROM FIFO TO FIFO System Generator Subsystem C Display Compare Subsystem C Shared memory FIFOs added to System Generator design 16 Bolster co-simulation performance with Avnet burst SpeedWay transfers Design Workshop To use vector-based transfers in hardware co-simulation, you must augment your model with input and output buffers to stream data through the target FPGA. < mouse click > Shared FIFOs provide abstractions necessary to implement these buffers. In lab 4, you will add To and From Shared FIFO blocks to the FPGA SAD design to transfer template and Region-of-Interest 2-D arrays from to the FPGA, and SAD results in the form of minimum SAD coordinates + 3x3 SAD neighborhood from FPGA to. It s important to note that hardware co-simulation can operate in one of 2 modes: single-stepped or free-running mode. We choose to operate in free-runnng mode, meaning the FPGA is clocked by a physical oscillator on the board at it s full rate, or some convenient sub-multiple. The shared FIFOs expose empty/full ports for flow control between the FIFOs and that provide a way to stall the SAD computations when the input buffer is empty. This technique allows the FPGA to execute at high speed as long as input data is available in the From FIFOs. 16

Summary FPGA model verification using Hardware co-simulation with System Generator and proceed to lab 4 Verifying FPGA co-processor functions using hardware co-simulation 17 17

Reference Slides 18 18

Accelerating Verification through Hardware Hardware co-verification removes simulation bottleneck Up to 1000x simulation performance improvement Automates FPGA and board setup process Design Simulation Time (Seconds) Software HW Co-Sim Increase Beamformer 113 2.5 45X OFDM BER Test 742.75 989X DUC CFR 731 23 32X Color Space Converter 277 4 69x Video Scalar 10422 92 113X 19 System Generator supports hardware in the loop flows to over 20 different boards HIL is supported for both the and MATLAB modeling environments JTAG, PCI and Ethernet based HW co-sim is supported Additional boards can be added to the flow using a wizard in less than 20 minutes Provides simulation increases up to 1000x Discovery Questions Will the FPGA be used for algorithm acceleration or as an FPGA prototype? Will the FPGA be replaced in the final product? What benefit will replacing the FPGA bring? 19

Hardware Co-Simulation Framework Design Compilation Simulation MATLAB M-Hwcosim API Block Diagram SysGen Block Diagram Bitstream HW CoSim Project File HW CoSim Test-bench xlhwcosimsimulate xlhwcosimconfig Hwcosim Shmem Shfifo Hardware Co-Simulation Engine HW CoSim Block HW CoSim Library Block PCI (XtremeDSP Kit) JTAG Point-to-point Ethernet Network-based Ethernet 20 To conclude this discussion of Xilinx hardware co-simulation, the above diagram shows a comprehensive framework in which hardware co-simulation can be scripted in detail under control of MATLAB, or entirely under. Scripting using the MATLAB API is an advanced technique new in System Generator for DSP 10.1, beyond the scope of this course. More information on the MATLAB API is available in on-line help in Xilinx System Generator for DSP 10.1. Programmatic Access \ M-Code Access to Hardware Co-Simulation We will focus exclusively on hardware co-simulation in for the remainder of the course. 20