MOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden

Similar documents
INTRODUCTION TO CATAPULT C

An introduction to CoCentric

High Level Synthesis for Design of Video Processing Blocks

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Unit 2: High-Level Synthesis

Evolution of CAD Tools & Verilog HDL Definition

COE 561 Digital System Design & Synthesis Introduction

OUTLINE RTL DESIGN WITH ARX

Advanced Synthesis Techniques

Outline. Catapult C Design Methodology. Design Steps with Catapult 1/31/12. System-on-Chip Design Methodologies

Verilog for High Performance

Making the Most of your MATLAB Models to Improve Verification

System Level Design with IBM PowerPC Models

NEW FPGA DESIGN AND VERIFICATION TECHNIQUES MICHAL HUSEJKO IT-PES-ES

Vivado HLx Design Entry. June 2016

VHDL for Synthesis. Course Description. Course Duration. Goals

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Multi-Gigahertz Parallel FFTs for FPGA and ASIC Implementation

Towards Optimal Custom Instruction Processors

Introduction to High level. Synthesis

Lecture 7: Introduction to Co-synthesis Algorithms

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

High-Level Synthesis (HLS)

23. Digital Baseband Design

Vivado Design Suite Tutorial: High-Level Synthesis

System-on-Chip Design for Wireless Communications

Part B. Dengxue Yan Washington University in St. Louis

The SOCks Design Platform. Johannes Grad

Agenda. How can we improve productivity? C++ Bit-accurate datatypes and modeling Using C++ for hardware design

BRNO UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Computer Systems. Ing. Azeddien M. Sllame

Lecture 20: High-level Synthesis (1)

Advanced Design System DSP Synthesis

Architecture and Partitioning - Architecture

Cadence SystemC Design and Verification. NMI FPGA Network Meeting Jan 21, 2015

FPGA design with National Instuments

ALTERA FPGA Design Using Verilog

CMPE 415 Programmable Logic Devices Introduction

EEL 4783: HDL in Digital System Design

EE 4755 Digital Design Using Hardware Description Languages

Designing a Hardware in the Loop Wireless Digital Channel Emulator for Software Defined Radio

Design and Verification of FPGA and ASIC Applications Graham Reith MathWorks

HW/SW Co-design. Design of Embedded Systems Jaap Hofstede Version 3, September 1999

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

EEL 4783: HDL in Digital System Design

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions

Lab 1: CORDIC Design Due Friday, September 8, 2017, 11:59pm

Embedded Systems CS - ES

Introduction to C and HDL Code Generation from MATLAB

EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

Can High-Level Synthesis Compete Against a Hand-Written Code in the Cryptographic Domain? A Case Study

High-Level Synthesis Creating Custom Circuits from High-Level Code

HIERARCHICAL DESIGN. RTL Hardware Design by P. Chu. Chapter 13 1

Outline HIERARCHICAL DESIGN. 1. Introduction. Benefits of hierarchical design

Intro to High Level Design with SystemC

DEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks

Advanced Design System 1.5. DSP Synthesis

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

Cadence FPGA System Planner technologies are available in the following product offerings: Allegro FPGA System Planner L, XL, and GXL

Connecting MATLAB & Simulink with your SystemVerilog Workflow for Functional Verification

ALTERA FPGAs Architecture & Design

Lab #1: Introduction to Design Methodology with FPGAs part 1 (80 pts)

EITF35: Introduction to Structured VLSI Design

International Training Workshop on FPGA Design for Scientific Instrumentation and Computing November 2013

Appendix SystemC Product Briefs. All product claims contained within are provided by the respective supplying company.

High Data Rate Fully Flexible SDR Modem

Accelerating FPGA/ASIC Design and Verification

Contents 1 Introduction 2 Functional Verification: Challenges and Solutions 3 SystemVerilog Paradigm 4 UVM (Universal Verification Methodology)

Introduction to the Qsys System Integration Tool

FPGA for Software Engineers

An Overview of a Compiler for Mapping MATLAB Programs onto FPGAs

High Level Synthesis Evaluation of Tools and Methodology

An overview of standard cell based digital VLSI design

Vivado Design Suite User Guide

Topics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1)

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

AccelDSP Synthesis Tool

Digital Systems Design. System on a Programmable Chip

8B10B Decoder User Guide

An easy to read reference is:

MODELING LANGUAGES AND ABSTRACT MODELS. Giovanni De Micheli Stanford University. Chapter 3 in book, please read it.

LogiCORE IP Floating-Point Operator v6.2

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs

SimBSP Enabling RTL Simulation for Intel FPGA OpenCL Kernels

Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team

קורס VHDL for High Performance. VHDL

A Verilog Primer. An Overview of Verilog for Digital Design and Simulation

Co-synthesis and Accelerator based Embedded System Design

FPGAs: Instant Access

Verilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)

ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests

VHDL Essentials Simulation & Synthesis

Lecture 1: Introduction Course arrangements Recap of basic digital design concepts EDA tool demonstration

FPGAs: FAST TRACK TO DSP

Lecture 9. VHDL, part IV. Hierarchical and parameterized design. Section 1 HIERARCHICAL DESIGN

EE595. Part VIII Overall Concept on VHDL. EE 595 EDA / ASIC Design Lab

Navigating the RTL to System Continuum

SystemC Synthesis Standard: Which Topics for Next Round? Frederic Doucet Qualcomm Atheros, Inc

CAD for VLSI Design - I. Lecture 21 V. Kamakoti and Shankar Balachandran

Transcription:

High Level Synthesis with Catapult MOJTABA MAHDAVI 1

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 2

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 3

High Level Synthesis (HLS) Higher abstraction level. High-level synthesis bridges hardware and software domains. Generate hardware from C/C++ code or another high level language. 4

High Level Synthesis (HLS) A common misconception: o Synthesizing hardware from C++ provides users the freedom of describing the algorithms in any desired style of C++ coding. Remember that you are still describing hardware using a high level language! o "poor" description leads to a sub-optimal hardware. 5

High Level Synthesis (HLS) It is important to know HW concepts to get optimal results Successful projects require HW engineers not SW engineers o Hardware designers can work at a higher level of abstraction while creating high-performance hardware 6

High-Level Synthesis Benefits Develop algorithms at high level, e.g. C-level o Reduce design time o Reduce time for design changes o Easy design exploration o Faster time to market Verify at the higher level o Easier and faster verification 7

High-Level Synthesis Benefits Easy optimization exploration o Create multiple hardware implementation alternatives from the C source code using optimization directives Easy to maintain, easy to reuse o Create readable and portable C source code 8

Some Limitations Not all algorithms are suited for hardware implementations. Sequential code, control logic, large loops with function calls typically will not benefit much. 9

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 10

Catapult Catapult is a product of Mentor Graphics, which is used for high-level synthesis. Catapult takes C/C++ and System C as inputs and generates register transfer level (RTL) code. The target device can be FPGA and ASIC. 11

Catapult The concepts that you have learned in the DSP Design course are used in the Catapult SW o Folding/Unfolding o Re-timing/Pipelining o Bit-level optimization o and more 12

Catapult 13 13

Traditional Design Flow vs. Catapult Flow 14

Top-level Design Similar to RTL designs, HLS needs to know the "top" level of your design. o The highest level of your design Top Module Top module specifies: o The design interfaces with the outside world o Port definitions, direction, bit widths, and data types Input Ports Block 1 Block 2 Block 3 Memory Output Ports 15

Registered Outputs In general, high-level synthesis builds synchronous designs by default. All outputs of the top-level design are registered o To guarantee that timing is met when connecting to another design 16

Design Example Consider the following simple design as an example to introduce: o The concepts in HLS process o The corresponding step in Catapult design flow 17

Port Width Port Direction Bit widths of the top-level ports, excluding clock and reset, are implied by the data type. The port direction is implied by how an interface variable is used in the C++ code. 18

Control Ports C++ code has no concept of timing. There are no clocks, enables, or resets described in the design source files. o These signals are added by the synthesis tool. Control over things like polarity, type of reset, etc, are taken care by setting design constraints. 19

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 20

Data Types Two important data types: o int, ac_int o ac_fixed To use these data types the following header files should be included in the design source files: o ac_int.h o ac_fixed.h It is possible to define user data types. 21

Data Types 22

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 23

Create a New Project 24

Create a New Project 25

Create a New Design File 26

Add Input Files 27

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 28

Set Top Module 29

Design Setup 30

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 31

Data Flow Graph Analysis HLS process starts by analyzing the data dependencies between the various steps in the algorithm. The analysis leads to a Data Flow Graph (DFG). 32

Design Example Data dependencies and the order of operations are specified by the connection between nodes. 33

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 34

Resource Allocation Each operation in the extracted DFG is mapped onto a hardware resource. o Each resource corresponds to a physical implementation of the operator in hardware. Some operators may have multiple hardware resource implementations o Different area, delay, and latency specifications. Timing and area constraints will be applied to this implementation. 35

Resource Allocation All resources are selected from a technology specific pre-characterized library o Library is already determined in Design Setup step 36

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 37

Scheduling HLS adds "time" to the design during the "scheduling step. Scheduling determines that each operation in the DFG should be performed in which clock cycle. o So, based on a target clock frequency, registers should be added between operations. 38

Scheduling This process is similar to technique what RTL designers would call pipelining o Inserting registers to reduce combinational delays. Note that this is not the same as "loop pipelining" which is discussed later. 39

Scheduling Consider that in our design example: o "add" operation takes 3 ns o Clock period equals 5 ns Each add operation is scheduled in its own clock cycle C1, C2, and C3 o Registers are inserted automatically between each adder. 40

Timing Constraints 41

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 42

Hardware Implementation The corresponding hardware that is generated from the schedule varies depending on how the design is constrained in terms of: o Resource allocation o The amount of loop pipelining 43

Loop Pipelining "Loop Pipelining" allows a new iteration of a loop to be started before the current iteration has finished. o Execution of the loop iterations to be overlapped o Increasing the design performance by running loops in parallel The top-level function call has an implied loop, also known as the main loop. o Each iteration of the implied loop corresponds to execution of the schedule 44

Initiation Interval (II) The amount of overlap between the loops is determined by the "Initiation Interval (II)". o Specifies the number of pipeline stages 45

Case1: No Pipelining Design is left unconstrained o There is only a single pipeline stage o No overlap between execution of each iteration of main loop Data will be written every four clock cycles o Latency of three clock cycles o Throughput of four clock cycles 46

Case1: No Pipelining There is no overlap between any operation. Resulting hardware uses a single adder to accumulate a, b, c, and d. Reduction in the overall area. 47

Case 2: Pipelining with II = 1 Pipelining with an II=1 results in a new iteration started every clock cycle. o Latency of 3 clock cycles o Throughput of 1 Three adders are required in hardware since all three pipeline stages can be active at the same time. o More area 48

Case 2: Pipelining with II = 1 Iteration one is started in C2 and iteration 2 is started in C3. 49

Architecture Specification 50

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 51

Design Schedule 52

Design Synthesis 53

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 54

Design Schematic 55

Design Schematic 56

Result Analysis 57

HDL Generation 58

Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling Architecture Mapping Synthesis Result Analysis Design Verification 59

Test Bench o Include your design source file o Call your design o Generate inputs to the design o Write outputs in the console 60

Design Verification SCVerify flow in Catapult automatically generates the verification infrastructure. o The functionality of the HLS generated RTL against the users original source code will be verified. SCVerify supports Mentor QuestaSim/ModelSim, Synopsys VCS and Cadence IUS/NCSim simulation environments. 61

Design Verification 62

Design Verification 63

64