Tools for Reconfigurable Supercomputing. Kris Gaj George Mason University

Similar documents
An Implementation Comparison of an IDEA Encryption Cryptosystem on Two General-Purpose Reconfigurable Computers

Mohamed Taher The George Washington University

ECE 699: Lecture 12. Introduction to High-Level Synthesis

Master s Thesis Presentation Hoang Le Director: Dr. Kris Gaj

Reconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization

Implementation of Elliptic Curve Cryptosystems over GF(2 n ) in Optimal Normal Basis on a Reconfigurable Computer

ISE Design Suite Software Manuals and Help

A Framework to Improve IP Portability on Reconfigurable Computers

Agenda. Introduction FPGA DSP platforms Design challenges New programming models for FPGAs

Intro to System Generator. Objectives. After completing this module, you will be able to:

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

RECONFIGURABLE COMPUTING: A DESIGN AND IMPLEMENTATION STUDY OF ELLIPTIC CURVE METHOD OF FACTORING USING SRC CARTE-C AND CELOXICA HANDEL-C

Basic Xilinx Design Capture. Objectives. After completing this module, you will be able to:

Evaluation of the RTL Synthesis Tools for FPGA/PLD Design. M.Matveev. Rice University. August 10, 2001

Performance and Overhead in a Hybrid Reconfigurable Computer

MATLAB/Simulink 기반의프로그래머블 SoC 설계및검증

Implementing MATLAB Algorithms in FPGAs and ASICs By Alexander Schreiber Senior Application Engineer MathWorks

Cover TBD. intel Quartus prime Design software

End User Update: High-Performance Reconfigurable Computing

DESIGN STRATEGIES & TOOLS UTILIZED

Cover TBD. intel Quartus prime Design software

Optimize DSP Designs and Code using Fixed-Point Designer

In the past few years, high-performance computing. The Promise of High-Performance Reconfigurable Computing

AccelDSP tutorial 2 (Matlab.m to HDL for Xilinx) Ronak Gandhi Syracuse University Fall

Method We follow- How to Get Entry Pass in SEMICODUCTOR Industries for 3rd year engineering. Winter/Summer Training

Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE

On Using Simulink to Program SRC-6 Reconfigurable Computer

CHAPTER-IV IMPLEMENTATION AND ANALYSIS OF FPGA-BASED DESIGN OF 32-BIT FPAU

DSP Builder Handbook Volume 1: Introduction to DSP Builder

Model-Based Design for Video/Image Processing Applications

Integrated Workflow to Implement Embedded Software and FPGA Designs on the Xilinx Zynq Platform Puneet Kumar Senior Team Lead - SPC

In the past few years, high-performance computing. The Promise of High-Performance Reconfigurable Computing

FPGA: What? Why? Marco D. Santambrogio

AccelDSP Synthesis Tool

Evolution of CAD Tools & Verilog HDL Definition

Can High-Level Synthesis Compete Against a Hand-Written Code in the Cryptographic Domain? A Case Study

Using FPGAs in Supercomputing Reconfigurable Supercomputing

Hardware-Software Co-Design and Prototyping on SoC FPGAs Puneet Kumar Prateek Sikka Application Engineering Team

MOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden

A Rapid Prototyping Methodology for Algorithm Development in Wireless Communications

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj

FPGA Based Digital Design Using Verilog HDL

FPGAs Provide Reconfigurable DSP Solutions

Case Study on DiaHDL: A Web-based Electronic Design Automation Tool for Education Purpose

DSP Builder Handbook Volume 1: Introduction to DSP Builder

Modeling a 4G LTE System in MATLAB

PlanAhead Release Notes

IMPLICIT+EXPLICIT Architecture

SimXMD Simulation-based HW/SW Co-debugging for field-programmable Systems-on-Chip

Introduction to DSP/FPGA Programming Using MATLAB Simulink

Tutorial on FPGA Design Flow based on Xilinx ISE Webpack and ModelSim. ver. 1.3

University of Massachusetts Amherst Department of Electrical & Computer Engineering

Hardware Oriented Security

IP Module Evaluation Tutorial

Intel Quartus Prime Pro Edition Software and Device Support Release Notes

Optimizing HW/SW Partition of a Complex Embedded Systems. Simon George November 2015.

FPGA-Based Rapid Prototyping of Digital Signal Processing Systems

Parallel Programming of High-Performance Reconfigurable Computing Systems with Unified Parallel C

SDR Spring KOMSYS-F6: Programmable Digital Devices (FPGAs)

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

Accelerating FPGA/ASIC Design and Verification

SimXMD: Simulation-based HW/SW Co-Debugging for FPGA Embedded Systems

An Overview of a Compiler for Mapping MATLAB Programs onto FPGAs

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

SimXMD Co-Debugging Software and Hardware in FPGA Embedded Systems

Circuit Design and Simulation with VHDL 2nd edition Volnei A. Pedroni MIT Press, 2010 Book web:

Actel Libero TM Integrated Design Environment v2.3 Structural Schematic Flow Design Tutorial

ChipScope Demo Instructions

ESL design with the Agility Compiler for SystemC

CAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA

Using ChipScope. Overview. Detailed Instructions: Step 1 Creating a new Project

Park Sung Chul. AE MentorGraphics Korea

V8-uRISC 8-bit RISC Microprocessor AllianceCORE Facts Core Specifics VAutomation, Inc. Supported Devices/Resources Remaining I/O CLBs

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

SpartanMC. SpartanMC. Quick Guide

Support for Programming Reconfigurable Supercomputers

II. LITERATURE SURVEY

Tutorial - Using Xilinx System Generator 14.6 for Co-Simulation on Digilent NEXYS3 (Spartan-6) Board

An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm

Synthesis Options FPGA and ASIC Technology Comparison - 1

NEW FPGA DESIGN AND VERIFICATION TECHNIQUES MICHAL HUSEJKO IT-PES-ES

An introduction to CoCentric

FPGA Solutions: Modular Architecture for Peak Performance

FPGA briefing Part II FPGA development DMW: FPGA development DMW:

[Sub Track 1-3] FPGA/ASIC 을타겟으로한알고리즘의효율적인생성방법및신기능소개

Tutorial on FPGA Design Flow based on Aldec Active HDL. Ver 1.5

Virtex-6 FPGA ML605 Evaluation Kit FAQ June 24, 2009

Simulink Design Environment

OUTLINE RTL DESIGN WITH ARX

Implementation of Elliptic Curve Cryptosystems over GF(2 n ) in Optimal Normal Basis on a Reconfigurable Computer

Experiment 3. Digital Circuit Prototyping Using FPGAs

UNIVERSITY OF CALIFORNIA, SAN DIEGO. Reconfigurable Computing Platform. A thesis submitted in partial satisfaction of the

MODEL BASED HARDWARE DESIGN WITH SIMULINK HDL CODER

DSP Flow for SmartFusion2 and IGLOO2 Devices - Libero SoC v11.6 TU0312 Quickstart and Design Tutorial

Chapter 9: Integration of Full ASIP and its FPGA Implementation

Model-Based FPGA Embedded-Processor Systems Design Methodologies: Modeling, Syntheses, Implementation and Validation

NCSA Reconfigurable Systems Summer Institute July, Michael Babst DSPlogic, Inc x705. DSPlogic Proprietary

Reconfigurable supercomputing has shown

Advanced module: Video en/decoder on Virtex 5

FPGA Design Flow 1. All About FPGA

Transcription:

Tools for Reconfigurable Supercomputing Kris Gaj George Mason University 1

Application Development for Reconfigurable Computers Program Entry Platform mapping Debugging & Verification Compilation Execution 2

Tasks Addressed in This Presentation Program Entry Platform mapping Debugging & Verification Compilation Execution 3

Program Entry Program 4

Platform Mapping SW/HW Partitioning Program Software (executed in the microprocessor system) Hardware (executed in the reconfigurable processor system) 5

SW/HW Partitioning & Coding Traditional Approach Specification SW/HW Partitioning SW Coding HW Coding SW Compilation HW Compilation SW Profiling HW Profiling 6

SW/HW Partitioning & Coding New Approach Specification SW/HW Coding SW/HW Partitioning SW Compilation HW Compilation SW Profiling HW Profiling 7

Program Entry for FPGA Accelerator Boards Traditional Software Hardware HDL Graphical Data Flow Diagram HLL Extended Software Hardware 8 Increased productivity Increased capability to describe parallel execution

Program Entry for Reconfigurable Computers Star Bridge Software Hardware HDL porting EDIF Graphical Data Flow Diagram COM objects HLL SRC Software Hardware HDL macros Increased productivity Increased capability to describe parallel execution 9

Examples of Software Environments for Reconfigurable Computers DSP-oriented Corefire from Annapolis Microsystems Xtreme DSP from Xilinx Inc. & MathWorks General-purpose Viva from Star Bridge Systems SRC Software Environment from SRC Computers, Inc. 10

CoreFire FPGA Application Builder Design viewer Library Cores window Message Window 11 Diagram Editor

Xtreme DSP Environment 12

Star Bridge Software Environment User input Graphical User Interface VIVA Netlists.ngo files Xilinx Place & Route Application executable 13.bin files Configuration bitstreams

Sheets Library Object 14

SRC Compilation Process Application sources Macro sources.c or.f files.vhd or.v files HDL sources.v files Logic synthesis µp Compiler MAP Compiler Netlists.ngo files Object files.o files.o files Place & Route Linker Application executable.bin files Configuration bitstreams 15

Cray XD1 Traditional Design Flow Cores HDL RA I/F, Metadata QDR SRAM I/F 0100010101 1010101011 0100101011 Synthesize Implement 0101011010 1001110101 0110101010 Download Binary File VHDL, Verilog, C Synplicity, Leonardo, Precision, Xilinx ISE Simulate Xilinx ISE From Command line or Application Verify Modelsim 16 Xilinx ChipScope Source: [Cray, MAPLD04]

Cray XD1 New design flow 17 Source: [Cray, MAPLD04]

SGI Altix Design Flow (HDLs) IA-32 Linux Machine.v,.vhd Design Entry (Verilog, VHDL).v,.vhd Design Synthesis (Synplify Pro, Amplify).edf Design iterations.v,.vhd Design Verification Behavioral Simulation (VCS, Modelsim) Metadata Processing (Python) Design Implementation (ISE).ncd,.pcf Static Timing Analysis (ISE Timing Analyzer).cfg.bin Altix Device Programming (RASC Abstraction Layer, Device Manager, Device Driver).c Real-time Verification (gdb) 18

SGI Altix Design Flow (HLLs) IA-32 Linux Machine.v,.vhd HLL Design Entry (Handel-C, Impulse C, Mitrion C, Viva) Metadata Processing (Python) RTL Generation and Integration with Core Services.v,.vhd.cfg Design Synthesis (Synplify Pro, Amplify).edf Design Implementation (ISE).bin.v,.vhd.ncd,.pcf Design Verification Behavioral Simulation (VCS, Modelsim) Static Timing Analysis (ISE Timing Analyzer) Altix Device Programming (RASC Abstraction Layer, Device Manager, Device Driver).c Real-time Verification (gdb) 19

Platform Mapping FPGA mapping Program Hardware FPGA 1 FPGA 2 Software FPGA 3 FPGA 4 20

Example of FPGA Mapping FPGA 1 FPGA 2 FPGA multiply divide multiply divide add add FPGA 1 FPGA 2 multiply divide add 21

FPGA Mapping in SRC FPGA1.mc void fpga1(int64_t a, int64_t b, int64_t *sum, int mapno) { int64_t c, temp; } FPGA2.mc send_to_bridge(b); c = a * const1; recv_from_bridge(&temp); *sum = temp+mult; void fpga2() { int64_t a, d; } recv_from_bridge(&a); d = a/const2; send_to_bridge(d); Makefile MAPFILES = FPGA1.mc FPGA2.mc PRIMARY = FPGA1.mc SECONDARY = FPGA2.mc CHIP2 = FPGA2.mc a multiply add b FPGA 1 FPGA 2 22 sum divide

FPGA Mapping in VIVA TM By changing the attributes one can specify where an object is to be located 23

Platform Mapping FPGA-FPGA data transfer & synchronization Program Hardware FPGA 1 FPGA 2 Software FPGA 3 FPGA 4 24

FPGA-FPGA Data Transfer in SRC FPGA1.mc void fpga1(int64_t a, b, c, *d) { } FPGA2.mc send_to_bridge(a, b, c); computation1 recv_from_bridge(d); void fpga2() { int64_t a,b,c,d; a b c d FPGA 1 FPGA 2 computation 1 64 64 computation 2 } recv_from_bridge(&a, &b, &c); computation2 send_to_bridge(d); 25

FPGA-FPGA Data Transfer in SRC 32 words 64 FIFO 64 bits FIFO 32 words 64 64 64 bits Bridge Port 26

FPGA-FPGA Data Transfer in VIVA TM 27 Special partitioning objects placed between the modules to be synthesized automatically map the relevant lines between the FPGAs. For designs mapped over several FPGAs: The system description must include those FPGAs over which the design is to be mapped,

Platform Mapping Use of Internal and External Memories Program Hardware OCM FPGA 1 FPGA 2 Software SM FPGA 3 OCM On-Chip Memory LM Local Memory SM Shared Memory LM 28 FPGA 4

Using On-Chip Memory (OCM) in SRC void sum(int64_t a[], int *c, int mapno) { BANK_A_ALLOC(AL, int64_t, SIZE); FPGA ocm_a [SIZE]; int i; cm2obm_0(al, a, bytelength); AL[] SM (OBM) 64 OCM ocm_a[] wait_server_0(); for(i=0; i<size; i++) { c 32 computations ocm_a[i] = AL[i]; } for(i=0; i<size; i++) { tmp = ocm_a[i] + tmp; } } 29

Using On-Chip Memory (OCM) in VIVA TM Special Objects under the Memory Subsystem of the library allows the programmer to use the on chip memory of the Xilinx Virtex II chip 30

Platform Mapping I/O Program Hardware OCM FPGA 1 FPGA 2 Software SRC SM StarBridge 31 FPGA 3 LM FPGA 4

Run Time Reconfiguration in SRC Program in C or Fortran Main program Function_1(a, d, e) Function_2(d, e, f) Function_1 Macro_1(a, b, c) Macro_2(b, d) Macro_2(c, e) Function_2 Macro_3(s, t) Macro_1(n, b) Macro_4(t, k) 32 FPGA contents after the Function_1 call FPGA b Macro_2 d a Macro_1 c Macro_2 e

Run-time Reconfiguration in VIVA TM 33 Reconfigurati on is possible by using the spawn object. By specifying the FileName attribute a VIVA executable (.vex file) or a VIVA project can be loaded onto the same or a different FPGA.

Ideal Program Entry Function Program Entry 34

Actual Program Entry Preferred Architectures Function SW/HW Partitioning Use of FPGA Resources (multipliers, µp cores) Sequence of Run-time Reconfigurations Program Entry FPGA Mapping Data Transfers & Synchronization SW/HW Interface Use of Internal and External Memories 35

Evolution and the current status of tools Not implemented Manual Entry µp-fpga Partitioning FPGA-FPGA Partitioning µp-fpga Data Transfer Compiler Automated SRC Star Bridge and other vendors FPGA-FPGA Data Transfer Computation-Data transfer Overlapping Choosing component version......... 36

Library Development - SRC LLL (ASM) HLL (C, Fortran) HLL (C, Fortran) µp system FPGA system HDL (VHDL, Verilog) Library Developer 37 HLL (C, Fortran) HLL (C, Fortran) Application Programmer

Library Development - StarBridge HLL, LLL (C++, ASM) GDF (Viva) GDF (Viva) µp system FPGA system HDL (VHDL, Verilog) Library Developer 38 GDF (Viva) GDF (Viva) Application Programmer

Debugging & Verification 39

CoreFireTM FPGA Application Debugger 40

Corefire Simulation Insert Debug Modules During Design Editing Step Through Design Using Data Flow One Step = One Module View Value and Status of Each Debug Module Waveform or Table of Values Read and Write Directly to Registers Read and Write Directly to Memory 41

X86 System in VIVA TM The FileIn Object as it appears when the x86 system is loaded 42

X86 System in VIVA TM FileIn object as it appears when the FPGA system description is loaded. 43

Debugging in VIVA TM Data can be viewed with the help of widgets, which are basically input and output horns placed in a worksheet. Various display options are available to view data, options to include the kind of view desired by the viewer and the data viewed can be switched between HEX or INT. 44

MAP Board Execution Application Subroutine For MAP User Logic Wrapper ComList Code Code MAP Runtime Library MAP Board On-board Memory Control Processor DMA Engine ComList Processor Registers & Flags User FPGAs User Logic Macro Logic Macro Logic Macro Logic Data & Flags Macro Logic 45

MAP Emulator + DFG Simulator Application Subroutine For MAP User Logic Wrapper ComList Code Code MAP Runtime Library Emulator On-board Memory Control Processor DMA Engine ComList Processor Registers & Flags User FPGAs User Logic Macro C Code Macro C Code Macro C Code Data & Flags Macro C Code 46

MAP Emulator + Verilog Simulator Application Subroutine For MAP User Logic Wrapper ComList Code Code MAP Runtime Library Emulator VCS On-board Memory Control Processor DMA Engine ComList Processor Registers & Flags User FPGAs User Logic Macro Verilog Macro Verilog Macro Verilog Data & Flags Macro Verilog 47

Summary Program Entry Program entry model SRC Star Bridge Xtreme DSP Corefire HLL GDF HLL, GDF HLL, GDF Programming languages HLL C, Fortran Matlab Java GDF - VIVA IIADL Simulink Corefire Application Builder HDL VHDL, Verilog EDIF VHDL, Verilog? 48

Summary Partitioning & Data Transfer FPGA Mapping FPGA-FPGA Data Transfer SRC Star Bridge Xtreme DSP Corefire Separate HLL functions send-to bridge, recv-from bridge macro System attributes of objects special data transfer objects such as PE1 =>PE2_50 Separate design sheets interface library components Separate design sheets interface library components 49

Summary Synchronization SRC Star Bridge Xtreme DSP Corefire Implicit Explicit: Go-Done- Busy-Wait Explicit: done, empty, full, etc. Implicit 50

Summary Run-time reconfiguration Run-time reconfiguration SRC Star Bridge Xtreme DSP sequence of MAP function calls using spawn objects associated with VIVA executables Corefire?? 51

Summary Use of Internal Resources Using internal component s of FPGA SRC Star Bridge Xtreme DSP Corefire block RAM s Arrays defined inside MAP functions Special objects under the memory subsystem Memory block sets Memory operators and resources multipliers Library functions Objects under arithmetic subsystem Math block sets Modules in math library 52

Summary Data Types Data types SRC Star Bridge Xtreme DSP Corefire Unsigned integers 8,16,32,64 bits 1-128 bits - 32, 64 bits Signed integers 8,16,32,64 bits 1-128 bits - 8 bits Fixed point - fix16, fix32 signed, unsigned variable size Floating point Single & double precision Arrays 8, 16, 32, 64 bit User defined types 32-bit single precision vectors of 1-128 bits 64-bit double precision - yes user-defined precision options - 32-bit single precision - - Complex - - - 16-bit signed integers, float - 53

Summary Libraries Libraries SRC Star Bridge Xtreme DSP Corefire arithmetic yes yes yes yes logic yes yes yes yes storage implicit yes yes yes memory allocation and access yes yes yes yes control yes yes yes - data transfer yes yes yes yes debugging and profiling yes yes yes yes DSP - - yes yes communication - - yes - User defined components Macros in VHDL or Verilog Objects in VIVA Black boxes in Simulink Macros in Corefire 54

Summary Debugging & Verification Debugging and verification SRC Star Bridge Xtreme DSP Corefire software emulation yes yes yes no HDL simulation yes no yes no 55

Summary Third Party Tools & I/O Use of external tools logic synthesis SRC Star Bridge Xtreme DSP Corefire Synplicity Synplify Pro - Synplicity Synplify Pro, Mentor Graphics Leonardo Spectrum, Xilinx XST MAP, PAR Xilinx ISE Xilinx ISE Xilinx ISE Xilinx ISE µp compilation Intel - Matlab - Schematic capture HDL simulation Components of the environment Input and output - VIVA Simulink Application Builder VCS - ModelSim - Editors, compiler, DFG behavioral simulator, VCS HDL simulator standard HLL i/o functions, files VIVA (program entry, compiler, debugging and verification, execution environment) Simulink program entry, HDL simulator, synthesis compiler, place and route tool - Application builder, debugger widgets, files files, block sets files, waveforms, tables 56

SRC Star Bridge Sashisu Bajracharya (GMU) Esmail Chitalwala (GWU) Esam El-Araby (GWU) Miaoqing Huang (GWU) Allen Michalski (USC) Nghi Nguyen (GMU) Proshanta Saha (GWU) Nandkishore Sastry (GMU) Chang Shu (GMU) Mohamed Taher (GWU) Acknowledgements 57