Adaptive Computing Systems (ACS) Domain for Implementing DSP Algorithms in Reconfigurable Hardware. Objective/Approach/Process
|
|
- Jesse Craig
- 6 years ago
- Views:
Transcription
1 Adaptive Computing Systems (ACS) Domain for Implementing DSP Algorithms in Reconfigurable Hardware John Zaino, Eric Pauer, Ken Smith, Paul Fiore, Jairam Ramanathan, Cory Myers {john.c.aino, ken.smith, paul.d.fiore, Fourth Biennial Ptolemy Miniconference March 200 Objective/Approach/Process Reconfigurable computing technology offers significant performance gains, e.g. 0X ops per watt and/or ops per cubic inch, over general purpose programmable solutions without the need to develop custom hardware. Today however, development of a working implementation requires hardware design expertise and generation of a good implementation requires many slow iterations between an algorithm designer and a hardware developer. Objective - reduce the design time for an initial implementation to hours and for an optimied implementation to days, for a range of signal processing applications Approach - provide the algorithm developer with tools to help analye algorithms, understand their implications for hardware, and rapidly implement their chosen solutions In the process, isolate the algorithm developer from the hardware designer through a set of library elements that provide well-defined interfaces to both communities Direct mapping of algorithm to adaptive computing system implementation. Automatic Implementation 03/07/0 Page - 2
2 Technical Attributes Development of Adaptive Computing Systems domain under Ptolemy Classic Allows alternative implementations from same dataflow graph Provides floating point simulation, fixed point simulation, C code generation and VHDL code generation Released first three versions of ACS domain in Ptolemy Classic End-to-end capability to map signal processing dataflow graph to working reconfigurable computing implementation Design space exploration automated Bit width optimiation theory (Markovian modeling) developed for algorithm analysis Bit width optimiation tool implemented to trade signal to noise ratio versus hardware complexity Pipeline alignment and scheduling algorithms implemented Automatically generate algorithm-specific sequencer and memory control logic Uni-rate and multi-rate signal processing Single and multi-fpga implementations Smart Generators- parameteriable algorithmic blocks 03/07/0 Page - 3 Analysis and Mapping in ACS Environment Dataflow Graph Bit Width Analysis Noise Distribution Analysis Precision Analysis Floating Point Simulation Fixed Point Simulation Algorithm Analysis Algorithm Rearrangement Alternative Implementations SNR analysis Alternative implementations Functional approximations Dataflow Graph Common Database in Ptolemy Automatic Scheduling Performance Metrics Performance Modeling Partitioning and Mapping Algorithm Mapping Timing and siing estimation Scheduling Partitioning across multiple FPGAs Allocated Functions Generator Selection Smart Generators Device program Interface program Device Programming VHDL Interface Libraries Adaptive Computing Resource 03/07/0 Page - 4
3 Algorithm Analysis Bit Width Analysis Noise Distribution Analysis Precision Analysis Algorithm Mapping Automatic Scheduling Performance Metrics Smart Generators Design Time Performance Modeling Partitioning & Mapping Allocated Functions Design Approach Dataflow Graph Floating Point Simulation Algorithm Rearrangement Fixed Point Simulation Alternative Implementations Signal Flow Graph Generator Selection Common Database in Ptolemy Signal Processing Algorithm Represent in Dataflow Ptolemy Environment Analysis and Simulation Hardware Configuration Library Application Interface Generation Run-Time Manager Operating System Device Driver Reconfigurable Hardware Run Time Application Software Compute Libraries Host Processor VHDL Interface Libraries Logic Generation Floorplanner Routing Device Program Legend Enhanced or New Capability Existing Tool or Hardware 03/07/0 Page - 5 Program Progress Algorithm analysis Representation for alternative implementations was incorporated as part of Ptolemy integration Side-by-side simulation capability incorporated as part of Ptolemy integration Developed bit width optimiation theory for algorithm analysis and extended to include multiple devices and constraints Implemented wordlength optimiation tool Algorithm mapping Cost analysis included as part of wordlength analysis Implemented uni-rate and multi-rate pipeline alignment and scheduling algorithm for signal processing dataflow graphs One-to-one and one-to-many mapping of functions to blocks supported 03/07/0 Page - 6
4 Program Progress Smart generators Implemented portable logic synthesis methodology with VHDL as first target Integrated Xilinx Core 4,000-series generators capability within VHDL code generation Implemented smart generators for state machine sequencer and memory control (address generator) Ptolemy integration Released initial version of Adaptive Computing Systems domain in Ptolemy in June 998. Second release April 999. Third release in August ACS domain supports alternative implementations from a common interface. Floating point simulation, fixed point simulation, C code generation, and VHDL code generation. Demonstration Selected Annapolis Micro Systems Wildforce TM board for demonstrations Established ACS demonstration environment for Solaris Integrated Wildforce TM board under Ptolemy Demonstrated Winograd-based FSIC receiver and FFT-based signal detector Procured and installed Annapolis Micro systems Wildstar TM board under Solaris SHARP/HRR (High Range Resolution Radar ATR) algorithm modeled - hardware development & testing nearing completion 03/07/0 Page - 7 ACS Domain Determined that extending old domains could not be justified New paradigm for Ptolemy, e.g. multiple implementations of a single star 03/07/0 Page - 8
5 ACS Domain New ACS domain to facilitate movement among simulation and code/design generation Corona contains interface specification Core contains an implementation ACS Stars are composed of one corona and multiple cores Core selection via targeting defines implementation Corona Core Targets Corona Floating_Point Simulation Core Fixed_Point Simulation Core C Code Generation Core FPGA Design Generation Core 03/07/0 Page - 9 Selecting Among Alternative Implementations Alternative implementations are represented as targets with cores for each star/functional block Targets can have parameters Floating point simulation, fixed point simulation, C code generation, and FPGA design generation are available. 03/07/0 Page - 0
6 Yn+=a0 Xn++a Xn +a2xn- P() Loop Filter Quanti e Algorithm Analysis I Q Angle N DELAYS Algorithm Analysis I Q Angle N/3 DELAYS 2N/3 DELAYS Multiple Representations N Mults N Adds Freq. Est. Scaling 7 Adds Freq. Est. Basic FIR Systolic Perform Trade-Offs Precision (float vs. fixed, wordlengths) Speed Sie/Area Latency FA FA FA FA Bit Serial Coeffs Data Low Power Acc Acc2 Acc3 Y n =a 0 X n +a X n- +a 2 X n-2 Y n+2 =a 0 X n+2 +a X n+ +a 2 X n FA FA Reduce Taplength Reduce Wordlength Multirate Implementation Trades 03/07/0 Page - Wordlength Optimiation Analysis Dynamic Range Optimal Design Choices Quantiation Noise (SNR) Hardware Cost 03/07/0 Page - 2
7 Algorithm Mapping Objectives Performance Modeling provide feedback on utiliation, throughput, efficiency, etc. Feedback should be used by algorithm analysis capabilities. Partitioning and Mapping break large dataflow graphs into groups and map those groups across multiple devices and across time Automatic Scheduling automatically determine firing sequence, optimal mappings and sequence of configurations Progress Cost analysis included as part of wordlength analysis Implemented uni-rate and multi-rate pipeline alignment and scheduling algorithms Memory allocation support Signal Flow Graph Performance Modeling Common Database in Ptolemy Automatic Scheduling Performance Metrics Partitioning and Mapping Allocated Functions 03/07/0 Page - 3 Automatic Scheduling Input PORT N A N2 B C N3 I2 P=2 I3 P= I4 P= I = Instance N=Node P=Pipeline Delays N6 N8 N7 Pipeline alignment and schedule determination required for logic synthesis I5 P= I6 P= N4 D N5 E PORT2 MEM ADDED TO NETLIST BY SEQUENCER GENERATOR MODIFIED ALGORITHM DATAFLOW GRAPH Output LDEN N N6 I2 N4 LDEN2 P=2 I5 2-MUX N2 N7 P= I3 MEM2 DELAYN9 LDEN2 P= N5 I7 I6 SEL N3 P= P= I4 N8 P= THE ALGORITHM DATAFLOW GRAPH RAM BANK A B C FPGA DATAPATH AND VARIABLE LOCATIONS RAM BANK 2 D E Node Activation Sequence N N2 N3 N4 N5 N6 N7 N8 N9 SEL LD LD2 LD3 PORT PORT2 FINAL ALGORITHM SCHEDULE 03/07/0 Page - 4
8 Processing Model Well-matched to Ptolemy Synchronous Dataflow (SDF) Domain Unit or block token produce and consume amounts Netlist structure determines execution order constraints Pipeline delay information required to determine absolute timing Delays are set to align pipelines for maximum throughput Delay can be automatically determined from block parameters Combination of fully synchronous model and tagged synchronous models No handshaking or tags but data is not always valid Data validity is implicit in timing of latch signals Memory access fits same model Data from common memory demuxed into separate streams running at lower rate Data to common memory multiplexed to a single port Multiple FPGAs introduce additional pipeline delays Multi-rate parameteried execution 03/07/0 Page - 5 Smart Generators Objectives Parameteried libraries generate node implementations for specified bit widths and parameter values Hierarchical representations provide generators that can recursively call other generators Interface generation automatically generate software to move data between generalpurpose processor and reconfigurable platform and to manage sequences of configurations General synthesis provide device independent representation of implementation Progress Implemented portable logic synthesis methodology with VHDL as first target Integrated Xilinx Core Generators (4000 Series) capability within VHDL code generation Implemented smart generators for state machine and memory control Hierarchical generation Allocated Functions Common Database in Ptolemy Generator Selection VHDL Interface Libraries Device Programming Adaptive Computing Resource 03/07/0 Page - 6
9 Multi-FPGA Capability Design Generation for Single or Multiple FPGAs Single FPGA Implementation FPGA Logic Multi-FPGA Implementation FPGA Logic FPGA 3 Logic FPGA Routing FPGA 2 Routing FPGA 3 Routing FPGA 4 Routing 03/07/0 Page - 7 Winograd DFT-Based FSK Communications Receiver FPGA Implementation 03/07/0 Page - 8
10 Results from FPGA-target / Back-end Tools Generated VHDL Generated Schedule FPGA Design 03/07/0 Page - 9 Hardware-in-the-Loop SDF Galaxy SDF Wildforce TM Star executes complete FPGA design in hardware on Annapolis Wildforce FPGA board 03/07/0 Page - 20
11 Processing Results 03/07/0 Page - 2 SHARP*/HRR Algorithm Test Vector Template Vector (one per target, per aimuth, per elevation) Non- Linearity Shift Least Squares Fit Modeling Error Can be done with correlation if templates are suitably pre-processed Algorithm Given test vector For each template For each shift Compute least squares error Select template with minimum error * System-oriented High Range Resolution (HRR) Automatic Recognition Program Complexity 70 data points per vector Number of shifts = (in range) Number templates = 3,600/class 86 sec/class for shifts on a Sun Ultra 5 (360 MHZ) workstation Expect 30x improvement 03/07/0 Page - 22
12 SHARP/HRR Algorithm NORMALIZATION CORRELATION Schedule FPGA Design Schedule FPGA Design Normaliation Results ( vs. SW) Correlation Results Across Range Shifts (typical expected) 03/07/0 Page - 23 ACS Tools - Facts and Figures 23 Functional Blocks (ACS stars) developed ~00 lines of code needed for new block/star Two ACS Architectures supported Wildforce TM (4062XL) Wildstar TM (XCV000) (in progress) ~6,000 lines of C++ code developed ~5 min to generate VHDL for five FPGA design Explore ~000 bit-width combinations in minute Ptolemy Classic runs under Solaris and Linux 03/07/0 Page - 24
Algorithm Analysis and Mapping Environment for Adaptive Computing Systems. Statement of the Problem
Algorithm Analysis and Mapping Environment for Adaptive Computing Systems Eric Pauer, Cory Myers, Ken Smith, and Paul Fiore {pauer,cory,jmsmith,pfiore}@sanders.com Sanders, a Lockheed Martin Company Nashua,
More informationFPGA Polyphase Filter Bank Study & Implementation
FPGA Polyphase Filter Bank Study & Implementation Raghu Rao Matthieu Tisserand Mike Severa Prof. John Villasenor Image Communications/. Electrical Engineering Dept. UCLA 1 Introduction This document describes
More informationAgenda. How can we improve productivity? C++ Bit-accurate datatypes and modeling Using C++ for hardware design
Catapult C Synthesis High Level Synthesis Webinar Stuart Clubb Technical Marketing Engineer April 2009 Agenda How can we improve productivity? C++ Bit-accurate datatypes and modeling Using C++ for hardware
More informationFFT/IFFTProcessor IP Core Datasheet
System-on-Chip engineering FFT/IFFTProcessor IP Core Datasheet - Released - Core:120801 Doc: 130107 This page has been intentionally left blank ii Copyright reminder Copyright c 2012 by System-on-Chip
More informationLow Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm
Low Power and Memory Efficient FFT Architecture Using Modified CORDIC Algorithm 1 A.Malashri, 2 C.Paramasivam 1 PG Student, Department of Electronics and Communication K S Rangasamy College Of Technology,
More informationParallel FIR Filters. Chapter 5
Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture
More informationERROR MODELLING OF DUAL FIXED-POINT ARITHMETIC AND ITS APPLICATION IN FIELD PROGRAMMABLE LOGIC
ERROR MODELLING OF DUAL FIXED-POINT ARITHMETIC AND ITS APPLICATION IN FIELD PROGRAMMABLE LOGIC Chun Te Ewe, Peter Y. K. Cheung and George A. Constantinides Department of Electrical & Electronic Engineering,
More informationUser Manual for FC100
Sundance Multiprocessor Technology Limited User Manual Form : QCF42 Date : 6 July 2006 Unit / Module Description: IEEE-754 Floating-point FPGA IP Core Unit / Module Number: FC100 Document Issue Number:
More informationCore Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items
(FFT_PIPE) Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com Core Facts Documentation
More informationFast implementation and fair comparison of the final candidates for Advanced Encryption Standard using Field Programmable Gate Arrays
Kris Gaj and Pawel Chodowiec Electrical and Computer Engineering George Mason University Fast implementation and fair comparison of the final candidates for Advanced Encryption Standard using Field Programmable
More informationFast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining
Pawel Chodowiec, Po Khuon, Kris Gaj Electrical and Computer Engineering George Mason University Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining http://ece.gmu.edu/crypto-text.htm
More informationCache Aware Optimization of Stream Programs
Cache Aware Optimization of Stream Programs Janis Sermulins, William Thies, Rodric Rabbah and Saman Amarasinghe LCTES Chicago, June 2005 Streaming Computing Is Everywhere! Prevalent computing domain with
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationVivado HLx Design Entry. June 2016
Vivado HLx Design Entry June 2016 Agenda What is the HLx Design Methodology? New & Early Access features for Connectivity Platforms Creating Differentiated Logic 2 What is the HLx Design Methodology? Page
More informationWordlength Optimization
EE216B: VLSI Signal Processing Wordlength Optimization Prof. Dejan Marković ee216b@gmail.com Number Systems: Algebraic Algebraic Number e.g. a = + b [1] High level abstraction Infinite precision Often
More informationComparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware
Comparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware Master s Thesis Pawel Chodowiec MS CpE Candidate, ECE George Mason University Advisor: Dr. Kris Gaj, ECE George
More informationQuixilica Floating-Point QR Processor Core
Data sheet Quixilica Floating-Point QR Processor Core With 13 processors on XC2V6000-5 - 20 GFlop/s at 100MHz With 10 processors on XC2V6000-5 - 15 GFlop/s at 97MHz With 4 processors on XC2V3000-5 - 81
More informationHardware-Software Codesign
Hardware-Software Codesign 8. Performance Estimation Lothar Thiele 8-1 System Design specification system synthesis estimation -compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationPart 2: Principles for a System-Level Design Methodology
Part 2: Principles for a System-Level Design Methodology Separation of Concerns: Function versus Architecture Platform-based Design 1 Design Effort vs. System Design Value Function Level of Abstraction
More informationAn FPGA Based Adaptive Viterbi Decoder
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst Overview Introduction Objectives Background Adaptive Viterbi Algorithm Architecture
More informationModeling of an MPEG Audio Layer-3 Encoder in Ptolemy
Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationThe Xilinx XC6200 chip, the software tools and the board development tools
The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationCore Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items
(ULFFT) November 3, 2008 Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E-mail: info@dilloneng.com URL: www.dilloneng.com Core
More informationAdvanced Synthesis Techniques
Advanced Synthesis Techniques Reminder From Last Year Use UltraFast Design Methodology for Vivado www.xilinx.com/ultrafast Recommendations for Rapid Closure HDL: use HDL Language Templates & DRC Constraints:
More informationInterfacing a High Speed Crypto Accelerator to an Embedded CPU
Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu
More informationHigh Level Abstractions for Implementation of Software Radios
High Level Abstractions for Implementation of Software Radios J. B. Evans, Ed Komp, S. G. Mathen, and G. Minden Information and Telecommunication Technology Center University of Kansas, Lawrence, KS 66044-7541
More informationSoftware Synthesis from Dataflow Models for G and LabVIEW
Software Synthesis from Dataflow Models for G and LabVIEW Hugo A. Andrade Scott Kovner Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 andrade@mail.utexas.edu
More informationA Lost Cycles Analysis for Performance Prediction using High-Level Synthesis
A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,
More informationDeveloping Applications for HPRCs
Developing Applications for HPRCs Esam El-Araby The George Washington University Acknowledgement Prof.\ Tarek El-Ghazawi Mohamed Taher ARSC SRC SGI Cray 2 Outline Background Methodology A Case Studies
More informationINTRODUCTION TO CATAPULT C
INTRODUCTION TO CATAPULT C Vijay Madisetti, Mohanned Sinnokrot Georgia Institute of Technology School of Electrical and Computer Engineering with adaptations and updates by: Dongwook Lee, Andreas Gerstlauer
More informationESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder
ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for
More informationAdaptive Stream Mining: A Novel Dynamic Computing Paradigm for Knowledge Extraction
Adaptive Stream Mining: A Novel Dynamic Computing Paradigm for Knowledge Extraction AFOSR DDDAS Program PI Meeting Presentation PIs: Shuvra S. Bhattacharyya, University of Maryland Mihaela van der Schaar,
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationFPGA Based Digital Design Using Verilog HDL
FPGA Based Digital Design Using Course Designed by: IRFAN FAISAL MIR ( Verilog / FPGA Designer ) irfanfaisalmir@yahoo.com * Organized by Electronics Division Integrated Circuits Uses for digital IC technology
More informationContents Part I Basic Concepts The Nature of Hardware and Software Data Flow Modeling and Transformation
Contents Part I Basic Concepts 1 The Nature of Hardware and Software... 3 1.1 Introducing Hardware/Software Codesign... 3 1.1.1 Hardware... 3 1.1.2 Software... 5 1.1.3 Hardware and Software... 7 1.1.4
More informationAn Overview of a Compiler for Mapping MATLAB Programs onto FPGAs
An Overview of a Compiler for Mapping MATLAB Programs onto FPGAs P. Banerjee Department of Electrical and Computer Engineering Northwestern University 2145 Sheridan Road, Evanston, IL-60208 banerjee@ece.northwestern.edu
More informationCore Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items
(FFT_MIXED) November 26, 2008 Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com
More informationA Library of Parameterized Floating-point Modules and Their Use
A Library of Parameterized Floating-point Modules and Their Use Pavle Belanović and Miriam Leeser Department of Electrical and Computer Engineering Northeastern University Boston, MA, 02115, USA {pbelanov,mel}@ece.neu.edu
More informationTwo-level Reconfigurable Architecture for High-Performance Signal Processing
International Conference on Engineering of Reconfigurable Systems and Algorithms, ERSA 04, pp. 177 183, Las Vegas, Nevada, June 2004. Two-level Reconfigurable Architecture for High-Performance Signal Processing
More informationAnalysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope
Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope G. Mohana Durga 1, D.V.R. Mohan 2 1 M.Tech Student, 2 Professor, Department of ECE, SRKR Engineering College, Bhimavaram, Andhra
More informationEECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007
EECS 5 - Components and Design Techniques for Digital Systems Lec 2 RTL Design Optimization /6/27 Shauki Elassaad Electrical Engineering and Computer Sciences University of California, Berkeley Slides
More informationReconfigurable Cell Array for DSP Applications
Outline econfigurable Cell Array for DSP Applications Chenxin Zhang Department of Electrical and Information Technology Lund University, Sweden econfigurable computing Coarse-grained reconfigurable cell
More informationA Process Model suitable for defining and programming MpSoCs
A Process Model suitable for defining and programming MpSoCs MpSoC-Workshop at Rheinfels, 29-30.6.2010 F. Mayer-Lindenberg, TU Hamburg-Harburg 1. Motivation 2. The Process Model 3. Mapping to MpSoC 4.
More informationDesign and Verification of FPGA Applications
Design and Verification of FPGA Applications Giuseppe Ridinò Paola Vallauri MathWorks giuseppe.ridino@mathworks.it paola.vallauri@mathworks.it Torino, 19 Maggio 2016, INAF 2016 The MathWorks, Inc. 1 Agenda
More informationDFT Compiler for Custom and Adaptable Systems
DFT Compiler for Custom and Adaptable Systems Paolo D Alberto Electrical and Computer Engineering Carnegie Mellon University Personal Research Background Embedded and High Performance Computing Compiler:
More informationA Configurable Multi-Ported Register File Architecture for Soft Processor Cores
A Configurable Multi-Ported Register File Architecture for Soft Processor Cores Mazen A. R. Saghir and Rawan Naous Department of Electrical and Computer Engineering American University of Beirut P.O. Box
More informationUnstructured Finite Element Computations on. Configurable Computers
Unstructured Finite Element Computations on Configurable Computers by Karthik Ramachandran Thesis submitted to the Faculty of Virginia Polytechnic Institute and State University in partial fulfillment
More informationAdvanced Design System 1.5. DSP Synthesis
Advanced Design System 1.5 DSP Synthesis December 2000 Notice The information contained in this document is subject to change without notice. Agilent Technologies makes no warranty of any kind with regard
More informationHead, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India
Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the
More informationFloating-point to Fixed-point Conversion. Digital Signal Processing Programs (Short Version for FPGA DSP)
Floating-point to Fixed-point Conversion for Efficient i Implementation ti of Digital Signal Processing Programs (Short Version for FPGA DSP) Version 2003. 7. 18 School of Electrical Engineering Seoul
More informationComputational Process Networks
Computational Process Networks for Real-Time High-Throughput Signal and Image Processing Systems on Workstations Gregory E. Allen EE 382C - Embedded Software Systems 17 February 2000 http://www.ece.utexas.edu/~allen/
More informationSoftware Synthesis Trade-offs in Dataflow Representations of DSP Applications
in Dataflow Representations of DSP Applications Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College Park
More informationLecture 2B. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram
Lecture 2B RTL Design Methodology Transition from Pseudocode & Interface to a Corresponding Block Diagram Structure of a Typical Digital Data Inputs Datapath (Execution Unit) Data Outputs System Control
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationDistributed Vision Processing in Smart Camera Networks
Distributed Vision Processing in Smart Camera Networks CVPR-07 Hamid Aghajan, Stanford University, USA François Berry, Univ. Blaise Pascal, France Horst Bischof, TU Graz, Austria Richard Kleihorst, NXP
More informationTwo HDLs used today VHDL. Why VHDL? Introduction to Structured VLSI Design
Two HDLs used today Introduction to Structured VLSI Design VHDL I VHDL and Verilog Syntax and ``appearance'' of the two languages are very different Capabilities and scopes are quite similar Both are industrial
More informationCSE 140 Lecture 16 System Designs. CK Cheng CSE Dept. UC San Diego
CSE 140 Lecture 16 System Designs CK Cheng CSE Dept. UC San Diego 1 System Designs Introduction Methodology and Framework Components Specification Implementation 2 Introduction Methodology Approach with
More informationResearch Article Design of A Novel 8-point Modified R2MDC with Pipelined Technique for High Speed OFDM Applications
Research Journal of Applied Sciences, Engineering and Technology 7(23): 5021-5025, 2014 DOI:10.19026/rjaset.7.895 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted:
More informationESE532: System-on-a-Chip Architecture. Today. Message. Graph Cycles. Preclass 1. Reminder
ESE532: System-on-a-Chip Architecture Day 8: September 26, 2018 Spatial Computations Today Graph Cycles (from Day 7) Accelerator Pipelines FPGAs Zynq Computational Capacity 1 2 Message Custom accelerators
More informationEE178 Spring 2018 Lecture Module 4. Eric Crabill
EE178 Spring 2018 Lecture Module 4 Eric Crabill Goals Implementation tradeoffs Design variables: throughput, latency, area Pipelining for throughput Retiming for throughput and latency Interleaving for
More informationAll MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes
MSEE Curriculum All MSEE students are required to take the following two core courses: 3531-571 Linear systems 3531-507 Probability and Random Processes The course requirements for students majoring in
More informationA SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN
A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN Xiaoying Li 1 Fuming Sun 2 Enhua Wu 1, 3 1 University of Macau, Macao, China 2 University of Science and Technology Beijing, Beijing, China
More informationADSL Transmitter Modeling and Simulation. Department of Electrical and Computer Engineering University of Texas at Austin. Kripa Venkatachalam.
ADSL Transmitter Modeling and Simulation Department of Electrical and Computer Engineering University of Texas at Austin Kripa Venkatachalam Qiu Wu EE382C: Embedded Software Systems May 10, 2000 Abstract
More informationVLSI Signal Processing
VLSI Signal Processing Programmable DSP Architectures Chih-Wei Liu VLSI Signal Processing Lab Department of Electronics Engineering National Chiao Tung University Outline DSP Arithmetic Stream Interface
More informationImplementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics
Implementation of FFT Processor using Urdhva Tiryakbhyam Sutra of Vedic Mathematics Yojana Jadhav 1, A.P. Hatkar 2 PG Student [VLSI & Embedded system], Dept. of ECE, S.V.I.T Engineering College, Chincholi,
More informationGeneral Purpose Signal Processors
General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:
More informationReducing the cost of FPGA/ASIC Verification with MATLAB and Simulink
Reducing the cost of FPGA/ASIC Verification with MATLAB and Simulink Graham Reith Industry Manager Communications, Electronics and Semiconductors MathWorks Graham.Reith@mathworks.co.uk 2015 The MathWorks,
More informationIntroduction to High level. Synthesis
Introduction to High level Synthesis LISHA/UFSC Prof. Dr. Antônio Augusto Fröhlich Tiago Rogério Mück http://www.lisha.ufsc.br/~guto June 2007 http://www.lisha.ufsc.br/ 1 What is HLS? Example: High level
More informationFPGA Based Digital Signal Processing Applications & Techniques. Nathan Eddy Fermilab BIW12 Tutorial
FPGA Based Digital Signal Processing Applications & Techniques BIW12 Tutorial Outline Digital Signal Processing Basics Modern FPGA Overview Instrumentation Examples Advantages of Digital Signal Processing
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationWhite Paper Assessing FPGA DSP Benchmarks at 40 nm
White Paper Assessing FPGA DSP Benchmarks at 40 nm Introduction Benchmarking the performance of algorithms, devices, and programming methodologies is a well-worn topic among developers and research of
More informationMeta-Data-Enabled Reuse of Dataflow Intellectual Property for FPGAs
Meta-Data-Enabled Reuse of Dataflow Intellectual Property for FPGAs Adam Arnesen NSF Center for High-Performance Reconfigurable Computing (CHREC) Dept. of Electrical and Computer Engineering Brigham Young
More informationDesign and Verification of FPGA and ASIC Applications Graham Reith MathWorks
Design and Verification of FPGA and ASIC Applications Graham Reith MathWorks 2014 The MathWorks, Inc. 1 Agenda -Based Design for FPGA and ASIC Generating HDL Code from MATLAB and Simulink For prototyping
More informationESE532: System-on-a-Chip Architecture. Today. Message. Clock Cycle BRAM
ESE532: System-on-a-Chip Architecture Day 20: April 3, 2017 Pipelining, Frequency, Dataflow Today What drives cycle times Pipelining in Vivado HLS C Avoiding bottlenecks feeding data in Vivado HLS C Penn
More informationAn Ultra Low-Power WOLA Filterbank Implementation in Deep Submicron Technology
An Ultra ow-power WOA Filterbank Implementation in Deep Submicron Technology R. Brennan, T. Schneider Dspfactory td 611 Kumpf Drive, Unit 2 Waterloo, Ontario, Canada N2V 1K8 Abstract The availability of
More informationSimulink Design Environment
EE219A Spring 2008 Special Topics in Circuits and Signal Processing Lecture 4 Simulink Design Environment Dejan Markovic dejan@ee.ucla.edu Announcements Class wiki Material being constantly updated Please
More informationModel-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany
Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany 2013 The MathWorks, Inc. 1 Agenda Model-Based Design of embedded Systems Software Implementation
More informationRTL Coding General Concepts
RTL Coding General Concepts Typical Digital System 2 Components of a Digital System Printed circuit board (PCB) Embedded d software microprocessor microcontroller digital signal processor (DSP) ASIC Programmable
More informationVHDL. VHDL History. Why VHDL? Introduction to Structured VLSI Design. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language
VHDL Introduction to Structured VLSI Design VHDL I Very High Speed Integrated Circuit (VHSIC) Hardware Description Language Joachim Rodrigues A Technology Independent, Standard Hardware description Language
More informationFPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression
FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas
More informationAbstract A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE
A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE Reiner W. Hartenstein, Rainer Kress, Helmut Reinig University of Kaiserslautern Erwin-Schrödinger-Straße, D-67663 Kaiserslautern, Germany
More informationA Hardware/Software Co-design Flow and IP Library Based on Simulink
A Hardware/Software Co-design Flow and IP Library Based on Simulink L.M.Reyneri, F.Cucinotta, A.Serra Dipartimento di Elettronica Politecnico di Torino, Italy email:reyneri@polito.it L.Lavagno DIEGM Università
More informationHDL Cosimulation August 2005
HDL Cosimulation August 2005 Notice The information contained in this document is subject to change without notice. Agilent Technologies makes no warranty of any kind with regard to this material, including,
More informationThe CompSOC Design Flow for Virtual Execution Platforms
NEST COBRA CA104 The CompSOC Design Flow for Virtual Execution Platforms FPGAWorld 10-09-2013 Sven Goossens*, Benny Akesson*, Martijn Koedam*, Ashkan Beyranvand Nejad, Andrew Nelson, Kees Goossens* * Introduction
More informationIntroduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013
Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.
More informationExercises in DSP Design 2016 & Exam from Exam from
Exercises in SP esign 2016 & Exam from 2005-12-12 Exam from 2004-12-13 ept. of Electrical and Information Technology Some helpful equations Retiming: Folding: ω r (e) = ω(e)+r(v) r(u) F (U V) = Nw(e) P
More informationA Stream Compiler for Communication-Exposed Architectures
A Stream Compiler for Communication-Exposed Architectures Michael Gordon, William Thies, Michal Karczmarek, Jasper Lin, Ali Meli, Andrew Lamb, Chris Leger, Jeremy Wong, Henry Hoffmann, David Maze, Saman
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationFloating-Point Bitwidth Analysis via Automatic Differentiation
Floating-Point Bitwidth Analysis via Automatic Differentiation Altaf Abdul Gaffar 1, Oskar Mencer 2, Wayne Luk 1, Peter Y.K. Cheung 3 and Nabeel Shirazi 4 1 Department of Computing, Imperial College, London
More informationSoC Design for the New Millennium Daniel D. Gajski
SoC Design for the New Millennium Daniel D. Gajski Center for Embedded Computer Systems University of California, Irvine www.cecs.uci.edu/~gajski Outline System gap Design flow Model algebra System environment
More informationXilinx System Generator v Xilinx Blockset Reference Guide. for Simulink. Introduction. Xilinx Blockset Overview.
Xilinx System Generator v1.0.1 for Simulink Introduction Xilinx Blockset Overview Blockset Elements Xilinx Blockset Reference Guide Printed in U.S.A. Xilinx System Generator v1.0.1 Reference Guide About
More informationFPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture
FPGA Architecture Overview dr chris dick dsp chief architect wireless and signal processing group xilinx inc. Generic FPGA Architecture () Generic FPGA architecture consists of an array of logic tiles
More informationIMPLICIT+EXPLICIT Architecture
IMPLICIT+EXPLICIT Architecture Fortran Carte Programming Environment C Implicitly Controlled Device Dense logic device Typically fixed logic µp, DSP, ASIC, etc. Implicit Device Explicit Device Explicitly
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu
More informationHRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing
HRL: Efficient and Flexible Reconfigurable Logic for Near-Data Processing Mingyu Gao and Christos Kozyrakis Stanford University http://mast.stanford.edu HPCA March 14, 2016 PIM is Coming Back End of Dennard
More informationHDL Cosimulation May 2007
HDL Cosimulation May 2007 Notice The information contained in this document is subject to change without notice. Agilent Technologies makes no warranty of any kind with regard to this material, including,
More informationAdvanced Design System DSP Synthesis
Advanced Design System 2002 DSP Synthesis February 2002 Notice The information contained in this document is subject to change without notice. Agilent Technologies makes no warranty of any kind with regard
More informationA Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO
2402 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 6, JUNE 2016 A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO Antony Xavier Glittas,
More information