Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications
|
|
- Sheena Greer
- 5 years ago
- Views:
Transcription
1 Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications Authors: Shuangchen Li, Yongpan Liu, X.Sharon Hu, Xinyu He, Pei Zhang, and Huazhong Yang 2013/01/23
2 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 2
3 Introduction: Background Application complexity increasing Well-developed software libraries Low speed, high power MPSoCs architecture Hardware design complexity 3
4 Introduction: Background (cont d) How to rapidly design hardware from existing software algorithms? This challenge is now new. However, Ever increasing design gap Progression log of EDA tools Synthesis level Gates/Chip RTL level System level Technology capabilities Physical level Moore low Gate level HW design gap HW design productivity s s s Mid-1990s time [29] 4
5 Introduction: Motivation C2RTL tools are promising A number of C2RTL tools A lot of successful stories A DVB-SH Turbo Decoder [8] A face detection system [9] A 3g/4g MIMO wireless systems [7] 5
6 Introduction: Motivation (cont d) However, state-of-the-art C2RTL tools suffer from: Low Quality of results (QoR) for large C programs System-level optimization options are limited A Reed-Solomon decoding [28] A JPEG encoder [10] Flatten approach Hierarchical approach speedup Cycles 42,475,202 4,070, x Clock x 6
7 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 7
8 Overview: Our work Given a large C program for a streaming application system constraints (latency, area, ) Determine how to partition the code into pipelined blocks Partition which blocks should be parallelized Parallelization The objectives Improve synthesis result quality Provide more system-level optimization options 8
9 STEP 1: We use excite here STEP 2: Determine partition and parallelization STEP 3: Synthesize each block with a C2RTL tool STEP 4: Construct the complete system Overview: Design flow STEP 1: STEP 2: STEP 3: STEP 4: C programs need to be synthesized Extract parameters of N functions 9 Throughput and Area constraints Optimize partition and parallelization Block-level parallelization F1 F1 F2 F2 Block1-1 Block1-2 Partition F3 F4 Block2 F5 FN Blockm Synthesize blocks by a C2RTL tool (excite) Assemble the modules into a single design Controller Module 1-1 Module 1-2 PE1 FIFO Module 2 PE2 Structure of the final system FIFO FIFO Module m PEm
10 Overview: An example Given a C program: In the straight-line style Given constraints: System throughput and area Partition: Which functions should be synthesized together as one pipeline stage Parallelization: Which synthesized modules should be parallelized Partition main(){ } C program F 1 (a,b); F 2 (b,c); F 3 (c,d); F 4 (d,e); F 5 (e,f); F 6 (f,g); F 7 (g,h); F 8 (h,i); Synthesized HDL Module 1 (from F 1,F 2,F 3 ) Module 2 (from F 4 ) Module 3 (from F 5,F 6 ) Module 4 (from F 7,F 8 ) FIFO FIFO FIFO Parallelization 10
11 Overview: Challenges The design space is large: Partition has a great impact on throughput and area Parallelization has a great impact on throughput and area The Pareto optimal solutions 2.4 x 104 The importance to simultaneously consider partition 2.2 and parallelization: 2 W. BLP W.O. BLP The constraints are for the system after both partition and parallelization Area (a all ) 1.8 If optimizing them separately, it is not clear how to apply the 1.6 constraints to each problem individually A GSM case Latency (r -1 all ) 11
12 Overview: Related work Application Input Target Partition Parallelization A. Hagiescu and et al., in DAC2009[11] J. Cong and et al., in DATE2012[12] Y. Liu and et al., in Intech Book[13] Y. Hara and et al., in IEICE[14] Stream StreamIT MSoPC Manually Heuristic Stream C FPGA Manually ILP Stream C FPGA Manually Heuristic General C FPGA ILP N/A This work Stream C FPGA Both MILP and Heuristic (consider simultaneously) A somewhat related line of work is mapping C programs to MPSoCs (software mapping): Blocks (or tasks) can be assigned to the same processor The processor area is given 12
13 Overview: Our Contribution A novel MILP based formulation Find a partition and parallelization solution with maximum throughput or minimum area while satisfying a given area or throughput constraint, respectively An efficient heuristic algorithm Overcome the scalability challenge facing the MILP formulation Validation of the proposed methods Developing FPGA based accelerators for seven streaming applications 13
14 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 14
15 MILP-Based Solution: Formulation Given function parameters (Para) Area, throughput of each function Determine (x n ) Which functions should be clustered to form blocks Which blocks should be parallelized Objective: min. Area (a all (x n,para) ) or max. Throughput (r all (x n,para)) Subject to: Area constraints (a all <A req ) Throughput constraints (r all >R req ) Connectivity constraints 15
16 MILP-Based Solution: Variable We use {x n } Z to represent partition and parallelization: Partition: If x n =0: F n and F n+1 are in the same block Parallelization: If x n 0: The parallelism degree of block with F n is x n F1 F2 F3 F4 F5 F6 F7 We also use {y i,j } Binary to represent partition y i,j =1 means F i,f i+1 F j are clustered F1 F1 F2 F2 F3 x n 16 F4 {0, 2, 1, 0, 1, 0, 3 } F5 F6 F6 F6 F7 F7 F7
17 MILP-Based Solution: Details To calculate throughput r all (x n,para): r r if y 1 (1) all i, j i, j To calculate area a all (x n,para): Connectivity constraints: i n i 1 i, n r i, j x y / T x y P (2) 1/ max{, } otherwi j i, j i, j j i, j i, j in out Ti, j Ti, j se i N j N le/ mem le/ mem le/ mem le/ mem all fifo (( j 1) j i, j ) i, j i 1 j i (3) a a x O x A y n i n x 1 when y 0 n y x i 1 i, n (4) i j 1 i N y y j [2, N] (5) i, j 1 j, i i 1 i j i j i N y y y 1 j [1, N] (6) i, j j, i j, j i 1 i j 17
18 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 18
19 Heuristic Solution: Overview Motivation: MILP is not scalable Bad feasible regions may incur long running time even when N is small Consider partition and oarallelization separately (constructive algorithm): Parallelization before partition to increase throughput: Incx() Partition for the given parallelization to reduce area: Clust() Implement Incx() and Clust() in a backtracking iterative way 19
20 Heuristic Solution: Algorithm Do Incx() until R req is satisfied Clust() Calculate r all and a all No Does a all violate A req? Yes Incx() Backtrack to last parallelization strategy Is this situation considered yet? No Yes Done Incx(): Parallelization before Partition to increase throughput Clust(): Partition for the given Parallelization to reduce area 20
21 Heuristic Solution: Algorithm (cont d) Incx(), Parallelization before Partition: Increase the parallelization degree of the bottleneck function Clust(), Partition under the given Parallelization: Model the blocks and their connections as a graph Convert the problem to a shortest path problem 0 B 1,1 A 1,1 A B 2,2 2,2 B 3,3 A 1,1 Begin 0 B 1,2 B 2,3 A 3,3 A 1,2 A 2,3 B 1,3 21 END
22 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 22
23 Experiments: Set up 7 Benchmark [21]: ADPCM JPEG encoder/decoder AES encryption/decryption GSM Filter Groups Environment & flow: C2RTL: excite Logic synthesis: Quartus II (cyclone II) Simulation: Modelsim excite C2RTL tool: modeling Our solution: Optimize partition and parallelization excite C2RTL tool: Implement hardware Altera Quartus tool: Area evaluation Mentor Modelsim tool: Throughput evaluation 23
24 Experiments: Validate proposed method Min. area for GSM case Heuristic solutions differ from the MILP results by 2.3% on average 24
25 Exp.: Validate proposed method (cont d) Min. Area for 7 benchmarks Heuristic with a difference of 7.5% on average 25
26 Experiments: Running time Running time: The heuristic solutions are worse by 7.2% on average 26
27 Outline Introduction Overview MILP-Based Solution Heuristic Solution Experimental Evaluation Conclusions and Future work 27
28 Conclusions and Future work Conclusions : Our work adopts a hierarchical framework with automatic C-code partition and block-level parallelization Both an MILP-based solution and a heuristic solution are proposed Experimental results obtained from seven real applications show that our approaches are effective Future work: Extend the solution to C program with feedback Taking power into consideration 28
29 Reference [1]-[27] is listed in the paper [28] Comparison of high level design methodologies for algorithmic ips: Bluespec and c-based synthesis, Ph.D. dissertation, MIT, 2009 [29] ITRS roadmap on Design 2011 Edition 29
30 30 THANK YOU!
31 MILP-Based Solution: Linearization Linearize x j y i,j : z i,j =x j y i,j My z My i, j i, j i, j x j M (1 yi, j ) zi, j x j M (1 yi, j ) Linearize Equation (1): r r M(1 y ) 1 i j N all i, j i, j Linearize Equation (2): zi, j / Ti, j ri, j in out 1/ max{ Ti, j, Ti, j } Linearize Equation (4): i n i n N y x M y x, y binary i, n n i, n n i, j i 1 i 1 31
32 Exp.: Validate proposed method (cont d) Min. area or Max. throughput for GSM 32
VHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationקורס VHDL for High Performance. VHDL
קורס VHDL for High Performance תיאור הקורס קורסזהמספקאתכלהידע התיאורטיוהמעשילכתיבתקודHDL. VHDL לסינתזה בעזרת שפת הסטנדרט הקורסמעמיקמאודומלמדאת הדרךהיעילהלכתיבתקודVHDL בכדילקבלאתמימושתכןהלוגי המדויק. הקורסמשלב
More informationIntroduction of the Research Based on FPGA at NICS
Introduction of the Research Based on FPGA at NICS Rong Luo Nano Integrated Circuits and Systems Lab, Department of Electronic Engineering, Tsinghua University Beijing, 100084, China 1 luorong@tsinghua.edu.cn
More informationMapping-Aware Constrained Scheduling for LUT-Based FPGAs
Mapping-Aware Constrained Scheduling for LUT-Based FPGAs Mingxing Tan, Steve Dai, Udit Gupta, Zhiru Zhang School of Electrical and Computer Engineering Cornell University High-Level Synthesis (HLS) for
More informationERCBench An Open-Source Benchmark Suite for Embedded and Reconfigurable Computing
ERCBench An Open-Source Benchmark Suite for Embedded and Reconfigurable Computing Daniel Chang Chris Jenkins, Philip Garcia, Syed Gilani, Paula Aguilera, Aishwarya Nagarajan, Michael Anderson, Matthew
More informationExact and Approximate Task Assignment Algorithms for Pipelined Software Synthesis
Exact and Approximate Task Assignment Algorithms for Pipelined Software Synthesis Matin Hashemi Soheil Ghiasi Laboratory for Embedded and Programmable Systems http://leps.ece.ucdavis.edu Department of
More informationFPGA for Software Engineers
FPGA for Software Engineers Course Description This course closes the gap between hardware and software engineers by providing the software engineer all the necessary FPGA concepts and terms. The course
More informationChapter 2 Designing Crossbar Based Systems
Chapter 2 Designing Crossbar Based Systems Over the last decade, the communication architecture of SoCs has evolved from single shared bus systems to multi-bus systems. Today, state-of-the-art bus based
More informationAn introduction to CoCentric
A Hand-Out 1 An introduction to CoCentric Las Palmas de G. C., Spain Jun, 27 th, 2002 Agenda 2 System-level SoC design What is SystemC? CoCentric System Studio SystemC based designs verification CoCentric
More informationAutomated Space/Time Scaling of Streaming Task Graphs. Hossein Omidian Supervisor: Guy Lemieux
Automated Space/Time Scaling of Streaming Task Graphs Hossein Omidian Supervisor: Guy Lemieux 1 Contents Introduction KPN-based HLS Tool for MPPA overlay Experimental Results Future Work Conclusion 2 Introduction
More informationCHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier
CHAPTER 3 METHODOLOGY 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier The design analysis starts with the analysis of the elementary algorithm for multiplication by
More informationLogic Optimization Techniques for Multiplexers
Logic Optimiation Techniques for Multiplexers Jennifer Stephenson, Applications Engineering Paul Metgen, Software Engineering Altera Corporation 1 Abstract To drive down the cost of today s highly complex
More informationLecture 1: Introduction Course arrangements Recap of basic digital design concepts EDA tool demonstration
TKT-1426 Digital design for FPGA, 6cp Fall 2011 http://www.tkt.cs.tut.fi/kurssit/1426/ Tampere University of Technology Department of Computer Systems Waqar Hussain Lecture Contents Lecture 1: Introduction
More informationElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests
ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests Mingxing Tan 1 2, Gai Liu 1, Ritchie Zhao 1, Steve Dai 1, Zhiru Zhang 1 1 Computer Systems Laboratory, Electrical and Computer
More informationParallelizing FPGA Technology Mapping using GPUs. Doris Chen Deshanand Singh Aug 31 st, 2010
Parallelizing FPGA Technology Mapping using GPUs Doris Chen Deshanand Singh Aug 31 st, 2010 Motivation: Compile Time In last 12 years: 110x increase in FPGA Logic, 23x increase in CPU speed, 4.8x gap Question:
More informationComputer-Aided Recoding for Multi-Core Systems
Computer-Aided Recoding for Multi-Core Systems Rainer Dömer doemer@uci.edu With contributions by P. Chandraiah Center for Embedded Computer Systems University of California, Irvine Outline Embedded System
More informationEfficient Hardware Acceleration on SoC- FPGA using OpenCL
Efficient Hardware Acceleration on SoC- FPGA using OpenCL Advisor : Dr. Benjamin Carrion Schafer Susmitha Gogineni 30 th August 17 Presentation Overview 1.Objective & Motivation 2.Configurable SoC -FPGA
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu
More informationasoc: : A Scalable On-Chip Communication Architecture
asoc: : A Scalable On-Chip Communication Architecture Russell Tessier, Jian Liang,, Andrew Laffely,, and Wayne Burleson University of Massachusetts, Amherst Reconfigurable Computing Group Supported by
More informationCadence SystemC Design and Verification. NMI FPGA Network Meeting Jan 21, 2015
Cadence SystemC Design and Verification NMI FPGA Network Meeting Jan 21, 2015 The High Level Synthesis Opportunity Raising Abstraction Improves Design & Verification Optimizes Power, Area and Timing for
More informationManaging Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks
Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department
More informationUNIFIED HARDWARE/SOFTWARE CO-VERIFICATION FRAMEWORK FOR LARGE SCALE SYSTEMS
UNIFIED HARDWARE/SOFTWARE CO-VERIFICATION FRAMEWORK FOR LARGE SCALE SYSTEMS NANA SUTISNA Contents 1 Introduction 6 1.1 Background.................................. 6 1.2 Research Objectives..............................
More informationCover TBD. intel Quartus prime Design software
Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a
More informationNEW FPGA DESIGN AND VERIFICATION TECHNIQUES MICHAL HUSEJKO IT-PES-ES
NEW FPGA DESIGN AND VERIFICATION TECHNIQUES MICHAL HUSEJKO IT-PES-ES Design: Part 1 High Level Synthesis (Xilinx Vivado HLS) Part 2 SDSoC (Xilinx, HLS + ARM) Part 3 OpenCL (Altera OpenCL SDK) Verification:
More informationCover TBD. intel Quartus prime Design software
Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a
More informationMOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden
High Level Synthesis with Catapult MOJTABA MAHDAVI 1 Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling
More informationIntroduction to the Qsys System Integration Tool
Introduction to the Qsys System Integration Tool Course Description This course will teach you how to quickly build designs for Altera FPGAs using Altera s Qsys system-level integration tool. You will
More informationMinje Jun. Eui-Young Chung
Mixed Integer Linear Programming-based g Optimal Topology Synthesis of Cascaded Crossbar Switches Minje Jun Sungjoo Yoo Eui-Young Chung Contents Motivation Related Works Overview of the Method Problem
More informationA Study of Data Partitioning on OpenCL-based FPGAs. Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST)
A Study of Data Partitioning on OpenC-based FPGAs Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST) 1 Outline Background and Motivations Data Partitioning on FPGA OpenC on FPGA
More informationArea-Efficient Pipelining for FPGA-Targeted High-Level Synthesis
Area-Efficient Pipelining for FPGA-Targeted High-Level Synthesis Ritchie Zhao, Mingxing Tan, Steve Dai, Zhiru Zhang School of Electrical and Computer Engineering, Cornell University, Ithaca, NY {rz5, mingxing.tan,
More informationCan High-Level Synthesis Compete Against a Hand-Written Code in the Cryptographic Domain? A Case Study
Can High-Level Synthesis Compete Against a Hand-Written Code in the Cryptographic Domain? A Case Study Ekawat Homsirikamol & Kris Gaj George Mason University USA Project supported by NSF Grant #1314540
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationReduce FPGA Power With Automatic Optimization & Power-Efficient Design. Vaughn Betz & Sanjay Rajput
Reduce FPGA Power With Automatic Optimization & Power-Efficient Design Vaughn Betz & Sanjay Rajput Previous Power Net Seminar Silicon vs. Software Comparison 100% 80% 60% 40% 20% 0% 20% -40% Percent Error
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationEEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools
EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools Wen-Yen Lin, Ph.D. Department of Electrical Engineering Chang Gung University Email: wylin@mail.cgu.edu.tw March 2013 Agenda Introduction
More informationPRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong
More informationTowards Optimal Custom Instruction Processors
Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT CHIPS 18 Overview 1. background: extensible processors
More informationInstruction Encoding Synthesis For Architecture Exploration
Instruction Encoding Synthesis For Architecture Exploration "Compiler Optimizations for Code Density of Variable Length Instructions", "Heuristics for Greedy Transport Triggered Architecture Interconnect
More informationMapping-aware Logic Synthesis with Parallelized Stochastic Optimization
Mapping-aware Logic Synthesis with Parallelized Stochastic Optimization Zhiru Zhang School of ECE, Cornell University September 29, 2017 @ EPFL A Case Study on Digit Recognition bit6 popcount(bit49 digit)
More informationDigital Systems Laboratory
2012 Fall CSE140L Digital Systems Laboratory by Dr. Choon Kim CSE Department UCSD 1 Welcome to CSE140L! 2 3-way Light Controller, 2-1 MUX, Majority Detector, 7- seg Display, Binary-to- Decimal converter.
More informationPERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS
American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE
More informationMulti-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview
Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?
More informationBenchmarking of Cryptographic Algorithms in Hardware. Ekawat Homsirikamol & Kris Gaj George Mason University USA
Benchmarking of Cryptographic Algorithms in Hardware Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 Co-Author Ekawat Homsirikamol a.k.a Ice Working on the PhD Thesis entitled A New Approach
More informationBroadening the Exploration of the Accelerator Design Space in Embedded Scalable Platforms
IEEE High Performance Extreme Computing Conference (HPEC), 2017 Broadening the Exploration of the Design Space in Embedded Scalable Platforms Luca Piccolboni, Paolo Mantovani, Giuseppe Di Guglielmo, Luca
More informationProfiling-Driven Multi-Cycling in FPGA High-Level Synthesis
Profiling-Driven Multi-Cycling in FPGA High-Level Synthesis Stefan Hadjis 1, Andrew Canis 1, Ryoya Sobue 2, Yuko Hara-Azumi 3, Hiroyuki Tomiyama 2, Jason Anderson 1 1 Dept. of Electrical and Computer Engineering,
More informationEnergy scalability and the RESUME scalable video codec
Energy scalability and the RESUME scalable video codec Harald Devos, Hendrik Eeckhaut, Mark Christiaens ELIS/PARIS Ghent University pag. 1 Outline Introduction Scalable Video Reconfigurable HW: FPGAs Implementation
More informationEarly Performance-Cost Estimation of Application-Specific Data Path Pipelining
Early Performance-Cost Estimation of Application-Specific Data Path Pipelining Jelena Trajkovic Computer Science Department École Polytechnique de Montréal, Canada Email: jelena.trajkovic@polymtl.ca Daniel
More informationIntel Quartus Prime Pro Edition User Guide
Intel Quartus Prime Pro Edition User Guide Design Compilation Updated for Intel Quartus Prime Design Suite: 18.1 Subscribe Latest document on the web: PDF HTML Contents Contents 1. Design Compilation...
More informationENEE 245 Lab 1 Report Rubrics
ENEE 4 Lab 1 Report Rubrics Design Clearly state the design requirements Derive the minimum SOP Show the circuit implementation. Draw logic diagram and wiring diagram neatly Label all the diagrams/tables
More informationDesign Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationALTERA FPGA Design Using Verilog
ALTERA FPGA Design Using Verilog Course Description This course provides all necessary theoretical and practical know-how to design ALTERA FPGA/CPLD using Verilog standard language. The course intention
More informationPilot: A Platform-based HW/SW Synthesis System
Pilot: A Platform-based HW/SW Synthesis System SOC Group, VLSI CAD Lab, UCLA Led by Jason Cong Zhong Chen, Yiping Fan, Xun Yang, Zhiru Zhang ICSOC Workshop, Beijing August 20, 2002 Outline Overview The
More informationModel-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany
Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany 2013 The MathWorks, Inc. 1 Agenda Model-Based Design of embedded Systems Software Implementation
More informationThe Lekha 3GPP LTE FEC IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1].
Lekha IP 3GPP LTE FEC Encoder IP Core V1.0 The Lekha 3GPP LTE FEC IP Core meets 3GPP LTE specification 3GPP TS 36.212 V 10.5.0 Release 10[1]. 1.0 Introduction The Lekha IP 3GPP LTE FEC Encoder IP Core
More informationVHDL Essentials Simulation & Synthesis
VHDL Essentials Simulation & Synthesis Course Description This course provides all necessary theoretical and practical know-how to design programmable logic devices using VHDL standard language. The course
More informationAdditional Slides to De Micheli Book
Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating
More informationModeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors
Modeling Arbitrator Delay-Area Dependencies in Customizable Instruction Set Processors Siew-Kei Lam Centre for High Performance Embedded Systems, Nanyang Technological University, Singapore (assklam@ntu.edu.sg)
More informationIntel Quartus Prime Pro Edition Software and Device Support Release Notes
Intel Quartus Prime Pro Edition Software and Device Support Release Notes Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Version 17.1... 3 1.1 New Features and Enhancements...3
More informationCoarse Grain Reconfigurable Arrays are Signal Processing Engines!
Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Advanced Topics in Telecommunications, Algorithms and Implementation Platforms for Wireless Communications, TLT-9707 Waqar Hussain Researcher
More informationTurbo Encoder Co-processor Reference Design
Turbo Encoder Co-processor Reference Design AN-317-1.2 Application Note Introduction The turbo encoder co-processor reference design is for implemention in an Stratix DSP development board that is connected
More informationisplever Reed-Solomon Encoder User s Guide October 2005 ipug05_03.0
isplever TM CORE Reed-Solomon Encoder User s Guide October 2005 ipug05_03.0 Introduction Lattice s Reed-Solomon Encoder core provides an ideal solution that meets the needs of today s Reed-Solomon applications.
More informationOptimizations in the Verification Technique of Automatic Assertion Checking with Non-linear Solver
Optimizations in the Verification Technique of Automatic Assertion Checking with Non-linear Solver AUTHORS AND AFFILATION DATA MUST NOT BE INCLUDED IN THE FIRST FULL PAPER VERSION FOR REVIEW Abstract This
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationDouble Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization
Double Patterning Layout Decomposition for Simultaneous Conflict and Stitch Minimization Kun Yuan, Jae-Seo Yang, David Z. Pan Dept. of Electrical and Computer Engineering The University of Texas at Austin
More informationDesign and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA
Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationSystemC Synthesis Standard: Which Topics for Next Round? Frederic Doucet Qualcomm Atheros, Inc
SystemC Synthesis Standard: Which Topics for Next Round? Frederic Doucet Qualcomm Atheros, Inc 2/29/2016 Frederic Doucet, Qualcomm Atheros, Inc 2 What to Standardize Next Benefit of current standard: Provides
More informationCo-Optimization of Memory Access and Task Scheduling on MPSoC Architectures with Multi-Level Memory
Co-Optimization of Memory Access and Task Scheduling on MPSoC Architectures with Multi-Level Memory Assistant Professor Computer Science City University of Hong Kong http://www.cs.cityu.edu.hk/~jasonxue/
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University
More informationLecture 7: Introduction to Co-synthesis Algorithms
Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today
More informationHardware-Software Codesign
Hardware-Software Codesign 8. Performance Estimation Lothar Thiele 8-1 System Design specification system synthesis estimation -compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationOptimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased
Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased platforms Damian Karwowski, Marek Domański Poznan University of Technology, Chair of Multimedia Telecommunications and Microelectronics
More informationNatalie Enright Jerger, Jason Anderson, University of Toronto November 5, 2010
Next Generation FPGA Research Natalie Enright Jerger, Jason Anderson, and Ali Sheikholeslami l i University of Toronto November 5, 2010 Outline Part (I): Next Generation FPGA Architectures Asynchronous
More informationHardware-Software Codesign
Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationEEL 4783: HDL in Digital System Design
EEL 4783: HDL in Digital System Design Lecture 9: Coding for Synthesis (cont.) Prof. Mingjie Lin 1 Code Principles Use blocking assignments to model combinatorial logic. Use nonblocking assignments to
More informationIntel Quartus Prime Pro Edition Software and Device Support Release Notes
Intel Quartus Prime Pro Edition Software and Device Support Release Notes RN-01082-17.0.0 2017.05.08 Subscribe Send Feedback Contents Contents 1 Version 17.0... 3 1.1 New Features and Enhancements...3
More informationIntel Quartus Prime Pro Edition
Intel Quartus Prime Pro Edition Version 18.1 Software and Device Support Release Notes Subscribe Latest document on the web: PDF HTML Contents Contents 1. Intel Quartus Prime Pro Edition Version 18.1 Software
More informationLab 1: Using the LegUp High-level Synthesis Framework
Lab 1: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you can compile
More informationLab 2: Modifying LegUp to Limit the Number of Hardware Functional Units
Lab 2: Modifying LegUp to Limit the Number of Hardware Functional Units 1 Introduction and Motivation In this lab, you will gain exposure to the scheduling and binding steps of LegUp. You will modify user
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationEMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES
EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES Pong P. Chu Cleveland State University A JOHN WILEY & SONS, INC., PUBLICATION PREFACE An SoC (system on a chip) integrates a processor, memory
More informationPerformance Evaluation & Design Methodologies for Automated CRC Checking for 32 bit address Using HDLC Block
Performance Evaluation & Design Methodologies for Automated CRC Checking for 32 bit address Using HDLC Block 32 Bit Neeraj Kumar Misra, (Assistant professor, Dept. of ECE, R D Foundation Group of Institution
More informationApplications to MPSoCs
3 rd Workshop on Mapping of Applications to MPSoCs A Design Exploration Framework for Mapping and Scheduling onto Heterogeneous MPSoCs Christian Pilato, Fabrizio Ferrandi, Donatella Sciuto Dipartimento
More informationFPGA Implementation of Rate Control for JPEG2000
Joint International Mechanical, Electronic and Information Technology Conference (JIMET 2015) FPGA Implementation of Rate Control for JPEG2000 Shijie Qiao1, a *, Aiqing Yi1, b and Yuan Yang1,c 1 Department
More informationAdvanced ALTERA FPGA Design
Advanced ALTERA FPGA Design Course Description This course focuses on advanced FPGA design topics in Quartus software. The first part covers advanced timing closure problems, analysis and solutions. The
More information2.5G Reed-Solomon II MegaCore Function Reference Design
2.5G Reed-Solomon II MegaCore Function Reference Design AN-642-1.0 Application Note The Altera 2.5G Reed-Solomon (RS) II MegaCore function reference design demonstrates a basic application of the Reed-Solomon
More informationDSP Builder Handbook Volume 1: Introduction to DSP Builder
DSP Builder Handbook Volume 1: Introduction to DSP Builder DSP Builder Handbook 101 Innovation Drive San Jose, CA 95134 www.altera.com HB_DSPB_INTRO-4.0 Document last updated for Altera Complete Design
More informationALTERA FPGAs Architecture & Design
ALTERA FPGAs Architecture & Design Course Description This course provides all theoretical and practical know-how to design programmable devices of ALTERA with QUARTUS-II design software. The course combines
More informationOutline. EECS Components and Design Techniques for Digital Systems. Lec 11 Putting it all together Where are we now?
Outline EECS 5 - Components and Design Techniques for Digital Systems Lec Putting it all together -5-4 David Culler Electrical Engineering and Computer Sciences University of California Berkeley Top-to-bottom
More informationA Novel Design Framework for the Design of Reconfigurable Systems based on NoCs
Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction
More informationOASIS Network-on-Chip Prototyping on FPGA
Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of
More informationTSIU03, SYSTEM DESIGN LECTURE 2
LINKÖPING UNIVERSITY Department of Electrical Engineering TSIU03, SYSTEM DESIGN LECTURE 2 Mario Garrido Gálvez mario.garrido.galvez@liu.se Linköping, 2018 1 From 1bit to several bits. TODAY - Review of
More informationAvailable online at ScienceDirect. The 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013)
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 11 ( 2013 ) 206 211 The 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013) FPGA Implementation
More informationAn approach to accelerate UVM based verification environment
An approach to accelerate UVM based verification environment Sachish Dhar DWIVEDI/Ravi Prakash GUPTA Hardware Emulation and Verification Solutions ST Microelectronics Pvt Ltd Outline Challenges in SoC
More informationDual-Core Execution: Building A Highly Scalable Single-Thread Instruction Window
Dual-Core Execution: Building A Highly Scalable Single-Thread Instruction Window Huiyang Zhou School of Computer Science University of Central Florida New Challenges in Billion-Transistor Processor Era
More informationA Modified CORDIC Processor for Specific Angle Rotation based Applications
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. II (Mar-Apr. 2014), PP 29-37 e-issn: 2319 4200, p-issn No. : 2319 4197 A Modified CORDIC Processor for Specific Angle Rotation
More informationThe Lekha 3GPP LTE Turbo Decoder IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1].
Lekha IP Core: LW RI 1002 3GPP LTE Turbo Decoder IP Core V1.0 The Lekha 3GPP LTE Turbo Decoder IP Core meets 3GPP LTE specification 3GPP TS 36.212 V 10.5.0 Release 10[1]. Introduction The Lekha IP 3GPP
More informationHardware/Software Codesign
Hardware/Software Codesign 3. Partitioning Marco Platzner Lothar Thiele by the authors 1 Overview A Model for System Synthesis The Partitioning Problem General Partitioning Methods HW/SW-Partitioning Methods
More information