Folding. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,
|
|
- Grant Logan
- 6 years ago
- Views:
Transcription
1 Folding ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2010 ldvan@cs.nctu.edu.tw
2 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-2
3 Introduction (1/2) Systematically determine the control circuits in DSP architectures by folding transformation, where multiple algorithm operations are time-multiplexed to a single functional unit. Use for synthesis of DSP architectures that can be operated at single or multiple clocks. Use to reduce the number of hardware functional units (FUs) by a factor of N at the expense of increasing computation time by a factor of N. Lead to an architecture that uses a large number of registers and thus present the register minimization technique. VLSI-DSP-6-3
4 Introduction (2/2) VLSI-DSP-6-4
5 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-5
6 Folding Transformation (1/3) A systematic techniques for designing control circuits for hardware where several algorithm operations are time-multiplexed on a single functional unit. Notations U, V: nodes (operations) of the original DFG H U, H V : nodes (functional units) of the folded DFG W (x) : x-th iteration of node W e U V: an edge e from node U to noe V w(e): # of delays of the edge e Folding factor N # of operations that share one FU Folding set An ordered set of operations that executed by the same FU the position of an operation U in folding set is actually the folding order of U The folding set are typically obtained from a scheduling and allocation algorithm (ref. Appendix B) The folding set represents underlying folding transformation VLSI-DSP-6-6
7 Folding Transformation (2/3) P U : # of the pipeline stages of H U. P U = 0 indicates that H U is not pipelined. e D F (U V): (folding equation) # of cycles that the result of H U must be stored D F ( U e V ) [ N( l w( e))] Nw( e) P Negative value of folding equation D F is possible before retiming the folding equations. U v v] [ Nl u P U u] VLSI-DSP-6-7
8 Folding Transformation (3/3) U (l) w(e) V (l+w(e)) N folded N folded H U (Nl+u) P U +D F H V (N(l+w(e))+v) VLSI-DSP-6-8
9 Folding Retimed Biquad Filter (1/2) Folding factor N = 4 Folding set S 1 = {4, 2, 3, 1}, S 2 = {5, 8, 6, 7}, where S 1 denote all add operation and S 2 denote all multiply operation. Assume that addition and multiplication require 1 and 2 u.t. respectively. 1-stage adders and 2-stage pipelined multipliers are available. VLSI-DSP-6-9
10 Folding Retimed Biquad Filter (2/2) folding equations VLSI-DSP-6-10
11 Retiming (1/3) What situations will be suffered if the folding equation D F is negative? Retiming (moving delay elements) the original DFG prior to folding Constraint: e D F (U V)= Nw r (e) P U +v u>= (1) Substitute w r (e)=w(e)+r(v) r(u) into (1) r(u) r(v)<= D F (U V)/N Since the retiming values of the nodes are restricted to be integers, the above equations can be rewritten as r(u) r(v)<= D F (U V)/N e e VLSI-DSP-6-11
12 Retiming (2/3) Example: D F (1 2)=Nw(e)-P U +vu= =-3 r(1)-r(2)<= floor{d F (1 2)/N} =floor{-3/4}=-1 VLSI-DSP-6-12
13 Retiming (3/3) r(1)=-1, r(2)=0, r(3)=-1, r(4)=0 r(5)=-1, r(6)=-1, r(7)=-2, r(8)=-1 VLSI-DSP-6-13
14 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-14
15 Lifetime Analysis Lifetime analysis is a procedure used to compute the minimum number of registers required to implement a DSP algorithm in hardware. Linear lifetimes analysis Circular lifetime analysis In lifetime analysis, the number of live variables at each time unit is computed, and the maximum number of live variables at any time unit is determined. Forward-backward register allocation technique VLSI-DSP-6-15
16 Linear Lifetime Analysis Variables {a, b, c} max {0,1,2,2,2,2,2,2}=2 Periodicity Implicit Three iterations with N=6 VLSI-DSP-6-16
17 Matrix Transpose Example (1/3) a b c d e f g h i Transpose a d g b e h c f i i h g f e d c b a Matrix Transpose i f c h e b g d a VLSI-DSP-6-17
18 Matrix Transpose Example (2/3) T zlout = zero-lantacy output time T diff = T zlout T input T output = T zlout + max{-t diff } VLSI-DSP-6-18
19 Matrix Transpose Example (3/3) Linear Lifetime Chart Circular Lifetime Chart The minimum register number is 4. VLSI-DSP-6-19
20 VLSI Digital Signal Processing Systems Procedures of Forward-Backward Register Allocation Steps: Step 1: Determinate the minimum number of registers using lifetime analysis. Step 2: Input each variable at time step according to the beginning of its lifetime. Step 3: Each variable is allocated in a forward manner until it is dead or it reaches the last register. Step 4: Since the allocation is periodic, the allocation of the current iteration also repeats itself in subsequent iterations. Thus, we hash the position for registers at period of N. Step 5: If a variable that reaches the last register and is still alive, then these variables are allocated to a register in a backwardly manner. Step 6: Repeat Steps 4 and 5 as required until the allocation is completed. VLSI-DSP-6-20
21 Register Allocation for Matrix Transpose Example VLSI-DSP-6-21
22 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-22
23 Procedures of Register Minimization in Folded Architectures Steps: Step 1: Perform retiming for folding Step 2: Write the folding equations Step 3: Use the folding equations to construct a lifetime table Step 4: Draw the lifetime chart and determine the required number of registers Step 5: Perform forward-backward register allocation Step 6: Draw the folded architecture that uses the minimum number of registers VLSI-DSP-6-23
24 Folding Architecture Example VLSI-DSP-6-24
25 Folded Architecture for Matrix Transpose Example VLSI-DSP-6-25
26 Biquad Filter Example (1/4) Step 1: Retiming Retiming Invalid folding: DF(1 2) = -3 DF(6 4) = -4 DF(8 4) = -3 DF(7 3) = -3 VLSI-DSP-6-26
27 Biquad Filter Example (2/4) Step 2: Folding Equations D F (U V) = Nw(e) P u + v - u Step 3: Construct the lifetime table T input = u + P u T output = u + P u + max v {D F (U V) } D F (1 2) = 4(1) = 1 D F (1 5) = 4(1) = 0 D F (1 6) = 4(1) = 2 D F (1 7) = 4(1) = 3 D F (1 8) = 4(2) = 5 D F (3 1) = 4(0) = 0 D F (4 2) = 4(0) = 0 D F (5 3) = 4(0) = 0 D F (6 4) = 4(1) = 4 D F (7 3) = 4(1) = 1 D F (8 4) = 4(1) = 1 VLSI-DSP-6-27
28 Biquad Filter Example (3/4) Step 4: Draw the Lifetime Chart Step 5: Register Allocation Folding Factor = 4 The minimum number of registers is 2. VLSI-DSP-6-28
29 Biquad Filter Example (4/4) Step 6: Folded Architecture VLSI-DSP-6-29
30 IIR Filter Example (1/4) Step 1: Retiming Retiming Invalid folding: DF(3 1) = -3 DF(4 1) = -2 VLSI-DSP-6-30
31 IIR Filter Example (2/4) Step 2: Folding Equations Step 3: Construct the lifetime table D F (U V) = Nw(e) P u + v - u T input = u + P u T output = u + P u + max v {D F (U V) } D F (1 2) = 4(1) = 0 D F (2 3) = 4(1) = 5 D F (2 4) = 4(1) = 2 D F (3 1) = 4(1) = 1 D F (4 1) = 4(2) = 0 VLSI-DSP-6-31
32 IIR Filter Example (3/4) Step 4: Draw the Lifetime Chart Step 5: Register Allocation Folding Factor = 2 The minimum number of registers is 3. VLSI-DSP-6-32
33 IIR Filter Example (4/4) Step 6: Folded Architecture VLSI-DSP-6-33
34 Conclusions Present a systematic transformation of timemultiplexed architectures Explore folding techniques to reduce # of functional units Explore register minimization technique to reduce # of registers VLSI-DSP-6-34
35 References K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation, Wiley, S. Y. Huang, Handout of text book, VLSI-DSP-6-35
Chapter 6: Folding. Keshab K. Parhi
Chapter 6: Folding Keshab K. Parhi Folding is a technique to reduce the silicon area by timemultiplexing many algorithm operations into single functional units (such as adders and multipliers) Fig(a) shows
More informationIteration Bound. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C.
Iteration Bound ( 范倫達 ) Ph. D. Department of Computer Science National Chiao Tung University Taiwan R.O.C. Fall 2 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction Data Flow Graph
More informationChapter 8 Folding. VLSI DSP 2008 Y.T. Hwang 8-1. Introduction (1)
Chapter 8 olding LSI SP 008 Y.T. Hang 8- folding Introduction SP architecture here multiple operations are multiplexed to a single function unit Trading area for time in a SP architecture Reduce the number
More informationRetiming. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,
Retiming ( 范倫達 ), Ph.. epartment of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outlines Introduction efinitions and Properties
More informationSynthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction
Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Rakhi S 1, PremanandaB.S 2, Mihir Narayan Mohanty 3 1 Atria Institute of Technology, 2 East Point College of Engineering &Technology,
More informationFolding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded
Folding is verse of Unfolding Node A A Folding by N (N=folding fator) Folding A Unfolding by J A A J- Hardware Mapped vs. Time multiplexed l Hardware Mapped vs. Time multiplexed/mirooded FI : y x(n) h
More informationIteration Bound. Lan-Da Van ( 倫 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C.
Iteration Bound Lan-Da Van ( 倫 ) Ph. D. Department of Computer Science National Chiao Tung University Taiwan R.O.C. Spring 27 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction Data
More informationOptimized Design Platform for High Speed Digital Filter using Folding Technique
Volume-2, Issue-1, January-February, 2014, pp. 19-30, IASTER 2013 www.iaster.com, Online: 2347-6109, Print: 2348-0017 ABSTRACT Optimized Design Platform for High Speed Digital Filter using Folding Technique
More informationMemory, Area and Power Optimization of Digital Circuits
Memory, Area and Power Optimization of Digital Circuits Laxmi Gupta Electronics and Communication Department Jaypee Institute of Information Technology Noida, Uttar Pradesh, India Ankita Bharti Electronics
More informationVerilog for Combinational Circuits
Verilog for Combinational Circuits Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2014 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/
More informationFOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION
FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION Pradnya Zode 1 and Dr.A.Y.Deshmukh 2 1 Research Scholar, Department of Electronics Engineering
More informationRegister Transfer Level in Verilog: Part I
Source: M. Morris Mano and Michael D. Ciletti, Digital Design, 4rd Edition, 2007, Prentice Hall. Register Transfer Level in Verilog: Part I Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National
More informationExercises in DSP Design 2016 & Exam from Exam from
Exercises in SP esign 2016 & Exam from 2005-12-12 Exam from 2004-12-13 ept. of Electrical and Information Technology Some helpful equations Retiming: Folding: ω r (e) = ω(e)+r(v) r(u) F (U V) = Nw(e) P
More informationVertex Shader Design I
The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only
More informationFILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas
FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given
More informationDesign of Efficient Fast Fourier Transform
Design of Efficient Fast Fourier Transform Shymna Nizar N. S PG student, VLSI & Embedded Systems, ECE Department TKM Institute of Technology Karuvelil P.O, Kollam, Kerala-691505, India Abhila R Krishna
More informationVerilog Dataflow Modeling
Verilog Dataflow Modeling Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:
More informationVerilog Behavioral Modeling
Verilog Behavioral Modeling Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:
More informationAcademic Course Description
Academic Course Description SRM University Faculty of Engineering and Technology Department of Electronics and Communication Engineering VL2003 DSP Structures for VLSI Systems First Semester, 2014-15 (ODD
More informationHigh-Level Synthesis (HLS)
Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationDigital Systems and Binary Numbers
Digital Systems and Binary Numbers ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2018 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outline
More informationVertex Shader Design II
The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only
More informationHIGH-LEVEL SYNTHESIS
HIGH-LEVEL SYNTHESIS Page 1 HIGH-LEVEL SYNTHESIS High-level synthesis: the automatic addition of structural information to a design described by an algorithm. BEHAVIORAL D. STRUCTURAL D. Systems Algorithms
More informationAcademic Course Description. VL2003 Digital Processing Structures for VLSI First Semester, (Odd semester)
Academic Course Description SRM University Faculty of Engineering and Technology Department of Electronics and Communication Engineering VL2003 Digital Processing Structures for VLSI First Semester, 2015-16
More informationPracy II Konferencji Krajowej Reprogramowalne uklady cyfrowe, RUC 99, Szczecin, 1999, pp Implementation of IIR Digital Filters in FPGA
Pracy II Konferencji Krajowej Reprogramowalne uklady cyfrowe, RUC 99, Szczecin, 1999, pp.233-239 Implementation of IIR Digital Filters in FPGA Anatoli Sergyienko*, Volodymir Lepekha*, Juri Kanevski**,
More information1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica
A New Register Allocation Scheme for Low Power Data Format Converters Kala Srivatsan, Chaitali Chakrabarti Lori E. Lucke Department of Electrical Engineering Minnetronix, Inc. Arizona State University
More informationTake Home Final Examination (From noon, May 5, 2004 to noon, May 12, 2004)
Last (family) name: First (given) name: Student I.D. #: Department of Electrical and Computer Engineering University of Wisconsin - Madison ECE 734 VLSI Array Structure for Digital Signal Processing Take
More informationRPUSM: An Effective Instruction Scheduling Method for. Nested Loops
RPUSM: An Effective Instruction Scheduling Method for Nested Loops Yi-Hsuan Lee, Ming-Lung Tsai and Cheng Chen Department of Computer Science and Information Engineering 1001 Ta Hsueh Road, Hsinchu, Taiwan,
More informationHead, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India
Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the
More informationROTATION SCHEDULING ON SYNCHRONOUS DATA FLOW GRAPHS. A Thesis Presented to The Graduate Faculty of The University of Akron
ROTATION SCHEDULING ON SYNCHRONOUS DATA FLOW GRAPHS A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Rama
More informationTopics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1)
Topics Verilog Hardware modeling and simulation Event-driven simulation Basics of register-transfer design: data paths and controllers; ASM charts. High-level synthesis Initially a proprietary language,
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationGate-Level Minimization
Gate-Level Minimization ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2011 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines The Map Method
More informationGate-Level Minimization
Gate-Level Minimization ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines The Map Method
More informationA Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for MAP Decoder
A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for Decoder S.Shiyamala Department of ECE SSCET Palani, India. Dr.V.Rajamani Principal IGCET Trichy,India ABSTRACT This paper
More informationECE 341 Midterm Exam
ECE 341 Midterm Exam Time allowed: 90 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (10 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a)
More informationHigh Level Synthesis
High Level Synthesis Design Representation Intermediate representation essential for efficient processing. Input HDL behavioral descriptions translated into some canonical intermediate representation.
More informationEEL 4783: HDL in Digital System Design
EEL 4783: HDL in Digital System Design Lecture 4: HLS Intro* Prof. Mingjie Lin *Notes are drawn from the textbook and the George Constantinides notes 1 Course Material Sources 1) Low-Power High-Level Synthesis
More informationIntroduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation
Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,
More informationVLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationECE 341 Midterm Exam
ECE 341 Midterm Exam Time allowed: 75 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (8 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a) A
More informationGraphing Linear Equations
Graphing Linear Equations Question 1: What is a rectangular coordinate system? Answer 1: The rectangular coordinate system is used to graph points and equations. To create the rectangular coordinate system,
More informationRetiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming.
Retiming Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford Outline Structural optimization methods. Retiming. Modeling. Retiming for minimum delay. Retiming for minimum
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationRetiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams
Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Daniel Gomez-Prado Dusung Kim Maciej Ciesielski Emmanuel Boutillon 2 University of Massachusetts Amherst, USA. {dgomezpr,ciesiel,dukim}@ecs.umass.edu
More informationReview for Ray-tracing Algorithm and Hardware
Review for Ray-tracing Algorithm and Hardware Reporter: 邱敬捷博士候選人 Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Summer, 2017 1 2017/7/26 Outline
More informationMOST computations used in applications, such as multimedia
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 9, SEPTEMBER 2005 1023 Pipelining With Common Operands for Power-Efficient Linear Systems Daehong Kim, Member, IEEE, Dongwan
More informationIntroduction to Field Programmable Gate Arrays
Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Digital Signal
More informationOn the Design of High Speed Parallel CRC Circuits using DSP Algorithams
On the Design of High Speed Parallel CRC Circuits using DSP Algorithams 1 B.Naresh Reddy, 2 B.Kiran Kumar, 3 K.Mohini sirisha 1 Dept.of ECE,Kodada institute of Technology & Science for women,kodada,india
More informationHIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC UNIT ON PROGRAMMABLE LOGIC DEVICE
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 2, Issue 1, Feb 2015, 01-07 IIST HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC
More informationAdditional Slides to De Micheli Book
Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating
More informationOverview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips
Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,
More informationMOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden
High Level Synthesis with Catapult MOJTABA MAHDAVI 1 Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling
More informationChapter 4. Combinational Logic
Chapter 4. Combinational Logic Tong In Oh 1 4.1 Introduction Combinational logic: Logic gates Output determined from only the present combination of inputs Specified by a set of Boolean functions Sequential
More informationPermutation Matrices. Permutation Matrices. Permutation Matrices. Permutation Matrices. Isomorphisms of Graphs. 19 Nov 2015
9 Nov 25 A permutation matrix is an n by n matrix with a single in each row and column, elsewhere. If P is a permutation (bijection) on {,2,..,n} let A P be the permutation matrix with A ip(i) =, A ij
More informationEfficient Radix-4 and Radix-8 Butterfly Elements
Efficient Radix4 and Radix8 Butterfly Elements Weidong Li and Lars Wanhammar Electronics Systems, Department of Electrical Engineering Linköping University, SE581 83 Linköping, Sweden Tel.: +46 13 28 {1721,
More informationReduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs
Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs P. Kollig B. M. Al-Hashimi School of Engineering and Advanced echnology Staffordshire University Beaconside, Stafford
More informationIMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC
IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna
More informationRegister Transfer Methodology II
Register Transfer Methodology II Chapter 12 1 Outline 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root
More informationOutline. Register Transfer Methodology II. 1. One shot pulse generator. Refined block diagram of FSMD
Outline Register Transfer Methodology II 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root approximation
More informationREDUCING THE CODE SIZE OF RETIMED SOFTWARE LOOPS UNDER TIMING AND RESOURCE CONSTRAINTS
REDUCING THE CODE SIZE OF RETIMED SOFTWARE LOOPS UNDER TIMING AND RESOURCE CONSTRAINTS Noureddine Chabini 1 and Wayne Wolf 2 1 Department of Electrical and Computer Engineering, Royal Military College
More informationFPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression
FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas
More informationIEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 5, MAY
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 5, MAY 2015 819 Obfuscating DSP Circuits via High-Level Transformations Yingjie Lao, Student Member, IEEE, andkeshabk.parhi,fellow,
More informationHigh-Level Synthesis
High-Level Synthesis 1 High-Level Synthesis 1. Basic definition 2. A typical HLS process 3. Scheduling techniques 4. Allocation and binding techniques 5. Advanced issues High-Level Synthesis 2 Introduction
More informationDigital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition
Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition Table of Contents 1. Introduction to Digital Logic 1 1.1 Background 1 1.2 Digital Logic 5 1.3 Verilog 8 2. Basic Logic Gates 9
More informationAdministrivia. What is Synthesis? What is architectural synthesis? Approximately 20 1-hour lectures An assessed mini project.
Synthesis of Digital Architectures Self-contained course no previous requirements beyond compulsory courses all material is described in these notes and during the lectures Course materials will draw on
More informationDESIGN OF 2-D FILTERS USING A PARALLEL PROCESSOR ARCHITECTURE. Nelson L. Passos Robert P. Light Virgil Andronache Edwin H.-M. Sha
DESIGN OF -D FILTERS USING A PARALLEL PROCESSOR ARCHITECTURE Nelson L. Passos Robert P. Light Virgil Andronache Edwin H.-M. Sha Midwestern State University University of Notre Dame Wichita Falls, TX 76308
More informationAlgebraically Speaking Chalkdust Algebra 1 Fall Semester
Algebraically Speaking Chalkdust Algebra 1 Fall Semester Homework Assignments: Chapter 1 The Real Number System: Lesson 1.1 - Real Numbers: Order and Absolute Value Do the following problems: # 1 9 Odd,
More informationAlgorithms Transformation Techniques for Low-Power Wireless VLSI Systems Design
International Journal of Wireless Information Networks, Vol. 5, No. 2, 1998 Algorithms Transformation Techniques for Low-Power Wireless VLSI Systems Design Naresh R. Shanbhag 1 This paper presents an overview
More informationMulti Design Exploration and Register Minimization of Retimed Circuits Using GA in DSP Applications
ISSN: -965; IC Value: 5.98; SJ Impact Factor: 6.887 Volume 6 Issue IV, April 8- Available at www.ijraset.com Multi Design Exploration and Register Minimization of Retimed Circuits Using GA in DSP Applications
More information16.10 Exercises. 372 Chapter 16 Code Improvement. be translated as
372 Chapter 16 Code Improvement 16.10 Exercises 16.1 In Section 16.2 we suggested replacing the instruction r1 := r2 / 2 with the instruction r1 := r2 >> 1, and noted that the replacement may not be correct
More informationHomework #2 Solution Due Date: Friday, March 24, 2004
Department of Electrical and Computer Engineering University of Wisconsin Madison ECE 734 VLSI Array Structures for Digital Signal Processing Homework #2 Solution Due Date: Friday, March 24, 2004 This
More informationResearch Article Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool
VLSI Design Volume 204, Article ID 28070, 8 pages http://dx.doi.org/0.55/204/28070 Research Article Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison
More informationMRPF: An Architectural Transformation for Synthesis of High-Performance and Low-Power Digital Filters
MRPF: An Architectural Transformation for Synthesis of High-Performance and Low-Power Digital Filters Hunsoo Choo, Khurram Muhammad, Kaushik Roy Electrical & Computer Engineering Department Texas Instruments
More informationRate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs
Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Timothy W. O Neil Dept. of Computer Science, The University of Akron, Akron OH 44325 USA toneil@uakron.edu Abstract Many common iterative
More informationExample 1: Give the coordinates of the points on the graph.
Ordered Pairs Often, to get an idea of the behavior of an equation, we will make a picture that represents the solutions to the equation. A graph gives us that picture. The rectangular coordinate plane,
More informationCHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier
CHAPTER 3 METHODOLOGY 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier The design analysis starts with the analysis of the elementary algorithm for multiplication by
More informationSimulink-Hardware Flow
5/2/22 EE26B: VLSI Signal Processing Simulink-Hardware Flow Prof. Dejan Marković ee26b@gmail.com Development Multiple design descriptions Algorithm (MATLAB or C) Fixed point description RTL (behavioral,
More informationImplementation of Two Level DWT VLSI Architecture
V. Revathi Tanuja et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Implementation of Two Level DWT VLSI Architecture V. Revathi Tanuja*, R V V Krishna ** *(Department
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationDefect Tolerance in VLSI Circuits
Defect Tolerance in VLSI Circuits Prof. Naga Kandasamy We will consider the following redundancy techniques to tolerate defects in VLSI circuits. Duplication with complementary logic (physical redundancy).
More informationCS 151 Final. (Last Name) (First Name)
CS 151 Final Name Student ID Signature :, (Last Name) (First Name) : : Instructions: 1. Please verify that your paper contains 20 pages including this cover. 2. Write down your Student-Id on the top of
More informationTree Structure and Algorithms for Physical Design
Tree Structure and Algorithms for Physical Design Chung Kuan Cheng, Ronald Graham, Ilgweon Kang, Dongwon Park and Xinyuan Wang CSE and ECE Departments UC San Diego Outline: Introduction Ancestor Trees
More informationHigh Performance Integer DCT Architectures for HEVC
2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems High Performance Integer DCT Architectures for HEVC Mohamed Asan Basiri M, Department of Computer
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 14 EE141 Outline Parallelism EE141 2 Parallelism Parallelism is the act of doing more
More informationWord-Level Equivalence Checking in Bit-Level Accuracy by Synthesizing Designs onto Identical Datapath
972 PAPER Special Section on Formal Approach Word-Level Equivalence Checking in Bit-Level Accuracy by Synthesizing Designs onto Identical Datapath Tasuku NISHIHARA a), Member, Takeshi MATSUMOTO, and Masahiro
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationEfficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture
( 6 of 11 ) United States Patent Application 20040221137 Kind Code Pitsianis, Nikos P. ; et al. November 4, 2004 Efficient complex multiplication and fast fourier transform (FFT) implementation on the
More informationSample Solutions to Homework #4
National Taiwan University Handout #25 Department of Electrical Engineering January 02, 207 Algorithms, Fall 206 TA: Zhi-Wen Lin and Yen-Chun Liu Sample Solutions to Homework #4. (0) (a) Both of the answers
More informationSardar Patel University S Y BSc. Computer Science CS-201 Introduction to Programming Language Effective from July-2002
Sardar Patel University S Y BSc. Computer Science CS-201 Introduction to Programming Language Effective from July-2002 2 Practicals per week External marks :80 Internal Marks : 40 Total Marks :120 University
More informationECE 545 Fall 2013 Final Exam
ECE 545 Fall 2013 Final Exam Problem 1 Develop an ASM chart for the circuit EXAM from your Midterm Exam, described below using its A. pseudocode B. table of input/output ports C. block diagram D. interface
More informationMaximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations
30 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 1, JANUARY 2000 Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations Miodrag
More informationVLSI Programming 2016: Lecture 3
VLSI Programming 2016: Lecture 3 Course: 2IMN35 Teachers: Kees van Berkel c.h.v.berkel@tue.nl Rudolf Mak r.h.mak@tue.nl Lab: Kees van Berkel, Rudolf Mak, Alok Lele www: http://www.win.tue.nl/~wsinmak/education/2imn35/
More informationComputer Science 160 Translation of Programming Languages
Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Code Optimization Code Optimization What should we optimize? improve running time decrease space requirements decrease
More informationOutline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers
Outline Introduction to Structured VLSI Design Integer Arithmetic and Pipelining Multiplication in the digital domain HW mapping Pipelining optimization Joachim Rodrigues Signed and Unsigned Integers n-1
More informationOPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION
OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant
More informationVLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT
VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT 1 DHANABAL R, 2 BHARATHI V, 3 G.SRI CHANDRAKIRAN, 4 BHARATH BHUSHAN REDDY.M 1 Assistant Professor (Senior Grade), VLSI division, SENSE, VIT University,
More informationDynamic Pipeline Design of an Adaptive Binary Arithmetic Coder
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 48, NO. 9, SEPTEMBER 2001 813 Dynamic Pipeline Design of an Adaptive Binary Arithmetic Coder Shiann Rong Kuang,
More information0.1 Unfolding. (b) (a) (c) N 1 y(2n+1) v(2n+2) (d)
171 0.1 Unfolding It is possible to transform an algorithm to be expressed over more than one sample period. his is called unfolding and may be beneficial as it gives a higher degree of flexibility when
More information