Folding. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Size: px
Start display at page:

Download "Folding. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,"

Transcription

1 Folding ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2010 ldvan@cs.nctu.edu.tw

2 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-2

3 Introduction (1/2) Systematically determine the control circuits in DSP architectures by folding transformation, where multiple algorithm operations are time-multiplexed to a single functional unit. Use for synthesis of DSP architectures that can be operated at single or multiple clocks. Use to reduce the number of hardware functional units (FUs) by a factor of N at the expense of increasing computation time by a factor of N. Lead to an architecture that uses a large number of registers and thus present the register minimization technique. VLSI-DSP-6-3

4 Introduction (2/2) VLSI-DSP-6-4

5 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-5

6 Folding Transformation (1/3) A systematic techniques for designing control circuits for hardware where several algorithm operations are time-multiplexed on a single functional unit. Notations U, V: nodes (operations) of the original DFG H U, H V : nodes (functional units) of the folded DFG W (x) : x-th iteration of node W e U V: an edge e from node U to noe V w(e): # of delays of the edge e Folding factor N # of operations that share one FU Folding set An ordered set of operations that executed by the same FU the position of an operation U in folding set is actually the folding order of U The folding set are typically obtained from a scheduling and allocation algorithm (ref. Appendix B) The folding set represents underlying folding transformation VLSI-DSP-6-6

7 Folding Transformation (2/3) P U : # of the pipeline stages of H U. P U = 0 indicates that H U is not pipelined. e D F (U V): (folding equation) # of cycles that the result of H U must be stored D F ( U e V ) [ N( l w( e))] Nw( e) P Negative value of folding equation D F is possible before retiming the folding equations. U v v] [ Nl u P U u] VLSI-DSP-6-7

8 Folding Transformation (3/3) U (l) w(e) V (l+w(e)) N folded N folded H U (Nl+u) P U +D F H V (N(l+w(e))+v) VLSI-DSP-6-8

9 Folding Retimed Biquad Filter (1/2) Folding factor N = 4 Folding set S 1 = {4, 2, 3, 1}, S 2 = {5, 8, 6, 7}, where S 1 denote all add operation and S 2 denote all multiply operation. Assume that addition and multiplication require 1 and 2 u.t. respectively. 1-stage adders and 2-stage pipelined multipliers are available. VLSI-DSP-6-9

10 Folding Retimed Biquad Filter (2/2) folding equations VLSI-DSP-6-10

11 Retiming (1/3) What situations will be suffered if the folding equation D F is negative? Retiming (moving delay elements) the original DFG prior to folding Constraint: e D F (U V)= Nw r (e) P U +v u>= (1) Substitute w r (e)=w(e)+r(v) r(u) into (1) r(u) r(v)<= D F (U V)/N Since the retiming values of the nodes are restricted to be integers, the above equations can be rewritten as r(u) r(v)<= D F (U V)/N e e VLSI-DSP-6-11

12 Retiming (2/3) Example: D F (1 2)=Nw(e)-P U +vu= =-3 r(1)-r(2)<= floor{d F (1 2)/N} =floor{-3/4}=-1 VLSI-DSP-6-12

13 Retiming (3/3) r(1)=-1, r(2)=0, r(3)=-1, r(4)=0 r(5)=-1, r(6)=-1, r(7)=-2, r(8)=-1 VLSI-DSP-6-13

14 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-14

15 Lifetime Analysis Lifetime analysis is a procedure used to compute the minimum number of registers required to implement a DSP algorithm in hardware. Linear lifetimes analysis Circular lifetime analysis In lifetime analysis, the number of live variables at each time unit is computed, and the maximum number of live variables at any time unit is determined. Forward-backward register allocation technique VLSI-DSP-6-15

16 Linear Lifetime Analysis Variables {a, b, c} max {0,1,2,2,2,2,2,2}=2 Periodicity Implicit Three iterations with N=6 VLSI-DSP-6-16

17 Matrix Transpose Example (1/3) a b c d e f g h i Transpose a d g b e h c f i i h g f e d c b a Matrix Transpose i f c h e b g d a VLSI-DSP-6-17

18 Matrix Transpose Example (2/3) T zlout = zero-lantacy output time T diff = T zlout T input T output = T zlout + max{-t diff } VLSI-DSP-6-18

19 Matrix Transpose Example (3/3) Linear Lifetime Chart Circular Lifetime Chart The minimum register number is 4. VLSI-DSP-6-19

20 VLSI Digital Signal Processing Systems Procedures of Forward-Backward Register Allocation Steps: Step 1: Determinate the minimum number of registers using lifetime analysis. Step 2: Input each variable at time step according to the beginning of its lifetime. Step 3: Each variable is allocated in a forward manner until it is dead or it reaches the last register. Step 4: Since the allocation is periodic, the allocation of the current iteration also repeats itself in subsequent iterations. Thus, we hash the position for registers at period of N. Step 5: If a variable that reaches the last register and is still alive, then these variables are allocated to a register in a backwardly manner. Step 6: Repeat Steps 4 and 5 as required until the allocation is completed. VLSI-DSP-6-20

21 Register Allocation for Matrix Transpose Example VLSI-DSP-6-21

22 Outline Introduction Folding Transformation Register Minimization Techniques Register Minimization in Folded Architecture Conclusions VLSI-DSP-6-22

23 Procedures of Register Minimization in Folded Architectures Steps: Step 1: Perform retiming for folding Step 2: Write the folding equations Step 3: Use the folding equations to construct a lifetime table Step 4: Draw the lifetime chart and determine the required number of registers Step 5: Perform forward-backward register allocation Step 6: Draw the folded architecture that uses the minimum number of registers VLSI-DSP-6-23

24 Folding Architecture Example VLSI-DSP-6-24

25 Folded Architecture for Matrix Transpose Example VLSI-DSP-6-25

26 Biquad Filter Example (1/4) Step 1: Retiming Retiming Invalid folding: DF(1 2) = -3 DF(6 4) = -4 DF(8 4) = -3 DF(7 3) = -3 VLSI-DSP-6-26

27 Biquad Filter Example (2/4) Step 2: Folding Equations D F (U V) = Nw(e) P u + v - u Step 3: Construct the lifetime table T input = u + P u T output = u + P u + max v {D F (U V) } D F (1 2) = 4(1) = 1 D F (1 5) = 4(1) = 0 D F (1 6) = 4(1) = 2 D F (1 7) = 4(1) = 3 D F (1 8) = 4(2) = 5 D F (3 1) = 4(0) = 0 D F (4 2) = 4(0) = 0 D F (5 3) = 4(0) = 0 D F (6 4) = 4(1) = 4 D F (7 3) = 4(1) = 1 D F (8 4) = 4(1) = 1 VLSI-DSP-6-27

28 Biquad Filter Example (3/4) Step 4: Draw the Lifetime Chart Step 5: Register Allocation Folding Factor = 4 The minimum number of registers is 2. VLSI-DSP-6-28

29 Biquad Filter Example (4/4) Step 6: Folded Architecture VLSI-DSP-6-29

30 IIR Filter Example (1/4) Step 1: Retiming Retiming Invalid folding: DF(3 1) = -3 DF(4 1) = -2 VLSI-DSP-6-30

31 IIR Filter Example (2/4) Step 2: Folding Equations Step 3: Construct the lifetime table D F (U V) = Nw(e) P u + v - u T input = u + P u T output = u + P u + max v {D F (U V) } D F (1 2) = 4(1) = 0 D F (2 3) = 4(1) = 5 D F (2 4) = 4(1) = 2 D F (3 1) = 4(1) = 1 D F (4 1) = 4(2) = 0 VLSI-DSP-6-31

32 IIR Filter Example (3/4) Step 4: Draw the Lifetime Chart Step 5: Register Allocation Folding Factor = 2 The minimum number of registers is 3. VLSI-DSP-6-32

33 IIR Filter Example (4/4) Step 6: Folded Architecture VLSI-DSP-6-33

34 Conclusions Present a systematic transformation of timemultiplexed architectures Explore folding techniques to reduce # of functional units Explore register minimization technique to reduce # of registers VLSI-DSP-6-34

35 References K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation, Wiley, S. Y. Huang, Handout of text book, VLSI-DSP-6-35

Chapter 6: Folding. Keshab K. Parhi

Chapter 6: Folding. Keshab K. Parhi Chapter 6: Folding Keshab K. Parhi Folding is a technique to reduce the silicon area by timemultiplexing many algorithm operations into single functional units (such as adders and multipliers) Fig(a) shows

More information

Iteration Bound. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C.

Iteration Bound. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Iteration Bound ( 范倫達 ) Ph. D. Department of Computer Science National Chiao Tung University Taiwan R.O.C. Fall 2 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction Data Flow Graph

More information

Chapter 8 Folding. VLSI DSP 2008 Y.T. Hwang 8-1. Introduction (1)

Chapter 8 Folding. VLSI DSP 2008 Y.T. Hwang 8-1. Introduction (1) Chapter 8 olding LSI SP 008 Y.T. Hang 8- folding Introduction SP architecture here multiple operations are multiplexed to a single function unit Trading area for time in a SP architecture Reduce the number

More information

Retiming. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Retiming. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, Retiming ( 范倫達 ), Ph.. epartment of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outlines Introduction efinitions and Properties

More information

Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction

Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Synthesis of DSP Systems using Data Flow Graphs for Silicon Area Reduction Rakhi S 1, PremanandaB.S 2, Mihir Narayan Mohanty 3 1 Atria Institute of Technology, 2 East Point College of Engineering &Technology,

More information

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded Folding is verse of Unfolding Node A A Folding by N (N=folding fator) Folding A Unfolding by J A A J- Hardware Mapped vs. Time multiplexed l Hardware Mapped vs. Time multiplexed/mirooded FI : y x(n) h

More information

Iteration Bound. Lan-Da Van ( 倫 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C.

Iteration Bound. Lan-Da Van ( 倫 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Iteration Bound Lan-Da Van ( 倫 ) Ph. D. Department of Computer Science National Chiao Tung University Taiwan R.O.C. Spring 27 ldvan@cs.nctu.edu.tw http://www.cs.nctu.tw/~ldvan/ Outline Introduction Data

More information

Optimized Design Platform for High Speed Digital Filter using Folding Technique

Optimized Design Platform for High Speed Digital Filter using Folding Technique Volume-2, Issue-1, January-February, 2014, pp. 19-30, IASTER 2013 www.iaster.com, Online: 2347-6109, Print: 2348-0017 ABSTRACT Optimized Design Platform for High Speed Digital Filter using Folding Technique

More information

Memory, Area and Power Optimization of Digital Circuits

Memory, Area and Power Optimization of Digital Circuits Memory, Area and Power Optimization of Digital Circuits Laxmi Gupta Electronics and Communication Department Jaypee Institute of Information Technology Noida, Uttar Pradesh, India Ankita Bharti Electronics

More information

Verilog for Combinational Circuits

Verilog for Combinational Circuits Verilog for Combinational Circuits Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2014 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/

More information

FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION

FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION FOLDED ARCHITECTURE FOR NON CANONICAL LEAST MEAN SQUARE ADAPTIVE DIGITAL FILTER USED IN ECHO CANCELLATION Pradnya Zode 1 and Dr.A.Y.Deshmukh 2 1 Research Scholar, Department of Electronics Engineering

More information

Register Transfer Level in Verilog: Part I

Register Transfer Level in Verilog: Part I Source: M. Morris Mano and Michael D. Ciletti, Digital Design, 4rd Edition, 2007, Prentice Hall. Register Transfer Level in Verilog: Part I Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National

More information

Exercises in DSP Design 2016 & Exam from Exam from

Exercises in DSP Design 2016 & Exam from Exam from Exercises in SP esign 2016 & Exam from 2005-12-12 Exam from 2004-12-13 ept. of Electrical and Information Technology Some helpful equations Retiming: Folding: ω r (e) = ω(e)+r(v) r(u) F (U V) = Nw(e) P

More information

Vertex Shader Design I

Vertex Shader Design I The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

Design of Efficient Fast Fourier Transform

Design of Efficient Fast Fourier Transform Design of Efficient Fast Fourier Transform Shymna Nizar N. S PG student, VLSI & Embedded Systems, ECE Department TKM Institute of Technology Karuvelil P.O, Kollam, Kerala-691505, India Abhila R Krishna

More information

Verilog Dataflow Modeling

Verilog Dataflow Modeling Verilog Dataflow Modeling Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:

More information

Verilog Behavioral Modeling

Verilog Behavioral Modeling Verilog Behavioral Modeling Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Source:

More information

Academic Course Description

Academic Course Description Academic Course Description SRM University Faculty of Engineering and Technology Department of Electronics and Communication Engineering VL2003 DSP Structures for VLSI Systems First Semester, 2014-15 (ODD

More information

High-Level Synthesis (HLS)

High-Level Synthesis (HLS) Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Digital Systems and Binary Numbers

Digital Systems and Binary Numbers Digital Systems and Binary Numbers ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Spring, 2018 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outline

More information

Vertex Shader Design II

Vertex Shader Design II The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

HIGH-LEVEL SYNTHESIS

HIGH-LEVEL SYNTHESIS HIGH-LEVEL SYNTHESIS Page 1 HIGH-LEVEL SYNTHESIS High-level synthesis: the automatic addition of structural information to a design described by an algorithm. BEHAVIORAL D. STRUCTURAL D. Systems Algorithms

More information

Academic Course Description. VL2003 Digital Processing Structures for VLSI First Semester, (Odd semester)

Academic Course Description. VL2003 Digital Processing Structures for VLSI First Semester, (Odd semester) Academic Course Description SRM University Faculty of Engineering and Technology Department of Electronics and Communication Engineering VL2003 Digital Processing Structures for VLSI First Semester, 2015-16

More information

Pracy II Konferencji Krajowej Reprogramowalne uklady cyfrowe, RUC 99, Szczecin, 1999, pp Implementation of IIR Digital Filters in FPGA

Pracy II Konferencji Krajowej Reprogramowalne uklady cyfrowe, RUC 99, Szczecin, 1999, pp Implementation of IIR Digital Filters in FPGA Pracy II Konferencji Krajowej Reprogramowalne uklady cyfrowe, RUC 99, Szczecin, 1999, pp.233-239 Implementation of IIR Digital Filters in FPGA Anatoli Sergyienko*, Volodymir Lepekha*, Juri Kanevski**,

More information

1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica

1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica A New Register Allocation Scheme for Low Power Data Format Converters Kala Srivatsan, Chaitali Chakrabarti Lori E. Lucke Department of Electrical Engineering Minnetronix, Inc. Arizona State University

More information

Take Home Final Examination (From noon, May 5, 2004 to noon, May 12, 2004)

Take Home Final Examination (From noon, May 5, 2004 to noon, May 12, 2004) Last (family) name: First (given) name: Student I.D. #: Department of Electrical and Computer Engineering University of Wisconsin - Madison ECE 734 VLSI Array Structure for Digital Signal Processing Take

More information

RPUSM: An Effective Instruction Scheduling Method for. Nested Loops

RPUSM: An Effective Instruction Scheduling Method for. Nested Loops RPUSM: An Effective Instruction Scheduling Method for Nested Loops Yi-Hsuan Lee, Ming-Lung Tsai and Cheng Chen Department of Computer Science and Information Engineering 1001 Ta Hsueh Road, Hsinchu, Taiwan,

More information

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India

Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the

More information

ROTATION SCHEDULING ON SYNCHRONOUS DATA FLOW GRAPHS. A Thesis Presented to The Graduate Faculty of The University of Akron

ROTATION SCHEDULING ON SYNCHRONOUS DATA FLOW GRAPHS. A Thesis Presented to The Graduate Faculty of The University of Akron ROTATION SCHEDULING ON SYNCHRONOUS DATA FLOW GRAPHS A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Rama

More information

Topics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1)

Topics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1) Topics Verilog Hardware modeling and simulation Event-driven simulation Basics of register-transfer design: data paths and controllers; ASM charts. High-level synthesis Initially a proprietary language,

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Gate-Level Minimization

Gate-Level Minimization Gate-Level Minimization ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2011 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines The Map Method

More information

Gate-Level Minimization

Gate-Level Minimization Gate-Level Minimization ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2017 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines The Map Method

More information

A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for MAP Decoder

A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for MAP Decoder A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for Decoder S.Shiyamala Department of ECE SSCET Palani, India. Dr.V.Rajamani Principal IGCET Trichy,India ABSTRACT This paper

More information

ECE 341 Midterm Exam

ECE 341 Midterm Exam ECE 341 Midterm Exam Time allowed: 90 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (10 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a)

More information

High Level Synthesis

High Level Synthesis High Level Synthesis Design Representation Intermediate representation essential for efficient processing. Input HDL behavioral descriptions translated into some canonical intermediate representation.

More information

EEL 4783: HDL in Digital System Design

EEL 4783: HDL in Digital System Design EEL 4783: HDL in Digital System Design Lecture 4: HLS Intro* Prof. Mingjie Lin *Notes are drawn from the textbook and the George Constantinides notes 1 Course Material Sources 1) Low-Power High-Level Synthesis

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

ECE 341 Midterm Exam

ECE 341 Midterm Exam ECE 341 Midterm Exam Time allowed: 75 minutes Total Points: 75 Points Scored: Name: Problem No. 1 (8 points) For each of the following statements, indicate whether the statement is TRUE or FALSE: (a) A

More information

Graphing Linear Equations

Graphing Linear Equations Graphing Linear Equations Question 1: What is a rectangular coordinate system? Answer 1: The rectangular coordinate system is used to graph points and equations. To create the rectangular coordinate system,

More information

Retiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming.

Retiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming. Retiming Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford Outline Structural optimization methods. Retiming. Modeling. Retiming for minimum delay. Retiming for minimum

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Daniel Gomez-Prado Dusung Kim Maciej Ciesielski Emmanuel Boutillon 2 University of Massachusetts Amherst, USA. {dgomezpr,ciesiel,dukim}@ecs.umass.edu

More information

Review for Ray-tracing Algorithm and Hardware

Review for Ray-tracing Algorithm and Hardware Review for Ray-tracing Algorithm and Hardware Reporter: 邱敬捷博士候選人 Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Summer, 2017 1 2017/7/26 Outline

More information

MOST computations used in applications, such as multimedia

MOST computations used in applications, such as multimedia IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 9, SEPTEMBER 2005 1023 Pipelining With Common Operands for Power-Efficient Linear Systems Daehong Kim, Member, IEEE, Dongwan

More information

Introduction to Field Programmable Gate Arrays

Introduction to Field Programmable Gate Arrays Introduction to Field Programmable Gate Arrays Lecture 2/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Digital Signal

More information

On the Design of High Speed Parallel CRC Circuits using DSP Algorithams

On the Design of High Speed Parallel CRC Circuits using DSP Algorithams On the Design of High Speed Parallel CRC Circuits using DSP Algorithams 1 B.Naresh Reddy, 2 B.Kiran Kumar, 3 K.Mohini sirisha 1 Dept.of ECE,Kodada institute of Technology & Science for women,kodada,india

More information

HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC UNIT ON PROGRAMMABLE LOGIC DEVICE

HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC UNIT ON PROGRAMMABLE LOGIC DEVICE International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 2, Issue 1, Feb 2015, 01-07 IIST HIGH PERFORMANCE QUATERNARY ARITHMETIC LOGIC

More information

Additional Slides to De Micheli Book

Additional Slides to De Micheli Book Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating

More information

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,

More information

MOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden

MOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden High Level Synthesis with Catapult MOJTABA MAHDAVI 1 Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling

More information

Chapter 4. Combinational Logic

Chapter 4. Combinational Logic Chapter 4. Combinational Logic Tong In Oh 1 4.1 Introduction Combinational logic: Logic gates Output determined from only the present combination of inputs Specified by a set of Boolean functions Sequential

More information

Permutation Matrices. Permutation Matrices. Permutation Matrices. Permutation Matrices. Isomorphisms of Graphs. 19 Nov 2015

Permutation Matrices. Permutation Matrices. Permutation Matrices. Permutation Matrices. Isomorphisms of Graphs. 19 Nov 2015 9 Nov 25 A permutation matrix is an n by n matrix with a single in each row and column, elsewhere. If P is a permutation (bijection) on {,2,..,n} let A P be the permutation matrix with A ip(i) =, A ij

More information

Efficient Radix-4 and Radix-8 Butterfly Elements

Efficient Radix-4 and Radix-8 Butterfly Elements Efficient Radix4 and Radix8 Butterfly Elements Weidong Li and Lars Wanhammar Electronics Systems, Department of Electrical Engineering Linköping University, SE581 83 Linköping, Sweden Tel.: +46 13 28 {1721,

More information

Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs

Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs Reduction of Latency and Resource Usage in Bit-Level Pipelined Data Paths for FPGAs P. Kollig B. M. Al-Hashimi School of Engineering and Advanced echnology Staffordshire University Beaconside, Stafford

More information

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC

IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC IMPLEMENTATION OF AN ADAPTIVE FIR FILTER USING HIGH SPEED DISTRIBUTED ARITHMETIC Thangamonikha.A 1, Dr.V.R.Balaji 2 1 PG Scholar, Department OF ECE, 2 Assitant Professor, Department of ECE 1, 2 Sri Krishna

More information

Register Transfer Methodology II

Register Transfer Methodology II Register Transfer Methodology II Chapter 12 1 Outline 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root

More information

Outline. Register Transfer Methodology II. 1. One shot pulse generator. Refined block diagram of FSMD

Outline. Register Transfer Methodology II. 1. One shot pulse generator. Refined block diagram of FSMD Outline Register Transfer Methodology II 1. Design example: One shot pulse generator 2. Design Example: GCD 3. Design Example: UART 4. Design Example: SRAM Interface Controller 5. Square root approximation

More information

REDUCING THE CODE SIZE OF RETIMED SOFTWARE LOOPS UNDER TIMING AND RESOURCE CONSTRAINTS

REDUCING THE CODE SIZE OF RETIMED SOFTWARE LOOPS UNDER TIMING AND RESOURCE CONSTRAINTS REDUCING THE CODE SIZE OF RETIMED SOFTWARE LOOPS UNDER TIMING AND RESOURCE CONSTRAINTS Noureddine Chabini 1 and Wayne Wolf 2 1 Department of Electrical and Computer Engineering, Royal Military College

More information

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 5, MAY

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 5, MAY IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 23, NO. 5, MAY 2015 819 Obfuscating DSP Circuits via High-Level Transformations Yingjie Lao, Student Member, IEEE, andkeshabk.parhi,fellow,

More information

High-Level Synthesis

High-Level Synthesis High-Level Synthesis 1 High-Level Synthesis 1. Basic definition 2. A typical HLS process 3. Scheduling techniques 4. Allocation and binding techniques 5. Advanced issues High-Level Synthesis 2 Introduction

More information

Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition

Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition Digital Design Using Digilent FPGA Boards -- Verilog / Active-HDL Edition Table of Contents 1. Introduction to Digital Logic 1 1.1 Background 1 1.2 Digital Logic 5 1.3 Verilog 8 2. Basic Logic Gates 9

More information

Administrivia. What is Synthesis? What is architectural synthesis? Approximately 20 1-hour lectures An assessed mini project.

Administrivia. What is Synthesis? What is architectural synthesis? Approximately 20 1-hour lectures An assessed mini project. Synthesis of Digital Architectures Self-contained course no previous requirements beyond compulsory courses all material is described in these notes and during the lectures Course materials will draw on

More information

DESIGN OF 2-D FILTERS USING A PARALLEL PROCESSOR ARCHITECTURE. Nelson L. Passos Robert P. Light Virgil Andronache Edwin H.-M. Sha

DESIGN OF 2-D FILTERS USING A PARALLEL PROCESSOR ARCHITECTURE. Nelson L. Passos Robert P. Light Virgil Andronache Edwin H.-M. Sha DESIGN OF -D FILTERS USING A PARALLEL PROCESSOR ARCHITECTURE Nelson L. Passos Robert P. Light Virgil Andronache Edwin H.-M. Sha Midwestern State University University of Notre Dame Wichita Falls, TX 76308

More information

Algebraically Speaking Chalkdust Algebra 1 Fall Semester

Algebraically Speaking Chalkdust Algebra 1 Fall Semester Algebraically Speaking Chalkdust Algebra 1 Fall Semester Homework Assignments: Chapter 1 The Real Number System: Lesson 1.1 - Real Numbers: Order and Absolute Value Do the following problems: # 1 9 Odd,

More information

Algorithms Transformation Techniques for Low-Power Wireless VLSI Systems Design

Algorithms Transformation Techniques for Low-Power Wireless VLSI Systems Design International Journal of Wireless Information Networks, Vol. 5, No. 2, 1998 Algorithms Transformation Techniques for Low-Power Wireless VLSI Systems Design Naresh R. Shanbhag 1 This paper presents an overview

More information

Multi Design Exploration and Register Minimization of Retimed Circuits Using GA in DSP Applications

Multi Design Exploration and Register Minimization of Retimed Circuits Using GA in DSP Applications ISSN: -965; IC Value: 5.98; SJ Impact Factor: 6.887 Volume 6 Issue IV, April 8- Available at www.ijraset.com Multi Design Exploration and Register Minimization of Retimed Circuits Using GA in DSP Applications

More information

16.10 Exercises. 372 Chapter 16 Code Improvement. be translated as

16.10 Exercises. 372 Chapter 16 Code Improvement. be translated as 372 Chapter 16 Code Improvement 16.10 Exercises 16.1 In Section 16.2 we suggested replacing the instruction r1 := r2 / 2 with the instruction r1 := r2 >> 1, and noted that the replacement may not be correct

More information

Homework #2 Solution Due Date: Friday, March 24, 2004

Homework #2 Solution Due Date: Friday, March 24, 2004 Department of Electrical and Computer Engineering University of Wisconsin Madison ECE 734 VLSI Array Structures for Digital Signal Processing Homework #2 Solution Due Date: Friday, March 24, 2004 This

More information

Research Article Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool

Research Article Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison and CAD Tool VLSI Design Volume 204, Article ID 28070, 8 pages http://dx.doi.org/0.55/204/28070 Research Article Design of Synthesizable, Retimed Digital Filters Using FPGA Based Path Solvers with MCM Approach: Comparison

More information

MRPF: An Architectural Transformation for Synthesis of High-Performance and Low-Power Digital Filters

MRPF: An Architectural Transformation for Synthesis of High-Performance and Low-Power Digital Filters MRPF: An Architectural Transformation for Synthesis of High-Performance and Low-Power Digital Filters Hunsoo Choo, Khurram Muhammad, Kaushik Roy Electrical & Computer Engineering Department Texas Instruments

More information

Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs

Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Timothy W. O Neil Dept. of Computer Science, The University of Akron, Akron OH 44325 USA toneil@uakron.edu Abstract Many common iterative

More information

Example 1: Give the coordinates of the points on the graph.

Example 1: Give the coordinates of the points on the graph. Ordered Pairs Often, to get an idea of the behavior of an equation, we will make a picture that represents the solutions to the equation. A graph gives us that picture. The rectangular coordinate plane,

More information

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier CHAPTER 3 METHODOLOGY 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier The design analysis starts with the analysis of the elementary algorithm for multiplication by

More information

Simulink-Hardware Flow

Simulink-Hardware Flow 5/2/22 EE26B: VLSI Signal Processing Simulink-Hardware Flow Prof. Dejan Marković ee26b@gmail.com Development Multiple design descriptions Algorithm (MATLAB or C) Fixed point description RTL (behavioral,

More information

Implementation of Two Level DWT VLSI Architecture

Implementation of Two Level DWT VLSI Architecture V. Revathi Tanuja et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Implementation of Two Level DWT VLSI Architecture V. Revathi Tanuja*, R V V Krishna ** *(Department

More information

Verilog for High Performance

Verilog for High Performance Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes

More information

Defect Tolerance in VLSI Circuits

Defect Tolerance in VLSI Circuits Defect Tolerance in VLSI Circuits Prof. Naga Kandasamy We will consider the following redundancy techniques to tolerate defects in VLSI circuits. Duplication with complementary logic (physical redundancy).

More information

CS 151 Final. (Last Name) (First Name)

CS 151 Final. (Last Name) (First Name) CS 151 Final Name Student ID Signature :, (Last Name) (First Name) : : Instructions: 1. Please verify that your paper contains 20 pages including this cover. 2. Write down your Student-Id on the top of

More information

Tree Structure and Algorithms for Physical Design

Tree Structure and Algorithms for Physical Design Tree Structure and Algorithms for Physical Design Chung Kuan Cheng, Ronald Graham, Ilgweon Kang, Dongwon Park and Xinyuan Wang CSE and ECE Departments UC San Diego Outline: Introduction Ancestor Trees

More information

High Performance Integer DCT Architectures for HEVC

High Performance Integer DCT Architectures for HEVC 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems High Performance Integer DCT Architectures for HEVC Mohamed Asan Basiri M, Department of Computer

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 14 EE141 Outline Parallelism EE141 2 Parallelism Parallelism is the act of doing more

More information

Word-Level Equivalence Checking in Bit-Level Accuracy by Synthesizing Designs onto Identical Datapath

Word-Level Equivalence Checking in Bit-Level Accuracy by Synthesizing Designs onto Identical Datapath 972 PAPER Special Section on Formal Approach Word-Level Equivalence Checking in Bit-Level Accuracy by Synthesizing Designs onto Identical Datapath Tasuku NISHIHARA a), Member, Takeshi MATSUMOTO, and Masahiro

More information

VHDL for Synthesis. Course Description. Course Duration. Goals

VHDL for Synthesis. Course Description. Course Duration. Goals VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes

More information

Efficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture

Efficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture ( 6 of 11 ) United States Patent Application 20040221137 Kind Code Pitsianis, Nikos P. ; et al. November 4, 2004 Efficient complex multiplication and fast fourier transform (FFT) implementation on the

More information

Sample Solutions to Homework #4

Sample Solutions to Homework #4 National Taiwan University Handout #25 Department of Electrical Engineering January 02, 207 Algorithms, Fall 206 TA: Zhi-Wen Lin and Yen-Chun Liu Sample Solutions to Homework #4. (0) (a) Both of the answers

More information

Sardar Patel University S Y BSc. Computer Science CS-201 Introduction to Programming Language Effective from July-2002

Sardar Patel University S Y BSc. Computer Science CS-201 Introduction to Programming Language Effective from July-2002 Sardar Patel University S Y BSc. Computer Science CS-201 Introduction to Programming Language Effective from July-2002 2 Practicals per week External marks :80 Internal Marks : 40 Total Marks :120 University

More information

ECE 545 Fall 2013 Final Exam

ECE 545 Fall 2013 Final Exam ECE 545 Fall 2013 Final Exam Problem 1 Develop an ASM chart for the circuit EXAM from your Midterm Exam, described below using its A. pseudocode B. table of input/output ports C. block diagram D. interface

More information

Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations

Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations 30 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 1, JANUARY 2000 Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations Miodrag

More information

VLSI Programming 2016: Lecture 3

VLSI Programming 2016: Lecture 3 VLSI Programming 2016: Lecture 3 Course: 2IMN35 Teachers: Kees van Berkel c.h.v.berkel@tue.nl Rudolf Mak r.h.mak@tue.nl Lab: Kees van Berkel, Rudolf Mak, Alok Lele www: http://www.win.tue.nl/~wsinmak/education/2imn35/

More information

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Code Optimization Code Optimization What should we optimize? improve running time decrease space requirements decrease

More information

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers Outline Introduction to Structured VLSI Design Integer Arithmetic and Pipelining Multiplication in the digital domain HW mapping Pipelining optimization Joachim Rodrigues Signed and Unsigned Integers n-1

More information

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant

More information

VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT

VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT VLSI DESIGN OF FLOATING POINT ARITHMETIC & LOGIC UNIT 1 DHANABAL R, 2 BHARATHI V, 3 G.SRI CHANDRAKIRAN, 4 BHARATH BHUSHAN REDDY.M 1 Assistant Professor (Senior Grade), VLSI division, SENSE, VIT University,

More information

Dynamic Pipeline Design of an Adaptive Binary Arithmetic Coder

Dynamic Pipeline Design of an Adaptive Binary Arithmetic Coder IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 48, NO. 9, SEPTEMBER 2001 813 Dynamic Pipeline Design of an Adaptive Binary Arithmetic Coder Shiann Rong Kuang,

More information

0.1 Unfolding. (b) (a) (c) N 1 y(2n+1) v(2n+2) (d)

0.1 Unfolding. (b) (a) (c) N 1 y(2n+1) v(2n+2) (d) 171 0.1 Unfolding It is possible to transform an algorithm to be expressed over more than one sample period. his is called unfolding and may be beneficial as it gives a higher degree of flexibility when

More information