Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division

Size: px
Start display at page:

Download "Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division"

Transcription

1 Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park Design Technology Infrastructure Design Center System-LSI Business Division

2 1. Motivation 2. Design flow 3. Parallel multiplier 4. Coarse-grained structural placement methodology 5. Experimental results 6. Future works

3 Data-flow (design structure) awareness is crucial to enhance physical design qualities. Timing, area, congestion, and power etc. Structured datapath placement is mostly done manually. In general, it is thought that placement tools do not perform well on the datapath designs. Design efforts; days ~ weeks Sum = A + B Floorplan Coarser Memory macro placement Control granularity Structured datapath placement Finer 3

4 We have added another methodology in the data-flow aware physical design. Automated extraction and mapping for a synthesized parallel multiplier. Sum = A * B Floorplan Coarser Logic Synthesis Memory Floorplan macro placement Coarse-grained structured Memory macro placement datapath placement Coarser Control granularity Control granularity Datapath template Automated datapath extraction and mapping Structured datapath placement Finer 4

5 Identify cells of a synthesized parallel multiplier to be structurally placed RTL code Parsing/Elaboration Technology library Timing/ Area constraints Inherent structural location extractions of the cells Analyze data-flow of the multiplier Logic Synthesis Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Non-arithmetic logic High-level optimizations Structurally mapping the cells on a logical 2-D array Structural templates (Multiplier) Technology independent and dependent optimizations Optimized gate-level netlist Physical bit-slice alignment of the cells Generate structural relative placement directives Guide structural placement during global placement 5 N o Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Coarse-grained structural placement Result satisfactory? Yes User Dataflow analysis N o

6 A parallel multiplier is one of the most abundant arithmetic circuits in today s multi-media feature intensive SoCs. Parallel multiplier largely consists of three parts. Partial product generation Partial product reduction Carry propagating adder (Final adder) Multiplicand Partial Product Multiplier Multiplicand Multiplier Y3 Y2 Y1 Y0 X3 X2 X1 X0 Partial Product Reduction Final Adder Multiplication in dot-notation Partial Products Final Product X0Y3 X0Y2 X0Y1 X0Y0 X1Y3 X1Y2 X1Y1 X1Y0 X2Y3 X2Y2 X2Y1 X2Y0 X3Y3 X3Y2 X3Y1 X3Y0 S7 S6 S5 S4 S3 S2 S1 S0 Final Product 6

7 Partial product generation Non-booth : generates the logical product of a multiplicand and multiplier (AND). Booth (Radix-4) : reduces the number of partial products to the half. Partial product reduction Carry-save addition : reduces every column to 2 output rows using compressor cell. Carry-propagate adder (final adder) Carry look ahead adder : adds the 2 output rows Multiplicand Partial Product Partial Product Reduction Final Adder Final Product Multiplier Multiplication in dot-notation Multiplicand Multiplier Partial Products Final Product Xi Partial Carry-propagate product generation reduction adder PPij PPi+2j-2 PPi+1j-1 PPij 3:2 3:2 PPi-1j+1 A2 B2 A1 B1 A0 B0 C0 Yj Cout FA FA FA Sum Cin C2 C1 S2 S1 S0 P2 G2 P1 G1 P0 G0 C3 Carry-look ahead unit Non-booth Booth

8 It performs 1. Identify cells of a synthesized parallel multiplier to be structurally placed The PI cells from the partial product generation The PO cells from the final adder 2. Inherent structural location extraction of the cells Tagging structural locations for the PI and PO cells RTL code Parsing/Elaboration Logic Synthesis Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Structural templates (Multiplier) Technology library Technology independent and dependent optimizations Timing/ Area constraints Non-arithmetic logic High-level optimizations Optimized gate-level netlist 3. Analyze data-flow of the multiplier 4. Structurally mapping the cells on a logical 2-D array 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives 7. Guide structural placement during global placement 8 N o Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Coarse-grained structural placement Result satisfactory? Yes User Dataflow analysis N o

9 The PI cells from the partial product generation The PI cells are retrieved by the immediate fan-out cone cells of the input nets. A set of nets that to collect the PI cells differs depending on the type of the partial product generation. Non-booth : multiplicand and multiplier input nets Booth : multiplicand input nets Multiplicand Multiplier Partial product generation Partial Product Y3 Y2 Y1 Y0 X3 X2 X1 X0 Partial Product Reduction Final Adder Final Product Xi Yj PPij Non-booth Booth X1Y3 X1Y2 X1Y1 X1Y0 X2Y3 X2Y2 X2Y1 X2Y0 X3Y3 X3Y2 X3Y1 X3Y0 S7 S6 S5 S4 X0Y3 X0Y2 X0Y1 X0Y0 S3 S2 S1 S0 9

10 After extracting the PI cells, the PI cells are tagged by 2-D locations of a partial product row and column. Row inference Column inference The row of the PI cell can be inferred by its topologically closest multiplier inputs. Row inference i indicates the ith row of the partial product generator. - PIrow(Ck) : the row number of the PI cell Ck - PIcol(Ck) : the column number of the PI cell Ck - Bmd(Ck) : the closest multiplicand bit of Ck - Bmr(Ck) : the closest multiplier bit of Ck - PPtype : the partial product type Xi Yj PPij Non-booth Booth

11 The column of the PI cell can be inferred by its topologically closest and bitslice aligned multiplier output bit. Topological order propagation is restricted to only follow the same weighted bit-slice along the CSA tree. - Ignoring carry-out pins of the compressor cells. Column inference Find topologically closest and bit-slice aligned result. 3:2 3:2 Y3 Y2 Y1 Y0 3:2 3:2 X3 X2 X1 X0 3:2 3:2 3:2 X2Y3 X2Y2 X2Y1 X2Y0 X3Y3 X3Y2 X3Y1 X3Y0 X0Y3 X0Y2 X0Y1 X0Y0 X1Y3 X1Y2 X1Y1 X1Y0 Column[i+1] Column[i] S7 S6 S5 S4 S3 S2 S1 S0 11

12 The PO cells are parts of the final carry propagating adder. The PO cells are retrieved by the immediate fan-in cone cells of the output nets. Tags corresponding multiplier output bits to the PO cells Multiplicand Partial Product Multiplier Carry-propagate adder A2 B2 A1 B1 A0 B0 Partial Product Reduction Final Adder Final Product C3 FA FA FA C2 C1 S2 S1 S0 P2 G2 P1 G1 P0 G0 Carry-look ahead unit C0 12

13 It performs 1. Identify cells of a parallel multiplier to be structurally placed RTL code Parsing/Elaboration Logic Synthesis Technology library Timing/ Area constraints 2. Inherent structural location extraction of the cells Arithmetic operation extraction High-level arithmetic optimizations Non-arithmetic logic High-level optimizations 3. Structurally mapping the cells on a logical 2-D array 4. Analyze data-flow of the multiplier 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives 7. Guide structural placement during global placement 13 N o Datapath generator Structural templates (Multiplier) Technology independent and dependent optimizations Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Result satisfactory? Optimized gate-level netlist Coarse-grained structural placement Yes User Dataflow analysis N o

14 It performs 1. Identify cells of a parallel multiplier to be structurally placed RTL code Parsing/Elaboration Logic Synthesis Technology library Timing/ Area constraints 2. Inherent structural location extraction of the cells 3. Analyze data-flow of the multiplier Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Non-arithmetic logic High-level optimizations 4. Structurally mapping the cells on a logical 2-D array Using the inferred row and column numbers. 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives 7. Guide structural placement during global placement Structural templates (Multiplier) Technology independent and dependent optimizations Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Optimized gate-level netlist Coarse-grained structural placement User Dataflow analysis N o Result satisfactory? Yes N o

15 The PI cells are mapped onto a logical 2-D array according to their tagged row and column numbers. However, the number of cells inferring to the same location can be uneven due to the local nature of logic synthesis optimizations. If enough slots are allocated for all the cells, the 2-D array may have uncontrollable aspect ratio which may degrade placement quality. The maximum number of columns is constrained to control the array dimension. The number of rows is fixed. Some mis-mappings are allowed. Slot sharing between adjacent columns. There are spacing between the rows of the 2-D array. Non-guided cells to be placed close to their inherent structural locations. 15

16 Min-cost max-flow based cell mapping to maximize the number of mapped PI cells with minimum mis-mapping cost for a given 2-D array. An initial 2-D slot array may not fully contain all the PI cells. It allows empty slot sharing between adjacent bit-slice columns. It iteratively add dummy (empty) column slots at columns with the worst mis-mapping costs during the mapping. PI Cell[i-1,0] PI Cell[i,0] Cost [0,0] CostSH [0,0] Cost [0,1] CostSH [0,0] Cost [0,0] Cost [0,n] CostDS [0,0] PI Cell[i+1,0] PI Cell[i+1,0] CostDS [0,0] The slots are divided into the three types for each column having different mapping cost weights. Non-shared : mapping weight γown j slots m slots k slots Shared : mapping weight γshared Dummy : mapping weight γdummy Column[i-1] Shared Slot Column[i] Dummy Column[i+1] Slot[i] Column[i+1] Capacity = m Shared Slot Capacity = j Capacity = m Capacity = m Capacity = k 16 Mis-mapping cost : γx* rowcell rowslot

17 HPWL is considered to compensate for net-connection blindness of the mapping as a tiebreaker for the mapping. Linear programming formulations of the weighted sum of min-cost max-flow for CostMA(ci) and HPWL minimization for CostHPWL(ni) CostMA(ci) : weighted sum of mis-mapping cost of cell ci CostHPWL(ni) : weighted sum of mis-mapping cost of cell ci Gradually add dummy column slots to minimize mis-mapping cost at columns with the worst mis-mapping cost, then solve the linear program iteratively. 17

18 It performs 1. Identify cells of a parallel multiplier to be structurally placed RTL code Parsing/Elaboration Logic Synthesis Technology library Timing/ Area constraints 2. Inherent structural location extraction of the cells 3. Analyze data-flow of the multiplier Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Non-arithmetic logic High-level optimizations 4. Structurally mapping the cells on a logical 2-D array Structural templates (Multiplier) Technology independent and dependent optimizations Optimized gate-level netlist 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives 7. Guide structural placement during global placement Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Coarse-grained structural placement User Dataflow analysis N o Result satisfactory? Yes N o

19 The logically mapped PI and PO cells are then bit-slice aligned with respect to their physical dimension. Strict bit-slice alignment : a column width is decided by the widest cell among them - uncontrollable cell alignment size Ci,j-1 Ci,j Ci,j+1 Ci,j+2 Ci,j+3 i-1,j-1 Ci-1,j Ci-1,j+1 Ci-1,j+2 Ci-1,j+3 i-2,j-1 Ci-2,j Ci-2,j+1 Ci-2,j+2 Ci-2,j+3 Compression alignment : this generates a compact cell cluster - It cannot ensure vertical bit-slice alignment Ci,j-1 Ci,j Ci,j+1 Ci,j+2 Ci,j+3 Ci-1,j Ci-1,j+1 Ci-1,j+2 Ci-1,j+3 Ci-2,j Ci-2,j+1 Ci-2,j+2 Ci-2,j+3 19

20 Our method combines the advantages of the aforementioned methods. Align the columns within a maximum width constraint It performs bit slice misalignment minimization while ensuring a maximum alignment width. Misalignment at each column Ci,j-1 Ci,j Ci,j+1 Ci,j+2 Ci,j+3 i-1,j-1 Ci-1,j Ci-1,j+1 Ci-1,j+2 Ci-1,j+3 i-2,j-1 Ci-2,j Ci-2,j+1 Ci-2,j+2 Ci-2,j+3 Maximum width constraint 20

21 It performs 1. Identify cells of a parallel multiplier to be structurally placed RTL code Parsing/Elaboration Logic Synthesis Technology library Timing/ Area constraints 2. Inherent structural location extraction of the cells 3. Analyze data-flow of the multiplier Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Non-arithmetic logic High-level optimizations 4. Structurally mapping the cells on a logical 2-D array 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives The relative row and column locations of the cells The column spaces between the cells 7. Guide structural placement during global placement N o Structural templates (Multiplier) Technology independent and dependent optimizations Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Result satisfactory? Optimized gate-level netlist Coarse-grained structural placement Yes User Dataflow analysis N o

22 After the bit-slice alignment, the structural locations and the cell spacings are transformed into structural relative placement directives. Relative row and column locations of the cells Cell spaces between the cells To accommodate the cell spaces, the number of the array column is set to be twice of the logical 2-D array. The compression based alignment is used to align the cell. An estimated dataflow direction is used to set the initial orientations of the arrays for global placement. Cell spacing Cell slots Space slots Ci,j-1 Ci,j Ci,j+1 Ci,j+2 Ci,j+3 Ci-1,j Ci-1,j+1 Ci-1,j+2 Ci-1,j+3 Ci-2,j Ci-2,j+1 Ci-2,j+2 Ci-2,j+3 22

23 It performs 1. Identify cells of a parallel multiplier to be structurally placed RTL code Parsing/Elaboration Logic Synthesis Technology library Timing/ Area constraints 2. Inherent structural location extraction of the cells 3. Analyze data-flow of the multiplier Arithmetic operation extraction High-level arithmetic optimizations Datapath generator Non-arithmetic logic High-level optimizations 4. Structurally mapping the cells on a logical 2-D array Structural templates (Multiplier) Technology independent and dependent optimizations Optimized gate-level netlist 5. Physical bit-slice alignment of the cells 6. Generate structural relative placement directives 7. Guide structural placement during global placement N o Structure Extraction and Mapping Structural location inference/ Cell mapping Physical aware bit-slice alignment Structural relative placement directives Global Placement Coarse-grained structural placement Result satisfactory? Yes User Dataflow analysis N o

24 Structural relative placement directives hold the locations of the PI and PO cells. Non-guided cells are attracted to the PI and PO cells. 13*12 non-booth multiplier 32*16 Booth multiplier 24

25 We implemented the proposed methodology in Tcl and CLP as a linear program solver. Commercial logic synthesis and P&R tools with industrial designs were used. About 2%, 42%, and 2% improvements in critical path delay, total negative slack, and total wire-length respectively. D11 degraded the physical implementation quality, which had about 25% of the inputs are pruned due to constant propagation, and was not sufficient for the approach. Design # Mults Area ratio CPD TNS Wirelength D D D D D D D D D D D Ave

26 A snapshot of D10 26

27 The future works will focus on Extending the methodology for other synthesized datapath circuits. Developing regularity measuring methods to avoid structurally mapping insufficiently regular multipliers. Adding more surround awareness to further automate the methodology. 27

28

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 6c High-Speed Multiplication - III Spring 2017 Koren Part.6c.1 Array Multipliers The two basic operations - generation

More information

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing EE878 Special Topics in VLSI Computer Arithmetic for Digital Signal Processing Part 6b High-Speed Multiplication - II Spring 2017 Koren Part.6b.1 Accumulating the Partial Products After generating partial

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 6c High-Speed Multiplication - III Israel Koren Fall 2010 ECE666/Koren Part.6c.1 Array Multipliers

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer Arithmetic ECE 666 Part 6b High-Speed Multiplication - II Israel Koren ECE666/Koren Part.6b.1 Accumulating the Partial

More information

II. MOTIVATION AND IMPLEMENTATION

II. MOTIVATION AND IMPLEMENTATION An Efficient Design of Modified Booth Recoder for Fused Add-Multiply operator Dhanalakshmi.G Applied Electronics PSN College of Engineering and Technology Tirunelveli dhanamgovind20@gmail.com Prof.V.Gopi

More information

Digital Computer Arithmetic

Digital Computer Arithmetic Digital Computer Arithmetic Part 6 High-Speed Multiplication Soo-Ik Chae Spring 2010 Koren Chap.6.1 Speeding Up Multiplication Multiplication involves 2 basic operations generation of partial products

More information

Partial product generation. Multiplication. TSTE18 Digital Arithmetic. Seminar 4. Multiplication. yj2 j = xi2 i M

Partial product generation. Multiplication. TSTE18 Digital Arithmetic. Seminar 4. Multiplication. yj2 j = xi2 i M TSTE8 igital Arithmetic Seminar 4 Oscar Gustafsson Multiplication Multiplication can typically be separated into three sub-problems Generating partial products Adding the partial products using a redundant

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

Array Multipliers. Figure 6.9 The partial products generated in a 5 x 5 multiplication. Sec. 6.5

Array Multipliers. Figure 6.9 The partial products generated in a 5 x 5 multiplication. Sec. 6.5 Sec. 6.5 Array Multipliers I'r) 1'8 P7 p6 PS f'4 1'3 1'2 1' 1 "0 Figure 6.9 The partial products generated in a 5 x 5 multiplication. called itemrive arrc.ly multipliers or simply cirruy m~illil>liers.

More information

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state Chapter 4 xi yi Carry in ci Sum s i Carry out c i+ At the ith stage: Input: ci is the carry-in Output: si is the sum ci+ carry-out to (i+)st state si = xi yi ci + xi yi ci + xi yi ci + xi yi ci = x i yi

More information

Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator

Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator Sum to Modified Booth Recoding Techniques For Efficient Design of the Fused Add-Multiply Operator D.S. Vanaja 1, S. Sandeep 2 1 M. Tech scholar in VLSI System Design, Department of ECE, Sri VenkatesaPerumal

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 9: Binary Addition & Multiplication Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Pop Quiz! Using 4 bits signed integer notation:

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016 NEW VLSI ARCHITECTURE FOR EXPLOITING CARRY- SAVE ARITHMETIC USING VERILOG HDL B.Anusha 1 Ch.Ramesh 2 shivajeehul@gmail.com 1 chintala12271@rediffmail.com 2 1 PG Scholar, Dept of ECE, Ganapathy Engineering

More information

Multi-Operand Addition Ivor Page 1

Multi-Operand Addition Ivor Page 1 Multi-Operand Addition 1 Multi-Operand Addition Ivor Page 1 9.1 Motivation The motivation for multi-operand adders comes from the need for innerproduct calculations and multiplication (summing the partial

More information

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase

FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase FPGA Implementation of a High Speed Multiplier Employing Carry Lookahead Adders in Reduction Phase Abhay Sharma M.Tech Student Department of ECE MNNIT Allahabad, India ABSTRACT Tree Multipliers are frequently

More information

Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3

Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3 Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number Chapter 3 Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number Chapter 3 3.1 Introduction The various sections

More information

Lecture 19: Arithmetic Modules 14-1

Lecture 19: Arithmetic Modules 14-1 Lecture 19: Arithmetic Modules 14-1 Syllabus Objectives Addition and subtraction Multiplication Division Arithmetic and logic unit 14-2 Objectives After completing this chapter, you will be able to: Describe

More information

Design of Arithmetic Units ECE152B AU 1

Design of Arithmetic Units ECE152B AU 1 Design of Arithmetic Units ECE152B AU 1 Design of Arithmetic Units We will discuss the design of Adders/Substractors Multipliers/Dividers li id and analyze algorithms & methods to perform the desired d

More information

L14 - Placement and Routing

L14 - Placement and Routing L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist

More information

Chapter 3 Part 2 Combinational Logic Design

Chapter 3 Part 2 Combinational Logic Design University of Wisconsin - Madison ECE/Comp Sci 352 Digital Systems Fundamentals Kewal K. Saluja and Yu Hen Hu Spring 2002 Chapter 3 Part 2 Combinational Logic Design Originals by: Charles R. Kime and Tom

More information

ECE 30 Introduction to Computer Engineering

ECE 30 Introduction to Computer Engineering ECE 30 Introduction to Computer Engineering Study Problems, Set #6 Spring 2015 1. With x = 1111 1111 1111 1111 1011 0011 0101 0011 2 and y = 0000 0000 0000 0000 0000 0010 1101 0111 2 representing two s

More information

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator A.Sindhu 1, K.PriyaMeenakshi 2 PG Student [VLSI], Dept. of ECE, Muthayammal Engineering College, Rasipuram, Tamil Nadu,

More information

Analysis of Different Multiplication Algorithms & FPGA Implementation

Analysis of Different Multiplication Algorithms & FPGA Implementation IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. I (Mar-Apr. 2014), PP 29-35 e-issn: 2319 4200, p-issn No. : 2319 4197 Analysis of Different Multiplication Algorithms & FPGA

More information

MARKET demands urge embedded systems to incorporate

MARKET demands urge embedded systems to incorporate IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 3, MARCH 2011 429 High Performance and Area Efficient Flexible DSP Datapath Synthesis Sotirios Xydis, Student Member, IEEE,

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Arithmetic Circuits. Nurul Hazlina Adder 2. Multiplier 3. Arithmetic Logic Unit (ALU) 4. HDL for Arithmetic Circuit

Arithmetic Circuits. Nurul Hazlina Adder 2. Multiplier 3. Arithmetic Logic Unit (ALU) 4. HDL for Arithmetic Circuit Nurul Hazlina 1 1. Adder 2. Multiplier 3. Arithmetic Logic Unit (ALU) 4. HDL for Arithmetic Circuit Nurul Hazlina 2 Introduction 1. Digital circuits are frequently used for arithmetic operations 2. Fundamental

More information

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Iterative-Constructive Standard Cell Placer for High Speed and Low Power Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Cortex-A12: ARM-Cadence collaboration Joint team working on ARM Cortex -A12 irm flow irm content:

More information

International Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015

International Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015 Design of Dadda Algorithm based Floating Point Multiplier A. Bhanu Swetha. PG.Scholar: M.Tech(VLSISD), Department of ECE, BVCITS, Batlapalem. E.mail:swetha.appari@gmail.com V.Ramoji, Asst.Professor, Department

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Arithmetic Unit 10122011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Fixed Point Arithmetic Addition/Subtraction

More information

*Instruction Matters: Purdue Academic Course Transformation. Introduction to Digital System Design. Module 4 Arithmetic and Computer Logic Circuits

*Instruction Matters: Purdue Academic Course Transformation. Introduction to Digital System Design. Module 4 Arithmetic and Computer Logic Circuits Purdue IM:PACT* Fall 2018 Edition *Instruction Matters: Purdue Academic Course Transformation Introduction to Digital System Design Module 4 Arithmetic and Computer Logic Circuits Glossary of Common Terms

More information

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder 1.M.Megha,M.Tech (VLSI&ES),2. Nataraj, M.Tech (VLSI&ES), Assistant Professor, 1,2. ECE Department,ST.MARY S College of Engineering

More information

Timing for Ripple Carry Adder

Timing for Ripple Carry Adder Timing for Ripple Carry Adder 1 2 3 Look Ahead Method 5 6 7 8 9 Look-Ahead, bits wide 10 11 Multiplication Simple Gradeschool Algorithm for 32 Bits (6 Bit Result) Multiplier Multiplicand AND gates 32

More information

Effective Improvement of Carry save Adder

Effective Improvement of Carry save Adder Effective Improvement of Carry save Adder K.Nandini 1, A.Padmavathi 1, K.Pavithra 1, M.Selva Priya 1, Dr. P. Nithiyanantham 2 1 UG scholars, Department of Electronics, Jay Shriram Group of Institutions,

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

HIGH PERFORMANCE FUSED ADD MULTIPLY OPERATOR

HIGH PERFORMANCE FUSED ADD MULTIPLY OPERATOR HIGH PERFORMANCE FUSED ADD MULTIPLY OPERATOR R. Alwin [1] S. Anbu Vallal [2] I. Angel [3] B. Benhar Silvan [4] V. Jai Ganesh [5] 1 Assistant Professor, 2,3,4,5 Student Members Department of Electronics

More information

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University

PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -

More information

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y Arithmetic A basic operation in all digital computers is the addition and subtraction of two numbers They are implemented, along with the basic logic functions such as AND,OR, NOT,EX- OR in the ALU subsystem

More information

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER

OPTIMIZING THE POWER USING FUSED ADD MULTIPLIER Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices Deshanand P. Singh Altera Corporation dsingh@altera.com Terry P. Borer Altera Corporation tborer@altera.com

More information

High Speed Special Function Unit for Graphics Processing Unit

High Speed Special Function Unit for Graphics Processing Unit High Speed Special Function Unit for Graphics Processing Unit Abd-Elrahman G. Qoutb 1, Abdullah M. El-Gunidy 1, Mohammed F. Tolba 1, and Magdy A. El-Moursy 2 1 Electrical Engineering Department, Fayoum

More information

Numbering Systems. Number Representations Part 1

Numbering Systems. Number Representations Part 1 Introduction Verilog HDL modeling language allows numbers being represented in several radix systems. The underlying circuit processes the number in binary, however, input into and output from such circuits

More information

Binary Multiplication

Binary Multiplication inary Multiplication The key to multiplication was memorizing a digit-by-digit table Everything else was just adding 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 4 6 8 2 4 6 8 3 3 6 9 2 5 8 2 24 27 + You ve got

More information

Study, Implementation and Survey of Different VLSI Architectures for Multipliers

Study, Implementation and Survey of Different VLSI Architectures for Multipliers Study, Implementation and Survey of Different VLSI Architectures for Multipliers Sonam Kandalgaonkar, Prof.K.R.Rasane Department of Electronics and Communication Engineering, VTU University KLE s College

More information

Advanced Synthesis Techniques

Advanced Synthesis Techniques Advanced Synthesis Techniques Reminder From Last Year Use UltraFast Design Methodology for Vivado www.xilinx.com/ultrafast Recommendations for Rapid Closure HDL: use HDL Language Templates & DRC Constraints:

More information

Mapping Algorithms to Hardware By Prawat Nagvajara

Mapping Algorithms to Hardware By Prawat Nagvajara Electrical and Computer Engineering Mapping Algorithms to Hardware By Prawat Nagvajara Synopsis This note covers theory, design and implementation of the bit-vector multiplication algorithm. It presents

More information

Review of Last lecture. Review ALU Design. Designing a Multiplier Shifter Design Review. Booth s algorithm. Today s Outline

Review of Last lecture. Review ALU Design. Designing a Multiplier Shifter Design Review. Booth s algorithm. Today s Outline Today s Outline San Jose State University EE176-SJSU Computer Architecture and Organization Lecture 5 HDL, ALU, Shifter, Booth Algorithm Multiplier & Divider Instructor: Christopher H. Pham Review of Last

More information

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7

More information

More complicated than addition. Let's look at 3 versions based on grade school algorithm (multiplicand) More time and more area

More complicated than addition. Let's look at 3 versions based on grade school algorithm (multiplicand) More time and more area Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's look at 3 versions based on grade school algorithm 01010010 (multiplicand) x01101101 (multiplier)

More information

A novel technique for fast multiplication

A novel technique for fast multiplication INT. J. ELECTRONICS, 1999, VOL. 86, NO. 1, 67± 77 A novel technique for fast multiplication SADIQ M. SAIT², AAMIR A. FAROOQUI GERHARD F. BECKHOFF and In this paper we present the design of a new high-speed

More information

ICS 252 Introduction to Computer Design

ICS 252 Introduction to Computer Design ICS 252 Introduction to Computer Design Placement Fall 2007 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and

More information

ECE468 Computer Organization & Architecture. The Design Process & ALU Design

ECE468 Computer Organization & Architecture. The Design Process & ALU Design ECE6 Computer Organization & Architecture The Design Process & Design The Design Process "To Design Is To Represent" Design activity yields description/representation of an object -- Traditional craftsman

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

Overview. EECS Components and Design Techniques for Digital Systems. Lec 16 Arithmetic II (Multiplication) Computer Number Systems.

Overview. EECS Components and Design Techniques for Digital Systems. Lec 16 Arithmetic II (Multiplication) Computer Number Systems. Overview EE 15 - omponents and Design Techniques for Digital ystems Lec 16 Arithmetic II (Multiplication) Review of Addition Overflow Multiplication Further adder optimizations for multiplication LA in

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

Virtex-II Architecture

Virtex-II Architecture Virtex-II Architecture Block SelectRAM resource I/O Blocks (IOBs) edicated multipliers Programmable interconnect Configurable Logic Blocks (CLBs) Virtex -II architecture s core voltage operates at 1.5V

More information

CSE140 L. Instructor: Thomas Y. P. Lee January 18,2006. CSE140L Course Info

CSE140 L. Instructor: Thomas Y. P. Lee January 18,2006. CSE140L Course Info CSE4 L Instructor: Thomas Y. P. Lee January 8,26 CSE4L Course Info Lectures Wedesday :-:2AM, HSS33 Lab Assignment egins TA s JinHua Liu (jhliu@cs.ucsd.edu) Contact TAs if you re still looking for a lab

More information

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure

Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Subhendu Roy 1, Pavlos M. Mattheakis 2, Laurent Masse-Navette 2 and David Z. Pan 1 1 ECE Department, The University of Texas at Austin

More information

CS 5803 Introduction to High Performance Computer Architecture: Arithmetic Logic Unit. A.R. Hurson 323 CS Building, Missouri S&T

CS 5803 Introduction to High Performance Computer Architecture: Arithmetic Logic Unit. A.R. Hurson 323 CS Building, Missouri S&T CS 5803 Introduction to High Performance Computer Architecture: Arithmetic Logic Unit A.R. Hurson 323 CS Building, Missouri S&T hurson@mst.edu 1 Outline Motivation Design of a simple ALU How to design

More information

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013 CME 342 (VLSI Circuit Design) Laboratory 6 - Using Encounter for Automatic Place and Route By Mulong Li, 2013 Reference: Digital VLSI Chip Design with Cadence and Synopsys CAD Tools, Erik Brunvand Background

More information

Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 2005

Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 2005 Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 200 Multiply We will start with unsigned multiply and contrast how humans and computers multiply Layout 8-bit 8 Pipelined Multiplier 1 2

More information

Learning Outcomes. Spiral 2-2. Digital System Design DATAPATH COMPONENTS

Learning Outcomes. Spiral 2-2. Digital System Design DATAPATH COMPONENTS 2-2. 2-2.2 Learning Outcomes piral 2-2 Arithmetic Components and Their Efficient Implementations I understand the control inputs to counters I can design logic to control the inputs of counters to create

More information

Eliminating Routing Congestion Issues with Logic Synthesis

Eliminating Routing Congestion Issues with Logic Synthesis Eliminating Routing Congestion Issues with Logic Synthesis By Mike Clarke, Diego Hammerschlag, Matt Rardon, and Ankush Sood Routing congestion, which results when too many routes need to go through an

More information

Paper ID # IC In the last decade many research have been carried

Paper ID # IC In the last decade many research have been carried A New VLSI Architecture of Efficient Radix based Modified Booth Multiplier with Reduced Complexity In the last decade many research have been carried KARTHICK.Kout 1, MR. to reduce S. BHARATH the computation

More information

VARUN AGGARWAL

VARUN AGGARWAL ECE 645 PROJECT SPECIFICATION -------------- Design A Microprocessor Functional Unit Able To Perform Multiplication & Division Professor: Students: KRIS GAJ LUU PHAM VARUN AGGARWAL GMU Mar. 2002 CONTENTS

More information

To design a 4-bit ALU To experimentally check the operation of the ALU

To design a 4-bit ALU To experimentally check the operation of the ALU 1 Experiment # 11 Design and Implementation of a 4 - bit ALU Objectives: The objectives of this lab are: To design a 4-bit ALU To experimentally check the operation of the ALU Overview An Arithmetic Logic

More information

International Journal of Engineering and Techniques - Volume 4 Issue 2, April-2018

International Journal of Engineering and Techniques - Volume 4 Issue 2, April-2018 RESEARCH ARTICLE DESIGN AND ANALYSIS OF RADIX-16 BOOTH PARTIAL PRODUCT GENERATOR FOR 64-BIT BINARY MULTIPLIERS K.Deepthi 1, Dr.T.Lalith Kumar 2 OPEN ACCESS 1 PG Scholar,Dept. Of ECE,Annamacharya Institute

More information

MULTIPLICATION TECHNIQUES

MULTIPLICATION TECHNIQUES Learning Objectives EE 357 Unit 2a Multiplication Techniques Perform by hand the different methods for unsigned and signed multiplication Understand the various digital implementations of a multiplier

More information

APPLICATION NOTE. Constant Coefficient Multipliers for the XC4000E. Introduction. High Performance = Constant Coefficient

APPLICATION NOTE. Constant Coefficient Multipliers for the XC4000E. Introduction. High Performance = Constant Coefficient APPLICATION NOTE Constant Coefficient Multipliers for the XC000E XAPP 05 December 11, 1996 (Version 1.1) Application Note by Ken Chapman Summary This paper identifies two points at which constant coefficient

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

CS Computer Architecture. 1. Explain Carry Look Ahead adders in detail

CS Computer Architecture. 1. Explain Carry Look Ahead adders in detail 1. Explain Carry Look Ahead adders in detail A carry-look ahead adder (CLA) is a type of adder used in digital logic. A carry-look ahead adder improves speed by reducing the amount of time required to

More information

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers Outline Introduction to Structured VLSI Design Integer Arithmetic and Pipelining Multiplication in the digital domain HW mapping Pipelining optimization Joachim Rodrigues Signed and Unsigned Integers n-1

More information

Announcements. Midterm 2 next Thursday, 6-7:30pm, 277 Cory Review session on Tuesday, 6-7:30pm, 277 Cory Homework 8 due next Tuesday Labs: project

Announcements. Midterm 2 next Thursday, 6-7:30pm, 277 Cory Review session on Tuesday, 6-7:30pm, 277 Cory Homework 8 due next Tuesday Labs: project - Fall 2002 Lecture 20 Synthesis Sequential Logic Announcements Midterm 2 next Thursday, 6-7:30pm, 277 Cory Review session on Tuesday, 6-7:30pm, 277 Cory Homework 8 due next Tuesday Labs: project» Teams

More information

Digital VLSI Design. Lecture 7: Placement

Digital VLSI Design. Lecture 7: Placement Digital VLSI Design Lecture 7: Placement Semester A, 2016-17 Lecturer: Dr. Adam Teman 29 December 2016 Disclaimer: This course was prepared, in its entirety, by Adam Teman. Many materials were copied from

More information

Planning for Local Net Congestion in Global Routing

Planning for Local Net Congestion in Global Routing Planning for Local Net Congestion in Global Routing Hamid Shojaei, Azadeh Davoodi, and Jeffrey Linderoth* Department of Electrical and Computer Engineering *Department of Industrial and Systems Engineering

More information

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011 FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level

More information

Week 7: Assignment Solutions

Week 7: Assignment Solutions Week 7: Assignment Solutions 1. In 6-bit 2 s complement representation, when we subtract the decimal number +6 from +3, the result (in binary) will be: a. 111101 b. 000011 c. 100011 d. 111110 Correct answer

More information

ISSN (Online)

ISSN (Online) Proposed FAM Unit with S-MB Techniques and Kogge Stone Adder using VHDL [1] Dhumal Ashwini Kashinath, [2] Asst. Prof. Shirgan Siddharudha Shivputra [1] [2] Department of Electronics and Telecommunication

More information

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard

FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE Standard FPGA Implementation of Multiplier for Floating- Point Numbers Based on IEEE 754-2008 Standard M. Shyamsi, M. I. Ibrahimy, S. M. A. Motakabber and M. R. Ahsan Dept. of Electrical and Computer Engineering

More information

TOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis

TOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis TOPIC : Verilog Synthesis examples Module 4.3 : Verilog synthesis Example : 4-bit magnitude comptarator Discuss synthesis of a 4-bit magnitude comparator to understand each step in the synthesis flow.

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

High-Level Synthesis

High-Level Synthesis High-Level Synthesis 1 High-Level Synthesis 1. Basic definition 2. A typical HLS process 3. Scheduling techniques 4. Allocation and binding techniques 5. Advanced issues High-Level Synthesis 2 Introduction

More information

Embedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai

Embedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai Embedded Soc using High Performance Arm Core Processor D.sridhar raja Assistant professor, Dept. of E&I, Bharath university, Chennai Abstract: ARM is one of the most licensed and thus widespread processor

More information

Binary Arithmetic. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T.

Binary Arithmetic. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. Binary Arithmetic Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. MIT 6.004 Fall 2018 Reminder: Encoding Positive Integers Bit i in a binary representation (in right-to-left order)

More information

CAD Flow for FPGAs Introduction

CAD Flow for FPGAs Introduction CAD Flow for FPGAs Introduction What is EDA? o EDA Electronic Design Automation or (CAD) o Methodologies, algorithms and tools, which assist and automatethe design, verification, and testing of electronic

More information

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion

Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick

More information

An instruction set processor consist of two important units: Data Processing Unit (DataPath) Program Control Unit

An instruction set processor consist of two important units: Data Processing Unit (DataPath) Program Control Unit DataPath Design An instruction set processor consist of two important units: Data Processing Unit (DataPath) Program Control Unit Add & subtract instructions for fixed binary numbers are found in the

More information

An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator

An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator An Efficient Design of Sum-Modified Booth Recoder for Fused Add-Multiply Operator M.Chitra Evangelin Christina Associate Professor Department of Electronics and Communication Engineering Francis Xavier

More information

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs

MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs MAPLE: Multilevel Adaptive PLacEment for Mixed Size Designs Myung Chul Kim, Natarajan Viswanathan, Charles J. Alpert, Igor L. Markov, Shyam Ramji Dept. of EECS, University of Michigan IBM Corporation 1

More information

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,

More information

Chapter 3 Arithmetic for Computers

Chapter 3 Arithmetic for Computers Chapter 3 Arithmetic for Computers 1 Arithmetic Where we've been: Abstractions: Instruction Set Architecture Assembly Language and Machine Language What's up ahead: Implementing the Architecture operation

More information

MULTIPLE OPERAND ADDITION. Multioperand Addition

MULTIPLE OPERAND ADDITION. Multioperand Addition MULTIPLE OPERAND ADDITION Chapter 3 Multioperand Addition Add up a bunch of numbers Used in several algorithms Multiplication, recurrences, transforms, and filters Signed (two s comp) and unsigned Don

More information

TRILOBYTE SYSTEMS. Consistent Timing Constraints with PrimeTime. Steve Golson Trilobyte Systems.

TRILOBYTE SYSTEMS. Consistent Timing Constraints with PrimeTime. Steve Golson Trilobyte Systems. TRILOBYTE SYSTEMS Consistent Timing Constraints with PrimeTime Steve Golson Trilobyte Systems http://www.trilobyte.com 2 Physical implementation Rule #1 Do not change the functionality Rule #2 Meet the

More information

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers International Journal of Research in Computer Science ISSN 2249-8257 Volume 1 Issue 1 (2011) pp. 1-7 White Globe Publications www.ijorcs.org IEEE-754 compliant Algorithms for Fast Multiplication of Double

More information

Verilog for Combinational Circuits

Verilog for Combinational Circuits Verilog for Combinational Circuits Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2014 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/

More information

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs

FastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs FastPlace.0: An Efficient Analytical Placer for Mixed- Mode Designs Natarajan Viswanathan Min Pan Chris Chu Iowa State University ASP-DAC 006 Work supported by SRC under Task ID: 106.001 Mixed-Mode Placement

More information

ECE 341. Lecture # 6

ECE 341. Lecture # 6 ECE 34 Lecture # 6 Instructor: Zeshan Chishti zeshan@pdx.edu October 5, 24 Portland State University Lecture Topics Design of Fast Adders Carry Looakahead Adders (CLA) Blocked Carry-Lookahead Adders Multiplication

More information

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier

CHAPTER 3 METHODOLOGY. 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier CHAPTER 3 METHODOLOGY 3.1 Analysis of the Conventional High Speed 8-bits x 8-bits Wallace Tree Multiplier The design analysis starts with the analysis of the elementary algorithm for multiplication by

More information