ADDERS AND MULTIPLIERS

Similar documents
FLOATING POINT ADDERS AND MULTIPLIERS

Design and Implementation of Low power High Speed. Floating Point Adder and Multiplier

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Design and Implementation of IEEE-754 Decimal Floating Point Adder, Subtractor and Multiplier

FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST

Implementation of Double Precision Floating Point Multiplier in VHDL

C NUMERIC FORMATS. Overview. IEEE Single-Precision Floating-point Data Format. Figure C-0. Table C-0. Listing C-0.

Floating Point. The World is Not Just Integers. Programming languages support numbers with fraction

CO212 Lecture 10: Arithmetic & Logical Unit

Floating-Point Data Representation and Manipulation 198:231 Introduction to Computer Organization Lecture 3

COMPARISION OF PARALLEL BCD MULTIPLICATION IN LUT-6 FPGA AND 64-BIT FLOTING POINT ARITHMATIC USING VHDL

A Binary Floating-Point Adder with the Signed-Digit Number Arithmetic

COMP2611: Computer Organization. Data Representation

An Efficient Implementation of Floating Point Multiplier

AN IMPROVED FUSED FLOATING-POINT THREE-TERM ADDER. Mohyiddin K, Nithin Jose, Mitha Raj, Muhamed Jasim TK, Bijith PS, Mohamed Waseem P

An FPGA Based Floating Point Arithmetic Unit Using Verilog

Reduced Latency IEEE Floating-Point Standard Adder Architectures

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

CPE300: Digital System Architecture and Design

Floating-point representations

An Effective Implementation of Dual Path Fused Floating-Point Add-Subtract Unit for Reconfigurable Architectures

Floating-point representations

An FPGA based Implementation of Floating-point Multiplier

Chapter 2 Data Representations

EE260: Logic Design, Spring n Integer multiplication. n Booth s algorithm. n Integer division. n Restoring, non-restoring

Design and Optimized Implementation of Six-Operand Single- Precision Floating-Point Addition

2 General issues in multi-operand addition

Double Precision Floating-Point Arithmetic on FPGAs

Data Representations & Arithmetic Operations

ECE232: Hardware Organization and Design

Figurel. TEEE-754 double precision floating point format. Keywords- Double precision, Floating point, Multiplier,FPGA,IEEE-754.

MIPS Integer ALU Requirements

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

Divide: Paper & Pencil

Neelesh Thakur, 2 Rama Pandole Dept. of Electronics and Communication Engineering Kailash narayan patidar college of science and technology, India

The Sign consists of a single bit. If this bit is '1', then the number is negative. If this bit is '0', then the number is positive.

A High Speed Binary Floating Point Multiplier Using Dadda Algorithm

Implementation of Floating Point Multiplier Using Dadda Algorithm

Assistant Professor, PICT, Pune, Maharashtra, India

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 2, Mar-Apr 2015

A Survey on Floating Point Adders

Architecture and Design of Generic IEEE-754 Based Floating Point Adder, Subtractor and Multiplier

A Decimal Floating Point Arithmetic Unit for Embedded System Applications using VLSI Techniques

FPSim: A Floating- Point Arithmetic Demonstration Applet

Floating Point Square Root under HUB Format

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

Floating Point Numbers

Foundations of Computer Systems

By, Ajinkya Karande Adarsh Yoga

Quixilica Floating Point FPGA Cores

Redundant Data Formats for Faster Floating Point Addition. Abstract

Computer Arithmetic Ch 8

Computer Arithmetic Ch 8

A Floating Point Divider Performing IEEE Rounding and Quotient Conversion in Parallel

Floating Point Unit Generation and Evaluation for FPGAs

Inf2C - Computer Systems Lecture 2 Data Representation

International Journal of Engineering and Techniques - Volume 4 Issue 2, April-2018

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

COMPUTER ORGANIZATION AND DESIGN

AT Arithmetic. Integer addition

Comparison of pipelined IEEE-754 standard floating point multiplier with unpipelined multiplier

ISSN Vol.02, Issue.11, December-2014, Pages:

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

University, Patiala, Punjab, India 1 2

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog

Numeric Encodings Prof. James L. Frankel Harvard University

Topics. 6.1 Number Systems and Radix Conversion 6.2 Fixed-Point Arithmetic 6.3 Seminumeric Aspects of ALU Design 6.4 Floating-Point Arithmetic

Floating-Point Arithmetic

UNIT IV: DATA PATH DESIGN

Computer Architecture and Organization

Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition. Carnegie Mellon

Chapter 2 Float Point Arithmetic. Real Numbers in Decimal Notation. Real Numbers in Decimal Notation

A Single/Double Precision Floating-Point Reciprocal Unit Design for Multimedia Applications

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: continued. Consulting hours. Introduction to Sim. Milestone #1 (due 1/26)

Floating point. Today! IEEE Floating Point Standard! Rounding! Floating Point Operations! Mathematical properties. Next time. !

DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTOR SUPPORT

EE 109 Unit 19. IEEE 754 Floating Point Representation Floating Point Arithmetic

Floating Point Arithmetic. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

CS222: Processor Design

Review: MIPS Organization

Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider

A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier using Modified CSA

Floating Point : Introduction to Computer Systems 4 th Lecture, May 25, Instructor: Brian Railing. Carnegie Mellon

A Quadruple Precision and Dual Double Precision Floating-Point Multiplier

Computer Architecture and IC Design Lab. Chapter 3 Part 2 Arithmetic for Computers Floating Point

Module 2: Computer Arithmetic

Floating Point (with contributions from Dr. Bin Ren, William & Mary Computer Science)

A Radix-10 SRT Divider Based on Alternative BCD Codings

Arithmetic for Computers. Hwansoo Han

FPGA Implementation of Low-Area Floating Point Multiplier Using Vedic Mathematics

Fused Floating Point Three Term Adder Using Brent-Kung Adder

Computer Arithmetic Floating Point

A New Architecture For Multiple-Precision Floating-Point Multiply-Add Fused Unit Design

Fig.1. Floating point number representation of single-precision (32-bit). Floating point number representation in double-precision (64-bit) format:

The ALU consists of combinational logic. Processes all data in the CPU. ALL von Neuman machines have an ALU loop.

Comparison of Adders for optimized Exponent Addition circuit in IEEE754 Floating point multiplier using VHDL

Representing and Manipulating Floating Points

International Journal of Research in Computer and Communication Technology, Vol 4, Issue 11, November- 2015

System Programming CISC 360. Floating Point September 16, 2008

Transcription:

Concordia University FLOATING POINT ADDERS AND MULTIPLIERS 1

Concordia University Lecture #4 In this lecture we will go over the following concepts: 1) Floating Point Number representation 2) Accuracy and Dynamic range; IEEE standard 3) Floating Point Addition 4) Rounding Techniques 5) Floating point Multiplication 6) Architectures for FP Addition 7) Architectures for FP Multiplication 8) Comparison of two FP Architectures 9) Barrel Shifters 2

- Single and double precision data formats of IEEE 754 standard Sign S 8 bit - biased Exponent E 23 bits - unsigned fraction P (a) IEEE single precision data format Sign S 11 bit - biased Exponent E 52 bits - unsigned fraction p (b) IEEE double precision data format 3

Format parameters of IEEE 754 Floating Point Standard Parameter Format Single Precision Double Precision Format width in bits 32 64 Precision (p) = fraction + hidden bit 23 + 1 52 + 1 Exponent width in bits 8 11 Maximum value of exponent + 127 + 1023 Minimum value of exponent -126-1022 4

- Range of floating point numbers Overflow Wi thin Range Underflow Within Range Overflow Negative numbers Positive numbers 0 + Denormalized 5

Exceptions in IEEE 754 Exception Remarks Overflow Result can be ± or default maximum value Underflow Result can be 0 or denormal Divide by Zero Result can be ± Invalid Result is NaN Inexact System specified rounding may be required 6

Operations that can generate Invalid Results Operation Remarks Addition/ Subtraction An operation of the type ± Multiplication An operation of the type 0 x Division Operations of the type 0/0 and / Remainder Operations of the type x REM 0 and REM y Square Root Square Root of a negative number 7

IEEE compatible floating point multipliers Algorithm Step 1 Calculate the tentative exponent of the product by adding the biased exponents of the two numbers, subtract-ing the bias, (). bias is 127 and 1023 for single precision and double precision IEEE data format respectively Step 2 If the sign of two floating point numbers are the same, set the sign of product to +, else set it to -. Step 3 Multiply the two significands. For p bit significand the product is 2p bits wide (p, the width of significand data field, is including the leading hidden bit (1)). Product of significands falls within range. Step 4 Normalize the product if MSB of the product is 1 (i.e. product of ), by shifting the product right by 1 bit position and incrementing the tentative exponent. Evaluate exception conditions, if any. Step 5 Round the product if R(M0 + S) is true, where M0 and R represent the pth and (p+1)st bits from the left end of normalized product and Sticky bit (S) is the logical OR of all the bits towards the right of R bit. If the rounding condition is true, a 1 is added at the pth bit (from the left side) of the normalized product. If all p MSBs of the normalized product are 1 s, rounding can generate a carry-out. In that case normalization (step 4) has to be done again. 8

Operands Multiplication and Rounding p-bit significand field Input Significands } 1 p - 1 lower order bits 1 p - 1 lower order bits Significands before multiplication 2p bits C out Result of significand multiplication before normalization shift p-bit significand field p - 1 higher order bits M 0 R S Normalized product before Rounding Figure 2.4 - Significand multiplication, normalization and rounding 9

What s the best architecture? Architecture Consideration Concordia University 10

A Simple FP Multiplier Concordia University Sign1 Sign2 Exp1 Exp2 Significand1 Significand2 Exponent & Sign Logic Significand Multiplier Normalization Logic Rounding Logic Correction Shift Result Flags Logic Result Selector Flags IEEE Product 11

A Dual Path FP Multiplier Concordia University Exponents Input Floating Point Numbers 1st Exponent Logic Control / Sign Logic 2nd Significand Multiplier (Partial Product Processing) Bypass Logic 3rd Critical Path CPA / Rounding Logic Sticky Logic Path 2 Exponent Incrementer Result Selector / Normalization Logic Result Integration / Flag Logic Flag bits IEEE product 12

Case-1 Normal Number Case-2 Normal Number S Exponent Significand Operand1 0 10000001 00000000101000111101011 Operand2 0 10000000 10101100110011001100110 Result 0 10000010 10101101110111110011100 S Exponent Significand Operand1 0 10000000 00001100110011001100110 Operand2 0 10000000 00001100110011001100110 Result 0 10000001 00011010001111010110111 13

Comparison 0f 3 types of FP Multipliers using 0.22 micron CMOS technology AREA (cell) POWER (mw) Delay (ns) Single Data Path FPM 2288.5 204.5 69.2 Double Data Path FPM 2997 94.5 68.81 Pipelined Double Data Path FPM 3173 105 42.26 14

IEEE compatible floating point adders Algorithm Step 1 Compare the exponents of two numbers for ( or ) and calculate the absolute value of difference between the two exponents (). Take the larger exponent as the tentative exponent of the result. Step 2 Shift the significand of the number with the smaller exponent, right through a number of bit positions that is equal to the exponent difference. Two of the shifted out bits of the aligned significand are retained as guard (G) and Round (R) bits. So for p bit significands, the effective width of aligned significand must be p + 2 bits. Append a third bit, namely the sticky bit (S), at the right end of the aligned significand. The sticky bit is the logical OR of all shifted out bits. Step 3 Add/subtract the two signed-magnitude significands using a p + 3 bit adder. Let the result of this is SUM. Step 4 Check SUM for carry out (C out ) from the MSB position during addition. Shift SUM right by one bit position if a carry out is detected and increment the tentative exponent by 1. During subtraction, check SUM for leading zeros. Shift SUM left until the MSB of the shifted result is a 1. Subtract the leading zero count from tentative exponent. Evaluate exception conditions, if any. Step 5 Round the result if the logical condition R (M 0 + S ) is true, where M 0 and R represent the pth and (p + 1)st bits from the left end of the normalized significand. New sticky bit (S ) is the logical OR of all bits towards the right of the R bit. If the rounding condition is true, a 1 is added at the pth bit (from the left side) of the normalized significand. If p MSBs of the normalized significand are 1 s, rounding can generate a carry-out. in that case normalization (step 4) has to be done again. 15

Floating Point Addition of Operands with Rounding p bit significand field Aligned } significands p - 1 higher order bits p - 1 higher order bits a 0 b 0 0 0 0 G R S Significands before addition C out Fig. 1 - Aligned significands G R S Result of significand addition before normalization shift p - 1 higher order bits M 0 R S Normalized Significand before Rounding Fig 2.6 - Significand addition, normalization and rounding 16

IEEE Rounding IEEE default rounding mode -- Round to nearest - even Significand Rounded Result Error Significand Rounded Result Error X0.00 X0. 0 X1.00 X1. 0 X0.01 X0. -1/4 X1.01 X1. -1/4 X0.10 X0. -1/2 X1.10 X1. + 1 + 1/2 X0.11 X1. + 1/4 X1.11 X1. + 1 + 1/4 17

What s the best architecture? Architecture Consideration Concordia University 18

Floating Point Adder Architecture 19

Triple Path Floating Point Adder Exponents Input Floating Point Numbers Exponent Logic Control Logic Data Selector Pre-alignment (Right Barrel Shifter)/ Complementer Data Selector/Pre-align (0/1 Bit Right Shifter) Adder/Rounding Logic Adder/Rounding Logic Bypass Logic Exponent Incr/Decr Result Selector Exponent Subtractor Result Selector Leading Zero Counting logic Normalization (1 bit Right/Left Shifter) Normalization (Left Barrel Shifter) Result Integration/Flag Logic Flags IEEE Sum Fig 4.2 - Block diagram of the TDPFADD 20

Pipelined Triple Paths Floating Point Adder TPFADD Exponents Input Floating Point Numbers 1st Exponent Logic Control Logic 2nd Data Selector Pre-alignment (Barrel Shifter Right)/ Complementer Data Selector/Pre-align (0/1 Bit Right Shifter) 3rd Critical Path Adder/Rounding Logic Adder/Rounding Logic Bypass Logic 4th Exponent Incr/Decr Result Selector Exponent Subtractor Result Selector Leading Zero Counter 5th Normalization (1 bit Right/Left Shifter) Normalization (Barrel Shifter Left) Result Integration / Flag Logic Flag IEEE Sum 21

BP (bypass) LZB I LZA LZB BP BP LZA K LZA LZB J 22

FPADDer with Leading Zero Anticipation Logic Control exponents significands e1 e2 s1 s2 exponent 0 1 difference 0 1 0 1 sign1 sign2 compare right shifter sign control bit inverter bit inverter LZA logic LZA counter 56b adder rounding control exponent subtract left shift incrementer selector sign exponent incrementer compensation shifter 23

Improvements to previous Designs Control exponents significands e1 e2 s1 s2 exponent 0 1 difference 0 1 0 1 sign1 sign2 sign control X compare X bit inverter X LZA logic LZA counter 56b adder right shifter bit inverter exponent subtract left shift S & M rounding control S incrementer & M S selector & M sign X exponent incrementer X compensation shifter 24

Improvements in FADD from Previous Designs Control exponents significands e1 e2 s1 s2 exponent 0 1 difference 0 1 0 1 sign1 sign2 ABSENT compare right shifter sign control bit inverter bit inverter LZA logic LZA counter 56b adder rounding control exponent subtract left shift incrementer selector sign exponent incrementer ABSENT compensation shifter 25

Comparison of Synthesis results for IEEE 754 Single Precision FP addition Using Xilinx 4052XL-1 FPGA Parameters SIMPLE TDPFADD PIPE/ TDPFADD Maximum delay, D (ns) 327.6 213.8 101.11 Average Power, P (mw)@ 2.38 MHz 1836 1024 382.4 Area A, Total number of CLBs (#) 664 1035 1324 Power Delay Product (ns. 10mW) 7.7. *10 4 4.31 *10 4. 3.82 *10 4 Area Delay Product (10 #.ns) 2.18`*10 4 2.21 * 10 4 1.34 *10 4 Area-Delay 2 Product (10#. ns 2 ) 7.13.*10 6 4.73 * 10 6 1.35 *10 6 26

27

Architecture Consideration 1 2 5 Straightforward IEEE Floating-point addition algorithm 3 4 6 7 Advantages: 1. Positive result, Eliminate Complement 2. Comparison // Alignment 3. Full Normal // Rounding 1. Exponent subtraction. 2. Alignment. 3. Significand addition. 4. Conversion. 5. Leading-one detection. 6. Normalization. 7. Rounding. 28

Architecture Consideration Cont. opa S EXP. SIGNIF. opb S EXP. SIGNIF. S EXP. opa SIGNIF. S EXP. opb SIGNIF. Sign opa Sign opb add/sub control Exponent difference d 2'COMP Adder Right shifter 1-bit shifter Sign(d) Bit inverter 0 1 Sign(d) 0 1 LOP 1-bit shifter Bit inverter 2'COMP Adder complement NORMAL Sign opa Sign opb add/sub control Exponent difference d Right shifter Compound adder Sign(d) Bit inverter 0 1 Sign(d) 0 1 LOP 1-bit shifter Bit inverter Compound adder NORMAL Rounding Rounding 1-bit shifter FAR path Effective addition Effective subtraction with d>1 MUX CLOSE path Effective subtraction with d=0,1 FAR path Effective addition Effective subtraction with d>1 MUX CLOSE path Effective subtraction with d=0,1 (Compare to signal path) Reduce latency FAR data-path: --No Conversion --No Full normalization --No LOP CLOSE data-path: --No Full Alignment Reduce total path delay --eliminate Comparator Increase area --two 2 s 2 s COMP ADDER The latency of the floating-point addition can be improved if the rounding is combined with the addition/subtraction. 29

Main Blocks What blocks are considered? Compound Adder with Flagged Prefix Adder (New) LOP with Concurrent Position Correction (New) Alignment Shifter Normalization Shifter Concordia University 30

How can a compound adder compute fastest? Compound Adder Concordia University 31

Compound Adder The Compound adder computes simultaneously the sum and the sum plus one, and then the correct rounded result is obtained by selecting according to the requirements of the rounding. B-A-1 Effective Addition A + B A + B + 1 Effective Subtraction A + B + 1 = A-B A + B = A-B-1 A + B + 1 = A + B = B-A 32

Round to nearest if g=1 if (LSB=1) OR (r+s=1) Add 1 to the result else Truncate at LSB Round Toward zero Truncate Sum, Sum+1 Sum Compound Adder Cont. Sum, Sum+1 and Sum+2 Round Toward +Infinity if sign=positive if any bits to the right of the result LSB=1 Add 1 to the result else Truncate at LSB if sign=negative Truncate at LSB Round Toward -Infinity if sign=negative if any bits to the right of the result LSB=1 Add 1 to the result else Truncate at LSB if sign=positive Truncate at LSB Sum, Sum+1 and Sum+2 Rounding Block 33

Compound Adder The Compound adder computes simultaneously the sum and the sum plus one, and then the correct rounded result is obtained by selecting according to the requirements of the rounding. B-A-1 Effective Addition A + B A + B + 1 Effective Subtraction A + B + 1 = A-B A + B = A-B-1 A + B + 1 = A + B = B-A 34

Round to nearest if g=1 if (LSB=1) OR (r+s=1) Add 1 to the result else Truncate at LSB Compound Adder Cont. Sum, Sum+1 Sum Round Toward zero Truncate Round Toward +Infinity if sign=positive if any bits to the right of the result LSB=1 Add 1 to the result else Truncate at LSB if sign=negative Sum, Sum+1 and Sum+2 Truncate at LSB Round Toward -Infinity if sign=negative if any bits to the right of the result LSB=1 Add 1 to the result else Truncate at LSB if sign=positive Truncate at LSB Sum, Sum+1 and Sum+2 Rounding Block 35

Reference List [1] Computer Arithmetic Systems, Algorithms, Architecture and Implementations. A. Omondi. Prentice Hall, 1994. [2] Computer Architecture A Quantitative Approach, chapter Appendix A. D. Goldberg. Morgan Kaufmann, 1990. [3] Reduced latency IEEE floating-point standard adder architectures. Beaumont-Smith, A.; Burgess, N.; Lefrere, S.; Lim, C.C.; Computer Arithmetic, 1999. Proceedings. 14th IEEE Symposium on, 14-16 April 1999 [4] Rounding in Floating-Point Addition using a Compound Adder. J.D. Bruguera and T. Lang. Technical Report. University of Santiago de Compostela. (2000) [5] Floating point adder/subtractor performing ieee rounding and addition/subtraction in parallel. W.-C. Park, S.-W. Lee, O.-Y. Kown, T.-D. Han, and S.-D. Kim. IEICE Transactions on Information and Systems, E79-D(4):297 305, Apr. 1996. [6] Efficient simultaneous rounding method removing sticky-bit from critical path for floating point addition. Woo-Chan Park; Tack-Don Han; Shin-Dug Kim; ASICs, 2000. AP-ASIC 2000. Proceedings of the Second IEEE Asia Pacific Conference on, 28-30 Aug. 2000 Pages:223 226 [7] Efficient implementation of rounding units. Burgess. N.; Knowles, S.; Signals, Systems, and Computers, 1999. Conference Record of the Thirty-Third Asilomar Conference on, Volume: 2, 24-27 Oct. 1999 Pages: 1489-1493 vol.2 [8] The Flagged Prefix Adder and its Applications in Integer Arithmetic. Neil Burgess. Journal of VLSI Signal Processing 31, 263 271, 2002 [9] A family of adders. Knowles, S.; Computer Arithmetic, 2001. Proceedings. 15th IEEE Symposium on, 11-13 June 2001 Pages:277 281 [10] PAPA - packed arithmetic on a prefix adder for multimedia applications. Burgess, N.; Application-Specific Systems, Architectures and Processors, 2002. Proceedings. The IEEE International Conference on, 17-19 July 2002 Pages:197 207 [11] Nonheuristic optimization and synthesis of parallel prefix adders. R. Zimmermann, in Proc. Int.Workshop on Logic and Architecture Synthesis, Grenoble, France, Dec. 1996, pp. 123 132. [12] Leading-One Prediction with Concurrent Position Correction. J.D. Bruguera and T. Lang. IEEE Transactions on Computers. Vol. 48. No. 10. pp. 1083-1097. (1999) [13] Leading-zero anticipatory logic for high-speed floating point addition. Suzuki, H.; Morinaka, H.; Makino, H.; Nakase, Y.; Mashiko, K.; Sumi, T.; Solid-State Circuits, IEEE Journal of, Volume: 31, Issue: 8, Aug. 1996 Pages:1157 1164 [14] On low power floating point data path architectures. R. V. K. Pillai. Ph. D thesis, Concordia University, Oct. 1999. [15] A low power approach to floating point adder design. Pillai, R.V.K.; Al-Khalili, D.; Al-Khalili, A.J.; Computer Design: VLSI in Computers and Processors, 1997. ICCD '97. Proceedings. 1997 IEEE International Conference on, 12-15 Oct. 1997 Pages:178 185 [16] Design of Floating-Point Arithmetic Units. S.F.Oberman, H. Al-Twaijry and M.J.Flynn. Proc. Of the 13th IEEE Symp on Computer Arithmetic. pp. 156-165 1997 [17] Digital Arithmetic. M.D. Ercegovac and T. Lang. San Francisco: Morgan Daufmann, 2004. ISBN 1-55860-798-6 [18] Computer Arithmetic Algorithms. Israel Koren. Pub A K Peters, 2002. ISBN 1-56881-160-8 [19] Parallel Prefix Adder Designs. Beaumont-Smith, A.; Lim, C.-C.; Computer Arithmetic, 2001. Proceedings. 15th IEEE Symposium on, 11-13 June 2001 Pages:218 225 [20] Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic. Reto Zimmmemann and Wolfgang Fichtner, IEEE Journal of Solid-State Circuits, VOL.,32, No.7, July 1997 [21] Comparative Delay, Noise and Energy of High-performance Domino Adders with SNP. Yibin Ye, etc., 2000 Symposium on VLSI Circuits Digest of Technical Papers [22] 5 GHz 32b Integer-Execution Core in 130nm Dual-Vt CMOS. Sriram Vangal, etc., IEEE Journal of Solid-State Circuits, VOL.37, NO.11, November 2002 36 [23] Performance analysis of low-power 1-bit CMOS full adder cells. A.Shams, T.Darwish and M.Byoumi, IEEE Trans. on VLSI Syst., vol. 10, no.1, pp. 20-29, Feb 2002.

What about shifting? How to shift several bits at once? Barrel Shifters Concordia University 37

Right Shift Barrel Shifter 38

Shift and Rotate Barrel Shifter D 3 D 2 D 1 D 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 MUX MUX MUX MUX 1 0 1 0 1 0 1 0 S 0 Y 3 Y 2 Y 1 Y 0 Select Out Put Operation S i S o Y 3 Y 2 Y 1 Y 0 0 0 D 3 D 2 D 1 D 0 No Shift 0 1 D 2 D 1 D 0 D 3 Rotate Once 1 0 D 1 D 0 D 3 D 2 Rotate Twice 1 1 D 0 D 3 D 2 D 1 Rotate 3 times 39

Distributed Barrel Shifter x 8 x 7 x 7 x 6 x 6 x 5 x 5 x 4 x 4 x 3 x 3 x 2 x 2 x 1 x 1 x 0 MUX MUX MUX MUX MUX MUX MUX MUX S 0 k 7 k 6 k 5 k 4 k 3 k 2 k 1 k 0 k 9 k 7 k 8 k 6 k 7 k 5 k 6 k 4 k 5 k 3 k 4 k 2 k 3 k 1 k 2 k 0 MUX MUX MUX MUX MUX MUX MUX MUX S 1 w 7 w 6 w 5 w 4 w 3 w 2 w 1 w 0 w 11 w 7 w 10 w 6 w 9 w 5 w 8 w 4 w 7 w 3 w 6 w 2 w 5 w 1 w 4 w 0 MUX MUX MUX MUX MUX MUX MUX MUX S 2 y 7 y 6 y 5 y 4 y 3 y 2 y 1 y 0 40

Paths of the distributed Barrel Shifter Please note that in this case if we have 8 bits of data then inputs to MUXes greater than 7 should be be set to a desired value 41

A Normalization Shifter for FP Arithmetic 42

. Block Diagram of the Right Shifter & GRS-bit Generation Component 43

Concordia University The end Thank you for your attendance 44