Learning Outcomes. Spiral 2 2. Digital System Design DATAPATH COMPONENTS

Similar documents
Learning Outcomes. Spiral 2 2. Digital System Design DATAPATH COMPONENTS

Learning Outcomes. Spiral 2-2. Digital System Design DATAPATH COMPONENTS

MULTIPLICATION TECHNIQUES

EE 109 Unit 6 Binary Arithmetic

11.1. Unit 11. Adders & Arithmetic Circuits

Overview. EECS Components and Design Techniques for Digital Systems. Lec 16 Arithmetic II (Multiplication) Computer Number Systems.

EE 101 Lab 5 Fast Adders

Arithmetic Logic Unit. Digital Computer Design

Mark Redekopp, All rights reserved. EE 352 Unit 8. HW Constructs

Arithmetic Circuits. Nurul Hazlina Adder 2. Multiplier 3. Arithmetic Logic Unit (ALU) 4. HDL for Arithmetic Circuit

ECE 30 Introduction to Computer Engineering

Chapter 3 Arithmetic for Computers

Week 7: Assignment Solutions

T insn-mem T regfile T ALU T data-mem T regfile

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

ECE 341 Midterm Exam

Computer Architecture and Organization

Semester Transition Point. EE 109 Unit 11 Binary Arithmetic. Binary Arithmetic ARITHMETIC

Chapter 3 Part 2 Combinational Logic Design

Lecture 5. Other Adder Issues

ECE 341 Midterm Exam

Department of Electrical and Computer Engineering University of Wisconsin - Madison. ECE/CS 352 Digital System Fundamentals.

Principles of Computer Architecture. Chapter 3: Arithmetic

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

Number Systems and Computer Arithmetic

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Arithmetic (a) The four possible cases Carry (b) Truth table x y

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

2.1. Unit 2. Integer Operations (Arithmetic, Overflow, Bitwise Logic, Shifting)

Binary Adders: Half Adders and Full Adders

Timing for Ripple Carry Adder

Real Digital Problem Set #6

Binary Arithmetic. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T.

*Instruction Matters: Purdue Academic Course Transformation. Introduction to Digital System Design. Module 4 Arithmetic and Computer Logic Circuits

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Computer Architecture Set Four. Arithmetic

Digital Circuit Design and Language. Datapath Design. Chang, Ik Joon Kyunghee University

ECE 341 Final Exam Solution

CAD4 The ALU Fall 2009 Assignment. Description

Jan Rabaey Homework # 7 Solutions EECS141

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Digital Computer Arithmetic ECE 666

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Addition and multiplication

ECE331: Hardware Organization and Design

EE878 Special Topics in VLSI. Computer Arithmetic for Digital Signal Processing

This Unit: Arithmetic. CIS 371 Computer Organization and Design. Readings. Pre Class Exercise

Chapter 3 Part 2 Combinational Logic Design

Revision: August 31, E Main Suite D Pullman, WA (509) Voice and Fax

An Efficient Fused Add Multiplier With MWT Multiplier And Spanning Tree Adder

Combinational Circuit Design

Basic Arithmetic (adding and subtracting)

Microcomputers. Outline. Number Systems and Digital Logic Review

Outline. Introduction to Structured VLSI Design. Signed and Unsigned Integers. 8 bit Signed/Unsigned Integers

Tailoring the 32-Bit ALU to MIPS

Parallel logic circuits

DIGITAL ARITHMETIC: OPERATIONS AND CIRCUITS

Chapter 4. Combinational Logic

CS/COE 0447 Example Problems for Exam 2 Spring 2011

Chapter 3: part 3 Binary Subtraction

Binary Adders. Ripple-Carry Adder

CPE 335 Computer Organization. MIPS Arithmetic Part I. Content from Chapter 3 and Appendix B

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

Arithmetic Circuits. Design of Digital Circuits 2014 Srdjan Capkun Frank K. Gürkaynak.

CS 5803 Introduction to High Performance Computer Architecture: Arithmetic Logic Unit. A.R. Hurson 323 CS Building, Missouri S&T

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

Chapter 3. ALU Design

Arithmetic and Logical Operations

Writing Circuit Descriptions 8

An FPGA based Implementation of Floating-point Multiplier

EKT 422/4 COMPUTER ARCHITECTURE. MINI PROJECT : Design of an Arithmetic Logic Unit

At the ith stage: Input: ci is the carry-in Output: si is the sum ci+1 carry-out to (i+1)st state

ECE468 Computer Organization & Architecture. The Design Process & ALU Design

9/6/2011. Multiplication. Binary Multipliers The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding

Each DSP includes: 3-input, 48-bit adder/subtractor

PESIT Bangalore South Campus

Computer Arithmetic Multiplication & Shift Chapter 3.4 EEC170 FQ 2005

Fundamentals of Computer Systems

1. Prove that if you have tri-state buffers and inverters, you can build any combinational logic circuit. [4]

L11: Major/Minor FSMs

Binary Addition. Add the binary numbers and and show the equivalent decimal addition.

Integer Multiplication and Division


We are quite familiar with adding two numbers in decimal

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences. Spring 2010 May 10, 2010

University of Illinois at Chicago. Lecture Notes # 10

CS Computer Architecture. 1. Explain Carry Look Ahead adders in detail

Control Unit: Binary Multiplier. Arturo Díaz-Pérez Departamento de Computación Laboratorio de Tecnologías de Información CINVESTAV-IPN

Binary Multiplication

CPE300: Digital System Architecture and Design

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Data Representation Type of Data Representation Integers Bits Unsigned 2 s Comp Excess 7 Excess 8

COMP 303 Computer Architecture Lecture 6

Number Systems. Readings: , Problem: Implement simple pocket calculator Need: Display, adders & subtractors, inputs

EC2303-COMPUTER ARCHITECTURE AND ORGANIZATION

Chapter 4 Arithmetic Functions

LECTURE 4. Logic Design

Area Efficient, Low Power Array Multiplier for Signed and Unsigned Number. Chapter 3

Cpr E 281 FINAL PROJECT ELECTRICAL AND COMPUTER ENGINEERING IOWA STATE UNIVERSITY. FINAL Project. Objectives. Project Selection

Laboratory Exercise 6

ECE 341 Midterm Exam

Transcription:

2-2. 2-2.2 Learning Outcomes piral 2 2 Arithmetic Components and Their Efficient Implementations I know how to combine overflow and subtraction results to determine comparison results of both signed and unsigned numbers I understand how combination multipliers can be built I understand how hierarchical carry lookahead logic can be used to produce logarithmic time delay for an adder Mark Redekopp 2-2.3 2-2.4 Digital ystem Design Control (CU) and Datapath Unit (DPU) paradigm eparate logic into datapath elements that operate on data and control elements that generate control signals for datapath elements Datapath: s, muxes, comparators, counters, registers (shift, with enables, etc.), memories, FIFO s Control Unit: tate machines/sequencers DATAPATH COMPONENT clk reset Control ignals Control Condition ignals Data Inputs Datapath Data Outputs

2-2.5 2-2.6 Overflow Detecting Overflow Helps Us Perform Comparison OVERFLOW & COMPARION Overflow occurs when the result of an arithmetic operation is too large to be represented with the given number of bits Unsigned overflow occurs when adding or subtracting unsigned numbers igned (2 s complement overflow) overflow occurs when adding or subtracting 2 s complement numbers 2-2.7 2-2.8 Unsigned Overflow 2 s Complement Overflow + 7 = 7 With 4-bit unsigned numbers we can only represent 5. Thus, we say overflow has occurred. Overflow occurs when you cross this discontinuity +5 +4 +3 Plus 7 +2 + + +9 +8 + +2 +3 +4 +5 +6 +7 5 + 7 = +2-6 + -4 = - With 4-bit 2 s complement numbers we can only represent -8 to +7. Thus, we say overflow has occurred. -4-3 -5-2 -6 - -7-8 + +2 +6 +7 +3 +5 +4 Overflow occurs when you cross this discontinuity

Testing for Overflow Most fundamental test Check if answer is wrong (i.e. Positive + Positive yields a negative) Unsigned overflow test [Different for add or sub] Addition: If carry out of final position equals '' ubtraction: If carry out of final addition equals '' igned (2 s complement) overflow test [ame for add or sub] Only occurs if two positives are added and result is negative or two negatives are added and result is positive Alternate test: if carry in and carry out of final position are different 2-2.9 Testing for Unsigned Overflow Unsigned Overflow has occurred if Unsigned Addition: If final carry out = Unsigned ubtraction: If final carry out = + - + - 2-2. Testing for Unsigned Overflow Unsigned Overflow has occurred if Unsigned Addition: If final carry out = Unsigned ubtraction: If final carry out = Final carry-out =, thus overflow - Final carry-out =, thus no overflow + + Final carry-out =, thus no overflow + - Final carry-out =, thus overflow + 2-2. Testing for 2 s Comp. Overflow 2 s Complement Overflow Occurs If Test : If pos. + pos. = neg. or neg. + neg. = pos. Test 2: If carry in to MB position and carry out of MB position are different + + (5) (6) (3) (2) 2-2.2 (-4) + (-7) (-2) + (-6)

2-2.3 2-2.4 Testing for 2 s Comp. Overflow Checking for Overflow 2 s Complement Overflow Occurs If Test : If pos. + pos. = neg. or neg. + neg. = pos. Test 2: If carry in to MB position and carry out of MB position are different Produce additional outputs to indicate if unsigned (UOV) or signed (OV) overflow has occurred Carry-in to MB and carry-out of MB position are different Overflow! Carry-in to MB and carry-out of MB position are same No Overflow! (5) + (6) (-5) + (3) (2) (5) Carry-in to MB and carry-out of MB position are different Overflow! Carry-in to MB and carryout of MB position are same No Overflow! (-4) + (-7) (+5) (-2) + (-6) (-8) 2-2.5 Comparison Via ubtraction 2-2.6 COMPARION uppose we want to compare two numbers: A & B uppose we let DIFF = A B what could the result tell us If DIFF <, then A < B If DIFF =, then A=B IF DIFF >, then A > B How would we know DIFF ==? If all bits of our answer are check with a NOR gate. How would we know DIFF < (i.e. negative)? igned: Check MB! (but what about overflow) Unsigned: Huh? In unsigned there are no negative results

2-2.7 2-2.8 Computing A<B from "Negative" Result Unsigned Comparator Unsigned Perform A B If A B would yield a negative result, this will appear as "overflow" in an unsigned subtraction And we know unsigned subtraction overflow occurs if Cout = o just check if Cout= igned Perform A B If there is no overflow (V=), simply check if MB = But if there is overflow?? Recall overflow has the effect of flipping the sign of the result to the opposite of what it should be. o if there is overflow (V=) check is MB = (i.e. positive) ummary: A B is "truly' negative if V= & MB= or V= & MB= A comparator can be built by using a subtractor A=B A A[3:] A>B DIFF[3:] Res[3:] ubtractor B B[3:] C4 A<B 2-2.9 2-2.2 igned Comparator A comparator can be built by using a subtractor A=B A A[3:] Res[3:] DIFF[3:] A>B ADDER TIMING B ubtractor B[3:] C4 A<B

2-2.2 2-2.22 Addition s Timing Be sure to connect first to = + = A chain of full adders presents an interesting timing analysis problem To correctly compute its own um and Carry out, each full adder requires the carry out bit from the previous full adder Because hardware works in parallel, the full adders further down the chain may momentarily produce the wrong outputs because the carry has not had time to propagate to them C out 2-2.23 2-2.24 Timing Example Assume that we were adding one set of inputs and then change to a new set of inputs: Old inputs: Old inputs: + = = New inputs: + = = Timing At the time just before we enter the new input values, all carries are s New inputs: Old inputs: + = = Time - C out

2-2.25 2-2.26 Timing Now we enter the new inputs and all the s starting adding their respective inputs + = = Time Timing Each adder computes from the current inputs (notice the sum of is incorrect at this point) + = = Time New inputs: Due to propagation delay, the carries are still from the old inputs Now the carries are all based off the new inputs 2-2.27 2-2.28 Timing Timing The carry is rippling through each adder + = = Time 2 The carry is rippling through each adder + = = Time 3

2-2.29 2-2.3 Timing Ripple Carry Only after the carry propagates through all the adders is the sum valid and correct + = = Time 4 The longest path through a chain of full adders is the carry path We say that the carry ripples through the adder time C C C C C C out out in out in out C 4 C 3 C 2 C C 2-2.3 2-2.32 Ripple Carry Delay Glitches An n bit ripple carry adder has a worst case delay proportional to n (i.e. n bits => n columns of addition => n full adders) Transient, incorrect output values due to differing arrival times of gate inputs

2-2.33 2-2.34 Output Glitches Critical Path Delay of the carry causes glitches on the sum bits Glitch = momentarily, incorrect output value early late C C out out Critical Path = Longest possible delay path Assume t sum = 5 ns, t carry = 4 ns 6 ns Co 2 ns Co 8 ns Co 4 ns Co 7 ns 3 ns 9 ns 5 ns Critical Path 3 2-2.35 Unsigned Multiplication Review 2-2.36 ame rules as decimal multiplication Multiply each bit of Q by M shifting as you go An m bit * n bit mult. produces an m+n bit result (i.e. n bit * n bit produces 2*n bit result) Notice each partial product is a shifted copy of M or (zero) MULTIPLIER * M (Multiplicand) Q (Multiplier)

2-2.37 2-2.38 Unsigned Multiplication Review igned Multiplication Techniques ame rules as decimal multiplication Multiply each bit of Q by M shifting as you go An m bit * n bit mult. produces an m+n bit result (i.e. n bit * n bit produces 2*n bit result) Notice each partial product is a shifted copy of M or (zero) * + M (Multiplicand) Q (Multiplier) PP(Partial Products) P (Product) When adding signed (2 s comp.) numbers, some new issues arise Must sign extend partial products (out to 2n bits) Without ign Extension Wrong Answer! * + = -7 = +6 = +54 With ign Extension Correct Answer! * + = -7 = +6 = -42 2-2.39 2-2.4 igned Multiplication Techniques Combinational Multiplier Also, must worry about negative multiplier MB of multiplier has negative weight If MB=, multiply by (i.e. take 2 s comp. of multiplicand) With ign Extension but w/o consideration of MB Wrong Answer! * + = -4 = -6 = -4 Place Value: -8 Multiply by - With ign Extension and w/ consideration of MB Correct Answer! * + = -4 = -6 = +24 Partial Product (PP i ) Generation Multiply Q[i] * M if Q[i]= => PP i = if Q[i]= => PP i = M

Combinational Multiplier 2-2.4 Multiplication Overview 2-2.42 Partial Product (PP i ) Generation Multiply Q[i] * M if Q[i]= => PP i = if Q[i]= => PP i = M AND gates can be used to generate each partial product M[3] M[2] M[] M[] M[3] M[2] M[] M[] if Q[ i]= M[3] M[2] M[] M[] if Q[ i]= Multiplication approaches: equential: hift and Add produces one product bit per clock cycle time (usually slow) Combinational: Array multiplier uses an array of adders Can be as simple as N ripple carry adders for an NxN multiplication AND Gate Array produces partial product terms m3 m2 m m x q3 q2 q q m3q m2q mq mq m3q m2q mq mq - m3q2 m2q2 mq2 mq2 - - + m3q3 m2q3 mq3 mq3 - - - p7 p6 p5 p4 p3 p2 p p Combinational Multiplier 2-2.43 Array Multiplier 2-2.44 Partial Products must be added together Combinational multipliers require long propagation delay through the adders propagation delay is proportional to the number of partial products (i.e. number of bits of input) and the width of each adder Can this be a HA? Maximum delay =? Do you look for the longest path or the shortest path between any input and output? Compare with the delay of a shift and add method

2-2.45 2-2.46 Pipelined Multiplier Carry ave Multiplier Now try to pipeline the previous design m3q2 m2q2 mq2 mq2 m2q3 mq3 mq3 m3q3 m3q m2q mq mq m3q m2q mq mq Instead of propagating the carries to the left in the same row, carries are now sent down to the next stage to reduce stage delay and facilitate pipelining Co HA Co Co Co HA CA s m3q m3q m2q mq mq m2q mq mq Co Co Co Co Co Co Co HA m3q2 m2q2 mq2 mq2 Co Co Co P[7] Co P[6] Co P[5] Co P[4] Co HA Determine the maximum stage delay to decide the pipeline clock rate. Assume zero delay for stage latches. How does the latency of the pipeline compare with the simple combinational array of the previous stage? P[3] P[2] P[] P[] The upper three stages are 3 bit Carry ave s (CA s) each with 2 gate delays. The last stage is a Ripple Carry (RCA) which requires longer delay. It can be replaced by a CLA for larger multipliers. RCA m3q3 Co P[7] P[6] m2q3 mq3 mq3 Co Co Co Co Co P[5] P[4] P[3] P[2] P[] P[] 2-2.47 2-2.48 Carry ave s Carry ave (3,2) s Consider the decimal addition of 47 + 96 + 58 = 2 One way is to add 47 to 96 to get 43 and then add 58 Here the ten s column cannot be added until the carry is produced In the carry save style, we add the one s column and ten s column simultaneous 4 7 + 9 6 2 4 3 + 5 8 2 3 6 5 4 4 7 9 6 + 5 8 2 2 + 8 _ 2 4 3 A carry save adder is also called a (3,2) adder or a (3,2) counter (refer to Computer Arithmetic Algorithms by Israel Koren) as it takes three vectors, adds them up, and reduces them to two vectors, namely a sum vector and a carry vector CA s are based on the principle that carries do not have to be added as soon as possible, but can be combined in a later step An n bit CA consist of n disjoint full adders + _ Carry vector um vector A[3] B[3] C[3] A[2] B[2] C[2] A[] B[] C[] A[] B[] C[] Co Z Co Z Co Z Co C[4] [3] C[3] [2] C[2] [] C[] [] Z

2-2.49 2-2.5 Propagation Delay Propagation Delay + + 2-2.5 2-2.52 Propagation Delay Propagation Delay + +

2-2.53 2-2.54 Propagation Delay Propagation Delay + + 2-2.55 2-2.56 Critical Path Combinational Multiplier Critical Path = Longest possible delay path Assume t sum = 5 ns, t carry = 4 ns 6 ns 2 ns 8 ns 4 ns 7 ns 3 ns 9 ns 5 ns Critical Path

Combinational Multiplier 2-2.57 Combinational Multiplier 2-2.58 Combinational Multiplier 2-2.59 Combinational Multiplier 2-2.6

Combinational Multiplier 2-2.6 Combinational Multiplier 2-2.62 Combinational Multiplier 2-2.63 Combinational Multiplier 2-2.64

2-2.65 2-2.66 Critical Paths Combinational Multiplier Analysis Large Area due to (n ) m bit adders n because the first adder adds the first two partial products and then each adder afterwards adds one more partial product Propagation delay is in two dimensions proportional to m+n Critical Path Critical Path 2 2-2.67 Ripple Carry s 2-2.68 Ripple carry adders (RCA) are slow due to carry propagation At least 2 levels of logic per full adder Carry Lookahead s T ADDER 6 5 4 3 2

Fast s 2-2.69 Carry Lookahead Logic 2-2.7 Rather than calculating one carry at a time and passing it down the chain, can we compute a group of carries at the same time To do this, let us define some new signals for each column of addition: p i = Propagate: This column will propagate a carry in (if there is one) to the carry out. p i is true when A i or B i is => p i = A i + B i g i = Generate: This column will generate a carry out whether or not the carry in is g i is true when A i and B i is => g i = A i B i Using these signals, we can define the carry out (c i+ ) as: c i+ = g i + p i c i Define each carry in terms of p i, g i and the initial carry in (c ) and not in terms of carry chain (intermediate carries: c,c2,c3, ) c = c2 = c3 = c4 = Carry Lookahead Logic 2-2.7 Carry Lookahead Analogy 2-2.72 Define each carry in terms of p i, g i and the initial carry in (c ) and not in terms of carry chain (intermediate carries: c,c2,c3, ) c = g + p c c2 = g + p c = g + p g + p p c c3 = c4 = Consider the carry chain like a long tube broken into segments. Each segment is controlled by a valve (propagate signal) and can insert a fluid into that segment (generate signal) The carry out of the diagram below will be true if g is true or p is true and g is true, or p, p and c is true

2-2.73 2-2.74 Carry Lookahead Use carry lookahead logic to generate all the carries in one shot and then create the sum Example 4 bit CLA shown below 2-2.75 2-2.76 Carry Lookahead 4 bit s Use carry lookahead logic to generate all the carries in one shot and then create the sum Example 4 bit CLA shown below 3 3 3 3 74L283 chip implements a 4 bit adder using CLA methodology A 3 A 2 A A = A + B 3 B 2 B B = B 4 3 2 = A 3 B 3 A 2 B 2 A B A B 74L283 3 2 4

6 Bit CLA But how would we make a 6 bit adder? hould we really just chain these fast 4 bit adders together? Or can we do better? 2-2.77 2-2.78 A[5:2] B[5:2] A[:8] B[:8] A[7:4] B[7:4] A[3:] B[3:] C C6 C2 PG PG PG PG [5:2] [:8] [7:4] [3:] C8 7 5 3 C4 6-bit RCA Delay = 6*2 = 32 gate delays Delay of the above adder design = 3+2+2+3 = gates Let us improve by looking ahead at a higher level to produce C6, C2, C8, C4 in parallel What s the difference between the equation for G here and C4 on the previous slides Define P and G as the overall Propagate and Generate signals for a set of 4 bits P = p3 p2 p p G = g3 + p3 g2 + p3 p2 g + p3 p2 p g REVIEW ON OUR OWN FOR CLA LAB 2-2.79 2-2.8 6 bit CLA Closer Look 6 Bit CLA Each 4 bit CLA only propagates its overall carry in if each of the 4 columns propagates: P = p3 p2 p p P = p7 p6 p5 p4 P2 = p p p9 p8 P3 = p5 p4 p3 p2 Each 4 bit CLA generates a carry if any column generates and the more significant columns propagate G = g3 + (p3 g2) + (p3 p2 g)+(p3 p2 p g) G3 = g5 + (p5 g4) + (p5 p4 g3)+(p5 p4 p3 g2) The higher order CLL logic (producing C4,C8,C2,C6) then is realized as: (C4) =>C = G + (P c) (C6) => C4 = G3 + (P3 G2) + (P3 P2 G) +(P3 P2 P G)+ (P3 P2 P P c) These equations are exactly the same CLL logic we derived earlier C6 Understanding 6 bit CLA hierarchy PG G c5 p3 g3 c4 P* G* CLL PG CLL PG CLL P G CLL C2 c3 p2 g2 C8 CLL p g Delay = = = Delay in producing pi,gi = = Delay in producing Pi*,Gi* = = Delay in producing C4,C8,C2,C6 = = Delay in producing c5 = = Delay in producing 5 c2 C4 c p g c C

2-2.8 2-2.82 64 Bit CLA ummary We can reuse the same CLL logic to build a 64 bit CLA c63 s35 C6 C56 C52 C44 C4 C36 C28 C24 C2 C2 C8 C4 C ou should now be able to build: Fast s Comparators PG G CLL PG CLL PG CLL P G CLL C48 C32 C6 p3 g3 c4 P* G* c3 p2 g2 c2 CLL p g c p g c = = Delay in producing 63 Is the delay in producing s63 the same as in s35? = = Delay in producing 2 = = Delay in producing = = Delay in producing pi*,gi* = = Delay in producing Pj**,Gj** = = Delay in producing C48 = = Delay in producing C6 = = Delay in producing C63 = = Delay in producing 63 = Total Delay