Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure Bing-Yu Chen National Taiwan University

Arithmetic for Computers Addition and Subtraction Gate Logic and K-Map Method Constructing a Basic ALU Arithmetic Logic Unit Multiplication and Division Floating Point

Arithmetic operation a 32 ALU result b 32 32 2

Integer Addition & Subtraction Example: 7 + 6 (just like in grade school) () () () () () (carries)......... () () () () () () Subtraction = Add negation of second operand Example: 7 6 = 7 + ( 6) +7: 6: +: 3

Overflow Overflow if result out of range Adding +ve and ve operands no overflow two +ve operands overflow if result sign is two ve operands overflow if result sign is Subtracting two +ve or two ve operands no overflow +ve from ve operand overflow if result sign is ve from +ve operand overflow if result sign is Consider the operations A + B, and A B Can overflow occur if B is? Can overflow occur if A is? 4

Detecting Overflow Operation Operand A A+B Operand B Result indicating overflow A+B A-B A-B 5

Dealing with Overflow An exception (interrupt) occurs On overflow, invoke exception handler Save PC in exception program counter (EPC) register Jump to predefined handler address mfc (move from coprocessor reg) instruction can retrieve EPC value, to return after corrective action Some languages (e.g., C) ignore overflow new MIPS instructions: addu, addiu, sub note: addiu still sign-extends! note: sltu, sltiu for unsigned comparisons 6

An ALU (Arithmetic Logic Unit) build an ALU to support andi & ori instructions just build a bit ALU, and use 32 of them operation op a b res a b result Possible Implementation (sum-of-products): 7

Review: The NOT, AND, OR Gates Description Gates Truth Table NOT If X = then X = If X = then X = X X X X AND Z = if X and Y are both X Y Z X Y Z OR Z = if X or Y (or both) are X Y Z X Y Z 8

Review: NAND, NOR, XOR, XNOR 6 functions of two variables: X Y F F F2 F3 F4 F5 F6 F7 X Y X Y X Y X, X', Y, Y', X Y, X+Y,, only half of the possible functions F8 F9 F Y F F2 X F3 F4 F5 9

Review: NAND, NOR NAND Description Gates Truth Table Z = if X is or Y is X Y Z X Y Z NOR Z = if both X and Y are X Y Z X Y Z

Review: XOR, XNOR XOR XOR: X or Y but not both ("inequality", "difference") X Y XY XY XNOR: X and Y are the same ("equality", "coincidence") X Y XY XY Description Gates Truth Table Z = if X has a different value than Y X Y Z X Y Z XNOR Z = if X has the same value as Y X Y Z X Y Z

Review: Truth Tables Tabulate all possible input combinations and their associated output values Example: half adder adds two binary digits to form Sum and Carry A B Sum Carry NOTE: plus is with a carry of in binary Example: full adder adds two binary digits and Carry in to form Sum and Carry Out A B C in S um C out 2

Deriving Boolean Equations from Truth Tables for Half Adder OR'd together product terms for each truth table row where the function is if input variable is, it appears in complemented form; if, it appears uncomplemented A B Sum Carry Sum = A B + A B Carry = A B 3

4 Example: Full Adder A B Cin Sum Cout Sum = A B Cin + A B Cin + A B Cin + A B Cin Cout = A B Cin + A B Cin + A B Cin + A B Cin

Reducing the Complexity of Boolean Equations each product term in the above equation covers exactly two rows in the truth table; several rows are "covered" by more than one term B C in A C in A B A B C in C out Cout = A Cin + B Cin + A B 5

Review: Gates & Net most widely used primitive building block in digital system design Standard Logic Gate Representation INVERTER A AND Net B SUM OR Net 2 CARRY Net: electrically connected collection of wires 6

Two-Level Simplification Key Tool: The Uniting Theorem A (B' + B) = A A B F F = A B' + A B = A (B' + B) = A B's values change within the on-set rows B is eliminated, A remains A's values don't change within the on-set rows Essence of Simplification: find two element subsets of the ON-set where only one variable changes its value. This single varying variable can be eliminated! 7

Karnaugh Map Method K-map is an alternative method of representing the truth table that helps visualize adjacencies in up to 6 dimensions Beyond that, computer-based methods are needed 2-variable K-map 3-variable K-map A B C AB 2 3 2 3 A B 6 7 4 5 C CD AB 3 2 4 5 7 6 A B 2 3 5 4 8 9 D 4-variable K-map 8

Karnaugh Map Method Numbering Scheme:,,, Gray Code only a single bit changes from code word to next code word 9

K-Map Method Examples A B A asserted, unchanged B varies A B Cin AB F = A A B complemented, unchanged A varies AB C G = B' A B Cout = A B + B Cin + A Cin B F(A,B,C) = A 2

K-Map Method Examples, 3 Variables AB C A F(A,B,C) = Sm(,4,5,7) F = B' C' + A C In the K-map, adjacency wraps from left to right and from top to bottom C AB B A F' simply replace 's with 's and vice versa F'(A,B,C) = Sm(,2,3,6) F' = B C' + A' C B compare with the method of using DeMorgan's Theorem and Boolean Algebra to reduce the complement! 2

K-map Method Examples: 4 variables AB CD A F(A,B,C,D) = Sm(,2,3,5,6,7,8,,,4,5) F = C + A' B D + B' D' C D B find the smallest number of the largest possible subcubes that cover the ON-set 22

K-map Example: Don't Cares Don't Cares can be treated as 's or 's if it is advantageous to do so AB CD C A X X X D F = A'D + B' C' D w/o don't cares F = C' D + A' D w/ don't cares by treating this DC as a "", a 2-cube can be formed rather than one -cube AB CD A X In PoS form: F = D (A' + C') same answer as above, but fewer literals B C X X B D 24

25 Design Example: Two Bit Comparator Block Diagram and Truth Table A 4-Variable K-map for each of the 3 output functions F F 2 F 3 D C B A =, >, < F A B = C D F 2 A B < C D F 3 A B > C D A B C D N N 2

Design Example: Two Bit Comparator AB CD A AB CD A AB CD A C D C D C D B B B K-map for F K-map for F 2 K-map for F 3 F = A' B' C' D' + A' B C' D + A B C D + A B' C D' F2 = A' B' D + A' C + B' C D F3 = B C' D' + A C' + A B D' (A xnor C)(B xnor D) 26

27 Design Example: Two Bit Adder Block Diagram and Truth Table A 4-variable K-map for each of the 3 output functions X Y Z D C B A + N 3 A B C D N N 2 X Y Z

Design Example: Two Bit Adder AB CD A AB CD A AB CD A C D C D C D B K-map for X X = A C + B C D + A B D Z = B D' + B' D = B xor D Y = A' B' C + A B' C' + A' B C' D + A' B C D' + A B C' D' + A B C D = B' (A xor C) + A' B (C xor D) + A B (C xnor D) = B' (A xor C) + B (A xor B xor C) B K-map for Y 's on diagonal suggest XOR! Y K-Map not minimal as drawn B K-map for Z gate count reduced if XOR available 28

Two Level Simplification Definition of Terms implicant: single element of the ON-set or any group of elements that can be combined together in a K-map prime implicant: implicant that cannot be combined with another implicant to eliminate a term essential prime implicant: if an element of the ON-set is covered by a single prime implicant, it is an essential prime Objective: grow implicants into prime implicants cover the ON-set with as few prime implicants as possible essential primes participate in ALL possible covers 29

Examples to Illustrate Terms AB CD A 6 Prime Implicants: A' B' D, B C', A C, A' C' D, A B, B' C D C D essential Minimum cover = B C' + A C + A' B' D AB CD C B A B D 5 Prime Implicants: B D, A B C', A C D, A' B C, A' C' D essential Essential implicants form minimum cover 3

Examples to Illustrate Terms AB CD A Prime Implicants: B D, C D, A C, B' C C D essential Essential primes form the minimum cover B 3

Algorithm: Minimum Sum of Products Expression from a K-Map. Choose an element of ON-set not already covered by an implicant 2. Find "maximal" groupings of 's and X's adjacent to that element. Remember to consider top/bottom row, left/right column, and corner adjacencies. This forms prime implicants (always a power of 2 number of elements). Repeat Steps and 2 to find all prime implicants 3. Revisit the 's elements in the K-map. If covered by single prime implicant, it is essential, and participates in final cover. The 's it covers do not need to be revisited 4. If there remain 's not covered by essential prime implicants, then select the smallest number of prime implicants that cover the remaining 's 32

Example AB CD A AB CD A AB CD A X X X C X X D C X X D C X X D B B B Initial K-map Primes around A' B C' D' Primes around A B C' D 33

Example AB CD A AB CD A AB CD A X X X C X X D C X X D C X X D B B B Primes around A B C' D Primes around A B' C' D' Essential Primes with Min Cover 34

Different Implementations Not easy to decide the best way to build something Don't want too many inputs to a single gate Don't want to have to go through too many gates for our purposes, ease of comprehension is important Let's look at a -bit ALU for addition: c in a b + sum c out = ab + ac in + bc in sum = a xor b xor c in c out How could we build a -bit ALU for add, and, and or? How could we build a 32-bit ALU? 35

Review: The Multiplexor Selects one of the inputs to be the output, based on a control input d d c a b c a b note: we call this a 2-input mux even though it has 3 inputs! Lets build our ALU using a MUX 36

-bit ALU for AND & OR a operation result b 37

Building a 32 bit ALU CarryIn operation a c in operation a b c in ALU c out result result a b c in ALU c out result 2 b + c out a3 b3 c in ALU3 c out result3 38

What about Subtraction? binvert c in operation a 2 result b + c out 39

Tailoring the ALU to the MIPS Need to support the set-on-less-than instruction (slt) remember: slt is an arithmetic instruction produces a if rs < rt and otherwise use subtraction: (a-b) < implies a < b Need to support test for equality (beq $t5, $t6, $t7) use subtraction: (a-b) = implies a = b 4

Supporting slt without overflow binvert c in operation a b + 2 less 3 result c out 4

Supporting slt with overflow & set binvert c in operation a b + 2 less 3 overflow detection result set overflow 42

A 32 bit ALU binvert CarryIn operation bit ALU without overflow & set a b c in ALU result c out a b c in ALU c out result bit ALU with overflow & set a3 b3 c in ALU3 result3 set overflow 43

Test for Equality binvert operation control lines: = and = or = add = subtract = slt = beq (use zero as the output) a b a b a3 b3 c in ALU c out c in ALU c out c in ALU3 set result result result3 zero overflow 44

The ALU Symbol ALU operation a zero ALU result overflow b CarryOut 45

Multiplication more complicated than addition accomplished via shifting and addition more time and more area Let's look at 3 versions based on grade school algorithm (multiplicand) x (multiplier) negative numbers: convert and multiply 46

Multiplication (multiplicand) x (multiplier) 47

Multiplication Hardware: First Version Multiplicand shift left 64 bits 64-bit ALU Multiplier shift right 32 bits Product write control test 64 bits 48

Multiplication Algorithm: First Version start Multiplier=. test Multiplier Multiplier= a. add Multiplicand to Product and place the result in Product register 2. shift the Multiplicand register left bit 3. shift the Multiplier register right bit 32nd repetition? no: < 32 repetitions yes: 32 repetitions done 49

Multiplication Example: First Version iteration step Multiplier Multiplicand Product initial value a: Prod = Prod + Mcand 2: shift left Multiplicand 3: shift right Multiplier 2 a: Prod = Prod + Mcand 2: shift left Multiplicand 3: shift right Multiplier 3 : no operation 2: shift left Multiplicand 3: shift right Multiplier 4 : no operation 2: shift left Multiplicand 3: shift right Multiplier 5

Multiplication Hardware: Second Version Multiplicand 32 bits Multiplier shift right 32 bits 32-bit ALU Product 64 bits shift right write control test 5

Multiplication Algorithm: Second Version start Multiplier=. test Multiplier Multiplier= a. add Multiplicand to the left half of the Product and place the result in the left half of the Product register 2. shift the Product register right bit 3. shift the Multiplier register right bit 32nd repetition? no: < 32 repetitions yes: 32 repetitions done 52

Multiplication Example: Second Version iteration step Multiplier Multiplicand Product initial value a: Prod = Prod + Mcand 2: shift right Product 3: shift right Multiplier 2 a: Prod = Prod + Mcand 2: shift right Product 3: shift right Multiplier 3 : no operation 2: shift right Product 3: shift right Multiplier 4 : no operation 2: shift right Product 3: shift right Multiplier 53

Multiplication Hardware: Third Version Multiplicand 32 bits 32-bit ALU Product 64 bits shift right write control test 54

Multiplication Algorithm: Third Version start Product=. test Product Product= a. add Multiplicand to the left half of the Product and place the result in the left half of the Product register 2. shift the Product register right bit 32nd repetition? no: < 32 repetitions yes: 32 repetitions done 55

Multiplication Example: Third Version iteration step Multiplicand Product initial value a: Prod = Prod + Mcand 2: shift right Product 2 a: Prod = Prod + Mcand 2: shift right Product 3 : no operation 2: shift right Product 4 : no operation 2: shift right Product 56

MIPS Multiplication Two 32-bit registers for product HI: most-significant 32 bits LO: least-significant 32-bits Instructions mult rs, rt / multu rs, rt 64-bit product in HI/LO mfhi rd / mflo rd Move from HI/LO to rd Can test HI value to see if product overflows 32 bits mul rd, rs, rt Least-significant 32 bits of product > rd 57

Division Dividend = Quotient x Divisor + Remainder - - 58

Division Hardware: First Version Divisor shift right 64 bits 64-bit ALU Quotient shift left 32 bits Remainder write control test 64 bits 59

Division Algorithm: First Version start. subtract the Divisor register from the Remainder register and place the result in the Remainder register Remainder>= 2a. shift the Quotient register to the left, setting the new rightmost bit to Test Remainder Remainder< 2b. restore the original value by adding the Divisor register to the Remainder register and place the sum in the Remainder register. Also shift the Quotient register to the left, setting the new least significant bit to 3. shift the Divisor register right bit 33rd repetition? no: < 33 repetitions done yes: 33 repetitions 6

Division Example: First Version iteration step Quotient Divisor Remainder initial values : Rem = Rem - Div 2b: Rem < +Div, sll Q, Q= 3: shift Div right 2 : Rem = Rem - Div 2b: Rem < +Div, sll Q, Q= 3: shift Div right 3 : Rem = Rem - Div 2b: Rem < +Div, sll Q, Q= 3: shift Div right 4 : Rem = Rem - Div 2a: Rem sll Q, Q= 3: shift Div right 5 : Rem = Rem - Div 2a: Rem sll Q, Q= 3: shift Div right 6

Division Hardware: Second Version Divisor 32 bits Quotient shift left 32 bits 32-bit ALU Remainder 64 bits shift left write control test 62

Division Hardware: Third Version Divisor 32 bits 32-bit ALU Remainder shift right shift left write control test 64 bits 63

Division Algorithm: Third Version start. shift the Remainder register left bit 2. subtract the Divisor register from the left half of the Remainder register and place the result in the left half of the Remainder register Remainder>= 3a. shift the Remainder register to the left, setting the new rightmost bit to Test Remainder Remainder< 3b. restore the original value by adding the Divisor register to the left half of the Remainder register and place the sum in the left half of the Remainder register. Also shift the Remainder register to the left, setting the new rightmost bit to 33rd repetition? no: < 33 repetitions yes: 33 repetitions done. shift left half of Remainder right bit 64

Division Example: Third Version iteration step Divisor Remainder initial value shift Rem left 2: Rem = Rem Div 3b: Rem < +Div, sll R, R = 2 2: Rem = Rem Div 3b: Rem < +Div, sll R, R = 3 2: Rem = Rem Div 3a: Rem +Div, sll R, R = 4 2: Rem = Rem Div 3a: Rem +Div, sll R, R = shift left half of Rem right 65

MIPS Division Use HI/LO registers for result HI: 32-bit remainder LO: 32-bit quotient Instructions div rs, rt / divu rs, rt No overflow or divide-by- checking Software must perform checks if required Use mfhi, mflo to access result 66

Right Shift and Division Left shift by i places multiplies an integer by 2 i Right shift divides by 2 i? Only for unsigned integers For signed integers Arithmetic right shift: replicate the sign bit e.g., 5 / 4 2 >> 2 = 2 = 2 Rounds toward c.f. 2 >>> 2 = 2 = +62 67

Floating Point We need a way to represent numbers with fractions, e.g., 3.459265 very small numbers, e.g.,. very large numbers, e.g., 3.5576 x 9 Like scientific notation 2.34 56 +.2 4 +987.2 9 In binary ±.xxxxxxx 2 2 yyyy Types float and double in C 68

Floating Point Standard Defined by IEEE Std. 754-985 Developed in response to divergence of representations Portability issues for scientific code Now almost universally adopted Two representations Single precision (32-bit) Double precision (64-bit) 69

IEEE Floating-Point Format single: 8 bits double: bits S Exponent x ( ) S single: 23 bits double: 52 bits Fraction ( Fraction) 2 (Exponent Bias) S: sign bit ( non-negative, negative) Normalize significand:. significand < 2. Always has a leading pre-binary-point bit, so no need to represent it explicitly (hidden bit) Significand is Fraction with the. restored Exponent: excess representation: actual exponent + Bias Ensures exponent is unsigned Single: Bias = 27; Double: Bias = 23 7

Single-Precision Range Exponents and reserved Smallest value Exponent: actual exponent = 27 = 26 Fraction: significand =. ±. 2 26 ±.2 38 Largest value Exponent: actual exponent = 254 27 = +27 Fraction: significand 2. ±2. 2 +27 ±3.4 +38 7

Double-Precision Range Exponents and reserved Smallest value Exponent: actual exponent = 23 = 22 Fraction: significand =. ±. 2 22 ±2.2 38 Largest value Exponent: actual exponent = 246 23 = +23 Fraction: significand 2. ±2. 2 +23 ±.8 +38 72

Floating-Point Precision Relative precision all fraction bits are significant Single: approx 2 23 Equivalent to 23 log 2 23.3 6 decimal digits of precision Double: approx 2 52 Equivalent to 52 log 2 52.3 6 decimal digits of precision 73

Floating-Point Example Represent.75.75 = ( ). 2 2 S = Fraction = 2 Exponent = + Bias Single: + 27 = 26 = 2 Double: + 23 = 22 = 2 Single: Double: 74

Floating-Point Example What number is represented by the single-precision float S = Fraction = 2 Exponent = 2 = 29 x = ( ) ( + 2 ) 2 (29 27) = ( ).25 2 2 = 5. 75

Floating-Point Addition 9.999 x +.6 x - =?..6 x - =.6 x align decimal points shift number with smaller exponent 2. 9.999 +.6 =.5 add significands 3..5 x =.5 x 2 normalize result & check for over/underflow 4..5 x 2 =.2 x 2 round and renormalize if necessary 76

Floating-Point Addition Now consider a 4-digit binary example. 2 2 +. 2 2 2 (.5 +.4375). Align binary points Shift number with smaller exponent. 2 2 +. 2 2 2. Add significands. 2 2 +. 2 2 =. 2 2 3. Normalize result & check for over/underflow. 2 2 4, with no over/underflow 4. Round and renormalize if necessary. 2 2 4 (no change) =.625 77

Floating-Point Addition start. compare the exponents of the two numbers. shift the smaller number to the right until its exponent would match the larger exponent 2. add the significands 3. normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent overflow or underflow? yes no exception 4. round the significand to the appropriate number of bits no still normalized? done yes 78

Floating-Point Addition Sign Exponent Fraction Sign Exponent Fraction small ALU exponent difference control shift right big ALU incrementor difference shift left or right rounding hardware Sign Exponent Fraction 79

Floating-Point Multiplication. x x 9.2 x -5 =?. addition of exponents + -5 = 5 2. multiplication of significands. x 9.2 =.22 3..22 x 5 =.22 x 6 4..22 x 6 =.2 x 6 5. determination of sign + 8

Floating-Point Multiplication Now consider a 4-digit binary example. 2 2. 2 2 2 (.5.4375). Add exponents + 2 = 3 2. Multiply significands. 2. 2 =. 2. 2 2 3 3. Normalize result & check for over/underflow. 2 2 3 (no change) with no over/underflow 4. Round and renormalize if necessary. 2 2 3 (no change) 5. Determine sign: +ve ve ve. 2 2 3 =.2875 8

Floating-Point Multiplication start. add the biased exponents of the two numbers, subtracting the bias from the sum to get the new biased exponent 2. multiply the significands 3. normalize the product if necessary, shifting if right and incrementing the exponent overflow or underflow? yes no 4. round the significand to the appropriate number of bits exception no still normalized? yes 5. set the sign of the product to positive if the signs of the original operands are the same; if they differ make the sign negative 82 done

Rounding with Guard Digits 2.56 x + 2.34 x 2 =? with Guard Digits (2 extra bits).256 x 2 + 2.34 x 2 = 2.3656 x 2 = 2.37 x 2 without Guard Digits.2 x 2 + 2.34 x 2 = 2.36 x 2 83

FP Instructions in MIPS Separate FP registers 32 single-precision: $f, $f, $f3 Paired for double-precision: $f/$f, $f2/$f3, FP instructions operate only on FP registers Programs generally don t do integer ops on FP data, or vice versa More registers with minimal code-size impact FP load and store instructions lwc, ldc, swc, sdc e.g., ldc $f8, 32($sp) 84

FP Instructions in MIPS Single-precision arithmetic add.s, sub.s, mul.s, div.s e.g., add.s $f, $f, $f6 Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g., mul.d $f4, $f4, $f6 Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, ) Sets or clears FP condition-code bit e.g. c.lt.s $f3, $f4 Branch on FP condition code true or false bct, bcf e.g., bct TargetLabel 85

FP Example: F to C C code: float f2c (float fahr) { return ((5./9.)*(fahr - 32.)); } fahr in $f2, result in $f, literals in global memory space MIPS code: f2c: lwc $f6, const5($gp) lwc2 $f8, const9($gp) div.s $f6, $f6, $f8 lwc $f8, const32($gp) sub.s $f8, $f2, $f8 mul.s $f, $f6, $f8 jr $ra 86