Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 <>

Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building Blocks Memory Arrays Logic Arrays Chapter 5 <2>

Introduction Digital building blocks: Gates, multiplexers, decoders, registers, arithmetic circuits, counters, memory arrays, logic arrays Building blocks demonstrate hierarchy, modularity, and regularity: Hierarchy of simpler components Well-defined interfaces and functions Regular structure easily extends to different sizes Will use these building blocks in Chapter 7 to build microprocessor Chapter 5 <3>

-Bit Adders Half Adder Full Adder A B A B C out + C out + C in S S A B S = C out = C out S C in A B C out S S = C out = Chapter 5 <4>

-Bit Adders Half Adder Full Adder A B A B C out + C out + C in S S A B S = C out = C out S C in A B C out S S = C out = Chapter 5 <5>

-Bit Adders Half Adder Full Adder A B A B C out + C out + C in S S A B C out S = A B C out = AB S C in A B C out S S = A B C in C out = AB + AC in + BC in Chapter 5 <6>

Multibit Adders (CPAs) Types of carry propagate adders (CPAs): Ripple-carry (slow) Carry-lookahead (fast) Prefix (faster) Carry-lookahead and prefix adders faster for large adders but require more hardware Symbol A B C out + S C in Chapter 5 <7>

Ripple-Carry Adder Chain -bit adders together Carry ripples through entire chain Disadvantage: slow A 3 B 3 A 3 B 3 A B A B C out + C + 3 C 29 C + C + C in S 3 S 3 S S Chapter 5 <8>

Ripple-Carry Adder Delay t ripple = t FA where t FA is the delay of a -bit full adder Chapter 5 <9>

Carry-Lookahead Adder Compute carry out (C out ) for k-bit blocks using generate and propagate signals Some definitions: Column i produces a carry out by either generating a carry out or propagating a carry in to the carry out Generate (G i ) and propagate (P i ) signals for each column: Column i will generate a carry out if A i AD B i are both. G i = A i B i Column i will propagate a carry in to the carry out if A i OR B i is. P i = A i + B i The carry out of column i (C i ) is: C i = A i B i + (A i + B i )C i- = G i + P i C i- Chapter 5 <>

Carry-Lookahead Addition Step : Compute G i and P i for all columns Step 2: Compute G and P for k-bit blocks Step 3: C in propagates through each k-bit propagate/generate block Chapter 5 <>

Carry-Lookahead Adder Example: 4-bit blocks (G 3: and P 3: ) : Generally, G 3: = G 3 + P 3 (G 2 + P 2 (G + P G ) P 3: = P 3 P 2 P P G i:j = G i + P i (G i- + P i- (G i-2 + P i-2 G j ) P i:j = P i P i- P i-2 P j C i = G i:j + P i:j C j- Chapter 5 <2>

32-bit CLA with 4-bit Blocks B 3:28 A 3:28 B 27:24 A 27:24 B 7:4 A 7:4 B 3: A 3: C out 4-bit CLA Block C 27 4-bit CLA Block C 23 C 7 4-bit CLA Block C 3 4-bit CLA Block C in S 3:28 S 27:24 S 7:4 S 3: B 3 A 3 B 2 A 2 B A B A + C 2 + C + C + C in S 3 S 2 S S G 3: G 3 P 3 G 2 P 2 G P G C out C in P 3: P 3 P 2 P P Chapter 5 <3>

Carry-Lookahead Adder Delay For -bit CLA with k-bit blocks: t CLA = t pg + t pg_block + (/k )t AD_OR + kt FA t pg : delay to generate all P i, G i t pg_block : delay to generate all P i:j, G i:j t AD_OR : delay from C in to C out of final AD/OR gate in k-bit CLA block An -bit carry-lookahead adder is generally much faster than a ripple-carry adder for > 6 Chapter 5 <4>

Prefix Adder Computes carry in (C i- ) for each column, then computes sum: S i = (A i Å B i ) Å C i Computes G and P for -, 2-, 4-, 8-bit blocks, etc. until all G i (carry in) known log 2 stages Chapter 5 <5>

Prefix Adder Carry in either generated in a column or propagated from a previous column. Column - holds C in, so G - = C in, P - = Carry in to column i = carry out of column i-: C i- = G i-:- G i-:- : generate signal spanning columns i- to - Sum equation: S i = (A i Å B i ) Å G i-:- Goal: Quickly compute G :-, G :-, G 2:-, G 3:-, G 4:-, G 5:-, (called prefixes) Chapter 5 <6>

Prefix Adder Generate and propagate signals for a block spanning bits i:j: G i:j = G i:k + P i:k G k-:j P i:j = P i:k P k-:j In words: Generate: block i:j will generate a carry if: upper part (i:k) generates a carry or upper part propagates a carry generated in lower part (k-:j) Propagate: block i:j will propagate a carry if both the upper and lower parts propagate the carry Chapter 5 <7>

Prefix Adder Schematic 5 4 3 2 9 8 7 6 5 4 3 2-4:3 2: :9 8:7 6:5 4:3 2: :- 4: 3: :7 9:7 6:3 5:3 2:- :- 4:7 3:7 2:7 :7 6:- 5:- 4:- 3:- 4:- 3:- 2:- :- :- 9:- 8:- 7:- 5 4 3 2 9 8 7 6 5 4 3 2 Legend i i:j i A i B i P i:k P k-:j G i:k G k-:j G i-:- A i B i P i:i G i:i P i:j G i:j S i Chapter 5 <8>

Prefix Adder Delay t PA = t pg + log 2 (t pg_prefix ) + t XOR t pg : delay to produce P i G i (AD or OR gate) t pg_prefix : delay of black prefix cell (AD-OR gate) Chapter 5 <9>

Adder Delay Comparisons Compare delay of: 32-bit ripple-carry, carry-lookahead, and prefix adders CLA has 4-bit blocks 2-input gate delay = ps; full adder delay = 3 ps Chapter 5 <2>

Adder Delay Comparisons Compare delay of: 32-bit ripple-carry, carry-lookahead, and prefix adders CLA has 4-bit blocks 2-input gate delay = ps; full adder delay = 3 ps t ripple t CLA t PA = t FA = 32(3 ps) = 9.6 ns = t pg + t pg_block + (/k )t AD_OR + kt FA = [ + 6 + (7)2 + 4(3)] ps = 3.3 ns = t pg + log 2 (t pg_prefix ) + t XOR = [ + log 2 32(2) + ] ps =.2 ns Chapter 5 <2>

Subtracter Symbol A B - Y Implementation A B + Y Chapter 5 <22>

Comparator: Equality Symbol Implementation A 3 B 3 A 4 = B 4 A 2 B 2 A Equal Equal B A B Chapter 5 <23>

Comparator: Less Than A B - [-] A < B Copyright 27 Elsevier Chapter 5 <24> 5-<24>

Arithmetic Logic Unit (ALU) A B F 2: Function A & B A B ALU Y 3 F A + B not used A & ~B A ~B A - B SLT Copyright 27 Elsevier Chapter 5 <25> 5-<25>

ALU Design A B F 2: Function A & B A B F 2 A + B not used C out Zero Extend 3 + [-] S 2 2 F : A & ~B A ~B A - B SLT Copyright 27 Elsevier Y Chapter 5 <26> 5-<26>

Set Less Than (SLT) Example A B Configure 32-bit ALU for SLT operation: A = 25 and B = 32 F 2 C out + [-] S Zero Extend 2 3 2 F : Y Copyright 27 Elsevier Chapter 5 <28> 5-<28>

Set Less Than (SLT) Example C out Zero Extend 3 A + [-] S 2 Y B 2 Configure 32-bit ALU for SLT operation: A = 25 and B = 32 F 2 F : A < B, so Y should be 32-bit representation of (x) F 2: = F 2 = (adder acts as subtracter), so 25-32 = -7-7 has in the most significant bit (S 3 = ) F : = multiplexer selects Y = S 3 (zero extended) = x. Copyright 27 Elsevier Chapter 5 <29> 5-<29>

Shifters Logical shifter: shifts value to left or right and fills empty spaces with s Ex: >> 2 = Ex: << 2 = Arithmetic shifter: same as logical shifter, but on right shift, fills empty spaces with the old most significant bit (msb). Ex: >>> 2 = Ex: <<< 2 = Rotator: rotates bits in a circle, such that bits shifted off one end are shifted into the other end Ex: ROR 2 = Ex: ROL 2 = Copyright 27 Elsevier Chapter 5 <3> 5-<3>

Shifters Logical shifter: Ex: >> 2 = Ex: << 2 = Arithmetic shifter: Ex: >>> 2 = Ex: <<< 2 = Rotator: Ex: ROR 2 = Ex: ROL 2 = Chapter 5 <3>

Shifter Design A 3 A 2 A A shamt : 2 S : Y 3 shamt : A 3: >> Y 3: 2 4 4 S : Y 2 S : Y S : Y Chapter 5 <32>

Shifters as Multipliers, Dividers A << = A 2 Example: << 2 = ( 2 2 = 4) Example: << 2 = (-3 2 2 = -2) A >>> = A 2 Example: >>> 2 = (8 2 2 = 2) Example: >>> 2 = (-6 2 2 = -4) Chapter 5 <33>

Multipliers Partial products formed by multiplying a single digit of the multiplier with multiplicand Shifted partial products summed to form result Decimal 23 x 42 46 + 92 966 multiplicand multiplier partial products result + Binary x 23 x 42 = 966 5 x 7 = 35 Chapter 5 <34>

4 x 4 Multiplier A B 4 x 4 P 8 A 3 A 2 A A B B A 3 A 2 A A x B 3 B 2 B B B 2 A 3 B A 2 B A B A B A 3 B A 2 B A B A B A 3 B 2 A 2 B 2 A B 2 A B 2 B 3 + A 3 B 3 A 2 B 3 A B 3 A B 3 P 7 P 6 P 5 P 4 P 3 P 2 P P P 7 P 6 P 5 P 4 P 3 P 2 P P Chapter 5 <35>

4 x 4 Divider B 3 B 2 B A 3 B Legend R B Q 3 R B C out C in D R' C out + D C in A 2 Q 2 Q Q R 3 R 2 R A A R R' A/B = Q + R/B Algorithm: R = for i = - to R = {R <<. A i } D = R - B if D <, Q i =, R =R else Q i =, R =D R =R Chapter 5 <36>

umber Systems umbers we can represent using binary representations Positive numbers Unsigned binary egative numbers Two s complement Sign/magnitude numbers What about fractions? Chapter 5 <37>

umbers with Fractions Two common notations: Fixed-point: binary point fixed Floating-point: binary point floats to the right of the most significant Chapter 5 <38>

Fixed-Point umbers 6.75 using 4 integer bits and 4 fraction bits:. 2 2 + 2 + 2 - + 2-2 = 6.75 Binary point is implied The number of integer and fraction bits must be agreed upon beforehand Chapter 5 <39>

Fixed-Point umber Example Represent 7.5 using 4 integer bits and 4 fraction bits. Chapter 5 <4>

Signed Fixed-Point umbers Representations: Sign/magnitude Two s complement Example: Represent -7.5 using 4 integer and 4 fraction bits Sign/magnitude: Two s complement: Chapter 5 <42>

Signed Fixed-Point umbers Representations: Sign/magnitude Two s complement Example: Represent -7.5 using 4 integer and 4 fraction bits Sign/magnitude: Two s complement:. +7.5: 2. Invert bits: 3. Add to lsb: + Chapter 5 <43>

Floating-Point umbers Binary point floats to the right of the most significant Similar to decimal scientific notation For example, write 273 in scientific notation: 273 = 2.73 2 In general, a number is written in scientific notation as: M = mantissa B = base E = exponent ± M B E In the example, M = 2.73, B =, and E = 2 Chapter 5 <44>

Floating-Point umbers bit 8 bits 23 bits Sign Exponent Mantissa Example: represent the value 228 using a 32-bit floating point representation We show three versions final version is called the IEEE 754 floating-point standard Chapter 5 <45>

Floating-Point Representation. Convert decimal to binary (don t reverse steps & 2!): 228 = 2 2. Write the number in binary scientific notation : 2 =. 2 2 7 3. Fill in each field of the 32-bit floating point number: The sign bit is positive () The 8 exponent bits represent the value 7 The remaining 23 bits are the mantissa bit 8 bits 23 bits Sign Exponent Mantissa Chapter 5 <46>

Floating-Point Representation 2 First bit of the mantissa is always : 228 = 2 =. 2 7 So, no need to store it: implicit leading Store just fraction bits in 23-bit field bit 8 bits 23 bits Sign Exponent Fraction Chapter 5 <47>

Floating-Point Representation 3 Biased exponent: bias = 27 ( 2 ) Biased exponent = bias + exponent Exponent of 7 is as: 27 + 7 = 34 = x 2 The IEEE 754 32-bit floating-point representation of 228 bit 8 bits 23 bits Sign Biased Exponent Fraction in hexadecimal: x4364 Chapter 5 <48>

Floating-Point Example Write -58.25 in floating point (IEEE 754) Chapter 5 <49>

Floating-Point Example Write -58.25 in floating point (IEEE 754). Convert decimal to binary: 58.25 =. 2 2. Write in binary scientific notation:. 2 5 3. Fill in fields: Sign bit: (negative) 8 exponent bits: (27 + 5) = 32 = 2 23 fraction bits: bit 8 bits 23 bits Sign Exponent Fraction in hexadecimal: xc269 Chapter 5 <5>

Floating-Point: Special Cases umber Sign Exponent Fraction X - a X non-zero Chapter 5 <5>

Floating-Point Precision Single-Precision: 32-bit sign bit, 8 exponent bits, 23 fraction bits bias = 27 Double-Precision: 64-bit sign bit, exponent bits, 52 fraction bits bias = 23 Chapter 5 <52>

Floating-Point: Rounding Overflow: number too large to be represented Underflow: number too small to be represented Rounding modes: Down Up Toward zero To nearest Example: round. (.57825) to only 3 fraction bits Down:. Up:. Toward zero:. To nearest:. (.625 is closer to.57825 than.5 is) Chapter 5 <53>

Floating-Point Addition. Extract exponent and fraction bits 2. Prepend leading to form mantissa 3. Compare exponents 4. Shift smaller mantissa if necessary 5. Add mantissas 6. ormalize mantissa and adjust exponent if necessary 7. Round result 8. Assemble exponent and fraction back into floating-point format Chapter 5 <54>

Floating-Point Addition Example Add the following floating-point numbers: x3fc x45 Chapter 5 <55>

Floating-Point Addition Example. Extract exponent and fraction bits bit 8 bits 23 bits Sign Exponent Fraction bit 8 bits 23 bits Sign Exponent Fraction For first number (): S =, E = 27, F =. For second number (2): S =, E = 28, F =. 2. Prepend leading to form mantissa :. 2:. Chapter 5 <56>

Floating-Point Addition Example 3. Compare exponents 27 28 = -, so shift right by bit 4. Shift smaller mantissa if necessary shift s mantissa:. >> =. ( 2 ) 5. Add mantissas. 2 +. 2. 2 Chapter 5 <57>

Floating Point Addition Example 6. ormalize mantissa and adjust exponent if necessary. 2 =. 2 2 7. Round result o need (fits in 23 bits) 8. Assemble exponent and fraction back into floating-point format S =, E = 2 + 27 = 29 = 2, F =.. bit 8 bits 23 bits Sign Exponent Fraction in hexadecimal: x498 Chapter 5 <58>

Counters Increments on each clock edge Used to cycle through numbers. For example,,,,,,,,,, Example uses: Digital clock displays Program counter: keeps track of current instruction executing Symbol Implementation CLK CLK Q Reset + r Reset Q Chapter 5 <59>

Shift Registers Shift a new bit in on each clock edge Shift a bit out on each clock edge Serial-to-parallel converter: converts serial input (S in ) to parallel output (Q :- ) Symbol: Implementation: CLK Q S in S out S in S out Q Q Q 2 Q - Chapter 5 <6>

Shift Register with Parallel Load When Load =, acts as a normal -bit register When Load =, acts as a shift register ow can act as a serial-to-parallel converter (S in to Q :- ) or a parallel-to-serial converter (D :- to S out ) Load Clk S in D D D 2 D - S out Q Q Q 2 Q - Chapter 5 <6>

Memory Arrays Efficiently store large amounts of data 3 common types: Dynamic random access memory (DRAM) Static random access memory (SRAM) Read only memory (ROM) M-bit data value read/ written at each unique -bit address Address Array M Data Chapter 5 <62>

Memory Arrays 2-dimensional array of bit cells Each bit cell stores one bit address bits and M data bits: 2 rows and M columns Depth: number of rows (number of words) Width: number of columns (size of word) Array size: depth width = 2 M Address Array M Data Address Data Address 2 Array depth 3 Data width Chapter 5 <63>

Memory Array Example 2 2 3-bit array umber of words: 4 Word size: 3-bits For example, the 3-bit word at address is Address Data Address 2 Array depth 3 Data width Chapter 5 <64>

Memory Arrays Address 24-word x 32-bit Array 32 Data Chapter 5 <65>

Memory Array Bit Cells wordline bit bitline bitline = bitline = wordline = bit = wordline = bit = bitline = bitline = wordline = bit = wordline = bit = (a) (b) Chapter 5 <66>

Memory Array Bit Cells bitline wordline bit bitline = bitline = Z wordline = bit = wordline = bit = bitline = bitline = Z wordline = bit = wordline = bit = (a) (b) Chapter 5 <67>

Memory Array Wordline: like an enable single row in memory array read/written corresponds to unique address only one wordline HIGH at once 2:4 Decoder bitline 2 bitline bitline wordline 3 Address 2 wordline 2 bit = bit = bit = wordline bit = bit = bit = bit = bit = bit = wordline bit = bit = bit = Data 2 Data Data Chapter 5 <68>

Types of Memory Random access memory (RAM): volatile Read only memory (ROM): nonvolatile Chapter 5 <69>

RAM: Random Access Memory Volatile: loses its data when power off Read and written quickly Main memory in your computer is RAM (DRAM) Historically called random access memory because any data word accessed as easily as any other (in contrast to sequential access memories such as a tape recorder) Chapter 5 <7>

ROM: Read Only Memory onvolatile: retains data when power off Read quickly, but writing is impossible or slow Flash memory in cameras, thumb drives, and digital cameras are all ROMs Historically called read only memory because ROMs were written at manufacturing time or by burning fuses. Once ROM was configured, it could not be written again. This is no longer the case for Flash memory and other types of ROMs. Chapter 5 <7>

Types of RAM DRAM (Dynamic random access memory) SRAM (Static random access memory) Differ in how they store data: DRAM uses a capacitor SRAM uses cross-coupled inverters Chapter 5 <72>

Robert Dennard, 932 - Invented DRAM in 966 at IBM Others were skeptical that the idea would work By the mid-97 s DRAM in virtually all computers Chapter 5 <73>

DRAM Data bits on capacitor Dynamic because the value needs to be refreshed (rewritten) periodically and after read: Charge leakage from the capacitor degrades the value Reading destroys the value bitline bitline wordline bit wordline bit Chapter 5 <74>

DRAM wordline bitline wordline bitline bit = + + bit = Chapter 5 <75>

SRAM wordline bit bitline wordline bitline bitline Chapter 5 <76>

Memory Arrays Review 2:4 Decoder bitline 2 bitline bitline wordline 3 Address 2 wordline 2 wordline wordline bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = Data 2 Data Data DRAM bit cell: SRAM bit cell: bitline bitline bitline wordline wordline Chapter 5 <77>

ROM: Dot otation 2:4 Decoder wordline bitline Address 2 bit cell containing bitline Data 2 Data Data wordline bit cell containing Chapter 5 <78>

Fujio Masuoka, 944 - Developed memories and high speed circuits at Toshiba, 97-994 Invented Flash memory as an unauthorized project pursued during nights and weekends in the late 97 s The process of erasing the memory reminded him of the flash of a camera Toshiba slow to commercialize the idea; Intel was first to market in 988 Flash has grown into a $25 billion per year market Chapter 5 <79>

ROM Storage Address 2 2:4 Decoder Address Data depth Data 2 Data Data width Chapter 5 <8>

ROM Logic Address 2:4 Decoder 2 Data 2 = A Å A Data = A + A Data = A A Data 2 Data Data Chapter 5 <8>

Example: Logic with ROMs Implement the following logic functions using a 2 2 3-bit ROM: X = AB Y = A + B Z = A B A, B 2 2:4 Decoder X Y Z Chapter 5 <82>

Example: Logic with ROMs Implement the following logic functions using a 2 2 3-bit ROM: X = AB Y = A + B Z = A B A, B 2 2:4 Decoder X Y Z Chapter 5 <83>

Logic with Any Memory Array 2:4 Decoder bitline 2 bitline bitline wordline 3 Address 2 wordline 2 wordline wordline bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = bit = Data 2 = A Å A Data 2 Data Data Data = A + A Data = A A Chapter 5 <84>

Logic with Memory Arrays Implement the following logic functions using a 2 2 3-bit memory array: X = AB Y = A + B Z = A B Chapter 5 <85>

Logic with Memory Arrays Implement the following logic functions using a 2 2 3-bit memory array: X = AB Y = A + B Z = A B A, B 2 2:4 Decoder wordline 3 wordline 2 wordline wordline bit = bit = bit = bit = bitline 2 bitline bitline bit = bit = bit = bit = bit = bit = bit = bit = X Y Z Chapter 5 <86>

Logic with Memory Arrays Called lookup tables (LUTs): look up output at each input combination (address) 4-word x -bit Array Truth Table A B Y A B 2:4 Decoder A A bit = bit = bit = bit = bitline Y Chapter 5 <87>

Multi-ported Memories Port: address/data pair 3-ported memory 2 read ports (A/RD, A2/RD2) write port (A3/WD3, WE3 enables writing) Register file: small multi-ported memory CLK A A2 WE3 RD RD2 M M M A3 WD3 Array Chapter 5 <88>

SystemVerilog Memory Arrays // 256 x 3 memory module with one read/write port module dmem( input logic clk, we, input logic [7:] a, input logic [2:] wd, output logic [2:] rd); logic [2:] RAM[255:]; assign rd = RAM[a]; always @(posedge clk) if (we) RAM[a] <= wd; endmodule Chapter 5 <89>

Logic Arrays PLAs (Programmable logic arrays) AD array followed by OR array Combinational logic only Fixed internal connections FPGAs (Field programmable gate arrays) Array of Logic Elements (LEs) Combinational and sequential logic Programmable internal connections Chapter 5 <9>

PLAs X = ABC + ABC Y = AB Inputs M AD ARRAY Implicants OR ARRAY A B C P Outputs OR ARRAY ABC ABC AB AD ARRAY X Y Chapter 5 <9>

PLAs: Dot otation Inputs M AD ARRAY Implicants OR ARRAY A B C P Outputs OR ARRAY ABC ABC AB AD ARRAY X Y Chapter 5 <92>

FPGA: Field Programmable Gate Array Composed of: LEs (Logic elements): perform logic IOEs (Input/output elements): interface with outside world Programmable interconnection: connect LEs and IOEs Some FPGAs include other building blocks such as multipliers and RAMs Chapter 5 <93>

General FPGA Layout Chapter 5 <94>

LE: Logic Element Composed of: LUTs (lookup tables): perform combinational logic Flip-flops: perform sequential logic Multiplexers: connect LUTs and flip-flops Chapter 5 <95>

Altera Cyclone IV LE Chapter 5 <96>

Altera Cyclone IV LE The Altera Cyclone IV LE has: four-input LUT registered output combinational output Chapter 5 <97>

LE Configuration Example Show how to configure a Cyclone IV LE to perform the following functions: X = ABC + ABC Y = AB Chapter 5 <98>

LE Configuration Example Show how to configure a Cyclone IV LE to perform the following functions: X = ABC + ABC Y = AB (A) (B) (C) (X) data data 2 data 3 data 4 X X X X X X X X LUT output C A B data data 2 data 3 data 4 LUT X LE (A) (B) (Y) data data 2 data 3 X X X X data 4 X X X X LUT output A data B data 2 data 3 Y data 4 LUT LE 2 Chapter 5 <99>

FPGA Design Flow Using a CAD tool (such as Altera s Quartus II) Enter the design using schematic entry or an HDL Simulate the design Synthesize design and map it onto FPGA Download the configuration onto the FPGA Test the design Chapter 5 <>