Midterm I March 3, 1999 CS152 Computer Architecture and Engineering

Similar documents
Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

Midterm I October 6, 1999 CS152 Computer Architecture and Engineering

Midterm I March 1, 2001 CS152 Computer Architecture and Engineering

University of California College of Engineering Computer Science Division -EECS. CS 152 Midterm I

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

CS3350B Computer Architecture Winter Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2)

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

CS61C : Machine Structures

Midterm I March 12, 2003 CS152 Computer Architecture and Engineering

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

CS 110 Computer Architecture Single-Cycle CPU Datapath & Control

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

Lecture #17: CPU Design II Control

COMP303 Computer Architecture Lecture 9. Single Cycle Control

CSE 141 Computer Architecture Summer Session Lecture 3 ALU Part 2 Single Cycle CPU Part 1. Pramod V. Argade

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

UC Berkeley CS61C : Machine Structures

Outline. EEL-4713 Computer Architecture Designing a Single Cycle Datapath

CPU Organization (Design)

Midterm I SOLUTIONS October 6, 1999 CS152 Computer Architecture and Engineering

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath & Control Part 2

UC Berkeley CS61C : Machine Structures

361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control

MIPS-Lite Single-Cycle Control

CS61C : Machine Structures

The Processor: Datapath & Control

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single- Cycle CPU Datapath Control Part 1

CPU Design Steps. EECC550 - Shaaban

Working on the Pipeline

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

CS 61C: Great Ideas in Computer Architecture Control and Pipelining

Single Cycle CPU Design. Mehran Rezaei

CS61C : Machine Structures

Ch 5: Designing a Single Cycle Datapath

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

CS 61C: Great Ideas in Computer Architecture Lecture 12: Single- Cycle CPU, Datapath & Control Part 2

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

Major CPU Design Steps

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

CS3350B Computer Architecture Quiz 3 March 15, 2018

EEM 486: Computer Architecture. Lecture 3. Designing Single Cycle Control

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I ! Nasty new windows vulnerability!

How to design a controller to produce signals to control the datapath

Designing a Multicycle Processor

CPSC614: Computer Architecture

Chapter 4. The Processor. Computer Architecture and IC Design Lab

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath Control Part 1

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

Lecture 6 Datapath and Controller

CS152 Computer Architecture and Engineering Lecture 10: Designing a Single Cycle Control. Recap: The MIPS Instruction Formats

Instructor: Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #18. Warehouse Scale Computer

Guerrilla Session 3: MIPS CPU

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 19 CPU Design: The Single-Cycle II & Control !

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Science

CENG 3420 Lecture 06: Datapath

CS 152 Computer Architecture and Engineering. Lecture 10: Designing a Multicycle Processor

University of California, Berkeley College of Engineering

Processor (I) - datapath & control. Hwansoo Han

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

King Fahd University of Petroleum and Minerals College of Computer Science and Engineering Computer Engineering Department

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CS 351 Exam 2 Mon. 11/2/2015

The MIPS Processor Datapath

Computer Architecture I Midterm I (solutions)

Recap: A Single Cycle Datapath. CS 152 Computer Architecture and Engineering Lecture 8. Single-Cycle (Con t) Designing a Multicycle Processor

Chapter 4. The Processor

Department of Electrical Engineering and Computer Sciences Fall 2003 Instructor: Dave Patterson CS 152 Exam #1. Personal Information

Midterm Questions Overview

CPU Organization Datapath Design:

CS 61c: Great Ideas in Computer Architecture

361 div.1. Computer Architecture EECS 361 Lecture 7: ALU Design : Division

COMP303 - Computer Architecture Lecture 10. Multi-Cycle Design & Exceptions

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

Computer Science 61C Spring Friedland and Weaver. The MIPS Datapath

A Processor. Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University. See: P&H Chapter , 4.1-3

Tailoring the 32-Bit ALU to MIPS

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4

Lecture Topics. Announcements. Today: Integer Arithmetic (P&H ) Next: The MIPS ISA (P&H ) Consulting hours. Milestone #1 (due 1/26)

CS 4200/5200 Computer Architecture I

Chapter 4. The Processor

Laboratory 5 Processor Datapath

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

University of California, Berkeley College of Engineering

CPU Organization Datapath Design:

EEM 486: Computer Architecture. Lecture 2. MIPS Instruction Set Architecture

Midterm Questions Overview

Grading Results Total 100

Systems Architecture

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Lecture 12: Single-Cycle Control Unit. Spring 2018 Jason Tang

CPU Organization Datapath Design:

CSE 141 Computer Architecture Summer Session I, Lecture 3 Performance and Single Cycle CPU Part 1. Pramod V. Argade

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Review: Organization. CS152 Computer Architecture and Engineering Lecture 2. Review of MIPS ISA and Design Concepts

Transcription:

University of California, Berkeley College of Engineering Computer Science Division EECS Spring 1999 John Kubiatowicz Midterm I March 3, 1999 CS152 Computer Architecture and Engineering Your Name: SID Number: Discussion Section: Problem Possible Score 1 15 2 15 3 20 4 20 5 30 Total 1

[ This page left for π ] 3.141592653589793238462643383279502884197169399375105820974944 2

Problem 1: Performance Problem 1a: Name the three principle components of runtime that we discussed in class. How do they combine to yield runtime? Now, you have analyzed a benchmark that runs on your company s processor. This processor runs at 300MHz and has the following characteristics: Instruction Type Frequency (%) Cycles Arithmetic and logical 40 1 Load and Store 30 2 Branches 20 3 Floating Point 10 5 Your company is considering a cheaper, lower-performance version of the processor. Their plan is to remove some of the floating-point hardware to reduce the die size. The wafer on which the chip is produced has a diameter of 10cm, a cost of $2000, and a defect rate of 1 / (cm 2 ). The manufacturing process has an 80% wafer yield and a value of 2 for α. Here are some equations that you may find useful: π ( wafer diameter/2) 2 π wafer diameter dies/wafer = die area 2 die area α defects per unit area die area die yield = wafer yield 1+ α The current procesor has a die size of 12mm 12mm. The new chip has a die size of 10mm 10mm, and floating point instructions will take 12 cycles to execute. Problem 1b: What is the CPI and MIPS rating of the original processor? 3

Problem 1c: What is the CPI and MIPS rating of the new processor? Problem 1d: What is the original cost per (working) processor? Problem 1e: What is the new cost per (working) processor? Problem 1f: What is the improvement (if any) in price per performance? 4

Problem 2: Delay For a Full Adder A key component of an ALU is a full adder. A symbol for a full adder is: A B C out Full Adder S C in Problem 2a: Implement a full adder using as few 2-input AND, OR, and XOR gates as possible. Keep in mind that the Carry In signal may arrive much later than the A or B inputs. Thus, optimize your design (if possible) to have as few gates between Carry In and the two outputs as possible: 5

Assume the following characteristics for the gates: AND: Input load: 150fF, Propagation delay: TPlh=0.2ns, TPhl=0.5ns, Load-Dependent delay: TPlhf=.0020ns, TPhlf=.0021ns OR: Input load: 100fF Propagation delay: TPlh=0.5ns, TPhl=0.2ns Load-Dependent delay: TPlhf=.0020ns, TPhlf=.0021ns XOR: Input load: 200fF, Propagation delay: TPlh=.8ns, TPhl=.8ns Load-Dependent delay: TPlhf=.0040ns,TPhlf=.0042ns Problem 2b: Compute the input load for each of the 3 inputs to your full adder: Problem 2c: Identify two critical paths from the inputs to the Sum and the Carry Out signal. Compute the propagation delays for these critical paths based on the information given above. (You will have 2 numbers for each of these two paths): Problem 2d: Compute the Load Dependent delay for your two outputs. 6

Problem 3: Division Here is pseudo-code for an unsigned division algorithm. It is essentially the last divider (#3) that we developed in class. Assume that quotient and remainder are 32-bit global values, and the inputs divisor and dividend are also 32 bits. divide(dividend, divisor) { int count; /* Missing initialization instructions. */ } ROL64(remainder,quotient,0); while (count > 0) { count--; if (remainder divisor) { remainder = remainder divisor; temp = 1; } else { temp = 0; } ROL64(remainder,quotient,temp); } /* Something missing here */ The ROL64(hi,lo,inbit) pseudo-instruction treats hi and lo together as a 64-bit register. Assume that inbit contains only 0 or 1. ROL64 shifts the combined register left by one position, filling in the single bit from inbit into the lowest bit of lo. Problem 3a: Implement ROL64 as 5 MIPS instructions. Assume that $t0, $t1, and $t2 are the arguments. Hint: what happens if you use signed slt on unsigned numbers? Problem 3b: This divide algorithm is incomplete. It is missing some initialization and some final code. What is missing? 7

Problem 3c: Assume that you have a MIPS processor that is missing the divide instruction. Implement the above divide operation as a procedure. Assume dividend and divisor are in $a0 and $a1 respectively, and that remainder and quotient are returned in registers $v0 and $v1 respectively. You can use ROL64 as a pseudo-instruction that takes 3 registers. Don t use any other pseudo-instructions, however. Make sure to adher to MIPS register conventions, and optimize the loop as much as possible. Problem 3d: What is the CPI of your divide procedure (i.e. what is the total number of cycles to perform a divide)? Assume that each MIPS instruction takes 1 cycle. 8

[ This page left for scratch ] 9

Problem 4: New instructions for a single-cycle data path The Single-Cycle datapath developed in class is shown below (similar to the one in the book): Inst Memory Adr Rs <21:25> <11:15> <16:20> Rt Rd <0:15> Imm16 Instruction<31:0> imm16 4 PC Ext Adder Adder npc_sel 00 PC Mux Clk RegDst 1 0 Rs Rt RegWr 5 5 5 busw 32 Clk imm16 Rd Rt Rw Ra Rb 32 32-bit Registers 16 busa busb Extender 32 Equal 32 32 0 Mux 1 ALUctr 3 = ALU MemWr 32 32 WrEn Adr Data In Data Clk Memory MemtoReg 0 Mux 1 ExtOp ALUSrc It supports the following instructions. (Note that, as in the virtual machine, the branch is not delayed.) op rs rt rd shamt funct = MEM[PC] op rs rt Imm16 = MEM[PC] INST Register Transfers ADDU R[rd] R[rs] + R[rt]; PC PC + 4 SUBU R[rd] R[rs] - R[rt]; PC PC + 4 ORI R[rt] R[rs] + zero_ext(imm16); PC PC + 4 LW R[rt] MEM[ R[rs] + sign_ext(imm16)]; PC PC + 4 SW MEM[R[rs] + sign_ext(imm16)] R[rs]; PC PC + 4 BEQ if ( R[rs] == R[rt] ) then PC PC + sign_ext(imm16) 00 else PC PC + 4 Consider adding the following instructions: ADDIU, XOR, JAL, and BGEZAL (branch on greater than or equal to zero and link). This last instruction branches if the value of register rs 0; further, it saves the address of the next instruction in $ra (like JAL). Remember that the JAL format has a 6-bit opcode + 26 bits used as an offset... 10

Problem 4a: Describe/sketch the modifications needed to the datapath for each instruction. Try to add as little hardware as possible. Make sure that you are very clear about your changes. Also assume that the original datapath had only enough functionality to implement the original 5 instructions: Problem 4b: Specify control for each of the new instructions. Make sure to include values (possibly x ) for all of the control points. 11

Problem 5: Multiplication Problem 5a: Draw the datapath for a multi-cycle, 32-bit x 32-bit unsigned multiplier. Try to minimize hardware resources. Assume a single 32-bit adder. Further, to be consistent with MIPS, call the two 32-bit registers that contain the result hi and lo. Draw the control for this multiplier as a round oval with all control signals labeled. Problem 5b: Describe the algorithm for performing a multiplication with this hardware. You can use pseudo-code. Make sure to include any initialization that might be required. Also, make sure that any loops are labeled with the number of iterations. 12

[ Problem 5 continued ] Single-bit Booth encoding results from noticing that a sequence of ones can be represented by two non-zero values at the endpoints: 111111 = 1000000 1 = 100001 The encoding uses three symbols, namely: 1, 0,. (The 1 stands for -1 ). A more complicated example of Booth encoding, used on a two s-compliment number is the following: 1111111100111011 = 0000000101001101 To perform Booth encoding, we build a circuit that is able to recognize the beginning, middle, and end of a string of ones. When encoding a string of bits, we start at the far right. For each new bit, we base our decision of how to encode it on both the current bit and the previous bit (to the right). Problem 5c: Write a table describing the this encoding. It should have 2 input bits (current and previous) and should output a 2 bit value which is the two s compliment value of the encoded digit (representing 1,0, or 1): 13

Problem 5d: Modify your datapath to do 32x32 bit signed multiplication by Booth-encoding the multiplier (the operand which is shifted during multiplication). Draw the Booth-encoder as a black-box in your design that takes two bits as input and produces a 2-bit, two s complement encoding on the output. Assume that you have a 32-bit ALU that can either add or subtract. (Hint: Be careful about the sign-bit of the result during shifts. Also, be careful about the initial value of the previous bit.) Explain how how your algorithm is different from the previous multiplier. 14

Problem 5e: By encoding two bits at a time, we could potentially speed up multiplication a lot. To do this, you simply use your table from problem 5c on two successive bits at the same time. So, we are going to make a table that has three input bits (namely the 2 bits for encoding and one previous bit) and which has a single radix-2 symbol as an output. This output will be one of: 3,2,1,0,1,2, 3. To help in this, you will first list the two encoded bits that result from using the table in (5c) to encode bits In1/In0 and bits In0/Prev. Don t bother with two s complement, just use the symbols: 1,0, and 1. Call these results Enc1 and Enc0. Then, treat these two encoded bits as representing a composite number, with the left digit being in the 2s place (call this the output-2 column). So, a sample line from your table will be: Example: In1 In0 Prev Enc1 Enc0 Output 1 0 1 1 1 1 (this is 2x(-1) + 1 = -1) Write the complete 8-entry table (note again that you are leaving the output symbols as one of 3,2,1,0,1,2, 3 ): Problem 5f: Notice that 3 and 3 never show up in the output column. Explain why this is true. Also, explain why this is good for implementing radix-4 (two-bit at a time) multiplication. 15

Problem 5g [Extra Credit]: Draw a datapath that does signed multiplication, two bits at a time and include the multiplication algorithm. Draw the two-bit Booth encoder as a black box which implements output from the table in problem 5f. Make sure that you describe how the 5 possible output symbols (i.e. 2,1,0,1, and 2 ) are encoded (hint: two s complement is not the best solution here). As before, assume that you have a 32-bit ALU that can add and subtract: 16