Clever Signed Adder/Subtractor. Five Components of a Computer. The CPU. Stages of the Datapath (1/5) Stages of the Datapath : Overview

Similar documents
UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

CS3350B Computer Architecture Winter 2015

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 27: Single- Cycle CPU Datapath Design

Truth Table! Review! Use this table and techniques we learned to transform from 1 to another! ! Gate Diagram! Today!

Computer Science 61C Spring Friedland and Weaver. The MIPS Datapath

Review. N-bit adder-subtractor done using N 1- bit adders with XOR gates on input. Lecture #19 Designing a Single-Cycle CPU

inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 18 CPU Design: The Single-Cycle I ! Nasty new windows vulnerability!

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture Datapath. Instructors: John Wawrzynek & Vladimir Stojanovic

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

UCB CS61C : Machine Structures

CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture. MIPS CPU Datapath, Control Introduction

CS61C : Machine Structures

Review. Pipeline big-delay CL for faster clock Finite State Machines extremely useful You ll see them again in 150, 152 & 164

I-Format Instructions (3/4) Define fields of the following number of bits each: = 32 bits

UCB CS61C : Machine Structures

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache

Review: New-School Machine Structures. Review: Direct-Mapped Cache. TIO Dan s great cache mnemonic. Memory Access without Cache

UCB CS61C : Machine Structures

CS61C : Machine Structures

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

CS61C : Machine Structures

Chapter 4. The Processor

CENG 3420 Lecture 06: Datapath

Lecture #6 Intro MIPS; Load & Store

Review: Abstract Implementation View

UCB CS61C : Machine Structures

Review. Motivation for Input/Output. What do we need to make I/O work?

CS61C : Machine Structures

UCB CS61C : Machine Structures

CS61C : Machine Structures

CS61C : Machine Structures

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

ECE232: Hardware Organization and Design

Chapter 4. The Processor

ECE232: Hardware Organization and Design. Computer Organization - Previously covered

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

CS61C : Machine Structures

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single- Cycle CPU Datapath Control Part 1

CS61C : Machine Structures

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

ECE 486/586. Computer Architecture. Lecture # 7

Topic Notes: MIPS Instruction Set Architecture

CS61C : Machine Structures

CS3350B Computer Architecture Quiz 3 March 15, 2018

Processor. Han Wang CS3410, Spring 2012 Computer Science Cornell University. See P&H Chapter , 4.1 4


CS61C : Machine Structures

Review. Lecture #9 MIPS Logical & Shift Ops, and Instruction Representation I Logical Operators (1/3) Bitwise Operations

CS61C - Machine Structures. Lecture 6 - Instruction Representation. September 15, 2000 David Patterson.

Processor (I) - datapath & control. Hwansoo Han

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

Midterm. Sticker winners: if you got >= 50 / 67

Chapter 4. The Processor

CS61C : Machine Structures

Chapter 4. The Processor Designing the datapath

CS61C : Machine Structures

The MIPS Processor Datapath

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

LECTURE 5. Single-Cycle Datapath and Control

CS61C : Machine Structures

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

CPU Organization (Design)

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Chapter 4. The Processor

CS31001 COMPUTER ORGANIZATION AND ARCHITECTURE. Debdeep Mukhopadhyay, CSE, IIT Kharagpur. Instructions and Addressing

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Lecture 10: Simple Data Path

Review. Lecture 18 Running a Program I aka Compiling, Assembling, Linking, Loading

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

ECE260: Fundamentals of Computer Engineering

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

CS 351 Exam 2 Mon. 11/2/2015

UCB CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures

Inequalities in MIPS (2/4) Inequalities in MIPS (1/4) Inequalities in MIPS (4/4) Inequalities in MIPS (3/4) UCB CS61C : Machine Structures

are Softw Instruction Set Architecture Microarchitecture are rdw

ECE468 Computer Organization and Architecture. Designing a Single Cycle Datapath

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single- Cycle CPU Datapath Control Part 1

UCB CS61C : Machine Structures

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

Chapter 2. Instructions:

CS222: Processor Design

Transcription:

Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 24 Introduction to CPU design Hi to Vitaly Babiy from Albany, NY! 2008-03-21 Stanford researchers developing 3D camera with 12,616 lenses! This enables the reconstruction of the normal picture and a depth-map. It would also be possible to defocus the image after the fact, in software. Lots of uses for this: security (facial recognition), 3D scanning for VR worlds, 3D modeling of objects, etc. Very cool! news-service.stanford.edu/news/2008/ march19/camera%20-031908.html Clever Signed Adder/Subtractor A - B = A + (-B); how do we make -B? An xor is a Conditional Inverter! x y xor 0 0 0 0 1 1 1 0 1 1 1 0 CS61C L19 Running a Program II (2) Five Components of a Computer Processor Control Datapath Computer Memory (passive) (where programs, data live when running) Devices Input Output Keyboard, Mouse Disk (where programs, data live when not running) Display, Printer The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor which contains hardware necessary to perform operations required by the processor (the brawn) Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain) CS61C L19 Running a Program II (3) CS61C L19 Running a Program II (4) Stages of the Datapath : Overview Problem: a single, atomic block which executes an instruction (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient Solution: break up the process of executing an instruction into stages, and then connect the stages to create the whole datapath smaller stages are easier to design easy to optimize (change) one stage without touching the others Stages of the Datapath (1/5) There is a wide variety of MIPS instructions: so what general steps do they have in common? Stage 1: Instruction Fetch no matter what the instruction, the 32-bit instruction word must first be fetched from memory (the cache-memory hierarchy) also, this is where we Increment PC (that is, PC = PC + 4, to point to the next instruction: byte addressing so + 4) CS61C L19 Running a Program II (5) CS61C L19 Running a Program II (6)

CS61C L19 Running a Program II (7) Stages of the Datapath (2/5) Stage 2: Instruction Decode upon fetching the instruction, we next gather data from the fields (decode all necessary instruction data) first, read the opcode to determine instruction type and field lengths second, read in data from all necessary registers for add, read two registers for addi, read one register for jal, no reads necessary Stages of the Datapath (3/5) Stage 3: (Arithmetic-Logic Unit) the real work of most instructions is done here: arithmetic (+, -, *, /), shifting, logic (&, ), comparisons (slt) what about loads and stores? lw $t0, 40($t1) the address we are accessing in memory = the value in $t1 PLUS the value 40 so we do this addition in this stage CS61C L19 Running a Program II (8) Stages of the Datapath (4/5) Stages of the Datapath (5/5) Stage 4: Memory Access actually only the load and store instructions do anything during this stage; the others remain idle during this stage or skip it all together since these instructions have a unique step, we need this extra stage to account for them as a result of the cache system, this stage is expected to be fast Stage 5: Register Write most instructions write the result of some computation into a register examples: arithmetic, logical, shifts, loads, slt what about stores, branches, jumps? don t write anything into a register at the end these remain idle during this fifth stage or skip it all together CS61C L19 Running a Program II (9) CS61C L19 Running a Program II (10) Generic Steps of Datapath Administrivia Homework 5 due tonight rd rs rt Have a GREAT and RELAXING spring break! CS61C L19 Running a Program II (11) CS61C L19 Running a Program II (12)

CS61C L19 Running a Program II (13) Datapath Walkthroughs (1/3) add $r3,$r1,$r2 # r3 = r1+r2 Stage 2: decode to find it s an add, then read registers $r1 and $r2 Stage 3: add the two values retrieved in Stage 2 Stage 4: idle (nothing to write to memory) Stage 5: write result of Stage 3 into register $r3 Example: add Instruction 2 add r3, r1, r2 reg[2] reg[1]+reg[2] CS61C L19 Running a Program II (14) Datapath Walkthroughs (2/3) slti $r3,$r1,17 Stage 2: decode to find it s an slti, then read register $r1 Stage 3: compare value retrieved in Stage 2 with the integer 17 Stage 4: idle Stage 5: write the result of Stage 3 in register $r3 Example: slti Instruction slti r3, r1, reg[1]<17? CS61C L19 Running a Program II (15) CS61C L19 Running a Program II (16) Datapath Walkthroughs (3/3) sw $r3, 17($r1) Stage 2: decode to find it s a sw, then read registers $r1 and $r3 Stage 3: add 17 to value in register $r1 (retrieved in Stage 2) Stage 4: write value in register $r3 (retrieved in Stage 2) into memory address computed in Stage 3 Stage 5: idle (nothing to write into a register) Example: sw Instruction SW r3, 17(r1) reg[3] reg[1]+ MEM[r1+17]<=r CS61C L19 Running a Program II (17) CS61C L19 Running a Program II (18)

CS61C L19 Running a Program II (19) Why Five Stages? (1/2) Could we have a different number of stages? Yes, and other architectures do So why does MIPS have five if instructions tend to idle for at least one stage? The five stages are the union of all the operations needed by all the instructions. There is one instruction that uses all five stages: the load Why Five Stages? (2/2) lw $r3, 17($r1) Stage 2: decode to find it s a lw, then read register $r1 Stage 3: add 17 to value in register $r1 (retrieved in Stage 2) Stage 4: read value from memory address compute in Stage 3 Stage 5: write value found in Stage 4 into register $r3 CS61C L19 Running a Program II (20) Example: lw Instruction Datapath Summary LW r3, 17(r1) reg[1]+ MEM[r1+17] The datapath based on data transfers required to perform instructions A controller causes the right transfers to happen rd rs rt opcode, funct Controller CS61C L19 Running a Program II (21) CS61C L19 Running a Program II (22) What Hardware Is Needed? (1/2) What Hardware Is Needed? (2/2) PC: a register which keeps track of memory addr of the next instruction General Purpose Registers used in Stages 2 (Read) and 5 (Write) MIPS has 32 of these Memory used in Stages 1 (Fetch) and 4 (R/W) cache system makes these two stages as fast as the others, on average used in Stage 3 something that performs all necessary functions: arithmetic, logicals, etc. we ll design details later Miscellaneous Registers In implementations with only one stage per clock cycle, registers are inserted between stages to hold intermediate data and control signals as they travels from stage to stage. Note: Register is a general purpose term meaning something that stores bits. Not all registers are in the register file. CS61C L19 Running a Program II (23) CS61C L19 Running a Program II (24)

CS61C L19 Running a Program II (25) CPU clocking (1/2) For each instruction, how do we control the flow of information though the datapath? Single Cycle CPU: All stages of an instruction are completed within one long clock cycle. The clock cycle is made sufficient long to allow each instruction to complete all stages without interruption and within one cycle. CPU clocking (2/2) Multiple-cycle CPU: Only one stage of instruction per clock cycle. The clock is made as long as the slowest stage. For each instruction, how do we control the flow of information though the datapath? Several significant advantages over single cycle execution: Unused stages in a particular instruction can be skipped OR instructions can be pipelined (overlapped). CS61C L19 Running a Program II (26) Peer Instruction Peer Instruction A. If the destination reg is the same as the source reg, we could compute the incorrect value! B. We re going to be able to read 2 registers and write a 3 rd in 1 cycle C. Datapath is hard, Control is easy CS61C L19 Running a Program II (27) 0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT A. Truth table for mux with 4-bits of signals has 2 4 rows B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl C. If 1-bit adder delay is T, the N-bit adder delay would also be T CS61C L19 Running a Program II (28) 1: FFF 2: FFT 3: FTF 4: FTT 5: TFF 6: TFT 7: TTF 8: TTT Peer Instruction Answer A. Truth table for mux with 4-bits of signals controls 16 inputs, for a total of 20 inputs, so truth table is 2 20 rows FALSE B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl TRUE C. What about the cascading carry? FALSE A. Truth table for mux with 4-bits of signals is 2 4 rows long B. We could cascade N 1-bit shifters to make 1 N-bit shifter for sll, srl C. If 1-bit adder delay is T, the N-bit adder delay would also be T CS61C L19 Running a Program II (29) 1: FFF 2: FFT 3: FTF 4: FTT 5: TFF 6: TFT 7: TTF 8: TTT And In conclusion N-bit adder-subtractor done using N 1-bit adders with XOR gates on input XOR serves as conditional inverter CPU design involves Datapath,Control Datapath in MIPS involves 5 CPU stages 1. Instruction Fetch 2. Instruction Decode & Register Read 3. (Execute) 4. Memory 5. Register Write CS61C L19 Running a Program II (30)