Systems Architecture

Similar documents
Processor (I) - datapath & control. Hwansoo Han

Chapter 4. The Processor

Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

ECE232: Hardware Organization and Design

Chapter 4. The Processor Designing the datapath

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Systems Architecture I

The MIPS Processor Datapath

ECE260: Fundamentals of Computer Engineering

Chapter 4. The Processor

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

TDT4255 Computer Design. Lecture 4. Magnus Jahre. TDT4255 Computer Design

Inf2C - Computer Systems Lecture Processor Design Single Cycle

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Chapter 4. The Processor

Introduction. Datapath Basics

Chapter 4 The Processor 1. Chapter 4A. The Processor

Single Cycle Datapath

COMPUTER ORGANIZATION AND DESIGN

Single Cycle Datapath

LECTURE 5. Single-Cycle Datapath and Control

CPE 335 Computer Organization. Basic MIPS Architecture Part I

Chapter 5: The Processor: Datapath and Control

Introduction. Chapter 4. Instruction Execution. CPU Overview. University of the District of Columbia 30 September, Chapter 4 The Processor 1

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

Processor: Multi- Cycle Datapath & Control

Topic #6. Processor Design

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

Review: Abstract Implementation View

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

CC 311- Computer Architecture. The Processor - Control

CPU Organization (Design)

CENG 3420 Lecture 06: Datapath

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

RISC Processor Design

The Processor: Datapath & Control

The Processor: Datapath & Control

COMP303 - Computer Architecture Lecture 8. Designing a Single Cycle Datapath

ECE369. Chapter 5 ECE369

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

COMPUTER ORGANIZATION AND DESIGN

ENGN1640: Design of Computing Systems Topic 04: Single-Cycle Processor Design

Major CPU Design Steps

ENE 334 Microprocessors

The Big Picture: Where are We Now? EEM 486: Computer Architecture. Lecture 3. Designing a Single Cycle Datapath

COMPUTER ORGANIZATION AND DESIGN

LECTURE 6. Multi-Cycle Datapath and Control

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Computer Architecture, IFE CS and T&CS, 4 th sem. Single-Cycle Architecture

CS/COE0447: Computer Organization

CS/COE0447: Computer Organization

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Design of the MIPS Processor

CPE 335. Basic MIPS Architecture Part II

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

CSEN 601: Computer System Architecture Summer 2014

MIPS-Lite Single-Cycle Control

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu (Guest starring: Frank K. Gürkaynak and Aanjhan Ranganathan)

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Data Paths and Microprogramming

Single Cycle Data Path

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

Single Cycle CPU Design. Mehran Rezaei

Data paths for MIPS instructions

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Laboratory Single-Cycle MIPS CPU Design (3): 16-bits version One clock cycle per instruction

361 datapath.1. Computer Architecture EECS 361 Lecture 8: Designing a Single Cycle Datapath

COMP303 Computer Architecture Lecture 9. Single Cycle Control

Lecture 5 and 6. ICS 152 Computer Systems Architecture. Prof. Juan Luis Aragón

CSE 2021 COMPUTER ORGANIZATION

Chapter 5 Solutions: For More Practice

Ch 5: Designing a Single Cycle Datapath

Computer Science 141 Computing Hardware

Lecture 10: Simple Data Path

RISC Design: Multi-Cycle Implementation

Lecture 12: Single-Cycle Control Unit. Spring 2018 Jason Tang

361 control.1. EECS 361 Computer Architecture Lecture 9: Designing Single Cycle Control

Lets Build a Processor

Adding Support for jal to Single Cycle Datapath (For More Practice Exercise 5.20)

RISC Architecture: Multi-Cycle Implementation

Fundamentals of Computer Systems

CS Computer Architecture Spring Week 10: Chapter

LECTURE 3: THE PROCESSOR

The overall datapath for RT, lw,sw beq instrucution

Computer Hardware Engineering

CSE 2021 COMPUTER ORGANIZATION

Lecture 5: The Processor

CS3350B Computer Architecture Quiz 3 March 15, 2018

Design of the MIPS Processor (contd)

Transcription:

Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software Approach, Third Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 2004 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED). Lec 15 Systems Architecture 1

Introduction Objective: To understand how to implement the MIPS instruction set. Combine components (registers, memory, ALU) and add control Fetch-Execute cycle Topics Sequential logic (elements with state) and timing (edge triggered) Memory Registers Datapath components: Instruction memory, PC, Adder, Register File, ALU, Data Memory Implement a subset of MIPS in a single cycle computer Shortcomings of a single cycle computer Lec 15 Systems Architecture 2

The Processor: Datapath & Control Implementation of MIPS Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow instructions: beq, j Generic Implementation: use the program counter (PC) to supply instruction address get the instruction from memory read registers use the instruction to decide exactly what to do Lec 15 Systems Architecture 3

Instruction Execution PC instruction memory, fetch instruction Register numbers register file, read registers Depending on instruction class Use ALU to calculate Arithmetic result Memory address for load/store Branch target address Access data memory for load/store PC target address or PC + 4 12/22/2011 Chapter 4 The Processor 4

Abstract View Two types of functional units: elements that operate on data values (combinational) elements that contain state (sequential) Lec 15 Systems Architecture 5

Multiplexers Can t just join wires together Use multiplexers 12/22/2011 Chapter 4 The Processor 6

Control 12/22/2011 Chapter 4 The Processor 7

Timing Clocks used in synchronous logic when should an element that contains state be updated? Edge-triggered timing falling edge cycle time rising edge Lec 15 Systems Architecture 8

Edge Triggered Timing State updated at clock edge Read contents of some state elements, Send values through some combinational logic Write results to one or more state elements State element 1 Combinational logic State element 2 Clock cycle Lec 15 Systems Architecture 9

Information encoded in binary Logic Design Basics Low voltage = 0, High voltage = 1 One wire per bit Multi-bit data encoded on multi-wire buses Combinational element Operate on data Output is a function of input State (sequential) elements Store information 4.2 Logic Design Conventions 12/22/2011 Chapter 4 The Processor 10

Combinational Elements AND-gate Y = A & B Adder Y = A + B A B + Y A B I0 I1 M u x S Y Multiplexer Y = S? I1 : I0 Y Arithmetic/Logic Unit Y = F(A, B) A ALU Y B F 22 December 2011 Chapter 4 The Processor 11

Sequential Elements Register: stores data in a circuit Uses a clock signal to determine when to update the stored value Edge-triggered: update when Clk changes from 0 to 1 D Clk Q Clk D Q 12/22/2011 Chapter 4 The Processor 12

Register with write control Sequential Elements Only updates on clock edge when write control input is 1 Used when stored value is required later Clk D Write Clk Q Write D Q 12/22/2011 Chapter 4 The Processor 13

Clocking Methodology Combinational logic transforms data during clock cycles Between clock edges Input from state elements, output to state element Longest delay determines clock period 12/22/2011 Chapter 4 The Processor 14

Components for Simple Implementation Functional Units needed for each instruction Instruction address Instruction PC Add Sum Instruction memory MemWrite Register numbers Data a. Instruction memory b. Program counter 5 Read 3 5 register 1 Read Read data 1 5 register 2 Registers Write register Write data Read data 2 Data c. Adder ALU control Zero ALU ALU result Address Write data Data memory Read data MemRead a. Data memory unit 16 Sign 32 extend b. Sign-extension unit RegWrite a. Registers b. ALU Lec 15 Systems Architecture 15

Instruction Fetch 32-bit register Increment by 4 for next instruction 12/22/2011 Chapter 4 The Processor 16

R-Format Instructions Read two register operands Perform arithmetic/logical operation Write register result 12/22/2011 Chapter 4 The Processor 17

Load/Store Instructions Read register operands Calculate address using 16-bit offset Use ALU, but sign-extend offset Load: Read memory and update register Store: Write register value to memory 12/22/2011 Chapter 4 The Processor 18

Read register operands Compare operands Branch Instructions Use ALU, subtract and check Zero output Calculate target address Sign-extend displacement Shift left 2 places (word displacement) Add to PC + 4 Already calculated by instruction fetch 12/22/2011 Chapter 4 The Processor 19

Branch Instructions Just re-routes wires Sign-bit wire replicated 12/22/2011 Chapter 4 The Processor 20

Composing the Elements First-cut data path does an instruction in one clock cycle Each datapath element can only do one function at a time Hence, we need separate instruction and data memories Use multiplexers where alternate data sources are used for different instructions 12/22/2011 Chapter 4 The Processor 21

R-Type/Load/Store Datapath 12/22/2011 Chapter 4 The Processor 22

Full Datapath 12/22/2011 Chapter 4 The Processor 23

Adding Control Selecting the operations to perform (ALU, read/write, etc.) Controlling the flow of data (multiplexor inputs) Information comes from the 32 bits of the instruction R I J op rs rt rd shamt funct op rs rt 16 bit address op 26 bit address Lec 15 Systems Architecture 25

MIPS Instructions add $t0,$s1,$s2 000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct lw $t0,256($t1) 100011 01001 01000 0000 0001 0000 0000 op rs rt offset Lec 15 Systems Architecture 26

MIPS Instructions Continued beq $s1,$s2,25 => 100 000100 10001 10010 0000 0000 0001 1001 op rs rt offset j 1024 => 4096 [+PC+4[31-28]] 000010 00 0000 0000 0000 0100 0000 0000 op address Lec 15 Systems Architecture 27

Determining ALU Control Bits ALUOp determined by instruction Control Lines 000 and 001 or 010 add 110 sub 111 slt Instruction ALUOp Instruction funct ALU ALU opcode operation action control LW 00 load word xxxxxx add 010 SW 00 store word xxxxxx add 010 BEQ 01 branch eq xxxxxx sub 110 R-type 10 add 100000 add 010 R-type 10 sub 100010 sub 110 R-type 10 and 100100 and 000 R-type 10 or 100101 or 001 R-type 10 slt 101010 slt 111 Lec 15 Systems Architecture 28

Must describe hardware to compute 3-bit ALU control input given instruction type 00 = lw, sw 01 = beq, 10 = arithmetic function code for arithmetic ALU Control ALUOp computed from instruction type Describe it using a truth table (can turn into gates): Lec 15 Systems Architecture 29

Datapath with Control 0 4 Add Instruction [31 26] Control RegDst Branch MemRead MemtoReg ALUOp MemWrite ALUSrc RegWrite Shift left 2 Add ALU result M u x 1 PC Read address Instruction [31 0] Instruction memory Instruction [25 21] Instruction [20 16] Instruction [15 11] 0 M u x 1 Read register 1 Read Read data1 register 2 Registers Read Write data2 register Write data 0 M u x 1 Zero ALU ALU result Address Write data Data memory Read data 1 M u x 0 Instruction [15 0] 16 32 Sign extend ALU control Instruction [5 0] Lec 15 Systems Architecture 30

Control Line Settings 8 control lines (control read/write and multiplexors) Memto- Reg Mem Mem Instruction RegDst ALUSrc Reg Write Read Write Branch ALUOp R-format 1 0 0 1 0 0 0 Func Code lw 0 1 1 1 1 0 0 add sw X 1 X 0 0 1 0 add beq X 0 X 0 0 0 1 sub Lec 15 Systems Architecture 31

R-Type Instruction 22 December 2011 Chapter 4 The Processor 32

Load Instruction 22 December 2011 Chapter 4 The Processor 33

Branch-on-Equal Instruction 22 December 2011 Chapter 4 The Processor 34

Implementing Jumps Jump 2 address 31:26 25:0 Jump uses word address Update PC with concatenation of Top 4 bits of old PC 26-bit jump address 00 Need an extra control signal decoded from opcode 22 December 2011 Chapter 4 The Processor 35

Datapath With Jumps Added 22 December 2011 Chapter 4 The Processor 36

Shortcomings of a Single Cycle Implementation Limits reuse of hardware components each functional unit can be used only once per cycle e.g. instruction and data memory required Inefficient clock cycle determined by longest possible path in the machine E.G. Assume time for: Memory units = 200 ps ALU and adders = 100 ps Register file (read or write) = 50 ps Instruction class Instruction memory Register read ALU operation Data memory Register write Total R-type 200 50 100 0 50 400 ps Load word 200 50 100 200 50 600 ps Store word 200 50 100 200 550 ps Branch 200 50 100 0 350 ps Jump 200 200 ps Lec 15 Systems Architecture 37

Single Cycle Model is inefficient! Assume 25% loads, 10% stores, 45% ALU instructions, 15% branches, and 5% jumps CPU execution time = Instruction count x CPI x Clock cycle time Performance ratio = CPU Performance (Multicycle impl.) ------------------------------------------------------ = CPU Performance (Single cycle impl.) CPU Exec. Time (Single cycle impl.) ------------------------------------------------------ = CPU Exec. Time (Multicycle impl.) 600 ------------------------------------------------------------------------------------- = 600 x 25% + 550 x 10% + 400 x 45% + 350 x 15% + 200 x 5% 600 ps 447.5 ps ------------- = 1.34 faster Lec 15 Systems Architecture 38