Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed

Size: px
Start display at page:

Download "Processor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed"

Transcription

1 Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites, authors pages, and other sources for academic purpose only. The instructor does not claim any originality.

2 Lecture Outline General Purpose Processor Program Eecution Construction of a Simple IPS Processor Single Cycle Processor ulticycle Processor Pipelined Processor 2

3 High Level Language Program Levels of Representation Compiler Assembly Language Program temp = v[k]; v[k] = v[k+]; v[k+] = temp; lw $5, ($2) lw $6, 4($2) sw $6, ($2) Assembler sw $5, 4($2) achine Language Program achine Interpretation Control Signal Specification OP[:3] <= InstReg[9:] & ASK 3

4 Eecution Cycle Fetch Decode Operand Fetch Eecute Result Store Net Obtain instruction from program storage Determine required actions and instruction size Locate and obtain operand Compute value or status Deposit s in storage for later use Determine successor instruction 4

5 The Processor: path & Control Let us look at an implementation of the IPS Simplified to contain only: -reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow instructions: beq, j Generic Implementation: use the program counter (PC) to supply instruction address get the instruction from read registers use the instruction to decide eactly what to do 5

6 ore Implementation Details Abstract / Simplified View: Register # PC ress Registers Register # Register # ress Two types of functional units: elements that operate on values (combinational) elements that contain state (sequential) 6

7 Register File: Operation Built using D flip-flops (Combinational in nature) register number register number 2 Register Register Register n Register n u u R ead 2 7

8 Register File: Operation We still use the real clock to determine when to write C D Register Register number n-to- decoder n C D Register n Register C Register n D C D Register n 8

9 Register File: Block Diagram register number register number 2 register Register file 2 Three ress ports One Input Port Two Output t Ports One Control Signal 9

10 a: emory Functional Units - I After an instruction ti address is put, the instruction ti residing at the address appears at the output port b: Program Counter -- A simple upcounter c: er -- A 2 s complement adder address PC Sum a. b. Program counter c. er

11 a: Register File Functional Units - II It s construction, read, and write operations as discussed previously b: (Arithmetic ti & Logic Unit) Recall the design we have discussed in last two classes Note the Zero output control 5 3 register Register 5 numbers register 2 Zero Registers 5 register 2 Reg a. Register File b.

12 a: emory Unit Functional Units -III Similar to emory Unit, only that it can written into as well Two input ports for address and, one output port (for read out) Two control signals: for and operations b: Sign Etension Unit Etends 6-bit input operand to 32 bits em ress 6 Sig n 32 etend em a. m emory unit b. Sign-etension unit 2

13 path for Fetch (Piece I) Fetching s and Incrementing the program counter by 4 4 PC ad dre ss 3

14 path for R-type s (Piece II) path for R-type s register register 2 Registers register 2 3 operation Zero Reg 4

15 path for Load/Store (Piece III) path for load or store () Register Access; (2) emory ress calculation; (3) / (4) into Register file (if the instruction is a load) 3 operation register register 2 Zero Registers register 2 Reg ress em 6 Sign 32 etend em 5

16 path for Branch (Piece IV) Unit Shift left 2 adds at the low-order end of the sign-etended offset Control logic is used to decide whether the incremented PC or branch target should replace the PC based on the Zero output of the PC + 4 from instruction path Sum Branch target Shift left 2 register register 2 Registers register Reg 2 6 Sign 32 etend operation 3 To branch Zero control logic 6

17 path Construction Strategy Now, we have pieces of path that are capable of performing distinct functions We want to stitch them together to yield a final path that can eecute all the instructions (lw, sw, add, sub, and, or, slt, beq, j) We will use multipleors (or mues for short) for stitching the path 7

18 i register l path Construction (erge Pieces II & III) Piece II + Piece III ri r register register 2 Registers register 2 register register 2 Registers register Reg Reg 2 6 Sign 32 etend 3 operation Zero 3 operation Zero ress em em 8

19 And you will get.. Rule: Whenever we have more than one input feeding a functional unit, introduce a multipleor (this gives rise to a control signal, more later..) Registers register register 2 register 2 Reg Src u operation 3 em Zero ress emtoreg u 6 32 Sign etend em 9

20 path Construction contd (merge Piece I) Just tack the Fetch and PC increment logic at the front! 4 PC address Registers register register 2 register Reg 2 Src u operation 3 em Zero ress emtoreg u 6 Sign 32 etend em 2

21 atapath Construction (Finally, merge Piece IV) PCSrc u 4 Shift left 2 PC address Registers register register 2 register Reg 2 6 Sign 32 etend Src u 3 operation em Zero ress em emtoreg u 2

22 Final path u 4 Reg Shift left 2 PCSrc PC address [3 ] [25 2] [2 6] [5 ] [5 ] register Src register 2 Zero 2 u u RegDst flows through various paths under the influence of control signals There are seven control signals (of type,, or u Select) register Registers 6 Sign 32 etend ti [5 ] control Op em ress em emtoreg u 22

23 Defining the Control.. Selecting the operations to perform (, read/write, etc.) Controlling the flow of (multipleor inputs) Information comes from the 32 bits of the instruction Eample: add $8, $7, $8 Format: op rs rt rd shamt funct 's operation based on instruction type and function code We will design two control units: () Control to generate appropriate function select signals for the (2) ain Control to generate signals for functional units other than the 23

24 Control - Truth Table & Implementation Describe it using a truth table (can turn into gates): Op Funct field Operation Op Op F5 F4 F3 F2 F F X X X X X X X X X X X X X X X X X X X X X X X X X X X X Op Op Op control block F3 Operation2 F (5 ) F2 F F Operation Operation Operation 24

25 Designing the ain Control 4 [3 26] Control RegDst Branch em emtoreg Op em Src Reg Shift left 2 u PC address [3 ] [25 2] [2 6] [5 ] register register 2 Registers 2 register u u Zero ress u [5 ] 6 Sign 32 etend control [5 ] 25

26 Control Signals and their Effects Signal Name Effect When deasserted Effect when asserted RegDst The register destination number for the The register destination number for the register comes from the rt field (bits 2-6) register comes from the rd field (bits 5-) Reg NONE The register on the register input is written with the value on the input Src The second operand comes from the The second operand is the sign-etended second register file output ( 2) lower 6 bits of the instruction PCSrc The PC is replaced by the output of the adder that The PC is replaced by the output of the adder computes the value of PC + 4. that computes the branch target em None emory contents t designated d by the address input are put on the output em None contents designated by the address input are replaced by the value on the input emtoreg The value fed to the register input The value fed to the register input comes comes from the from the. 26

27 ain Control: Truth Table & Implementation emto- Reg em em RegDst Src Reg Branch Op p R-format lw sw X X beq X X Inputs Op5 O p 4 Op3 Op2 Op O p R -fo r m a t Iw s w b e q Outputs RegDst Src emtoreg R e g W rite em em Branch Op OpO 27

28 Our Simple Control Structure All of the logic is combinational We wait for everything to settle down, and the right thing to be done might not produce right answer right away we use write signals along with clock to determine when to write Cycle time determined by length of the longest path State element Combinational logic State element 2 Clock cycle We are ignoring some details like setup and hold times 28

29 We designed a Single Cycle Implementation Calculate cycle time assuming negligible delays ecept: (2ns), and adders (2ns), register file access (ns) PCSrc u 4 Reg Shift left 2 PC address [3 ] [25 2] [2 6] u [5 ] RegDst [5 ] register register 2 2 register Registers 6 Sign 32 etend Src u control Zero em ress em emtoreg u [5 ] Op 29

30 How does the single cycle path works? Let us understand this by highlighting the portions of the path when an R-type instruction is eecuted For an R-type instruction we go through the following phases: Phase : Fetch Phase 2: Register File Phase 3: eecution Phase 4: the Result into the Register File NOTE: All the four phases are completed in only ONE clock cycle and hence it is a single cycle implementation 3

31 R-type Phase ( Fetch) u 4 [3 26] Control RegDst Branch em emtoreg Op em Src Reg Shift left 2 PC address [3 ] ti [25 2] 2] [2 6] [5 ] register register 2 Registers 2 register u u Zero ress u [5 ] 6 Sign 32 etend control [5 ] 3

32 R-type Phase 2 (Register ) u 4 [3 26] Control RegDst Branch em emtoreg Op em Src Reg Shift left 2 PC address [3 ] ti [25 2] 2] [2 6] [5 ] register register 2 Registers 2 register u u Zero ress u [5 ] 6 Sign 32 etend control [5 ] 32

33 R-type Phase 3 ( eecution) u 4 [3 26] Control RegDst Branch em emtoreg Op em Src Reg Shift left 2 PC address [3 ] [25 2] [2 6] [5 ] register register 2 Registers 2 u register u Zero ress u [5 ] 6 Sign 32 etend control [5 ] 33

34 R-type Phase 4 ( the Result) u 4 [3 26] Control RegDst Branch em emtoreg Op em Src Reg Shift left 2 PC address [3 ] ti [25 2] 2] [2 6] [5 ] register register 2 Registers 2 register u u Zero ress u [5 ] 6 Sign 32 etend control [5 ] 34

35 How do we handle jump? [25 ] Shift Jump address [3 ] left PC+4 [3 28] [3 26] Control RegDst Jump Branch em emtoreg Op em Src Reg Shift left 2 u u PC address [3 ] [25 2] [2 6] [5 ] register register 2 Zero Registers 2 u register u ress u [5 ] 6 Sign 32 etend control [5 ] 35

36 Where we are headed Single Cycle Problems: what if we had a more complicated instruction like floating point? wasteful of area One Solution: use a smaller ccletime cycle have different instructions take different numbers of cycles a multicycle path: 36

37 Single Cycle Implementation: Summary All instructions are eecuted in only clock cycle We built a single cycle path from scratch We designed appropriate controller to generate correct correct signals All instructions are not born equal; that some require more work, some less => disadvantage of single cycle implementation is that the slowest instruction determines the clock cycle width In reality, no body implements single cycle approach Suggestion: STARE, STARE, STARE at the single cycle path to familiarize yourself Given the single cycle path, you should be able to highlight active portions of the path for any given instruction ( just as we did for an R-type instruction in the class) 37

38 Single Cycle Implementation - Recap Single Cycle Problems: what if we had a more complicated instruction like floating point? wasteful of area Cycle width determined by the slowest instruction One Solution: use a smaller cycle time have different instructions ti take different numbers of cycles a multicycle path: 38

39 ulticycle Approach High Level View PC ress emory or register emory register Register # Registers Register # Register # A B Out 39

40 Single Cycle, ultiple Cycle, vs. Pipeline Clk Cycle Single Cycle Implementation: Cycle 2 Load Store Waste Clk Cycle Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle ultiple Cycle Implementation: Load Store Ifetch Reg Eec em Wr Ifetch Reg Eec em R-type Ifetch Pipeline Implementation: Load Ifetch Reg Eec em Wr Store Ifetch Reg Eec em Wr R-type Ifetch Reg Eec em Wr 4

41 Basic Idea IF: fetch u ID: decode/ register file read EX: Eecute/ address calculation E: emory access WB: back 4 Shift left 2 PC ress register register 2 Registers 2 register u Zero ress u 6 Sign etend 32 4

42 u Corrected path IF/ID ID/EX EX/E E/WB 4 Shift left 2 PC ress Instructio on register register 2 Registers 2 register u Zero ress u 6 Sign etend 32 42

43 PCSrc path with Control u Control ID/EX WB EX/E WB E/WB IF/ID EX WB PC 4 ress ruction Instr Reg register register 2 Registers 2 register Shift left 2 u Src Zero Branch em ress em mtoreg u [5 ] 6 32 Sign etend 6 control em [2 6] [5 ] Op u RegDst 43

44 Pipelining Summary Pipelining doesn t help latency of single task, it helps throughput of entire workload ultiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup Three types of pipeline hazards: structural,, and control/branch Stalling helps any kind of hazard hazard solutions: Stalling, forwarding, Hazard detection Control or Branch Hazard solutions: Stalling, Branch Prediction, Delayed Branching 44

Lecture 5: The Processor

Lecture 5: The Processor Lecture 5: The Processor CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and

More information

Lecture 6: Pipelining

Lecture 6: Pipelining Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other

More information

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified

More information

CPE 335 Computer Organization. Basic MIPS Architecture Part I

CPE 335 Computer Organization. Basic MIPS Architecture Part I CPE 335 Computer Organization Basic MIPS Architecture Part I Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s8/index.html CPE232 Basic MIPS Architecture

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

CENG 3420 Lecture 06: Datapath

CENG 3420 Lecture 06: Datapath CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference

More information

Review: Abstract Implementation View

Review: Abstract Implementation View Review: Abstract Implementation View Split memory (Harvard) model - single cycle operation Simplified to contain only the instructions: memory-reference instructions: lw, sw arithmetic-logical instructions:

More information

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Full Datapath Branch Target Instruction Fetch Immediate 4 Today s Contents We have looked

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

comp 180 Lecture 25 Outline of Lecture The ALU Control Operation & Design The Datapath Control Operation & Design HKUST 1 Computer Science

comp 180 Lecture 25 Outline of Lecture The ALU Control Operation & Design The Datapath Control Operation & Design HKUST 1 Computer Science Outline of Lecture The Control Operation & Design The Datapath Control Operation & Design HKST 1 Computer Science Control After the design of partial single IPS datapath, we need to add the control unit

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content 3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design

More information

Single-Cycle Examples, Multi-Cycle Introduction

Single-Cycle Examples, Multi-Cycle Introduction Single-Cycle Examples, ulti-cycle Introduction 1 Today s enu Single cycle examples Single cycle machines vs. multi-cycle machines Why multi-cycle? Comparative performance Physical and Logical Design of

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control

Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control Computer and Information Sciences College / Computer Science Department The Processor: Datapath and Control Chapter 5 The Processor: Datapath and Control Big Picture: Where are We Now? Performance of a

More information

Designing a Pipelined CPU

Designing a Pipelined CPU Designing a Pipelined CPU CSE 4, S2'6 Review -- Single Cycle CPU CSE 4, S2'6 Review -- ultiple Cycle CPU CSE 4, S2'6 Review -- Instruction Latencies Single-Cycle CPU Load Ifetch /Dec Exec em Wr ultiple

More information

Chapter 4 (Part II) Sequential Laundry

Chapter 4 (Part II) Sequential Laundry Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

What do we have so far? Multi-Cycle Datapath (Textbook Version)

What do we have so far? Multi-Cycle Datapath (Textbook Version) What do we have so far? ulti-cycle Datapath (Textbook Version) CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instruction being processed in datapath How to lower CPI further? #1 Lec # 8 Summer2001

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19

CO Computer Architecture and Programming Languages CAPL. Lecture 18 & 19 CO2-3224 Computer Architecture and Programming Languages CAPL Lecture 8 & 9 Dr. Kinga Lipskoch Fall 27 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently the clock cycle must be

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel

More information

Ch 5: Designing a Single Cycle Datapath

Ch 5: Designing a Single Cycle Datapath Ch 5: esigning a Single Cycle path Computer Systems Architecture CS 365 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Memory path Input Output Today s Topic:

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 4: Datapath and Control ELEC 52/62 Computer Architecture and Design Spring 217 Lecture 4: Datapath and Control Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849

More information

Pipelining: Basic Concepts

Pipelining: Basic Concepts Pipelining: Basic Concepts Prof. Cristina Silvano Dipartimento di Elettronica e Informazione Politecnico di ilano email: silvano@elet.polimi.it Outline Reduced Instruction Set of IPS Processor Implementation

More information

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control

ﻪﺘﻓﺮﺸﻴﭘ ﺮﺗﻮﻴﭙﻣﺎﻛ يرﺎﻤﻌﻣ MIPS يرﺎﻤﻌﻣ data path and ontrol control معماري كامپيوتر پيشرفته معماري MIPS data path and control abbasi@basu.ac.ir Topics Building a datapath support a subset of the MIPS-I instruction-set A single cycle processor datapath all instruction actions

More information

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W9-W

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W9-W CSE 22 Computer Organization Hugh Chesser, CSEB 2U Agenda Topics:. Single Cycle Review (Sample Exam/Quiz Q) 2. ultiple cycle implementation Patterson: Section 4.5 Reminder: Quiz #2 Next Wednesday (November

More information

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good

T = I x CPI x C. Both effective CPI and clock cycle C are heavily influenced by CPU design. CPI increased (3-5) bad Shorter cycle good CPU performance equation: T = I x CPI x C Both effective CPI and clock cycle C are heavily influenced by CPU design. For single-cycle CPU: CPI = 1 good Long cycle time bad On the other hand, for multi-cycle

More information

LECTURE 5. Single-Cycle Datapath and Control

LECTURE 5. Single-Cycle Datapath and Control LECTURE 5 Single-Cycle Datapath and Control PROCESSORS In lecture 1, we reminded ourselves that the datapath and control are the two components that come together to be collectively known as the processor.

More information

ECE473 Computer Architecture and Organization. Processor: Combined Datapath

ECE473 Computer Architecture and Organization. Processor: Combined Datapath Computer Architecture and Organization Processor: Combined path Lecturer: Prof. Yifeng Zhu Fall, 2014 Portions of these slides are derived from: Dave Patterson CB 1 Where are we? Want to build a processor

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

The overall datapath for RT, lw,sw beq instrucution

The overall datapath for RT, lw,sw beq instrucution Designing The Main Control Unit: Remember the three instruction classes {R-type, Memory, Branch}: a) R-type : Op rs rt rd shamt funct 1.src 2.src dest. 31-26 25-21 20-16 15-11 10-6 5-0 a) Memory : Op rs

More information

CPE 335. Basic MIPS Architecture Part II

CPE 335. Basic MIPS Architecture Part II CPE 335 Computer Organization Basic MIPS Architecture Part II Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/courses/cpe335_s08/index.html CPE232 Basic MIPS Architecture

More information

Processor (I) - datapath & control. Hwansoo Han

Processor (I) - datapath & control. Hwansoo Han Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two

More information

The Processor: Datapath & Control

The Processor: Datapath & Control Chapter Five 1 The Processor: Datapath & Control We're ready to look at an implementation of the MIPS Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions:

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware 4.1 Introduction We will examine two MIPS implementations

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

Chapter 5: The Processor: Datapath and Control

Chapter 5: The Processor: Datapath and Control Chapter 5: The Processor: Datapath and Control Overview Logic Design Conventions Building a Datapath and Control Unit Different Implementations of MIPS instruction set A simple implementation of a processor

More information

Systems Architecture

Systems Architecture Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software

More information

Improve performance by increasing instruction throughput

Improve performance by increasing instruction throughput Improve performance by increasing instruction throughput Program execution order Time (in instructions) lw $1, 100($0) fetch 2 4 6 8 10 12 14 16 18 ALU Data access lw $2, 200($0) 8ns fetch ALU Data access

More information

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns

Chapter Six. Dataı access. Reg. Instructionı. fetch. Dataı. Reg. access. Dataı. Reg. access. Dataı. Instructionı fetch. 2 ns 2 ns 2 ns 2 ns 2 ns Chapter Si Pipelining Improve perfomance by increasing instruction throughput eecutionı Time lw $, ($) 2 6 8 2 6 8 access lw $2, 2($) 8 ns access lw $3, 3($) eecutionı Time lw $, ($) lw $2, 2($) 2 ns 8

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7- in the online

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Processor Architecture

Processor Architecture Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Datapath for a Simplified Processor James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Introduction

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W12-M

CSE 2021 Computer Organization. Hugh Chesser, CSEB 1012U W12-M CSE 22 Computer Organization Hugh Chesser, CSEB 2U W2- Graphical Representation Time 2 6 8 add $s, $t, $t IF ID E E Decode / Execute emory Back fetch from / stage into the instruction register file. Shading

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

CPU Organization (Design)

CPU Organization (Design) ISA Requirements CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic

More information

Advanced Computer Architecture Pipelining

Advanced Computer Architecture Pipelining Advanced Computer Architecture Pipelining Dr. Shadrokh Samavi Some slides are from the instructors resources which accompany the 6 th and previous editions of the textbook. Some slides are from David Patterson,

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 14: One Cycle MIPs Datapath Adapted from Computer Organization and Design, Patterson & Hennessy, UCB R-Format Instructions Read two register operands Perform

More information

CSE140: Components and Design Techniques for Digital Systems

CSE140: Components and Design Techniques for Digital Systems CSE4: Components and Design Techniques for Digital Systems Tajana Simunic Rosing Announcements and Outline Check webct grades, make sure everything is there and is correct Pick up graded d homework at

More information

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations

Chapter 4. The Processor. Instruction count Determined by ISA and compiler. We will examine two MIPS implementations Chapter 4 The Processor Part I Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

Topic #6. Processor Design

Topic #6. Processor Design Topic #6 Processor Design Major Goals! To present the single-cycle implementation and to develop the student's understanding of combinational and clocked sequential circuits and the relationship between

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined

More information

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H

COMPUTER ORGANIZATION AND DESIGN. The Hardware/Software Interface. Chapter 4. The Processor: A Based on P&H COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 4 The Processor: A Based on P&H Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Chapter 4. The Processor. Computer Architecture and IC Design Lab Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS

More information

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S.

Pipelined Datapath. Reading. Sections Practice Problems: 1, 3, 8, 12 (2) Lecture notes from MKP, H. H. Lee and S. Pipelined Datapath Lecture notes from KP, H. H. Lee and S. Yalamanchili Sections 4.5 4. Practice Problems:, 3, 8, 2 ing (2) Pipeline Performance Assume time for stages is ps for register read or write

More information

Single Cycle Datapath

Single Cycle Datapath Single Cycle atapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili Section 4.-4.4 Appendices B.7, B.8, B.,.2 Practice Problems:, 4, 6, 9 ing (2) Introduction We will examine two MIPS implementations

More information

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner

CPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 35: Final Exam Review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Material from Earlier in the Semester Throughput and latency

More information

COMP2611: Computer Organization. The Pipelined Processor

COMP2611: Computer Organization. The Pipelined Processor COMP2611: Computer Organization The 1 2 Background 2 High-Performance Processors 3 Two techniques for designing high-performance processors by exploiting parallelism: Multiprocessing: parallelism among

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

COSC 6385 Computer Architecture - Pipelining

COSC 6385 Computer Architecture - Pipelining COSC 6385 Computer Architecture - Pipelining Fall 2006 Some of the slides are based on a lecture by David Culler, Instruction Set Architecture Relevant features for distinguishing ISA s Internal storage

More information

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining

CS 152 Computer Architecture and Engineering Lecture 4 Pipelining CS 152 Computer rchitecture and Engineering Lecture 4 Pipelining 2014-1-30 John Lazzaro (not a prof - John is always OK) T: Eric Love www-inst.eecs.berkeley.edu/~cs152/ Play: 1 otorola 68000 Next week

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

EE 457 Unit 6a. Basic Pipelining Techniques

EE 457 Unit 6a. Basic Pipelining Techniques EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does

More information

Outline. Combinational Element. State (Sequential) Element. Clocking Methodology. Input/Output of Elements

Outline. Combinational Element. State (Sequential) Element. Clocking Methodology. Input/Output of Elements Outline ombinational Element ombinational & sequential logic Single-cycle PU ulti-cycle PU Examples of ombinational Elements State (Sequential) Element! "#$$$ #$$ #$$ #$ # & & ) *.// * - + #3, * + - locking

More information

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23) Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20

More information

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds? Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Processor: Datapath and Control. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University The Processor: Datapath and Control Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction CPU performance factors Instruction count Determined

More information

ECE369. Chapter 5 ECE369

ECE369. Chapter 5 ECE369 Chapter 5 1 State Elements Unclocked vs. Clocked Clocks used in synchronous logic Clocks are needed in sequential logic to decide when an element that contains state should be updated. State element 1

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution

EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution EC 413 Computer Organization - Fall 2017 Problem Set 3 Problem Set 3 Solution Important guidelines: Always state your assumptions and clearly explain your answers. Please upload your solution document

More information

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The Processor (1) Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University The Processor (1) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Pipelined Datapath. One register file is enough

Pipelined Datapath. One register file is enough ipelined path The goal of pipelining is to allow multiple instructions execute at the same time We may need to perform several operations in a cycle Increment the and add s at the same time. Fetch one

More information

The Pipelined MIPS Processor

The Pipelined MIPS Processor 1 The niversity of Texas at Dallas Lecture #20: The Pipeline IPS Processor The Pipelined IPS Processor We complete our study of AL architecture by investigating an approach providing even higher performance

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory

Unpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing

More information

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl.

Lecture 4: Review of MIPS. Instruction formats, impl. of control and datapath, pipelined impl. Lecture 4: Review of MIPS Instruction formats, impl. of control and datapath, pipelined impl. 1 MIPS Instruction Types Data transfer: Load and store Integer arithmetic/logic Floating point arithmetic Control

More information

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering

CS359: Computer Architecture. The Processor (A) Yanyan Shen Department of Computer Science and Engineering CS359: Computer Architecture The Processor (A) Yanyan Shen Department of Computer Science and Engineering Eecuting R-type Instructions 7 Instructions ADD and subtract add rd, rs, rt sub rd, rs, rt OR Immediate:

More information

Single Cycle Datapath

Single Cycle Datapath Single Cycle atapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili Section 4.1-4.4 Appendices B.3, B.7, B.8, B.11,.2 ing Note: Appendices A-E in the hardcopy text correspond to chapters 7-11 in

More information

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations

ECE170 Computer Architecture. Single Cycle Control. Review: 3b: Add & Subtract. Review: 3e: Store Operations. Review: 3d: Load Operations ECE7 Computer Architecture Single Cycle Control Review: 3a: Overview of the Fetch Unit The common operations Fetch the : mem[] Update the program counter: Sequential Code: < + Branch and Jump: < something

More information

CC 311- Computer Architecture. The Processor - Control

CC 311- Computer Architecture. The Processor - Control CC 311- Computer Architecture The Processor - Control Control Unit Functions: Instruction code Control Unit Control Signals Select operations to be performed (ALU, read/write, etc.) Control data flow (multiplexor

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation (cwli@twins.ee.nct.ed.tw) 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath

CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath CPE 442 single-cycle datapath.1 Outline of Today s Lecture Recap and Introduction Where are we with respect to the BIG picture?

More information

Major CPU Design Steps

Major CPU Design Steps Datapath Major CPU Design Steps. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected

More information

EECS 322 Computer Architecture Improving Memory Access: the Cache

EECS 322 Computer Architecture Improving Memory Access: the Cache EECS 322 Computer Architecture Improving emory Access: the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow

More information

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August

Lecture 8: Control COS / ELE 375. Computer Architecture and Organization. Princeton University Fall Prof. David August Lecture 8: Control COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Datapath and Control Datapath The collection of state elements, computation elements,

More information