Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Size: px
Start display at page:

Download "Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:"


1 EE8 Winter 25 Homework #2 Soltions De Thrsday, Feb 2, 5 P. ( points) Consider the following fragment of Java code: for (i=; i<=6; i=i+3) a[i] = b[i] +c; Assme that a and b are arrays of words and the base address of a is in $a and the base address of b is in $a. Register $t is associated with variable i and register $s is associated with the vale of c. Yo may also assme that any address constants yo need are available to be loaded from memory. Write the code for IPS. How many instrctions are eected dring the rnning of this code if there are no array ot-of-bonds eceptions thrown? How many memory data references will be made dring eection? Hint: To indicate branching to error handling code yo may se synta sch as: bne $t, $t, DescriptionOfError Soltion: To test for loop termination, the (address) constant 24 is needed. Assme that it is placed in memory when the program is loaded. This soltion assmes that the memory addresses storing the lengths of arrays are in $a2 and $a3 for a and b respectively: lw $t8, AddressConstant24($zero)# $t8 = 24 lw $t7, ($a2) # $t7 = length of a[] lw $t6, ($a3) # $t6 = length of b[] add $t, $zero, $zero # initialize i = Loop: slt $t4, $t, $zero # $t4 = if i < bne $t4, $zero, IndeOtOfBonds # if i<, goto Error slt $t4, $t, $t6 # $t4 = if i >= length beq $t4, $zero, IndeOtOfBonds # if i >= length, goto Error slt $t4, $t, $t7 # $t4 = if i >= length beq $t4, $zero, IndeOtOfBonds # if i >= length, goto Error add $t, $a, $t # $t = address of b[i] lw $t2, ($t) # $t2 = b[i] add $t2, $t2, $s # $t2 = b[i] + c add $t3, $a, $t # $t3 = address of a[i] sw $t2, ($t3) # a[i] = b[i] + c addi $t, $t, 2 # i = i + 2 slt $t4, $t, $t8 # $t4 = if $t < 24, i.e., i <= 6 bne $t4, $zero, Loop # goto Loop if i <= 6 The nmber of instrctions eected is = 288. The nmber of data references made is = 45. Eception and termination checks mst be handled correctly (as above).

2 EE8 Winter (5 points) Sppose we have made the following measrements of average CPI for instrctions: Instrction Arithmetic Data transfer Conditional branch Jmp Average CPI. clock cycles.7 clock cycles 2.5 clock cycles 2.2 clock cycles Compte the effective CPI for IPS. Use the Core IPS instrction freqencies for SPEC26int in Figre 3.28 (on page 236 of the 5 th edition of the tetbook, to obtain the instrction mi. Soltion: Effective CPI = Sm of (CPI of instrction type Freqency of eection) The average instrction freqencies for SPEC2int and SPEC2fp are:.457 (arithmetic and logic).338 (data transfer).7 (conditional branch).8 (jmp) Ths, the effective CPI: =.496 (rondoff to.5) Dividing this answer by.98 (to get.53) is also fine, as the total instrction percent does not add p to. 2

3 EE8 Winter (5 points) Compter A has an overall CPI of.9 and can be rn at a clock rate of.8 GHz. Compter B has a CPI of 2.6 and can be rn at a clock rate of 2.4 GHz. We have a particlar program we wish to rn. When compiled for compter A, this program has eactly, instrctions. How many instrctions wold the program need to have when compiled for Compter B, in order for the two compters to have eactly the same eection time for this program? Soltion: Time = InstrCont * CPI * Clock Cycle Time Time for A =, *.9 * (/.8 GHz) Time for B = InstrContB * 2.6 * (/2.4 GHz) If the two eection times shold be eqal, then: InstrContB = (2.4GHz.9 ) (.8GHz 2.6) = Note that the instrction cont is mch lower for compter B than for compter A on the same program. To achieve this in real life, one wold need a dramatically different architectre (e.g. B is a CISC machine) or a mch more aggressive compiler for B.) 3

4 EE8 Winter ( points) Consider the following idea: Let s modify the instrction set architectre and remove the ability to specify an offset for memory access instrctions. Specifically, all load-store instrctions with nonzero offsets wold become psedoinstrctions and wold be implemented sing two instrctions. For eample: addi $at, $t, 4 # add the offset to a temporary lw $t, $at # new way of doing lw $t, 4 ($t) What changes wold yo make to the single-cycle datapath and control if this simplified architectre were to be sed? Soltion: The key is recognizing that we no longer have to go throgh the ALU and then to memory. We wold not want to add zero sing the ALU, instead we want to provide a path directly from the Read data otpt of the Register File to the read/write address lines of the memory (assming the instrction format does not change). The otpt of the ALU wold no longer connect to memory. The control does not need to change, bt some of the control signals now are don t cares. Assming we are not implementing addi or addi, it is possible to remove the AlSrc control signal and the mltipleer that it controls, ths having jst the data from Read data 2 otpt (of the Register File) going into the ALU. This reslts in additional optimizations to ALU control. 5. ( points) IPS chooses to simplify the strctre of its instrctions. The way we implement comple instrctions throgh the se of IPS instrctions is to decompose sch comple instrctions into mltiple simpler IPS ones. Show how IPS can implement the instrction swap $rs, $rt which swaps the contents of registers $rs and $rt in software i.e., sing IPS instrctions. Consider the case in which there is an available register that may be sed as well as the case in which no sch register eists. If the implementation of this instrction in hardware will increase the clock period of a single-instrction implementation by 8%, what percentage of swap operations in the instrction mi wold recommend implementing it in hardware? What if the clock period wold increase by 5%? 4

5 EE8 Winter 25 Soltion: Available register ($rd ) case: swap $rs,$rt can be implemented as follows: addi $rd,$rs, addi $rs,$rt, addi $rt,$rd, No available register case: sw $rs,temp($r) addi $rs,$rt, lw $rt,temp($r) Alternate soltion: or $rs,$rs,$rt or $rt,$rs,$rt or $rs,$rs,$rt Clock cycle tradeoff evalation: Software takes three cycles, and hardware takes one cycle. Let Rs be the ratio of swaps in the code mi. Also, assme a base CPI= (which it is for the IPS). Now: Avg time per instrction: (Software): Rs*3*T + ( Rs)**T = (2Rs + ) * T (Hardware): T Hardware implementation makes sense only if: T <= (2Rs + ) * T 8% increase in clock period: Clock period =.8 * T i.e. if swap instrctions are greater than 4% of the instrction mi (Rs >=.4), then a hardware implementation wold be preferable. 5% increase in clock period: Clock period =.5*T i.e. if swap instrctions are greater than 7.5% of the instrction mi, then a hardware implementation wold be preferable. 5

6 EE8 Winter (2 points) The following C program is compiled into IPS objects with no optimization and with O2 optimization. int A[], B[]; main() { int i; int c = ; } for (i=; i < ; i++) A[i] = B[i] + c; Unoptimized Code Optimized with O2 : li gp, 4: addi gp, gp, 8: add gp, gp, t9 c: addi sp, sp, -24 : sw gp, (sp) 4: sw fp, 2(sp) 8: sw gp, 6(sp) c: move fp, sp 2: li v, 24: sw v, 2(fp) 28: sw zero, 8(fp) 2c: lw v, 8(fp) 3: slti v, v, 34: bne v, zero, 3c 38: j 88 3c: lw v, 8(fp) 4: move v, v 44: sll v, v, 2 48: lw v, (gp) 4c: add v, v, v 5: lw v, 8(fp) 54: move a, v 58: sll v, a, 2 5c: lw a, 4(gp) 6: add v, v, a 64: lw a, (v) 68: lw v, 2(fp) 6c: add a, a, v 7: sw a, (v) 74: lw v, 8(fp) 78: addi v, v, 7c: move v, v 8: sw v, 8(fp) 84: j 2c 88: move sp, s8 8c: lw fp, 2(sp) 9: addi sp, sp, 24 94: jr ra : li gp, 4: addi gp, gp, 8: add gp, gp, t9 c: li a2, : move a, zero 4: lw a, (gp) 8: lw v, 4(gp) c: lw v, (v) 2: addi v, v, 4 24: addi a, a, 28: add v, v, a2 2c: sw v, (a) 3: slti v, a, 34: addi a, a, 4 38: bne v, zero, c 3c: jr ra a. ( points) Please identify the optimizations sed by the compiler to transform the code from the noptimized version into the optimized one and point ot where they are applied. Note: the s seen in the first few lines in both versions of the fnction are only place holders for nknown constants, so yo shold 6

7 EE8 Winter 25 not assme that gp is initialized to. Frthermore t9 in both versions contains the offset between gp and the address storing the pointer to array A. Soltion: Copy propagation: Instrctions 4, 54 and 7c are removed. Arithmetic identity/algebraic simplification: Since (i+) 4 == (i 4)+4, instrctions 4 and 4c, and 54 and 6 that comptes the new A[i] and B[i], are transformed to 34 and 2 respectively. Leaf rotine optimization: It is a leaf rotine and there is no need to save and restore fp and gp. There is also no need to store i and c on the stack since they are only sed locally. As a reslt no stack space needs to be allocated. Ths instrctions c 8, 24, 3c, 5, 68, 74, 8 and 88-9 in the noptimized code are removed, and 28-2c are redced to instrction in the optimized version. Loop invariant code otion: Since the arrays A and B are in static memory, instrctions 48 and 5c that load the base address of A and B are moved above the loop (instrctions 4-8 in the optimized code) to redce the nmber of dynamic instrctions. Loop inversion: Since the lower and pper bond of the for loop are constants, the loop can be transformed into a while loop that has a lower loop overhead. Ths, instrctions 3-38 and 84 are transformed to 3 and 38 in the optimized version. b. (7 points) Please compte the nmber of dynamic instrctions and show the instrction mi (types: ALU, Branch, emory) for both version of the code. Unoptimized version: (before loop) + 22 (in loop) * + 7 (after loop) = 228 7

8 EE8 Winter 25 ALU 9/228 = 46% Branch 22/228 = 9% emory 7/228 = 45% Optimized version: 7 (before loop) + 8 (in loop) * + (after loop) = 88 ALU 55/88 = 62% Branch /88 = 3% emory 22/88 = 25% c. (3 points) In the optimized code, find the code or data references that need to be resolved by the linker. The constants in instrctions and 4, which initializes $gp to point to the middle of the static data area of memory. The register $t9 acconts for the offset between the initial vale of $gp and where the base address of the first array is stored. The branch at 38 is not PC-relative, so this needs to be resolved by the linker. 8

9 EE8 Winter (5 points) Using the figre below, show all the necessary data and control path for instrction jalr rd, rs in the single-cycle IPS processor discssed in lectre. P C [3 28 ] Instrction [25 ] 4 A dd Ins trc tion [3 26] Control RegDst Br anc h em Read em toreg ALUOp em Write ALUS rc RegW rite S hift left 2 ALU Add reslt Jm p PC Read address Instrction mem or y Instrction [3 ] Ins trc tion [25 2] Ins trc tion [2 6] Ins trc tion [5 ] Read r egister Read data Read r egister 2 Regis ter s Read W rite data 2 r egister W rite data Z ero ALU ALU reslt Address W rite data Read data Data memory Ins trc tion [5 ] 6 32 Sign etend A LU contr ol Instrction [5 ] I n s t r c t i o n [ 25 ] S h i f t J m p a d d r e s s [ 3 ] l e f t A d d P C + 4 [ ] I n s t r c t i o n [ 3 26 ] C o n t r o l R e g D s t J m p B r a n c h e m R e a d e m t o R e g A L U O p e m W r i t e A L U S r c R e g W r i t e S h i f t l e f t 2 A d d A L U r e s l t P C R e a d a d d r e s s I n s t r c t i o n m e m o r y I n s t r c t i o n [ 3 ] I n s t r c t i o n [ 25 2 ] I n s t r c t i o n [ 2 6 ] I n s t r c t i o n [ 5 ] R e a d r e g i s t e r R e a d r e g i s t e r 2 W r i t e R e g i s t e r s r e g i s t e r W r i t e d a t a R e a d d a t a R e a d d a t a 2 Z e r o A L U A L U r e s l t A d d r e s s W r i t e d a t a D a t a m e m o r y R e a d d a t a I n s t r c t i o n [ 5 ] I n s t r c t i o n [ 5 ] 6 32 S i g n e t e n d A L U c o n t r o l Jalr PC + 4 9

10 EE8 Winter (5 Points) It happens qite often that we wish to inde throgh and access each element of an array. Absent from IPS, bt present in other assembly langages/instrction sets are load/store commands which also increment the indeing register. For eample, lwinc $rt, offset($rs) wold perform the normal load and sbseqently increment $rs by 4. Please either describe in words, or show in the figre below, all necessary modifications needed to spport these instrctions in the single-cycle IPS processor discssed in lectre. load / store Rs Rt Offset 3:26 25:2 2:6 5: The datapath reqires an additional ALU to increment the content of the $ rs register (Read data ) by 4 (7 points). The otpt of this is fed back to the register file, which needs a second write port (8 points) becase two writes to the register are reqired in a single cycle. The new write port will be controlled by a new signal, "Write 2." We assme that the destination register for the second write is always the same as Read register ($ rs). This way "Write 2" indicates that there is second write to register file to the register identified by "Read register," and the data is fed throgh Write data 2.

11 EE8 Winter 25 Adding a second register file wold be incorrect since then the contents of the two wold have to be kept consistent. 9. (2 Points) The poplar 86 instrction set by Intel allows arithmetic instrctions to directly access memory for one of their sorce operands. The primary benefit is that fewer instrctions will be eected becase we won t have to first load that sorce operand into a register. The primary disadvantage is that the cycle time will have to increase to accont for the additional time to read memory dring the arithmetic instrction. Consider adding a new instrction to the IPS ISA: addm $t2, $t3, $t4 // $t2 = $t3 + emory[$t4] a). (5 Points) Consider the single-cycle IPS processor datapath shown below. Show the datapath changes needed to implement addm. Describe each change in -2 sentences. Name control signals, bt don t worry abot their vales for now.

12 EE8 Winter 25 b). (5 Points) Determine the control signals necessary to implement addm in the singlecycle IPS processor. For each control signal specify in the following table whether it needs to be,, or X (don t care) to implement addm. There are additional lines for the control signals needed for datapath changes yo made in 2.c. The ALUop control signal can take one of the following vales: add, sb, or, X. 2

The single-cycle design from last time

The single-cycle design from last time lticycle path Last time we saw a single-cycle path and control nit for or simple IPS-based instrction set. A mlticycle processor fies some shortcomings in the single-cycle CPU. Faster instrctions are not

More information

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read.

The final datapath. M u x. Add. 4 Add. Shift left 2. PCSrc. RegWrite. MemToR. MemWrite. Read data 1 I [25-21] Instruction. Read. register 1 Read. The final path PC 4 Add Reg Shift left 2 Add PCSrc Instrction [3-] Instrction I [25-2] I [2-6] I [5 - ] register register 2 register 2 Registers ALU Zero Reslt ALUOp em Data emtor RegDst ALUSrc em I [5

More information

PART I: Adding Instructions to the Datapath. (2 nd Edition):

PART I: Adding Instructions to the Datapath. (2 nd Edition): EE57 Instrctor: G. Pvvada ===================================================================== Homework #5b De: check on the blackboard =====================================================================

More information

The extra single-cycle adders

The extra single-cycle adders lticycle Datapath As an added bons, we can eliminate some of the etra hardware from the single-cycle path. We will restrict orselves to sing each fnctional nit once per cycle, jst like before. Bt since

More information

Review. A single-cycle MIPS processor

Review. A single-cycle MIPS processor Review If three instrctions have opcodes, 7 and 5 are they all of the same type? If we were to add an instrction to IPS of the form OD $t, $t2, $t3, which performs $t = $t2 OD $t3, what wold be its opcode?

More information

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation

EXAMINATIONS 2003 END-YEAR COMP 203. Computer Organisation EXAINATIONS 2003 COP203 END-YEAR Compter Organisation Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. There are 180 possible marks on the eam. Calclators and foreign langage dictionaries

More information

Review Multicycle: What is Happening. Controlling The Multicycle Design

Review Multicycle: What is Happening. Controlling The Multicycle Design Review lticycle: What is Happening Reslt Zero Op SrcA SrcB Registers Reg Address emory em Data Sign etend Shift left Sorce A B Ot [-6] [5-] [-6] [5-] [5-] Instrction emory IR RegDst emtoreg IorD em em

More information

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University

Computer Architecture Chapter 5. Fall 2005 Department of Computer Science Kent State University Compter Architectre Chapter 5 Fall 25 Department of Compter Science Kent State University The Processor: Datapath & Control Or implementation of the MIPS is simplified memory-reference instrctions: lw,

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 3.. 3% of corse mark De onday, Febrary 26th, 4:3 P Lates accepted ntil : A, Febrary 27th with a 5% penalty. IEEE 754 Floating Point ( points): (a) (4 points) Complete the following

More information


EXAMINATIONS 2010 END OF YEAR NWEN 242 COMPUTER ORGANIZATION EXAINATIONS 2010 END OF YEAR COPUTER ORGANIZATION Time Allowed: 3 Hors (180 mintes) Instrctions: Answer all qestions. ake sre yor answers are clear and to the point. Calclators and paper foreign langage

More information

Quiz #1 EEC 483, Spring 2019

Quiz #1 EEC 483, Spring 2019 Qiz # EEC 483, Spring 29 Date: Jan 22 Name: Eercise #: Translate the following instrction in C into IPS code. Eercise #2: Translate the following instrction in C into IPS code. Hint: operand C is stored

More information

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction Review Friday the 2st of October Real world eamples of pipelining? How does pipelining pp inflence instrction latency? How does pipelining inflence instrction throghpt? What are the three types of hazard

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 483 Compter Organization Chapter 4.4 A Simple Implementation Scheme Chans Y The Big Pictre The Five Classic Components of a Compter Processor Control emory Inpt path Otpt path & Control 2 path and

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 25, Winter 28, Assignment 4.. 3% of corse mark De Wednesday, arch 7th, 4:3P Lates accepted ntil Thrsday arch 8th, am with a 5% penalty. (6 points) In the diagram below, the mlticycle compter from the

More information

CS 251, Spring 2018, Assignment 3.0 3% of course mark

CS 251, Spring 2018, Assignment 3.0 3% of course mark CS 25, Spring 28, Assignment 3. 3% of corse mark De onday, Jne 25th, 5:3 P. (5 points) Consider the single-cycle compter shown on page 6 of this assignment. Sppose the circit elements take the following

More information

CS 251, Winter 2019, Assignment % of course mark

CS 251, Winter 2019, Assignment % of course mark CS 25, Winter 29, Assignment.. 3% of corse mark De Wednesday, arch 3th, 5:3P Lates accepted ntil Thrsday arch th, pm with a 5% penalty. (7 points) In the diagram below, the mlticycle compter from the corse

More information

Review: Computer Organization

Review: Computer Organization Review: Compter Organization Pipelining Chans Y Landry Eample Landry Eample Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 3 mintes A B C D Dryer takes 3 mintes

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics (Lectres

More information

Lecture 7. Building A Simple Processor

Lecture 7. Building A Simple Processor Lectre 7 Bilding A Simple Processor Christos Kozyrakis Stanford University http://eeclass.stanford.ed/ee8b C. Kozyrakis EE8b Lectre 7 Annoncements Upcoming deadlines Lab is de today Demo by 5pm, report

More information

Computer Architecture

Computer Architecture Compter Architectre Lectre 4: Intro to icroarchitectre: Single- Cycle Dr. Ahmed Sallam Sez Canal University Spring 25 Based on original slides by Prof. Onr tl Review Compter Architectre Today and Basics

More information

Enhanced Performance with Pipelining

Enhanced Performance with Pipelining Chapter 6 Enhanced Performance with Pipelining Note: The slides being presented represent a mi. Some are created by ark Franklin, Washington University in St. Lois, Dept. of CSE. any are taken from the

More information

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM

The multicycle datapath. Lecture 10 (Wed 10/15/2008) Finite-state machine for the control unit. Implementing the FSM Lectre (Wed /5/28) Lab # Hardware De Fri Oct 7 HW #2 IPS programming, de Wed Oct 22 idterm Fri Oct 2 IorD The mlticycle path SrcA Today s objectives: icroprogramming Etending the mlti-cycle path lti-cycle

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5A - simple implementation ( 5A- Introdction In this lectre, we will try to implement simplified IPS which contain emory

More information

Pipelining. Chapter 4

Pipelining. Chapter 4 Pipelining Chapter 4 ake processor rns faster Pipelining is an implementation techniqe in which mltiple instrctions are overlapped in eection Key of making processor fast Pipelining Single cycle path we

More information

What do we have so far? Multi-Cycle Datapath

What do we have so far? Multi-Cycle Datapath What do we have so far? lti-cycle Datapath CPI: R-Type = 4, Load = 5, Store 4, Branch = 3 Only one instrction being processed in datapath How to lower CPI frther? #1 Lec # 8 Spring2 4-11-2 Pipelining pipelining

More information

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13

Comp 303 Computer Architecture A Pipelined Datapath Control. Lecture 13 Comp 33 Compter Architectre A Pipelined path Lectre 3 Pipelined path with Signals PCSrc IF/ ID ID/ EX EX / E E / Add PC 4 Address Instrction emory RegWr ra rb rw Registers bsw [5-] [2-6] [5-] bsa bsb Sign

More information

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind

Chapter 6 Enhancing Performance with. Pipelining. Pipelining. Pipelined vs. Single-Cycle Instruction Execution: the Plan. Pipelining: Keep in Mind Pipelining hink of sing machines in landry services Chapter 6 nhancing Performance with Pipelining 6 P 7 8 9 A ime ask A B C ot pipelined Assme 3 min. each task wash, dry, fold, store and that separate

More information

Exceptions and interrupts

Exceptions and interrupts Eceptions and interrpts An eception or interrpt is an nepected event that reqires the CPU to pase or stop the crrent program. Eception handling is the hardware analog of error handling in software. Classes

More information

Hardware Design Tips. Outline

Hardware Design Tips. Outline Hardware Design Tips EE 36 University of Hawaii EE 36 Fall 23 University of Hawaii Otline Verilog: some sbleties Simlators Test Benching Implementing the IPS Actally a simplified 6 bit version EE 36 Fall

More information

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts

Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts CS359: Compter Architectre Chapter 3 & Appendi C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Compter Science and Engineering Shanghai Jiao Tong University 1 Otline Introdction

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 83 Compter Organization Chapter.6 A Pipelined path Chans Y Pipelined Approach 2 - Cycle time, No. stages - Resorce conflict E E A B C D 3 E E 5 E 2 3 5 2 6 7 8 9 c.y9@csohio.ed Resorces sed in 5 Stages

More information

Chapter 6: Pipelining

Chapter 6: Pipelining CSE 322 COPUTER ARCHITECTURE II Chapter 6: Pipelining Chapter 6: Pipelining Febrary 10, 2000 1 Clothes Washing CSE 322 COPUTER ARCHITECTURE II The Assembly Line Accmlate dirty clothes in hamper Place in

More information

Lecture 13: Exceptions and Interrupts

Lecture 13: Exceptions and Interrupts 18 447 Lectre 13: Eceptions and Interrpts S 10 L13 1 James C. Hoe Dept of ECE, CU arch 1, 2010 Annoncements: Handots: Spring break is almost here Check grades on Blackboard idterm 1 graded Handot #9: Lab

More information

1048: Computer Organization

1048: Computer Organization 8: Compter Organization Lectre 6 Pipelining Lectre6 - pipelining ( 6- Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards

More information

Solutions for Chapter 6 Exercises

Solutions for Chapter 6 Exercises Soltions for Chapter 6 Eercises Soltions for Chapter 6 Eercises 6. 6.2 a. Shortening the ALU operation will not affect the speedp obtained from pipelining. It wold not affect the clock cycle. b. If the

More information

Overview of Pipelining

Overview of Pipelining EEC 58 Compter Architectre Pipelining Department of Electrical Engineering and Compter Science Cleveland State University Fndamental Principles Overview of Pipelining Pipelined Design otivation: Increase

More information

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control

CSE Introduction to Computer Architecture Chapter 5 The Processor: Datapath & Control CSE-45432 Introdction to Compter Architectre Chapter 5 The Processor: Datapath & Control Dr. Izadi Data Processor Register # PC Address Registers ALU memory Register # Register # Address Data memory Data

More information

1048: Computer Organization

1048: Computer Organization 48: Compter Organization Lectre 5 Datapath and Control Lectre5B - mlticycle implementation ( 5B- Recap: A Single-Cycle Processor PCSrc 4 Add Shift left 2 Add ALU reslt PC address

More information


POWER-OF-2 BOUNDARIES Page 5 Monday, Jne 17, 5:6 PM CHAPTER 3 POWER-OF- BOUNDARIES 3 1 Ronding Up/Down to a Mltiple of a Known Power of Ronding an nsigned integer down to, for eample, the net smaller mltiple of

More information

Computer Architecture. Lecture 6: Pipelining

Computer Architecture. Lecture 6: Pipelining Compter Architectre Lectre 6: Pipelining Dr. Ahmed Sallam Based on original slides by Prof. Onr tl Agenda for Today & Net Few Lectres Single-cycle icroarchitectres lti-cycle and icroprogrammed icroarchitectres

More information

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 6: Microprogrammed Multi Cycle Implementation. James C. Hoe Department of ECE Carnegie Mellon University 8 447 Lectre 6: icroprogrammed lti Cycle Implementation James C. Hoe Department of ECE Carnegie ellon University 8 447 S8 L06 S, James C. Hoe, CU/ECE/CALC, 208 Yor goal today Hosekeeping nderstand why

More information

Lecture 9: Microcontrolled Multi-Cycle Implementations

Lecture 9: Microcontrolled Multi-Cycle Implementations 8-447 Lectre 9: icroled lti-cycle Implementations James C. Hoe Dept of ECE, CU Febrary 8, 29 S 9 L9- Annoncements: P&H Appendi D Get started t on Lab Handots: Handot #8: Project (on Blackboard) Single-Cycle

More information

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion

Instruction fetch. MemRead. IRWrite ALUSrcB = 01. ALUOp = 00. PCWrite. PCSource = 00. ALUSrcB = 00. R-type completion . (Chapter 5) Fill in the vales for SrcA, SrcB, IorD, Dst and emto to complete the Finite State achine for the mlti-cycle datapath shown below. emory address comptation 2 SrcA = SrcB = Op = fetch em SrcA

More information

CS/COE1541: Introduction to Computer Architecture

CS/COE1541: Introduction to Computer Architecture CS/COE1541: Introduction to Computer Architecture Dept. of Computer Science University of Pittsburgh 1 Computer Architecture? Application pull Operating

More information

Lab 8 (All Sections) Prelab: ALU and ALU Control

Lab 8 (All Sections) Prelab: ALU and ALU Control Lab 8 (All Sections) Prelab: and Control Name: Sign the following statement: On my honor, as an Aggie, I have neither given nor received nathorized aid on this academic work Objective In this lab yo will

More information

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade

CSE 141 Computer Architecture Summer Session I, Lectures 10 Advanced Topics, Memory Hierarchy and Cache. Pramod V. Argade CSE 141 Compter Architectre Smmer Session I, 2004 Lectres 10 Advanced Topics, emory Hierarchy and Cache Pramod V. Argade CSE141: Introdction to Compter Architectre Instrctor: TA: Pramod V. Argade (p2argade@cs.csd.ed)

More information

PS Midterm 2. Pipelining

PS Midterm 2. Pipelining PS idterm 2 Pipelining Seqential Landry 6 P 7 8 9 idnight Time T a s k O r d e r A B C D 3 4 2 3 4 2 3 4 2 3 4 2 Seqential landry takes 6 hors for 4 loads If they learned pipelining, how long wold landry

More information

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page.

Winter 2013 MIDTERM TEST #2 Wednesday, March 20 7:00pm to 8:15pm. Please do not write your U of C ID number on this cover page. page of 7 University of Calgary Departent of Electrical and Copter Engineering ENCM 369: Copter Organization Lectre Instrctors: Steve Noran and Nor Bartley Winter 23 MIDTERM TEST #2 Wednesday, March 2

More information

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1

4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language 345.e1 .3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage 35.e.3 Advanced Topic: An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline

More information

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary

Control Instructions. Computer Organization Architectures for Embedded Computing. Thursday, 26 September Summary Control Instructions Computer Organization Architectures for Embedded Computing Thursday, 26 September 2013 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

Control Instructions

Control Instructions Control Instructions Tuesday 22 September 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Instruction Set

More information

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont )

Chapter 2. Computer Abstractions and Technology. Lesson 4: MIPS (cont ) Chapter 2 Computer Abstractions and Technology Lesson 4: MIPS (cont ) Logical Operations Instructions for bitwise manipulation Operation C Java MIPS Shift left >>> srl Bitwise

More information

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET

Multiple-Choice Test Chapter Golden Section Search Method Optimization COMPLETE SOLUTION SET Mltiple-Choice Test Chapter 09.0 Golden Section Search Method Optimization COMPLETE SOLUTION SET. Which o the ollowing statements is incorrect regarding the Eqal Interval Search and Golden Section Search

More information

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle

Animating the Datapath. Animating the Datapath: R-type Instruction. Animating the Datapath: Load Instruction. MIPS Datapath I: Single-Cycle nimating the atapath PS atapath : Single-Cycle npt is either (-type) or sign-etended lower half of instrction (load/store) op offset/immediate W egister File 6 6 + from instrction path beq,, offset if

More information

EEC 483 Computer Organization. Branch (Control) Hazards

EEC 483 Computer Organization. Branch (Control) Hazards EEC 483 Compter Organization Section 4.8 Branch Hazards Section 4.9 Exceptions Chans Y Branch (Control) Hazards While execting a previos branch, next instrction address might not yet be known. s n i o

More information

Chapter 6: Pipelining

Chapter 6: Pipelining Chapter 6: Pipelining Otline An overview of pipelining A pipelined path Pipelined control Data hazards and forwarding Data hazards and stalls Branch hazards Eceptions Sperscalar and dynamic pipelining

More information

CENG3420 Lecture 03 Review

CENG3420 Lecture 03 Review CENG3420 Lecture 03 Review Bei Yu 2017 Spring 1 / 38 CISC vs. RISC Complex Instruction Set Computer (CISC) Lots of instructions of variable size, very memory optimal, typically less

More information

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction

More information

Review. How to represent real numbers

Review. How to represent real numbers PCWrite PC IorD Review ALUSrcA emread Address Write data emory emwrite em Data IRWrite [3-26] [25-2] [2-6] [5-] [5-] RegDst Read register Read register 2 Write register Write data RegWrite Read data Read

More information

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010

ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010 ECE/CS 552: Introduction to Computer Architecture ASSIGNMENT #1 Due Date: At the beginning of lecture, September 22 nd, 2010 This homework is to be done individually. Total 9 Questions, 100 points 1. (8

More information

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer

1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer 1 5. Addressing Modes MIPS Addressing Modes 2 Addressing takes care of where to find data instruction We have seen, so far three addressing modes of MIPS (to find data): 1. Immediate addressing: provides

More information

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set

COMPSCI 313 S Computer Organization. 7 MIPS Instruction Set COMPSCI 313 S2 2018 Computer Organization 7 MIPS Instruction Set Agenda & Reading MIPS instruction set MIPS I-format instructions MIPS R-format instructions 2 7.1 MIPS Instruction Set MIPS Instruction

More information

Chapter 2: Instructions:

Chapter 2: Instructions: Chapter 2: Instructions: Language of the Computer Computer Architecture CS-3511-2 1 Instructions: To command a computer s hardware you must speak it s language The computer s language is called instruction

More information

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Instructions: Language of the Computer (3) Fengguang Song Department of Computer & Information Science IUPUI Recall Big endian, little endian Memory alignment Unsigned

More information

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012

Computer Architecture Lecture 6: Multi-cycle Microarchitectures. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 2/6/2012 8-447 Compter Architectre Lectre 6: lti-cycle icroarchitectres Prof. Onr tl Carnegie ellon University Spring 22, 2/6/22 Reminder: Homeworks Homework soltions Check and stdy the soltions! Learning now is

More information

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions

MIPS R-format Instructions. Representing Instructions. Hexadecimal. R-format Example. MIPS I-format Example. MIPS I-format Instructions Representing Instructions Instructions are encoded in binary Called machine code MIPS instructions Encoded as 32-bit instruction words Small number of formats encoding operation code (opcode), register

More information

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes

Chapter 2. Instructions: Language of the Computer. Adapted by Paulo Lopes Chapter 2 Instructions: Language of the Computer Adapted by Paulo Lopes Instruction Set The repertoire of instructions of a computer Different computers have different instruction sets But with many aspects

More information

Chapter 4. The Processor. Computer Architecture and IC Design Lab

Chapter 4. The Processor. Computer Architecture and IC Design Lab Chapter 4 The Processor Introduction CPU performance factors CPI Clock Cycle Time Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS

More information

Instructions: Language of the Computer

Instructions: Language of the Computer CS359: Computer Architecture Instructions: Language of the Computer Yanyan Shen Department of Computer Science and Engineering 1 The Language a Computer Understands Word a computer understands: instruction

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS3350B Computer Architecture Winter 2015 Lecture 4.1: MIPS ISA: Introduction Marc Moreno Maza [Adapted d from lectures on Computer Organization and Design, Patterson & Hennessy,

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer Institut für Technische Informatik Branch to a labeled instruction if a condition is true Otherwise, continue sequentially beq rs, rt, L1 if (rs == rt) branch to

More information

Lecture 4: MIPS Instruction Set

Lecture 4: MIPS Instruction Set Lecture 4: MIPS Instruction Set No class on Tuesday Today s topic: MIPS instructions Code examples 1 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software

More information

CS 153 Design of Operating Systems Spring 18

CS 153 Design of Operating Systems Spring 18 CS 53 Design of Operating Systems Spring 8 Lectre 2: Virtal Memory Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Recap: cache Well-written programs exhibit

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: MIPS Instruction Set Architecture Computer Science 324 Computer Architecture Mount Holyoke College Fall 2009 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture. Idea:

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA CISC 662 Graduate Computer Architecture Lecture 4 - ISA Michela Taufer Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

Computer Architecture

Computer Architecture CS3350B Computer Architecture Winter 2015 Lecture 4.2: MIPS ISA -- Instruction Representation Marc Moreno Maza [Adapted from lectures on Computer Organization and Design,

More information

CS222: MIPS Instruction Set

CS222: MIPS Instruction Set CS222: MIPS Instruction Set Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati 1 Outline Previous Introduction to MIPS Instruction Set MIPS Arithmetic's Register Vs Memory, Registers

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Harware Organization an Design ectre 11: Introction to IPs path apte from Compter Organization an Design, Patterson & Hennessy, CB IPS-lite processor Compter Want to bil a processor for a sbset

More information

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons

PIPELINING. Pipelining: Natural Phenomenon. Pipelining. Pipelining Lessons Pipelining: Natral Phenomenon Landry Eample: nn, rian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 mintes C D Dryer takes 0 mintes PIPELINING Folder takes 20 mintes

More information

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015

Branch Addressing. Jump Addressing. Target Addressing Example. The University of Adelaide, School of Computer Science 28 September 2015 Branch Addressing Branch instructions specify Opcode, two registers, target address Most branch targets are near branch Forward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits PC-relative

More information

Topic Notes: MIPS Instruction Set Architecture

Topic Notes: MIPS Instruction Set Architecture Computer Science 220 Assembly Language & Comp. Architecture Siena College Fall 2011 Topic Notes: MIPS Instruction Set Architecture vonneumann Architecture Modern computers use the vonneumann architecture.

More information

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization

CISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,

More information

Architecture II. Computer Systems Laboratory Sungkyunkwan University

Architecture II. Computer Systems Laboratory Sungkyunkwan University MIPS Instruction ti Set Architecture II Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University Making Decisions (1) Conditional operations Branch to a

More information

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands

Stored Program Concept. Instructions: Characteristics of Instruction Set. Architecture Specification. Example of multiple operands Stored Program Concept Instructions: Instructions are bits Programs are stored in memory to be read or written just like data Processor Memory memory for data, programs, compilers, editors, etc. Fetch

More information

CS3350B Computer Architecture MIPS Introduction

CS3350B Computer Architecture MIPS Introduction CS3350B Computer Architecture MIPS Introduction Marc Moreno Maza Department of Computer Science University of Western Ontario, Canada Thursday January

More information

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009

101 Assembly. ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 101 Assembly ENGR 3410 Computer Architecture Mark L. Chang Fall 2009 What is assembly? 79 Why are we learning assembly now? 80 Assembly Language Readings: Chapter 2 (2.1-2.6, 2.8, 2.9, 2.13, 2.15), Appendix

More information

CS 153 Design of Operating Systems

CS 153 Design of Operating Systems CS 153 Design of Operating Systems Spring 18 Lectre 3: OS model and Architectral Spport Instrctor: Chengy Song Slide contribtions from Nael Ab-Ghazaleh, Harsha Madhyvasta and Zhiyn Qian Last time/today

More information

Chapter 2. Instruction Set Architecture (ISA)

Chapter 2. Instruction Set Architecture (ISA) Chapter 2 Instruction Set Architecture (ISA) MIPS arithmetic Design Principle: simplicity favors regularity. Why? Of course this complicates some things... C code: A = B + C + D; E = F - A; MIPS code:

More information

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1

Instructions: MIPS ISA. Chapter 2 Instructions: Language of the Computer 1 Instructions: MIPS ISA Chapter 2 Instructions: Language of the Computer 1 PH Chapter 2 Pt A Instructions: MIPS ISA Based on Text: Patterson Henessey Publisher: Morgan Kaufmann Edited by Y.K. Malaiya for

More information

Chapter 2. Instructions:

Chapter 2. Instructions: Chapter 2 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions We ll be working with

More information

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda

MIPS ISA and MIPS Assembly. CS301 Prof. Szajda MIPS ISA and MIPS Assembly CS301 Prof. Szajda Administrative HW #2 due Wednesday (9/11) at 5pm Lab #2 due Friday (9/13) 1:30pm Read Appendix B5, B6, B.9 and Chapter 2.5-2.9 (if you have not already done

More information

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers

5/17/2012. Recap from Last Time. CSE 2021: Computer Organization. The RISC Philosophy. Levels of Programming. Stored Program Computers CSE 2021: Computer Organization Recap from Last Time load from disk High-Level Program Lecture-2 Code Translation-1 Registers, Arithmetic, logical, jump, and branch instructions MIPS to machine language

More information

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011

Recap from Last Time. CSE 2021: Computer Organization. Levels of Programming. The RISC Philosophy 5/19/2011 CSE 2021: Computer Organization Recap from Last Time load from disk High-Level Program Lecture-3 Code Translation-1 Registers, Arithmetic, logical, jump, and branch instructions MIPS to machine language

More information

Chapter 3. Instructions:

Chapter 3. Instructions: Chapter 3 1 Instructions: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive e.g., MIPS Arithmetic Instructions We ll be working with

More information

Lecture 10: Pipelined Implementations

Lecture 10: Pipelined Implementations U 8-7 S 9 L- 8-7 Lectre : Pipelined Implementations James. Hoe ept of EE, U Febrary 23, 29 nnoncements: Project is de this week idterm graded, d reslts posted Handots: H9 Homework 3 (on lackboard) Graded

More information

CSc 256 Midterm 2 Fall 2011

CSc 256 Midterm 2 Fall 2011 CSc 256 Midterm 2 Fall 2011 NAME: 1a) You are given a MIPS branch instruction: x: beq $12, $0, y The address of the label "y" is 0x400468. The memory location at "x" contains: address contents 0x40049c

More information

Chapter 2A Instructions: Language of the Computer

Chapter 2A Instructions: Language of the Computer Chapter 2A Instructions: Language of the Computer Copyright 2009 Elsevier, Inc. All rights reserved. Instruction Set The repertoire of instructions of a computer Different computers have different instruction

More information

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from:

Instructions: MIPS arithmetic. MIPS arithmetic. Chapter 3 : MIPS Downloaded from: Instructions: Chapter 3 : MIPS Downloaded from: Language of the Machine More primitive than higher level languages e.g., no sophisticated control flow Very restrictive

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State

More information

4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations

4.13. An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations .3 An Introdction to Digital Design Using a Hardware Design Langage to Describe and odel a Pipeline and ore Pipelining Illstrations This online section covers hardware description langages and then gives

More information