Slide Set 7 for Lecture Section 01

Size: px
Start display at page:

Download "Slide Set 7 for Lecture Section 01"

Transcription

1 Slide Set 7 for Lecture Section 01 for ENCM 369 Winter 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2017

2 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 2/86 Contents The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

3 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 3/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

4 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 4/86 The multicycle processor (textbook Section 7.4) ENCM 369 will not cover Section 7.4 in detail, because terms at Canadian universities are short! That s too bad, because the multicycle design has some interesting aspects... It shows how a computer can use a single memory array for both instructions and data. It makes very efficient use of the ALU the ALU gets used to compute three different results for every instruction. The control unit is sequential it s a really nice and practical example of a finite state machine (FSM).

5 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 5/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

6 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 6/86 Introduction to Pipelining Before we start to learn about pipelining, let s review a model we will call the one-instruction-at-a-time model: Step 1: Processor reads instruction from memory and updates PC. Step 2: Processor executes the instruction. The processor performs Step 1, Step 2, Step 1, Step 2,..., forever (or until the power is turned off). This model correctly predicts the results produced by sequences of instructions in assembly language code. Also, the model accurately describes the organization of the processors of textbook Sections 7.3 and 7.4.

7 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 7/86 The one-instruction-at-a-time model and modern processors The model DOES NOT accurately describe the organization of modern processors! At a given moment in time, a modern processor will be working on many different instructions this allows much greater speed than one-instruction-at-a-time processing. However, the processor must produce results as if instructions were being handled one at a time.

8 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 8/86 Remark: Your instructor thinks that as if is a very short and very useful summary of many of the important ideas related to modern computer system designs. Modern processor chips often process instructions in ways that are hard for humans to understand, but nevertheless do what skilled coders want in time- and energy-efficient ways.

9 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 9/86 The Laundry Analogy This analogy is taken from Computer Organization and Design, by David Patterson and John Hennessy, which was the ENCM 369 textbook for many years. You have many loads of laundry to do, with these four resources: a washing machine a dryer a folding unit (you) a putting-away unit (your roommate) (In real life not very many students would ask their roommates to put away laundry for them, but let s just follow Patterson and Hennessy here.)

10 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 10/86 The Laundry Analogy, continued Let s assume that each step in processing laundry takes 30 minutes. (In real life, this close to correct for washers but unfortunately not at all correct for dryers.) Suppose you have four loads of dirty laundry. If you process each load completely before starting the next, how long does it take to finish all four loads?

11 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 11/86 Processing four loads of laundry, one at a time... Load 1st 2nd 3rd 4th W D F PA W D F 6:00pm 8:00pm 10:00pm midnight 2:00am Time The work takes EIGHT HOURS in total! And each resource (washer, dryer, etc.) is IDLE for three-quarters of the time. PA There is an obvious way to speed this up... W D F PA W D F PA

12 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 12/86 Processing four loads of laundry, making better use of resources... As soon as one load is out of the washer, the washer is free for the next load. The same is true for all of the other resources. So we can schedule the work this way... Load 1st 2nd 3rd 4th W D W F D W PA F D W PA F D PA F PA 6:00pm 8:00pm 10:00pm midnight 2:00am Time

13 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 13/86 The concept of pipelining in digital logic design A pipelined system is a collection of stages, each with a simple role to perform. When a stage is finished producing its current output, it can pass that output to the next stage and receive new input. In the laundry analogy, the washer stage receives a load of dirty clothes as input, and produces a load of wet, clean clothes as output, which gets passed as input to the dryer stage. In Harris and Harris, this year s textbook, pipelining is introduced in Section 3.6, along with an analogy to baking cookies. That section is short and worth reading.

14 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 14/86 Pipelined execution of instructions In a pipelined processor, an instruction is like a single load of laundry. Processing an instruction can start long before processing of the preceding instruction is finished. To divide the work of processing an instruction across a number of pipeline stages, that work has to be broken down into simple steps that take roughly equal amounts of time. Each step must fit into a single clock cycle.

15 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 15/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

16 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 16/86 5 pipeline stages for our MIPS subset The subset is: ADD, SUB, SLT, AND, OR, LW, SW, BEQ. The stages are: Fetch: Read instruction from I-Mem and update PC. Decode: Determine outputs of Control Unit and read GPRs from R-File. Execute: Get a result from the ALU. Memory: D-Mem access for loads and stores. Writeback: Write to a GPR at the end of a load or an R-type instruction. In some stages, for some instructions, nothing happens. What are some examples of that?

17 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 17/86 Example sequence of instructions in our 5-stage MIPS subset pipeline # This sequence is not practical code, # but it makes for a simple example. ADD $t2, $t0, $t1 LW $t4, ($t3) SW $t5, ($t6) SUB $t9, $t7, $t8 Let s suppose we have a 1 GHz clock, so the clock period is 1 ns. How long will it take from the beginning of the ADD instruction to the end of the SUB instruction?

18 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 18/86 Pipelined processing for example instruction sequence ADD IF ID EX MEM WB LW IF ID EX MEM WB SW IF ID EX MEM WB SUB IF ID EX MEM WB 0 ns 1 ns 2 ns 3 ns 4 ns 5 ns 6 ns 7 ns 8 ns time

19 The single-cycle processor starts one instruction per clock cycle. A pipelined processor also starts one instruction per clock cycle. Why will a pipelined design allow much greater instruction throughput? (The diagram below provides a hint at the answer!) CLK PC output Instruction main decoder outputs R-File outputs ALU decoder outputs ALU result D-Mem RD output $s1 contents

20 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 20/86 An example 3-instruction sequence in a pipelined processor The sequence is... lw $t0, 20($t1) or $t2, $t3, $t4 sw $t5, 40($t6) Let s use the Pipeline Basics handout to track all the steps in processing these instructions.

21 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 21/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

22 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 22/86 Pipeline Hazards These can be defined as situations that prevent throughput of one instruction per clock cycle. There are three main kinds: structural hazards, data hazards, and control hazards.

23 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 23/86 Structural Hazards A structural hazard occurs when a unit within a computer is asked to do two (or more) incompatible things at the same time. Example: In a computer with a single memory unit, the processor can t do the Fetch step of one instruction while also doing the Memory step of an earlier LW or SW instruction.

24 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 24/86 Solution to Structural Hazards Design the instruction set and hardware so that this kind of hazard does not occur. Example: Have separate Instruction and Data Memories, so Fetch can be simultaneous with Memory of an earlier instruction. (Note: When we get to textbook Chapter 8, we ll see that for modern processors, separation of instructions and data really means having separate caches for instructions and data.)

25 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 25/86 Data Hazards: Are inputs to instructions up-to-date? Example: add $t0, $t1, $t2 sub $t4, $t3, $t0 The destination of ADD is a source for SUB. The Writeback step of ADD will happen later than the Decode step of SUB, so there is a risk that SUB will use old, wrong data from $t0. Remember, the processor must produce results as if one instruction completes before the next instruction starts!

26 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 26/86 Control Hazards: What instruction address should be used in the next Fetch? Example... beq $t0, $t1, L1 and $t4, $t2, $t3... more instructions... L1: lw $t5, ($t6) Which instruction should be fetched after BEQ is fetched? AND or LW? The processor will not know until the $t0 == $t1 comparison is done!

27 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 27/86 Assumption about Register File in textbook Section 7.5, related to data hazards Writes to the Register File occur in the first half of a clock cycle, and reads from the Register File occur in the second half. To enable this behaviour, what choices can be made about flip-flops, Data Memory, and other clocked components? What are the consequences regarding GPR reads and writes that happen within the same clock cycle?

28 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 28/86 Edge-triggering for pipelined computers in Section 7.5 Updates to GPRs in the Register File happen in response to negative clock edges. system clock 1 0 Updates to PC, Data Memory, and pipeline registers happen in response to positive clock edges. This is NOT applicable to the single-cycle design of Section 7.3 and the multicycle design of Section 7.4!

29 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 29/86 Review: 5 pipeline stages for our MIPS subset Fetch: Read instruction from Instruction Memory; do PC = PC + 4. Decode: Determine Control Unit outputs appropriate for instruction opcode; copy two GPR values out of Register File. Execute: Do computation in ALU. Memory: Read or write Data Memory. Writeback: Update a GPR in the Register File.

30 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 30/86 Solutions to data hazards, first of three: stalling the pipeline Example: add $t0, $t1, $t2 sub $t4, $t3, $t0

31 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 31/86 Solutions to data hazards, second of three: forwarding Example A, from previous slide: add $t0, $t1, $t2 sub $t4, $t3, $t0 Example B: lw $t0, ($t1) add $t2, $t2, $t3 slt $t6, $t0, $t5

32 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 32/86 Solutions to data hazards, third of three: combine stalling and forwarding Example: lw $t0, ($t1) add $t3, $t0, $t2 Can forwarding by itself solve this data hazard?

33 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 33/86 Control Hazards (repeat of earlier example) What instruction address should be used in the next Fetch step after the Fetch step of a branch instruction? Example... beq $t0, $t1, L1 and $t4, $t2, $t3... more instructions... L1: lw $t5, ($t6) Which instruction should be fetched after BEQ is fetched? AND or LW? The processor will not know until the $t0 == $t1 comparison is done!

34 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 34/86 Control hazard illustration BEQ F D E M W next instruction F D E M W Here next means next in time, not necessarily next location in Instruction Memory. Why will it be difficult to do the Fetch step for the next instruction just one clock cycle after the Fetch step for BEQ? (There are multiple reasons.)

35 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 35/86 Four kinds of solutions for control hazards 1. Stall: Delay the Fetch step for the next instruction until the address of the next instruction is known. 2. Predict: Guess what the address of the next instruction will be, and act on the guess without delay. Check that the guess was correct; if not, cancel instructions that have incorrectly entered the pipeline. 3. Delayed branch and jump rules 4. Conditional instructions

36 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 36/86 Dynamic branch prediction This is widely used in modern processors (but mostly not used in low-power embedded processors). A large and complex branch prediction circuit is dedicated to recording information about recently-encountered branch instructions. For each branch instruction, its target address is recorded along with a prediction about whether the branch will be taken.

37 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 37/86 Dynamic branch prediction, continued When a branch instruction is encountered, the branch prediction circuit can quickly supply a guess for the next PC value, and instruction fetch can occur without delay. If a guess is wrong, some instructions will have to be cancelled, and clock cycles will be lost. This system is called dynamic because a taken/not-taken prediction will be changed if it has recently been more often wrong than right.

38 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 38/86 Branch prediction code example p and past_last are of type int*. count is an int. do { if (*p < 0) count++; p++; } while (p!= past_last); p walks through an array of int elements, and count records how many of those elements are negative.

39 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 39/86 Branch prediction code example, continued Let s suppose that there are a lot of array elements, and most of them are negative... L1: lw $t0, ($a0) slt $t1, $zero, $t0 beq $t1, $zero, L2 # branch if!(*p < 0) addiu $t9, $t9, 1 # count++ L2: addiu $a0, $a0, 4 # p++ bne $a0, $t8, L1 # branch if p!= past_last As the processor runs the loop, what predictions will it make about the BEQ and BNE instructions?

40 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 40/86 Delayed branch and jump rules This kind of solution to control hazards is older and less sophisticated than branch prediction. This is a feature of the real MIPS instruction set, but is NOT enabled by default in MARS and other MIPS simulators used for education! The idea is that one instruction of useful work can get started in the clock cycle needed to make a branch decision and compute a branch or jump target address. Details are in the two paragraphs at the bottom of the page in the Control Hazard Solutions document.

41 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 41/86 MIPS delayed branch example What will the flow of instructions be if $t0!= 0? What will it be if $t0 == 0? slt beq add lw operands $t0, $zero, L1 operands operands L1: or operands sub operands

42 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 42/86 Examples of MIPS delayed jumps # C code: i = foo(17);... suppose i is in $s0. jal foo addiu $a0, $zero, 17 # Argument set up after call starts! addu $s0, $v0, $zero # $ra points to this instruction. # Example return from nonleaf procedure... jr $ra addiu $sp, $sp, 32 # Deallocate stack after return starts! If you ever do A.L. programming for real MIPS processors, or need to read MIPS compiler output, be aware of delayed branches and jumps!

43 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 43/86 Conditional instructions if (a < b) c = a; else c = b; Suppose this if-else code is inside a loop. Translating this with a branch and a jump could cause a lot of lost clock cycles, especially if branch prediction does a poor job. Suppose that a, b, and c are ints in $s0, $s1, $s2. Let s see how this can be coded with MIPS move conditional instructions movn and movz. By the way, ARM instruction sets have very rich collections of conditional instructions.

44 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 44/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

45 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 45/86 Making pipelining work in hardware The textbook presents a sequence of designs, from Figure 7.45 to Figure The earliest designs are incomplete and incorrect in many ways. Later designs get closer and closer to being complete and correct. Recommendation: Read Sections through carefully and observe how new features get added and existing features get modified.

46 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 46/86 Remarks on Textbook Figure 7.47 This computer handles R-type, LW, and SW instructions correctly, except when there are data hazards. It makes an attempt to handle BEQ, but doesn t get it right. This computer works as if three delay-slot instructions should be processed before a branch is taken.

47 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 47/86 D flip-flops: What s the point? (Repeat slide from Slide Set 6) This is important! Knowing what a D flip-flop does is as important as knowing the truth tables for NOT, AND, and OR. A clock cycle is a span of time from one active edge of a clock to the next active edge. A D flip-flop captures the value of the input bit D at the end of a clock cycle, and makes that captured bit value available on Q throughout the next clock cycle.

48 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 48/86 Pipeline registers Prominent in all of the Section 7.5 designs are pipeline registers made of D flip-flops. The pipeline registers are not 32 bits wide they re much wider than that. They have clock inputs; the register outputs change only on active clock edges. At the end of each clock cycle, each pipeline register collects information from one pipeline stage, and makes that information available to the next stage throughout the next clock cycle.

49 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 49/86 A sketch of a pipelined datapath This is essentially textbook Figure 7.46 with the wiring removed to reduce clutter. Note the highlighted pipeline registers! CLK CLK CLK CLK CLK CLK CLK PC I-Mem F/D pipeline register R-File + 4 SignExt D/E pipeline register <<2 ALU + E/M pipeline register D-Mem M/W pipeline register F stage D stage E stage M stage W stage

50 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 50/86 Review: Edge-triggering for pipelined computers in Section 7.5 Updates to GPRs in the Register File happen in response to negative clock edges. system clock 1 0 Updates to PC, Data Memory, and pipeline registers happen in response to positive clock edges.

51 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 51/86 Tracing an instruction through the datapath of Figure 7.46 Let s trace an R-type instruction: SLT $2, $4, $5. We ll assume that this instruction is located at address 0x0040_0030 in Instruction Memory. For now, we ll look at the datapath only. We ll consider control later, after we have seen the whole datapath.

52 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 52/86 SLT $2, $4, $5 located at 0x0040_0030: F stage CLK CLK 0 1 PC PCPlus4F PCF 4 I-Mem + F/D pipeline reg. PCBranchM (from M stage) How many DFFs are there in the F/D register? What values get written into the F/D register at the end of the Fetch clock cycle of the SLT?

53 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 53/86 SLT $2, $4, $5 located at 0x0040_0030: D stage CLK CLK InstrD 25:21 WE3 20:16 R-File F/D pipeline reg. 20:16 15:11 15:0 SignExt PCPlus4D WriteRegW ResultW CLK D/E pipeline reg. How many DFFs are there in the D/E register? What gets into the D/E register at the end of the Decode clock cycle? What is going on with WriteRegW and ResultW?

54 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 54/86 SLT $2, $4, $5 located at 0x0040_0030: E stage CLK D/E pipeline reg. RtE RdE 0 1 SignImmE PCPlus4E 0 1 SrcAE SrcBE <<2 ALU WriteDataE WriteRegE + CLK E/M pipeline reg. How many DFFs are there in the E/M register? For the SLT instruction, what useful information gets written into the E/M register at the end of the Execute clock cycle? What useful information gets written into the E/M register in the cases of LW, SW and BEQ?

55 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 55/86 SLT $2, $4, $5 located at 0x0040_0030: M stage CLK ZeroM E/M pipeline reg. ALUOutM WriteDataM WriteRegM PCBranchM CLK WE D-Mem CLK M/W pipeline reg. How many DFFs are there in the M/W register? For the SLT instruction, what useful information gets written into the E/M register at the end of Memory clock cycle? What happens in the M stage for LW, SW, and BEQ?

56 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 56/86 SLT $2, $4, $5 located at 0x0040_0030: W stage CLK ALUOutW M/W pipeline reg. ReadDataW 1 WriteRegW 0 For the SLT instruction, what happens in the Writeback stage? Let s draw part of a schematic to help explain it. What would be the same and what would be different for an LW instruction in the W stage? ResultW

57 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 57/86 Pipelined control for the Figure 7.46 datapath Perhaps surprisingly, we can use exactly the same control unit that was designed for the single-cycle machine. We can drop the Control Unit into the Decode stage. However, now we must organize the control signals so that each one arrives at the correct time wherever it is needed on the datapath! For example... Q1: RegWrite = 1 is generated for LW. When should that value of RegWrite arrive at the R-File? Q2: MemWrite = 1 is generated for SW. When should that value of MemWrite arrive at D-Mem? Q3: What general method can we use to get the timing correct for all of the control signals?

58 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 58/86 Control circuit for pipelined datapath of Figure :26 Instr 5:0 Control Unit opcode funct RegWriteD MemtoRegD MemWriteD BranchD ALUControlD ALUSrcD RegDstD to R-File CLK CLK PCSrcM CLK D/E pipeline register. RegWriteE MemtoRegE MemWriteE BranchE ALUControlE ALUSrcE RegDstE E/M pipeline register. RegWriteM MemtoRegM MemWriteM BranchM ZeroM (from ALU) M/W pipeline register MemtoRegW. RegWriteW Let s make a few notes about how this circuit works.

59 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 59/86 How much progress have we made so far? Reminder: processor designs near the beginning of Section 7.5 are incomplete and partly incorrect. Processor designs get better and better as corrections and improvements are made. The datapath and control system we have just looked at in detail are combined in the textbook in the computer of Figure That computer can t deal with data hazards and handles BEQ incorrectly.

60 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 60/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

61 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 61/86 Hardware features to manage data hazards Let s start by reviewing two of the more complicated kinds of data hazard. For example #2 of the Hazard Examples document... first ADD F D E M W second ADD F D E M W SUB F D E M W Let s illustrate why forwarding by itself won t work for example #4 in Hazard Examples...

62 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 62/86 Hardware for forwarding: This incomplete sketch of an upgraded Execute stage allows a lot of choice for ALU A and B inputs! CLK ID/EX pipeline register GPR GPR LW/SW offset ForwardAE Hazard Unit ALUSrcE ForwardBE 0 1 A B ALU WriteDataE ALUOutM ResultW

63 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 63/86 Hardware for forwarding, continued: Q1: What should the values of ForwardAE and ForwardBE be in the case where no forwarding is needed? Consider this sequence: LW AND SUB R8, 0(R4) R9, R10, R11 R12, R8, R9 Q2: What should the values of ForwardAE and ForwardBE be when SUB is in the EX stage? Q3: What inputs does the Hazard Unit need in order to decide correctly on the values of ForwardAE and ForwardBE?

64 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 64/86 Hazard Unit for computer of textbook Figure 7.50 RsE RtE ForwardAE ForwardBE WriteRegM RegWriteM WriteRegW RegWriteW Hazard Unit What are RsE and RtE, and how are they used by the Hazard Unit? A complete description of the logic in this version of the Hazard Unit can be found on pages 416 and 418 in the textbook. Note: The computer of Figure 7.50 properly handles data hazards that can be solved using forwarding only. It is not capable of solving data hazards that require stalls.

65 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 65/86 Hardware for data hazard stalls This is an example of what is called a load-use data hazard: LW $8, 0($9) ADD $16, $17, $8 SUB $18, $4, $5 We ve already seen that a one-cycle stall is needed so that the M stage result of LW can be forwarded to the E stage of ADD. The need for a stall can be detected in the D stage of ADD. Let s draw a diagram to show how LW, ADD, and SUB will be processed.

66 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 66/86 To make this work in hardware, we must enhance some of the registers in the system... Add an EN (enable) input to the PC. If EN is turned off the PC is frozen and does not update on a positive clock edge. Add a similar EN input to the F/D pipeline register. Add a CLR (clear) input to the D/E pipeline register. If CLR is turned on, the instruction arriving in the register is converted to a harmless NOP. These changes are sketched in an incomplete schematic on the next slide...

67 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 67/86 CLK CLK CLK PC EN StallF F/D register StallD EN RsD RtD CLR D/E register FlushE RtE MemtoRegE extension to Hazard Unit For clarity, the schematic above only shows Hazard Unit inputs and outputs that are used to effect the stall for LW instructions. See textbook Figure 7.53 for a complete schematic.

68 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 68/86 For a complete description of all of the logic used to effect the stall for LW instructions, see pages in the textbook. In lecture, it s really only possible to present a sketch of that logic.

69 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 69/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

70 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 70/86 Hardware changes to manage control hazards ENCM 369 will NOT cover this material in depth, and there will be NO lab exercises or midterm or final exam questions on it! The Figure 7.53 processor is excellent regarding data hazards, but handles BEQ instructions poorly three instructions follow a BEQ into the pipeline before the branch decision gets made. Why does that happen? The Figure 7.53 processor makes the branch decision in the Memory stage. (Check the location of the AND gate... )

71 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 71/86 Redesign to make branch instructions work better The processor of Figure 7.56 moves the branch decision from the Memory stage to the Decode stage, and the branch target address generation from the Execute stage to the Decode stage. So, only one instruction follows BEQ into the pipeline before a branch is taken, which is better, but making the decision in the Decode stage causes new data hazards!

72 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 72/86 Redesign to make branch instructions work better, continued Example #6 from the Hazard Examples document, with an extra instruction... LW $17, 0($4) BEQ $17, $0, some_other_label ADD $2, $5, $6 What is needed to get the LW result into the Decode step of BEQ? If the branch is taken, what should happen to ADD? (Assume that we re designing a computer that does NOT have a delayed branch rule.)

73 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 73/86 Redesign to make branch instructions work better: Remarks It s hard to process branches with perfect accuracy without losing lots of cycles due to hazards! Therefore, dynamic branch prediction can save a lot of cycles if most guesses are correct. Also, conditional instructions such as MIPS movn and movz are sometimes better choices than branch instructions.

74 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 74/86 Outline of Slide Set 7 for Lecture Section 01 The multicycle processor (textbook Section 7.4) Introduction to Pipelining 5 pipeline stages for our MIPS subset Pipeline Hazards Making pipelining work in hardware Hardware features to manage data hazards Hardware changes to manage control hazards Exceptions

75 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 75/86 Exceptions: General Concepts An exception is an event that changes flow of instructions in a way that is quite different from a branch or jump. So, obviously, an exception causes a special kind of PC update. But an exception can also cause a change in privilege a switch from a user program to operating system kernel software.

76 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 76/86 Privilege: user program vs. kernel A user program has rights to read and write memory allocated to that program and to read and write registers. That s all it can do by itself, but it can also ask for help from the kernel. The kernel controls hardware like disks and network interfaces. The kernel has power over all memory in the computer and can start and stop all other programs.

77 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 77/86 Two meanings for exception The concept of an exception in discussion of hardware or assembly language code is NOT THE SAME as the concept of an exception in a high-level language like C++, Java, or Python! Exception-related keywords in C++: try, catch, throw Exception-related keywords in Java: try, catch, finally, throw, throws Exception-related keywords in Python: try, except, raise

78 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 78/86 Two meanings for exception, continued High-level language exception: a special kind of jump (possibly involving a return through one or more procedure calls) to code that is set up to handle an error condition. Do NOT try to connect the above concept to hardware exceptions if you do, your brain will hurt and your understanding of both kinds of exceptions will be damaged.

79 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 79/86 Exceptions in Hardware and Assembly Language: 3 Main Categories 1. The processor notices that a program has tried to do a bad thing. 2. A program intentionally generates the exception. 3. Interrupts hardware external to the processor sends a signal to the processor asking for attention.

80 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 80/86 Examples of program-trying-a-bad-thing exceptions Instruction fetched with opcode that does not make sense to processor ( undefined instruction ). Addition or subtraction of integers resulted in overflow (e.g., MIPS ADD, SUB, ADDI, but not ADDU, SUBU, ADDIU). Attempt to access memory a program is not permitted to access. Attempt to access memory with invalid address (e.g., LW data address is not a multiple of 4). (Note: Memory units in Chapter 7 computers don t have the capability to report memory access errors, but memory systems in real computers usually do.)

81 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 81/86 Programs intentionally causing exceptions This mainly happens with system calls. Examples: MIPS syscall instruction, similar instructions in other instruction sets. A user program asks the operating system kernel to provide a service.

82 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 82/86 Examples of Interrupts Laptop user presses key on keyboard. Desktop user moves a mouse. Smartphone or tablet user taps finger on a touchscreen. A data packet arrives on a network interface. A disk controller reports that a write operation on a disk has completed.

83 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 83/86 What happens when an exception occurs? The processor will start executing instructions that form an exception handler (like a procedure, but not exactly the same). Before starting the exception handler, the processor must record some essential information in some special-purpose registers... Let s make some notes about this essential information.

84 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 84/86 Exceptions and pipelines Due to time limitations and lack of textbook support, we will not look in detail at this topic, just give a quick sketch. Useful terms: exception victim and flushing. The victim of an exception is either the instruction that caused the exception or, when there is an interrupt, the first instruction in the pipeline that will not be allowed to complete. To flush an instruction in a pipeline means ensuring that the instruction does not update system state, such as register file or memory contents.

85 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 85/86 Exceptions and pipelines: key challenges Instructions that enter a pipeline before the victim must be allowed to complete. The victim and the instructions that followed the victim in the pipeline must be flushed. The address of the victim must be identified NOT easy, because in a pipelined system, the PC probably will NOT be pointing to the victim.

86 ENCM 369 Winter 2017 Slide Set 7 for Lecture Section 01 slide 86/86 Example of MIPS exception processing Suppose there is an exception when the LW instruction (at address 0x0040_0090) is in the Memory stage. What should happen? Scenario 1: Exception is caused by $t0 not being a multiple of 4. Scenario 2: Exception is an interrupt, unrelated to this program. # Code running in a # 5-stage pipeline, an # actual MIPS computer, # not a Ch. 7 machine! andi $t2, $s4, 0xFF sll $t3, $t2, 8 or $s2, $s2, $t3 lw $t1, ($t0) addiu $t0, $t0, 4 sw $t1, ($s0) addiu $s0, $s0, 4 slt $t4, $t0, $s7

ENCM 369 Winter 2018 Lab 9 for the Week of March 19

ENCM 369 Winter 2018 Lab 9 for the Week of March 19 page 1 of 9 ENCM 369 Winter 2018 Lab 9 for the Week of March 19 Steve Norman Department of Electrical & Computer Engineering University of Calgary March 2018 Lab instructions and other documents for ENCM

More information

Topics. Lecture 12: Pipelining. Introduction to pipelining. Pipelined datapath. Hazards in pipeline. Performance. Design issues.

Topics. Lecture 12: Pipelining. Introduction to pipelining. Pipelined datapath. Hazards in pipeline. Performance. Design issues. Lecture 2: Pipelining Topics Introduction to pipelining Performance Pipelined datapath Design issues Hazards in pipeline Types Solutions Pipelining is Natural! Laundry Example Use case scenario Ann, Brian,

More information

ENCM 501 Winter 2019 Assignment 6 for the Week of March 11

ENCM 501 Winter 2019 Assignment 6 for the Week of March 11 page of 8 ENCM 5 Winter 29 Assignment 6 for the Week of March Steve Norman Department of Electrical & Computer Engineering University of Calgary February 29 Assignment instructions and other documents

More information

Slide Set 7. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Slide Set 7. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng Slide Set 7 for ENCM 501 in Winter Term, 2017 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2017 ENCM 501 W17 Lectures: Slide

More information

Slides for Lecture 15

Slides for Lecture 15 Slides for Lecture 15 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary 6 March,

More information

CHW 362 : Computer Architecture & Organization

CHW 362 : Computer Architecture & Organization CHW 362 : Computer Architecture & Organization Instructors: Dr Ahmed Shalaby Dr Mona Ali http://bu.edu.eg/staff/ahmedshalaby4# http://www.bu.edu.eg/staff/mona.abdelbaset Review: Instruction Formats R-Type

More information

Computer Architectures

Computer Architectures Computer Architectures Pipelined instruction execution Hazards, stages balancing, super-scalar systems Pavel Píša, Michal Štepanovský, Miroslav Šnorek Main source of inspiration: Patterson Czech Technical

More information

Design of Digital Circuits Lecture 17: Pipelining Issues. Prof. Onur Mutlu ETH Zurich Spring April 2017

Design of Digital Circuits Lecture 17: Pipelining Issues. Prof. Onur Mutlu ETH Zurich Spring April 2017 Design of Digital Circuits Lecture 17: Pipelining Issues Prof. Onur Mutlu ETH Zurich Spring 2017 28 April 2017 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed

More information

Design of Digital Circuits Lecture 16: Dependence Handling. Prof. Onur Mutlu ETH Zurich Spring April 2017

Design of Digital Circuits Lecture 16: Dependence Handling. Prof. Onur Mutlu ETH Zurich Spring April 2017 Design of Digital Circuits Lecture 16: Dependence Handling Prof. Onur Mutlu ETH Zurich Spring 2017 27 April 2017 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed

More information

CENG 5133 Computer Architecture Design Spring Sample Exam 2

CENG 5133 Computer Architecture Design Spring Sample Exam 2 CENG 533 Computer Architecture Design Spring 24 Sample Exam 2. (6 pt) Determine the propagation delay and contamination delay of the following circuit using the gate delays given below. Gate t pd (ps)

More information

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 9 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

Slide Set 5. for ENCM 369 Winter 2014 Lecture Section 01. Steve Norman, PhD, PEng

Slide Set 5. for ENCM 369 Winter 2014 Lecture Section 01. Steve Norman, PhD, PEng Slide Set 5 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section

More information

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1

Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Lecture 3: The Processor (Chapter 4 of textbook) Chapter 4.1 Introduction Chapter 4.1 Chapter 4.2 Review: MIPS (RISC) Design Principles Simplicity favors regularity fixed size instructions small number

More information

ENCM 369 Winter 2019 Lab 6 for the Week of February 25

ENCM 369 Winter 2019 Lab 6 for the Week of February 25 page of ENCM 369 Winter 29 Lab 6 for the Week of February 25 Steve Norman Department of Electrical & Computer Engineering University of Calgary February 29 Lab instructions and other documents for ENCM

More information

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141

EECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 13 EE141 EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 13 Project Introduction You will design and optimize a RISC-V processor Phase 1: Design

More information

Contents. Slide Set 2. Outline of Slide Set 2. More about Pseudoinstructions. Avoid using pseudoinstructions in ENCM 369 labs

Contents. Slide Set 2. Outline of Slide Set 2. More about Pseudoinstructions. Avoid using pseudoinstructions in ENCM 369 labs Slide Set 2 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section

More information

Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium Tuesday, April 18 7:00pm to 10:00pm

Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium Tuesday, April 18 7:00pm to 10:00pm University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Lecture Instructor for L01 and L02: Dr. S. A. Norman Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #22 CPU Design: Pipelining to Improve Performance II 2007-8-1 Scott Beamer, Instructor CS61C L22 CPU Design : Pipelining to Improve Performance

More information

Slide Set 5. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 5. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 5 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary February 2018 ENCM 369 Winter 2018 Section

More information

Design of A Six-stage Pipelined MIPS Processor Based on FPGA

Design of A Six-stage Pipelined MIPS Processor Based on FPGA Design of A Six-stage Pipelined MIPS Processor Based on FPGA Qiao-Zhi Sun, De-Chun Kong, Cheng-Long Zhao, and Hui-Bin Shi Department of Computer Science and Technology, Nanjing University of Aeronautics

More information

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 8 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined

More information

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides Slide Set 1 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section

More information

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4

Midnight Laundry. IC220 Set #19: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Return to Chapter 4 IC220 Set #9: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Return to Chapter 4 Midnight Laundry Task order A B C D 6 PM 7 8 9 0 2 2 AM 2 Smarty Laundry Task order A B C D 6 PM

More information

Winter 2002 FINAL EXAMINATION

Winter 2002 FINAL EXAMINATION University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructors: Dr. S. A. Norman (L01) and Dr. S. Yanushkevich (L02) Note for Winter 2005 students Winter

More information

Slide Set 4. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 4. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 4 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary January 2018 ENCM 369 Winter 2018 Section

More information

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor. COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Slide Set 1 (corrected)

Slide Set 1 (corrected) Slide Set 1 (corrected) for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary January 2018 ENCM 369 Winter 2018

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

Winter 2009 FINAL EXAMINATION Location: Engineering A Block, Room 201 Saturday, April 25 noon to 3:00pm

Winter 2009 FINAL EXAMINATION Location: Engineering A Block, Room 201 Saturday, April 25 noon to 3:00pm University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Lecture Instructors: S. A. Norman (L01), N. R. Bartley (L02) Winter 2009 FINAL EXAMINATION Location:

More information

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9 slide 2/41 Contents Slide Set 9 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike

More information

1 Hazards COMP2611 Fall 2015 Pipelined Processor

1 Hazards COMP2611 Fall 2015 Pipelined Processor 1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

#1 #2 with corrections Monday, March 12 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page.

#1 #2 with corrections Monday, March 12 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page. page 1 of 6 University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Lecture Instructors: Steve Norman and Norm Bartley Winter 2018 MIDTERM TEST #1 #2 with

More information

Lecture 7 Pipelining. Peng Liu.

Lecture 7 Pipelining. Peng Liu. Lecture 7 Pipelining Peng Liu liupeng@zju.edu.cn 1 Review: The Single Cycle Processor 2 Review: Given Datapath,RTL -> Control Instruction Inst Memory Adr Op Fun Rt

More information

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

Slide Set 8. for ENCM 501 in Winter Steve Norman, PhD, PEng

Slide Set 8. for ENCM 501 in Winter Steve Norman, PhD, PEng Slide Set 8 for ENCM 501 in Winter 2018 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 501 Winter 2018 Slide Set 8 slide

More information

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction

EECS150 - Digital Design Lecture 10- CPU Microarchitecture. Processor Microarchitecture Introduction EECS150 - Digital Design Lecture 10- CPU Microarchitecture Feb 18, 2010 John Wawrzynek Spring 2010 EECS150 - Lec10-cpu Page 1 Processor Microarchitecture Introduction Microarchitecture: how to implement

More information

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining Single-Cycle Design Problems Assuming fixed-period clock every instruction datapath uses one

More information

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 06: MIPS Processor - I. Bei Yu CENG 342 Computer Organization and Design Lecture 6: MIPS Processor - I Bei Yu CEG342 L6. Spring 26 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified

More information

Full Datapath. Chapter 4 The Processor 2

Full Datapath. Chapter 4 The Processor 2 Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory

More information

CS 351 Exam 2 Mon. 11/2/2015

CS 351 Exam 2 Mon. 11/2/2015 CS 351 Exam 2 Mon. 11/2/2015 Name: Rules and Hints The MIPS cheat sheet and datapath diagram are attached at the end of this exam for your reference. You may use one handwritten 8.5 11 cheat sheet (front

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

CS3350B Computer Architecture Quiz 3 March 15, 2018

CS3350B Computer Architecture Quiz 3 March 15, 2018 CS3350B Computer Architecture Quiz 3 March 15, 2018 Student ID number: Student Last Name: Question 1.1 1.2 1.3 2.1 2.2 2.3 Total Marks The quiz consists of two exercises. The expected duration is 30 minutes.

More information

LECTURE 3: THE PROCESSOR

LECTURE 3: THE PROCESSOR LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU

More information

Columbia University CSEE 3827 Fundamentals of Computer Systems Final Exam

Columbia University CSEE 3827 Fundamentals of Computer Systems Final Exam Columbia University CSEE 3827 Fundamentals of Computer Systems Final Exam Prof. Martha A. Kim December 7, 23 Name: First Last (Family) UNI (e.g., mak29) You are allowed 3 hours. You may consult your own

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some

More information

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer

EECS150 - Digital Design Lecture 9- CPU Microarchitecture. Watson: Jeopardy-playing Computer EECS150 - Digital Design Lecture 9- CPU Microarchitecture Feb 15, 2011 John Wawrzynek Spring 2011 EECS150 - Lec09-cpu Page 1 Watson: Jeopardy-playing Computer Watson is made up of a cluster of ninety IBM

More information

Chapter 4 The Processor 1. Chapter 4A. The Processor

Chapter 4 The Processor 1. Chapter 4A. The Processor Chapter 4 The Processor 1 Chapter 4A The Processor Chapter 4 The Processor 2 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

CSEE 3827: Fundamentals of Computer Systems

CSEE 3827: Fundamentals of Computer Systems CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 martha@cs.columbia.edu Amdahl s Law Be aware when optimizing... T = improved Taffected improvement factor + T unaffected

More information

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3. Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview

More information

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception

Outline. A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception Outline A pipelined datapath Pipelined control Data hazards and forwarding Data hazards and stalls Branch (control) hazards Exception 1 4 Which stage is the branch decision made? Case 1: 0 M u x 1 Add

More information

Systems Architecture

Systems Architecture Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some or all figures from Computer Organization and Design: The Hardware/Software

More information

Department of Electrical Engineering and Computer Sciences Fall 2003 Instructor: Dave Patterson CS 152 Exam #1. Personal Information

Department of Electrical Engineering and Computer Sciences Fall 2003 Instructor: Dave Patterson CS 152 Exam #1. Personal Information University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Fall 2003 Instructor: Dave Patterson 2003-10-8 CS 152 Exam #1 Personal Information First

More information

CS232 Final Exam May 5, 2001

CS232 Final Exam May 5, 2001 CS232 Final Exam May 5, 2 Name: This exam has 4 pages, including this cover. There are six questions, worth a total of 5 points. You have 3 hours. Budget your time! Write clearly and show your work. State

More information

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1,

SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life. Chapter 6 ADMIN. Reading for Chapter 6: 6.1, SI232 Set #20: Laundry, Co-dependency, and other Hazards of Modern (Architecture) Life Chapter 6 ADMIN ing for Chapter 6: 6., 6.9-6.2 2 Midnight Laundry Task order A 6 PM 7 8 9 0 2 2 AM B C D 3 Smarty

More information

Design of Digital Circuits Lecture 15: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2017

Design of Digital Circuits Lecture 15: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2017 Design of Digital Circuits Lecture 5: Pipelining Prof. Onur Mutlu ETH Zurich Spring 27 3 April 27 Agenda for Today & Next Few Lectures! Single-cycle Microarchitectures! Multi-cycle and Microprogrammed

More information

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds? Chapter 4: Assessing and Understanding Performance 1. Define response (execution) time. 2. Define throughput. 3. Describe why using the clock rate of a processor is a bad way to measure performance. Provide

More information

CENG 3420 Lecture 06: Datapath

CENG 3420 Lecture 06: Datapath CENG 342 Lecture 6: Datapath Bei Yu byu@cse.cuhk.edu.hk CENG342 L6. Spring 27 The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: memory-reference

More information

Chapter 4 The Processor 1. Chapter 4B. The Processor

Chapter 4 The Processor 1. Chapter 4B. The Processor Chapter 4 The Processor 1 Chapter 4B The Processor Chapter 4 The Processor 2 Control Hazards Branch determines flow of control Fetching next instruction depends on branch outcome Pipeline can t always

More information

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007

Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007 Digital Design & Computer Architecture (E85) D. Money Harris Fall 2007 Final Exam This is a closed-book take-home exam. You are permitted a calculator and two 8.5x sheets of paper with notes. The exam

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining

More information

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining

Orange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction

More information

The Pipelined MIPS Processor

The Pipelined MIPS Processor 1 The niversity of Texas at Dallas Lecture #20: The Pipeline IPS Processor The Pipelined IPS Processor We complete our study of AL architecture by investigating an approach providing even higher performance

More information

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control

Mark Redekopp and Gandhi Puvvada, All rights reserved. EE 357 Unit 15. Single-Cycle CPU Datapath and Control EE 37 Unit Single-Cycle CPU path and Control CPU Organization Scope We will build a CPU to implement our subset of the MIPS ISA Memory Reference Instructions: Load Word (LW) Store Word (SW) Arithmetic

More information

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

ENCM 369 Winter 2017 Lab 3 for the Week of January 30 page 1 of 11 ENCM 369 Winter 2017 Lab 3 for the Week of January 30 Steve Norman Department of Electrical & Computer Engineering University of Calgary January 2017 Lab instructions and other documents for

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

ENCM 369 Winter 2016 Lab 11 for the Week of April 4

ENCM 369 Winter 2016 Lab 11 for the Week of April 4 page 1 of 13 ENCM 369 Winter 2016 Lab 11 for the Week of April 4 Steve Norman Department of Electrical & Computer Engineering University of Calgary April 2016 Lab instructions and other documents for ENCM

More information

Grading Results Total 100

Grading Results Total 100 University of California, Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Fall 2003 Instructor: Dave Patterson 2003-10-8 CS 152 Exam #1 Personal Information First

More information

Faculty of Science FINAL EXAMINATION

Faculty of Science FINAL EXAMINATION Faculty of Science FINAL EXAMINATION COMPUTER SCIENCE COMP 273 INTRODUCTION TO COMPUTER SYSTEMS Examiner: Prof. Michael Langer April 18, 2012 Associate Examiner: Mr. Joseph Vybihal 2 P.M. 5 P.M. STUDENT

More information

University of Jordan Computer Engineering Department CPE439: Computer Design Lab

University of Jordan Computer Engineering Department CPE439: Computer Design Lab University of Jordan Computer Engineering Department CPE439: Computer Design Lab Experiment : Introduction to Verilogger Pro Objective: The objective of this experiment is to introduce the student to the

More information

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture The Processor Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut CSE3666: Introduction to Computer Architecture Introduction CPU performance factors Instruction count

More information

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes

EE557--FALL 1999 MAKE-UP MIDTERM 1. Closed books, closed notes NAME: STUDENT NUMBER: EE557--FALL 1999 MAKE-UP MIDTERM 1 Closed books, closed notes Q1: /1 Q2: /1 Q3: /1 Q4: /1 Q5: /15 Q6: /1 TOTAL: /65 Grade: /25 1 QUESTION 1(Performance evaluation) 1 points We are

More information

Integer Multiplication and Division

Integer Multiplication and Division Integer Multiplication and Division for ENCM 369: Computer Organization Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 208 Integer

More information

Thomas Polzer Institut für Technische Informatik

Thomas Polzer Institut für Technische Informatik Thomas Polzer tpolzer@ecs.tuwien.ac.at Institut für Technische Informatik Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University The Processor Logic Design Conventions Building a Datapath A Simple Implementation Scheme An Overview of Pipelining Pipelined

More information

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESI COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler

More information

Working on the Pipeline

Working on the Pipeline Computer Science 6C Spring 27 Working on the Pipeline Datapath Control Signals Computer Science 6C Spring 27 MemWr: write memory MemtoReg: ALU; Mem RegDst: rt ; rd RegWr: write register 4 PC Ext Imm6 Adder

More information

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructor: Steve Norman

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructor: Steve Norman page of 9 University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructor: Steve Norman Winter 26 FINAL EXAMINATION (with corrections) Location: ICT 2

More information

Processor (I) - datapath & control. Hwansoo Han

Processor (I) - datapath & control. Hwansoo Han Processor (I) - datapath & control Hwansoo Han Introduction CPU performance factors Instruction count - Determined by ISA and compiler CPI and Cycle time - Determined by CPU hardware We will examine two

More information

Binvert Operation (add, and, or) M U X

Binvert Operation (add, and, or) M U X Exercises 5 - IPS datapath and control Questions 1. In the circuit of the AL back in lecture 4, we included an adder, an AND gate, and an OR gate. A multiplexor was used to select one of these three values.

More information

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) CS 61C Summer 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) 1) If this exam were a CPU, you d be halfway through the pipeline (Sp15 Final) We found that the instruction fetch and memory stages

More information

Computer Organization and Structure

Computer Organization and Structure Computer Organization and Structure 1. Assuming the following repeating pattern (e.g., in a loop) of branch outcomes: Branch outcomes a. T, T, NT, T b. T, T, T, NT, NT Homework #4 Due: 2014/12/9 a. What

More information

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23)

Lecture Topics. Announcements. Today: Single-Cycle Processors (P&H ) Next: continued. Milestone #3 (due 2/9) Milestone #4 (due 2/23) Lecture Topics Today: Single-Cycle Processors (P&H 4.1-4.4) Next: continued 1 Announcements Milestone #3 (due 2/9) Milestone #4 (due 2/23) Exam #1 (Wednesday, 2/15) 2 1 Exam #1 Wednesday, 2/15 (3:00-4:20

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions

CSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions CSEE W3827 Fundamentals of Computer Systems Homework Assignment 3 Solutions 2 3 4 5 Prof. Stephen A. Edwards Columbia University Due June 26, 207 at :00 PM ame: Solutions Uni: Show your work for each problem;

More information

Slides for Lecture 6

Slides for Lecture 6 Slides for Lecture 6 ENCM 501: Principles of Computer Architecture Winter 2014 Term Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary 28 January,

More information

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements

CSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements CSE 4 Computer Architecture Spring 25 Lectures Exceptions and Introduction to Pipelining May 4, 25 Announcements Reading Assignment Sections 5.6, 5.9 The Processor Datapath and Control Section 6., Enhancing

More information

Laboratory Pipeline MIPS CPU Design (2): 16-bits version

Laboratory Pipeline MIPS CPU Design (2): 16-bits version Laboratory 10 10. Pipeline MIPS CPU Design (2): 16-bits version 10.1. Objectives Study, design, implement and test MIPS 16 CPU, pipeline version with the modified program without hazards Familiarize the

More information

CS Computer Architecture Spring Week 10: Chapter

CS Computer Architecture Spring Week 10: Chapter CS 35101 Computer Architecture Spring 2008 Week 10: Chapter 5.1-5.3 Materials adapated from Mary Jane Irwin (www.cse.psu.edu/~mji) and Kevin Schaffer [adapted from D. Patterson slides] CS 35101 Ch 5.1

More information

CS 251, Winter 2018, Assignment % of course mark

CS 251, Winter 2018, Assignment % of course mark CS 251, Winter 2018, Assignment 5.0.4 3% of course mark Due Wednesday, March 21st, 4:30PM Lates accepted until 10:00am March 22nd with a 15% penalty 1. (10 points) The code sequence below executes on a

More information

CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control)

CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) CS 61C Fall 2016 Guerrilla Section 4: MIPS CPU (Datapath & Control) 1) If this exam were a CPU, you d be halfway through the pipeline (Sp15 Final) We found that the instruction fetch and memory stages

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

EIE/ENE 334 Microprocessors

EIE/ENE 334 Microprocessors EIE/ENE 334 Microprocessors Lecture 6: The Processor Week #06/07 : Dejwoot KHAWPARISUTH Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2009, Elsevier (MK) http://webstaff.kmutt.ac.th/~dejwoot.kha/

More information