Module 4c: Pipelining
|
|
- Leona Thomas
- 6 years ago
- Views:
Transcription
1 Module 4c: Pipelining R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E P A T T E R S O N A N D H E N N E S S Y, C O M P U T E R O R G A N I Z A T I O N A N D D E S I G N
2 Pipeline Performance of a computer can be increased by increasing the performance of the CPU. This can be done by executing more than one tasks at a time. This procedure is referred to as pipelining. The concept of pipelining is to allow the processing of a new task even though the processing of previous task has not ended. 2
3 The root of the single cycle processor s problems: The cycle time has to be long enough for the slowest instruction Solution: Break the instruction into smaller steps Execute each step (instead of the entire instruction) in one cycle Cycle time: time it takes to execute the longest step Keep all the steps to have similar length This is the essence of the multiple cycle processor
4 The advantages of the multiple cycle processor: Cycle time is much shorter Different instructions take different number of cycles to complete Load takes five cycles Jump only takes three cycles Allows a functional unit to be used more than once per instruction
5 Pipeline A single task is divided into several small independent processes. 5 Process T3 T2 T1 Segment 1 Segment 2 Segment 3
6 Analogy: Pipelined Laundry Non-pipelined approach: 1. run 1 load of clothes through washer 2. run load through dryer 3. fold the clothes (optional step for students) 4. put the clothes away (also optional). 6 Two loads? Start all over.
7 Analogy: Pipelined Laundry 7 While the first load is drying, put the second load in the washing machine. When the first load is being folded and the second load is in the dryer, put the third load in the washing machine.
8 T i m e T a s k o r d e r A B C D 6 P M A M Non-pipelined 16 units of time T i m e 6 P M A M T a s k o r d e r A B C D 8 Pipelined 7 units of time Notice that all the process have the same length.
9 Pipeline: Space Time Diagram Illustrates the behaviour of a pipeline S1 P1 P2 P3 P clock S2 S3 S4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 T1 n-1 S = segment; P = process;
10 Pipelining Lessons Pipelining doesn t help latency of single task, it helps throughput of entire workload Multiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Pipeline rate limited by slowest pipeline stage Unbalanced lengths of pipe stages reduces speedup Time to fill pipeline and time to drain it reduces speedup Stall for Dependences
11 Design Issues Since each segment are connected to each other in a sequence, the next segment cannot start execution until it has received the result from the previous segment (in this case, pipelining is not ideal). 12 So, the cycle time of the segments must be the same. However, it is known that the execution time of each segment is not the same. Therefore for synchronization, the cycle time for the pipeline will be based on, the longest execution time of the segment in the pipeline.
12 Pipeline performance: Degree of Speedup Given t n is the cycle time for non-pipelining and t p for pipelining. An ideal pipeline divides a task into k independent sequential processes: Each process requires t p time unit to complete, The task itself then requires k t p time units to complete. For n iterations of the task, the execution times will be: With no pipelining: n t n time units, With pipelining: k t p + (n-1) t p time units. Degree of speedup is thus: S = Ex. Time non-pipelining / Ex. Time with pipelining = (n t n )/ [k+(n-1)] t p If n is too large from k, n >> k and when t n = k t p, thus max. speedup: S max = k 13 An ideal pipeline each process has requires the same time unit
13 The Laundromat In a Laundromat the following process occur run 1 load of clothes through washer : 8 run load through dryer : 4 fold the clothes : 5 put the clothes away : 3 Answer these: Determine the cycle time for pipelined and non-pipelined process? Determine the execution time for both pipelined and nonpipelined execution. What is the maximum speedup possible? What is the real speedup value?
14 Non- Ideal Pipeline Structure: Example The operands pass through all segments in a fixed sequence 15 Control Unit Segments are separated by registers, that hold the intermediate results between stages. R1 S1 R2 S2... Rm Sm Data In Segment 1 Each segment Segment consists of 2 a circuit Segment m that performs sub-operation Data Out
15 Example 1: Pipeline (Non-ideal case) Given a 4 segment pipeline whereby each segment has a delay time as follows: - Segment 1 : 40 ns - Segment 2 : 25 ns - Segment 3 : 45 ns - Segment 4 : 45 ns The delay time for the interface register is 5ns. Calculate the: i) cycle time of the non-pipeline and pipeline, ii) execution time for 100 tasks, iii) real speedup, and iv) maximum speedup. 16
16 Example 1: Pipeline 17 Control Unit Data Input Segment 1 Segment 2 Segment 3 Segment 4 Data Output Seg.1 : 40 ns Cycle time Seg.2 : 25 ns Seg.3 : 45 ns Seg.4 : 45 ns i) Cycle time: t n = ( ) ns = 160ns t p = the longest delay for execution + interface delay = (45 + 5) ns = 50 ns
17 Example 1: Pipeline ii. Execution time for 100 tasks 18 = k x t p + (n-1) t p (k + (n 1)) t p = ((4 * 50) + (99 * 50)) ns = (50 * 103) ns = 5150 ns For non-pipeline system, the total execution time for 100 tasks = n t n = 100 * 160ns = ns iii. The real speedup for 100 task Speed up = Execution time for non-pipeline /Execution time for pipeline = / 5150= iv. Maximum speedup, S max = k = 4
18 Instruction Pipeline The instruction cycle clearly shows the sequence of operations that take place in order to execute a single instruction. A good design goal of any system is to have all of its components performing useful work all of the time high efficiency. Following the instruction cycle in a sequential fashion does not permit this level of efficiency. Analogy: Automobile assembly line Perform all tasks concurrently, but on different (sequential) instructions 20 The result is temporal parallelism Result is the instruction pipeline This cause the instruction fetch and execute phases to overlap and perform simultaneous operations.
19 Implementation of Instruction Pipeline: Case 1 Divide the instruction cycle into two processes 22 Instruction fetch (Fetch cycle) Everything else (Execution phase) While one instruction is in execution, pre-fetching of the next instruction Assumes the memory bus will be idle at some point during the execution phase. Reduces the time to fetch an instruction to zero (ideal situation). Instruction prefetching is also known as fetch overlap
20 Implementation of Instruction Pipeline: Case 1 23 Sequential Fetch #1 Execute #1 Fetch #2 Execute #2 Pipeline Fetch #1 Execute #1 Fetch #2 Execute #2
21 Implementation of Instruction Pipeline: Case 1 Fetch and execute are not the same size, take different times to operate so its not really neat or ideal Execution is generally longer than fetch Whenever the execution unit is not using memory, the control increments the program counter, and read the consecutive instructions from memory efficient Fetch and put into queue But sometimes execution needs to branch, so all prefetching is useless. e.g. needs instruction 123 rather than 102 flush the queue and fetch again Also, if there is a branch, fetch must wait for the address from execute Execute must wait until that address is fetched.
22 Implementation of Instruction Pipeline: Case 2 Complex computer instructions require other finer phases Having more stages can further speedup Example: use a 6-stage pipeline: Fetch instruction (FI) read next instruction into buffer Decode instruction (DI) determine opcode and operand Calculate operands (CO) calculate effective address of operand Fetch operands (FO) fetch operand from memory Execute instruction (EI) Write (store) operand (WO) store result in memory The various stages will be more of equal duration Lets assume equal duration 26
23 Implementation of Instruction Pipeline: Case 2 Use multiple execution functional units to parallelize the actual execution phase of several instructions Use branching strategies to minimize branch impact 27
24 Space Time Diagram 28 A 6-stage pipeline can reduce execution time of 9 ins. From 54 to 14 time units. Non-pipelined: 9 ins. X 6 stages = 54 time units.
25 Comments on diagram Assumes each instruction goes thru all 6 stages not true Example: load instruction don t need WO stage Assumes there are no memory conflicts Example: FI, FO, WO involve memory access cannot do it simultaneously Unless FO and WO stage is null or value needed is already in cache then this is not a problem. Assumes no interrupt or branching happens Assumes no data dependency Where the CO stage may depend on contents of a register that could be altered by a previous instruction still in the pipeline
26 6-stage CPU Instruction Pipeline
27 Example 2 The estimated timings for each of the stages of an instruction pipeline: 31 Instruction Fetch Register Read ALU Operation Data Memory Register Write 2ns 1ns 2ns 2ns 1ns
28 P r o g r a m e x e c u t i o n T i m e o r d e r ( i n i n s t r u c t i o n s ) mov ax, num I n s t r u c t i o n D a t a R e g A L U R e g f e t c h a c c e s s mov bx, num2 8 n s I n s t r u c t i o n f e t c h R e g A L U D a t a a c c e s s R e g mov cx, num3 8 n s I n s t r u c t i o n f e t c h 8 n s... P r o g r a m e x e c u t i o n T i m e o r d e r ( i n i n s t r u c t i o n s ) mov ax, num I n s t r u c t i o n D a t a R e g A L U R e g f e t c h a c c e s s mov bx, num2 2 n s I n s t r u c t i o n f e t c h R e g A L U D a t a a c c e s s R e g mov cx, num3 2 n s I n s t r u c t i o n f e t c h R e g A L U D a t a a c c e s s R e g 2 n s 2 n s 2 n s 2 n s 2 n s 32
29 Pipeline Limitations
30 Pipeline Limitations Difficulties sometimes called limitations or hazards In general there are 3 major difficulties that causes instruction pipeline to deviate from its normal operation (i.e. affect pipeline performance) Resource conflict (or resource hazard or structural hazard) Caused by access to memory by 2 segments at the same time Data dependency (or data hazard) Conflict arise when an instruction depends on the result of a previous instruction, but this result is not yet available Branch difficulties (or control hazard) Arise from branch and other instructions that change the PC value Pipeline depth is often not included in the list but it does have an effect on pipeline performance (so it will be discussed)
31 Pipeline Limitations Hazards in pipeline can make the pipeline to stall Eliminating a hazard often requires that some instructions in the pipeline to be allowed to proceed while others are delayed When an instruction is stalled, instructions issued later than the stalled instruction are stopped, while the ones issued earlier must continue No new instructions are fetched during the stall
32 Pipeline Limitation: Pipeline depth If the speedup is based on the number of stages, why not build lots of stages? 37 Each stage uses latches at its input (output) to buffer the next set of inputs. More stages means more hardware overhead If the stage granularity is reduced too much, the latches and their control become a significant hardware overhead. Also suffer a time overhead in the propagation time through the latches. Limits the rate at which data can be clocked through the pipeline. Logic to handle memory and register use and to control the overall pipeline increases significantly with increasing pipeline depth.
33 6 Deep i.e. 6 stages
34 Pipeline Limitation : Data dependency 39 Pipelining must insure that computed results are the same as if computation was performed in strict sequential order. With multiple stages, two instructions in execution in the pipeline may have data dependencies -- must design the pipeline to prevent this. Data dependencies limit when an instruction can be input to the pipeline. Data dependency is shown in the following portion of a program: A = B + C; D = E + A; C = G x H; A = D / H; D needs A but cant read value of A because it still being written by previous instruction
35 Say that this is a 5-staged pipeline I1: ADD AX, BX ; [AX] [AX] + [BX] I2: SUB AX, CX ; [AX] [AX] - [CX] ADD ins. does not update reg. AX until stage 5 But SUB ins. needs the value at beginning of stage 3 cannot fetch something that is not ready To maintain correct operation, the pipeline must stall for 2 clock cycles Results in inefficient pipeline usage
36 Solutions to Data Dependency Hardware interlocks 42 An interlock is a circuit that detects instructions with data dependency and inserts required delays to resolve conflicts Operand forwarding Use circuit to detect possible conflict. Then uses extra hardware to keep results for future use down the pipeline Allowing the result of ALU directly sent as input to ALU to be used by other ALU operations in the next instruction cycle. Delayed load (NOP) Compiler is used to detect conflict and reorder instructions to delay loading of conflicting data. Using NOP instruction.
37 Solution to Data Dependency: Operand Forwarding 43 Src1, Src2 RSLT ALU Operation Operand Store Allowing result of ALU to be used as input to ALU Forwarding Data Path
38 Solution to Data Dependency: Delay Load (NOP) Fetch Instruction Decode Instruction Execute Instruction Store Result I1 NOP I2 I3 I4 I1 NOP I2 I3 I4 I1 NOP I2 I3 I4 I1 NOP I2 I3 I4
39 Pipeline Limitation: Conflict of Resources Occurs when two segment need to access memory at the same time Can be solved by implementing modular memory. Fetch Instruction Decode Instruction Fetch I4 I1 I2 I3 I4 I5 I1 I2 I3 I4 I5 All need access to memory at the same time Execute Instruction Store Result Fetching indirect operand for I3 I1 I2 I3 I4 I5 I1 I2 I3 I4 I5 Store/Write I1
40 5-staged pipeline, ideal case We have a conflict with 2 instruction needing the same resource Assume that memory has a single port so data reads and writes can only happen 1 at a time. Assume that the source operand for I1 is in memory. A delay in any stage can cause pipeline stalls. The FI stage for I3 must idle for 1 clock cycle before beginning
41 Handling Resource Conflict 49 This scenario describes a memory conflict caused by the instruction fetch of I3 and memory resident operand fetch of I1.
42 Handling Resource Conflict 50 Harvard architecture alleviates this issue.
43 Pipeline Limitation 4: Branching For the pipeline to have the desired operational speedup, we must feed it with long strings of instructions. However, 15-20% of instructions in an assemblylevel stream are (conditional) or not ; until it is actually executed. branches. 51 There must be a steady flow of instructions Conditional branches impossible to determine whether the branch will be taken Of these, 60-70% take the branch to a target address. Impact of the branch is that pipeline never really operates at its full capacity limiting the performance improvement that is derived from the pipeline
44 Example 4: Branching 52 Instruction that need to be flushed out Fetch Instruction Decode Instruction Execute Instruction Store Result idle
45 Solution for Branching 53 Delayed branch (using NOP) Rearranging the instructions Implementation of instruction queue
46 Solution for Branching: Delay Branch Using NOP 54 When the compiler detect a branch, it will automatically insert several NOP so that there is no interruptions in the pipeline. Example: I4 I5 I6 I7 MOV INC ADD RET KK MOV INC ADD RET KK The number of NOPs used is 1 less than the number of segments If k = Number of segments Then, NOP to be used = k 1 = 4 1 = 3 I8 NOP NOP NOP
47 Solution for Branching: Delay Branch Using NOP Fetch Inst.(FI) Decode Inst.(DI) Execute Instr.(EI) Store Result I4 I5 I6 I7 NOP NOP NOP KK I4 I5 I6 I7 NOP NOP NOP KK I4 I5 I6 I7 NOP NOP NOP KK I4 I5 I6 I7 NOP NOP NOP KK 4 segments 3 NOPs Wasted 3 pipeline clock cycles per segment
48 Solution for Branching: Rearranging Instructions Tasks are rearranged so that the pipelined can operate efficiently. Fetch Instruction Decode Instruction Execute Instruction Store Result Mov Add Inc Ret 57 Ret Mov Add Inc How to arrange? By bringing the branch instruction up few notches. The number of notches = No segment 1 = k 1 = 4 1 = 3 Ret Mov Add Inc I1 I2 I3 I4 I5 Ret Mov Add Inc I1 I2 I3 I4 Ret Mov Add Inc I1 I2 I3 Ret Mov Add Inc I1 I2
49 Solution for Branching: Rearranging Instructions I4 I5 I6 I7 I8 MOV INC ADD RET KK Fetch Inst.(FI) Decode Inst.(DI) Execute Instr.(EI) I7 RET KK I4 MOV I5 INC I6 ADD I7 I4 I5 I6 KK I7 I4 I5 I6 KK How to arrange? By bringing the branch instruction up few notches. The number of notches = No segment 1 = k 1 = 4 1 = 3 I7 I4 I5 I6 KK Store Result I7 I4 I5 I6 KK
50 I1 I2 I3 I4 I5 Example: Rearranging Instructions MOV AX,BX ADD BX,CX INC CX JMP LOSSY MOV DX,1 59 Pipeline with branching hazard I6 INC BX : I11 LOSSY: DEC AX Flushed out Idle Fetch Inst.(FI) I1 I2 I3 I4 I5 I6 I11 Decode Inst.(DI) I1 I2 I3 I4 I5 I11 Execute Instr.(EI) I1 I2 I3 I4 I11 Store Result I1 I2 I3 I4 I11
51 I1 I2 I3 Example: Rearranging Instructions MOV AX,BX ADD BX,CX INC CX I4 I1 I2 JMP LOSSY MOV AX,BX 60 ADD BX,CX Bring JMP inst. up by 3 notches I4 I5 I6 JMP LOSSY MOV DX,1 INC BX I3 I5 I6 INC CX MOV DX,1 INC BX Pipeline AFTER rearranging instructions : I11 LOSSY: DEC AX : I11 LOSSY: DEC AX Fetch Inst.(FI) Decode Inst.(DI) Execute Instr.(EI) Store Result I4 I1 I2 I3 I11 I4 I1 I2 I3 I11 I4 I1 I2 I3 I11 I4 I1 I2 I3 I11
52 Instruction Queue Solution for Branching: Instruction Queue 61 Fetch Instruction Prefetch target instruction: Prefetch both possible next instructions in the case of a conditional branch Fetch Operand Execute Instruction Store Result Instruction Queue S1 S2 JMP S4 When FI segment detect a JMP the next address instruction regarding to the JMP will be determined. S4 will be deleted and the new instruction will be fetched.
53 Example 5: Instruction Pipeline Given a pipeline that consist of 5 segments; Fetch Instruction, Decode Instruction, Fetch Operand, Execute Instruction and Store Result. 62 Say, n = 3 : I n : MOV AX, NUM ; AX NUM I n+1 : ADD BX, AX ; BX BX + AX I n+2 : JMP LOOP1 ; I n+10 I n+3 : : : : : I n+10 : LOOP1:
54 : I 3 MOV AX,NUM ; AX <-- NUM I 4 ADD BX, AX ; Here BX <-- we have BX + a AX I 5 JMP LOOP1 ; branching I13 problem I 6 : : : I 13 LOOP1: Draw the space time diagram for the execution of the instructions- Identify the branching and data dependency hazard flushed Fetch Inst.(FI) I3 I4 I5 I6 I7 I8 I13 Pipeline is idle until JMP is completed and I13 is fetched and completed Decode Inst.(DI) I3 I4 I5 I6 I7 I13 Fetch Opr.(FO) I3 I4 I5 I6 I13 Execute Instr.(EI) When I5 (JMP) is executed, all fetched instructions will be flushed I3 I4 I5 I13 Store Result I3 I4 I5 I13
55 : I 3 MOV AX,NUM ; AX <-- NUM I 4 ADD BX, AX ; BX <-- BX + AX I 5 JMP LOOP1 ; I13 I 6 : : : Here we have data dependency problem Draw the space time diagram for the execution of the instructions- Identify the branching and data dependency hazard I4 needs AX, but AX is still being processed by I3 I 13 LOOP1: Fetch Inst.(FI) I3 I4 I5 I6 I7 I8 I13 Decode Inst.(DI) I3 I4 I5 I6 I7 I13 Fetch Opr.(FO) I3 I4 I5 I6 I13 Execute I3 I4 I5 I13 Instr.(EI) Here is where data dependency is shown Store Result I3 I4 I5 I13 in the diagram
56 : I 3 MOV AX,NUM ; AX <-- NUM I 4 ADD BX, AX ; BX <-- BX + AX I 5 JMP LOOP1 ; I13 I 6 : : : I 13 LOOP1: Solve the data dependency problem using NOP instructions. Insert NOP between I3 and I4 I3 NOP I4 I5 : : Fetch Inst.(FI) I3 NOP I4 I5 I13 Decode Inst.(DI) I3 NOP I4 I5 I13 Fetch Opr.(FO) I3 NOP I4 I5 I13 Execute Instr.(EI) I3 NOP I4 I5 I13 Store Result I3 NOP I4 I5 I13
57 : I 3 MOV AX,NUM ; AX <-- NUM NOP I 4 ADD BX, AX ; BX <-- BX + AX I 5 JMP LOOP1 ; I13 I 6 : : : I 13 LOOP1: Solve the branching problem by rearranging the instructions. The number of notches = No segment 1 = 5 1 = 4 I3 NOP I4 I5 : : I5 I2 I3 NOP I4 : : Fetch Inst.(FI) I5 I2 I3 NOP I4 I13 Decode Inst.(DI) Fetch Opr.(FO) Execute Instr.(EI) I5 I2 I3 NOP I4 I13 I5 I2 I3 NOP I4 I13 I5 I2 I3 NOP I4 I13 Store Result I5 I2 I3 NOP I4 I13
58 : I 3 MOV AX,NUM ; AX <-- NUM NOP I 4 ADD BX, AX ; BX <-- BX + AX I 5 JMP LOOP1 ; I13 I 6 : : : I 13 LOOP1: Solve the branching problem using NOP instructions. I3 NOP I4 I5 : : The number of NOPs = No segment 1 = 5 1 = Fetch Inst.(FI) I3 NOP I4 I5 NOP NOP NOP NOP I13 Decode Inst.(DI) I3 NOP I4 I5 NOP NOP NOP NOP I13 Fetch Opr.(FO) I3 NOP I4 I5 NOP NOP NOP NOP I13 Execute Instr.(EI) I3 NOP I4 I5 NOP NOP NOP NOP I13 Store Result I3 NOP I4 I5 NOP NOP NOP NOP I13
59 Try this out Given a pipeline that consist of 4 segments; Fetch Instruction, Decode Instruction, Execute Instruction and Store Result. Use this information to answer the following questions. -Draw the space time diagram for I1 MOV AX,BX the execution. I2 SUB BX,2 -Label clearly the data dependency I3 ADD BX, NUMB problem, the flushed instructions and the empty/idle slots. I4 JMP LOSSY I5 INC BX 16 DEC NUMB I7 LOSSY: DEC AX -Solve the data dependency using NOP instructions -Solve the branching problem by rearranging the instructions - Draw a new space time diagram
60 I1 I2 I3 I4 MOV AX,BX SUB BX,2 ADD BX, NUMB JMP LOSSY Draw the space time diagram for the execution. Label the data dependency problem, the flushed instructions and the empty/idle slots. I5 INC BX 16 DEC NUMB I7 LOSSY: DEC AX Fetch Inst.(FI) Decode Inst.(DI) Execute Instr.(EI) I1 I2 I3 I4 I5 I6 I7 I1 I2 I3 I4 I5 I7 Data dependency problem Flushed instructions Empty/idle slots I1 I2 I3 I4 I7 Store Result I1 I2 I3 I4 I7
61 I1 MOV AX,BX I2 SUB BX,2 I3 ADD BX, NUMB I4 JMP LOSSY I5 INC BX 16 DEC NUMB I7 LOSSY: DEC AX A I1 I2 NOP I3 I4 B I1 I4 I2 NOP I3 -Solve the data dependency using NOP instructions A -Solve the branching problem by rearranging the instructions B - Draw a new space time diagram C Fetch Inst.(FI) I1 I4 I2 NOP I3 I7 Decode Inst.(DI) Execute Instr.(EI) I1 I4 I2 NOP I3 I7 I1 I4 I2 NOP I3 I7 Store Result I1 I4 I2 NOP I3 I7
62 Arithmetic Pipeline The process of a complex arithmetic is subdivided into several task segments Every parameter that is involve with the arithmetic operations is processed in the segments and is control by the CPU clock. Example: S i = A i * B i +C i 71 Z i = A i * B i +C i *D i K i = A i * B i +C i /D i
63 Example 6: Designing Arithmetic Pipeline 72 Design an arithmetic pipeline for the operation S = A i * B i + C i where i = 1, 2, 3,, n. Specifications: 3 segment pipeline Each segment is allow to process only two tasks. Component: Register (R), Adder (A), Multiplier (M).
64 Example 6 S = A i * B i + C i where i = 1, 2, 3,, n. 73 The sequence of operation (follow the standard mathematical operational: Access the require data for the registers. Multiplying operation Addition operation
65 Example 6: S = A i * B i + C i Segment definition: Segment 1 : Reg 1 A i, Reg 2 B i Segment 2 : Multiply Reg 3 A i * B i, Reg 4 C i Segment 3 : Add Reg 5 Reg 3 + C i, S Reg 5 74 Notice that all segments only execution of 2 subtasks
Chapter 8. Pipelining
Chapter 8. Pipelining Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization requires sophisticated compilation techniques.
More informationCPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner
CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.
More informationPipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15
Pipeline Processors Pipelining :: Slide 1 of 15 Pipeline Processors A common feature of modern processors Works like a series production line An operation is divided into k decoupled (independent) elementary
More informationLecture 15: Pipelining. Spring 2018 Jason Tang
Lecture 15: Pipelining Spring 2018 Jason Tang 1 Topics Overview of pipelining Pipeline performance Pipeline hazards 2 Sequential Laundry 6 PM 7 8 9 10 11 Midnight Time T a s k O r d e r A B C D 30 40 20
More informationPipeline: Introduction
Pipeline: Introduction These slides are derived from: CSCE430/830 Computer Architecture course by Prof. Hong Jiang and Dave Patterson UCB Some figures and tables have been derived from : Computer System
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationChapter 5 (a) Overview
Chapter 5 (a) Overview (a) The principles of pipelining (a) A pipelined design of SRC (b) Pipeline hazards (b) Instruction-level parallelism (ILP) Superscalar processors Very Long Instruction Word (VLIW)
More informationPage 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight
Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike
More informationECEC 355: Pipelining
ECEC 355: Pipelining November 8, 2007 What is Pipelining Pipelining is an implementation technique whereby multiple instructions are overlapped in execution. A pipeline is similar in concept to an assembly
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationLecture 19 Introduction to Pipelining
CSE 30321 Lecture 19 Pipelining (Part 1) 1 Lecture 19 Introduction to Pipelining CSE 30321 Lecture 19 Pipelining (Part 1) Basic pipelining basic := single, in-order issue single issue one instruction at
More informationTi Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr
Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions
More informationWhat is Pipelining? Time per instruction on unpipelined machine Number of pipe stages
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationWhat is Pipelining? RISC remainder (our assumptions)
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationWilliam Stallings Computer Organization and Architecture
William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function Rev. 3.2.1 (2005-06) by Enrico Nardelli 11-1 CPU Functions CPU must: Fetch instructions Decode instructions
More informationChapter 14 - Processor Structure and Function
Chapter 14 - Processor Structure and Function Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 14 - Processor Structure and Function 1 / 94 Table of Contents I 1 Processor Organization
More informationMC9211Computer Organization. Unit 4 Lesson 1 Processor Design
MC92Computer Organization Unit 4 Lesson Processor Design Basic Processing Unit Connection Between the Processor and the Memory Memory MAR PC MDR R Control IR R Processo ALU R n- n general purpose registers
More informationPipelining and Vector Processing
Chapter 8 Pipelining and Vector Processing 8 1 If the pipeline stages are heterogeneous, the slowest stage determines the flow rate of the entire pipeline. This leads to other stages idling. 8 2 Pipeline
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationCS311 Lecture: Pipelining, Superscalar, and VLIW Architectures revised 10/18/07
CS311 Lecture: Pipelining, Superscalar, and VLIW Architectures revised 10/18/07 Objectives ---------- 1. To introduce the basic concept of CPU speedup 2. To explain how data and branch hazards arise as
More informationPipelining. Principles of pipelining Pipeline hazards Remedies. Pre-soak soak soap wash dry wipe. l Chapter 4.4 and 4.5
Pipelining Pre-soak soak soap wash dry wipe Chapter 4.4 and 4.5 Principles of pipelining Pipeline hazards Remedies 1 Multi-stage process Sequential execution One process begins after previous finishes
More informationPipelining. Parts of these slides are from the support material provided by W. Stallings
Pipelining Raul Queiroz Feitosa Parts of these slides are from the support material provided by W. Stallings Objective To present the Pipelining concept, its limitations and the techniques for performance
More informationELCT 501: Digital System Design
ELCT 501: Digital System Lecture 8: Pipelining Dr. Mohamed Abd El Ghany, Pipelining: Its Natural! Laundry Example Ann, brian, cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes
More informationInstr. execution impl. view
Pipelining Sangyeun Cho Computer Science Department Instr. execution impl. view Single (long) cycle implementation Multi-cycle implementation Pipelined implementation Processing an instruction Fetch instruction
More informationUNIT 3 - Basic Processing Unit
UNIT 3 - Basic Processing Unit Overview Instruction Set Processor (ISP) Central Processing Unit (CPU) A typical computing task consists of a series of steps specified by a sequence of machine instructions
More informationInstruction Pipelining Review
Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number
More informationCOMPUTER ORGANIZATION AND DESIGN
ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some
More informationParallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1
Pipelining COMP375 Computer Architecture and dorganization Parallelism The most common method of making computers faster is to increase parallelism. There are many levels of parallelism Macro Multiple
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationCPU Structure and Function
Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com http://www.yildiz.edu.tr/~naydin CPU Structure and Function 1 2 CPU Structure Registers
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationPipelining, Branch Prediction, Trends
Pipelining, Branch Prediction, Trends 10.1-10.4 Topics 10.1 Quantitative Analyses of Program Execution 10.2 From CISC to RISC 10.3 Pipelining the Datapath Branch Prediction, Delay Slots 10.4 Overlapping
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationCPU Structure and Function
CPU Structure and Function Chapter 12 Lesson 17 Slide 1/36 Processor Organization CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Lesson 17 Slide 2/36 CPU With Systems
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationEE 457 Unit 6a. Basic Pipelining Techniques
EE 47 Unit 6a Basic Pipelining Techniques 2 Pipelining Introduction Consider a drink bottling plant Filling the bottle = 3 sec. Placing the cap = 3 sec. Labeling = 3 sec. Would you want Machine = Does
More informationPipelining: Overview. CPSC 252 Computer Organization Ellen Walker, Hiram College
Pipelining: Overview CPSC 252 Computer Organization Ellen Walker, Hiram College Pipelining the Wash Divide into 4 steps: Wash, Dry, Fold, Put Away Perform the steps in parallel Wash 1 Wash 2, Dry 1 Wash
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationPIPELINE AND VECTOR PROCESSING
PIPELINE AND VECTOR PROCESSING PIPELINING: Pipelining is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates
More informationIntroduction to Pipelining. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.
Introduction to Pipelining Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L15-1 Performance Measures Two metrics of interest when designing a system: 1. Latency: The delay
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,
More informationComputer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1
Pipelining and Vector Processing Parallel Processing: The term parallel processing indicates that the system is able to perform several operations in a single time. Now we will elaborate the scenario,
More informationPipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010
Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several
More informationUnpipelined Machine. Pipelining the Idea. Pipelining Overview. Pipelined Machine. MIPS Unpipelined. Similar to assembly line in a factory
Pipelining the Idea Similar to assembly line in a factory Divide instruction into smaller tasks Each task is performed on subset of resources Overlap the execution of multiple instructions by completing
More informationSISTEMI EMBEDDED. Computer Organization Pipelining. Federico Baronti Last version:
SISTEMI EMBEDDED Computer Organization Pipelining Federico Baronti Last version: 20160518 Basic Concept of Pipelining Circuit technology and hardware arrangement influence the speed of execution for programs
More informationUNIT- 5. Chapter 12 Processor Structure and Function
UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers
More informationLecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2
Lecture 5: Instruction Pipelining Basic concepts Pipeline hazards Branch handling and prediction Zebo Peng, IDA, LiTH Sequential execution of an N-stage task: 3 N Task 3 N Task Production time: N time
More informationCS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS
CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight
More informationCPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts
CPE 110408443 Computer Architecture Appendix A: Pipelining: Basic and Intermediate Concepts Sa ed R. Abed [Computer Engineering Department, Hashemite University] Outline Basic concept of Pipelining The
More informationPipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.
Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup =2n/05n+15 2n/0.5n 1.5 4 = number of stages 4.5 An Overview
More information1 Hazards COMP2611 Fall 2015 Pipelined Processor
1 Hazards Dependences in Programs 2 Data dependence Example: lw $1, 200($2) add $3, $4, $1 add can t do ID (i.e., read register $1) until lw updates $1 Control dependence Example: bne $1, $2, target add
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy
More informationHPC VT Machine-dependent Optimization
HPC VT 2013 Machine-dependent Optimization Last time Choose good data structures Reduce number of operations Use cheap operations strength reduction Avoid too many small function calls inlining Use compiler
More informationDepartment of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri
Department of Computer and IT Engineering University of Kurdistan Computer Architecture Pipelining By: Dr. Alireza Abdollahpouri Pipelined MIPS processor Any instruction set can be implemented in many
More informationWorking on the Pipeline
Computer Science 6C Spring 27 Working on the Pipeline Datapath Control Signals Computer Science 6C Spring 27 MemWr: write memory MemtoReg: ALU; Mem RegDst: rt ; rd RegWr: write register 4 PC Ext Imm6 Adder
More informationFull Datapath. Chapter 4 The Processor 2
Pipelining Full Datapath Chapter 4 The Processor 2 Datapath With Control Chapter 4 The Processor 3 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory
More informationChapter 4 (Part II) Sequential Laundry
Chapter 4 (Part II) The Processor Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Sequential Laundry 6 P 7 8 9 10 11 12 1 2 A T a s k O r d e r A B C D 30 30 30 30 30 30 30 30 30 30
More informationControl Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.
Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationChapter 9. Pipelining Design Techniques
Chapter 9 Pipelining Design Techniques 9.1 General Concepts Pipelining refers to the technique in which a given task is divided into a number of subtasks that need to be performed in sequence. Each subtask
More informationECE-7 th sem. CAO-Unit 6. Pipeline and Vector Processing Dr.E V Prasad
ECE-7 th sem. CO-Unit 6 Pipeline and Vector Processing Dr.E V Prasad 12.10.17 Contents Parallel Processing Pipelining rithmetic Pipeline Instruction Pipeline RISC Pipeline Vector Processing rray Processors
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationEast Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2005
Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2005 Section
More informationLECTURE 3: THE PROCESSOR
LECTURE 3: THE PROCESSOR Abridged version of Patterson & Hennessy (2013):Ch.4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU
More informationSuggested Readings! Recap: Pipelining improves throughput! Processor comparison! Lecture 17" Short Pipelining Review! ! Readings!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 4.5-4.7!! (Over the next 3-4 lectures)! Lecture 17" Short Pipelining Review! 3! Processor components! Multicore processors and programming! Recap: Pipelining
More informationLecture 5: Pipelining Basics
Lecture 5: Pipelining Basics Biggest contributors to performance: clock speed, parallelism Today: basic pipelining implementation (Sections A.1-A.3) 1 The Assembly Line Unpipelined Start and finish a job
More information6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU
1-6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU Product Overview Introduction 1. ARCHITECTURE OVERVIEW The Cyrix 6x86 CPU is a leader in the sixth generation of high
More informationLecture 6: Pipelining
Lecture 6: Pipelining i CSCE 26 Computer Organization Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed from various books, websites, authors pages, and other
More informationDC57 COMPUTER ORGANIZATION JUNE 2013
Q2 (a) How do various factors like Hardware design, Instruction set, Compiler related to the performance of a computer? The most important measure of a computer is how quickly it can execute programs.
More informationPipelined Processor Design
Pipelined Processor Design Pipelined Implementation: MIPS Virendra Singh Computer Design and Test Lab. Indian Institute of Science (IISc) Bangalore virendra@computer.org Advance Computer Architecture http://www.serc.iisc.ernet.in/~viren/courses/aca/aca.htm
More informationChapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts
CS359: Computer Architecture Chapter 3 & Appendix C Pipelining Part A: Basic and Intermediate Concepts Yanyan Shen Department of Computer Science and Engineering Shanghai Jiao Tong University Parallel
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationPractice Problems (Con t) The ALU performs operation x and puts the result in the RR The ALU operand Register B is loaded with the contents of Rx
Microprogram Control Practice Problems (Con t) The following microinstructions are supported by each CW in the CS: RR ALU opx RA Rx RB Rx RB IR(adr) Rx RR Rx MDR MDR RR MDR Rx MAR IR(adr) MAR Rx PC IR(adr)
More informationCS 3510 Comp&Net Arch
CS 3510 Comp&Net Arch Pipeline Dr. Ken Hoganson 2010 Enhancing Performance We observed that we can obtain better performance in executing instructions, if a single cycle accomplishes multiple operations:
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count CPI and Cycle time Determined
More informationChapter 9 Pipelining. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan
Chapter 9 Pipelining Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Basic Concepts Data Hazards Instruction Hazards Advanced Reliable Systems (ARES) Lab.
More informationCSE 141 Computer Architecture Spring Lectures 11 Exceptions and Introduction to Pipelining. Announcements
CSE 4 Computer Architecture Spring 25 Lectures Exceptions and Introduction to Pipelining May 4, 25 Announcements Reading Assignment Sections 5.6, 5.9 The Processor Datapath and Control Section 6., Enhancing
More informationOverview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP
Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined
More informationChapter 12. CPU Structure and Function. Yonsei University
Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 05
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationC.1 Introduction. What Is Pipelining? C-2 Appendix C Pipelining: Basic and Intermediate Concepts
C-2 Appendix C Pipelining: Basic and Intermediate Concepts C.1 Introduction Many readers of this text will have covered the basics of pipelining in another text (such as our more basic text Computer Organization
More informationUNIT V: CENTRAL PROCESSING UNIT
UNIT V: CENTRAL PROCESSING UNIT Agenda Basic Instruc1on Cycle & Sets Addressing Instruc1on Format Processor Organiza1on Register Organiza1on Pipeline Processors Instruc1on Pipelining Co-Processors RISC
More informationProcessor Architecture
ECPE 170 Jeff Shafer University of the Pacific Processor Architecture 2 Lab Schedule Ac=vi=es Assignments Due Today Wednesday Apr 24 th Processor Architecture Lab 12 due by 11:59pm Wednesday Network Programming
More informationEECS150 - Digital Design Lecture 09 - Parallelism
EECS150 - Digital Design Lecture 09 - Parallelism Feb 19, 2013 John Wawrzynek Spring 2013 EECS150 - Lec09-parallel Page 1 Parallelism Parallelism is the act of doing more than one thing at a time. Optimization
More informationOrange Coast College. Business Division. Computer Science Department. CS 116- Computer Architecture. Pipelining
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining Recall Pipelining is parallelizing execution Key to speedups in processors Split instruction
More informationComputer Architecture
Lecture 3: Pipelining Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture Measurements and metrics : Performance, Cost, Dependability, Power Guidelines and principles in
More informationPipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD
Pipeline and Vector Processing 1. Parallel Processing Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data-processing tasks for the purpose
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationDSP VLSI Design. Pipelining. Byungin Moon. Yonsei University
Byungin Moon Yonsei University Outline What is pipelining? Performance advantage of pipelining Pipeline depth Interlocking Due to resource contention Due to data dependency Branching Effects Interrupt
More informationCSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Today s Content
3/6/8 CSCI 42: Computer Architectures The Processor (2) Fengguang Song Department of Computer & Information Science IUPUI Today s Content We have looked at how to design a Data Path. 4.4, 4.5 We will design
More informationRISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.
COMP 212 Computer Organization & Architecture Pipeline Re-Cap Pipeline is ILP -Instruction Level Parallelism COMP 212 Fall 2008 Lecture 12 RISC & Superscalar Divide instruction cycles into stages, overlapped
More information