پردازش لوله ای و برداری
|
|
- Marianna Strickland
- 5 years ago
- Views:
Transcription
1 پردازش لوله ای و برداری )فصل 9 از کتاب )Mano 1
2 پردازش موازی Throughput: the amount of processing that can be accomplished during a given interval of time 2
3 3
4 : طبقه بندی کامپیوترها از نظر Flynn SISD: Single Instruction stream, Single Data stream SIMD: Single Instruction stream, Multiple Data stream MISD MIMD 4
5 طبقه بندی Flynn مبتنی بر جداسازی کارایی واحد کنترل و واحد پردازش داده است. این طبقه بندی همه انواع موجود کامپیوترها را پوشش نمی دهد. مانند پردازش لوله ای In this chapter: 5
6 Laundry Example (by David Patterson) Four loads of clothes: A, B, C, D A B C D Task: each one to wash, dry, and fold Resources Washer takes 30 minutes Dryer takes 40 minutes Folder takes 20 minutes 6
7 Sequential Laundry 6 PM Midnight Time 7 T a s k O r d e r A B C D Sequential laundry takes 6 hours for 4 loads If they learned pipelining, how long would laundry take?
8 Pipelined Laundry Pipelined laundry takes 3.5 hours for 4 loads 6 PM Midnight Time T a s k O r d e r A B C D
9 مثال : 9
10 10 نحوه اجرای دستورات به صورت لوله ای:
11 Speed up: If n>>k then k+n-1 n =>S = t n /t p => S= k n : number of instructions t n : the time for an instruction to execute in the nonepipeline processor k : number of segments t p : clock cycle time in the pipeline stage 11
12 در سیستم های multi functional unit برای استفاده از قابلیت های pipeline می توانیم واحدهای حسابی مربوط به دستور در حال اجرا را تکرار کنیم. 12
13 کاری که درسیستم های SIMD می تواند انجام شود در شکل فوق یک دستور محاسباتی بطور همزمان روی چهار داده مختلف انجام می شود. مثل اینکه یک سیستم لوله ای با چهار خط لوله است. 13
14 Area of pipelining Arithmetic pipeline Instruction pipeline Arithmetic pipeline: Floating point computations(chap. 10) Fixed-point Multiplication(chap. 10) Similar computations 14
15 Example: Floating point Add/Sub 15
16 16
17 17
18 Instruction pipeline Simple example: Two-segment pipeline FIFO Buffer Fetch instruction Execute instruction Mem. Reduction in the access time to memory Instruction pipeline: 18
19 Instruction pipeline problems: 1. Different segment may take different times 2. Some operations are skipped from certain segments.(like as register mode instruction) 3. Two or more segments may require memory access at the same time(separate modules for data and instruction) 4. Running of direct or conditional jump operations need to skip from some instructions 19
20 Example: 4-segment Inst. Pipe. Another memory module 20
21 21
22 Timing of Instr. Pipe. with branch instr. 22
23 Pipeline conflicts: 23
24 2. Data dependency: Data dependency Data add r1,r2,r3 Data Address Address mov r1,[r2] sub r4,r1,r3 and r6,r1,r7 sub r4,[r1],r3 and r6,[r1],r7 24
25 Resolving of Data dependency Hardware interlock: a circuit that detects instructions whose source operands are destinations of instructions farther up in the pipeline( to insert the required delays) Operand forwarding: special hardware to detect a conflict and avoid it by routing the data through special paths between pipeline segments.( like as insert a path from ALU output to destination) Delayed load: compiler method( reorder the instruction as necessary to delay the load) 25
26 3. Branch difficulties Pre-fetch target instruction: to pre-fetch the target instruction in addition to the instruction following the branch. Branch Target Buffer(BTB) : (storing all of prev. branch instr.) Loop buffer: (an extension of BTB: some fast registers to store all of the loop instr. s) Branch prediction ( using additional logic circuit to guess the outcome of a conditional branch) Delayed branch ( rearranging the instructions with compiler to run useful instr. in the branch ex. Cycle) used in RISC processors. 26
27 Example: RISC pipeline One clock cycle for each instruction run. Fixed-length instruction format. Reg. to Reg. operation. Using two memory module: instr. mem. and data mem. Using compiler to optimize pipeline. 27
28 Type of instructions: Data manipulating (operate on registers) Data transfer (just load and store) Program control Three segment instruction pipeline Data manipulating Evaluating the effective address Calculating the branch address 28
29 Example of Delayed load in RISC 29
30 30
31 Example of delayed branch in RISC 31
32 32
33 33
34 34
35 Vector Processing It needs a vast number of computations 35
36 36
37 Interleaved Memory: memory modules with common bus 37
38 Supercomputers Array Processor: A processor that performs computations on large array of data. Attached array processor: an auxiliary processor attached to a general-purpose computer(improvement in numerical computations) SIMD array processor 38
39 VAX 11 computer & FSP-164/MAX Fl. P. s. 39
40 ALU Floating Point Unit- Working Registers An enable for any PE 40
41 1-8.Measuring, Reporting, Summarizing Performance (by D. Patterson) When we say one computer is faster than another is, what do we mean? 41
42 Some definitions: The phrase X is faster than Y is used here to mean that the response time or execution time is lower on X than on Y for the given task. In particular, X is n times faster than Y will mean: The most straightforward definition of time is called wall-clock time, response time, or elapsed time, which is the latency to complete a task, disk accesses, memory accesses, input/output activities, operating system overhead everything. But in multiprogramming this is not true! CPU time recognizes this distinction and means the time the processor is computing, not includ ing the time waiting for I/O or running other programs. 42
43 Wall-clock time, response time, elapsed time the latency to complete a task, including disk accesses, memory accesses, input/output activities, operating system overhead,... CPU time the time the CPU is computing, excluding I/O or running other programs with multiprogramming often further divided into user and system CPU times User CPU time the CPU time spent in the program System CPU time the CPU time spent in the operating system In the multiprogramming: the response time seen by the user is the elapsed time of the program, not the CPU time. 43
44 UNIX time command 90.7u 12.9s 2:39 65% seconds of user CPU time seconds of system CPU time 2:39 - elapsed time (159 seconds) 65% - percentage of elapsed time that is CPU time ( )/159 44
45 CPU 45 time CPU Execution Time CPU clock cycles for CPU clock cycles for a CPUtime Clock rate program Clock Instruction count (IC) = Number of instructions executed Clock cycles per instruction (CPI) CPI CPU clock cycles IC a for a program program cycle CPI - one way to compare two machines with same instruction set, since Instruction Count would be the same time
46 CPU Execution Time (cont d) CPU time IC CPI Clock cycle time CPU time IC CPI Clock rate CPU time Instructions Program Clock cycles Instruction Seconds Clock cycle Seconds Program 46
47 How to Calculate 3 Components? Clock Cycle Time in specification of computer (Clock Rate in advertisements) Instruction count Count instructions in loop of small program Use simulator to count instructions Hardware counter in special register (Pentium II) CPI Calculate: Execution Time / Clock cycle time / Instruction Count Hardware counter in special register (Pentium II) 47
48 Another Way to Calculate CPI First calculate CPI for each individual instruction (add, sub, and, etc.): CPIi Next calculate frequency of each individual instr.: Freqi = ICi/IC Finally multiply these two for each instruction and add them up to get final CPI CPI 48 n i 1 IC IC i CPI i Op ALU Load Store Bran. Freq i 50% 20% 10% 20% CPI i Prod /2.2 % Time 23% 45% 14% 18%
49 Choosing Programs to Evaluate Per. Ideally run typical programs with typical input before purchase, or before even build machine Engineer uses compiler, Author uses word processor, drawing program, compression software Workload mixture of programs and OS commands that users run on a machine Few can do this Don t have access to machine to benchmark before purchase Don t know workload in future 49
50 Benchmarks Different types of benchmarks Real programs (Ex. MSWord, Excel, Photoshop,...) Kernels - small pieces from real programs (Linpack,...) Toy Benchmarks - short, easy to type and run (Quicksort, Puzzle,...) Synthetic benchmarks - code that matches frequency of key instructions and operations to real programs (Whetstone, Dhrystone) Need industry standards so that different processors can be fairly compared Companies exist that create these benchmarks: typical code used to evaluate systems 50
51 Benchmark Suites SPEC - Standard Performance Evaluation Corporation ( originally focusing on CPU performance SPEC , SPEC CPU2000 graphics benchmarks: SPECviewperf, SPECapc server benchmark: SPECSFS, SPECWEB PC benchmarks (Winbench 99, Business Winstone 99, High-end Winstone 99, CC Winstone 99) ( Transaction processing benchmarks ( Embedded benchmarks ( 51
52 Comparing and Summarising Per. An Example Program Com. A Com. B Com. C P1 (sec) P2 (sec) Total (sec) A is 20 times faster than C for program P1 C is 50 times faster than A for program P2 B is 2 times faster than C for program P1 C is 5 times faster than B for program P2 What we can learn from these statements? We know nothing about relative performance of computers A, B, C! One approach to summarise relative performance: use total execution times of programs 52
53 Amdahl s Law Suppose that we make an enhancement to a machine that will improve its performance; Speedup is ratio: Speedup ExTime for entire task without enhancement ExTime for entire task using enhancement Speedup Performance for entire task using enhancement Performance for entire task without enhancement Amdahl s Law states that the performance improvement that can be gained by a particular enhancement is limited by the amount of time that enhancement can be used 53
54 Amdahl s Law gives us a quick way to find the speedup from some enhancement, which depends on two factors: 1. The fraction of the computation time in the original computer that can be converted to take advantage of the enhancement For example, if 20 seconds of the execution time of a program that takes 60 seconds in total can use an enhancement, the fraction is 20/60. This value, which we will call Fractionenhanced, is always less than or equal to The improvement gained by the enhanced execution mode; that is, how much faster the task would run if the enhanced mode were used for the entire program This value is the time of the original mode over the time of the enhanced mode. If the enhanced mode takes, say, 2 seconds for a portion of the program, while it is 5 seconds in the original mode, the improvement is 5/2. We will call this value, which is always greater than 1, Speedupenhanced. 54
55 Computing Speedup Fractionenhanced = fraction of execution time in the original machine that can be converted to take advantage of enhancement (E.g., 10/30) Speedupenhanced = how much faster the enhanced code will run (E.g., 10/2=5) Execution time of enhanced program will be sum of old execution time of the unenhanced part of program and new execution time of the enhanced part of program: 55 ExTime new ExTime unenhanced ExTime Speedup enhanced enhanced 10/5=2
56 ExTime Enhanced part of program is Fractionenhanced, so times are: new ExTime ExTime unenhanced unenhanced ExTime Speedup ExTime 1 old enhanced enhanced Fraction enhanced ExTime enhanced ExTime old Fraction enhanced Factor out Timeold and divide by Speedupenhanced: Fraction ExTimenew ExTimeold 1 Fractionenhanced Speedup Overall speedup is ratio of Timeold to Timenew: 1 Speedup Fractionenhanced 1 Fraction 56 enhanced Speedup enhanced enhanced enhanced
57 An Example Enhancement runs 10 times faster and it affects 40% of the execution time Fractionenhanced = 0.40 Speedupenhanced = 10 Speedupoverall =? Speedup
PIPELINE AND VECTOR PROCESSING
PIPELINE AND VECTOR PROCESSING PIPELINING: Pipelining is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates
More informationMEASURING COMPUTER TIME. A computer faster than another? Necessity of evaluation computer performance
Necessity of evaluation computer performance MEASURING COMPUTER PERFORMANCE For comparing different computer performances User: Interested in reducing the execution time (response time) of a task. Computer
More informationPipeline and Vector Processing 1. Parallel Processing SISD SIMD MISD & MIMD
Pipeline and Vector Processing 1. Parallel Processing Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data-processing tasks for the purpose
More informationPage 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1
Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the
More informationInstructor Information
CS 203A Advanced Computer Architecture Lecture 1 1 Instructor Information Rajiv Gupta Office: Engg.II Room 408 E-mail: gupta@cs.ucr.edu Tel: (951) 827-2558 Office Times: T, Th 1-2 pm 2 1 Course Syllabus
More informationCPS104 Computer Organization and Programming Lecture 19: Pipelining. Robert Wagner
CPS104 Computer Organization and Programming Lecture 19: Pipelining Robert Wagner cps 104 Pipelining..1 RW Fall 2000 Lecture Overview A Pipelined Processor : Introduction to the concept of pipelined processor.
More informationComputer Architecture. Lecture 6.1: Fundamentals of
CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and
More informationModule 4c: Pipelining
Module 4c: Pipelining R E F E R E N C E S : S T A L L I N G S, C O M P U T E R O R G A N I Z A T I O N A N D A R C H I T E C T U R E M O R R I S M A N O, C O M P U T E R O R G A N I Z A T I O N A N D A
More informationReporting Performance Results
Reporting Performance Results The guiding principle of reporting performance measurements should be reproducibility - another experimenter would need to duplicate the results. However: A system s software
More informationPipeline: Introduction
Pipeline: Introduction These slides are derived from: CSCE430/830 Computer Architecture course by Prof. Hong Jiang and Dave Patterson UCB Some figures and tables have been derived from : Computer System
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationComputer organization by G. Naveen kumar, Asst Prof, C.S.E Department 1
Pipelining and Vector Processing Parallel Processing: The term parallel processing indicates that the system is able to perform several operations in a single time. Now we will elaborate the scenario,
More informationECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.
ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks
More informationWhat is Pipelining? Time per instruction on unpipelined machine Number of pipe stages
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationMIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14
MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK
More informationChapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture
More informationCourse web site: teaching/courses/car. Piazza discussion forum:
Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start
More informationChapter 8. Pipelining
Chapter 8. Pipelining Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization requires sophisticated compilation techniques.
More informationWhat is Pipelining? RISC remainder (our assumptions)
What is Pipelining? Is a key implementation techniques used to make fast CPUs Is an implementation techniques whereby multiple instructions are overlapped in execution It takes advantage of parallelism
More informationLecture 4: Instruction Set Architectures. Review: latency vs. throughput
Lecture 4: Instruction Set Architectures Last Time Performance analysis Amdahl s Law Performance equation Computer benchmarks Today Review of Amdahl s Law and Performance Equations Introduction to ISAs
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Pipeline Thoai Nam Outline Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy
More informationPipelining. Maurizio Palesi
* Pipelining * Adapted from David A. Patterson s CS252 lecture slides, http://www.cs.berkeley/~pattrsn/252s98/index.html Copyright 1998 UCB 1 References John L. Hennessy and David A. Patterson, Computer
More informationPipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome
Thoai Nam Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome Reference: Computer Architecture: A Quantitative Approach, John L Hennessy & David a Patterson,
More informationLecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1
Lecture - 4 Measurement Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1 Acknowledgements David Patterson Dr. Roger Kieckhafer 9/29/2009 2 Computer Architecture is Design and Analysis
More informationCS654 Advanced Computer Architecture. Lec 2 - Introduction
CS654 Advanced Computer Architecture Lec 2 - Introduction Peter Kemper Adapted from the slides of EECS 252 by Prof. David Patterson Electrical Engineering and Computer Sciences University of California,
More informationChapter 5 (a) Overview
Chapter 5 (a) Overview (a) The principles of pipelining (a) A pipelined design of SRC (b) Pipeline hazards (b) Instruction-level parallelism (ILP) Superscalar processors Very Long Instruction Word (VLIW)
More informationECE 486/586. Computer Architecture. Lecture # 3
ECE 486/586 Computer Architecture Lecture # 3 Spring 2014 Portland State University Lecture Topics Measuring, Reporting and Summarizing Performance Execution Time and Throughput Benchmarks Comparing and
More informationPipelining. CS701 High Performance Computing
Pipelining CS701 High Performance Computing Student Presentation 1 Two 20 minute presentations Burks, Goldstine, von Neumann. Preliminary Discussion of the Logical Design of an Electronic Computing Instrument.
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationECE-7 th sem. CAO-Unit 6. Pipeline and Vector Processing Dr.E V Prasad
ECE-7 th sem. CO-Unit 6 Pipeline and Vector Processing Dr.E V Prasad 12.10.17 Contents Parallel Processing Pipelining rithmetic Pipeline Instruction Pipeline RISC Pipeline Vector Processing rray Processors
More informationCS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST
CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 4 Processor Part 2: Pipelining (Ch.4) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations from Mike
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationCMSC 313 Lecture 27. System Performance CPU Performance Disk Performance. Announcement: Don t use oscillator in DigSim3
System Performance CPU Performance Disk Performance CMSC 313 Lecture 27 Announcement: Don t use oscillator in DigSim3 UMBC, CMSC313, Richard Chang Bottlenecks The performance of a process
More informationCpE 442 Introduction to Computer Architecture. The Role of Performance
CpE 442 Introduction to Computer Architecture The Role of Performance Instructor: H. H. Ammar CpE442 Lec2.1 Overview of Today s Lecture: The Role of Performance Review from Last Lecture Definition and
More informationPerformance evaluation. Performance evaluation. CS/COE0447: Computer Organization. It s an everyday process
Performance evaluation It s an everyday process CS/COE0447: Computer Organization and Assembly Language Chapter 4 Sangyeun Cho Dept. of Computer Science When you buy food Same quantity, then you look at
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationInstruction Pipelining
Instruction Pipelining Simplest form is a 3-stage linear pipeline New instruction fetched each clock cycle Instruction finished each clock cycle Maximal speedup = 3 achieved if and only if all pipe stages
More informationLecture 15: Pipelining. Spring 2018 Jason Tang
Lecture 15: Pipelining Spring 2018 Jason Tang 1 Topics Overview of pipelining Pipeline performance Pipeline hazards 2 Sequential Laundry 6 PM 7 8 9 10 11 Midnight Time T a s k O r d e r A B C D 30 40 20
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Number Representation 09212011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Recap Logic Circuits for Register Transfer
More informationComputer Performance. Relative Performance. Ways to measure Performance. Computer Architecture ELEC /1/17. Dr. Hayden Kwok-Hay So
Computer Architecture ELEC344 Computer Performance How do you measure performance of a computer? 2 nd Semester, 208-9 Dr. Hayden Kwok-Hay So How do you make a computer fast? Department of Electrical and
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationProcessors. Young W. Lim. May 12, 2016
Processors Young W. Lim May 12, 2016 Copyright (c) 2016 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version
More informationInstruction Pipelining Review
Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number
More informationModern Computer Architecture
Modern Computer Architecture Lecture2 Pipelining: Basic and Intermediate Concepts Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationEECS4201 Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be
More information5008: Computer Architecture HW#2
5008: Computer Architecture HW#2 1. We will now support for register-memory ALU operations to the classic five-stage RISC pipeline. To offset this increase in complexity, all memory addressing will be
More informationPage 1. Pipelining: Its Natural! Chapter 3. Pipelining. Pipelined Laundry Start work ASAP. Sequential Laundry A B C D. 6 PM Midnight
Pipelining: Its Natural! Chapter 3 Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes A B C D Dryer takes 40 minutes Folder
More informationRAID 0 (non-redundant) RAID Types 4/25/2011
Exam 3 Review COMP375 Topics I/O controllers chapter 7 Disk performance section 6.3-6.4 RAID section 6.2 Pipelining section 12.4 Superscalar chapter 14 RISC chapter 13 Parallel Processors chapter 18 Security
More informationCO Computer Architecture and Programming Languages CAPL. Lecture 15
CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125
More informationEngineering 9859 CoE Fundamentals Computer Architecture
Engineering 9859 CoE Fundamentals Computer Architecture Introduction Dennis Peters 1 Fall 2007 1 Based on notes from Dr. R. Venkatesan Course Details Classes Monday, Wednesday, Friday 9 10 EN-4033 Course
More informationChapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor The Processor - Introduction
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition The Processor - Introduction
More informationomputer Design Concept adao Nakamura
omputer Design Concept adao Nakamura akamura@archi.is.tohoku.ac.jp akamura@umunhum.stanford.edu 1 1 Pascal s Calculator Leibniz s Calculator Babbage s Calculator Von Neumann Computer Flynn s Classification
More informationEITF20: Computer Architecture Part2.2.1: Pipeline-1
EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle
More informationPipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010
Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several
More informationPipeline Processors David Rye :: MTRX3700 Pipelining :: Slide 1 of 15
Pipeline Processors Pipelining :: Slide 1 of 15 Pipeline Processors A common feature of modern processors Works like a series production line An operation is divided into k decoupled (independent) elementary
More informationCPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts
CPE 110408443 Computer Architecture Appendix A: Pipelining: Basic and Intermediate Concepts Sa ed R. Abed [Computer Engineering Department, Hashemite University] Outline Basic concept of Pipelining The
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationPipelining. Principles of pipelining Pipeline hazards Remedies. Pre-soak soak soap wash dry wipe. l Chapter 4.4 and 4.5
Pipelining Pre-soak soak soap wash dry wipe Chapter 4.4 and 4.5 Principles of pipelining Pipeline hazards Remedies 1 Multi-stage process Sequential execution One process begins after previous finishes
More informationAdvanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017
Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation
More informationPIPELINING AND VECTOR PROCESSING
1 PIPELINING AND VECTOR PROCESSING Parallel Processing Pipelining Arithmetic Pipeline Instruction Pipeline RISC Pipeline Vector Processing Array Processors 2 PARALLEL PROCESSING Parallel Processing Execution
More informationPerformance, Power, Die Yield. CS301 Prof Szajda
Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the
More informationParallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1
Pipelining COMP375 Computer Architecture and dorganization Parallelism The most common method of making computers faster is to increase parallelism. There are many levels of parallelism Macro Multiple
More informationPerformance, Cost and Amdahl s s Law. Arquitectura de Computadoras
Performance, Cost and Amdahl s s Law Arquitectura de Computadoras Arturo Díaz D PérezP Centro de Investigación n y de Estudios Avanzados del IPN adiaz@cinvestav.mx Arquitectura de Computadoras Performance-
More informationLecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1
Lecture 3 Pipelining Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1 A "Typical" RISC ISA 32-bit fixed format instruction (3 formats) 32 32-bit GPR (R0 contains zero, DP take pair)
More informationModule 5 Introduction to Parallel Processing Systems
Module 5 Introduction to Parallel Processing Systems 1. What is the difference between pipelining and parallelism? In general, parallelism is simply multiple operations being done at the same time.this
More informationMinimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline
Instruction Pipelining Review: MIPS In-Order Single-Issue Integer Pipeline Performance of Pipelines with Stalls Pipeline Hazards Structural hazards Data hazards Minimizing Data hazard Stalls by Forwarding
More informationComputer Architecture. What is it?
Computer Architecture Venkatesh Akella EEC 270 Winter 2005 What is it? EEC270 Computer Architecture Basically a story of unprecedented improvement $1K buys you a machine that was 1-5 million dollars a
More informationData Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard
Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard Consider: a = b + c; d = e - f; Assume loads have a latency of one clock cycle:
More informationThese actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.
MIPS Pipe Line 2 Introduction Pipelining To complete an instruction a computer needs to perform a number of actions. These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously
More informationCOMPUTER ORGANIZATION AND DESIGN
ARM COMPUTER ORGANIZATION AND DESIGN Edition The Hardware/Software Interface Chapter 4 The Processor Modified and extended by R.J. Leduc - 2016 To understand this chapter, you will need to understand some
More informationThe Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.
The Processor Pipeline Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes. Pipeline A Basic MIPS Implementation Memory-reference instructions Load Word (lw) and Store Word (sw) ALU instructions
More informationOverview. Appendix A. Pipelining: Its Natural! Sequential Laundry 6 PM Midnight. Pipelined Laundry: Start work ASAP
Overview Appendix A Pipelining: Basic and Intermediate Concepts Basics of Pipelining Pipeline Hazards Pipeline Implementation Pipelining + Exceptions Pipeline to handle Multicycle Operations 1 2 Unpipelined
More informationDHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY. Department of Computer science and engineering
DHANALAKSHMI SRINIVASAN INSTITUTE OF RESEARCH AND TECHNOLOGY Department of Computer science and engineering Year :II year CS6303 COMPUTER ARCHITECTURE Question Bank UNIT-1OVERVIEW AND INSTRUCTIONS PART-B
More informationExercise 1 Advanced Computer Architecture. Exercise 1
Folie a: Name Advanced Computer Architecture Department of Electrical Engineering and Information Technology Institute for g Dipl.-Ing. M.A. Lebedev Institute for BB 321, Tel: 0203 379-1019 E-mail: michail.lebedev@uni-due.de
More informationLecture Outline. CPE 631: Introduction. Introduction. A short history of computing
Lecture Outline CPE 63: Introduction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic, milenka@ece.uah.edu http://www.ece.uah.edu/~milenka Evolution of Computer
More informationCMSC411 Epic Cheat Sheet Draft Summer 2014
CMSC411 Epic Cheat Sheet Draft Summer 2014 Table of Contents (Locations in 5thEd) ISAs and MIPS, Appendix B Classifications B.2 Encodings B.3 MIPS B.9 Quantitative CPU Analysis Ch. 1.8 1.9 Amdahl's Law
More informationComputer Performance. Reread Chapter Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm
Computer Performance He said, to speed things up we need to squeeze the clock Reread Chapter 1.4-1.9 Quiz on Friday. Study Session Wed Night FB 009, 5pm-6:30pm L15 Computer Performance 1 Why Study Performance?
More informationIC220 Slide Set #5B: Performance (Chapter 1: 1.6, )
Performance IC220 Slide Set #5B: Performance (Chapter 1: 1.6, 1.9-1.11) Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational
More informationThe Role of Performance
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture The Role of Performance What is performance? A set of metrics that allow us to compare two different hardware
More informationPipeline Review. Review
Pipeline Review Review Covered in EECS2021 (was CSE2021) Just a reminder of pipeline and hazards If you need more details, review 2021 materials 1 The basic MIPS Processor Pipeline 2 Performance of pipelining
More information3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?
CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:
More informationاصول ميکروکامپيوترها استاد درس: دکتر http://eeiustacir/rahmati/indexhtm rahmati@iustacir ا درس Email و Website برای تکاليف و : http://eeliustacir/rahmati/ ١ /١۴ هفدهم فصل ا شنايی با دستورالعمل ها وMode
More informationUpdated Exercises by Diana Franklin
C-82 Appendix C Pipelining: Basic and Intermediate Concepts Updated Exercises by Diana Franklin C.1 [15/15/15/15/25/10/15] Use the following code fragment: Loop: LD R1,0(R2) ;load R1 from address
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationThe bottom line: Performance. Measuring and Discussing Computer System Performance. Our definition of Performance. How to measure Execution Time?
The bottom line: Performance Car to Bay Area Speed Passengers Throughput (pmph) Ferrari 3.1 hours 160 mph 2 320 Measuring and Discussing Computer System Performance Greyhound 7.7 hours 65 mph 60 3900 or
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures CS61C L41 Performance I (1) Lecture 41 Performance I 2004-12-06 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Sour Roses! Cal s best season
More informationLecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533
Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?
More informationWhat is Good Performance. Benchmark at Home and Office. Benchmark at Home and Office. Program with 2 threads Home program.
Performance COMP375 Computer Architecture and dorganization What is Good Performance Which is the best performing jet? Airplane Passengers Range (mi) Speed (mph) Boeing 737-100 101 630 598 Boeing 747 470
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 05
More informationWhich is the best? Measuring & Improving Performance (if planes were computers...) An architecture example
1 Which is the best? 2 Lecture 05 Performance Metrics and Benchmarking 3 Measuring & Improving Performance (if planes were computers...) Plane People Range (miles) Speed (mph) Avg. Cost (millions) Passenger*Miles
More informationDefining Performance. Performance 1. Which airplane has the best performance? Computer Organization II Ribbens & McQuain.
Defining Performance Performance 1 Which airplane has the best performance? Boeing 777 Boeing 777 Boeing 747 BAC/Sud Concorde Douglas DC-8-50 Boeing 747 BAC/Sud Concorde Douglas DC- 8-50 0 100 200 300
More informationECE260: Fundamentals of Computer Engineering
Pipelining James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy What is Pipelining? Pipelining
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationLecture: Benchmarks, Pipelining Intro. Topics: Performance equations wrap-up, Intro to pipelining
Lecture: Benchmarks, Pipelining Intro Topics: Performance equations wrap-up, Intro to pipelining 1 Measuring Performance Two primary metrics: wall clock time (response time for a program) and throughput
More informationComputer Architecture
Computer Architecture Architecture The art and science of designing and constructing buildings A style and method of design and construction Design, the way components fit together Computer Architecture
More information