Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov

Size: px
Start display at page:

Download "Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov"

Transcription

1 Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1

2 Outline Course information Trends (in technology, cost, performance) and issues 2

3 Course organization Class website: er2015/home.htm Instructor office hours: Wed, 2:00 pm 4:00 pm Teacher Assistant: David McCarthy office hours: Friday 2:00 pm 4:00 pm 3

4 Textbook Computer Architecture: A Quantitative Approach, John L. Hennessy and David A. Patterson, Fifth Edition, Morgan Kaufmann, 2012, ISBN: Modern Processor Design: Fundamentals of Superscalar Processors, John Paul Shen and Mikko H. Lipasti, Waveland Press, 2013, ISBN:

5 Class topics and tentative schedule Computer fundamentals (historical trends, performance metrics) 1 week Memory hierarchy design - 2 weeks Instruction level parallelism (static and dynamic scheduling, speculation) 2 weeks Data level parallelism (vector, SIMD and GPUs) 2 weeks Thread level parallelism (shared-memory architectures, synchronization and cache coherence) 2 weeks Warehouse-scale computers or Detailed analysis of some specific up (1 week) 5

6 Ultimate goal of the class To get intuition behind main techniques for improving performance To understand advanced microprocessors such as - ARM Cortex A8 - Intel Core i7 - Tesla GPU This is what you supposed to know! 5-STAGE MIPS PIPELINE 6

7 This is what we learn in this class! 7

8 Grading Projects: 70 % Final: 30 % Project course work will involve program performance analysis and architectural optimizations for superscalar processors using SimpleScalar simulation tools Number of problems will be assigned before final (but not graded) 8

9 Course prerequisites ECE 154A or equivalent 9

10 A bit of history: ENIAC - Electronic Numerical Integrator And Computer,

11 VLSI Developments 1946: ENIAC electronic numerical integrator and computer Floor area 140 m 2 Performance multiplication of two 10-digit numbers in 2 ms 2011: High Performance microprocessor Chip area mm 2 (for multi-core) Board area 200 cm 2 ; improvement of 10 4 Performance: 64 bit multiply in few ns; improvement of

12 Computer trends: Performance of a (single) processor The next series of question is centered around understanding that important graph 12

13 Question Q1: what is performance shown on the figure and how do we define it? - A1a: Performance is typically related to how fast a certain task can be executed, i.e. reciprocal of execution time Performance = 1/ ExecTime Wall clock time: includes all system overheads CPU time: only computation time ExecTime = IC * CCT * <CPI> - A1b: Many different metrics of performance today because of different application of ups - What kind of metrics? 13

14 Measuring Performance Typical performance metrics: Execution Time (or latency) Throughput Q2: How is throughput related to latency? A2: In general these are two different concepts. Throughput can be improved by providing more parallelism, but also be improved by reducing latency. For example, with no parallelism throughput is reversely proportional to latency Energy Q3: Is energy metric the same as power consumption one? A3: Power = energy / time, so in general, it is the same metric only when execution time is the same. Response time Typical way to measure performance is to run benchmark (i.e. collection of representative for the tested hardware application) Kernels (e.g. matrix multiply) Toy programs (e.g. sorting) Synthetic benchmarks (e.g. Dhrystone) Benchmark suites (e.g. SPEC06fp, TPC-C) Speedup of X relative to Y Execution time Y / Execution time X 14

15 Bandwidth vs. Latency Bandwidth or throughput Total work done in a given time 10,000-25,000X improvement for processors X improvement for memory and disks Latency or response time Time between start and completion of an event 30-80X improvement for processors 6-8X improvement for memory and disks 15

16 Computer trends: Performance of a (single) processor 16

17 Questions: Reasons behind performance improvement? Q4: Why it was improving originally (from ~1978-~1984 on the figure)? A4: Moore s law and the resulting increase in clock frequency 17

18 CMOS improvements: Transistor density: 4x / 3 yrs Die size: 10-25% / yr 18

19 Scaling with Feature Size (for short channel devices before running into leakage problems) Let s 1) scale all the dimensions of the transistors and wires down by factor of s and 2) supply voltage V down by factor of s (together with threshold voltage Vth) Then Density: ~ s 2 Logic gate capacitance Cgate (traditionally dominating parasitics): ~ 1/s Saturation current I ON : ~ 1/s Gate delay Tgate: ~ CgateV/I ON = 1/s Clock frequency: s, i.e. it is reversely proportional to gate delay. Clock cycle time is typically around ten or more of logic gate delays See, e.g. page 124 of Digital Integrated Circuits by Jan Rabaey et al, 2 nd edition 19

20 Frequency Scaling with Feature Size If s is scaling factor, then density scale as s 2 Voltage V: 1/s Logic gate capacitance C (traditionally dominating): ~ 1/s Saturation current I ON : ~ 1/s Gate delay: ~ CV/I ON = 1/s 20

21 Computer trends: Performance of a (single) processor 21

22 Question: Q5: Reasons behind further performance improvement? What happened in 1986? A5: CISC to RISC which enabled additional architectural improvements (see next slide) Review: Dimensions of ISA (1) Class of ISA: register-memory vs load-store (2) Memory addressing: byte addressabile (3) Addressing modes (what are operands and addressing modes of memory): registers, immediate, displacement, indirect, indexed, absolute) (4) Types and sizes of operands: byte, half-word, word (5) Operations: data transfer, arithmetic logical, control and fp (6) Control flow instructions: conditional branches, unconditional jumps, returns (7) Encoding an ISA: variable versus fixed length 22

23 Question: Reasons behind performance improvement? What happened in 1986? CISC to RISC ExecTime = IC * CCT * <CPI> Q6: How are these terms affected by this move and in particular how are the terms in performance equation are affected by pipelining? -A6: Design Inst count CPI CCT Single Cycle (SC) Multi cycle (MC) 1 N CPI > 1 (closer to N than 1) > 1/N Multi cycle pipelined (MCP) 1 > 1 >1/N 23

24 Question: Pipelining improve performance (instruction per cycle with respect to multi-cycle processor with non pipelining, by overlapping instructions) One kind of instruction level parallelism (ILP) Q7: Problems with improving ILP? What are the problems in pipelines? A7: Clock cycle is determined by slowest component» What is typically the slowest component? memory A7: Data and control hazards (pipeline stalls and flushes) Further improvement in ILP? A7: Limited parallelism in ILP 24

25 Memory Wall problem DRAM access (main memory) could take hundreds of cycles Memory hierarchy to rescue to alleviate the problem Will spend much time later in class reviewing advanced techniques for reducing effective access time to main memory 25

26 CPU high, Memory low ( Memory Wall ) Bandwidth and Latency Performance Milestones Processor: 286, 386, 486, Pentium, Pentium Pro, Pentium 4 (21x,2250x) Ethernet: 10Mb, 100Mb, 1000Mb, Mb/s (16x,1000x) Memory Module: 16bit plain DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x) Disk : 3600, 5400, 7200, 10000, RPM (8x, 143x) Log-log plot of bandwidth and latency milestones Bandwidth is much easier to improve why?

27 Question: Pipelining improve performance (instruction per cycle with respect to multi-cycle processor with non pipelining, by overlapping instructions) One kind of instruction level parallelism (ILP) Q7: Problems with improving ILP? What are the problems in pipelines? A7: Clock cycle is determined by slowest component» What is typically the slowest component? memory A7: Data and control hazards (pipeline stalls and flushes) Further improvement in ILP? A7: Limited parallelism in ILP 27

28 ILP techniques

29 Summary of Trends in Technology Integrated circuit technology Transistor density: 35%/year Die size: 10-20%/year Integration overall: 40-55%/year DRAM capacity: 25-40%/year (slowing) Flash capacity: 50-60%/year 15-20X cheaper/bit than DRAM Magnetic disk technology: 40%/year 15-25X cheaper/bit then Flash X cheaper/bit than DRAM 29

30 Computer trends: Performance of a (single) processor The area of high performance chip has been close to ~ cm^2, why? 30

31 Question: Q8: Why did the die size only grew by 10% / year? Performance of single processor could be improved by using more hardware (larger cache, more sophisticated branch prediction etc.) Drawing single-crystal Si ingot from furnace. Then, slice into wafers and pattern it 8 MIPS64 R20K wafer (564 dies) 31

32 Trends in Cost Cost driven down by learning curve Yield DRAM: price closely tracks cost Microprocessors: price depends on volume 10% less for each doubling of volume 32

33 Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Final test yield: fraction of packaged dies which pass the final testing state Die yield: fraction of good dies on a wafer Defects per unit area = defects per square cm (2010) N = process-complexity factor = (40 nm, 2010) 33

34 Die yield / wafer yield Die cost (arbitrary units) N = 11.5 Defects per unit area = Defects per unit area = Die area (cm^2) Answer to Q Die area (cm^2) 34

35 Total cost $ ASIC vs. up this is just an example of NRE cost. It may vary by much in in general total cost for up > that of ASIC 10 6 $1 M NRE (non recurrent engineering cost) Total cost = NRE/volume + IC cost IC cost = $100 IC cost = $ up ASIC Volume Q9: - What is typically denser ASIC or up for the same task? ASIC - What is typically more energy efficient and faster? ASIC - What cost less to produce ASIC or up? depends on volume (see graph above) 35

36 Major computing platforms Application Specific Integrated Circuit Field Programmable Gate Array Microprocessor Density, speed Flexibility In this class, the focus is on the microprocessors only

37 Computer trends: Performance of a (single) processor 37

38 Questions: Reasons behind performance improvement? Q10: What happened after > 2002 on the performance figure? A10: Power wall A:10 End of ILP Limits to pipelining Limits to superscalar 38

39 Power consumption Problem: Get power in, get power out Thermal Design Power (TDP) - Characterizes sustained power consumption, used as target for power supply and cooling system, Lower than peak power, higher than average power consumption Intel consumed ~ 2 W 3.3 GHz Intel Core i7 consumes 130 W Typical max temperatures: ~70 C Maximum power density for fan-based cooling: 200W/cm^2 water based cooling: 1000W/cm^2 39

40 Ambient temperature (Tlow) Fourier Law in 1D : Similar to Ohms law when replacing - thermal conductance with electrical conductance - heat source (total generated power) with current source - temperature with voltage Heat flux (Q) Thermal conductance K Chip temperature (Thigh) Tlow K Q Vlow I 1/R Thigh Vhigh Thigh = Tlow + Q/K Vhigh = Vlow + IR Temperature is roughly (in 1D lumped model) linearly proportional to the Q or total dissipated power 40

41 Scaling with Feature Size (for short channel devices before running into leakage problems) Let s 1) scale all the dimensions of the transistors and wires down by factor of s and 2) supply voltage V down by factor of s (together with threshold voltage Vth) Then Density: ~ s 2 Logic gate capacitance Cgate (traditionally dominating parasitics): ~ 1/s Saturation current I ON : ~ 1/s Gate delay Tgate: ~ CgateV/I ON = 1/s Clock frequency f : s, i.e. it is reversely proportional to gate delay. Clock cycle time is typically around ten or more of logic gate delays Power (dynamic component only): ~1/2 Ctotal*V 2 *f ~ 1 (if chip area remain the same power scaling is the same as power density) No issue with power (or temperature) scaling but the problem is that supply voltage is no longer scaled down by factor of s, why? see next slides 41

42 Static vs. dynamic power Static power Ron Dynamic power Roff Static power is permanent Dynamic power only when switching Leakage (static power) increases exponentially when lowering V! Cannot be neglected anymore Leakage power ~ V^2/Roff Roff/Ron ~ Exp(V)

43 Technique for Reducing Power Consumption Do nothing well Low power state for DRAM, disks Energy proportionality concept (don t consume energy when no work is done) very important for data center for which power is huge portion of running cost Power gating to reduce static component Dynamic Voltage-Frequency Scaling Since saturation current I ON ~ V 2 f ~ 1/Tgate I ON / (Cgate V )~ V Lowering voltage reduces the dynamic power consumption and energy per operation but decrease performance because of increased CCT Q11: Any benefits for multiprocessors? A11: If task is easily parallelizable, then running this task on m processors in parallel at lower V (say V/m) and slower f (say f/m) can lead to the same execution time but much lower dynamic power CtotalV^2f ~ 1/m^3 (not accounting for static power) Overclocking, turning off cores Race-to-halt Thermal capacitance/ turbo mode 43

44 Reducing energy consumption: Choice of optimal voltage supply

45 Other problems with scaling: Transistors and Wires Feature size Minimum size of transistor or wire in x or y dimension 10 microns in 1971 to.032 microns in 2011 Transistor performance scales linearly Wire delay does not improve with feature size! There is always need in long wires Problem related to Rent Rule (number of pins versus number of gates) 45

46 Questions: Reasons behind performance improvement? Q10: What happened after > 2002 on the performance figure? A10: Power wall A10: End of ILP Limits to pipelining Limits to superscalar» Will discuss it in detail after covering advanced ILP topics 46

47 What is next: Current Trends in Architecture Cannot continue to leverage Instruction-Level parallelism (ILP) Single processor performance improvement ended in 2003 New ways of improving performance: Data-level parallelism (DLP) Thread-level parallelism (TLP) Request-level parallelism (RLP) These require explicit restructuring of the application 47

48 New applications appears: Classes of computers now Personal Mobile Device (PMD) e.g. start phones, tablet computers Emphasis on energy efficiency and real-time Desktop Computing Emphasis on price-performance Servers Emphasis on availability, scalability, throughput Clusters / Warehouse Scale Computers Used for Software as a Service (SaaS) Emphasis on availability and price-performance Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networks Embedded Computers Emphasis: price 48

49 Dark silicon Only some parts of a chip are active at a time Q12: Specialized cores make sense now in general purpose microprocessor Qualcomm Zeroth chip 49

50 Summary of trends in up 50

51 Not covered in class 51

52 Quantitative Principles of Design Take Advantage of Parallelism Principle of Locality Focus on the Common Case Amdahl s Law E.g. common case supported by special hardware; uncommon cases in software The Performance Equation 52

53 1. Parallelism How to improve performance? (Super)-pipelining Powerful instructions MD-technique multiple data operands per operation MO-technique multiple operations per instruction Multiple instruction issue single instruction-program stream multiple streams (or programs, or tasks) 53

54 Flynn s Taxonomy Single instruction stream, single data stream (SISD) Single instruction stream, multiple data streams (SIMD) Vector architectures Multimedia extensions Graphics processor units Multiple instruction streams, single data stream (MISD) No commercial implementation Multiple instruction streams, multiple data streams (MIMD) Tightly-coupled MIMD Loosely-coupled MIMD 54

55 MIPS Pipeline Five stages, one step per stage 1. IF: Instruction fetch from memory 2. ID: Instruction decode & register read 3. EX: Execute operation or calculate address 4. MEM: Access memory operand 5. WB: Write result back to register lw Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 IFetch Dec Exec Mem WB

56 f = Fetch f r a d w 5 r = Reg read a = ALU op 6 d = Data access w = Writeback 7 f r f a r f d a r Instruction (a) Task-time diagram Review from Last Lecture f multi cycle pipelined r f a r f d a r w d a w d w w d a w Cycle d w f Clock f f f f f f Time Pipeline stage (b) Space-time diagram Cycle Time r r r r r r r Drainage needed region a a a a a a a Time allotted Instr 1 Instr 2 Instr 3 Instr 4 Start-up d d d d d d d region Clock w w w w w w w needed Time allotted 3 cycles 5 cycles 3 cycles 4 cycles Instr 1 Instr 2 Instr 3 Instr 4 Time saved Single cycle multi cycle Execution time = 1/ Performance = Inst count x CPI x CCT N = # of stages for pipeline design or ~ maximum number of steps for MC CPI ideal MCP =N /InstCount + 1 1/InstCount large N and/or small InstCount result in worse CPI Performance to run one instruction is the same as of CP (i.e. latency for single instruction is not reduced) Design Inst count CPI Single Cycle (SC) Multi cycle (MC) 1 N CPI > 1 (closer to N than 1) Multi cycle pipelined (MCP) CCT > 1/N 1 > 1 >1/N What are the other issues affecting CCT and CPI for MC and MCP?

57 ALU ALU ALU ALU Pipelined Instruction Execution Time (clock cycles) I n s t r. Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Ifetch Reg Ifetch Reg DMem Reg DMem Reg O r d e r Ifetch Reg Ifetch Reg DMem Reg DMem Reg 57

58 ALU ALU ALU ALU Limits to pipelining Hazards prevent next instruction from executing during its designated clock cycle Structural hazards: attempt to use the same hardware to do two different things at once Data hazards: Instruction depends on result of prior instruction still in the pipeline Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps). Time (clock cycles) I n s t r. Ifetch Reg Ifetch Reg Ifetch DMem Reg Reg DMem Reg DMem Reg O r d e r Ifetch Reg DMem Reg 58

59 2. The Principle of Locality Programs access a relatively small portion of the address space at any instant of time. Two Different Types of Locality: Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access) Last 30 years, HW relied on locality for memory perf. P $ MEM 59

60 Capacity Access Time Cost CPU Registers 100s Bytes ps ( ns) L1 and L2 Cache 10s-100s K Bytes ~1 ns - ~10 ns ~ $100s/ GByte Main Memory G Bytes 80ns- 200ns ~ $10/ GByte Disk 10s T Bytes, 10 ms (10,000,000 ns) ~ $0.1 / GByte Tape infinite sec-min ~$0.1 / GByte Memory Hierarchy Levels Registers L1 Cache Memory Disk Tape Instr. Operands Blocks L2 Cache Blocks Pages Files Staging Xfer Unit prog./compiler 1-8 bytes cache cntl bytes cache cntl bytes OS 4K-8K bytes user/operator Gbytes Upper Level faster Larger Lower Level still needed? 60

61 3. Focus on the Common Case Favor the frequent case over the infrequent case E.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it first E.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it first Frequent case is often simpler and can be done faster than the infrequent case E.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow May slow down overflow, but overall performance improved by optimizing for the normal case What is frequent case? How much performance improved by making case faster? => Amdahl s Law 61

62 serial part parallel part serial part Amdahl s Law Speedup overall = T exec,old T exec,new = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced 62

63 Amdahl s Law Floating point instructions improved to run 2 times faster, but only 10% of actual instructions are FP T exec,new = Speedup overall = 63

64 Amdahl s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP T exec,new = T exec,old x ( /2) = 0.95 x T exec,old Speedup overall = =

65 Amdahl's law 65

66 Principles of Computer Design The Processor Performance Equation 66

67 Principles of Computer Design Different instruction types having different CPIs 67

68 Acknowledgements Some of the slides contain material developed and copyrighted by Henk Corporaal (TU/e) and instructor material for the textbook 68

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1 Outline Course information Trends (in technology, cost, performance) and issues 2 Course organization Class website (old),

More information

Introduction to Computer Architecture II

Introduction to Computer Architecture II Introduction to Computer Architecture II ECE 154B Dmitri Strukov Computer systems overview 1 Outline Course information Trends Computing classes Quantitative Principles of Design Dependability 2 Course

More information

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1 Outline Course information Trends (in technology, cost, performance) and issues 2 Course organization Old class website : http://www.ece.ucsb.edu/~strukov/ece154bsprin

More information

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov

Advanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1 Outline Course information Trends (in technology, cost, performance) and issues 2 Course organization Class website : http://www.ece.ucsb.edu/~strukov/ece154bspring201

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

EECS4201 Computer Architecture

EECS4201 Computer Architecture Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

CSE 502 Graduate Computer Architecture

CSE 502 Graduate Computer Architecture Computer Architecture A Quantitative Approach, Fifth Edition CAQA5 Chapter 1 CSE 502 Graduate Computer Architecture Lec 1-3 - Introduction Fundamentals of Quantitative Design and Analysis Larry Wittie

More information

Transistors and Wires

Transistors and Wires Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis Part II These slides are based on the slides provided by the publisher. The slides

More information

CS654 Advanced Computer Architecture. Lec 2 - Introduction

CS654 Advanced Computer Architecture. Lec 2 - Introduction CS654 Advanced Computer Architecture Lec 2 - Introduction Peter Kemper Adapted from the slides of EECS 252 by Prof. David Patterson Electrical Engineering and Computer Sciences University of California,

More information

ECE 486/586. Computer Architecture. Lecture # 2

ECE 486/586. Computer Architecture. Lecture # 2 ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:

More information

TDT 4260 lecture 2 spring semester 2015

TDT 4260 lecture 2 spring semester 2015 1 TDT 4260 lecture 2 spring semester 2015 Lasse Natvig, The CARD group Dept. of computer & information science NTNU 2 Lecture overview Chapter 1: Fundamentals of Quantitative Design and Analysis, continued

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

Technology. Giorgio Richelli

Technology. Giorgio Richelli Technology Giorgio Richelli What Comes out of the Fab Transistor Abstractions in Logic Design In physical world Voltages, Currents Electron flow In logical world - abstraction V < V lo 0 = FALSE V > V

More information

Lecture 1: Introduction

Lecture 1: Introduction Lecture 1: Introduction Dr. Eng. Amr T. Abdel-Hamid Winter 2014 Computer Architecture Text book slides: Computer Architec ture: A Quantitative Approach 5 th E dition, John L. Hennessy & David A. Patterso

More information

Fundamentals of Computers Design

Fundamentals of Computers Design Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2

More information

Computer Systems Architecture Spring 2016

Computer Systems Architecture Spring 2016 Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

B649 Graduate Computer Architecture. Lec 1 - Introduction

B649 Graduate Computer Architecture. Lec 1 - Introduction B649 Graduate Computer Architecture Lec 1 - Introduction http://www.cs.indiana.edu/~achauhan/teaching/ B649/2009-Spring/ 1/12/09 b649, Lec 01-intro 2 Outline Computer Science at a Crossroads Computer Architecture

More information

Lecture 2: Performance

Lecture 2: Performance Lecture 2: Performance Today s topics: Technology wrap-up Performance trends and equations Reminders: YouTube videos, canvas, and class webpage: http://www.cs.utah.edu/~rajeev/cs3810/ 1 Important Trends

More information

Lecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1

Lecture - 4. Measurement. Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1 Lecture - 4 Measurement Dr. Soner Onder CS 4431 Michigan Technological University 9/29/2009 1 Acknowledgements David Patterson Dr. Roger Kieckhafer 9/29/2009 2 Computer Architecture is Design and Analysis

More information

Advanced Computer Architecture (CS620)

Advanced Computer Architecture (CS620) Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

CS/EE 6810: Computer Architecture

CS/EE 6810: Computer Architecture CS/EE 6810: Computer Architecture Class format: Most lectures on YouTube *BEFORE* class Use class time for discussions, clarifications, problem-solving, assignments 1 Introduction Background: CS 3810 or

More information

DEPARTMENT OF ECE IV YEAR ECE EC6009 ADVANCED COMPUTER ARCHITECTURE LECTURE NOTES

DEPARTMENT OF ECE IV YEAR ECE EC6009 ADVANCED COMPUTER ARCHITECTURE LECTURE NOTES DEPARTMENT OF ECE IV YEAR ECE EC6009 ADVANCED COMPUTER ARCHITECTURE LECTURE NOTES SYLLABUS EC6009 ADVANCED COMPUTER ARCHITECTURE L T P C 3 0 0 3 OBJECTIVES: The student should be made to: Understand the

More information

Fundamentals of Computer Design

Fundamentals of Computer Design CS359: Computer Architecture Fundamentals of Computer Design Yanyan Shen Department of Computer Science and Engineering 1 Defining Computer Architecture Agenda Introduction Classes of Computers 1.3 Defining

More information

Instructor Information

Instructor Information CS 203A Advanced Computer Architecture Lecture 1 1 Instructor Information Rajiv Gupta Office: Engg.II Room 408 E-mail: gupta@cs.ucr.edu Tel: (951) 827-2558 Office Times: T, Th 1-2 pm 2 1 Course Syllabus

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

MOTIVATION. B649 Parallel Architectures and Programming

MOTIVATION. B649 Parallel Architectures and Programming MOTIVATION B649 Parallel Architectures and Programming Growth in Processor Performance From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, October, 2006! B649: Parallel

More information

Performance of computer systems

Performance of computer systems Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type

More information

Course web site: teaching/courses/car. Piazza discussion forum:

Course web site:   teaching/courses/car. Piazza discussion forum: Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start

More information

Computer Architecture Lecture 1: Fundamentals of Quantitative Design and Analysis (Chapter 1)

Computer Architecture Lecture 1: Fundamentals of Quantitative Design and Analysis (Chapter 1) Computer Architecture Lecture 1: Fundamentals of Quantitative Design and Analysis (Chapter 1) Chih Wei Liu 劉志尉 National Chiao Tung University cwliu@twins.ee.nctu.edu.tw Computer Technology Introduction

More information

EECS 322 Computer Architecture Superpipline and the Cache

EECS 322 Computer Architecture Superpipline and the Cache EECS 322 Computer Architecture Superpipline and the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow Summary:

More information

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Sept. 5 th : Homework 1 release (due on Sept.

More information

Computer Architecture. R. Poss

Computer Architecture. R. Poss Computer Architecture R. Poss 1 ca01-10 september 2015 Course & organization 2 ca01-10 september 2015 Aims of this course The aims of this course are: to highlight current trends to introduce the notion

More information

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Performance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of

More information

Computer Architecture. Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson)

Computer Architecture. Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson) Computer Architecture Minas E. Spetsakis Dept. Of Computer Science and Engineering (Class notes based on Hennessy & Patterson) What is Architecture? Instruction Set Design. Old definition from way back

More information

EE 4980 Modern Electronic Systems. Processor Advanced

EE 4980 Modern Electronic Systems. Processor Advanced EE 4980 Modern Electronic Systems Processor Advanced Architecture General Purpose Processor User Programmable Intended to run end user selected programs Application Independent PowerPoint, Chrome, Twitter,

More information

ECE 154A. Architecture. Dmitri Strukov

ECE 154A. Architecture. Dmitri Strukov ECE 154A Introduction to Computer Architecture Dmitri Strukov Lecture 1 Outline Admin What this class is about? Prerequisites ii Simple computer Performance Historical trends Economics 2 Admin Office Hours:

More information

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

EE282 Computer Architecture. Lecture 1: What is Computer Architecture? EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer

More information

ECE 154A Introduction to. Fall 2012

ECE 154A Introduction to. Fall 2012 ECE 154A Introduction to Computer Architecture Fall 2012 Dmitri Strukov Lecture 10 Floating point review Pipelined design IEEE Floating Point Format single: 8 bits double: 11 bits single: 23 bits double:

More information

Lecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533

Lecture 2: Computer Performance. Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Lecture 2: Computer Performance Assist.Prof.Dr. Gürhan Küçük Advanced Computer Architectures CSE 533 Performance and Cost Purchasing perspective given a collection of machines, which has the - best performance?

More information

Memory hierarchy review. ECE 154B Dmitri Strukov

Memory hierarchy review. ECE 154B Dmitri Strukov Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Six basic optimizations Virtual memory Cache performance Opteron example Processor-DRAM gap in latency Q1. How to deal

More information

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review CISC 662 Graduate Computer Architecture Lecture 6 - Cache and virtual memory review Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David

More information

Lecture 29 Review" CPU time: the best metric" Be sure you understand CC, clock period" Common (and good) performance metrics"

Lecture 29 Review CPU time: the best metric Be sure you understand CC, clock period Common (and good) performance metrics Be sure you understand CC, clock period Lecture 29 Review Suggested reading: Everything Q1: D[8] = D[8] + RF[1] + RF[4] I[15]: Add R2, R1, R4 RF[1] = 4 I[16]: MOV R3, 8 RF[4] = 5 I[17]: Add R2, R2, R3

More information

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010

Pipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several

More information

CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year

CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology. Moore s Law: 2X transistors / year CMSC 411 Computer Systems Architecture Lecture 2 Trends in Technology Moore s Law: 2X transistors / year Cramming More Components onto Integrated Circuits Gordon Moore, Electronics, 1965 # on transistors

More information

Multiple Issue ILP Processors. Summary of discussions

Multiple Issue ILP Processors. Summary of discussions Summary of discussions Multiple Issue ILP Processors ILP processors - VLIW/EPIC, Superscalar Superscalar has hardware logic for extracting parallelism - Solutions for stalls etc. must be provided in hardware

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

45-year CPU Evolution: 1 Law -2 Equations

45-year CPU Evolution: 1 Law -2 Equations 4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there

More information

ECE 587 Advanced Computer Architecture I

ECE 587 Advanced Computer Architecture I ECE 587 Advanced Computer Architecture I Instructor: Alaa Alameldeen alaa@ece.pdx.edu Spring 2015 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2015 1 When and Where? When:

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Computer Architecture

Computer Architecture Computer Architecture Architecture The art and science of designing and constructing buildings A style and method of design and construction Design, the way components fit together Computer Architecture

More information

Lecture 1: CS/ECE 3810 Introduction

Lecture 1: CS/ECE 3810 Introduction Lecture 1: CS/ECE 3810 Introduction Today s topics: Why computer organization is important Logistics Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer

More information

Chapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002,

Chapter 1. Instructor: Josep Torrellas CS433. Copyright Josep Torrellas 1999, 2001, 2002, Chapter 1 Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Course Goals Introduce you to design principles, analysis techniques and design options in computer architecture

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Chapter 1: Fundamentals of Quantitative Design and Analysis

Chapter 1: Fundamentals of Quantitative Design and Analysis 1 / 12 Chapter 1: Fundamentals of Quantitative Design and Analysis Be careful in this chapter. It contains a tremendous amount of information and data about the changes in computer architecture since the

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

COMPUTER ORGANIZATION AND DESI

COMPUTER ORGANIZATION AND DESI COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler

More information

The Processor: Instruction-Level Parallelism

The Processor: Instruction-Level Parallelism The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy

More information

CS550. TA: TBA Office: xxx Office hours: TBA. Blackboard:

CS550. TA: TBA   Office: xxx Office hours: TBA. Blackboard: CS550 Advanced Operating Systems (Distributed Operating Systems) Instructor: Xian-He Sun Email: sun@iit.edu, Phone: (312) 567-5260 Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment

More information

Performance, Power, Die Yield. CS301 Prof Szajda

Performance, Power, Die Yield. CS301 Prof Szajda Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the

More information

Chap. 4 Multiprocessors and Thread-Level Parallelism

Chap. 4 Multiprocessors and Thread-Level Parallelism Chap. 4 Multiprocessors and Thread-Level Parallelism Uniprocessor performance Performance (vs. VAX-11/780) 10000 1000 100 10 From Hennessy and Patterson, Computer Architecture: A Quantitative Approach,

More information

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture

EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Final Lecture. A few minutes to wrap up and add some perspective

Final Lecture. A few minutes to wrap up and add some perspective Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

Module 5 Introduction to Parallel Processing Systems

Module 5 Introduction to Parallel Processing Systems Module 5 Introduction to Parallel Processing Systems 1. What is the difference between pipelining and parallelism? In general, parallelism is simply multiple operations being done at the same time.this

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology

More information

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

LECTURE 5: MEMORY HIERARCHY DESIGN

LECTURE 5: MEMORY HIERARCHY DESIGN LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

EITF20: Computer Architecture Part2.2.1: Pipeline-1

EITF20: Computer Architecture Part2.2.1: Pipeline-1 EITF20: Computer Architecture Part2.2.1: Pipeline-1 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Pipelining Harzards Structural hazards Data hazards Control hazards Implementation issues Multi-cycle

More information

How What When Why CSC3501 FALL07 CSC3501 FALL07. Louisiana State University 1- Introduction - 1. Louisiana State University 1- Introduction - 2

How What When Why CSC3501 FALL07 CSC3501 FALL07. Louisiana State University 1- Introduction - 1. Louisiana State University 1- Introduction - 2 Computer Organization and Design Dr. Arjan Durresi Louisiana State University Baton Rouge, LA 70803 durresi@csc.lsu.edu d These slides are available at: http://www.csc.lsu.edu/~durresi/csc3501_07/ Louisiana

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Fall 2009 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2009 1 When and Where? When:

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Winter 2018 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2018 1 When and Where? When:

More information

EE282H: Computer Architecture and Organization. EE282H: Computer Architecture and Organization -- Course Overview

EE282H: Computer Architecture and Organization. EE282H: Computer Architecture and Organization -- Course Overview : Computer Architecture and Organization Kunle Olukotun Gates 302 kunle@ogun.stanford.edu http://www-leland.stanford.edu/class/ee282h/ : Computer Architecture and Organization -- Course Overview Goals»

More information

PERFORMANCE MEASUREMENT

PERFORMANCE MEASUREMENT Administrivia CMSC 411 Computer Systems Architecture Lecture 3 Performance Measurement and Reliability Homework problems for Unit 1 posted today due next Thursday, 2/12 Start reading Appendix C Basic Pipelining

More information

Lec 25: Parallel Processors. Announcements

Lec 25: Parallel Processors. Announcements Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza

More information

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0.

Performance! (1/latency)! 1000! 100! 10! Capacity Access Time Cost. CPU Registers 100s Bytes <10s ns. Cache K Bytes ns 1-0. Since 1980, CPU has outpaced DRAM... EEL 5764: Graduate Computer Architecture Appendix C Hierarchy Review Ann Gordon-Ross Electrical and Computer Engineering University of Florida http://www.ann.ece.ufl.edu/

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture 1 L E C T U R E 0 J A N L E M E I R E Course Objectives 2 Intel 4004 1971 2.3K trans. Intel Core 2 Duo 2006 291M trans. Where have all the transistors gone? Turing Machine

More information

CSEE W4824 Computer Architecture Fall 2012

CSEE W4824 Computer Architecture Fall 2012 CSEE W4824 Computer Architecture Fall 2012 Lecture 8 Memory Hierarchy Design: Memory Technologies and the Basics of Caches Luca Carloni Department of Computer Science Columbia University in the City of

More information

An Introduction to Parallel Architectures

An Introduction to Parallel Architectures An Introduction to Parallel Architectures Andrea Marongiu a.marongiu@unibo.it Impact of Parallel Architectures From cell phones to supercomputers In regular CPUs as well as GPUs Parallel HW Processing

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Processor Architecture. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Processor Architecture Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Moore s Law Gordon Moore @ Intel (1965) 2 Computer Architecture Trends (1)

More information

Tutorial 11. Final Exam Review

Tutorial 11. Final Exam Review Tutorial 11 Final Exam Review Introduction Instruction Set Architecture: contract between programmer and designers (e.g.: IA-32, IA-64, X86-64) Computer organization: describe the functional units, cache

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design Edited by Mansour Al Zuair 1 Introduction Programmers want unlimited amounts of memory with low latency Fast

More information

CS 654 Computer Architecture Summary. Peter Kemper

CS 654 Computer Architecture Summary. Peter Kemper CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch 1: Fundamentals Ch 2: Instruction Level Parallelism Ch 3: Limits on ILP Ch 4: Multiprocessors & TLP Ap A: Pipelining

More information

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction. CSCI 4850/5850 High-Performance Computing Spring 2018 Introduction CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University What is Parallel

More information

The Computer Revolution. Classes of Computers. Chapter 1

The Computer Revolution. Classes of Computers. Chapter 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition 1 Chapter 1 Computer Abstractions and Technology 1 The Computer Revolution Progress in computer technology Underpinned by Moore

More information

Computer Architecture. Introduction. Lynn Choi Korea University

Computer Architecture. Introduction. Lynn Choi Korea University Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,

More information

Designing for Performance. Patrick Happ Raul Feitosa

Designing for Performance. Patrick Happ Raul Feitosa Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance

More information