Design of Digital Circuits Lecture 16: Out-of-Order Execution. Prof. Onur Mutlu ETH Zurich Spring April 2018

Size: px
Start display at page:

Download "Design of Digital Circuits Lecture 16: Out-of-Order Execution. Prof. Onur Mutlu ETH Zurich Spring April 2018"

Transcription

1 Desig of Digital Circuits Lecture 16: Out-of-Order Executio Prof. Our Mutlu ETH Zurich Sprig April 2018

2 Ageda for Today & Next Few Lectures Sigle-cycle Microarchitectures Multi-cycle ad Microprogrammed Microarchitectures Pipeliig Issues i Pipeliig: Cotrol & Data Depedece Hadlig, State Maiteace ad Recovery, Out-of-Order Executio Other Executio Paradigms 2

3 Remider: Optioal Homeworks Posted olie 3 Optioal Homeworks Optioal Good for your learig =homeworks 3

4 Readigs Specifically for Today Smith ad Sohi, The Microarchitecture of Superscalar Processors, Proceedigs of the IEEE, 1995 More advaced pipeliig Iterrupt ad exceptio hadlig Out-of-order ad superscalar executio cocepts Optioal: Kessler, The Alpha Microprocessor, IEEE Micro

5 Lecture Aoucemet Moday, April 30, :15-17:15 CAB G 61 Apéro after the lecture J Prof. We-Mei Hwu (Uiversity of Illiois at Urbaa-Champaig) D-INFK Distiguished Collouium Iovative Applicatios ad Techology Pivots A Perfect Storm i Computig 5

6 Geeral Suggestio Atted the Computer Sciece Distiguished Collouia Happes o some Modays 16:15-17:15, followed by a apero Great way of learig about key developmets i the field Ad, meetig leadig researchers 6

7 Review: I-Order Pipelie with Reorder Buffer Decode (D): Access regfile/rob, allocate etry i ROB, check if istructio ca execute, if so dispatch istructio (sed to fuctioal uit) Execute (E): Istructios ca complete out-of-order Completio (R): Write result to reorder buffer Retiremet/Commit (W): Check for exceptios; if oe, write result to architectural register file or memory; else, flush pipelie ad start from exceptio hadler I-order dispatch/executio, out-of-order completio, i-order retiremet E Iteger add F D R E E E E Iteger mul E E E E E E E E FP mul E E E E E E E E... R W Load/store 7

8 Remember: Register Reamig with a Reorder Buffer Output ad ati depedecies are ot true depedecies WHY? The same register refers to values that have othig to do with each other They exist due to lack of register ID s (i.e. ames) i the ISA The register ID is reamed to the reorder buffer etry that will hold the register s value Register ID à ROB etry ID Architectural register ID à Physical register ID After reamig, ROB etry ID used to refer to the register This elimiates ati ad output depedecies Gives the illusio that there are a large umber of registers 8

9 Remember: Reamig Example Assume Register file has a poiter to the reorder buffer etry that cotais or will cotai the value, if the register is ot valid Reorder buffer works as described before Where is the latest defiitio of R3 for each istructio below i seuetial order? LD R0(0) à R3 LD R3, R1 à R10 MUL R1, R2 à R3 MUL R3, R4 à R11 ADD R5, R6 à R3 ADD R7, R8 à R12 9

10 Reorder Buffer Tradeoffs Advatages Coceptually simple for supportig precise exceptios Ca elimiate false depedeces Disadvatages Reorder buffer eeds to be accessed to get the results that are yet to be writte to the register file CAM or idirectio à icreased latecy ad complexity Other solutios aim to elimiate the disadvatages History buffer Future file Checkpoitig We will ot cover these 10

11 Out-of-Order Executio (Dyamic Istructio Schedulig)

12 A I-order Pipelie E Iteger add F D E E E E Iteger mul E E E E E E E E FP mul R W E E E E E E E E... Cache miss Dispatch: Act of sedig a istructio to a fuctioal uit Reamig with ROB elimiates stalls due to false depedecies Problem: A true data depedecy stalls dispatch of youger istructios ito fuctioal (executio) uits 12

13 Ca We Do Better? What do the followig two pieces of code have i commo (with respect to executio i the previous desig)? IMUL R3 ß R1, R2 ADD R3 ß R3, R1 ADD R4 ß R6, R7 IMUL R5 ß R6, R8 ADD R7 ß R9, R9 LD R3 ß R1 (0) ADD R3 ß R3, R1 ADD R4 ß R6, R7 IMUL R5 ß R6, R8 ADD R7 ß R9, R9 Aswer: First ADD stalls the whole pipelie! ADD caot dispatch because its source registers uavailable Later idepedet istructios caot get executed How are the above code portios differet? Aswer: Load latecy is variable (ukow util rutime) What does this affect? Thik compiler vs. microarchitecture 13

14 Prevetig Dispatch Stalls Problem: i-order dispatch (schedulig, or executio) Solutio: out-of-order dispatch (schedulig, or executio) Actually, we have see the basic idea before: Dataflow: fetch ad fire a istructio oly whe its iputs are ready We will use similar priciples, but ot expose it i the ISA Aside: Ay other way to prevet dispatch stalls? 1. Compile-time istructio schedulig/reorderig 2. Value predictio 3. Fie-graied multithreadig 14

15 Out-of-order Executio (Dyamic Schedulig) Idea: Move the depedet istructios out of the way of idepedet oes (s.t. idepedet oes ca execute) Rest areas for depedet istructios: Reservatio statios Moitor the source values of each istructio i the restig area Whe all source values of a istructio are available, fire (i.e. dispatch) the istructio Istructios dispatched i dataflow (ot cotrol-flow) order Beefit: Latecy tolerace: Allows idepedet istructios to execute ad complete i the presece of a log-latecy operatio 15

16 I-order vs. Out-of-order Dispatch I order dispatch + precise exceptios: F D E E E E R W F D STALL E R W F STALL D E R W F D E E E E E R W F D STALL E R W IMUL R3 ß R1, R2 ADD R3 ß R3, R1 ADD R1 ß R6, R7 IMUL R5 ß R6, R8 ADD R7 ß R3, R5 Out-of-order dispatch + precise exceptios: F D E E E E R W F D F D WAIT E R E R W W F D E E E E R W F D WAIT E R W 16 vs. 12 cycles 16

17 Eablig OoO Executio 1. Need to lik the cosumer of a value to the producer Register reamig: Associate a tag with each data value 2. Need to buffer istructios util they are ready to execute Isert istructio ito reservatio statios after reamig 3. Istructios eed to keep track of readiess of source values Broadcast the tag whe the value is produced Istructios compare their source tags to the broadcast tag à if match, source value becomes ready 4. Whe all source values of a istructio are ready, eed to dispatch the istructio to its fuctioal uit (FU) Istructio wakes up if all sources are ready If multiple istructios are awake, eed to select oe per FU 17

18 Tomasulo s Algorithm for OoO Executio OoO with register reamig iveted by Robert Tomasulo Used i IBM 360/91 Floatig Poit Uits Read: Tomasulo, A Efficiet Algorithm for Exploitig Multiple Arithmetic Uits, IBM Joural of R&D, Ja What is the major differece today? Precise exceptios Provided by Patt, Hwu, Shebaow, HPS, a ew microarchitecture: ratioale ad itroductio, MICRO Patt et al., Critical issues regardig HPS, a high performace microarchitecture, MICRO OoO variats are used i most high-performace processors Iitially i Itel Petium Pro, AMD K5 Alpha 21264, MIPS R10000, IBM POWER5, IBM z196, Oracle UltraSPARC T4, ARM Cortex A15 18

19 Two Humps i a Moder Pipelie TAG ad VALUE Broadcast Bus F D S C H E D U L E E Iteger add E E E E Iteger mul E E E E E E E E FP mul E E E E E E E E... R E O R D E R W Load/store i order out of order i order Hump 1: Reservatio statios (schedulig widow) Hump 2: Reorderig (reorder buffer, aka istructio widow or active widow) 19

20 Two Humps i a Moder Pipelie TAG ad VALUE Broadcast Bus F D S C H E D U L E E Iteger add E E E E Iteger mul E E E E E E E E FP mul E E E E E E E E... R E O R D E R W Load/store i order out of order i order Hump 1: Reservatio statios (schedulig widow) Hump 2: Reorderig (reorder buffer, aka istructio widow or active widow) Photo credit: 20

21 Geeral Orgaizatio of a OOO Processor Smith ad Sohi, The Microarchitecture of Superscalar Processors, Proc. IEEE, Dec

22 Tomasulo s Machie: IBM 360/91 from memory from istructio uit FP registers load buffers store buffers operatio bus FP FU FP FU reservatio statios to memory Commo data bus 22

23 Register Reamig Output ad ati depedecies are ot true depedecies WHY? The same register refers to values that have othig to do with each other They exist because ot eough register ID s (i.e. ames) i the ISA The register ID is reamed to the reservatio statio etry that will hold the register s value Register ID à RS etry ID Architectural register ID à Physical register ID After reamig, RS etry ID used to refer to the register This elimiates ati- ad output- depedecies Approximates the performace effect of a large umber of registers eve though ISA has a small umber 23

24 Tomasulo s Algorithm: Reamig Register reame table (register alias table) tag value valid? R0 R1 R2 R3 R4 R5 R6 R7 R8 R

25 Tomasulo s Algorithm If reservatio statio available before reamig Istructio + reamed operads (source value/tag) iserted ito the reservatio statio Oly reame if reservatio statio is available Else stall While i reservatio statio, each istructio: Watches commo data bus (CDB) for tag of its sources Whe tag see, grab value for the source ad keep it i the reservatio statio Whe both operads available, istructio ready to be dispatched Dispatch istructio to the Fuctioal Uit whe istructio is ready After istructio fiishes i the Fuctioal Uit Arbitrate for CDB Put tagged value oto CDB (tag broadcast) Register file is coected to the CDB Register cotais a tag idicatig the latest writer to the register If the tag i the register file matches the broadcast tag, write broadcast value ito register (ad set valid bit) Reclaim reame tag o valid copy of tag i system! 25

26 A Exercise MUL R3 ß R1, R2 ADD R5 ß R3, R4 ADD R7 ß R2, R6 ADD R10 ß R8, R9 MUL R11 ß R7, R10 ADD R5 ß R5, R11 F D E W Assume ADD (4 cycle execute), MUL (6 cycle execute) Assume oe adder ad oe multiplier How may cycles i a o-pipelied machie i a i-order-dispatch pipelied machie with imprecise exceptios (o forwardig ad full forwardig) i a out-of-order dispatch pipelied machie imprecise exceptios (full forwardig) 26

27 Exercise Cotiued 27

28 Exercise Cotiued 28

29 Exercise Cotiued MUL R3 ß R1, R2 ADD R5 ß R3, R4 ADD R7 ß R2, R6 ADD R10 ß R8, R9 MUL R11 ß R7, R10 ADD R5 ß R5, R11 29

30 How It Works 30

31 Cycle 0 31

32 Cycle 2 32

33 Cycle 3 33

34 Cycle 4 34

35 Cycle 7 35

36 Cycle 8 36

37 Some Questios What is eeded i hardware to perform tag broadcast ad value capture? à make a value valid à wake up a istructio Does the tag have to be the ID of the Reservatio Statio Etry? What ca potetially become the critical path? Tag broadcast à value capture à istructio wake up How ca you reduce the potetial critical paths? 37

38 Dataflow Graph for Our Example MUL R3 ß R1, R2 ADD R5 ß R3, R4 ADD R7 ß R2, R6 ADD R10 ß R8, R9 MUL R11 ß R7, R10 ADD R5 ß R5, R11 38

39 State of RAT ad RS i Cycle 7 39

40 Dataflow Graph 40

41 Desig of Digital Circuits Lecture 16: Out-of-Order Executio Prof. Our Mutlu ETH Zurich Sprig April 2018

42 We did ot cover the followig slides i lecture. These are for your preparatio for the ext lecture.

43 A Exercise, with Precise Exceptios MUL R3 ß R1, R2 ADD R5 ß R3, R4 ADD R7 ß R2, R6 ADD R10 ß R8, R9 MUL R11 ß R7, R10 ADD R5 ß R5, R11 F D E R W Assume ADD (4 cycle execute), MUL (6 cycle execute) Assume oe adder ad oe multiplier How may cycles i a o-pipelied machie i a i-order-dispatch pipelied machie with reorder buffer (o forwardig ad full forwardig) i a out-of-order dispatch pipelied machie with reorder buffer (full forwardig) 43

44 Out-of-Order Executio with Precise Exceptios Idea: Use a reorder buffer to reorder istructios before committig them to architectural state A istructio updates the RAT whe it completes executio Also called froted register file A istructio updates a separate architectural register file whe it retires i.e., whe it is the oldest i the machie ad has completed executio I other words, the architectural register file is always updated i program order O a exceptio: flush pipelie, copy architectural register file ito froted register file 44

45 Out-of-Order Executio with Precise Exceptios TAG ad VALUE Broadcast Bus F D S C H E D U L E E Iteger add E E E E Iteger mul E E E E E E E E FP mul E E E E E E E E... R E O R D E R W Load/store i order out of order i order Hump 1: Reservatio statios (schedulig widow) Hump 2: Reorderig (reorder buffer, aka istructio widow or active widow) 45

46 Two Humps i a Moder Pipelie TAG ad VALUE Broadcast Bus F D S C H E D U L E E Iteger add E E E E Iteger mul E E E E E E E E FP mul E E E E E E E E... R E O R D E R W Load/store i order out of order i order Hump 1: Reservatio statios (schedulig widow) Hump 2: Reorderig (reorder buffer, aka istructio widow or active widow) Photo credit: 46

47 Moder OoO Executio w/ Precise Exceptios Most moder processors use the followig Reorder buffer to support i-order retiremet of istructios A sigle register file to store all registers Both speculative ad architectural registers INT ad FP are still separate Two register maps Future/froted register map à used for reamig Architectural register map à used for maitaiig precise state 47

48 A Example from Moder Processors Boggs et al., The Microarchitecture of the Petium 4 Processor, Itel Techology Joural,

49 Eablig OoO Executio, Revisited 1. Lik the cosumer of a value to the producer Register reamig: Associate a tag with each data value 2. Buffer istructios util they are ready Isert istructio ito reservatio statios after reamig 3. Keep track of readiess of source values of a istructio Broadcast the tag whe the value is produced Istructios compare their source tags to the broadcast tag à if match, source value becomes ready 4. Whe all source values of a istructio are ready, dispatch the istructio to fuctioal uit (FU) Wakeup ad select/schedule the istructio 49

50 Summary of OOO Executio Cocepts Register reamig elimiates false depedecies, eables likig of producer to cosumers Bufferig eables the pipelie to move for idepedet ops Tag broadcast eables commuicatio (of readiess of produced value) betwee istructios Wakeup ad select eables out-of-order dispatch 50

51 OOO Executio: Restricted Dataflow A out-of-order egie dyamically builds the dataflow graph of a piece of the program which piece? The dataflow graph is limited to the istructio widow Istructio widow: all decoded but ot yet retired istructios Ca we do it for the whole program? Why would we like to? I other words, how ca we have a large istructio widow? Ca we do it efficietly with Tomasulo s algorithm? 51

52 Recall: Dataflow Graph for Our Example MUL R3 ß R1, R2 ADD R5 ß R3, R4 ADD R7 ß R2, R6 ADD R10 ß R8, R9 MUL R11 ß R7, R10 ADD R5 ß R5, R11 52

53 Recall: State of RAT ad RS i Cycle 7 53

54 Recall: Dataflow Graph 54

55 Questios to Poder Why is OoO executio beeficial? What if all operatios take sigle cycle? Latecy tolerace: OoO executio tolerates the latecy of multi-cycle operatios by executig idepedet operatios cocurretly What if a istructio takes 500 cycles? How large of a istructio widow do we eed to cotiue decodig? How may cycles of latecy ca OoO tolerate? What limits the latecy tolerace scalability of Tomasulo s algorithm? Active/istructio widow size: determied by both schedulig widow ad reorder buffer size 55

56 Geeral Orgaizatio of a OOO Processor Smith ad Sohi, The Microarchitecture of Superscalar Processors, Proc. IEEE, Dec

57 A Moder OoO Desig: Itel Petium 4 57 Boggs et al., The Microarchitecture of the Petium 4 Processor, Itel Techology Joural, 2001.

58 Itel Petium 4 Simplified Mutlu+, Ruahead Executio, HPCA

59 Alpha Kessler, The Alpha Microprocessor, IEEE Micro, March-April

60 MIPS R10000 Yeager, The MIPS R10000 Superscalar Microprocessor, IEEE Micro, April

61 IBM POWER4 Tedler et al., POWER4 system microarchitecture, IBM J R&D,

62 IBM POWER4 2 cores, out-of-order executio 100-etry istructio widow i each core 8-wide istructio fetch, issue, execute Large, local+global hybrid brach predictor 1.5MB, 8-way L2 cache Aggressive stream based prefetchig 62

63 IBM POWER5 Kalla et al., IBM Power5 Chip: A Dual-Core Multithreaded Processor, IEEE Micro

64 Hadlig Out-of-Order Executio of Loads ad Stores

65 Recall: Registers versus Memory So far, we cosidered maily registers as part of state What about memory? What are the fudametal differeces betwee registers ad memory? Register depedeces kow statically memory depedeces determied dyamically Register state is small memory state is large Register state is ot visible to other threads/processors memory state is shared betwee threads/processors (i a shared memory multiprocessor) 65

66 Memory Depedece Hadlig (I) Need to obey memory depedeces i a out-of-order machie ad eed to do so while providig high performace Observatio ad Problem: Memory address is ot kow util a load/store executes Corollary 1: Reamig memory addresses is difficult Corollary 2: Determiig depedece or idepedece of loads/stores eed to be hadled after their (partial) executio Corollary 3: Whe a load/store has its address ready, there may be youger/older loads/stores with udetermied addresses i the machie 66

67 Memory Depedece Hadlig (II) Whe do you schedule a load istructio i a OOO egie? Problem: A youger load ca have its address ready before a older store s address is kow Kow as the memory disambiguatio problem or the ukow address problem Approaches Coservative: Stall the load util all previous stores have computed their addresses (or eve retired from the machie) Aggressive: Assume load is idepedet of ukow-address stores ad schedule the load right away Itelliget: Predict (with a more sophisticated predictor) if the load is depedet o the/ay ukow address store 67

68 Hadlig of Store-Load Depedeces A load s depedece status is ot kow util all previous store addresses are available. How does the OOO egie detect depedece of a load istructio o a previous store? Optio 1: Wait util all previous stores committed (o eed to check for address match) Optio 2: Keep a list of pedig stores i a store buffer ad check whether load address matches a previous store address How does the OOO egie treat the schedulig of a load istructio wrt previous stores? Optio 1: Assume load depedet o all previous stores Optio 2: Assume load idepedet of all previous stores Optio 3: Predict the depedece of a load o a outstadig store 68

69 Memory Disambiguatio (I) Optio 1: Assume load depedet o all previous stores + No eed for recovery -- Too coservative: delays idepedet loads uecessarily Optio 2: Assume load idepedet of all previous stores + Simple ad ca be commo case: o delay for idepedet loads -- Reuires recovery ad re-executio of load ad depedets o mispredictio Optio 3: Predict the depedece of a load o a outstadig store + More accurate. Load store depedecies persist over time -- Still reuires recovery/re-executio o mispredictio Alpha : Iitially assume load idepedet, delay loads foud to be depedet Moshovos et al., Dyamic speculatio ad sychroizatio of data depedeces, ISCA Chrysos ad Emer, Memory Depedece Predictio Usig Store Sets, ISCA

70 Memory Disambiguatio (II) Chrysos ad Emer, Memory Depedece Predictio Usig Store Sets, ISCA Predictig store-load depedecies importat for performace Simple predictors (based o past history) ca achieve most of the potetial performace 70

71 Data Forwardig Betwee Stores ad Loads We caot update memory out of program order à Need to buffer all store ad load istructios i istructio widow Eve if we kow all addresses of past stores whe we geerate the address of a load, two uestios still remai: 1. How do we check whether or ot it is depedet o a store 2. How do we forward data to the load if it is depedet o a store Moder processors use a LQ (load ueue) ad a SQ for this Ca be combied or separate betwee loads ad stores A load searches the SQ after it computes its address. Why? A store searches the LQ after it computes its address. Why? 71

72 Out-of-Order Completio of Memory Ops Whe a store istructio fiishes executio, it writes its address ad data i its reorder buffer etry Whe a later load istructio geerates its address, it: searches the reorder buffer (or the SQ) with its address accesses memory with its address receives the value from the yougest older istructio that wrote to that address (either from ROB or memory) This is a complicated search logic implemeted as a Cotet Addressable Memory Cotet is memory address (but also eed size ad age) Called store-to-load forwardig logic 72

73 Store-Load Forwardig Complexity Cotet Addressable Search (based o Load Address) Rage Search (based o Address ad Size) Age-Based Search (for last writte values) Load data ca come from a combiatio of Oe or more stores i the Store Buffer (SQ) Memory/cache 73

74 Food for Thought for You May other desig choices for OoO egies Should reservatio statios be cetralized or distributed across fuctioal uits? What are the tradeoffs? Should reservatio statios ad ROB store data values or should there be a cetralized physical register file where all data values are stored? What are the tradeoffs? Exactly whe does a istructio broadcast its tag? 74

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab2 due toight Exam I: covers lectures 1-9 Ope book, ope otes, close device

More information

Design of Digital Circuits Lecture 17: Out-of-Order, DataFlow, Superscalar Execution. Prof. Onur Mutlu ETH Zurich Spring April 2018

Design of Digital Circuits Lecture 17: Out-of-Order, DataFlow, Superscalar Execution. Prof. Onur Mutlu ETH Zurich Spring April 2018 Desig of Digital Circuits Lecture 17: Out-of-Order, DataFlow, Superscalar Executio Prof. Our Mutlu ETH Zurich Sprig 2018 27 April 2018 Ageda for Today & Next Few Lectures Sigle-cycle Microarchitectures

More information

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling)

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) 18-447 Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 2/13/2015 Agenda for Today & Next Few Lectures

More information

Computer Architecture Lecture 14: Out-of-Order Execution. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/18/2013

Computer Architecture Lecture 14: Out-of-Order Execution. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/18/2013 18-447 Computer Architecture Lecture 14: Out-of-Order Execution Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/18/2013 Reminder: Homework 3 Homework 3 Due Feb 25 REP MOVS in Microprogrammed

More information

Computer Architecture: Out-of-Order Execution II. Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Out-of-Order Execution II. Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Out-of-Order Execution II Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-447 Spring 2013, Computer Architecture, Lecture 15 Video

More information

15-740/ Computer Architecture Lecture 10: Out-of-Order Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/3/2011

15-740/ Computer Architecture Lecture 10: Out-of-Order Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/3/2011 5-740/8-740 Computer Architecture Lecture 0: Out-of-Order Execution Prof. Onur Mutlu Carnegie Mellon University Fall 20, 0/3/20 Review: Solutions to Enable Precise Exceptions Reorder buffer History buffer

More information

CS252 Spring 2017 Graduate Computer Architecture. Lecture 6: Out-of-Order Processors

CS252 Spring 2017 Graduate Computer Architecture. Lecture 6: Out-of-Order Processors CS252 Sprig 2017 Graduate Computer Architecture Lecture 6: Out-of-Order Processors Lisa Wu, Krste Asaovic http://ist.eecs.berkeley.edu/~cs252/sp17 WU UCB CS252 SP17 2 WU UCB CS252 SP17 Last Time i Lecture

More information

Computer Architecture Lecture 13: State Maintenance and Recovery. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/15/2013

Computer Architecture Lecture 13: State Maintenance and Recovery. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/15/2013 18-447 Computer Architecture Lecture 13: State Maintenance and Recovery Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/15/2013 Reminder: Homework 3 Homework 3 Due Feb 25 REP MOVS in Microprogrammed

More information

Computer Architecture Lecture 15: Load/Store Handling and Data Flow. Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014

Computer Architecture Lecture 15: Load/Store Handling and Data Flow. Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014 18-447 Computer Architecture Lecture 15: Load/Store Handling and Data Flow Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014 Lab 4 Heads Up Lab 4a out Branch handling and branch predictors

More information

15-740/ Computer Architecture Lecture 8: Issues in Out-of-order Execution. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 8: Issues in Out-of-order Execution. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 8: Issues in Out-of-order Execution Prof. Onur Mutlu Carnegie Mellon University Readings General introduction and basic concepts Smith and Sohi, The Microarchitecture

More information

This Unit: Dynamic Scheduling. Can Hardware Overcome These Limits? Scheduling: Compiler or Hardware. The Problem With In-Order Pipelines

This Unit: Dynamic Scheduling. Can Hardware Overcome These Limits? Scheduling: Compiler or Hardware. The Problem With In-Order Pipelines This Uit: Damic Schedulig CSE 560 Computer Sstems Architecture Damic Schedulig Slides origiall developed b Drew Hilto (IBM) ad Milo Marti (Uiversit of Peslvaia) App App App Sstem software Mem CPU I/O Code

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

15-740/ Computer Architecture Lecture 12: Issues in OoO Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/7/2011

15-740/ Computer Architecture Lecture 12: Issues in OoO Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/7/2011 15-740/18-740 Computer Architecture Lecture 12: Issues in OoO Execution Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/7/2011 Reviews Due next Monday Mutlu et al., Runahead Execution: An Alternative

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

Precise Exceptions and Out-of-Order Execution. Samira Khan

Precise Exceptions and Out-of-Order Execution. Samira Khan Precise Exceptions and Out-of-Order Execution Samira Khan Multi-Cycle Execution Not all instructions take the same amount of time for execution Idea: Have multiple different functional units that take

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

CMSC22200 Computer Architecture Lecture 8: Out-of-Order Execution. Prof. Yanjing Li University of Chicago

CMSC22200 Computer Architecture Lecture 8: Out-of-Order Execution. Prof. Yanjing Li University of Chicago CMSC22200 Computer Architecture Lecture 8: Out-of-Order Execution Prof. Yanjing Li University of Chicago Administrative Stuff! Lab2 due tomorrow " 2 free late days! Lab3 is out " Start early!! My office

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 5: Pipeliig Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab1 Due toight Lab2: out later today; due 2 weeks from ow Review sessio this Friday Turig award

More information

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

ΕΠΛ 605 Εργαστήριο 5. Παναγιώτα Νικολάου 11/10/18. Slides from: Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin

ΕΠΛ 605 Εργαστήριο 5. Παναγιώτα Νικολάου 11/10/18. Slides from: Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin ΕΠΛ 605 Εργαστήριο 5 Παναγιώτα Νικολάου 11/10/18 Slides from: Rajagopala Desika, Doug Burger, Stephe Keckler, Todd Austi Simulators Simulatio is the process of desigig a model of a real system ad coductig

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Multi-Core Prof. Yajig Li Uiversity of Chicago Course Evaluatio Very importat Please fill out! 2 Lab3 Brach Predictio Competitio 8 teams etered the competitio,

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Instruction and Data Streams

Instruction and Data Streams Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad

More information

Uniprocessors. HPC Prof. Robert van Engelen

Uniprocessors. HPC Prof. Robert van Engelen Uiprocessors HPC Prof. Robert va Egele Overview PART I: Uiprocessors PART II: Multiprocessors ad ad Compiler Optimizatios Parallel Programmig Models Uiprocessors Multiprocessors Processor architectures

More information

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago

CMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yajig Li Departmet of Computer Sciece Uiversity of Chicago Admiistrative Stuff Lab1 out toight Due Thursday (10/18) Lab1 review sessio Tomorrow, 10/05,

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

Design of Digital Circuits Lecture 21: SIMD Processors II and Graphics Processing Units

Design of Digital Circuits Lecture 21: SIMD Processors II and Graphics Processing Units Desig of Digital Circuits Lecture 21: SIMD Processors II ad Graphics Processig Uits Dr. Jua Gómez Lua Prof. Our Mutlu ETH Zurich Sprig 2018 17 May 2018 New Course: Bachelor s Semiar i Comp Arch Fall 2018

More information

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings Operatig Systems: Iterals ad Desig Priciples Chapter 4 Threads Nith Editio By William Stalligs Processes ad Threads Resource Owership Process icludes a virtual address space to hold the process image The

More information

Design of Digital Circuits Lecture 14: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2018

Design of Digital Circuits Lecture 14: Pipelining. Prof. Onur Mutlu ETH Zurich Spring April 2018 Desig of Digital Circuits Lecture 4: Pipeliig Prof. Our Mutlu ETH Zurich Sprig 28 9 April 28 Ageda for Today & Next Few Lectures Previous lectures Sigle-cycle Microarchitectures Multi-cycle ad Microprogrammed

More information

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition. Computer Architecture A Quatitative Approach, Sixth Editio Chapter 2 Memory Hierarchy Desig 1 Itroductio Programmers wat ulimited amouts of memory with low latecy Fast memory techology is more expesive

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Determined by ISA and compiler. Determined by CPU hardware

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Determined by ISA and compiler. Determined by CPU hardware COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 4 The Processor Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler CPI ad Cycle time Determied

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Isn t It Time You Got Faster, Quicker?

Isn t It Time You Got Faster, Quicker? Is t It Time You Got Faster, Quicker? AltiVec Techology At-a-Glace OVERVIEW Motorola s advaced AltiVec techology is desiged to eable host processors compatible with the PowerPC istructio-set architecture

More information

Computer Architecture Lecture 8: SIMD Processors and GPUs. Prof. Onur Mutlu ETH Zürich Fall October 2017

Computer Architecture Lecture 8: SIMD Processors and GPUs. Prof. Onur Mutlu ETH Zürich Fall October 2017 Computer Architecture Lecture 8: SIMD Processors ad GPUs Prof. Our Mutlu ETH Zürich Fall 2017 18 October 2017 Ageda for Today & Next Few Lectures SIMD Processors GPUs Itroductio to GPU Programmig Digitaltechik

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Review Istructio Set Architecture Istructio Set The repertoire of istructios of a computer Differet computers have differet istructio

More information

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 3: ISA ad Itroductio to Microarchitecture Prof. Yajig Li Uiversity of Chicago Lecture Outlie ISA uarch (hardware implemetatio of a ISA) Logic desig basics Sigle-cycle

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

A collection of open-sourced RISC-V processors

A collection of open-sourced RISC-V processors Riscy Processors A collectio of ope-sourced RISC-V processors Ady Wright, Sizhuo Zhag, Thomas Bourgeat, Murali Vijayaraghava, Jamey Hicks, Arvid Computatio Structures Group, CSAIL, MIT 4 th RISC-V Workshop

More information

Computer Architecture. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff

Computer Architecture. Microcomputer Architecture and Interfacing Colorado School of Mines Professor William Hoff Computer rchitecture Microcomputer rchitecture ad Iterfacig Colorado School of Mies Professor William Hoff Computer Hardware Orgaizatio Processor Performs all computatios; coordiates data trasfer Iput

More information

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization

End Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization Ed Semester Examiatio 2013-14 CSE, III Yr. (I Sem), 30002: Computer Orgaizatio Istructios: GROUP -A 1. Write the questio paper group (A, B, C, D), o frot page top of aswer book, as per what is metioed

More information

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors

More information

15-740/ Computer Architecture Lecture 23: Superscalar Processing (III) Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 23: Superscalar Processing (III) Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 23: Superscalar Processing (III) Prof. Onur Mutlu Carnegie Mellon University Announcements Homework 4 Out today Due November 15 Midterm II November 22 Project

More information

Portland State University ECE 587/687. Superscalar Issue Logic

Portland State University ECE 587/687. Superscalar Issue Logic Portland State University ECE 587/687 Superscalar Issue Logic Copyright by Alaa Alameldeen, Zeshan Chishti and Haitham Akkary 2017 Instruction Issue Logic (Sohi & Vajapeyam, 1987) After instructions are

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)

More information

Computer Architecture Lecture 17: Latency Tolerance and Prefetching. Prof. Onur Mutlu ETH Zürich Fall November 2017

Computer Architecture Lecture 17: Latency Tolerance and Prefetching. Prof. Onur Mutlu ETH Zürich Fall November 2017 Computer Architecture Lecture 17: Latecy Tolerace ad Prefetchig Prof. Our Mutlu ETH Zürich Fall 2017 22 November 2017 Summary of Last Week s Lectures Shared Cache Maagemet Makig Cachig More Effective Heterogeeous

More information

CS61C : Machine Structures

CS61C : Machine Structures CS 61C L24 VM II (1) ist.eecs.berkele.edu/~cs61c/su5 CS61C : Machie Structures Lecture #24: VM II Address Mappig: Virtual Address: VPN offset 25-8-2 Ad Carle idex ito page table located i phsical memor

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

ECE5917 SoC Architecture: MP SoC Part 1. Tae Hee Han: Semiconductor Systems Engineering Sungkyunkwan University

ECE5917 SoC Architecture: MP SoC Part 1. Tae Hee Han: Semiconductor Systems Engineering Sungkyunkwan University ECE5917 SoC Architecture: MP SoC Part 1 Tae Hee Ha: tha@skku.edu Semicoductor Systems Egieerig Sugkyukwa Uiversity Outlie Overview Parallelism Data-Level Parallelism Istructio-Level Parallelism Thread-Level

More information

15-740/ Computer Architecture Lecture 14: Runahead Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/12/2011

15-740/ Computer Architecture Lecture 14: Runahead Execution. Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/12/2011 15-740/18-740 Computer Architecture Lecture 14: Runahead Execution Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/12/2011 Reviews Due Today Chrysos and Emer, Memory Dependence Prediction Using

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

Design of Digital Circuits Lecture 20: SIMD Processors. Prof. Onur Mutlu ETH Zurich Spring May 2018

Design of Digital Circuits Lecture 20: SIMD Processors. Prof. Onur Mutlu ETH Zurich Spring May 2018 Desig of Digital Circuits Lecture 20: SIMD Processors Prof. Our Mutlu ETH Zurich Sprig 2018 11 May 2018 New Course: Bachelor s Semiar i Comp Arch Fall 2018 2 credit uits Rigorous semiar o fudametal ad

More information

Arquitectura de Computadores

Arquitectura de Computadores Arquitectura de Computadores Capítulo 2. Procesadores segmetados Based o the origial material of the book: D.A. Patterso y J.L. Heessy Computer Orgaizatio ad Desig: The Hardware/Software Iterface 4 th

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 CPU-Memory Bottleeck Computer Architecture ELEC44 CPU Memory Lecture 8 Cache Dr. Hayde Kwok-Hay So Departmet of Electrical ad Electroic Egieerig Performace of high-speed computers is usually limited by

More information

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018 Fudametals of Chapter 1 Microprocessor ad Microcotroller Dr. Farid Farahmad Updated: Tuesday, Jauary 16, 2018 Evolutio First came trasistors Itegrated circuits SSI (Small-Scale Itegratio) to ULSI Very

More information

CMSC Computer Architecture Lecture 1: Introduction. Prof. Yanjing Li Department of Computer Science University of Chicago

CMSC Computer Architecture Lecture 1: Introduction. Prof. Yanjing Li Department of Computer Science University of Chicago CMSC 22200 Computer Architecture Lecture 1: Itroductio Prof. Yajig Li Departmet of Computer Sciece Uiversity of Chicago Lecture Outlie Meet ad greet Computer architecture: overview ad perspectives Course

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

Switching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1

Switching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1 Switchig Hardware Sprig 208 CS 438 Staff, Uiversity of Illiois Where are we? Uderstad Differet ways to move through a etwork (forwardig) Read sigs at each switch (datagram) Follow a kow path (virtual circuit)

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local

More information

Review: The ACID properties

Review: The ACID properties Recovery Review: The ACID properties A tomicity: All actios i the Xactio happe, or oe happe. C osistecy: If each Xactio is cosistet, ad the DB starts cosistet, it eds up cosistet. I solatio: Executio of

More information

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory! Why Care About the Memory Hierarchy? Memory Virtual Memory -DRAM Memory Gap (latecy) Reasos: Multi process systems (abstractio & memory protectio) Solutio: Tables (holdig per process traslatios) Fast traslatio

More information

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1 Reliable Trasmissio Sprig 2018 CS 438 Staff - Uiversity of Illiois 1 Reliable Trasmissio Hello! My computer s ame is Alice. Alice Bob Hello! Alice. Sprig 2018 CS 438 Staff - Uiversity of Illiois 2 Reliable

More information

Lecture 1: Introduction and Fundamental Concepts 1

Lecture 1: Introduction and Fundamental Concepts 1 Uderstadig Performace Lecture : Fudametal Cocepts ad Performace Aalysis CENG 332 Algorithm Determies umber of operatios executed Programmig laguage, compiler, architecture Determie umber of machie istructios

More information

System and Software Architecture Description (SSAD)

System and Software Architecture Description (SSAD) System ad Software Architecture Descriptio (SSAD) Diabetes Health Platform Team #6 Jasmie Berry (Cliet) Veerav Naidu (Project Maager) Mukai Nog (Architect) Steve South (IV&V) Vijaya Prabhakara (Quality

More information

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components Aoucemets Readig Chapter 4 (4.1-4.2) Project #4 is o the web ote policy about project #3 missig compoets Homework #1 Due 11/6/01 Chapter 6: 4, 12, 24, 37 Midterm #2 11/8/01 i class 1 Project #4 otes IPv6Iit,

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

n Explore virtualization concepts n Become familiar with cloud concepts

n Explore virtualization concepts n Become familiar with cloud concepts Chapter Objectives Explore virtualizatio cocepts Become familiar with cloud cocepts Chapter #15: Architecture ad Desig 2 Hypervisor Virtualizatio ad cloud services are becomig commo eterprise tools to

More information

Programming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen Programmig with Shared Memory PART II HPC Sprig 2017 Prof. Robert va Egele Overview Sequetial cosistecy Parallel programmig costructs Depedece aalysis OpeMP Autoparallelizatio Further readig HPC Sprig

More information

Handout 2 ILP: Part B

Handout 2 ILP: Part B Handout 2 ILP: Part B Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level Parallelism Loop unrolling by compiler to increase ILP Branch prediction to increase ILP

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Chapter 4 The Datapath

Chapter 4 The Datapath The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that

More information

15-740/ Computer Architecture Lecture 22: Superscalar Processing (II) Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 22: Superscalar Processing (II) Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 22: Superscalar Processing (II) Prof. Onur Mutlu Carnegie Mellon University Announcements Project Milestone 2 Due Today Homework 4 Out today Due November 15

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1 Threads ad Cocurrecy i Java: Part 1 1 Cocurrecy What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.

More information

Introduction to Computing Systems: From Bits and Gates to C and Beyond 2 nd Edition

Introduction to Computing Systems: From Bits and Gates to C and Beyond 2 nd Edition Lecture Goals Itroductio to Computig Systems: From Bits ad Gates to C ad Beyod 2 d Editio Yale N. Patt Sajay J. Patel Origial slides from Gregory Byrd, North Carolia State Uiversity Modified slides by

More information

Computer Systems Architecture I. CSE 560M Lecture 10 Prof. Patrick Crowley

Computer Systems Architecture I. CSE 560M Lecture 10 Prof. Patrick Crowley Computer Systems Architecture I CSE 560M Lecture 10 Prof. Patrick Crowley Plan for Today Questions Dynamic Execution III discussion Multiple Issue Static multiple issue (+ examples) Dynamic multiple issue

More information

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers Outlie CSCI 4730 s! What is a s?!! System Compoet Architecture s Overview Questios What is a?! What are the major operatig system compoets?! What are basic computer system orgaizatios?! How do you commuicate

More information

10/23/18. File class in Java. Scanner reminder. Files. Opening a file for reading. Scanner reminder. File Input and Output

10/23/18. File class in Java. Scanner reminder. Files. Opening a file for reading. Scanner reminder. File Input and Output File class i Java File Iput ad Output TOPICS File Iput Exceptio Hadlig File Output Programmers refer to iput/output as "I/O". The File class represets files as objects. The class is defied i the java.io

More information

15-740/ Computer Architecture Lecture 21: Superscalar Processing. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 21: Superscalar Processing. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 21: Superscalar Processing Prof. Onur Mutlu Carnegie Mellon University Announcements Project Milestone 2 Due November 10 Homework 4 Out today Due November 15

More information

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1 Cocurrecy Threads ad Cocurrecy i Java: Part 1 What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 22 Database Recovery Techiques Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Recovery algorithms Recovery cocepts Write-ahead

More information

CS 111: Program Design I Lecture #26: Heat maps, Nothing, Predictive Policing

CS 111: Program Design I Lecture #26: Heat maps, Nothing, Predictive Policing CS 111: Program Desig I Lecture #26: Heat maps, Nothig, Predictive Policig Robert H. Sloa & Richard Warer Uiversity of Illiois at Chicago November 29, 2018 Some Logistics Extra credit: Sample Fial Exam

More information

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames Uit 4, Part 3 Recursio Computer Sciece S-111 Harvard Uiversity David G. Sulliva, Ph.D. Review: Method Frames Whe you make a method call, the Java rutime sets aside a block of memory kow as the frame of

More information

Designing a learning system

Designing a learning system CS 75 Itro to Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@pitt.edu 539 Seott Square, -5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please

More information

Computer Architecture

Computer Architecture Computer Architecture Overview Prof. Tie-Fu Che Dept. of Computer Sciece Natioal Chug Cheg Uiv Sprig 2002 Overview- Computer Architecture Course Focus Uderstadig the desig techiques, machie structures,

More information

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen COP4020 mig Laguages Compilers ad Iterpreters Prof. Robert va Egele Overview Commo compiler ad iterpreter cofiguratios Virtual machies Itegrated developmet eviromets Compiler phases Lexical aalysis Sytax

More information

Chapter 5: Processor Design Advanced Topics. Microprogramming: Basic Idea

Chapter 5: Processor Design Advanced Topics. Microprogramming: Basic Idea 5-1 Chapter 5 Processor Desig Advaced Topics Chapter 5: Processor Desig Advaced Topics Topics 5.3 Microprogrammig Cotrol store ad microbrachig Horizotal ad vertical microprogrammig 5- Chapter 5 Processor

More information

Using VTR Emulation on Avid Systems

Using VTR Emulation on Avid Systems Usig VTR Emulatio o Avid Systems VTR emulatio allows you to cotrol a sequece loaded i the Record moitor from a edit cotroller for playback i the edit room alog with other sources. I this sceario the edit

More information

Computer Architecture Lecture 15: Dataflow and SIMD. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/20/2013

Computer Architecture Lecture 15: Dataflow and SIMD. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/20/2013 18-447 Computer Architecture Lecture 15: Dataflow and SIMD Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 2/20/2013 Reminder: Homework 3 Homework 3 Due Feb 25 REP MOVS in Microprogrammed LC-3b,

More information

Service Oriented Enterprise Architecture and Service Oriented Enterprise

Service Oriented Enterprise Architecture and Service Oriented Enterprise Approved for Public Release Distributio Ulimited Case Number: 09-2786 The 23 rd Ope Group Eterprise Practitioers Coferece Service Orieted Eterprise ad Service Orieted Eterprise Ya Zhao, PhD Pricipal, MITRE

More information

Arithmetic Sequences

Arithmetic Sequences . Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered

More information