Lecture #22 Pipelining II, Cache I

Size: px
Start display at page:

Download "Lecture #22 Pipelining II, Cache I"

Transcription

1 inst.eecs.bekeley.edu/~cs61c CS61C : Machine Stuctues Lectue #22 Pipelining II, Cache I Wiewold cicuits Albet Chae, Instucto CS61C L22 Pipelining II, Cache I (1) Review: Pocesso Pipelining (1/2) Pipeline egistes ae added to the datapath/contolle to neatly divide the single cycle pocesso into pipeline stages. Optimal Pipeline Each stage is executing pat of an instuction each clock cycle. One inst. finishes duing each clock cycle. On aveage, execute fa moe quickly. What makes this wok well? Similaities between instuctions allow us to use same stages fo all instuctions (geneally). Each stage takes about the same amount of time as all othes: little wasted time. CS61C L22 Pipelining II, Cache I (2)

2 A pipelined datapath Fom P&H CS61C L22 Pipelining II, Cache I (3) Review: Pipeline (2/2) Pipelining is a BIG IDEA widely used concept What makes it less than pefect? Stuctual hazads: Conflicts fo esouces. Suppose we had only one cache? Need moe HW esouces Contol hazads: Banch instuctions effect which instuctions come next. Delayed banch Data hazads: Data flow between instuctions. Fowading CS61C L22 Pipelining II, Cache I (4)

3 Review Some fixes to hazads Illusion of two memoies Registe file convention Fowading Load delay slot All else fails, bubble/stall Latency vs thoughput What pevents us fom getting n-times speedup, whee n is the numbe of pipeline stages? CS61C L22 Pipelining II, Cache I (5) I n s t. O d e Gaphical Pipeline Repesentation (In Reg, ight half highlight ead, left half wite) Time (clock cycles) Load Add Stoe Sub O I$ Reg I$ CS61C L22 Pipelining II, Cache I (6) Reg I$ D$ Reg I$ Reg D$ Reg I$ Reg D$ Reg Reg D$ Reg D$ Reg

4 I n s t. O d e Contol Hazad: Banching (1/8) beq Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg I$ Whee do we do the compae fo the banch? I$ Reg D$ Reg Reg D$ Reg I$ Reg D$ Reg CS61C L22 Pipelining II, Cache I (7) Contol Hazad: Banching (2/8) We had put banch decision-making hadwae in stage theefoe two moe instuctions afte the banch will always be fetched, whethe o not the banch is taken Desied functionality of a banch if we do not take the banch, don t waste any time and continue executing nomally if we take the banch, don t execute any instuctions afte the banch, just go to the desied label CS61C L22 Pipelining II, Cache I (8)

5 Contol Hazad: Banching (3/8) Initial Solution: Stall until decision is made inset no-op instuctions (those that accomplish nothing, just take time) o hold up the fetch of the next instuction (fo 2 cycles). Dawback: banches take 3 clock cycles each (assuming compaato is put in stage) CS61C L22 Pipelining II, Cache I (9) Contol Hazad: Banching (4/8) Optimization #1: inset special banch compaato in Stage 2 as soon as instuction is decoded (Opcode identifies it as a banch), immediately make a decision and set the new value of the PC Benefit: since banch is complete in Stage 2, only one unnecessay instuction is fetched, so only one no-op is needed Side Note: This means that banches ae idle in Stages 3, 4 and 5. CS61C L22 Pipelining II, Cache I (10)

6 I n s t. O d e Contol Hazad: Banching (5/8) beq Inst 1 Inst 2 Inst 3 Inst 4 Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg I$ Banch compaato moved to Decode stage. I$ Reg D$ Reg Reg D$ Reg I$ Reg D$ Reg CS61C L22 Pipelining II, Cache I (11) Contol Hazad: Banching (6a/8) I n s t. O d e Use inseting no-op instuction add beq nop lw Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg bub ble bub ble I$ bub ble bub ble bub ble Reg D$ Reg Impact: 2 clock cycles pe banch instuction slow CS61C L22 Pipelining II, Cache I (12)

7 Contol Hazad: Banching (6b/8) I n s t. O d e Contolle inseting a single bubble add beq lw Time (clock cycles) I$ Reg D$ Reg I$ Reg D$ Reg bub ble I$ Reg D$ Reg Impact: 2 clock cycles pe banch instuction slow CS61C L22 Pipelining II, Cache I (13) Contol Hazad: Banching (7/8) Optimization #2: Redefine banches Old definition: if we take the banch, none of the instuctions afte the banch get executed by accident New definition: whethe o not we take the banch, the single instuction immediately following the banch gets executed (called the banch-delay slot) The tem Delayed Banch means we always execute inst afte banch This optimization is used on the MIPS CS61C L22 Pipelining II, Cache I (14)

8 Contol Hazad: Banching (8/8) Notes on Banch-Delay Slot Wost-Case Scenaio: can always put a no-op in the banch-delay slot Bette Case: can find an instuction peceding the banch which can be placed in the banch-delay slot without affecting flow of the pogam - e-odeing instuctions is a common method of speeding up pogams - compile must be vey smat in ode to find instuctions to do this - usually can find such an instuction at least 50% of the time - Jumps also have a delay slot CS61C L22 Pipelining II, Cache I (15) Example: Nondelayed vs. Delayed Banch Nondelayed Banch Delayed Banch o $8, $9,$10 add $1,$2,$3 add $1,$2,$3 sub $4, $5,$6 beq $1, $4, Exit xo $10, $1,$11 sub $4, $5,$6 beq $1, $4, Exit o $8, $9,$10 xo $10, $1,$11 Exit: Exit: CS61C L22 Pipelining II, Cache I (16)

9 Out-of-Ode Laundy: Don t Wait T a s k O d e 12 2 AM 6 PM A B C D E F A depends on D; est continue; need moe esouces to allow out-of-ode CS61C L22 Pipelining II, Cache I (17) bubble Time Supescala Laundy: Paallel pe stage T a s k O d e 12 2 AM 6 PM A B C D E F Moe esouces, HW to match mix of paallel tasks? CS61C L22 Pipelining II, Cache I (18) Time (light clothing) (dak clothing) (vey dity clothing) (light clothing) (dak clothing) (vey dity clothing)

10 Supescala Laundy: Mismatch Mix 12 2 AM 6 PM T a s k O d e A B C D Time (light clothing) (light clothing) (dak clothing) (light clothing) Task mix undeutilizes exta esouces CS61C L22 Pipelining II, Cache I (19) Real-wold pipelining poblem You e the manage of a HUGE assembly plant to build computes. Box Main pipeline 10 minutes/ pipeline stage 60 stages Latency: 10h CS61C L22 Pipelining II, Cache I (20) Poblem: need to un 2 h test befoe done..help!

11 Real-wold pipelining poblem solution 1 You emembe: a pipeline fequency is limited by its slowest stage, so Box Main pipeline 10 2hous/ minutes/ pipeline stage 60 stages Latency: 120h 10h CS61C L22 Pipelining II, Cache I (21) Poblem: need to un 2 h test befoe done..help! Real-wold pipelining poblem solution 2 Ceate a sub-pipeline! Box Main pipeline 10 minutes/ pipeline stage 60 stages 2h test (12 CPUs in this pipeline) CS61C L22 Pipelining II, Cache I (22)

12 Pee Instuction (1/2) Assume 1 inst/clock, delayed banch, 5 stage pipeline, fowading, intelock on unesolved load hazads (afte 10 3 loops, so pipeline full) Loop: lw $t0, 0($s1) addu $t0, $t0, $s2 sw $t0, 0($s1) addiu $s1, $s1, -4 bne $s1, $zeo, Loop nop How many pipeline stages (clock cycles) pe CS61C L22 Pipelining II, Cache I (23) loop iteation to execute this code? Pee Instuction Answe (1/2) Assume 1 inst/clock, delayed banch, 5 stage pipeline, fowading, intelock on unesolved load hazads iteations, so pipeline full. 2. (data hazad so stall) Loop: 1.lw $t0, 0($s1) 3.addu $t0, $t0, $s2 4.sw $t0, 0($s1) 6. (!= in DCD) 5.addiu $s1, $s1, -4 7.bne $s1, $zeo, Loop 8.nop (delayed banch so exec. nop) How many pipeline stages (clock cycles) pe loop iteation to execute this code? CS61C L22 Pipelining II, Cache I (24)

13 Pee Instuction (2/2) Assume 1 inst/clock, delayed banch, 5 stage pipeline, fowading, intelock on unesolved load hazads (afte 10 3 loops, so pipeline full). Rewite this code to educe pipeline stages (clock cycles) pe loop to as few as possible. Loop: lw $t0, 0($s1) addu $t0, $t0, $s2 sw $t0, 0($s1) addiu $s1, $s1, -4 bne $s1, $zeo, Loop nop How many pipeline stages (clock cycles) pe CS61C L22 Pipelining II, Cache I (25) loop iteation to execute this code? Pee Instuction (2/2) How long to execute? Rewite this code to educe clock cycles pe loop to as few as possible: (no hazad since exta cycle) Loop: 1. lw $t0, 0($s1) 2. addiu $s1, $s1, addu $t0, $t0, $s2 4. bne $s1, $zeo, Loop 5. sw $t0, +4($s1) (modified sw to put past addiu) How many pipeline stages (clock cycles) pe loop iteation to execute you evised code? (assume pipeline is full) CS61C L22 Pipelining II, Cache I (26)

14 Administivia HW5 due TODAY 7/29 Quiz9 due Wednesday 7/30 HW6 due Fiday 8/1 Poj3 out soon, due next Tuesday 8/5 Will be hand gaded in peson, signups will be posted soon Midtem egades due TODAY 7/29 Poj1 gades out, poj2 hopefully soon appeals due 7/31 CS61C L22 Pipelining II, Cache I (27) Administivia Lab on polling/inteupts is cancelled We will give eveyone 4 pts on that lab Dop o gading option deadline August 1 summe.bekeley.edu fo moe details CS61C L22 Pipelining II, Cache I (28)

15 The Big Pictue Compute Pocesso (active) Contol ( bain ) Datapath ( bawn ) Memoy (passive) (whee pogams, data live when unning) Devices Input Output Keyboad, Mouse Disk, Netwok Display, Pinte CS61C L22 Pipelining II, Cache I (29) Memoy Hieachy Stoage in compute systems: Pocesso holds data in egiste file (~100 Bytes) Registes accessed on nanosecond timescale Memoy (we ll call main memoy ) Disk Moe capacity than egistes (~Gbytes) Access time ~ ns Hundeds of clock cycles pe memoy access?! HUGE capacity (vitually limitless) VERY slow: uns ~milliseconds CS61C L22 Pipelining II, Cache I (30)

16 Motivation: Why We Use Caches (witten $) Pefomance CPU DRAM µpoc 60%/y. Pocesso-Memoy Pefomance Gap: (gows 50% / yea) DRAM 7%/y fist Intel CPU with cache on chip 1998 Pentium III has two levels of cache on chip CS61C L22 Pipelining II, Cache I (31) Memoy Caching Mismatch between pocesso and memoy speeds leads us to add a new level: a memoy cache Implemented with same IC pocessing technology as the CPU (usually integated on same chip): faste but moe expensive than DRAM memoy. Cache is a copy of a subset of main memoy. Most pocessos have sepaate caches fo instuctions and data. CS61C L22 Pipelining II, Cache I (32)

17 Memoy Hieachy Highe Levels in memoy hieachy Lowe Pocesso Level 1 Level 2 Level 3... Level n Inceasing Distance fom Poc., Deceasing speed Size of memoy at each level As we move to deepe levels the latency goes up and pice pe bit goes down. CS61C L22 Pipelining II, Cache I (33) Memoy Hieachy If level close to Pocesso, it is: smalle faste subset of lowe levels (contains most ecently used data) Lowest Level (usually disk) contains all available data (o does it go beyond the disk?) Memoy Hieachy pesents the pocesso with the illusion of a vey lage vey fast memoy. CS61C L22 Pipelining II, Cache I (34)

18 Memoy Hieachy Analogy: Libay (1/2) You e witing a tem pape (Pocesso) at a table in Doe Doe Libay is equivalent to disk essentially limitless capacity vey slow to etieve a book Table is main memoy smalle capacity: means you must etun book when table fills up easie and faste to find a book thee once you ve aleady etieved it CS61C L22 Pipelining II, Cache I (35) Memoy Hieachy Analogy: Libay (2/2) Open books on table ae cache smalle capacity: can have vey few open books fit on table; again, when table fills up, you must close a book much, much faste to etieve data Illusion ceated: whole libay open on the tabletop Keep as many ecently used books open on table as possible since likely to use again Also keep as many books on table as possible, since faste than going to libay CS61C L22 Pipelining II, Cache I (36)

19 Memoy Hieachy Basis Cache contains copies of data in memoy that ae being used. Memoy contains copies of data on disk that ae being used. Caches wok on the pinciples of tempoal and spatial locality. Tempoal Locality: if we use it now, chances ae we ll want to use it again soon. Spatial Locality: if we use a piece of memoy, chances ae we ll use the neighboing pieces soon. CS61C L22 Pipelining II, Cache I (37) Cache Design How do we oganize cache? Whee does each memoy addess map to? (Remembe that cache is subset of memoy, so multiple memoy addesses map to the same cache location.) How do we know which elements ae in cache? How do we quickly locate them? CS61C L22 Pipelining II, Cache I (38)

20 Diect-Mapped Cache (1/4) In a diect-mapped cache, each memoy addess is associated with one possible block within the cache Theefoe, we only need to look in a single location in the cache fo the data if it exists in the cache Block is the unit of tansfe between cache and memoy CS61C L22 Pipelining II, Cache I (39) Diect-Mapped Cache (2/4) Cache Memoy Index Memoy Addess A B C D E F CS61C L22 Pipelining II, Cache I (40) 4 Byte Diect Mapped Cache Block size = 1 byte Cache Location 0 can be occupied by data fom: Memoy location 0, 4, 8,... 4 blocks any memoy location that is multiple of 4 What if we wanted a block to be bigge than one byte?

21 Diect-Mapped Cache (3/4) Cache Memoy Index Memoy Addess A C E A 1C 1E etc CS61C L22 Pipelining II, Cache I (41) 8 Byte Diect Mapped Cache Block size = 2 bytes When we ask fo a byte, the system finds out the ight block, and loads it all! How does it know ight block? How do we select the byte? E.g., Mem addess 11101? How does it know WHICH coloed block it oiginated fom? What do you do at baggage claim? Diect-Mapped Cache (4/4) Memoy Addess Cache 8 Byte Diect Memoy Index Mapped Cache w/tag! (addesses shown) E Tag Data etc 1 (Block size = 2 bytes) A C E A 1C 1E CS61C L22 Pipelining II, Cache I (42) 2 3 What should go in the tag? Do we need the entie addess? - What do all these tags have in common? What did we do with the immediate when we wee banch addessing, always count by bytes? Why not count by cache #? Cache# It s useful to daw memoy with the same width as the block size

22 Issues with Diect-Mapped Since multiple memoy addesses map to same cache index, how do we tell which one is in thee? What if we have a block size > 1 byte? Answe: divide memoy addess into thee fields ttttttttttttttttt iiiiiiiiii oooo tag index byte to check to offset if have select within coect block block block CS61C L22 Pipelining II, Cache I (43) Diect-Mapped Cache Teminology All fields ae ead as unsigned integes. Index: specifies the cache index (which ow /block of the cache we should look in) Offset: once we ve found coect block, specifies which byte within the block we want Tag: the emaining bits afte offset and index ae detemined; these ae used to distinguish between all the memoy addesses that map to the same location CS61C L22 Pipelining II, Cache I (44)

23 TIO Dan s geat cache mnemonic AREA (cache size, B) 2 (H+W) = 2 H * 2 W = HEIGHT (# of blocks) * WIDTH (size of one block, B/block) Tag Index Offset WIDTH (size of one block, B/block) HEIGHT (# of blocks) AREA (cache size, B) CS61C L22 Pipelining II, Cache I (45) Diect-Mapped Cache Example (1/3) Suppose we have a 16KB of data in a diect-mapped cache with 4 wod blocks Detemine the size of the tag, index and offset fields if we e using a 32-bit achitectue Offset need to specify coect byte within a block block contains 4 wods = 16 bytes = 2 4 bytes need 4 bits to specify coect byte CS61C L22 Pipelining II, Cache I (46)

24 Diect-Mapped Cache Example (2/3) Index: (~index into an aay of blocks ) need to specify coect block in cache cache contains 16 KB = 2 14 bytes block contains 2 4 bytes (4 wods) # blocks/cache = bytes/cache = 2 14 bytes/cache = 2 10 blocks/cache need 10 bits to specify this many blocks CS61C L22 Pipelining II, Cache I (47) Diect-Mapped Cache Example (3/3) Tag: use emaining bits as tag tag length = add length offset - index = bits = 18 bits so tag is leftmost 18 bits of memoy addess Why not full 32 bit addess as tag? All bytes within block need same addess (4b) Index must be same fo evey addess within a block, so it s edundant in tag check, thus can leave off to save memoy (hee 10 bits) CS61C L22 Pipelining II, Cache I (48)

25 Caching Teminology When we ty to ead memoy, 3 things can happen: 1. cache hit: cache block is valid and contains pope addess, so ead desied wod 2. cache miss: nothing in cache in appopiate block, so fetch fom memoy 3. cache miss, block eplacement: wong data is in cache at appopiate block, so discad it and fetch desied data fom memoy (cache always copy) CS61C L22 Pipelining II, Cache I (49) Pee instuction Conside an addess split into fields fo cache access as follows: ttttttttttttttttttttttt iiiiii oooo How big ae the cache blocks in wods? How many enties does the cache have? How big is a cache enty? CS61C L22 Pipelining II, Cache I (50)

26 In Conclusion Pipeline challenge is hazads Fowading helps w/many data hazads Delayed banch helps with contol hazad in 5 stage pipeline Load delay slot / intelock necessay Moe aggessive pefomance: Supescala Out-of-ode execution Use caches to simulate fast lage memoy CS61C L22 Pipelining II, Cache I (51)

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson

Lecture 8 Introduction to Pipelines Adapated from slides by David Patterson Lectue 8 Intoduction to Pipelines Adapated fom slides by David Patteson http://www-inst.eecs.bekeley.edu/~cs61c/ * 1 Review (1/3) Datapath is the hadwae that pefoms opeations necessay to execute pogams.

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.bekeley.edu/~cs61c UCB CS61C : Machine Stuctues Lectue SOE Dan Gacia Lectue 28 CPU Design : Pipelining to Impove Pefomance 2010-04-05 Stanfod Reseaches have invented a monitoing technique called

More information

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia

CS 61C: Great Ideas in Computer Architecture. Pipelining Hazards. Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Geat Ideas in Compute Achitectue Pipelining Hazads Instucto: Senio Lectue SOE Dan Gacia 1 Geat Idea #4: Paallelism So9wae Paallel Requests Assigned to compute e.g. seach Gacia Paallel Theads Assigned

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 30 Caches I 2006-11-08 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Shuttle can t fly over Jan 1? A computer bug has

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 31 Caches I 2007-04-06 Powerpoint bad!! Research done at the Univ of NSW says that working memory, the brain part providing

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #22 CPU Design: Pipelining to Improve Performance II 2007-8-1 Scott Beamer, Instructor CS61C L22 CPU Design : Pipelining to Improve Performance

More information

Introduction To Pipelining. Chapter Pipelining1 1

Introduction To Pipelining. Chapter Pipelining1 1 Intoduction To Pipelining Chapte 6.1 - Pipelining1 1 Mooe s Law Mooe s Law says that the numbe of pocessos on a chip doubles about evey 18 months. Given the data on the following two slides, is this tue?

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instruc>on Level Parallelism Agenda CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuc>on Level Paallelism Instuctos: Randy H. Katz David A. PaJeson hjp://inst.eecs.bekeley.edu/~cs61c/fa10 Review Instuc>on Set Design

More information

COSC 6385 Computer Architecture. - Pipelining

COSC 6385 Computer Architecture. - Pipelining COSC 6385 Compute Achitectue - Pipelining Sping 2012 Some of the slides ae based on a lectue by David Culle, Pipelining Pipelining is an implementation technique wheeby multiple instuctions ae ovelapped

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 30 Caches I 2008-04-11 Lecturer SOE Dan Garcia Hi to Kononov Alexey from Russia! Touted as the fastest CPU on Earth, IBM s new Power6

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hadwae Oganization and Design Lectue 16: Pipelining Adapted fom Compute Oganization and Design, Patteson & Hennessy, UCB Last time: single cycle data path op System clock affects pimaily the Pogam

More information

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards CISC 662 Gaduate Compute Achitectue Lectue 6 - Hazads Michela Taufe http://www.cis.udel.edu/~taufe/teaching/cis662f07 Powepoint Lectue Notes fom John Hennessy and David Patteson s: Compute Achitectue,

More information

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE!

! CS61C : Machine Structures. Lecture 22 Caches I. !!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE! inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 22 Caches I 2010-07-28!!!Instructor Paul Pearce! ITʼS NOW LEGAL TO JAILBREAK YOUR PHONE! On Monday the Library of Congress added 5 exceptions

More information

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011

You Are Here! Review: Hazards. Agenda. Agenda. Review: Load / Branch Delay Slots 7/28/2011 CS 61C: Geat Ideas in Compute Achitectue (Machine Stuctues) Instuction Level Paallelism: Multiple Instuction Issue Guest Lectue: Justin Hsia Softwae Paallel Requests Assigned to compute e.g., Seach Katz

More information

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines

COEN-4730 Computer Architecture Lecture 2 Review of Instruction Sets and Pipelines 1 COEN-4730 Compute Achitectue Lectue 2 Review of nstuction Sets and Pipelines Cistinel Ababei Dept. of Electical and Compute Engineeing Maquette Univesity Cedits: Slides adapted fom pesentations of Sudeep

More information

The Processor: Improving Performance Data Hazards

The Processor: Improving Performance Data Hazards The Pocesso: Impoving Pefomance Data Hazads Monday 12 Octobe 15 Many slides adapted fom: and Design, Patteson & Hennessy 5th Edition, 2014, MK and fom Pof. May Jane Iwin, PSU Summay Pevious Class Pipeline

More information

Computer Science 141 Computing Hardware

Computer Science 141 Computing Hardware Compute Science 141 Computing Hadwae Fall 2006 Havad Univesity Instucto: Pof. David Books dbooks@eecs.havad.edu [MIPS Pipeline Slides adapted fom Dave Patteson s UCB CS152 slides and May Jane Iwin s CSE331/431

More information

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue

CS 61C: Great Ideas in Computer Architecture Instruc(on Level Parallelism: Mul(ple Instruc(on Issue CS 61C: Geat Ideas in Compute Achitectue Instuc(on Level Paallelism: Mul(ple Instuc(on Issue Instuctos: Kste Asanovic, Randy H. Katz hbp://inst.eecs.bekeley.edu/~cs61c/fa12 1 Paallel Requests Assigned

More information

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20

Administrivia. CMSC 411 Computer Systems Architecture Lecture 5. Data Hazard Even with Forwarding Figure A.9, Page A-20 Administivia CMSC 411 Compute Systems Achitectue Lectue 5 Basic Pipelining (cont.) Alan Sussman als@cs.umd.edu as@csu dedu Homewok poblems fo Unit 1 due today Homewok poblems fo Unit 3 posted soon CMSC

More information

Review : Pipelining. Memory Hierarchy

Review : Pipelining. Memory Hierarchy CS61C L11 Caches (1) CS61CL : Machine Structures Review : Pipelining The Big Picture Lecture #11 Caches 2009-07-29 Jeremy Huddleston!! Pipeline challenge is hazards "! Forwarding helps w/many data hazards

More information

Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture

Computer Architecture. Pipelining and Instruction Level Parallelism An Introduction. Outline of This Lecture Compute Achitectue Pipelining and nstuction Level Paallelism An ntoduction Adapted fom COD2e by Hennessy & Patteson Slide 1 Outline of This Lectue ntoduction to the Concept of Pipelined Pocesso Pipelined

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 12 Caches I Lecturer SOE Dan Garcia Midterm exam in 3 weeks! A Mountain View startup promises to do Dropbox one better. 10GB free storage,

More information

CS 2461: Computer Architecture 1 Program performance and High Performance Processors

CS 2461: Computer Architecture 1 Program performance and High Performance Processors Couse Objectives: Whee ae we. CS 2461: Pogam pefomance and High Pefomance Pocessos Instucto: Pof. Bhagi Naahai Bits&bytes: Logic devices HW building blocks Pocesso: ISA, datapath Using building blocks

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 12 Caches I 2014-09-26 Instructor: Miki Lustig September 23: Another type of Cache PayPal Integrates Bitcoin Processors BitPay, Coinbase

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Is this the beginning of the end for our beloved Lecture 32 Caches I 2004-11-12 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia The Incredibles!

More information

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1 CMCS 611-101 Advanced Compute Achitectue Lectue 6 Intoduction to Pipelining Septembe 23, 2009 www.csee.umbc.edu/~younis/cmsc611/cmsc611.htm Mohamed Younis CMCS 611, Advanced Compute Achitectue 1 Pevious

More information

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards)

Chapter 4 (Part III) The Processor: Datapath and Control (Pipeline Hazards) Chapte 4 (Pat III) The Pocesso: Datapath and Contol (Pipeline Hazads) 陳瑞奇 (J.C. Chen) 亞洲大學資訊工程學系 Adapted fom class notes by Pof. M.J. Iwin, PSU and Pof. D. Patteson, UCB 1 吃感冒藥副作用怎麼辦? http://big5.sznews.com/health/images/attachement/jpg/site3/20120319/001558d90b3310d0c1683e.jpg

More information

CS61C : Machine Structures

CS61C : Machine Structures CS C L.. Cache I () Design Principles for Hardware CSC : Machine Structures Lecture.. Cache I -- Kurt Meinz inst.eecs.berkeley.edu/~csc. Simplicity favors regularity Every instruction has operands, opcode

More information

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 07: MIPS Processor - II. Bei Yu CENG 3420 Compute Oganization and Design Lectue 07: MIPS Pocesso - II Bei Yu CEG3420 L07.1 Sping 2016 Review: Instuction Citical Paths q Calculate cycle time assuming negligible delays (fo muxes, contol

More information

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4)

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4) PU Stuctue and Function h Geneal Oganisation Registes Instuction ycle Pipelining anch Pediction Inteupts Use Visible Registes Vaies fom one achitectue to anothe Geneal pupose egiste (GPR) ata, addess,

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #23 Cache I 2007-8-2 Scott Beamer, Instructor CS61C L23 Caches I (1) The Big Picture Computer Processor (active) Control ( brain ) Datapath

More information

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining

Lecture Topics ECE 341. Lecture # 12. Control Signals. Control Signals for Datapath. Basic Processing Unit. Pipelining EE 341 Lectue # 12 Instucto: Zeshan hishti zeshan@ece.pdx.edu Novembe 10, 2014 Potland State Univesity asic Pocessing Unit ontol Signals Hadwied ontol Datapath contol signals Dealing with memoy delay Pipelining

More information

Review from last lecture

Review from last lecture CSE820 Gaduate Compute Achitectue Week 3 Pefomance + Pipeline Review Based on slides by David Patteson Review fom last lectue Tacking and extapolating technology pat of achitect s esponsibility Expect

More information

CENG 3420 Lecture 07: Pipeline

CENG 3420 Lecture 07: Pipeline CENG 3420 Lectue 07: Pipeline Bei Yu byu@cse.cuhk.edu.hk CENG3420 L07.1 Sping 2017 Outline q Review: Flip-Flop Contol Signals q Pipeline Motivations q Pipeline Hazads q Exceptions CENG3420 L07.2 Sping

More information

CS61C - Machine Structures. Lecture 17 - Caches, Part I. October 25, 2000 David Patterson

CS61C - Machine Structures. Lecture 17 - Caches, Part I. October 25, 2000 David Patterson CS1C - Machine Structures Lecture 1 - Caches, Part I October 25, 2 David Patterson http://www-inst.eecs.berkeley.edu/~cs1c/ Things to Remember Magnetic Disks continue rapid advance: %/yr capacity, 4%/yr

More information

Lctures 33: Cache Memory - I. Some of the slides are adopted from David Patterson (UCB)

Lctures 33: Cache Memory - I. Some of the slides are adopted from David Patterson (UCB) Lctures 33: Cache Memory - I Some of the slides are adopted from David Patterson (UCB) Outline Memory Hierarchy On-Chip SRAM Direct-Mapped Cache Review: ARM System Architecture Fast on-chip RAM External

More information

CSE4201. Computer Architecture

CSE4201. Computer Architecture CSE 4201 Compute Achitectue Pof. Mokhta Aboelaze Pats of these slides ae taken fom Notes by Pof. David Patteson at UCB Outline MIPS and instuction set Simple pipeline in MIPS Stuctual and data hazads Fowading

More information

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade

Review: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Review: Joy s Law in ManyCore world. Bell s Law new class per decade EECS 252 Gaduate Compute Achitectue Lectue 2 ℵ 0 Review of Instuction Sets, Pipelines, and Caches Januay 26 th, 2009 Review Mooe s Law John Kubiatowicz Electical Engineeing and Compute Sciences Univesity

More information

THE THETA BLOCKCHAIN

THE THETA BLOCKCHAIN THE THETA BLOCKCHAIN Theta is a decentalized video steaming netwok, poweed by a new blockchain and token. By Theta Labs, Inc. Last Updated: Nov 21, 2017 esion 1.0 1 OUTLINE Motivation Reputation Dependent

More information

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly

Pre-requisites. This is a textbook-based course. Chapter 1. Pipelines, Performance, Caches, and Virtual Memory. January 2009 Paul H J Kelly 332 Advanced Compute Achitectue Chapte 1 Intoduction and eview of Pipelines, Pefomance, Caches, and Vitual Januay 2009 Paul H J Kelly These lectue notes ae patly based on the couse text, Hennessy and Patteson

More information

Any modern computer system will incorporate (at least) two levels of storage:

Any modern computer system will incorporate (at least) two levels of storage: 1 Any moden compute system will incopoate (at least) two levels of stoage: pimay stoage: andom access memoy (RAM) typical capacity 32MB to 1GB cost pe MB $3. typical access time 5ns to 6ns bust tansfe

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin. School of Information Science and Technology SIST

CS 110 Computer Architecture. Pipelining. Guest Lecture: Shu Yin.   School of Information Science and Technology SIST CS 110 Computer Architecture Pipelining Guest Lecture: Shu Yin http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's CS61C

More information

DYNAMIC STORAGE ALLOCATION. Hanan Samet

DYNAMIC STORAGE ALLOCATION. Hanan Samet ds0 DYNAMIC STORAGE ALLOCATION Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 07 e-mail: hjs@umiacs.umd.edu

More information

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards

CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards CS 61C: Great Ideas in Computer Architecture Pipelining and Hazards Instructors: Vladimir Stojanovic and Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/sp16 1 Pipelined Execution Representation Time

More information

dc - Linux Command Dc may be invoked with the following command-line options: -V --version Print out the version of dc

dc - Linux Command Dc may be invoked with the following command-line options: -V --version Print out the version of dc - CentOS 5.2 - Linux Uses Guide - Linux Command SYNOPSIS [-V] [--vesion] [-h] [--help] [-e sciptexpession] [--expession=sciptexpession] [-f sciptfile] [--file=sciptfile] [file...] DESCRIPTION is a evese-polish

More information

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS Daniel A Menascé Mohamed N Bennani Dept of Compute Science Oacle, Inc Geoge Mason Univesity 1211 SW Fifth

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012 2011, Scienceline Publication www.science-line.com Jounal of Wold s Electical Engineeing and Technology J. Wold. Elect. Eng. Tech. 1(1): 12-16, 2012 JWEET An Efficient Algoithm fo Lip Segmentation in Colo

More information

DYNAMIC STORAGE ALLOCATION. Hanan Samet

DYNAMIC STORAGE ALLOCATION. Hanan Samet ds0 DYNAMIC STORAGE ALLOCATION Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 074 e-mail: hjs@umiacs.umd.edu

More information

EE 6900: Interconnection Networks for HPC Systems Fall 2016

EE 6900: Interconnection Networks for HPC Systems Fall 2016 EE 6900: Inteconnection Netwoks fo HPC Systems Fall 2016 Avinash Kaanth Kodi School of Electical Engineeing and Compute Science Ohio Univesity Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement: Inteconnection

More information

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies

Accelerating Storage with RDMA Max Gurtovoy Mellanox Technologies Acceleating Stoage with RDMA Max Gutovoy Mellanox Technologies 2018 Stoage Develope Confeence EMEA. Mellanox Technologies. All Rights Reseved. 1 What is RDMA? Remote Diect Memoy Access - povides the ability

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

Cache Memory - II. Some of the slides are adopted from David Patterson (UCB)

Cache Memory - II. Some of the slides are adopted from David Patterson (UCB) Cache Memory - II Some of the slides are adopted from David Patterson (UCB) Outline Direct-Mapped Cache Types of Cache Misses A (long) detailed example Peer - to - peer education example Block Size Tradeoff

More information

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle? CSE 2021: Computer Organization Single Cycle (Review) Lecture-10b CPU Design : Pipelining-1 Overview, Datapath and control Shakil M. Khan 2 Single Cycle with Jump Multi-Cycle Implementation Instruction:

More information

The Java Virtual Machine. Compiler construction The structure of a frame. JVM stacks. Lecture 2

The Java Virtual Machine. Compiler construction The structure of a frame. JVM stacks. Lecture 2 Compile constuction 2009 Lectue 2 Code geneation 1: Geneating code The Java Vitual Machine Data types Pimitive types, including intege and floating-point types of vaious sizes and the boolean type. The

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 Instructor: Dan Garcia inst.eecs.berkeley.edu/~cs61c! Compu@ng in the News At a laboratory in São Paulo,

More information

A Memory Efficient Array Architecture for Real-Time Motion Estimation

A Memory Efficient Array Architecture for Real-Time Motion Estimation A Memoy Efficient Aay Achitectue fo Real-Time Motion Estimation Vasily G. Moshnyaga and Keikichi Tamau Depatment of Electonics & Communication, Kyoto Univesity Sakyo-ku, Yoshida-Honmachi, Kyoto 66-1, JAPAN

More information

All lengths in meters. E = = 7800 kg/m 3

All lengths in meters. E = = 7800 kg/m 3 Poblem desciption In this poblem, we apply the component mode synthesis (CMS) technique to a simple beam model. 2 0.02 0.02 All lengths in metes. E = 2.07 10 11 N/m 2 = 7800 kg/m 3 The beam is a fee-fee

More information

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma apreduce Optimizations and Algoithms 2015 Pofesso Sasu Takoma www.cs.helsinki.fi Optimizations Reduce tasks cannot stat befoe the whole map phase is complete Thus single slow machine can slow down the

More information

IP Network Design by Modified Branch Exchange Method

IP Network Design by Modified Branch Exchange Method Received: June 7, 207 98 IP Netwok Design by Modified Banch Method Kaiat Jaoenat Natchamol Sichumoenattana 2* Faculty of Engineeing at Kamphaeng Saen, Kasetsat Univesity, Thailand 2 Faculty of Management

More information

Memory Hierarchy. Mehran Rezaei

Memory Hierarchy. Mehran Rezaei Memory Hierarchy Mehran Rezaei What types of memory do we have? Registers Cache (Static RAM) Main Memory (Dynamic RAM) Disk (Magnetic Disk) Option : Build It Out of Fast SRAM About 5- ns access Decoders

More information

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives SPARK: Soot Reseach Kit Ondřej Lhoták Objectives Spak is a modula toolkit fo flow-insensitive may points-to analyses fo Java, which enables expeimentation with: vaious paametes of pointe analyses which

More information

CS61C : Machine Structures

CS61C : Machine Structures CS61C L2 Caches II (1) inst.eecs.berkeley.edu/~cs61c/su5 CS61C : Machine Structures Lecture #2: Caches 2 25-7-26 Andy Carle Review: Direct-Mapped Cache Cache Memory Index 1 2 Memory Address 12 4 5 6 7

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures CS 61C L21 Caches II (1) Lecture 21 Caches II 24-3-1 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia US Buys world s biggest RAM disk. 2.5TB!

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su5 CS61C : Machine Structures Lecture #2: Caches 2 25-7-26 CS61C L32 Caches II (1) Andy Carle A Carle, Summer 25 UCB Memory Address 12 Review: Direct-Mapped Cache 3 4 5 6

More information

Computer Architecture CS372 Exam 3

Computer Architecture CS372 Exam 3 Name: Computer Architecture CS372 Exam 3 This exam has 7 pages. Please make sure you have all of them. Write your name on this page and initials on every other page now. You may only use the green card

More information

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks Spial Recognition Methodology and Its Application fo Recognition of Chinese Bank Checks Hanshen Tang 1, Emmanuel Augustin 2, Ching Y. Suen 1, Olivie Baet 2, Mohamed Cheiet 3 1 Cente fo Patten Recognition

More information

GARBAGE COLLECTION METHODS. Hanan Samet

GARBAGE COLLECTION METHODS. Hanan Samet gc0 GARBAGE COLLECTION METHODS Hanan Samet Compute Science Depatment and Cente fo Automation Reseach and Institute fo Advanced Compute Studies Univesity of Mayland College Pak, Mayland 07 e-mail: hjs@umiacs.umd.edu

More information

arxiv: v1 [cs.lo] 3 Dec 2018

arxiv: v1 [cs.lo] 3 Dec 2018 A high-level opeational semantics fo hadwae weak memoy models axiv:1812.00996v1 [cs.lo] 3 Dec 2018 Abstact Robet J. Colvin School of Electical Engineeing and Infomation Technology The Univesity of Queensland

More information

Modeling a shared medium access node with QoS distinction

Modeling a shared medium access node with QoS distinction Modeling a shaed medium access node with QoS distinction Matthias Gies, Jonas Geutet Compute Engineeing and Netwoks Laboatoy (TIK) Swiss Fedeal Institute of Technology Züich CH-8092 Züich, Switzeland email:

More information

Overview of Control. CS 152 Computer Architecture and Engineering Lecture 11. Multicycle Controller Design

Overview of Control. CS 152 Computer Architecture and Engineering Lecture 11. Multicycle Controller Design S 152 ompute chitectue and Engineeing Lectue 11 Multicycle ontolle Design Oveview of ontol ontol may be designed using one of seveal initial epesentations. The choice of sequence contol, and how logic

More information

High performance CUDA based CNN image processor

High performance CUDA based CNN image processor High pefomance UDA based NN image pocesso GEORGE VALENTIN STOIA, RADU DOGARU, ELENA RISTINA STOIA Depatment of Applied Electonics and Infomation Engineeing Univesity Politehnica of Buchaest -3, Iuliu Maniu

More information

A modal estimation based multitype sensor placement method

A modal estimation based multitype sensor placement method A modal estimation based multitype senso placement method *Xue-Yang Pei 1), Ting-Hua Yi 2) and Hong-Nan Li 3) 1),)2),3) School of Civil Engineeing, Dalian Univesity of Technology, Dalian 116023, China;

More information

Image Enhancement in the Spatial Domain. Spatial Domain

Image Enhancement in the Spatial Domain. Spatial Domain 8-- Spatial Domain Image Enhancement in the Spatial Domain What is spatial domain The space whee all pixels fom an image In spatial domain we can epesent an image by f( whee x and y ae coodinates along

More information

On the Conversion between Binary Code and Binary-Reflected Gray Code on Boolean Cubes

On the Conversion between Binary Code and Binary-Reflected Gray Code on Boolean Cubes On the Convesion between Binay Code and BinayReflected Gay Code on Boolean Cubes The Havad community has made this aticle openly available. Please shae how this access benefits you. You stoy mattes Citation

More information

ASSIGN 01: Due Monday Feb 04 PART 1 Get a Sketchbook: 8.5 x 11 (Minimum size 5 x7 ) fo keeping a design jounal and a place to keep poject eseach & ideas. Make sue you have you Dopbox account and/o Flash

More information

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache

Direct-Mapped Cache Terminology. Caching Terminology. TIO Dan s great cache mnemonic. Accessing data in a direct mapped cache Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 31 Caches II 2008-04-14 Hi to Yi Luo from Seattle, WA! In this week s Science, IBM researchers describe a new

More information

Processor design - MIPS

Processor design - MIPS EASY Processor design - MIPS Q.1 What happens when a register is loaded? 1. The bits of the register are set to all ones. 2. The bit pattern in the register is copied to a location in memory. 3. A bit

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAE COMPRESSION STANDARDS Lesson 17 JPE-2000 Achitectue and Featues Instuctional Objectives At the end of this lesson, the students should be able to: 1. State the shotcomings of JPE standad.

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

A Novel Parallel Deadlock Detection Algorithm and Architecture

A Novel Parallel Deadlock Detection Algorithm and Architecture A Novel Paallel Deadlock Detection Aloithm and Achitectue Pun H. Shiu 2, Yudon Tan 2, Vincent J. Mooney III {ship, ydtan, mooney}@ece.atech.ed }@ece.atech.edu http://codesin codesin.ece.atech.eduedu,2

More information

Computer Architecture. Lecture 6.1: Fundamentals of

Computer Architecture. Lecture 6.1: Fundamentals of CS3350B Computer Architecture Winter 2015 Lecture 6.1: Fundamentals of Instructional Level Parallelism Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and

More information

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations?

c. What are the machine cycle times (in nanoseconds) of the non-pipelined and the pipelined implementations? Brown University School of Engineering ENGN 164 Design of Computing Systems Professor Sherief Reda Homework 07. 140 points. Due Date: Monday May 12th in B&H 349 1. [30 points] Consider the non-pipelined

More information

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts)

CS 230 Practice Final Exam & Actual Take-home Question. Part I: Assembly and Machine Languages (22 pts) Part I: Assembly and Machine Languages (22 pts) 1. Assume that assembly code for the following variable definitions has already been generated (and initialization of A and length). int powerof2; /* powerof2

More information

Multidimensional Testing

Multidimensional Testing Multidimensional Testing QA appoach fo Stoage netwoking Yohay Lasi Visuality Systems 1 Intoduction Who I am Yohay Lasi, QA Manage at Visuality Systems Visuality Systems the leading commecial povide of

More information

Communication vs Distributed Computation: an alternative trade-off curve

Communication vs Distributed Computation: an alternative trade-off curve Communication vs Distibuted Computation: an altenative tade-off cuve Yahya H. Ezzeldin, Mohammed amoose, Chistina Fagouli Univesity of Califonia, Los Angeles, CA 90095, USA, Email: {yahya.ezzeldin, mkamoose,

More information

CMPT 300 Introduction to Operating Systems

CMPT 300 Introduction to Operating Systems CMPT 300 Introduction to Operating Systems Cache 0 Acknowledgement: some slides are taken from CS61C course material at UC Berkeley Agenda Memory Hierarchy Direct Mapped Caches Cache Performance Set Associative

More information

lecture 18 cache 2 TLB miss TLB - TLB (hit and miss) - instruction or data cache - cache (hit and miss)

lecture 18 cache 2 TLB miss TLB - TLB (hit and miss) - instruction or data cache - cache (hit and miss) lecture 18 2 virtual physical virtual physical - TLB ( and ) - instruction or data - ( and ) Wed. March 16, 2016 Last lecture I discussed the TLB and how virtual es are translated to physical es. I only

More information

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data

More information

Realistic Memories and. 2-level Data Cache Interface (0,n)

Realistic Memories and. 2-level Data Cache Interface (0,n) Realistic Meoies and Caches Pat III Li-Shiuan Peh Copute Science & Atificial Intelligence Lab. Massachusetts Institute of Technology Apil 4, 2012 http://csg.csail.it.edu/6.s078 L15-1 2-level Data Cache

More information

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts

EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts EN2910A: Advanced Computer Architecture Topic 02: Review of classical concepts Prof. Sherief Reda School of Engineering Brown University S. Reda EN2910A FALL'15 1 Classical concepts (prerequisite) 1. Instruction

More information

Physical simulation for animation

Physical simulation for animation Physical simulation fo animation Case study: The jello cube The Jello Cube Mass-Sping System Collision Detection Integatos Septembe 17 2002 1 Announcements Pogamming assignment 3 is out. It is due Tuesday,

More information

Query Language #1/3: Relational Algebra Pure, Procedural, and Set-oriented

Query Language #1/3: Relational Algebra Pure, Procedural, and Set-oriented Quey Language #1/3: Relational Algeba Pue, Pocedual, and Set-oiented To expess a quey, we use a set of opeations. Each opeation takes one o moe elations as input paamete (set-oiented). Since each opeation

More information

How many times is the loop executed? middle = (left+right)/2; if (value == arr[middle]) return true;

How many times is the loop executed? middle = (left+right)/2; if (value == arr[middle]) return true; This lectue Complexity o binay seach Answes to inomal execise Abstact data types Stacks ueues ADTs, Stacks, ueues 1 binayseach(int[] a, int value) { while (ight >= let) { { i (value < a[middle]) ight =

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson DEADLOCK AVOIDANCE IN BATCH PROCESSES M. Tittus K. Åkesson Univesity College Boås, Sweden, e-mail: Michael.Tittus@hb.se Chalmes Univesity of Technology, Gothenbug, Sweden, e-mail: ka@s2.chalmes.se Abstact:

More information

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University

Lecture 9. Pipeline Hazards. Christos Kozyrakis Stanford University Lecture 9 Pipeline Hazards Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee18b 1 Announcements PA-1 is due today Electronic submission Lab2 is due on Tuesday 2/13 th Quiz1 grades will

More information

Final Exam Fall 2008

Final Exam Fall 2008 COE 308 Computer Architecture Final Exam Fall 2008 page 1 of 8 Saturday, February 7, 2009 7:30 10:00 AM Computer Engineering Department College of Computer Sciences & Engineering King Fahd University of

More information