COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Size: px
Start display at page:

Download "COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Large and Fast: Exploiting Memory Hierarchy"

Transcription

1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy Priciple of Locality Programs access a small proportio of their address space at ay time Temporal locality Items accessed recetly are likely to be accessed agai soo e.g., istructios i a loop, iductio variables Spatial locality Items ear those accessed recetly are likely to be accessed soo E.g., sequetial istructio access, array data 5.1 Itroductio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 2

2 Takig Advatage of Locality Memory hierarchy Store everythig o disk Copy recetly accessed (ad earby) items from disk to smaller DRAM memory Mai memory Copy more recetly accessed (ad earby) items from DRAM to smaller SRAM memory Cache memory attached to CPU Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 3 Memory Hierarchy Levels Block (aka lie): uit of copyig May be multiple words If accessed data is preset i upper level Hit: access satisfied by upper level Hit ratio: hits/accesses If accessed data is abset Miss: block copied from lower level Time take: miss pealty Miss ratio: misses/accesses = 1 hit ratio The accessed data supplied from upper level Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 4

3 Memory Techology Static RAM (SRAM) 0.5s 2.5s, $2000 $5000 per GB Dyamic RAM (DRAM) 50s 70s, $20 $75 per GB Magetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Access time of SRAM Capacity ad cost/gb of disk 5.2 Memory Techologies Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 5 DRAM Techology Data stored as a charge i a capacitor Sigle trasistor used to access the charge Must periodically be refreshed Read cotets ad write back Performed o a DRAM row Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 6

4 Advaced DRAM Orgaizatio Bits i a DRAM are orgaized as a rectagular array DRAM accesses a etire row Burst mode: supply successive words from a row with reduced latecy Double data rate (DDR) DRAM Trasfer o risig ad fallig clock edges Quad data rate (QDR) DRAM Separate DDR iputs ad outputs Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 7 DRAM Geeratios Year Capacity $/GB Kbit $ Kbit $ Mbit $ Mbit $ Mbit $ Mbit $ Mbit $ Mbit $ Trac Tcac Mbit $ Gbit $50 0 '80 '83 '85 '89 '92 '96 '98 '00 '04 '07 Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 8

5 DRAM Performace Factors Row buffer Allows several words to be read ad refreshed i parallel Sychroous DRAM Allows for cosecutive accesses i bursts without eedig to sed each address Improves badwidth DRAM bakig Allows simultaeous access to multiple DRAMs Improves badwidth Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 9 Icreasig Memory Badwidth 4-word wide memory Miss pealty = = 17 bus cycles Badwidth = 16 bytes / 17 cycles = 0.94 B/cycle 4-bak iterleaved memory Miss pealty = = 20 bus cycles Badwidth = 16 bytes / 20 cycles = 0.8 B/cycle Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 10

6 Flash Storage Novolatile semicoductor storage faster tha disk Smaller, lower power, more robust But more $/GB (betwee disk ad DRAM) 6.4 Flash Storage Chapter 6 Storage ad Other I/O Topics 11 Flash Types NOR flash: bit cell like a NOR gate Radom read/write access Used for istructio memory i embedded systems NAND flash: bit cell like a NAND gate Deser (bits/area), but block-at-a-time access Cheaper per GB Used for USB keys, media storage, Flash bits wears out after 1000 s of accesses Not suitable for direct RAM or disk replacemet Wear levelig: remap data to less used blocks Chapter 6 Storage ad Other I/O Topics 12

7 Disk Storage Novolatile, rotatig magetic storage 6.3 Disk Storage Chapter 6 Storage ad Other I/O Topics 13 Disk Sectors ad Access Each sector records Sector ID Data (512 bytes, 4096 bytes proposed) Error correctig code (ECC) Used to hide defects ad recordig errors Sychroizatio fields ad gaps Access to a sector ivolves Queuig delay if other accesses are pedig Seek: move the heads Rotatioal latecy Data trasfer Cotroller overhead Chapter 6 Storage ad Other I/O Topics 14

8 Disk Access Example Give 512B sector, 15,000rpm, 4ms average seek time, 100MB/s trasfer rate, 0.2ms cotroller overhead, idle disk Average read time 4ms seek time + ½ / (15,000/60) = 2ms rotatioal latecy / 100MB/s = 0.005ms trasfer time + 0.2ms cotroller delay = 6.2ms If actual average seek time is 1ms Average read time = 3.2ms Chapter 6 Storage ad Other I/O Topics 15 Disk Performace Issues Maufacturers quote average seek time Based o all possible seeks Locality ad OS schedulig lead to smaller actual average seek times Smart disk cotroller allocate physical sectors o disk Preset logical sector iterface to host SCSI, ATA, SATA Disk drives iclude caches Prefetch sectors i aticipatio of access Avoid seek ad rotatioal delay Chapter 6 Storage ad Other I/O Topics 16

9 Cache Memory Cache memory The level of the memory hierarchy closest to the CPU 5.3 The Basics of Caches Give accesses X 1,, X 1, X How do we kow if the data is preset? Where do we look? Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 17 Direct Mapped Cache Locatio determied by address Direct mapped: oly oe choice (Block address) modulo (#Blocks i cache) #Blocks is a power of 2 Use low-order address bits Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 18

10 Tags ad Valid Bits How do we kow which particular block is stored i a cache locatio? Store block address as well as the data Actually, oly eed the high-order bits Called the tag What if there is o data i a locatio? Valid bit: 1 = preset, 0 = ot preset Iitially 0 Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 19 Cache Example 8-blocks, 1 word/block, direct mapped Iitial state Idex V Tag Data 000 N 001 N 010 N 011 N 100 N 101 N 110 N 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 20

11 Cache Example Word addr Biary addr Hit/miss Cache block Miss 110 Idex V Tag Data 000 N 001 N 010 N 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 21 Cache Example Word addr Biary addr Hit/miss Cache block Miss 010 Idex V Tag Data 000 N 001 N 010 Y 11 Mem[11010] 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 22

12 Cache Example Word addr Biary addr Hit/miss Cache block Hit Hit 010 Idex V Tag Data 000 N 001 N 010 Y 11 Mem[11010] 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 23 Cache Example Word addr Biary addr Hit/miss Cache block Miss Miss Hit 000 Idex V Tag Data 000 Y 10 Mem[10000] 001 N 010 Y 11 Mem[11010] 011 Y 00 Mem[00011] 100 N 101 N 110 Y 10 Mem[10110] 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 24

13 Cache Example Word addr Biary addr Hit/miss Cache block Miss 010 Idex V Tag Data 000 Y 10 Mem[10000] 001 N 010 Y 10 Mem[10010] 011 Y 00 Mem[00011] 100 N 101 N 110 Y 10 Mem[10110] 111 N Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 25 Address Subdivisio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 26

14 Example: Larger Block Size 64 blocks, 16 bytes/block To what block umber does address 1200 map? Block address = ë1200/16û = 75 Block umber = 75 modulo 64 = Tag Idex Offset 22 bits 6 bits 4 bits Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 27 Block Size Cosideratios Larger blocks should reduce miss rate Due to spatial locality But i a fixed-sized cache Larger blocks Þ fewer of them More competitio Þ icreased miss rate Larger blocks Þ pollutio Larger miss pealty Ca override beefit of reduced miss rate Early restart ad critical-word-first ca help Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 28

15 Cache Misses O cache hit, CPU proceeds ormally O cache miss Stall the CPU pipelie Fetch block from ext level of hierarchy Istructio cache miss Restart istructio fetch Data cache miss Complete data access Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 29 Write-Through O data-write hit, could just update the block i cache But the cache ad memory would be icosistet Write through: also update memory But makes writes take loger e.g., if base CPI = 1, 10% of istructios are stores, write to memory takes 100 cycles Effective CPI = = 11 Solutio: write buffer Holds data waitig to be writte to memory CPU cotiues immediately Oly stalls o write if write buffer is already full Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 30

16 Write-Back Alterative: O data-write hit, just update the block i cache Keep track of whether each block is dirty Whe a dirty block is replaced Write it back to memory Ca use a write buffer to allow replacig block to be read first Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 31 Write Allocatio What should happe o a write miss? Alteratives for write-through Allocate o miss: fetch the block Write aroud: do t fetch the block Sice programs ofte write a whole block before readig it (e.g., iitializatio) For write-back Usually fetch the block Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 32

17 Example: Itrisity FastMATH Embedded MIPS processor 12-stage pipelie Istructio ad data access o each cycle Split cache: separate I-cache ad D-cache Each 16KB: 256 blocks 16 words/block D-cache: write-through or write-back SPEC2000 miss rates I-cache: 0.4% D-cache: 11.4% Weighted average: 3.2% Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 33 Example: Itrisity FastMATH Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 34

18 Mai Memory Supportig Caches Use DRAMs for mai memory Fixed width (e.g., 1 word) Coected by fixed-width clocked bus Bus clock is typically slower tha CPU clock Example cache block read 1 bus cycle for address trasfer 15 bus cycles per DRAM access 1 bus cycle per data trasfer For 4-word block, 1-word-wide DRAM Miss pealty = = 65 bus cycles Badwidth = 16 bytes / 65 cycles = 0.25 B/cycle Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 35 Measurig Cache Performace Compoets of CPU time Program executio cycles Icludes cache hit time Memory stall cycles Maily from cache misses With simplifyig assumptios: Memory stall cycles Memory accesses = Miss rate Miss pealty Program = Istructios Program Misses Istructio Miss pealty 5.4 Measurig ad Improvig Cache Performace Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 36

19 Cache Performace Example Give I-cache miss rate = 2% D-cache miss rate = 4% Miss pealty = 100 cycles Base CPI (ideal cache) = 2 Load & stores are 36% of istructios Miss cycles per istructio I-cache: = 2 D-cache: = 1.44 Actual CPI = = 5.44 Ideal CPU is 5.44/2 =2.72 times faster Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 37 Average Access Time Hit time is also importat for performace Average memory access time (AMAT) AMAT = Hit time + Miss rate Miss pealty Example CPU with 1s clock, hit time = 1 cycle, miss pealty = 20 cycles, I-cache miss rate = 5% AMAT = = 2s 2 cycles per istructio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 38

20 Performace Summary Whe CPU performace icreased Miss pealty becomes more sigificat Decreasig base CPI Greater proportio of time spet o memory stalls Icreasig clock rate Memory stalls accout for more CPU cycles Ca t eglect cache behavior whe evaluatig system performace Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 39 Associative Caches Fully associative Allow a give block to go i ay cache etry Requires all etries to be searched at oce Comparator per etry (expesive) -way set associative Each set cotais etries Block umber determies which set (Block umber) modulo (#Sets i cache) Search all etries i a give set at oce comparators (less expesive) Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 40

21 Associative Cache Example Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 41 Spectrum of Associativity For a cache with 8 etries Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 42

22 Associativity Example Compare 4-block caches Direct mapped, 2-way set associative, fully associative Block access sequece: 0, 8, 0, 6, 8 Direct mapped Block address Cache idex Hit/miss Cache cotet after access miss Mem[0] 8 0 miss Mem[8] 0 0 miss Mem[0] 6 2 miss Mem[0] Mem[6] 8 0 miss Mem[8] Mem[6] Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 43 Associativity Example 2-way set associative Block address Cache idex Hit/miss 0 0 miss Mem[0] 8 0 miss Mem[0] Mem[8] 0 0 hit Mem[0] Mem[8] 6 0 miss Mem[0] Mem[6] 8 0 miss Mem[8] Mem[6] Fully associative Cache cotet after access Set 0 Set 1 Block address Hit/miss Cache cotet after access 0 miss Mem[0] 8 miss Mem[0] Mem[8] 0 hit Mem[0] Mem[8] 6 miss Mem[0] Mem[8] Mem[6] 8 hit Mem[0] Mem[8] Mem[6] Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 44

23 How Much Associativity Icreased associativity decreases miss rate But with dimiishig returs Simulatio of a system with 64KB D-cache, 16-word blocks, SPEC way: 10.3% 2-way: 8.6% 4-way: 8.3% 8-way: 8.1% Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 45 Set Associative Cache Orgaizatio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 46

24 Replacemet Policy Direct mapped: o choice Set associative Prefer o-valid etry, if there is oe Otherwise, choose amog etries i the set Least-recetly used (LRU) Choose the oe uused for the logest time Radom Simple for 2-way, maageable for 4-way, too hard beyod that Gives approximately the same performace as LRU for high associativity Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 47 Multilevel Caches Primary cache attached to CPU Small, but fast Level-2 cache services misses from primary cache Larger, slower, but still faster tha mai memory Mai memory services L-2 cache misses Some high-ed systems iclude L-3 cache Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 48

25 Multilevel Cache Example Give CPU base CPI = 1, clock rate = 4GHz Miss rate/istructio = 2% Mai memory access time = 100s With just primary cache Miss pealty = 100s/0.25s = 400 cycles Effective CPI = = 9 Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 49 Example (cot.) Now add L-2 cache Access time = 5s Global miss rate to mai memory = 0.5% Primary miss with L-2 hit Pealty = 5s/0.25s = 20 cycles Primary miss with L-2 miss Extra pealty = 500 cycles CPI = = 3.4 Performace ratio = 9/3.4 = 2.6 Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 50

26 Multilevel Cache Cosideratios Primary cache Focus o miimal hit time L-2 cache Focus o low miss rate to avoid mai memory access Hit time has less overall impact Results L-1 cache usually smaller tha a sigle cache L-1 block size smaller tha L-2 block size Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 51 Iteractios with Advaced CPUs Out-of-order CPUs ca execute istructios durig cache miss Pedig store stays i load/store uit Depedet istructios wait i reservatio statios Idepedet istructios cotiue Effect of miss depeds o program data flow Much harder to aalyse Use system simulatio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 52

27 Iteractios with Software Misses deped o memory access patters Algorithm behavior Compiler optimizatio for memory access Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 53 Software Optimizatio via Blockig Goal: maximize accesses to data before it is replaced Cosider ier loops of DGEMM: for (it j = 0; j < ; ++j) { double cij = C[i+j*]; for( it k = 0; k < ; k++ ) cij += A[i+k*] * B[k+j*]; C[i+j*] = cij; } Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 54

28 DGEMM Access Patter C, A, ad B arrays older accesses ew accesses Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 55 Cache Blocked DGEMM 1 #defie BLOCKSIZE 32 2 void do_block (it, it si, it sj, it sk, double *A, double 3 *B, double *C) 4 { 5 for (it i = si; i < si+blocksize; ++i) 6 for (it j = sj; j < sj+blocksize; ++j) 7 { 8 double cij = C[i+j*];/* cij = C[i][j] */ 9 for( it k = sk; k < sk+blocksize; k++ ) 10 cij += A[i+k*] * B[k+j*];/* cij+=a[i][k]*b[k][j] */ 11 C[i+j*] = cij;/* C[i][j] = cij */ 12 } 13 } 14 void dgemm (it, double* A, double* B, double* C) 15 { 16 for ( it sj = 0; sj < ; sj += BLOCKSIZE ) 17 for ( it si = 0; si < ; si += BLOCKSIZE ) 18 for ( it sk = 0; sk < ; sk += BLOCKSIZE ) 19 do_block(, si, sj, sk, A, B, C); 20 } Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 56

29 Blocked DGEMM Access Patter Uoptimized Blocked Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 57 Depedability Service accomplishmet Service delivered as specified Restoratio Failure Fault: failure of a compoet May or may ot lead to system failure 5.5 Depedable Memory Hierarchy Service iterruptio Deviatio from specified service Chapter 6 Storage ad Other I/O Topics 58

30 Depedability Measures Reliability: mea time to failure (MTTF) Service iterruptio: mea time to repair (MTTR) Mea time betwee failures MTBF = MTTF + MTTR Availability = MTTF / (MTTF + MTTR) Improvig Availability Icrease MTTF: fault avoidace, fault tolerace, fault forecastig Reduce MTTR: improved tools ad processes for diagosis ad repair Chapter 6 Storage ad Other I/O Topics 59 The Hammig SEC Code Hammig distace Number of bits that are differet betwee two bit patters Miimum distace = 2 provides sigle bit error detectio E.g. parity code Miimum distace = 3 provides sigle error correctio, 2 bit error detectio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 60

31 Ecodig SEC To calculate Hammig code: Number bits from 1 o the left All bit positios that are a power 2 are parity bits Each parity bit checks certai data bits: Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 61 Decodig SEC Value of parity bits idicates which bits are i error Use umberig from ecodig procedure E.g. Parity bits = 0000 idicates o error Parity bits = 1010 idicates bit 10 was flipped Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 62

32 SEC/DEC Code Add a additioal parity bit for the whole word (p ) Make Hammig distace = 4 Decodig: Let H = SEC parity bits H eve, p eve, o error H odd, p odd, correctable sigle bit error H eve, p odd, error i p bit H odd, p eve, double error occurred Note: ECC DRAM uses SEC/DEC with 8 bits protectig each 64 bits Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 63 Virtual Machies Host computer emulates guest operatig system ad machie resources Improved isolatio of multiple guests Avoids security ad reliability problems Aids sharig of resources Virtualizatio has some performace impact Feasible with moder high-performace comptuers Examples IBM VM/370 (1970s techology!) VMWare Microsoft Virtual PC 5.6 Virtual Machies Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 64

33 Virtual Machie Moitor Maps virtual resources to physical resources Memory, I/O devices, CPUs Guest code rus o ative machie i user mode Traps to VMM o privileged istructios ad access to protected resources Guest OS may be differet from host OS VMM hadles real I/O devices Emulates geeric virtual I/O devices for guest Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 65 Example: Timer Virtualizatio I ative machie, o timer iterrupt OS suspeds curret process, hadles iterrupt, selects ad resumes ext process With Virtual Machie Moitor VMM suspeds curret VM, hadles iterrupt, selects ad resumes ext VM If a VM requires timer iterrupts VMM emulates a virtual timer Emulates iterrupt for VM whe physical timer iterrupt occurs Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 66

34 Istructio Set Support User ad System modes Privileged istructios oly available i system mode Trap to system if executed i user mode All physical resources oly accessible usig privileged istructios Icludig page tables, iterrupt cotrols, I/O registers Reaissace of virtualizatio support Curret ISAs (e.g., x86) adaptig Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 67 Virtual Memory Use mai memory as a cache for secodary (disk) storage Maaged joitly by CPU hardware ad the operatig system (OS) Programs share mai memory Each gets a private virtual address space holdig its frequetly used code ad data Protected from other programs CPU ad OS traslate virtual addresses to physical addresses VM block is called a page VM traslatio miss is called a page fault 5.7 Virtual Memory Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 68

35 Address Traslatio Fixed-size pages (e.g., 4K) Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 69 Page Fault Pealty O page fault, the page must be fetched from disk Takes millios of clock cycles Hadled by OS code Try to miimize page fault rate Fully associative placemet Smart replacemet algorithms Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 70

36 Page Tables Stores placemet iformatio Array of page table etries, idexed by virtual page umber Page table register i CPU poits to page table i physical memory If page is preset i memory PTE stores the physical page umber Plus other status bits (refereced, dirty, ) If page is ot preset PTE ca refer to locatio i swap space o disk Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 71 Traslatio Usig a Page Table Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 72

37 Mappig Pages to Storage Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 73 Replacemet ad Writes To reduce page fault rate, prefer leastrecetly used (LRU) replacemet Referece bit (aka use bit) i PTE set to 1 o access to page Periodically cleared to 0 by OS A page with referece bit = 0 has ot bee used recetly Disk writes take millios of cycles Block at oce, ot idividual locatios Write through is impractical Use write-back Dirty bit i PTE set whe page is writte Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 74

38 Fast Traslatio Usig a TLB Address traslatio would appear to require extra memory refereces Oe to access the PTE The the actual memory access But access to page tables has good locality So use a fast cache of PTEs withi the CPU Called a Traslatio Look-aside Buffer (TLB) Typical: PTEs, cycle for hit, cycles for miss, 0.01% 1% miss rate Misses could be hadled by hardware or software Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 75 Fast Traslatio Usig a TLB Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 76

39 TLB Misses If page is i memory Load the PTE from memory ad retry Could be hadled i hardware Ca get complex for more complicated page table structures Or i software Raise a special exceptio, with optimized hadler If page is ot i memory (page fault) OS hadles fetchig the page ad updatig the page table The restart the faultig istructio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 77 TLB Miss Hadler TLB miss idicates Page preset, but PTE ot i TLB Page ot preset Must recogize TLB miss before destiatio register overwritte Raise exceptio Hadler copies PTE from memory to TLB The restarts istructio If page ot preset, page fault will occur Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 78

40 Page Fault Hadler Use faultig virtual address to fid PTE Locate page o disk Choose page to replace If dirty, write to disk first Read page ito memory ad update page table Make process ruable agai Restart from faultig istructio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 79 TLB ad Cache Iteractio If cache tag uses physical address Need to traslate before cache lookup Alterative: use virtual address tag Complicatios due to aliasig Differet virtual addresses for shared physical address Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 80

41 Memory Protectio Differet tasks ca share parts of their virtual address spaces But eed to protect agaist errat access Requires OS assistace Hardware support for OS protectio Privileged supervisor mode (aka kerel mode) Privileged istructios Page tables ad other state iformatio oly accessible i supervisor mode System call exceptio (e.g., syscall i MIPS) Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 81 The Memory Hierarchy The BIG Picture Commo priciples apply at all levels of the memory hierarchy Based o otios of cachig At each level i the hierarchy Block placemet Fidig a block Replacemet o a miss Write policy 5.8 A Commo Framework for Memory Hierarchies Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 82

42 Block Placemet Determied by associativity Direct mapped (1-way associative) Oe choice for placemet -way set associative choices withi a set Fully associative Ay locatio Higher associativity reduces miss rate Icreases complexity, cost, ad access time Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 83 Fidig a Block Associativity Locatio method Tag comparisos Direct mapped Idex 1 -way set associative Set idex, the search etries withi the set Fully associative Search all etries #etries Full lookup table 0 Hardware caches Reduce comparisos to reduce cost Virtual memory Full table lookup makes full associativity feasible Beefit i reduced miss rate Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 84

43 Replacemet Choice of etry to replace o a miss Least recetly used (LRU) Complex ad costly hardware for high associativity Radom Close to LRU, easier to implemet Virtual memory LRU approximatio with hardware support Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 85 Write Policy Write-through Update both upper ad lower levels Simplifies replacemet, but may require write buffer Write-back Update upper level oly Update lower level whe block is replaced Need to keep more state Virtual memory Oly write-back is feasible, give disk write latecy Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 86

44 Sources of Misses Compulsory misses (aka cold start misses) First access to a block Capacity misses Due to fiite cache size A replaced block is later accessed agai Coflict misses (aka collisio misses) I a o-fully associative cache Due to competitio for etries i a set Would ot occur i a fully associative cache of the same total size Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 87 Cache Desig Trade-offs Desig chage Effect o miss rate Negative performace effect Icrease cache size Icrease associativity Icrease block size Decrease capacity misses Decrease coflict misses Decrease compulsory misses May icrease access time May icrease access time Icreases miss pealty. For very large block size, may icrease miss rate due to pollutio. Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 88

45 Cache Cotrol Example cache characteristics Direct-mapped, write-back, write allocate Block size: 4 words (16 bytes) Cache size: 16 KB (1024 blocks) 32-bit byte addresses Valid bit ad dirty bit per block Blockig cache CPU waits util access is complete Tag Idex Offset 18 bits 10 bits 4 bits 5.9 Usig a Fiite State Machie to Cotrol A Simple Cache Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 89 Iterface Sigals Read/Write Read/Write Valid Valid Address 32 Address 32 CPU Write Data 32 Cache Write Data 128 Memory Read Data 32 Read Data 128 Ready Ready Multiple cycles per access Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 90

46 Fiite State Machies Use a FSM to sequece cotrol steps Set of states, trasitio o each clock edge State values are biary ecoded Curret state stored i a register Next state = f (curret state, curret iputs) Cotrol output sigals = f o (curret state) Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 91 Cache Cotroller FSM Could partitio ito separate states to reduce clock cycle time Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 92

47 Cache Coherece Problem Suppose two CPU cores share a physical address space Time step Write-through caches Evet CPU A s cache CPU B s cache Memory CPU A reads X CPU B reads X CPU A writes 1 to X Parallelism ad Memory Hierarchies: Cache Coherece Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 93 Coherece Defied Iformally: Reads retur most recetly writte value Formally: P writes X; P reads X (o iterveig writes) Þ read returs writte value P 1 writes X; P 2 reads X (sufficietly later) Þ read returs writte value c.f. CPU B readig X after step 3 i example P 1 writes X, P 2 writes X Þ all processors see writes i the same order Ed up with the same fial value for X Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 94

48 Cache Coherece Protocols Operatios performed by caches i multiprocessors to esure coherece Migratio of data to local caches Reduces badwidth for shared memory Replicatio of read-shared data Reduces cotetio for access Soopig protocols Each cache moitors bus reads/writes Directory-based protocols Caches ad memory record sharig status of blocks i a directory Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 95 Ivalidatig Soopig Protocols Cache gets exclusive access to a block whe it is to be writte Broadcasts a ivalidate message o the bus Subsequet read i aother cache misses Owig cache supplies updated value CPU activity Bus activity CPU A s cache CPU B s cache Memory CPU A reads X Cache miss for X 0 0 CPU B reads X Cache miss for X CPU A writes 1 to X Ivalidate for X 1 0 CPU B read X Cache miss for X Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 96

49 Memory Cosistecy Whe are writes see by other processors See meas a read returs the writte value Ca t be istataeously Assumptios A write completes oly whe all processors have see it A processor does ot reorder writes with other accesses Cosequece P writes X the writes Y Þ all processors that see ew Y also see ew X Processors ca reorder reads, but ot writes Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 97 Multilevel O-Chip Caches Chapter 5 Large ad Fast: Exploitig Memory Hierarchy The ARM Cortex-A8 ad Itel Core i7 Memory Hierarchies

50 2-Level TLB Orgaizatio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 99 Supportig Multiple Issue Both have multi-baked caches that allow multiple accesses per cycle assumig o bak coflicts Core i7 cache optimizatios Retur requested word first No-blockig cache Hit uder miss Miss uder miss Data prefetchig Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 100

51 DGEMM Combie cache blockig ad subword parallelism 5.14 Goig Faster: Cache Blockig ad Matrix Multiply Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 101 Pitfalls Byte vs. word addressig Example: 32-byte direct-mapped cache, 4-byte blocks Byte 36 maps to block 1 Word 36 maps to block 4 Igorig memory system effects whe writig or geeratig code Example: iteratig over rows vs. colums of arrays Large strides result i poor locality 5.15 Fallacies ad Pitfalls Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 102

52 Pitfalls I multiprocessor with shared L2 or L3 cache Less associativity tha cores results i coflict misses More cores Þ eed to icrease associativity Usig AMAT to evaluate performace of out-of-order processors Igores effect of o-blocked accesses Istead, evaluate performace by simulatio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 103 Pitfalls Extedig address rage usig segmets E.g., Itel But a segmet is ot always big eough Makes address arithmetic complicated Implemetig a VMM o a ISA ot desiged for virtualizatio E.g., o-privileged istructios accessig hardware resources Either exted ISA, or require guest OS ot to use problematic istructios Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 104

53 Cocludig Remarks Fast memories are small, large memories are slow We really wat fast, large memories L Cachig gives this illusio J Priciple of locality Programs use a small part of their memory space frequetly Memory hierarchy L1 cache «L2 cache ««DRAM memory «disk Memory system desig is critical for multiprocessors 5.16 Cocludig Remarks Chapter 5 Large ad Fast: Exploitig Memory Hierarchy 105

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy Priciple of Locality Programs access a small proportio of their address space

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Review: Major Components of a Computer Processor Devices Control Memory Input Datapath Output Secondary Memory (Disk) Main Memory Cache Performance

More information

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic

More information

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition. Computer Architecture A Quatitative Approach, Sixth Editio Chapter 2 Memory Hierarchy Desig 1 Itroductio Programmers wat ulimited amouts of memory with low latecy Fast memory techology is more expesive

More information

COMPUTER ORGANIZATION AND DESIGN ARM

COMPUTER ORGANIZATION AND DESIGN ARM COMPUTER ORGANIZATION AND DESIGN ARM The Hardware/Software Interface Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Διδάσκων Καθηγητής :Παρασκευάς Ευριπίδου Γραφείο :ΘΕΕ01 115 Τηλέφωνο :22892996

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Different Storage Memories Chapter 5 Large and Fast: Exploiting Memory

More information

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs Chapter 5 (Part II) Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Virtual Machines Host computer emulates guest operating system and machine resources Improved isolation of multiple

More information

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Static RAM (SRAM) Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 0.5ns 2.5ns, $2000 $5000 per GB 5.1 Introduction Memory Technology 5ms

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Impact of Memory Access 5.1 Introduction Program accesses 7 GB of RAM

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16

5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16 5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Movie Rental Store You have a huge warehouse with every movie ever made.

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COEN-4710 Computer Hardware Lecture 7 Large and Fast: Exploiting Memory Hierarchy (Chapter 5) Cristinel Ababei Marquette University Department

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy Chapter 5 B Large and Fast: Exploiting Memory Hierarchy Dependability 5.5 Dependable Memory Hierarchy Chapter 6 Storage and Other I/O Topics 2 Dependability Service accomplishment Service delivered as

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

CS/ECE 3330 Computer Architecture. Chapter 5 Memory

CS/ECE 3330 Computer Architecture. Chapter 5 Memory CS/ECE 3330 Computer Architecture Chapter 5 Memory Last Chapter n Focused exclusively on processor itself n Made a lot of simplifying assumptions IF ID EX MEM WB n Reality: The Memory Wall 10 6 Relative

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 CPU-Memory Bottleeck Computer Architecture ELEC44 CPU Memory Lecture 8 Cache Dr. Hayde Kwok-Hay So Departmet of Electrical ad Electroic Egieerig Performace of high-speed computers is usually limited by

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Jiang Jiang

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Jiang Jiang Chapter 5 Large and Fast: Exploiting Memory Hierarchy Jiang Jiang jiangjiang@ic.sjtu.edu.cn [Adapted from Computer Organization and Design, 4 th Edition, Patterson & Hennessy, 2008, MK] Chapter 5 Large

More information

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory! Why Care About the Memory Hierarchy? Memory Virtual Memory -DRAM Memory Gap (latecy) Reasos: Multi process systems (abstractio & memory protectio) Solutio: Tables (holdig per process traslatios) Fast traslatio

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Operating System Concepts. Operating System Concepts

Operating System Concepts. Operating System Concepts Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the

More information

Memory Hierarchy Y. K. Malaiya

Memory Hierarchy Y. K. Malaiya Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

CS61C : Machine Structures

CS61C : Machine Structures CS 61C L24 VM II (1) ist.eecs.berkele.edu/~cs61c/su5 CS61C : Machie Structures Lecture #24: VM II Address Mappig: Virtual Address: VPN offset 25-8-2 Ad Carle idex ito page table located i phsical memor

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2) The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache

More information

CS2410 Computer Architecture. Flynn s Taxonomy

CS2410 Computer Architecture. Flynn s Taxonomy CS2410 Computer Architecture Dept. of Computer Sciece Uiversity of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/2410p/idex.html 1 Fly s Taxoomy SISD Sigle istructio stream Sigle data stream (SIMD)

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

Arquitectura de Computadores

Arquitectura de Computadores Arquitectura de Computadores Capítulo 5. Almaceamieto y otros aspectos de la E/S Based o the origial material of the book: D.A. Patterso y J.L. Heessy Computer Orgaizatio ad Desig: The Hardware/Software

More information

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Precise Definition of Virtual Memory Virtual memory is a mechanism for translating logical

More information

V. Primary & Secondary Memory!

V. Primary & Secondary Memory! V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)

More information

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns

More information

CSE 2021: Computer Organization

CSE 2021: Computer Organization CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

Lecture 1: Introduction and Fundamental Concepts 1

Lecture 1: Introduction and Fundamental Concepts 1 Uderstadig Performace Lecture : Fudametal Cocepts ad Performace Aalysis CENG 332 Algorithm Determies umber of operatios executed Programmig laguage, compiler, architecture Determie umber of machie istructios

More information

Uniprocessors. HPC Prof. Robert van Engelen

Uniprocessors. HPC Prof. Robert van Engelen Uiprocessors HPC Prof. Robert va Egele Overview PART I: Uiprocessors PART II: Multiprocessors ad ad Compiler Optimizatios Parallel Programmig Models Uiprocessors Multiprocessors Processor architectures

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings Operatig Systems: Iterals ad Desig Priciples Chapter 4 Threads Nith Editio By William Stalligs Processes ad Threads Resource Owership Process icludes a virtual address space to hold the process image The

More information

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy Datorarkitektur och operativsystem Lecture 7 Memory It is impossible to have memory that is both Unlimited (large in capacity) And fast 5.1 Intr roduction We create an illusion for the programmer Before

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 22 Database Recovery Techiques Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Recovery algorithms Recovery cocepts Write-ahead

More information

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers

Outline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers Outlie CSCI 4730 s! What is a s?!! System Compoet Architecture s Overview Questios What is a?! What are the major operatig system compoets?! What are basic computer system orgaizatios?! How do you commuicate

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local

More information

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline

More information

Chapter 4 The Datapath

Chapter 4 The Datapath The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Part II Virtual Memory

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Part II Virtual Memory Chapter 5 Large and Fast: Exploiting Memory Hierarchy Part II Virtual Memory Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system

More information

Review: The ACID properties

Review: The ACID properties Recovery Review: The ACID properties A tomicity: All actios i the Xactio happe, or oe happe. C osistecy: If each Xactio is cosistet, ad the DB starts cosistet, it eds up cosistet. I solatio: Executio of

More information

Memory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S.

Memory Hierarchy. Reading. Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 (2) Lecture notes from MKP, H. H. Lee and S. Memory Hierarchy Lecture notes from MKP, H. H. Lee and S. Yalamanchili Sections 5.1, 5.2, 5.3, 5.4, 5.8 (some elements), 5.9 Reading (2) 1 SRAM: Value is stored on a pair of inerting gates Very fast but

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,

More information

Chapter 5. Memory Technology

Chapter 5. Memory Technology Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components Aoucemets Readig Chapter 4 (4.1-4.2) Project #4 is o the web ote policy about project #3 missig compoets Homework #1 Due 11/6/01 Chapter 6: 4, 12, 24, 37 Midterm #2 11/8/01 i class 1 Project #4 otes IPv6Iit,

More information

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 2)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 2) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 2) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Feng-Chia Unive ersity Outline 5.4 Virtual

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy CSCI-UA.0201 Computer Systems Organization Memory Hierarchy Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Programmer s Wish List Memory Private Infinitely large Infinitely fast Non-volatile

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab2 due toight Exam I: covers lectures 1-9 Ope book, ope otes, close device

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Multi-Core Prof. Yajig Li Uiversity of Chicago Course Evaluatio Very importat Please fill out! 2 Lab3 Brach Predictio Competitio 8 teams etered the competitio,

More information

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 3: ISA ad Itroductio to Microarchitecture Prof. Yajig Li Uiversity of Chicago Lecture Outlie ISA uarch (hardware implemetatio of a ISA) Logic desig basics Sigle-cycle

More information

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling

Greedy Algorithms. Interval Scheduling. Greedy Algorithms. Interval scheduling. Greedy Algorithms. Interval Scheduling Greedy Algorithms Greedy Algorithms Witer Paul Beame Hard to defie exactly but ca give geeral properties Solutio is built i small steps Decisios o how to build the solutio are made to maximize some criterio

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Main Memory Supporting Caches

Main Memory Supporting Caches Main Memory Supporting Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width clocked bus Bus clock is typically slower than CPU clock Cache Issues 1 Example cache block read

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Review Istructio Set Architecture Istructio Set The repertoire of istructios of a computer Differet computers have differet istructio

More information