Opportunities and Challenges of Emerging Memory Technologies
|
|
- Adrian McDonald
- 5 years ago
- Views:
Transcription
1 Opportuities ad Challeges of Emergig Memory Techologies Our Mutlu September 11, 2017 ARM Research Summit
2 The Mai Memory System Processors ad caches Mai Memory Storage (SSD/HDD) Mai memory is a critical compoet of all computig systems: server, mobile, embedded, desktop, sesor Mai memory system must scale (i size, techology, efficiecy, cost, ad maagemet algorithms) to maitai performace growth ad techology scalig beefits 2
3 The Mai Memory System FPGAs Mai Memory Storage (SSD/HDD) Mai memory is a critical compoet of all computig systems: server, mobile, embedded, desktop, sesor Mai memory system must scale (i size, techology, efficiecy, cost, ad maagemet algorithms) to maitai performace growth ad techology scalig beefits 3
4 The Mai Memory System GPUs Mai Memory Storage (SSD/HDD) Mai memory is a critical compoet of all computig systems: server, mobile, embedded, desktop, sesor Mai memory system must scale (i size, techology, efficiecy, cost, ad maagemet algorithms) to maitai performace growth ad techology scalig beefits 4
5 Memory System: A Shared Resource View Storage Most of the system is dedicated to storig ad movig data 5
6 State of the Mai Memory System Recet techology, architecture, ad applicatio treds lead to ew reuiremets exacerbate old reuiremets DRAM ad memory cotrollers, as we kow them today, are (will be) ulikely to satisfy all reuiremets Some emergig o-volatile memory techologies (e.g., PCM) eable ew opportuities: memory+storage mergig We eed to rethik the mai memory system to fix DRAM issues ad eable emergig techologies to satisfy all reuiremets 6
7 Major Treds Affectig Mai Memory (I) Need for mai memory capacity, badwidth, QoS icreasig Mai memory eergy/power is a key system desig cocer DRAM techology scalig is edig 7
8 Major Treds Affectig Mai Memory (II) Need for mai memory capacity, badwidth, QoS icreasig Multi-core: icreasig umber of cores/agets Data-itesive applicatios: icreasig demad/huger for data Cosolidatio: cloud computig, GPUs, mobile, heterogeeity Mai memory eergy/power is a key system desig cocer DRAM techology scalig is edig 8
9 Example: The Memory Capacity Gap Core cout doublig ~ every 2 years DRAM DIMM capacity doublig ~ every 3 years Lim et al., ISCA 2009 Memory capacity per core expected to drop by 30% every two years Treds worse for memory badwidth per core! 9
10 Major Treds Affectig Mai Memory (III) Need for mai memory capacity, badwidth, QoS icreasig Mai memory eergy/power is a key system desig cocer ~40-50% eergy spet i off-chip memory hierarchy [Lefurgy, IEEE Computer 03] >40% power i DRAM [Ware, HPCA 10][Paul,ISCA 15] DRAM cosumes power eve whe ot used (periodic refresh) DRAM techology scalig is edig 10
11 Major Treds Affectig Mai Memory (IV) Need for mai memory capacity, badwidth, QoS icreasig Mai memory eergy/power is a key system desig cocer DRAM techology scalig is edig ITRS projects DRAM will ot scale easily below X m Scalig has provided may beefits: higher capacity (desity), lower cost, lower eergy 11
12 Major Treds Affectig Mai Memory (V) DRAM scalig has already become icreasigly difficult Icreasig cell leakage curret, reduced cell reliability, icreasig maufacturig difficulties [Kim+ ISCA 2014], [Liu+ ISCA 2013], [Mutlu IMW 2013], [Mutlu DATE 2017] Difficult to sigificatly improve capacity, eergy Emergig memory techologies are promisig 3D-Stacked DRAM higher badwidth smaller capacity Reduced-Latecy DRAM (e.g., RLDRAM, TL-DRAM) Low-Power DRAM (e.g., LPDDR3, LPDDR4) No-Volatile Memory (NVM) (e.g., PCM, STTRAM, ReRAM, 3D Xpoit) lower latecy lower power larger capacity higher cost higher latecy higher cost higher latecy higher dyamic power lower edurace 12
13 Major Treds Affectig Mai Memory (V) DRAM scalig has already become icreasigly difficult Icreasig cell leakage curret, reduced cell reliability, icreasig maufacturig difficulties [Kim+ ISCA 2014], [Liu+ ISCA 2013], [Mutlu IMW 2013], [Mutlu DATE 2017] Difficult to sigificatly improve capacity, eergy Emergig memory techologies are promisig 3D-Stacked DRAM higher badwidth smaller capacity Reduced-Latecy DRAM (e.g., RLDRAM, TL-DRAM) Low-Power DRAM (e.g., LPDDR3, LPDDR4) No-Volatile Memory (NVM) (e.g., PCM, STTRAM, ReRAM, 3D Xpoit) lower latecy lower power larger capacity higher cost higher latecy higher cost higher latecy higher dyamic power lower edurace 13
14 Ageda Major Treds Affectig Mai Memory The Memory Scalig Problem Two Complemetary Solutio Directios New Memory (DRAM) Architectures Eablig Emergig (NVM) Techologies Coclusio 14
15 Limits of Charge Memory Difficult charge placemet ad cotrol Flash: floatig gate charge DRAM: capacitor charge, trasistor leakage Reliable sesig becomes difficult as charge storage uit size reduces 15
16 The DRAM Scalig Problem DRAM stores charge i a capacitor (charge-based memory) Capacitor must be large eough for reliable sesig Access trasistor should be large eough for low leakage ad high retetio time Scalig beyod 40-35m (2013) is challegig [ITRS, 2009] DRAM capacity, cost, ad eergy/power hard to scale 16
17 As Memory Scales, It Becomes Ureliable Data from all of Facebook s servers worldwide Meza+, Revisitig Memory Errors i Large-Scale Productio Data Ceters, DSN
18 Large-Scale Failure Aalysis of DRAM Chips Aalysis ad modelig of memory errors foud i all of Facebook s server fleet Justi Meza, Qiag Wu, Sajeev Kumar, ad Our Mutlu, "Revisitig Memory Errors i Large-Scale Productio Data Ceters: Aalysis ad Modelig of New Treds from the Field" Proceedigs of the 45th Aual IEEE/IFIP Iteratioal Coferece o Depedable Systems ad Networks (DSN), Rio de Jaeiro, Brazil, Jue [Slides (pptx) (pdf)] [DRAM Error Model] 18
19 Ifrastructures to Uderstad Such Issues Temperature Cotroller FPGAs Heater FPGAs PC Kim+, Flippig Bits i Memory Without Accessig Them: A Experimetal Study of DRAM Disturbace Errors, ISCA
20 SoftMC: Ope Source DRAM Ifrastructure Hasa Hassa et al., SoftMC: A Flexible ad Practical OpeSource Ifrastructure for Eablig Experimetal DRAM Studies, HPCA Flexible Easy to Use (C++ API) Ope-source github.com/cmu-safari/softmc 20
21 SoftMC 21
22 A Curious Discovery [Kim et al., ISCA 2014] Oe ca predictably iduce errors i most DRAM memory chips 22
23 DRAM RowHammer A simple hardware failure mechaism ca create a widespread system security vulerability 23
24 Moder DRAM is Proe to Disturbace Errors Row of Cells Row Victim Row Row Hammered Row Row Victim Row Row Opeed Closed Wordlie V LOW HIGH Repeatedly readig a row eough times (before memory gets refreshed) iduces disturbace errors i adjacet rows i most real DRAM chips you ca buy today Flippig Bits i Memory Without Accessig Them: A Experimetal Study of DRAM Disturbace Errors, (Kim et al., ISCA 2014) 24
25 More o RowHammer Aalysis Yoogu Kim, Ross Daly, Jeremie Kim, Chris Falli, Ji Hye Lee, Doghyuk Lee, Chris Wilkerso, Korad Lai, ad Our Mutlu, "Flippig Bits i Memory Without Accessig Them: A Experimetal Study of DRAM Disturbace Errors" Proceedigs of the 41st Iteratioal Symposium o Computer Architecture (ISCA), Mieapolis, MN, Jue [Slides (pptx) (pdf)] [Lightig Sessio Slides (pptx) (pdf)] [Source Code ad Data] 25
26 Idustry Is Writig Papers About It, Too 26
27 Idustry Is Writig Papers About It, Too 27
28 How Do We Solve The Memory Problem? Fix it: Make memory ad cotrollers more itelliget New iterfaces, fuctios, architectures: system-mem codesig Elimiate or miimize it: Replace or (more likely) augmet DRAM with a differet techology New techologies ad (VM, system-wide OS, MM) rethikig of memory & storage Embrace it: Desig heterogeeous Logic memories (oe of which are perfect) ad map data itelligetly across them Problems Algorithms Programs Rutime System ISA Microarchitecture Devices User New models for data maagemet ad maybe usage Solutios (to memory scalig) reuire software/hardware/device cooperatio 28
29 Ageda Major Treds Affectig Mai Memory The Memory Scalig Problem Two Complemetary Solutio Directios New Memory (DRAM) Architectures Eablig Emergig (NVM) Techologies Coclusio 29
30 Solutio 1: New Memory Architectures Overcome memory shortcomigs with Memory-cetric system desig Novel memory architectures, iterfaces, fuctios Better waste maagemet (efficiet utilizatio) Key issues to tackle Eable reliability at low cost à high capacity Reduce eergy Improve latecy ad badwidth Reduce waste (capacity, badwidth, latecy) Eable computatio close to data 30
31 Solutio 1: New Memory Architectures Liu+, RAIDR: Retetio-Aware Itelliget DRAM Refresh, ISCA Kim+, A Case for Exploitig Subarray-Level Parallelism i DRAM, ISCA Lee+, Tiered-Latecy DRAM: A Low Latecy ad Low Cost DRAM Architecture, HPCA Liu+, A Experimetal Study of Data Retetio Behavior i Moder DRAM Devices, ISCA Seshadri+, RowCloe: Fast ad Efficiet I-DRAM Copy ad Iitializatio of Bulk Data, MICRO Pekhimeko+, Liearly Compressed Pages: A Mai Memory Compressio Framework, MICRO Chag+, Improvig DRAM Performace by Parallelizig Refreshes with Accesses, HPCA Kha+, The Efficacy of Error Mitigatio Techiues for DRAM Retetio Failures: A Comparative Experimetal Study, SIGMETRICS Luo+, Characterizig Applicatio Memory Error Vulerability to Optimize Data Ceter Cost, DSN Kim+, Flippig Bits i Memory Without Accessig Them: A Experimetal Study of DRAM Disturbace Errors, ISCA Lee+, Adaptive-Latecy DRAM: Optimizig DRAM Timig for the Commo-Case, HPCA Qureshi+, AVATAR: A Variable-Retetio-Time (VRT) Aware Refresh for DRAM Systems, DSN Meza+, Revisitig Memory Errors i Large-Scale Productio Data Ceters: Aalysis ad Modelig of New Treds from the Field, DSN Kim+, Ramulator: A Fast ad Extesible DRAM Simulator, IEEE CAL Seshadri+, Fast Bulk Bitwise AND ad OR i DRAM, IEEE CAL Ah+, A Scalable Processig-i-Memory Accelerator for Parallel Graph Processig, ISCA Ah+, PIM-Eabled Istructios: A Low-Overhead, Locality-Aware Processig-i-Memory Architecture, ISCA Lee+, Decoupled Direct Memory Access: Isolatig CPU ad IO Traffic by Leveragig a Dual-Data-Port DRAM, PACT Seshadri+, Gather-Scatter DRAM: I-DRAM Address Traslatio to Improve the Spatial Locality of No-uit Strided Accesses, MICRO Lee+, Simultaeous Multi-Layer Access: Improvig 3D-Stacked Memory Badwidth at Low Cost, TACO Hassa+, ChargeCache: Reducig DRAM Latecy by Exploitig Row Access Locality, HPCA Chag+, Low-Cost Iter-Liked Subarrays (LISA): Eablig Fast Iter-Subarray Data Migratio i DRAM, HPCA Chag+, Uderstadig Latecy Variatio i Moder DRAM Chips Experimetal Characterizatio, Aalysis, ad Optimizatio, SIGMETRICS Kha+, PARBOR: A Efficiet System-Level Techiue to Detect Data Depedet Failures i DRAM, DSN Hsieh+, Trasparet Offloadig ad Mappig (TOM): Eablig Programmer-Trasparet Near-Data Processig i GPU Systems, ISCA Hashemi+, Acceleratig Depedet Cache Misses with a Ehaced Memory Cotroller, ISCA Boroumad+, LazyPIM: A Efficiet Cache Coherece Mechaism for Processig-i-Memory, IEEE CAL Pattaik+, Schedulig Techiues for GPU Architectures with Processig-I-Memory Capabilities, PACT Hsieh+, Acceleratig Poiter Chasig i 3D-Stacked Memory: Challeges, Mechaisms, Evaluatio, ICCD Hashemi+, Cotiuous Ruahead: Trasparet Hardware Acceleratio for Memory Itesive Workloads, MICRO Kha+, A Case for Memory Cotet-Based Detectio ad Mitigatio of Data-Depedet Failures i DRAM", IEEE CAL Hassa+, SoftMC: A Flexible ad Practical Ope-Source Ifrastructure for Eablig Experimetal DRAM Studies, HPCA Mutlu, The RowHammer Problem ad Other Issues We May Face as Memory Becomes Deser, DATE Lee+, Desig-Iduced Latecy Variatio i Moder DRAM Chips: Characterizatio, Aalysis, ad Latecy Reductio Mechaisms, SIGMETRICS Chag+, Uderstadig Reduced-Voltage Operatio i Moder DRAM Devices: Experimetal Characterizatio, Aalysis, ad Mechaisms, SIGMETRICS Patel+, The Reach Profiler (REAPER): Eablig the Mitigatio of DRAM Retetio Failures via Profilig at Aggressive Coditios, ISCA Seshadri ad Mutlu, Simple Operatios i Memory to Reduce Data Movemet, ADCOM Liu+, Cocurret Data Structures for Near-Memory Computig, SPAA Kha+, Detectig ad Mitigatig Data-Depedet DRAM Failures by Exploitig Curret Memory Cotet, MICRO Seshadri+, Ambit: I-Memory Accelerator for Bulk Bitwise Operatios Usig Commodity DRAM Techology, MICRO Avoid DRAM: Seshadri+, The Evicted-Address Filter: A Uified Mechaism to Address Both Cache Pollutio ad Thrashig, PACT Pekhimeko+, Base-Delta-Immediate Compressio: Practical Data Compressio for O-Chip Caches, PACT Seshadri+, The Dirty-Block Idex, ISCA Pekhimeko+, Exploitig Compressed Block Size as a Idicator of Future Reuse, HPCA Vijaykumar+, A Case for Core-Assisted Bottleeck Acceleratio i GPUs: Eablig Flexible Data Compressio with Assist Warps, ISCA Pekhimeko+, Toggle-Aware Badwidth Compressio for GPUs, HPCA
32 Example: (Truly) I-Memory Computatio We ca support i-dram COPY, ZERO, AND, OR, NOT, MAJ At low cost Usig aalog computatio capability of DRAM Idea: activatig multiple rows performs computatio 30-60X performace ad eergy improvemet Seshadri+, Ambit: I-Memory Accelerator for Bulk Bitwise Operatios Usig Commodity DRAM Techology, MICRO New memory techologies eable eve more opportuities Memristors, resistive RAM, phase chage mem, STT-MRAM, Ca operate o data with miimal movemet 32
33 Performace: I-DRAM Bitwise Operatios 33
34 Eergy of I-DRAM Bitwise Operatios Seshadri+, Ambit: I-Memory Accelerator for Bulk Bitwise Operatios usig Commodity DRAM Techology, MICRO
35 More o Ambit Vivek Seshadri et al., Ambit: I-Memory Accelerator for Bulk Bitwise Operatios Usig Commodity DRAM Techology, MICRO
36 Tesseract System for Graph Processig Itercoected set of 3D-stacked memory+logic chips with simple cores Host Processor Memory-Mapped Accelerator Iterface Nocacheable, Physically Addressed) Memory Logic Crossbar Network I-Order Core LP PF Buffer MTP Message Queue DRAM Cotroller NI Ah+, A Scalable Processig-i-Memory Accelerator for Parallel Graph Processig ISCA 2015.
37 Tesseract Graph Processig Performace Speedup >13X Performace Improvemet O five graph processig algorithms 11.6x 9.0x +56% +25% 13.8x 0 DDR3-OoO HMC-OoO HMC-MC Tesseract Tesseract- LP Tesseract- LP-MTP Ah+, A Scalable Processig-i-Memory Accelerator for Parallel Graph Processig ISCA 2015.
38 Tesseract Graph Processig System Eergy 1.2 Memory Layers Logic Layers Cores HMC-OoO > 8X Eergy Reductio Tesseract with Prefetchig Ah+, A Scalable Processig-i-Memory Accelerator for Parallel Graph Processig ISCA 2015.
39 More o Tesseract Juwha Ah, Sugpack Hog, Sugjoo Yoo, Our Mutlu, ad Kiyoug Choi, "A Scalable Processig-i-Memory Accelerator for Parallel Graph Processig" Proceedigs of the 42d Iteratioal Symposium o Computer Architecture (ISCA), Portlad, OR, Jue [Slides (pdf)] [Lightig Sessio Slides (pdf)] 39
40 Acceleratig Poiter Chasig with PIM Kevi Hsieh, Samira Kha, Nadita Vijaykumar, Kevi K. Chag, Amirali Boroumad, Saugata Ghose, ad Our Mutlu, "Acceleratig Poiter Chasig i 3D-Stacked Memory: Challeges, Mechaisms, Evaluatio" Proceedigs of the 34th IEEE Iteratioal Coferece o Computer Desig (ICCD), Phoeix, AZ, USA, October
41 Ageda Major Treds Affectig Mai Memory The Memory Scalig Problem Two Complemetary Solutio Directios New Memory (DRAM) Architectures Eablig Emergig (NVM) Techologies Coclusio 41
42 Limits of Charge Memory Difficult charge placemet ad cotrol Flash: floatig gate charge DRAM: capacitor charge, trasistor leakage Reliable sesig becomes difficult as charge storage uit size reduces 42
43 Solutio 2: Emergig Memory Techologies Some emergig resistive memory techologies seem more scalable tha DRAM (ad they are o-volatile) Example: Phase Chage Memory Data stored by chagig phase of material Data read by detectig material s resistace Expected to scale to 9m (2022 [ITRS 2009]) Prototyped at 20m (Raoux+, IBM JRD 2008) Expected to be deser tha DRAM: ca store multiple bits/cell But, emergig techologies have (may) shortcomigs Ca they be eabled to replace/augmet/surpass DRAM? 43
44 Solutio 2: Emergig Memory Techologies Lee+, Architectig Phase Chage Memory as a Scalable DRAM Alterative, ISCA 09, CACM 10, IEEE Micro 10. Meza+, Eablig Efficiet ad Scalable Hybrid Memories, IEEE Comp. Arch. Letters Yoo, Meza+, Row Buffer Locality Aware Cachig Policies for Hybrid Memories, ICCD Kultursay+, Evaluatig STT-RAM as a Eergy-Efficiet Mai Memory Alterative, ISPASS Meza+, A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory, WEED Lu+, Loose Orderig Cosistecy for Persistet Memory, ICCD Zhao+, FIRM: Fair ad High-Performace Memory Cotrol for Persistet Memory Systems, MICRO Yoo, Meza+, Efficiet Data Mappig ad Bufferig Techiues for Multi-Level Cell Phase-Chage Memories, TACO Re+, ThyNVM: Eablig Software-Trasparet Crash Cosistecy i Persistet Memory Systems, MICRO Chauha+, NVMove: Helpig Programmers Move to Byte-Based Persistece, INFLOW Li+, Utility-Based Hybrid Memory Maagemet, CLUSTER Yu+, Bashee: Badwidth-Efficiet DRAM Cachig via Software/Hardware Cooperatio, MICRO
45 Promisig Resistive Memory Techologies PCM Iject curret to chage material phase Resistace determied by phase STT-MRAM Iject curret to chage maget polarity Resistace determied by polarity Memristors/RRAM/ReRAM Iject curret to chage atomic structure Resistace determied by atom distace 45
46 What is Phase Chage Memory? Phase chage material (chalcogeide glass) exists i two states: Amorphous: Low optical reflexivity ad high electrical resistivity Crystallie: High optical reflexivity ad low electrical resistivity PCM is resistive memory: High resistace (0), Low resistace (1) PCM cell ca be switched betwee states reliably ad uickly 46
47 How Does PCM Work? Write: chage phase via curret ijectio SET: sustaied curret to heat cell above Tcryst RESET: cell heated above Tmelt ad ueched Read: detect phase via material resistace amorphous/crystallie Large Curret Small Curret Memory Elemet SET (cryst) Low resistace Access Device RESET (amorph) High resistace W Photo Courtesy: Bipi Rajedra, IBM Slide Courtesy: Moiuddi Qureshi, IBM W 47
48 Opportuity: PCM Advatages Scales better tha DRAM, Flash Reuires curret pulses, which scale liearly with feature size Expected to scale to 9m (2022 [ITRS]) Prototyped at 20m (Raoux+, IBM JRD 2008) Ca be deser tha DRAM Ca store multiple bits per cell due to large resistace rage Prototypes with 2 bits/cell i ISSCC 08, 4 bits/cell by 2012 No-volatile Retai data for >10 years at 85C No refresh eeded, low idle power 48
49 Phase Chage Memory Properties Surveyed prototypes from (ITRS, IEDM, VLSI, ISSCC) Derived PCM parameters for F=90m Lee, Ipek, Mutlu, Burger, Architectig Phase Chage Memory as a Scalable DRAM Alterative, ISCA Lee et al., Phase Chage Techology ad the Future of Mai Memory, IEEE Micro Top Picks
50 50
51 Phase Chage Memory: Pros ad Cos Pros over DRAM Better techology scalig (capacity ad cost) No volatile à Persistet Low idle power (o refresh) Cos Higher latecies: ~4-15x DRAM (especially write) Higher active eergy: ~2-50x DRAM (especially write) Lower edurace (a cell dies after ~10 8 writes) Reliability issues (resistace drift) Challeges i eablig PCM as DRAM replacemet/helper: Mitigate PCM shortcomigs Fid the right way to place PCM i the system 51
52 PCM-based Mai Memory (I) How should PCM-based (mai) memory be orgaized? Hybrid PCM+DRAM [Qureshi+ ISCA 09, Dhima+ DAC 09]: How to partitio/migrate data betwee PCM ad DRAM 52
53 PCM-based Mai Memory (II) How should PCM-based (mai) memory be orgaized? Pure PCM mai memory [Lee et al., ISCA 09, Top Picks 10]: How to redesig etire hierarchy (ad cores) to overcome PCM shortcomigs 53
54 A Iitial Study: Replace DRAM with PCM Lee, Ipek, Mutlu, Burger, Architectig Phase Chage Memory as a Scalable DRAM Alterative, ISCA Surveyed prototypes from (e.g. IEDM, VLSI, ISSCC) Derived average PCM parameters for F=90m 54
55 Results: Naïve Replacemet of DRAM with PCM Replace DRAM with PCM i a 4-core, 4MB L2 system PCM orgaized the same as DRAM: row buffers, baks, peripherals 1.6x delay, 2.2x eergy, 500-hour average lifetime Lee, Ipek, Mutlu, Burger, Architectig Phase Chage Memory as a Scalable DRAM Alterative, ISCA
56 Architectig PCM to Mitigate Shortcomigs Idea 1: Use multiple arrow row buffers i each PCM chip à Reduces array reads/writes à better edurace, latecy, eergy Idea 2: Write ito array at cache block or word graularity à Reduces uecessary wear DRAM PCM 56
57 Results: Architected PCM as Mai Memory 1.2x delay, 1.0x eergy, 5.6-year average lifetime Scalig improves eergy, edurace, desity Caveat 1: Worst-case lifetime is much shorter (o guaratees) Caveat 2: Itesive applicatios see large performace ad eergy hits Caveat 3: Optimistic PCM parameters? 57
58 More o PCM As Mai Memory Bejami C. Lee, Egi Ipek, Our Mutlu, ad Doug Burger, "Architectig Phase Chage Memory as a Scalable DRAM Alterative" Proceedigs of the 36th Iteratioal Symposium o Computer Architecture (ISCA), pages 2-13, Austi, TX, Jue Slides (pdf) 58
59 More o PCM As Mai Memory (II) Bejami C. Lee, Pig Zhou, Ju Yag, Youtao Zhag, Bo Zhao, Egi Ipek, Our Mutlu, ad Doug Burger, "Phase Chage Techology ad the Future of Mai Memory" IEEE Micro, Special Issue: Micro's Top Picks from 2009 Computer Architecture Cofereces (MICRO TOP PICKS), Vol. 30, No. 1, pages 60-70, Jauary/February
60 STT-MRAM as Mai Memory Magetic Tuel Juctio (MTJ) device Referece layer: Fixed magetic orietatio Free layer: Parallel or ati-parallel Magetic orietatio of the free layer determies logical state of device High vs. low resistace Write: Push large curret through MTJ to chage orietatio of free layer Read: Sese curret flow Logical 0 Referece Layer Barrier Free Layer Logical 1 Referece Layer Barrier Free Layer Word Lie MTJ Kultursay et al., Evaluatig STT-RAM as a Eergy- Efficiet Mai Memory Alterative, ISPASS Access Trasistor Bit Lie Sese Lie
61 STT-MRAM: Pros ad Cos Pros over DRAM Better techology scalig (capacity ad cost) No volatile à Persistet Low idle power (o refresh) Cos Higher write latecy Higher write eergy Poor desity (curretly) Reliability? Aother level of freedom Ca trade off o-volatility for lower write latecy/eergy (by reducig the size of the MTJ) 61
62 Architected STT-MRAM as Mai Memory 4-core, 4GB mai memory, multiprogrammed workloads ~6% performace loss, ~60% eergy savigs vs. DRAM Performace vs. DRAM 98% 96% 94% 92% 90% 88% STT-RAM (base) STT-RAM (opt) ACT+PRE WB RB Eergy vs. DRAM 100% 80% 60% 40% 20% 0% Kultursay+, Evaluatig STT-RAM as a Eergy-Efficiet Mai Memory Alterative, ISPASS
63 More o STT-MRAM as Mai Memory Emre Kultursay, Mahmut Kademir, Aad Sivasubramaiam, ad Our Mutlu, "Evaluatig STT-RAM as a Eergy-Efficiet Mai Memory Alterative" Proceedigs of the 2013 IEEE Iteratioal Symposium o Performace Aalysis of Systems ad Software (ISPASS), Austi, TX, April Slides (pptx) (pdf) 63
64 A More Viable Approach: Hybrid Memory Systems CPU DRAM PCM DRAM Ctrl Ctrl Phase Chage Memory (or Tech. X) Fast, durable Small, leaky, volatile, high-cost Large, o-volatile, low-cost Slow, wears out, high active eergy Hardware/software maage data allocatio ad movemet to achieve the best of multiple techologies Meza+, Eablig Efficiet ad Scalable Hybrid Memories, IEEE Comp. Arch. Letters, Yoo+, Row Buffer Locality Aware Cachig Policies for Hybrid Memories, ICCD 2012 Best Paper Award.
65 Challege ad Opportuity Providig the Best of Multiple Metrics with Multiple Memory Techologies 65
66 Challege ad Opportuity Heterogeeous, Cofigurable, Programmable Memory Systems 66
67 Hybrid Memory Systems: Issues Cache vs. Mai Memory Graularity of Data Move/Maage-met: Fie or Coarse Hardware vs. Software vs. HW/SW Cooperative Whe to migrate data? How to desig a scalable ad efficiet large cache? 67
68 Data Placemet i Hybrid Memory Cores/Caches Chael A Memory Cotrollers IDLE Chael B Memory A (Fast, Small) Memory A is fast, but small Load should be balaced o both chaels Memory B (Large, Slow) Page 1 Page 2 Which memory do we place each page i, to maximize system performace? Page migratios have performace ad eergy overhead 68
69 Data Placemet Betwee DRAM ad PCM Idea: Characterize data access patters ad guide data placemet i hybrid memory Streamig accesses: As fast i PCM as i DRAM Radom accesses: Much faster i DRAM Idea: Place radom access data with some reuse i DRAM; streamig data i PCM Yoo+, Row Buffer Locality-Aware Data Placemet i Hybrid Memories, ICCD 2012 Best Paper Award. 69
70 Hybrid vs. All-PCM/DRAM [ICCD 12] Normalized Weighted Speedup GB PCM RBLA-Dy 16GB DRAM 31% 29% Normalized Max. Slowdow % better performace tha all PCM, withi 29% of all DRAM performace Weighted Speedup Max. Slowdow Perf. per Watt 0 Normalized Metric 0 Yoo+, Row Buffer Locality-Aware Data Placemet i Hybrid Memories, ICCD 2012 Best Paper Award. 1
71 More o Hybrid Memory Data Placemet HaBi Yoo, Justi Meza, Rachata Ausavarugiru, Rachael Hardig, ad Our Mutlu, "Row Buffer Locality Aware Cachig Policies for Hybrid Memories" Proceedigs of the 30th IEEE Iteratioal Coferece o Computer Desig (ICCD), Motreal, Quebec, Caada, September Slides (pptx) (pdf) 71
72 Weakesses of Existig Solutios They are all heuristics that cosider oly a limited part of memory access behavior Do ot directly capture the overall system performace impact of data placemet decisios Example: Noe capture memory-level parallelism (MLP) Number of cocurret memory reuests from the same applicatio whe a page is accessed Affects how much page migratio helps performace 72
73 Importace of Memory-Level Parallelism Before migratio: Before migratio: reuests to Page 1 Mem. B reuests to Page 2 Mem. B reuests to Page 3 Mem. B After migratio: After migratio: reuests to Page 1 Mem. A reuests to Page 2 Mem. A T reuests to Page 3 Mem. Mem. A BT time Migratig oe page reduces stall time by T time Must migrate two pages to reduce stall time by T: migratig oe page aloe does ot help Page migratio decisios eed to cosider MLP 73
74 Our Goal [CLUSTER 2017] A geeralized mechaism that 1. Directly estimates the performace beefit of migratig a page betwee ay two types of memory 2. Places oly the performace-critical data i the fast memory 74
75 Utility-Based Hybrid Memory Maagemet A memory maager that works for ay hybrid memory e.g., DRAM-NVM, DRAM-RLDRAM Key Idea For each page, use comprehesive characteristics to calculate estimated utility (i.e., performace impact) of migratig page from oe memory to the other i the system Migrate oly pages with the highest utility (i.e., pages that improve system performace the most whe migrated) Li+, Utility-Based Hybrid Memory Maagemet, CLUSTER
76 Key Mechaisms of UH-MEM For each page, estimate utility usig a performace model Applicatio stall time reductio How much would migratig a page beefit the performace of the applicatio that the page belogs to? Applicatio performace sesitivity How much does the improvemet of a sigle applicatio s performace icrease the overall system performace? Utility = StallTime 0 Sesitivity 0 Migrate oly pages whose utility exceed the migratio threshold from slow memory to fast memory Periodically adjust migratio threshold 76
77 Results: System Performace ALL FREQ RBLA UH-MEM Normalized Weighted Speedup % 9% 3% 5% 0% 25% 50% 75% 100% Workload Memory Itesity Category UH-MEM improves system performace over the best state-of-the-art hybrid memory maager 77
78 Results: Sesitivity to Slow Memory Latecy We vary t 567 ad t 85 of the slow memory Weighted Speedup t 567 : t 85 : ALL FREQ RBLA UH-MEM 8% 6% 14% 13% 13% x3.0 x3.0 x4.0 x4.0 x4.5 x12 x6.0 x16 x7.5 x20 Slow Memory Latecy Multiplier UH-MEM improves system performace for a wide variety of hybrid memory systems 78
79 More o UH-MEM Yag Li, Saugata Ghose, Jogmoo Choi, Ji Su, Hui Wag, ad Our Mutlu, "Utility-Based Hybrid Memory Maagemet" Proceedigs of the 19th IEEE Cluster Coferece (CLUSTER), Hoolulu, Hawaii, USA, September [Slides (pptx) (pdf)] 79
80 Challege ad Opportuity Eablig a Emergig Techology to Augmet DRAM Maagig Hybrid Memories 80
81 Aother Challege Desigig Effective Large (DRAM) Caches 81
82 Oe Problem with Large DRAM Caches A large DRAM cache reuires a large metadata (tag + block-based iformatio) store How do we desig a efficiet DRAM cache? Metadata: X à DRAM Mem Ctlr CPU LOAD X Mem Ctlr DRAM X (small, fast cache) Access X PCM (high capacity) 82
83 Idea 1: Tags i Memory Store tags i the same row as data i DRAM Store metadata i same row as their data Data ad metadata ca be accessed together DRAM row Cache block 0 Cache block 1 Cache block 2 Tag 0 Tag 1 Tag 2 Beefit: No o-chip tag storage overhead Dowsides: Cache hit determied oly after a DRAM access Cache hit reuires two DRAM accesses 83
84 Idea 2: Cache Tags i SRAM Recall Idea 1: Store all metadata i DRAM To reduce metadata storage overhead Idea 2: Cache i o-chip SRAM freuetly-accessed metadata Cache oly a small amout to keep SRAM size small 84
85 O Large DRAM Cache Desig Justi Meza, Jichua Chag, HaBi Yoo, Our Mutlu, ad Parthasarathy Ragaatha, "Eablig Efficiet ad Scalable Hybrid Memories Usig Fie-Graularity DRAM Cache Maagemet" IEEE Computer Architecture Letters (CAL), February
86 DRAM Caches: May Recet Optios Yu+, Bashee: Badwidth-Efficiet DRAM Cachig via Software/Hardware Cooperatio, MICRO
87 Bashee [MICRO 2017] Tracks presece i cache usig TLB ad Page Table No tag store eeded for DRAM cache Eabled by a ew lightweight lazy TLB coherece protocol New badwidth-aware freuecy-based replacemet policy 87
88 More o Bashee Xiagyao Yu, Christopher J. Hughes, Nadathur Satish, Our Mutlu, ad Sriivas Devadas, "Bashee: Badwidth-Efficiet DRAM Cachig via Software/Hardware Cooperatio" Proceedigs of the 50th Iteratioal Symposium o Microarchitecture (MICRO), Bosto, MA, USA, October
89 Other Opportuities with Emergig Techologies Mergig of memory ad storage e.g., a sigle iterface to maage all data New applicatios e.g., ultra-fast checkpoit ad restore More robust system desig e.g., reducig data loss Processig tightly-coupled with memory e.g., eablig efficiet search ad filterig 89
90 TWO-LEVEL STORAGE MODEL MEMORY CPU STORAGE Ld/St DRAM FILE I/O VOLATILE FAST BYTE ADDR NONVOLATILE SLOW BLOCK ADDR 90
91 TWO-LEVEL STORAGE MODEL MEMORY STORAGE CPU Ld/St DRAM FILE I/O NVM PCM, STT-RAM VOLATILE FAST BYTE ADDR NONVOLATILE SLOW BLOCK ADDR No-volatile memories combie characteristics of memory ad storage 91
92 Two-Level Memory/Storage Model The traditioal two-level storage model is a bottleeck with NVM Volatile data i memory à a load/store iterface Persistet data i storage à a file system iterface Problem: Operatig system (OS) ad file system (FS) code to locate, traslate, buffer data become performace ad eergy bottleecks with fast NVM stores Two-Level Store Virtual memory Load/Store fope, fread, fwrite, Operatig system ad file system Address traslatio Processor ad caches Mai Memory Persistet (e.g., Phase-Chage) Storage Memory (SSD/HDD) 92
93 Uified Memory ad Storage with NVM Goal: Uify memory ad storage maagemet i a sigle uit to elimiate wasted work to locate, trasfer, ad traslate data Improves both eergy ad performace Simplifies programmig model as well Uified Memory/Storage Persistet Memory Maager Processor ad caches Load/Store Feedback Persistet (e.g., Phase-Chage) Memory Meza+, A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory, WEED
94 The Persistet Memory Maager (PMM) 1 it mai(void) { 2 // data i file.dat is persistet 3 FILE mydata = "file.dat"; Persistet objects 4 mydata = ew it[64]; 5 } 6 void updatevalue(it, it value) { 7 FILE mydata = "file.dat"; 8 mydata[] = value; // value is persistet 9 } Load Figure 2: Sample program with access to fi Store Hits from SW/OS/rutime Software Hardware Persistet Memory Maager Data Layout, Persistece, Metadata, Security,... DRAM Flash NVM HDD PMM uses access ad hit iformatio to allocate, locate, migrate ad access data i the heterogeeous array of devices 94
95 Performace Beefits of a Sigle-Level Store User CPU User Memory Syscall CPU Syscall I/O 1.0 ~24X Normalized Executio Time ~5X HDD 2-level NVM 2-level Persistet Memory Meza+, A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory, WEED
96 Eergy Beefits of a Sigle-Level Store User CPU Syscall CPU DRAM NVM HDD 1.0 ~16X Fractio of Total Eergy ~5X HDD 2-level NVM 2-level Persistet Memory Meza+, A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory, WEED
97 O Persistet Memory Beefits & Challeges Justi Meza, Yixi Luo, Samira Kha, Jishe Zhao, Yua Xie, ad Our Mutlu, "A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory" Proceedigs of the 5th Workshop o Eergy-Efficiet Desig (WEED), Tel-Aviv, Israel, Jue Slides (pptx) Slides (pdf) 97
98 Challege ad Opportuity Combied Memory & Storage 98
99 Challege ad Opportuity A Uified Iterface to All Data 99
100 Oe Key Challege i Persistet Memory How to esure cosistecy of system/data if all memory is persistet? Two extremes Programmer trasparet: Let the system hadle it Programmer oly: Let the programmer hadle it May alteratives i-betwee 100
101 CHALLENGE: CRASH CONSISTENCY Persistet Memory System System crash ca result i permaet data corruptio i NVM 101
102 CRASH CONSISTENCY PROBLEM Example: Add a ode to a liked list 2. Lik to prev 1. Lik to ext System crash ca result i icosistet memory state 102
103 CURRENT SOLUTIONS Explicit iterfaces to maage cosistecy NV-Heaps [ASPLOS 11], BPFS [SOSP 09], Memosye [ASPLOS 11] AtomicBegi { Isert a ew ode; } AtomicEd; Limits adoptio of NVM Have to rewrite code with clear partitio betwee volatile ad o-volatile data Burde o the programmers 103
104 OUR APPROACH: ThyNVM Goal: Software trasparet cosistecy i persistet memory systems 104
105 ThyNVM: Summary A ew hardware-based checkpoitig mechaism Checkpoits at multiple graularities to reduce both checkpoitig latecy ad metadata overhead Overlaps checkpoitig ad executio to reduce checkpoitig latecy Adapts to DRAM ad NVM characteristics Performs withi 4.9% of a idealized DRAM with zero cost cosistecy 105
106 More About ThyNVM Jiglei Re, Jishe Zhao, Samira Kha, Jogmoo Choi, Yogwei Wu, ad Our Mutlu, "ThyNVM: Eablig Software-Trasparet Crash Cosistecy i Persistet Memory Systems" Proceedigs of the 48th Iteratioal Symposium o Microarchitecture (MICRO), Waikiki, Hawaii, USA, December [Slides (pptx) (pdf)] [Lightig Sessio Slides (pptx) (pdf)] [Poster (pptx) (pdf)] [Source Code] 106
107 Aother Key Challege i Persistet Memory Programmig Ease to Exploit Persistece 107
108 Tools/Libraries to Help Programmers Himashu Chauha, Iria Calciu, Vijay Chidambaram, Eric Schkufza, Our Mutlu, ad Pratap Subrahmayam, "NVMove: Helpig Programmers Move to Byte-Based Persistece" Proceedigs of the 4th Workshop o Iteractios of NVM/Flash with Operatig Systems ad Workloads (INFLOW), Savaah, GA, USA, November [Slides (pptx) (pdf)] 108
109 Ageda Major Treds Affectig Mai Memory The Memory Scalig Problem Two Complemetary Solutio Directios New Memory (DRAM) Architectures Eablig Emergig (NVM) Techologies Coclusio 109
110 The Future of Emergig Techologies is Bright Regardless of challeges i uderlyig techology ad overlyig problems/reuiremets Ca eable: - Orders of magitude improvemets - New applicatios ad computig systems Problem Algorithm Program/Laguage System Software SW/HW Iterface Micro-architecture Logic Devices Electros Yet, we have to - Thik across the stack - Desig eablig systems 110
111 If I Doubt, Refer to Flash Memory A very doubtful emergig techology for at least two decades Proceedigs of the IEEE, Sept
112 Opportuities ad Challeges of Emergig Memory Techologies Our Mutlu September 11, 2017 ARM Research Summit
113 Overview Paper o Flash Reliability Yu Cai, Saugata Ghose, Erich F. Haratsch, Yixi Luo, ad Our Mutlu, "Error Characterizatio, Mitigatio, ad Recovery i Flash Memory Based Solid State Drives" to appear i Proceedigs of the IEEE,
114 Solutio 2: Emergig Memory Techologies Some emergig resistive memory techologies seem more scalable tha DRAM (ad they are o-volatile) Example: Phase Chage Memory Expected to scale to 9m (2022 [ITRS]) Expected to be deser tha DRAM: ca store multiple bits/cell But, emergig techologies have shortcomigs as well Ca they be eabled to replace/augmet/surpass DRAM? Lee+, Architectig Phase Chage Memory as a Scalable DRAM Alterative, ISCA 09, CACM 10, IEEE Micro 10. Meza+, Eablig Efficiet ad Scalable Hybrid Memories, IEEE Comp. Arch. Letters Yoo, Meza+, Row Buffer Locality Aware Cachig Policies for Hybrid Memories, ICCD Kultursay+, Evaluatig STT-RAM as a Eergy-Efficiet Mai Memory Alterative, ISPASS Meza+, A Case for Efficiet Hardware-Software Cooperative Maagemet of Storage ad Memory, WEED Lu+, Loose Orderig Cosistecy for Persistet Memory, ICCD Zhao+, FIRM: Fair ad High-Performace Memory Cotrol for Persistet Memory Systems, MICRO Yoo, Meza+, Efficiet Data Mappig ad Bufferig Techiues for Multi-Level Cell Phase-Chage Memories, TACO Re+, ThyNVM: Eablig Software-Trasparet Crash Cosistecy i Persistet Memory Systems, MICRO Chauha+, NVMove: Helpig Programmers Move to Byte-Based Persistece, INFLOW Li+, Utility-Based Hybrid Memory Maagemet, CLUSTER Yu+, Bashee: Badwidth-Efficiet DRAM Cachig via Software/Hardware Cooperatio, MICRO
115 Combiatio: Hybrid Memory Systems DRAM Fast, durable Small, leaky, volatile, high-cost DRAM Ctrl CPU PCM Ctrl Techology X (e.g., PCM) Large, o-volatile, low-cost Slow, wears out, high active eergy Hardware/software maage data allocatio ad movemet to achieve the best of multiple techologies Meza+, Eablig Efficiet ad Scalable Hybrid Memories, IEEE Comp. Arch. Letters, Yoo, Meza et al., Row Buffer Locality Aware Cachig Policies for Hybrid Memories, ICCD 2012 Best Paper Award.
116 Exploitig Memory Error Tolerace with Hybrid Memory Systems Memory error vulerability Vulerable Tolerat data data Reliable memory Low-cost memory ECC Vulerable O protected Microsoft s Web Search NoECCworkload Tolerat Parity Reduces Well-tested data server hardware chips cost Less-tested by 4.7 % data chips Achieves sigle server availability target of % App/Data A App/Data B App/Data C Heterogeeous-Reliability Memory [DSN 2014] 116
117 More o Heterogeeous Reliability Memory Yixi Luo, Sriram Govida, Bikash Sharma, Mark Sataiello, Justi Meza, Ama Kasal, Jie Liu, Badriddie Khessib, Kushagra Vaid, ad Our Mutlu, "Characterizig Applicatio Memory Error Vulerability to Optimize Data Ceter Cost via Heterogeeous-Reliability Memory" Proceedigs of the 44th Aual IEEE/IFIP Iteratioal Coferece o Depedable Systems ad Networks (DSN), Atlata, GA, Jue [Summary] [Slides (pptx) (pdf)] [Coverage o ZDNet] 117
118 Challege ad Opportuity Providig the Best of Multiple Metrics with Multiple Memory Techologies 118
119 Challege ad Opportuity Heterogeeous, Cofigurable, Programmable Memory Systems 119
UH-MEM: Utility-Based Hybrid Memory Management. Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang, Onur Mutlu
UH-MEM: Utility-Based Hybrid Memory Maagemet Yag Li, Saugata Ghose, Jogmoo Choi, Ji Su, Hui Wag, Our Mutlu 1 Executive Summary DRAM faces sigificat techology scalig difficulties Emergig memory techologies
More informationScalable Many-Core Memory Systems Lecture 3, Topic 2: Emerging Technologies and Hybrid Memories
Scalable Many-Core Memory Systems Lecture 3, Topic 2: Emerging Technologies and Hybrid Memories Prof. Onur Mutlu http://www.ece.cmu.edu/~omutlu onur@cmu.edu HiPEAC ACACES Summer School 2013 July 17, 2013
More information18-740: Computer Architecture Recitation 5: Main Memory Scaling Wrap-Up. Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 29, 2015
18-740: Computer Architecture Recitation 5: Main Memory Scaling Wrap-Up Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 29, 2015 Review Assignments for Next Week Required Reviews Due Tuesday
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5
Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:
More informationCMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems
More informationCMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access
More informationCourse Site: Copyright 2012, Elsevier Inc. All rights reserved.
Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.
Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple
More informationMaster Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1
Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts
More informationPhase Change Memory An Architecture and Systems Perspective
Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,
More informationCMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,
More informationThe University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.
Computer Architecture A Quatitative Approach, Sixth Editio Chapter 2 Memory Hierarchy Desig 1 Itroductio Programmers wat ulimited amouts of memory with low latecy Fast memory techology is more expesive
More informationMulti-Threading. Hyper-, Multi-, and Simultaneous Thread Execution
Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig
More informationMultiprocessors. HPC Prof. Robert van Engelen
Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies
More informationSOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS
SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS Samira Khan MEMORY IN TODAY S SYSTEM Processor DRAM Memory Storage DRAM is critical for performance
More informationChapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings
Operatig Systems: Iterals ad Desig Priciples Chapter 4 Threads Nith Editio By William Stalligs Processes ad Threads Resource Owership Process icludes a virtual address space to hold the process image The
More informationCS2410 Computer Architecture. Flynn s Taxonomy
CS2410 Computer Architecture Dept. of Computer Sciece Uiversity of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/2410p/idex.html 1 Fly s Taxoomy SISD Sigle istructio stream Sigle data stream (SIMD)
More information18-740: Computer Architecture Recitation 4: Rethinking Memory System Design. Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 22, 2015
18-740: Computer Architecture Recitation 4: Rethinking Memory System Design Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 22, 2015 Agenda Review Assignments for Next Week Rethinking Memory
More informationTransforming Irregular Algorithms for Heterogeneous Computing - Case Studies in Bioinformatics
Trasformig Irregular lgorithms for Heterogeeous omputig - ase Studies i ioiformatics Jig Zhag dvisor: Dr. Wu Feg ollaborator: Hao Wag syergy.cs.vt.edu Irregular lgorithms haracterized by Operate o irregular
More informationPhase Change Memory An Architecture and Systems Perspective
Phase Change Memory An Architecture and Systems Perspective Benjamin Lee Electrical Engineering Stanford University Stanford EE382 2 December 2009 Benjamin Lee 1 :: PCM :: 2 Dec 09 Memory Scaling density,
More informationService Oriented Enterprise Architecture and Service Oriented Enterprise
Approved for Public Release Distributio Ulimited Case Number: 09-2786 The 23 rd Ope Group Eterprise Practitioers Coferece Service Orieted Eterprise ad Service Orieted Eterprise Ya Zhao, PhD Pricipal, MITRE
More informationCSC 220: Computer Organization Unit 11 Basic Computer Organization and Design
College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:
More informationAppendix D. Controller Implementation
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);
More informationCMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago
CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab2 due toight Exam I: covers lectures 1-9 Ope book, ope otes, close device
More informationn Explore virtualization concepts n Become familiar with cloud concepts
Chapter Objectives Explore virtualizatio cocepts Become familiar with cloud cocepts Chapter #15: Architecture ad Desig 2 Hypervisor Virtualizatio ad cloud services are becomig commo eterprise tools to
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)
More informationOperating System Concepts. Operating System Concepts
Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the
More informationComputer Architecture ELEC3441
CPU-Memory Bottleeck Computer Architecture ELEC44 CPU Memory Lecture 8 Cache Dr. Hayde Kwok-Hay So Departmet of Electrical ad Electroic Egieerig Performace of high-speed computers is usually limited by
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local
More informationBasic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.
5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator
More informationSCI Reflective Memory
Embedded SCI Solutios SCI Reflective Memory (Experimetal) Atle Vesterkjær Dolphi Itercoect Solutios AS Olaf Helsets vei 6, N-0621 Oslo, Norway Phoe: (47) 23 16 71 42 Fax: (47) 23 16 71 80 Mail: atleve@dolphiics.o
More informationBig Data Capacity Planning: Achieving Right Sized Hadoop Clusters and Optimized Operations
Big Data Capacity Plaig: Achievig Right Sized Hadoop Clusters ad Optimized Operatios Abstract Busiesses are cosiderig more opportuities to leverage data for differet purposes, impactig resources ad resultig
More informationSession Initiated Protocol (SIP) and Message-based Load Balancing (MBLB)
F5 White Paper Sessio Iitiated Protocol (SIP) ad Message-based Load Balacig (MBLB) The ability to provide ew ad creative methods of commuicatios has esured a SIP presece i almost every orgaizatio. The
More informationOutline. CSCI 4730 Operating Systems. Questions. What is an Operating System? Computer System Layers. Computer System Layers
Outlie CSCI 4730 s! What is a s?!! System Compoet Architecture s Overview Questios What is a?! What are the major operatig system compoets?! What are basic computer system orgaizatios?! How do you commuicate
More informationReducing SDRAM Energy Consumption in Embedded Systems Λ
Reducig SDRAM Eergy Cosumptio i Embedded Systems Λ Jelea Trajkovic, Alexader Veidebaum Uiversity of Califoria, Irvie fjeleat,alexvg@ics.uci.edu Abstract DRAM eergy cosumptio i embedded systems ca be very
More informationReview: The ACID properties
Recovery Review: The ACID properties A tomicity: All actios i the Xactio happe, or oe happe. C osistecy: If each Xactio is cosistet, ad the DB starts cosistet, it eds up cosistet. I solatio: Executio of
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must
More informationIsn t It Time You Got Faster, Quicker?
Is t It Time You Got Faster, Quicker? AltiVec Techology At-a-Glace OVERVIEW Motorola s advaced AltiVec techology is desiged to eable host processors compatible with the PowerPC istructio-set architecture
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad
More informationAvid Interplay Bundle
Avid Iterplay Budle Versio 2.5 Cofigurator ReadMe Overview This documet provides a overview of Iterplay Budle v2.5 ad describes how to ru the Iterplay Budle cofiguratio tool. Iterplay Budle v2.5 refers
More informationRethinking Memory System Design Business as Usual in the Next Decade?
Rethinking Memory System Design Business as Usual in the Next Decade? Onur Mutlu onur.mutlu@inf.ethz.ch http://users.ece.cmu.edu/~omutlu/ September 23, 2016 MST Workshop Keynote (Milan) The Main Memory
More informationComputer Architecture Lecture 12: Memory Interference and Quality of Service. Prof. Onur Mutlu ETH Zürich Fall November 2017
Computer Architecture Lecture 12: Memory Iterferece ad Quality of Service Prof. Our Mutlu ETH Zürich Fall 2017 1 November 2017 Summary of Last Week s Lectures Cotrol Depedece Hadlig Problem Six solutios
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 22 Database Recovery Techiques Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Recovery algorithms Recovery cocepts Write-ahead
More informationn Learn how resiliency strategies reduce risk n Discover automation strategies to reduce risk
Chapter Objectives Lear how resiliecy strategies reduce risk Discover automatio strategies to reduce risk Chapter #16: Architecture ad Desig Resiliecy ad Automatio Strategies 2 Automatio/Scriptig Resiliet
More informationAdministrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today
Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised
More informationPage 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!
Why Care About the Memory Hierarchy? Memory Virtual Memory -DRAM Memory Gap (latecy) Reasos: Multi process systems (abstractio & memory protectio) Solutio: Tables (holdig per process traslatios) Fast traslatio
More informationData Protection: Your Choice Is Simple PARTNER LOGO
Data Protectio: Your Choice Is Simple PARTNER LOGO Is Your Data Truly Protected? The growth, value ad mobility of data are placig icreasig pressure o orgaizatios. IT must esure assets are properly protected
More informationInstruction and Data Streams
Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies
More informationCMSC Computer Architecture Lecture 5: Pipelining. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 5: Pipeliig Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab1 Due toight Lab2: out later today; due 2 weeks from ow Review sessio this Friday Turig award
More informationA Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu
A Row Buffer Locality-Aware Caching Policy for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Overview Emerging memories such as PCM offer higher density than
More informationSwitching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1
Switchig Hardware Sprig 208 CS 438 Staff, Uiversity of Illiois Where are we? Uderstad Differet ways to move through a etwork (forwardig) Read sigs at each switch (datagram) Follow a kow path (virtual circuit)
More informationRethinking Memory System Design for Data-Intensive Computing. Onur Mutlu October 11-16, 2013
Rethinking Memory System Design for Data-Intensive Computing Onur Mutlu onur@cmu.edu October 11-16, 2013 Three Key Problems in Systems The memory system Data storage and movement limit performance & efficiency
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:
More informationFAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS
SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG
More informationCS61C : Machine Structures
CS 61C L24 VM II (1) ist.eecs.berkele.edu/~cs61c/su5 CS61C : Machie Structures Lecture #24: VM II Address Mappig: Virtual Address: VPN offset 25-8-2 Ad Carle idex ito page table located i phsical memor
More informationDCMIX: Generating Mixed Workloads for the Cloud Data Center
DCMIX: Geeratig Mixed Workloads for the Cloud Data Ceter XigWag Xiog, Lei Wag, WaLig Gao, Rui Re, Ke Liu, Che Zheg, Yu We, YiLiag Istitute of Computig Techology, Chiese Academy of Scieces Bech 2018, Seattle,
More informationComputer Architecture
Computer Architecture Overview Prof. Tie-Fu Che Dept. of Computer Sciece Natioal Chug Cheg Uiv Sprig 2002 Overview- Computer Architecture Course Focus Uderstadig the desig techiques, machie structures,
More informationUniprocessors. HPC Prof. Robert van Engelen
Uiprocessors HPC Prof. Robert va Egele Overview PART I: Uiprocessors PART II: Multiprocessors ad ad Compiler Optimizatios Parallel Programmig Models Uiprocessors Multiprocessors Processor architectures
More informationU8 Flash Memory Controller
U8 Flash Memory Cotroller U8 U8 Flash Memory Cotroller The Hyperstoe U8 family of Flash Memory Cotrollers together with provided applicatio ad Flash specific firmware offers a easy-to-use turkey platform
More informationAnalysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis
Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems
More informationScalable Many-Core Memory Systems Topic 2: Emerging Technologies and Hybrid Memories
Scalable Many-Core Memory Systems Topic 2: Emerging Technologies and Hybrid Memories Prof. Onur Mutlu http://www.ece.cmu.edu/~omutlu onur@cmu.edu HiPEAC ACACES Summer School 2013 July 15-19, 2013 What
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationPhase-change RAM (PRAM)- based Main Memory
Phase-change RAM (PRAM)- based Main Memory Sungjoo Yoo April 19, 2011 Embedded System Architecture Lab. POSTECH sungjoo.yoo@gmail.com Agenda Introduction Current status Hybrid PRAM/DRAM main memory Next
More informationData diverse software fault tolerance techniques
Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the
More informationGoals of the Lecture UML Implementation Diagrams
Goals of the Lecture UML Implemetatio Diagrams Object-Orieted Aalysis ad Desig - Fall 1998 Preset UML Diagrams useful for implemetatio Provide examples Next Lecture Ð A variety of topics o mappig from
More informationEvaluating STT-RAM as an Energy-Efficient Main Memory Alternative
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University
More informationStructuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software
Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.
More informationCache-Optimal Methods for Bit-Reversals
Proceedigs of the ACM/IEEE Supercomputig Coferece, November 1999, Portlad, Orego, U.S.A. Cache-Optimal Methods for Bit-Reversals Zhao Zhag ad Xiaodog Zhag Departmet of Computer Sciece College of William
More informationTrusted Design in FPGAs
Trusted Desig i FPGAs Mark Tehraipoor Itroductio to Hardware Security & Trust Uiversity of Florida 1 Outlie Itro to FPGA Architecture FPGA Overview Maufacturig Flow FPGA Security Attacks Defeses Curret
More informationDesign of Digital Circuits Lecture 22: GPU Programming. Dr. Juan Gómez Luna Prof. Onur Mutlu ETH Zurich Spring May 2018
Desig of Digital Circuits Lecture 22: GPU Programmig Dr. Jua Gómez Lua Prof. Our Mutlu ETH Zurich Sprig 2018 18 May 2018 Ageda for Today GPU as a accelerator Program structure Bulk sychroous programmig
More informationIT administrators face a variety of challenges
Flexible Computig Optimized Desktop Ifrastructure Usig Dell Flexible Computig Solutios By Joh Schoute Ramesh Radhakrisha, Ph.D. The Dell Flexible Computig Solutios suite of products ad services is desiged
More informationComputer Architecture: Main Memory (Part II) Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Main Memory (Part II) Prof. Onur Mutlu Carnegie Mellon University Main Memory Lectures These slides are from the Scalable Memory Systems course taught at ACACES 2013 (July 15-19,
More informationECE4050 Data Structures and Algorithms. Lecture 6: Searching
ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated
More informationCOSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1
COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,
More informationAdaptive Resource Allocation for Electric Environmental Pollution through the Control Network
Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the
More informationChapter 4 The Datapath
The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that
More information1. SWITCHING FUNDAMENTALS
. SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig
More informationStone Images Retrieval Based on Color Histogram
Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose
More informationA Parallel Reconfigurable Architecture for Real-Time Stereo Vision
2009 Iteratioal Cofereces o Embedded Software ad Systems A Parallel Recofigurable Architecture for Real-Time Stereo Visio Lei Che Yude Jia Beijig Laboratory of Itelliget Iformatio Techology, School of
More informationEfficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO
Efficiet Hardware Desig for Implemetatio of Matrix Multiplicatio by usig PPI-SO Shivagi Tiwari, Niti Meea Dept. of EC, IES College of Techology, Bhopal, Idia Assistat Professor, Dept. of EC, IES College
More information1 Enterprise Modeler
1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio
More informationSoftware development of components for complex signal analysis on the example of adaptive recursive estimation methods.
Software developmet of compoets for complex sigal aalysis o the example of adaptive recursive estimatio methods. SIMON BOYMANN, RALPH MASCHOTTA, SILKE LEHMANN, DUNJA STEUER Istitute of Biomedical Egieerig
More information3D Model Retrieval Method Based on Sample Prediction
20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer
More informationWhat are Information Systems?
Iformatio Systems Cocepts What are Iformatio Systems? Roma Kotchakov Birkbeck, Uiversity of Lodo Based o Chapter 1 of Beett, McRobb ad Farmer: Object Orieted Systems Aalysis ad Desig Usig UML, (4th Editio),
More informationScalable High Performance Main Memory System Using PCM Technology
Scalable High Performance Main Memory System Using PCM Technology Moinuddin K. Qureshi Viji Srinivasan and Jude Rivers IBM T. J. Watson Research Center, Yorktown Heights, NY International Symposium on
More informationArchitectural styles for software systems The client-server style
Architectural styles for software systems The cliet-server style Prof. Paolo Ciacarii Software Architecture CdL M Iformatica Uiversità di Bologa Ageda Cliet server style CS two tiers CS three tiers CS
More informationAn Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem
A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.
More informationCMSC Computer Architecture Lecture 2: ISA. Prof. Yanjing Li Department of Computer Science University of Chicago
CMSC 22200 Computer Architecture Lecture 2: ISA Prof. Yajig Li Departmet of Computer Sciece Uiversity of Chicago Admiistrative Stuff Lab1 out toight Due Thursday (10/18) Lab1 review sessio Tomorrow, 10/05,
More informationCOMPUTER ORGANIZATION AND DESIGN
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy Priciple of Locality Programs access a small proportio of their address space
More informationAccuracy Improvement in Camera Calibration
Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Large and Fast: Exploiting Memory Hierarchy
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface ARM Editio Chapter 5 Large ad Fast: Exploitig Memory Hierarchy Priciple of Locality Programs access a small proportio of their address space
More informationComputers and Scientific Thinking
Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput
More informationThe Value of Peering
The Value of Peerig ISP/IXP Workshops These materials are licesed uder the Creative Commos Attributio-NoCommercial 4.0 Iteratioal licese (http://creativecommos.org/liceses/by-c/4.0/) Last updated 25 th
More information1&1 Next Level Hosting
1&1 Next Level Hostig Performace Level: Performace that grows with your requiremets Copyright 1&1 Iteret SE 2017 1ad1.com 2 1&1 NEXT LEVEL HOSTING 3 Fast page loadig ad short respose times play importat
More informationCSE 305. Computer Architecture
CSE 305 Computer Architecture Computer Architecture Course Teachers Rifat Shahriyar (rifat1816@gmail.com) Johra Muhammad Moosa Textbook Computer Orgaizatio ad Desig (The Hardware/Software Iterface) David
More informationBanshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation!
Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation! Xiangyao Yu 1, Christopher Hughes 2, Nadathur Satish 2, Onur Mutlu 3, Srinivas Devadas 1 1 MIT 2 Intel Labs 3 ETH Zürich 1 High-Bandwidth
More informationRow Buffer Locality Aware Caching Policies for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu
Row Buffer Locality Aware Caching Policies for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Executive Summary Different memory technologies have different
More informationCMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 15: Multi-Core Prof. Yajig Li Uiversity of Chicago Course Evaluatio Very importat Please fill out! 2 Lab3 Brach Predictio Competitio 8 teams etered the competitio,
More information