Revolutionizing Technological Devices such as STT- RAM and their Multiple Implementation in the Cache Level Hierarchy
|
|
- Buck Mason
- 5 years ago
- Views:
Transcription
1 Revolutionizing Technological s such as and their Multiple Implementation in the Cache Level Hierarchy Michael Mosquera Department of Electrical and Computer Engineering University of Central Florida Orlando, FL Abstract Many devices are currently being tested to replace the antiquated S, static random access memory, which has been used for nearly a decade. New technological devices such as or ed or even P are being introduced and presently being tested to replace S. Not only are these devices eventually being replacing S but also certain testing is being conducted to determine which cache level configuration these devices should be placed in for maximum efficiency and data retrieval performance; whether they be placed in cache level 1, level 2, or level 3. Although most devices such as are placed in cache level 3 hierarchy but now new designs incorporate these devices on level 1 or level 2. Keywords, S, ED P,, Volatile, Non- Volatile, Cache, Level 1, Level 2, Level 3, LLC, Associativity, Protocol, write instruction, read instruction I. INTRODUCTION As with many technological advancements and changes in recent years, developments in computer system processors are being realized. Within these processors exist memory called cache, section of memory located inside the CPU that is used to store data needed to be retrieved as requested by the CPU. Cache when compared to memory located on the motherboard is incredibly fast and with the placement of cache come a variety of configurations for multi-leveled cache: Level 1, Level, 2, and Level 3. Either the cache level can be optimized or cache itself to store more data. Variability exists when optimizing a processor s cache, either individual cache level capacity can be increased to store additional data or cache levels themselves can be optimized to have two level cache levels up to three level cache structures. Multi-level caches are important and vital for data retrieval performance. When cache reaches its maximum capacity rather than having to store data in main memory, data can be also placed in another level 2 cache. Miss penalty also decreases with multi-leveled caches [13]. As well has having multiple cache level organizations; there exist several cache associativity approaches for data retrieval by placing data blocks within cache. Some of these methods include full associativity, set associativity, and direct mapping each having their own unique advantages and disadvantages upon implementations [13]. Memory is another very important aspect of computer systems. Many technological devices exist such as Static Random Access Memory (S), Dynamic Random Access Memory (D), Spin Transfer Torque Ram () and Embedded Random Access Memory (ed) [13]. While every particular technological device listed are used for storing data, some devices differ from other in that some are considered volatile while other non-volatile. Memory components such as S, D, and ed are volatile. These devices have the capability of losing data without the consistent provision of voltage. Other devices such as are non-volatile, no data leakage or loss without the source of voltage. Depending on the cache associativity, the manner in which data is stored and accessed differs from method to method. Using direct mapped cache, each block from memory is situated to only one line in cache whereas in set associativity a specific number of cache lines store exactly one block from memory. mapping has regulated mapping while direct mapping is fixed, one block for one cache line. Direct mapping works by transferring the data at a specific memory address, using a tag to determine where the desired block was positioned in the cache [13]. mapping works by placing the memory blocks in limited number of cache lines, where depending on the cache one block can fill either 1, 2, or 3 cache lines. Although many computers systems differ in cache level organization, cache structure remain consistently the same. Just as memory cells contain data, within cache exist cache lines which store the data from memory. Cache lines are segmented to contain specific information needed by the processor, cache lines contain both the tags and the data. The tags are essential for determining the destination in main memory from which the block of data is being retrieved [13]. Cache operates and functions just as the main memory, difference is memory is lower located significantly lower in memory hierarchy when compared to cache which is located within the CPU chip. The hardware placement of cache in the chip causes significant speed for data retrieval yet lack in space whereas main memory lacks speed yet delay in speed [13]. Page 1 of 5
2 In the figure, cache placement is shown to be within the CPU chip and contained with the cache are segmented sections or cells known as cache lines. The data placed into cache lines are retrieved from main memory as displayed in the figure below. Whenever data retrieval is in process, each time segment of data from memory are located in cache, a hit occurs. Whenever a block of memory is not able to be located within any of the cache levels, whether they be a Level 1, Level 2, or Level 3, a miss occurs. When a high hit ratio occurs, data retrieval is significantly faster since the data exist within cache delivering considerably low timing for the data retrieval. When a high miss ratio occurs, there is data that is not able to be located within cache that has to be retrieved from main memory and has to be placed in the cache lines for the CPU to access; this entire process of location data and relocation causes delay in data retrieval performance decreasing overall speed. In the upcoming section of the paper, new advancements will be discussed that have taken place over the past decade with certain technological devices such as Spin Static Torque Ram and embedded Random Access Memory as well as other devices. Along with these devices, certain cache levels configurations will be discussed as well as optimizations that have taken place within each cache level, such as with Level 2 and most importantly Level 3. II. LITERATURE REVIEW While adding cache memory to a computer system can be an excellent alternative to increase the data accessing speed while decreasing the retrieval time, certain issues have risen such as that of cache coherence [2]. The issue takes place when multiple levels of cache contain data that needs to be altered in one level, precautions must take place to ensure that data is modified throughout the entire levels of cache to maintain consistency [2]. With the immense enhancements in cache-leveled structures new technological hybrid devices are being designed able to sustain heavily oriented memory tasks. One of the hybrids designed known as ASTRO which focuses on retrieving instructions stored in the main memory of the system [3]. Not only does ASTRO receive instructions but the hybrid also reduces energy dissipation when compared to other devices [3]. Not only are enhancements being made for cache level structures but certain advancements in energy preservation by minimal dissipation [4]. With issues surfacing with the use of technological devices such as S, new substitutions are being made such as the use of. Spin Transfer Torque Ram now the new device being implemented, unlike S does not reduce power loss but rather preserves while also be a non-volatile device. Last Level Cache also known as LLC are placed lower in the memory hierarchy while also maintaining remarkable speeds yet problems arrive when the CPU is waiting for the data retrieval from cache lines [5]. Solutions for solving these processor related issues incorporate replacing the S with, Spin Transfer Torque Random Access Memory [5]. is specifically used for data storage in the CPU last level cache which also can lead to a decrease in miss ratio, reducing the time to return the data from cache directly to the processor [5]. As specified earlier, or Static Transfer Torque has shown remarkable improvements when compared to cache Level 1 and cache Level 2. Not only is the overall area reduced but this improvement leads to lower amount of misses while also retaining information without any type of loss to data, causing increased delay in data retrieval [7]. Although has shown significant results when placed in cache level 3, certain factors inhibit its placement within any other localities in cache levels [7]. One of these factors include the excessive amount of read and write instructions stored in the device which can cause overheating and inaccurate placements of blocks within the pertaining cache lines. As shown in the configuration, when was placed in Level 1 cache, the write instructions prove to be considerably slower than compared to the placement of S in cache Level 1 as well as an increase in read and write energy consumption [7]. With new discoveries being made for cache level organization, new technological devices are beginning to come into light for the purpose of replacing S in cache levels [8]. Some of these new technological devices include as mentioned consistently throughout the paper but also a new device called ed [8]. Many cache levels are now being tested with these various device types, optimizing them for maximum efficiency. Although some show to be faster regarding data and instruction retrieval, these devices shows downsides as well. Downsides for include increases in energy use; while ed requires certain processes to retain the correct data block retrieved from the memory preventing data corruption [8]. With many options presented to replace technological devices such as with or ed; alternatives exist. Alternatives that rather than complete replacement, require mixture of two types of devices, known as hybrids [10]. These hybrids are composed of a new classification of device type known as non-volatile ram. Not only do the hybrids contain greater storage when compared to singular type devices but also do not require as much energy use as that of Ram [10].
3 As discussed throughout the paper, is currently being tested to replace the antiquated use of S which although have fast data generation, other devices are much faster with certain instruction while offering a tremendous increase in storage capacity which also happen to be non-volatile [11]. One issue that arises with is also the quantity of refresh instruction executed in cache which can cause excessive loss of energy [11]. A solution which can reduce the quantity of refresh instructions is cache coherence adaptive refresh which has the capability of minimizing the refresh instructions leading to a decrease in energy loss [11]. III. DATA ANALYSIS 5 Read Cache Latency (nsec) The figure above depicts various Read Cache Latency in nanoseconds for their pertaining technological device as shown in the x-axis. Read latency is shown for multiple cache level configurations with device name and capacity listed. L1 4MB L2 S 512KB ed L3 STTS L2 S Re- 8MB S 32KB Read Energy Consumption (nj) The graph above displays the Read Energy Consumtion for various devices such as, ED, S and many other devices. The energy consumtion shown is measured in nanojoules. IV. CONCLUSION Throughout the paper, many technological devices were mentioned, many which are currently undergoing examination to replace the long lasting Static or S. Some of these new devices included which has increased storage capacity while also having an increase in energy use. Other devices such as ed or embedded dynamic are also being examined to contain certain instruction to retain data reliability without the presence of corruption. Not only are some of the new devices replacing S but also being tested to be positioned in certain cache levels other than Level 3 such as Level 2 or Level 1. As more testing is done, solutions to decrease energy loss and minimize excessive writing instructions will begin to surface as they are as current testing is proving.
4 REFERENCES [1] N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, Bit-Upset Vulnerability Factor for ed Last Level Cache Immunity Analysis, Proceedings of 17th International Symposium on Quality Electronic Design (ISQED 2016), Santa Clara, CA, USA, March 15-16, [2] S. E. Crawford and R. F. DeMara, "Cache coherence in a multiport memory environment," in Proceedings of the Second International Conference on Massively Parallel Computing Systems (MPCS-95), pp , Ischia, Italy, May 2-6, [3] M. Lin, et al. "ASTRO: Synthesizing application-specific reconfigurable hardware traces to exploit memory-level parallelism" Microprocessors and Microsystems 39.7 (2015): [4] X. Chen, N. Khoshavi, J. Zhou, D. Huang, R. F. DeMara, J. Wang, W. Wen and Y. Chen, AOS: Adaptive Overwrite Scheme for Energy- Efficient MLC Cache, 53rd Design Automation Conference, Austing, TX, USA, [5] N. Khoshavi, X. Chen, J. Wang and R. F. DeMara, "Read-Tuned and ed Cache Hierarchies for Throughput and Energy Enhancement, arxiv preprint, [6] A. Jog, A. K. Mishra, C. Xu, Y. Xie, V. Narayanan, R. Iyer, and C. R. Das, Cache Revive: Architecting Volatile Caches for Enhanced Performance in CMPs, in Proceedings of 49th Annual Design Automation Conference (DAC). 2012, pp [7] Z. Sun, X. Bi, H. H. Li, W.-F. Wong, Z.-L. Ong, X. Zhu, and W. Wu, Multi Retention Level Cache Designs with a Dynamic Refresh Scheme, in Proceedings of 44th annual IEEE/ACM International Symposium on Microarchitecture. 2011, pp [8] M.-T. Chang, P. Rosenfeld, S.-L. Lu, and B. Jacob, Technology Comparison for Large Last-level Caches (L 3 Cs): Low-leakage S, Low Write-energy, and Refresh-optimized ed, in Proceedings of 19th International Symposium on High Performance Computer Architecture (HPCA), 2013, pp [9] M. R. Jokar, M. Arjomand, and H. Sarbazi-Azad, Sequoia: High- Endurance NVM-Based Cache Architecture, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, [10] Joo, Yongsoo, and Sangsoo Park. "A hybrid P and cache architecture for extending the lifetime of P caches." IEEE computer architecture letters 12.2 (2013): [11] Li, Jianhua, et al. "Low-energy volatile cache design using cache-coherence-enabled adaptive refresh." ACM Transactions on Design Automation of Electronic Systems (TODAES) 19.1 (2013): 5. [12] Zhang, Yaojun, et al. "Read performance: The newest barrier in scaled." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 23.6 (2015): [13] DeMara, Ronald. Module 11, Memory Hierarchy,
5 TABLE I. Parameters for Processor the below techniques # of Freq. Capacity cores Level 1 (L1) for Instruction (I) or Data (D) Level 2 (L2) Level 3 (L3) or Last Level Cache (LLC) # of CL Protocol Capacity # of CL Protocol Capacity # of CL Protocol Khoshavi [1] 8 3GHz 32KB 8-way S 512 MESI 512KB 8-way S 8192 MESI 96MB 16-way ed ~100M WB Sun [7] 4 2GHz 32KB 4-way S 512 N/A 256KB 8-way S 4096 N/A 4MB 16-way Jokar[9] 4 3GHz 32KB 8-way D 512 MOESI 2MB 8-way N/A MOESI 8MB 8-way Re MOESI Zhang[12] GHz 32KB 4-way S 512 MESI 256K 8-way S 4096 N/A 16M 16-way S N/A Chang[8] 8 2GHz 32KB 8-way N/A 512 MESI 256KB 8-way N/A 4096 MESI 32MB 16-way N/A WB Chen[4] 4 3.3GHz 32KB 8-way S 512 WB 4MB 8-way WB N/A N/A N/A N/A N/A Khoshavi[5] N/A 3GHz 32KB 8-way S 512 WB N/A 8-way N/A N/A WB 96MB 16-way EDRA M ~100M WB Jog[6] N/A 2GHz 32KB 4-way S 512 WB 1MB 16-way S N/A N/A N/A N/A N/A N/A Li[11] 16 2GHz 32KB 2-way 512 WB N/A N/A N/A N/A N/A 8MB 16-way WB Joo[10] 1 2GHz 32KB N/A S 512 WB 8MB 16-way Hybrid WB N/A N/A N/A N/A N/A CL = Cache line Calculation for # of CL columns: Manually compute the number of cache lines given the capacity value as listed in capacity column, assuming the cache line size is always 64 Bytes Protocol column = {Write Back (WB), Write Through (WT), MESI, MOESI, Not Available (N/A)}
Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Device Technologies over the Years
Analysis of Cache Configurations and Cache Hierarchies Incorporating Various Technologies over the Years Sakeenah Khan EEL 30C: Computer Organization Summer Semester Department of Electrical and Computer
More informationCache Memory Introduction and Analysis of Performance Amongst SRAM and STT-RAM from The Past Decade
Cache Memory Introduction and Analysis of Performance Amongst S and from The Past Decade Carlos Blandon Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 386-36
More informationA Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing SRAM,
A Brief Compendium of On Chip Memory Highlighting the Tradeoffs Implementing, RAM, or edram Justin Bates Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 3816-36
More informationCache Memory Configurations and Their Respective Energy Consumption
Cache Memory Configurations and Their Respective Energy Consumption Dylan Petrae Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 32816-2362 Abstract When it
More informationComparisons Of Different Level Of Cache Using Various Technologies From Multiple Reverences
Comparisons Of Different Level Of Cache Using Various Technologies From ultiple Reverences Parameswari Chandrasekar Department of Electrical and Computer Engineering University of Central Florida Orlando,
More informationA Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid Cache in 3D chip Multi-processors
, July 4-6, 2018, London, U.K. A Spherical Placement and Migration Scheme for a STT-RAM Based Hybrid in 3D chip Multi-processors Lei Wang, Fen Ge, Hao Lu, Ning Wu, Ying Zhang, and Fang Zhou Abstract As
More informationA Coherent Hybrid SRAM and STT-RAM L1 Cache Architecture for Shared Memory Multicores
A Coherent Hybrid and L1 Cache Architecture for Shared Memory Multicores Jianxing Wang, Yenni Tim Weng-Fai Wong, Zhong-Liang Ong Zhenyu Sun, Hai (Helen) Li School of Computing Swanson School of Engineering
More informationA novel SRAM -STT-MRAM hybrid cache implementation improving cache performance
A novel SRAM -STT-MRAM hybrid cache implementation improving cache performance Odilia Coi, Guillaume Patrigeon, Sophiane Senni, Lionel Torres, Pascal Benoit To cite this version: Odilia Coi, Guillaume
More informationArea, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory
Area, Power, and Latency Considerations of STT-MRAM to Substitute for Main Memory Youngbin Jin, Mustafa Shihab, and Myoungsoo Jung Computer Architecture and Memory Systems Laboratory Department of Electrical
More informationEmerging NVM Memory Technologies
Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie yuanxie@cse.psu.edu Position Statement
More informationInternational Journal of Information Research and Review Vol. 05, Issue, 02, pp , February, 2018
International Journal of Information Research and Review, February, 2018 International Journal of Information Research and Review Vol. 05, Issue, 02, pp.5221-5225, February, 2018 RESEARCH ARTICLE A GREEN
More informationMiddleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives
Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives Chao Sun 1, Asuka Arakawa 1, Ayumi Soga 1, Chihiro Matsui 1 and Ken Takeuchi 1 1 Chuo University Santa Clara,
More informationSPINTRONIC MEMORY ARCHITECTURE
SPINTRONIC MEMORY ARCHITECTURE Anand Raghunathan Integrated Systems Laboratory School of ECE, Purdue University Rangharajan Venkatesan Shankar Ganesh Ramasubramanian, Ashish Ranjan Kaushik Roy 7 th NCN-NEEDS
More informationExploring Configurable Non-Volatile Memory-based Caches for Energy-Efficient Embedded Systems
Exploring Configurable Non-Volatile Memory-based Caches for Energy-Efficient Embedded Systems Tosiron Adegbija Department of Electrical and Computer Engineering University of Arizona Tucson, Arizona tosiron@email.arizona.edu
More informationImproving Energy Efficiency of Write-asymmetric Memories by Log Style Write
Improving Energy Efficiency of Write-asymmetric Memories by Log Style Write Guangyu Sun 1, Yaojun Zhang 2, Yu Wang 3, Yiran Chen 2 1 Center for Energy-efficient Computing and Applications, Peking University
More informationHibachi: A Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays
Hibachi: A Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays Ziqi Fan, Fenggang Wu, Dongchul Park 1, Jim Diehl, Doug Voigt 2, and David H.C. Du University of Minnesota, 1 Intel, 2 HP Enterprise
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationPhysical characteristics (such as packaging, volatility, and erasability Organization.
CS 320 Ch 4 Cache Memory 1. The author list 8 classifications for memory systems; Location Capacity Unit of transfer Access method (there are four:sequential, Direct, Random, and Associative) Performance
More informationOAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches
OAP: An Obstruction-Aware Cache Management Policy for STT-RAM Last-Level Caches Jue Wang, Xiangyu Dong, Yuan Xie Department of Computer Science and Engineering, Pennsylvania State University Qualcomm Technology,
More informationMohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu
Mohsen Imani University of California San Diego Winter 2016 Technology Trend for IoT http://www.flashmemorysummit.com/english/collaterals/proceedi ngs/2014/20140807_304c_hill.pdf 2 Motivation IoT significantly
More informationCS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.
CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 4: Memory Organization Our goal: understand the basic types of memory in computer understand memory hierarchy and the general process to access memory
More informationA Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems
A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems Zhenjiang Dong, Jun Wang, George Riley, Sudhakar Yalamanchili School of Electrical and Computer Engineering Georgia Institute
More informationEfficient Energy and Power Consumption of 3-D Chip Multiprocessor with NUCA Architecture
Indian Journal of Science and Technology, Vol 9(2), DOI: 10.17485/ijst/2016/v9i2/85815 January 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Efficient Energy and Power Consumption of 3-D Chip
More informationModule Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies.
M6 Memory Hierarchy Module Outline CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies. Events on a Cache Miss Events on a Cache Miss Stall the pipeline.
More informationAn Energy Improvement in Cache System by Using Write Through Policy
An Energy Improvement in Cache System by Using Write Through Policy Vigneshwari.S 1 PG Scholar, Department of ECE VLSI Design, SNS College of Technology, CBE-641035, India 1 ABSTRACT: This project presents
More informationMigration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM
Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,
More informationCS 3510 Comp&Net Arch
CS 3510 Comp&Net Arch Cache P1 Dr. Ken Hoganson 2010 Von Neuman Architecture Instructions and Data Op Sys CPU Main Mem Secondary Store Disk I/O Dev Bus The Need for Cache Memory performance has not kept
More informationReconfigurable Spintronic Fabric using Domain Wall Devices
Reconfigurable Spintronic Fabric using Domain Wall Devices Ronald F. DeMara, Ramtin Zand, Arman Roohi, Soheil Salehi, and Steven Pyle Department of Electrical and Computer Engineering University of Central
More informationA Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values
A Low-Power Hybrid Magnetic Cache Architecture Exploiting Narrow-Width Values Mohsen Imani, Abbas Rahimi, Yeseong Kim, Tajana Rosing Computer Science and Engineering, UC San Diego, La Jolla, CA 92093,
More informationLarge and Fast: Exploiting Memory Hierarchy
CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,
More informationThe Memory System. Components of the Memory System. Problems with the Memory System. A Solution
Datorarkitektur Fö 2-1 Datorarkitektur Fö 2-2 Components of the Memory System The Memory System 1. Components of the Memory System Main : fast, random access, expensive, located close (but not inside)
More informationWALL: A Writeback-Aware LLC Management for PCM-based Main Memory Systems
: A Writeback-Aware LLC Management for PCM-based Main Memory Systems Bahareh Pourshirazi *, Majed Valad Beigi, Zhichun Zhu *, and Gokhan Memik * University of Illinois at Chicago Northwestern University
More informationMemory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)
Memory Hierarchy Why are you dressed like that? Halloween was weeks ago! It makes me look faster, don t you think? Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity (Study
More informationBaoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China
doi:10.21311/001.39.7.41 Implementation of Cache Schedule Strategy in Solid-state Disk Baoping Wang School of software, Nanyang Normal University, Nanyang 473061, Henan, China Chao Yin* School of Information
More informationCouture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung
Couture: Tailoring STT-MRAM for Persistent Main Memory Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Executive Summary Motivation: DRAM plays an instrumental role in modern
More informationEnergy-Efficient Spin-Transfer Torque RAM Cache Exploiting Additional All-Zero-Data Flags
Energy-Efficient Spin-Transfer Torque RAM Cache Exploiting Additional All-Zero-Data Flags Jinwook Jung, Yohei Nakata, Masahiko Yoshimoto, and Hiroshi Kawaguchi Graduate School of System Informatics, Kobe
More informationMemory memories memory
Memory Organization Memory Hierarchy Memory is used for storing programs and data that are required to perform a specific task. For CPU to operate at its maximum speed, it required an uninterrupted and
More informationCS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook
CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array
More informationAmnesic Cache Management for Non-Volatile Memory
Amnesic Cache Management for Non-Volatile Memory Dongwoo Kang, Seungjae Baek, Jongmoo Choi Dankook University, South Korea {kangdw, baeksj, chiojm}@dankook.ac.kr Donghee Lee University of Seoul, South
More informationNovel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014
Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014 Shinobu Fujita, Kumiko Nomura, Hiroki Noguchi, Susumu Takeda, Keiko Abe Toshiba Corporation, R&D Center Advanced
More informationMEMORY. Objectives. L10 Memory
MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical
More informationComputer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg
Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD
More informationCondusiv s V-locity VM Accelerates Exchange 2010 over 60% on Virtual Machines without Additional Hardware
openbench Labs Executive Briefing: March 13, 2013 Condusiv s V-locity VM Accelerates Exchange 2010 over 60% on Virtual Machines without Additional Hardware Optimizing I/O for Increased Throughput and Reduced
More informationPerformance Enhancement Guaranteed Cache Using STT-RAM Technology
Performance Enhancement Guaranteed Cache Using STT-RAM Technology Ms.P.SINDHU 1, Ms.K.V.ARCHANA 2 Abstract- Spin Transfer Torque RAM (STT-RAM) is a form of computer data storage which allows data items
More informationA LITERATURE SURVEY ON CPU CACHE RECONFIGURATION
A LITERATURE SURVEY ON CPU CACHE RECONFIGURATION S. Subha SITE, Vellore Institute of Technology, Vellore, India E-Mail: ssubha@rocketmail.com ABSTRACT CPU caches are designed with fixed number of sets,
More informationAdaptive Placement and Migration Policy for an STT-RAM-Based Hybrid Cache
Adaptive Placement and Migration Policy for an STT-RAM-Based Hybrid Cache Zhe Wang Daniel A. Jiménez Cong Xu Guangyu Sun Yuan Xie Texas A&M University Pennsylvania State University Peking University AMD
More informationPerfect Student CS 343 Final Exam May 19, 2011 Student ID: 9999 Exam ID: 9636 Instructions Use pencil, if you have one. For multiple choice
Instructions Page 1 of 7 Use pencil, if you have one. For multiple choice questions, circle the letter of the one best choice unless the question specifically says to select all correct choices. There
More informationEvaluation of NOC Using Tightly Coupled Router Architecture
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 01-05 www.iosrjournals.org Evaluation of NOC Using Tightly Coupled Router
More informationCS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 15
CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2015 Lecture 15 LAST TIME! Discussed concepts of locality and stride Spatial locality: programs tend to access values near values they have already accessed
More informationThe Role of Storage Class Memory in Future Hardware Platforms Challenges and Opportunities
The Role of Storage Class Memory in Future Hardware Platforms Challenges and Opportunities Sudhanva Gurumurthi gurumurthi@cs.virginia.edu Multicore Processors Intel Nehalem AMD Phenom IBM POWER6 Future
More informationhttps://www.usenix.org/conference/fast16/technical-sessions/presentation/li-qiao
Access Characteristic Guided Read and Write Cost Regulation for Performance Improvement on Flash Memory Qiao Li and Liang Shi, Chongqing University; Chun Jason Xue, City University of Hong Kong; Kaijie
More informationArchitectural Differences nc. DRAM devices are accessed with a multiplexed address scheme. Each unit of data is accessed by first selecting its row ad
nc. Application Note AN1801 Rev. 0.2, 11/2003 Performance Differences between MPC8240 and the Tsi106 Host Bridge Top Changwatchai Roy Jenevein risc10@email.sps.mot.com CPD Applications This paper discusses
More informationARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES
ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES Shashikiran H. Tadas & Chaitali Chakrabarti Department of Electrical Engineering Arizona State University Tempe, AZ, 85287. tadas@asu.edu, chaitali@asu.edu
More informationNVMe: The Protocol for Future SSDs
When do you need NVMe? You might have heard that Non-Volatile Memory Express or NVM Express (NVMe) is the next must-have storage technology. Let s look at what NVMe delivers. NVMe is a communications protocol
More informationChapter 12 Wear Leveling for PCM Using Hot Data Identification
Chapter 12 Wear Leveling for PCM Using Hot Data Identification Inhwan Choi and Dongkun Shin Abstract Phase change memory (PCM) is the best candidate device among next generation random access memory technologies.
More informationWrite only as much as necessary. Be brief!
1 CIS371 Computer Organization and Design Final Exam Prof. Martin Wednesday, May 2nd, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached (with
More informationCS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15
CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2017 Lecture 15 LAST TIME: CACHE ORGANIZATION Caches have several important parameters B = 2 b bytes to store the block in each cache line S = 2 s cache sets
More informationDesigning Enterprise Controllers with QLC 3D NAND
Designing Enterprise Controllers with QLC 3D NAND Roman Pletka, Radu Stoica, Nikolas Ioannou, Sasa Tomic, Nikolaos Papandreou, Haralampos Pozidis IBM Research Zurich Research Laboratory Santa Clara, CA
More informationMemory Overview. Overview - Memory Types 2/17/16. Curtis Nelson Walla Walla University
Memory Overview Curtis Nelson Walla Walla University Overview - Memory Types n n n Magnetic tape (used primarily for long term archive) Magnetic disk n Hard disk (File, Directory, Folder) n Floppy disks
More informationA Page-Based Storage Framework for Phase Change Memory
A Page-Based Storage Framework for Phase Change Memory Peiquan Jin, Zhangling Wu, Xiaoliang Wang, Xingjun Hao, Lihua Yue University of Science and Technology of China 2017.5.19 Outline Background Related
More informationSnoop-Based Multiprocessor Design III: Case Studies
Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions
More informationUNIT:4 MEMORY ORGANIZATION
1 UNIT:4 MEMORY ORGANIZATION TOPICS TO BE COVERED. 4.1 Memory Hierarchy 4.2 Memory Classification 4.3 RAM,ROM,PROM,EPROM 4.4 Main Memory 4.5Auxiliary Memory 4.6 Associative Memory 4.7 Cache Memory 4.8
More informationCS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.
CS 33 Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Hyper Threading Instruction Control Instruction Control Retirement Unit
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More informationEmbedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts
Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:
More informationEmbedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction
Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationTowards Performance Modeling of 3D Memory Integrated FPGA Architectures
Towards Performance Modeling of 3D Memory Integrated FPGA Architectures Shreyas G. Singapura, Anand Panangadan and Viktor K. Prasanna University of Southern California, Los Angeles CA 90089, USA, {singapur,
More informationLecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )
Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections 5.1-5.3) 1 Reducing Miss Rate Large block size reduces compulsory misses, reduces miss penalty in case
More informationSF-LRU Cache Replacement Algorithm
SF-LRU Cache Replacement Algorithm Jaafar Alghazo, Adil Akaaboune, Nazeih Botros Southern Illinois University at Carbondale Department of Electrical and Computer Engineering Carbondale, IL 6291 alghazo@siu.edu,
More informationPipeline Optimizations of Architecting STT-RAM as Registers in Rad-Hard Environment
2017 IEEE Trustcom/BigDataSE/ICESS Pipeline Optimizations of Architecting STT-RAM as Registers in Rad-Hard Environment Zhiyao Gong, Keni Qiu, eiwen Chen Yuanhui Ni and Yuanchao Xu Beijing Advanced Innovation
More informationFigure 5.2: (a) Floor plan examples for varying the number of memory controllers and ranks. (b) Example configuration.
Figure 5.2: (a) Floor plan examples for varying the number of memory controllers and ranks. (b) Example configuration. The study found that a 16 rank 4 memory controller system obtained a speedup of 1.338
More informationUnleashing the Power of Embedded DRAM
Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers
More informationOverview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM
Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static
More informationCHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang
CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365
More informationDYNORA: A New Caching Technique
DYNORA: A New Caching Technique Srivatsan P. Sudarshan P.B. Bhaskaran P.P. Department of Electrical and Electronics Engineering Sri Venkateswara College of Engineering, Chennai, India e-mail: srivatsan00@yahoo.com
More informationCS311 Lecture 21: SRAM/DRAM/FLASH
S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University
More informationNew Memory Organizations For 3D DRAM and PCMs
New Memory Organizations For 3D DRAM and PCMs Ademola Fawibe 1, Jared Sherman 1, Krishna Kavi 1 Mike Ignatowski 2, and David Mayhew 2 1 University of North Texas, AdemolaFawibe@my.unt.edu, JaredSherman@my.unt.edu,
More informationComputer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James
Computer Systems Architecture I CSE 560M Lecture 18 Guest Lecturer: Shakir James Plan for Today Announcements No class meeting on Monday, meet in project groups Project demos < 2 weeks, Nov 23 rd Questions
More informationEvaluating STT-RAM as an Energy-Efficient Main Memory Alternative
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University
More informationMemory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)
Memory Hierarchy It makes me look faster, don t you think? Are you dressed like the Easter Bunny? Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity (Study Chapter 5)
More informationChapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)
Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)
More informationA Review on Cache Memory with Multiprocessor System
A Review on Cache Memory with Multiprocessor System Chirag R. Patel 1, Rajesh H. Davda 2 1,2 Computer Engineering Department, C. U. Shah College of Engineering & Technology, Wadhwan (Gujarat) Abstract
More informationMEMORY BHARAT SCHOOL OF BANKING- VELLORE
A memory is just like a human brain. It is used to store data and instructions. Computer memory is the storage space in computer where data is to be processed and instructions required for processing are
More informationLECTURE 11. Memory Hierarchy
LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed
More informationA Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers
A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea
More informationReducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression
Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression Xuebin Zhang, Jiangpeng Li, Hao Wang, Kai Zhao and Tong Zhang xuebinzhang.rpi@gmail.com ECSE Department,
More informationCS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14
CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors
More informationCache/Memory Optimization. - Krishna Parthaje
Cache/Memory Optimization - Krishna Parthaje Hybrid Cache Architecture Replacing SRAM Cache with Future Memory Technology Suji Lee, Jongpil Jung, and Chong-Min Kyung Department of Electrical Engineering,KAIST
More informationDesign and Implementation of a Random Access File System for NVRAM
This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Design and Implementation of a Random Access
More informationCS 261 Fall Mike Lam, Professor. Memory
CS 261 Fall 2016 Mike Lam, Professor Memory Topics Memory hierarchy overview Storage technologies SRAM DRAM PROM / flash Disk storage Tape and network storage I/O architecture Storage trends Latency comparisons
More informationA Cache Hierarchy in a Computer System
A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the
More informationRethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization
Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization Fazal Hameed and Jeronimo Castrillon Center for Advancing Electronics Dresden (cfaed), Technische Universität Dresden,
More informationNAND Flash Memory. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
NAND Flash Memory Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu) Flash
More informationThis material is based upon work supported in part by Intel Corporation /DATE13/ c 2013 EDAA
DWM-TAPESTRI - An Energy Efficient All-Spin Cache using Domain wall Shift based Writes Rangharajan Venkatesan, Mrigank Sharad, Kaushik Roy, and Anand Raghunathan School of Electrical and Computer Engineering,
More informationCaches. Cache Memory. memory hierarchy. CPU memory request presented to first-level cache first
Cache Memory memory hierarchy CPU memory request presented to first-level cache first if data NOT in cache, request sent to next level in hierarchy and so on CS3021/3421 2017 jones@tcd.ie School of Computer
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationEEC 483 Computer Organization. Chapter 5.3 Measuring and Improving Cache Performance. Chansu Yu
EEC 483 Computer Organization Chapter 5.3 Measuring and Improving Cache Performance Chansu Yu Cache Performance Performance equation execution time = (execution cycles + stall cycles) x cycle time stall
More informationCAUSE: Critical Application Usage-Aware Memory System using Non-volatile Memory for Mobile Devices
CAUSE: Critical Application Usage-Aware Memory System using Non-volatile Memory for Mobile Devices Yeseong Kim, Mohsen Imani, Shruti Patil and Tajana S. Rosing Computer Science and Engineering University
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More information