EARLY DRAFT. Efficient caching in distributed persistent stores. 1. Introduction. Bengt Johansson *

Size: px
Start display at page:

Download "EARLY DRAFT. Efficient caching in distributed persistent stores. 1. Introduction. Bengt Johansson *"

Transcription

1 Efficient caching in distributed persistent stores. EARLY DRAFT Bengt Johansson * Abstract: This article shows how techniques from hardware systems like multiprocessor computers can be used to improve the efficiency of distributed software, such as distributed persistent stores. It also describes the practical implementation of a particular caching protocol for a distributed store. 1. Introduction A persistent store is an abstraction of a persistent memory, i.e. a memory in which data remains when the process using it has terminated. Therefore a distributed persistent store can be seen as a distributed persistent memory. In other words, a memory that is distributed over several computer and remains in storage when no processes are active. Often this memory is thought of as a shared memory, since the user is presented with a view of one, often monolithic, block of memory. The system we describe uses a client/server-model to implement the store. One or more clients communicate with servers on which the persistent data are stored. (See fig. 1) The servers may reside on the same machine as the clients or on another machine on the network, possibly far away from the clients. s maintain a set of uniquely identified objects, all of which may contain references to other objects in the same, as well as remote stores. Clients perform operations on the stores using remote procedure calls, implemented on top of TCP/IP. Therefore clients and servers may also reside on machines on the Internet. Since the clients use a possibly very slow network to communicate with the servers it is important to decrease the size of the transmitted data. Also, due to the high latency in the network and the relatively small message sizes, the number of individual messages sent must be kept low. Two solutions to this problem is caching of objects and pre-fetching of adjacent objects to the ones requested. * Department of Computing Science, Chalmers University of Technology and Göteborg University bengtj@cs.chalmers.se, WWW:

2 Client Client Client Figure 1. An example of a distributed persistent store. 2. consistency Introducing caches to a system inevitably increases its complexity, since the protocol used must insure that the clients have a consistent view of the system. In other words, they must see updates performed by other clients in some well-defined order. This property is called consistency. Ideally, a system using caches should maintain the same semantics as a system without caches. Accesses to the system are serialised and updates are seen immediately by the other clients. A system satisfying this property is said to be sequentially consistent. [1] However, for a system with caches to be sequentially consistent, the clients must immediately be informed of any updates to the stores. Also, after an update, all other clients must have been informed of it. When the store is updated, the server must immediately send notifications to any other clients that cache the updated object. (See Fig. 2) Therefore each update gives rise to 2*#clients messages, which in most cases is unacceptable. Client1 Write(5) Upd(5) Client2 Figure 2. The messages necessary to perform an update. In distributed multiprocessor systems this problem is, to some degree, solved by relaxing the requirements to keep the cache consistent. One such relaxed consistency model is processor consistency [2]. In this model, client x sees the updates performed by client y in the same order as client y performs them. However, the updates of client x and y together may not be seen by other clients, or x and y themselves, in the same order as they are performed. Using processor consistency it is not necessary to acknowledge the update messages sent to clients. (See Fig. 3) This only reduces the messages to 2+#clients-1, but the server may return the acknowledgement to the updating client without waiting for the other clients to update their

3 caches. However, clients now may have different views of the system for a short but undefined amount of time. Client1 Write(5) Upd(5) Client2 Figure 3. The messages in a system satisfying processor consistency If a system allows for some kind of synchronisation, for instance monitors, transactions or object locking, it is not necessary to maintain the cache consistency for objects that only one user is able modify. The cache consistency is then restored when the user leaves the critical region or unlocks the objects. The weak consistency model [3] is based on the idea that it is possible to identify the points where the system needs to be consistent. For ordinary modifications, (read and write), the caches are allowed to become inconsistent and the system is brought back to a consistent state only at so called synchronisation points. (See fig. 4) Client 1 R W R W Synch+ Updated objects Client 2 Figure 4. Messages in a system satisfying weak consistency 3. Implementation The distributed persistent store described in this article implements a cache protocol satisfying weak consistency. The distributed store is implemented as a client/server application on top of an existing local persistent store. The clients basically provide the same functionality as a local store, extended with functions to manage global references, etc. A client consists of an interface to the user of the store and a transport layer that converts the calls made by the user into messages sent over the network. The transport layer also receives responses from the server and returns them to the user. The server waits for messages from the client and performs the corresponding operations. It then returns a response to the client. One such loop exists for each client that is connected to the server.

4 The client also maintains a cache. When an object is fetched from the server it is stored in the cache. Is stays there until the cache fills up or the server tells the client to remove or update it. An object is removed from the cache when it is locked by another client. The server maintains weak cache consistency by buffering update messages until a synchronisation point, lock or unlock operation, is reached or the buffer is full. When a client reads an un-cached object, the buffer is filled, breadth-first, with objects in the transitive closure of references starting at the requested object. The buffer is then sent to the client. Fig. 5 shows an overview of the system. Client Interface Transport layer Buffer Transport layer Local store Figure 5. Overview of the system When the user makes a call to the store, it goes through the following steps: 1. The client first checks if there are any messages from the server waiting to be processed. Update or remove messages for the cache may have arrived between two calls made by the user. If so, the requested operations are performed on the cache. 2. If the user made a read request, the client checks the cache. If the object is found in the cache, it is returned to the user. 3. If the object is not in the cache, or if the operation is not a read, a message is composed and sent to the server. The client then waits for the response. 4. The server receives the message, decodes it and performs the operation on the local store. 5. If the store is updated, an update message is put in the buffers of the other clients. 6. If a synchronising point is reached, the buffers are sent to the clients. The server then composes a response and sends it back to the client. 7. The client receives the response and returns it to the user. Operation(args) { if(incommingmessgs()) ProcessMessgs(); if(isin(args)) return (args) else { msg=createmsg(op,args); SendMsgTo(msg); rsp=waitforresp(); ProcessResponse(rsp); return Data(resp); (a) Loop() { while(connectionisopen()) { msg=waitformessage(); rsp=decodeandperformop(msg); PutRespInBuffer(rsp); if(isupdated()) { LeaveUpdMsgInOtherBuffers(); if(isreadop()) FillMyBufferWithData(); SendBuffer(); (b) Figure 6. Pseudo-code for the clients and servers

5 Fig. 6 shows the pseudo-code for (a) clients and (b) servers. An example interaction between clients and a store is shown in fig. 5: Suppose client 1 first writes 5 into object x and then 9 into object y. Thereafter object x is unlocked. When the unlocking operation is performed, the buffer is sent to client 2. This process is shown in fig. 5. Client 1 Write(x:=5) Write(y:=9) Unlock(x) x=5 x=5 y=9 Buffer Client 2 x=4,y=8 x=5 y=9 x=5,y=9 Figure 7. An example execution 4. Preliminary results The results presented here are preliminary, but should give some indication of the benefits of caching in a system like the distributed persistent store. The test is a simple producer-consumer system, where the producer generates a list of values and the consumer reads the values as they soon as they become available to it. Table 1 shows the speed-up resulting from using caches in the system. The times presented are the averaged execution times on an Sun Ultra-1 140MHz with 64Mb RAM. Without cache With cache Speed-up 8.18s 3.81s 2.14 Table 1. The speed-up achieved using caches. Table 2 shows the execution time depending on the buffer size. The speed-up achieved here is mainly due to the fact that the pre-fetching increases the cache hits. The table shows the execution times with cache and different buffer sizes. All processes are run on the same machine as for Table 1. Note that using a 0K buffer the system satisfies processor consistency, since Buffer size Execution time Processor usage hits no cache 70.3s 44.9% - 0K 75.7s 49.7% 0% 1K 295.9s 0.2% 88.2% 4K 27.1s 62.1% 98.4% 16K 26.7s 65.8% 99.6% 64K 26.1s 63.7% 99.9% 256K 25.4s 64.6% 99.96% Table 2. Clients and server running on the same machine such a buffer immediately fills up and is sent to the clients. Table 3 shows the same algorithm but the server is running on another machine. The bad times, in both cases, for 1K buffers are due to congestion in the network.

6 Buffer size Execution time Processor usage hits 1K 297.1s 0.5% 88.2% 4K 95.5s 2.4% 96.9% 16K 44.1s 36.3% 99.2% 64K 26.6s 65.2% 99.8% 256K 25.8s 65.5% 99.95% Table 3. running on another machine These results do not show the benefits of using relaxed cache consistency in the system. Therefore, further testing is necessary to get definitive results. 5. Related work Multiprocessor computers, especially those with distributed memory, often take advantage of relaxed cache consistency models. Not only do they benefit from the decreased data-flow over their interconnections, relaxed models also allow compiler writers to perform code-optimizations that would respect the semantics of the source-program on a single processor machine, but would break down in a multiprocessor with a sequentially consistent cache. The processor consistency model [2], allows writes from two or more processors to be observed in different order on different processors. This model is implemented in the VAX multiprocessor. Weak consistency [3], takes advantage of the fact that many memory updates are performed in critical sections, where only one processor may access the data. Therefore it is unnecessary to enforce a strict consistency model except when entering or leaving critical sections. The weak consistency model distinguishes between ordinary accesses and synchronising accesses, at which cache consistency is ensured. For an overview of cache consistency models see [4]. The work in distributed persistent stores have so far not concentrated on improving efficiency in the model, but rather to show that persistent stores and the persistent programming model has advantages compared to relational databases or remote object invocation in CORBA. Examples of distributed persistent stores and operating systems are PerDis [5] and Grasshopper [6]. 6. Further work The implementation presented in this article can be further improved. At the moment all writes are immediately performed, even if they are to a locked object. This situation can be improved by not having a write-through cache in the clients. Updates can be performed globally at synchronisation points. Update messages are sent to all clients connected to a particular server even if the client is not using the object. Therefore it is possible to avoid sending unnecessary messages to clients if the server keeps a list of the cached objects on each client. The server then only propagates messages to clients that have a copy of the object. We intend to explore alternative implementations of the protocol. For instance, it is possible to have replicated caches in the server instead of buffers.

7 7. Conclusions Relaxed consistency models are much used in multiprocessor implementations. They lead to decreased memory latency and makes it possible to take advantage of program optimization in compilers and in hardware. So far distributed persistent stores have not made use of such models, but we have shown that weak consistency gives an increase in performance when employed in a software system such as a persistent store. 8. References [1] Leslie Lamport, How to make a multiprocessor computer that correctly executes multiprocess programs, IEEE Transactions on Computers, C-28(9): September [2] James R. Goodman, consistency and sequential consistency, Technical Report no. 61, SCI Committee, March [3] Michel Dubois, Christoph Scheurich and Fayé Briggs, Memory access buffering in multiprocessors, In Proceedings of the 13th Annual International Symposium on Computer Architecture, pp , June [4] Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip B. Gibbons, Anoop Gupta and John L. Hennessy, Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors, ISCA 1990: [5] Paulo Ferreira, Marc Shapiro, Xavier Blondel, Olivier Fambon, João Garcia, Sytse Kloosterman, Nicolas Richer, Marcus, Roberts, Fadi Sandakly, George Coulouris, Jean Dollimore, Paulo Guedes, Daniel Hagimont, and Sacha Krakowiak, PerDiS: design, implementation, and use of a PERsistent DIstributed, Tech.Report: QMW TR752, CSTB ILC/ , INRIA RR 3525, INESC RT/5/98, URL: www-sor.inria.fr/publi/pdiupds_rr3525.html, October 1998 [6] Alan Dearle, Francis Vaughan, Rex di Bona, James Farrow, Frans Henskens, Anders Lindström and John Rosenberg, Grasshopper: An orthogonally persistent operating system, Tech.Report GH10, Dept. of Computer Science University of Adelaide, URL: 1994

Memory Consistency Models

Memory Consistency Models Memory Consistency Models Contents of Lecture 3 The need for memory consistency models The uniprocessor model Sequential consistency Relaxed memory models Weak ordering Release consistency Jonas Skeppstedt

More information

How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor

How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor How to Make a Correct Multiprocess Program Execute Correctly on a Multiprocessor Leslie Lamport 1 Digital Equipment Corporation February 14, 1993 Minor revisions January 18, 1996 and September 14, 1996

More information

Overview: Memory Consistency

Overview: Memory Consistency Overview: Memory Consistency the ordering of memory operations basic definitions; sequential consistency comparison with cache coherency relaxing memory consistency write buffers the total store ordering

More information

Shared Memory Consistency Models: A Tutorial

Shared Memory Consistency Models: A Tutorial Shared Memory Consistency Models: A Tutorial By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Outline Concurrent programming on a uniprocessor The effect of optimizations on a uniprocessor The

More information

CS533 Concepts of Operating Systems. Jonathan Walpole

CS533 Concepts of Operating Systems. Jonathan Walpole CS533 Concepts of Operating Systems Jonathan Walpole Shared Memory Consistency Models: A Tutorial Outline Concurrent programming on a uniprocessor The effect of optimizations on a uniprocessor The effect

More information

Using Relaxed Consistency Models

Using Relaxed Consistency Models Using Relaxed Consistency Models CS&G discuss relaxed consistency models from two standpoints. The system specification, which tells how a consistency model works and what guarantees of ordering it provides.

More information

Release Consistency. Draft material for 3rd edition of Distributed Systems Concepts and Design

Release Consistency. Draft material for 3rd edition of Distributed Systems Concepts and Design Draft material for 3rd edition of Distributed Systems Concepts and Design Department of Computer Science, Queen Mary & Westfield College, University of London Release Consistency 1. Introduction Chapter

More information

Relaxed Memory-Consistency Models

Relaxed Memory-Consistency Models Relaxed Memory-Consistency Models [ 9.1] In Lecture 13, we saw a number of relaxed memoryconsistency models. In this lecture, we will cover some of them in more detail. Why isn t sequential consistency

More information

Lamport Clocks: Verifying a Directory Cache-Coherence Protocol

Lamport Clocks: Verifying a Directory Cache-Coherence Protocol To appear in the 10th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), June 28-July 2, 1998, Puerto Vallarta, Mexico Lamport Clocks: Verifying a Directory Cache-Coherence Protocol

More information

Portland State University ECE 588/688. Cache Coherence Protocols

Portland State University ECE 588/688. Cache Coherence Protocols Portland State University ECE 588/688 Cache Coherence Protocols Copyright by Alaa Alameldeen 2018 Conditions for Cache Coherence Program Order. A read by processor P to location A that follows a write

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 4, Issue 7, January 2015

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 4, Issue 7, January 2015 Cache Coherence Mechanisms Sultan Almakdi, Abdulwahab Alazeb, Mohammed Alshehri College of Computer Science and Information System, Najran University, Najran, Saudi Arabia Abstract Many modern computing

More information

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem.

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem. Memory Consistency Model Background for Debate on Memory Consistency Models CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley for a SAS specifies constraints on the order in which

More information

Race-free Interconnection Networks and Multiprocessor Consistency

Race-free Interconnection Networks and Multiprocessor Consistency Race-free Interconnection Networks and Multiprocessor Consistency Anders Landin, Erik Hagersten and Seif Haridi Swedish Institute of Computer Science* Abstract Modern shared-memory multiprocessors require

More information

Consistency Issues in Distributed Shared Memory Systems

Consistency Issues in Distributed Shared Memory Systems Consistency Issues in Distributed Shared Memory Systems CSE 6306 Advance Operating System Spring 2002 Chingwen Chai University of Texas at Arlington cxc9696@omega.uta.edu Abstract In the field of parallel

More information

Using Subpages for Cache Coherency Control in Parallel Database Systems 1

Using Subpages for Cache Coherency Control in Parallel Database Systems 1 Using Subpages for Cache Coherency Control in Parallel Database Systems 1 Andreas Listl Institut für Informatik, Technische Universität München Arcisstr. 21, D-80290 München, Fed. Rep. Germany e-mail:

More information

Implementing Sequential Consistency In Cache-Based Systems

Implementing Sequential Consistency In Cache-Based Systems To appear in the Proceedings of the 1990 International Conference on Parallel Processing Implementing Sequential Consistency In Cache-Based Systems Sarita V. Adve Mark D. Hill Computer Sciences Department

More information

Computer Architecture

Computer Architecture Jens Teubner Computer Architecture Summer 2016 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2016 Jens Teubner Computer Architecture Summer 2016 83 Part III Multi-Core

More information

Topic C Memory Models

Topic C Memory Models Memory Memory Non- Topic C Memory CPEG852 Spring 2014 Guang R. Gao CPEG 852 Memory Advance 1 / 29 Memory 1 Memory Memory Non- 2 Non- CPEG 852 Memory Advance 2 / 29 Memory Memory Memory Non- Introduction:

More information

Shared Memory Systems

Shared Memory Systems Shared Memory Systems Haitao Wei (Most of the slides are from Dr. Stephane Zuckerman) University of Delaware hep://www.udel.edu Computer Architecture and Parallel Systems Laboratory hep://www.capsl.udel.edu

More information

Relaxed Memory-Consistency Models

Relaxed Memory-Consistency Models Relaxed Memory-Consistency Models Review. Why are relaxed memory-consistency models needed? How do relaxed MC models require programs to be changed? The safety net between operations whose order needs

More information

Recent Advances in Memory Consistency Models for Hardware Shared Memory Systems

Recent Advances in Memory Consistency Models for Hardware Shared Memory Systems Recent Advances in Memory Consistency Models for Hardware Shared Memory Systems SARITA V. ADVE, MEMBER, IEEE, VIJAY S. PAI, STUDENT MEMBER, IEEE, AND PARTHASARATHY RANGANATHAN, STUDENT MEMBER, IEEE Invited

More information

Memory Consistency Models

Memory Consistency Models Memory Consistency Models David Mosberger Department of Computer Science University of Arizona Tucson, AZ 85721 davidm@cs.arizona.edu Abstract This paper discusses memory consistency models and their influence

More information

Data-Centric Consistency Models. The general organization of a logical data store, physically distributed and replicated across multiple processes.

Data-Centric Consistency Models. The general organization of a logical data store, physically distributed and replicated across multiple processes. Data-Centric Consistency Models The general organization of a logical data store, physically distributed and replicated across multiple processes. Consistency models The scenario we will be studying: Some

More information

Computer Architecture

Computer Architecture 18-447 Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 29: Consistency & Coherence Lecture 20: Consistency and Coherence Bo Wu Prof. Onur Mutlu Colorado Carnegie School Mellon University

More information

Lecture 13: Consistency Models. Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models

Lecture 13: Consistency Models. Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models Lecture 13: Consistency Models Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models 1 Coherence Vs. Consistency Recall that coherence guarantees

More information

Distributed Operating Systems Memory Consistency

Distributed Operating Systems Memory Consistency Faculty of Computer Science Institute for System Architecture, Operating Systems Group Distributed Operating Systems Memory Consistency Marcus Völp (slides Julian Stecklina, Marcus Völp) SS2014 Concurrent

More information

Designing Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve

Designing Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve Designing Memory Consistency Models for Shared-Memory Multiprocessors Sarita V. Adve Computer Sciences Department University of Wisconsin-Madison The Big Picture Assumptions Parallel processing important

More information

Relaxed Memory-Consistency Models

Relaxed Memory-Consistency Models Relaxed Memory-Consistency Models [ 9.1] In small multiprocessors, sequential consistency can be implemented relatively easily. However, this is not true for large multiprocessors. Why? This is not the

More information

Operating system Dr. Shroouq J.

Operating system Dr. Shroouq J. 2.2.2 DMA Structure In a simple terminal-input driver, when a line is to be read from the terminal, the first character typed is sent to the computer. When that character is received, the asynchronous-communication

More information

C++ Memory Model. Martin Kempf December 26, Abstract. 1. Introduction What is a Memory Model

C++ Memory Model. Martin Kempf December 26, Abstract. 1. Introduction What is a Memory Model C++ Memory Model (mkempf@hsr.ch) December 26, 2012 Abstract Multi-threaded programming is increasingly important. We need parallel programs to take advantage of multi-core processors and those are likely

More information

Lecture 24: Multiprocessing Computer Architecture and Systems Programming ( )

Lecture 24: Multiprocessing Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 24: Multiprocessing Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Most of the rest of this

More information

Distributed Systems. Distributed Shared Memory. Paul Krzyzanowski

Distributed Systems. Distributed Shared Memory. Paul Krzyzanowski Distributed Systems Distributed Shared Memory Paul Krzyzanowski pxk@cs.rutgers.edu Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.

More information

A System-Level Specification Framework for I/O Architectures*

A System-Level Specification Framework for I/O Architectures* A System-Level Specification Framework for I/O Architectures* Mark D. Hill, Anne E. Condon, Manoj Plakal, Daniel J. Sorin Computer Sciences Department, University of Wisconsin - Madison, 1210 West Dayton

More information

Module 15: "Memory Consistency Models" Lecture 34: "Sequential Consistency and Relaxed Models" Memory Consistency Models. Memory consistency

Module 15: Memory Consistency Models Lecture 34: Sequential Consistency and Relaxed Models Memory Consistency Models. Memory consistency Memory Consistency Models Memory consistency SC SC in MIPS R10000 Relaxed models Total store ordering PC and PSO TSO, PC, PSO Weak ordering (WO) [From Chapters 9 and 11 of Culler, Singh, Gupta] [Additional

More information

Enterprise in Building and Construction 1

Enterprise in Building and Construction 1 Distributed Shared Memory Infrastructure for Virtual Enterprise in Building and Construction 1 Fadi Sandakly *, João Garcia **, Paulo Ferreira ** and Patrice Poyet * * CSTB, BP 209, 06904 Sophia-Antipolis,

More information

Chapter 17: Recovery System

Chapter 17: Recovery System Chapter 17: Recovery System! Failure Classification! Storage Structure! Recovery and Atomicity! Log-Based Recovery! Shadow Paging! Recovery With Concurrent Transactions! Buffer Management! Failure with

More information

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure Chapter 17: Recovery System Failure Classification! Failure Classification! Storage Structure! Recovery and Atomicity! Log-Based Recovery! Shadow Paging! Recovery With Concurrent Transactions! Buffer Management!

More information

Shared Memory and Shared Memory Consistency

Shared Memory and Shared Memory Consistency Shared Memory and Shared Memory Consistency Josip Popovic Graduate Studies and Research, Systems and Computer Eng. Carleton University, OCIECE program Abstract Multiprocessors (MP) use is growing in standard

More information

Selection-based Weak Sequential Consistency Models for. for Distributed Shared Memory.

Selection-based Weak Sequential Consistency Models for. for Distributed Shared Memory. Selection-based Weak Sequential Consistency Models for Distributed Shared Memory Z. Huang, C. Sun, and M. Purvis Departments of Computer & Information Science University of Otago, Dunedin, New Zealand

More information

A Mechanism for Sequential Consistency in a Distributed Objects System

A Mechanism for Sequential Consistency in a Distributed Objects System A Mechanism for Sequential Consistency in a Distributed Objects System Cristian Ţăpuş, Aleksey Nogin, Jason Hickey, and Jerome White California Institute of Technology Computer Science Department MC 256-80,

More information

Memory Consistency. Minsoo Ryu. Department of Computer Science and Engineering. Hanyang University. Real-Time Computing and Communications Lab.

Memory Consistency. Minsoo Ryu. Department of Computer Science and Engineering. Hanyang University. Real-Time Computing and Communications Lab. Memory Consistency Minsoo Ryu Department of Computer Science and Engineering 2 Distributed Shared Memory Two types of memory organization in parallel and distributed systems Shared memory (shared address

More information

Distributed File Systems. CS432: Distributed Systems Spring 2017

Distributed File Systems. CS432: Distributed Systems Spring 2017 Distributed File Systems Reading Chapter 12 (12.1-12.4) [Coulouris 11] Chapter 11 [Tanenbaum 06] Section 4.3, Modern Operating Systems, Fourth Ed., Andrew S. Tanenbaum Section 11.4, Operating Systems Concept,

More information

Distributed Systems COMP 212. Lecture 1 Othon Michail

Distributed Systems COMP 212. Lecture 1 Othon Michail Distributed Systems COMP 212 Lecture 1 Othon Michail Course Information Lecturer: Othon Michail Office 2.14 Holt Building http://csc.liv.ac.uk/~michailo/teaching/comp2 12 Structure 30 Lectures + 10 lab

More information

On the tamability of the Location Consistency memory model

On the tamability of the Location Consistency memory model On the tamability of the Location Consistency memory model Charles Wallace Computer Science Dept. Michigan Technological University Houghton, MI, USA Guy Tremblay Dépt. d informatique Université du Québec

More information

Module 7 - Replication

Module 7 - Replication Module 7 - Replication Replication Why replicate? Reliability Avoid single points of failure Performance Scalability in numbers and geographic area Why not replicate? Replication transparency Consistency

More information

A Cache Hierarchy in a Computer System

A Cache Hierarchy in a Computer System A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the

More information

An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic

An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic To appear in Parallel Architectures and Languages Europe (PARLE), July 1994 An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic Håkan Nilsson and Per Stenström Department

More information

Distributed Shared Memory and Memory Consistency Models

Distributed Shared Memory and Memory Consistency Models Lectures on distributed systems Distributed Shared Memory and Memory Consistency Models Paul Krzyzanowski Introduction With conventional SMP systems, multiple processors execute instructions in a single

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS CONSISTENCY AND REPLICATION CONSISTENCY MODELS Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Consistency Models Background Replication Motivation

More information

Chapter 8. Multiprocessors. In-Cheol Park Dept. of EE, KAIST

Chapter 8. Multiprocessors. In-Cheol Park Dept. of EE, KAIST Chapter 8. Multiprocessors In-Cheol Park Dept. of EE, KAIST Can the rapid rate of uniprocessor performance growth be sustained indefinitely? If the pace does slow down, multiprocessor architectures will

More information

EE382 Processor Design. Illinois

EE382 Processor Design. Illinois EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate

More information

Distributed Systems COMP 212. Lecture 1 Othon Michail

Distributed Systems COMP 212. Lecture 1 Othon Michail Distributed Systems COMP 212 Lecture 1 Othon Michail Course Information Lecturer: Othon Michail Office 2.14 Holt Building http://csc.liv.ac.uk/~michailo/teaching/comp2 12 Structure 30 Lectures + 10 lab

More information

CONSISTENCY MODELS IN DISTRIBUTED SHARED MEMORY SYSTEMS

CONSISTENCY MODELS IN DISTRIBUTED SHARED MEMORY SYSTEMS Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

More information

Rice University. memory: Munin, a distributed shared memory (DSM) system implemented

Rice University. memory: Munin, a distributed shared memory (DSM) system implemented Toward Large-Scale Shared Memory Multiprocessing John K. Bennett John B. Carter Willy Zwaenepoel Computer Systems Laboratory Rice University Abstract We are currently investigating two dierent approaches

More information

Persistent Objects In A Relational Database

Persistent Objects In A Relational Database Holger Vogelsang, Uwe Brinkschulte, Institute for Microcomputers and Automation University of Karlsruhe Haid-und-Neu-Str. 7 76131 Karlsruhe, Germany Tel.: +49+721 6083898 Fax: +49+721 661732 email: {vogelsang

More information

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas Parallel Computer Architecture Spring 2018 Memory Consistency Nikos Bellas Computer and Communications Engineering Department University of Thessaly Parallel Computer Architecture 1 Coherence vs Consistency

More information

Lecture 6 Consistency and Replication

Lecture 6 Consistency and Replication Lecture 6 Consistency and Replication Prof. Wilson Rivera University of Puerto Rico at Mayaguez Electrical and Computer Engineering Department Outline Data-centric consistency Client-centric consistency

More information

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University 740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University Readings: Memory Consistency Required Lamport, How to Make a Multiprocessor Computer That Correctly Executes Multiprocess

More information

Shared memory multiprocessors

Shared memory multiprocessors Shared memory multiprocessors Leonid Ryzhyk April 21, 2006 1 Introduction The hardware evolution has reached the point where it becomes extremely difficult to further improve

More information

Beyond Sequential Consistency: Relaxed Memory Models

Beyond Sequential Consistency: Relaxed Memory Models 1 Beyond Sequential Consistency: Relaxed Memory Models Computer Science and Artificial Intelligence Lab M.I.T. Based on the material prepared by and Krste Asanovic 2 Beyond Sequential Consistency: Relaxed

More information

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Memory Consistency and Synchronization Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 5 (multi-core) " Basic requirements: out later today

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

Module 14: "Directory-based Cache Coherence" Lecture 31: "Managing Directory Overhead" Directory-based Cache Coherence: Replacement of S blocks

Module 14: Directory-based Cache Coherence Lecture 31: Managing Directory Overhead Directory-based Cache Coherence: Replacement of S blocks Directory-based Cache Coherence: Replacement of S blocks Serialization VN deadlock Starvation Overflow schemes Sparse directory Remote access cache COMA Latency tolerance Page migration Queue lock in hardware

More information

Portland State University ECE 588/688. Memory Consistency Models

Portland State University ECE 588/688. Memory Consistency Models Portland State University ECE 588/688 Memory Consistency Models Copyright by Alaa Alameldeen 2018 Memory Consistency Models Formal specification of how the memory system will appear to the programmer Places

More information

Consistency in Distributed Systems

Consistency in Distributed Systems Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission

More information

Integrating Fragmented Objects into a CORBA Environment

Integrating Fragmented Objects into a CORBA Environment Integrating ed Objects into a CORBA Environment Hans P. Reiser 1, Franz J. Hauck 2, Rüdiger Kapitza 1, and Andreas I. Schmied 2 1 Dept. of Distributed Systems and Operating System, University of Erlangen-

More information

Java RMI Middleware Project

Java RMI Middleware Project Java RMI Middleware Project Nathan Balon CIS 578 Advanced Operating Systems December 7, 2004 Introduction The semester project was to implement a middleware similar to Java RMI or CORBA. The purpose of

More information

Distributed Systems COMP 212. Lecture 1 Othon Michail

Distributed Systems COMP 212. Lecture 1 Othon Michail Distributed Systems COMP 212 Lecture 1 Othon Michail Course Information Lecturer: Othon Michail Office 2.14 Holt Building Module Website: http://csc.liv.ac.uk/~michailo/teaching/comp212 VITAL may be used

More information

Page 1. Outline. Coherence vs. Consistency. Why Consistency is Important

Page 1. Outline. Coherence vs. Consistency. Why Consistency is Important Outline ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Memory Consistency Models Copyright 2006 Daniel J. Sorin Duke University Slides are derived from work by Sarita

More information

Module 7: Synchronization Lecture 13: Introduction to Atomic Primitives. The Lecture Contains: Synchronization. Waiting Algorithms.

Module 7: Synchronization Lecture 13: Introduction to Atomic Primitives. The Lecture Contains: Synchronization. Waiting Algorithms. The Lecture Contains: Synchronization Waiting Algorithms Implementation Hardwired Locks Software Locks Hardware Support Atomic Exchange Test & Set Fetch & op Compare & Swap Traffic of Test & Set Backoff

More information

A Hybrid Shared Memory/Message Passing Parallel Machine

A Hybrid Shared Memory/Message Passing Parallel Machine A Hybrid Shared Memory/Message Passing Parallel Machine Matthew I. Frank and Mary K. Vernon Computer Sciences Department University of Wisconsin Madison Madison, WI 53706 {mfrank, vernon}@cs.wisc.edu Abstract

More information

殷亚凤. Consistency and Replication. Distributed Systems [7]

殷亚凤. Consistency and Replication. Distributed Systems [7] Consistency and Replication Distributed Systems [7] 殷亚凤 Email: yafeng@nju.edu.cn Homepage: http://cs.nju.edu.cn/yafeng/ Room 301, Building of Computer Science and Technology Review Clock synchronization

More information

Motivations. Shared Memory Consistency Models. Optimizations for Performance. Memory Consistency

Motivations. Shared Memory Consistency Models. Optimizations for Performance. Memory Consistency Shared Memory Consistency Models Authors : Sarita.V.Adve and Kourosh Gharachorloo Presented by Arrvindh Shriraman Motivations Programmer is required to reason about consistency to ensure data race conditions

More information

Lect. 6: Directory Coherence Protocol

Lect. 6: Directory Coherence Protocol Lect. 6: Directory Coherence Protocol Snooping coherence Global state of a memory line is the collection of its state in all caches, and there is no summary state anywhere All cache controllers monitor

More information

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types Chapter 5 Multiprocessor Cache Coherence Thread-Level Parallelism 1: read 2: read 3: write??? 1 4 From ILP to TLP Memory System is Coherent If... ILP became inefficient in terms of Power consumption Silicon

More information

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems CSE 5306 Distributed Systems Consistency and Replication Jia Rao http://ranger.uta.edu/~jrao/ 1 Reasons for Replication Data is replicated for the reliability of the system Servers are replicated for performance

More information

The Cache Write Problem

The Cache Write Problem Cache Coherency A multiprocessor and a multicomputer each comprise a number of independent processors connected by a communications medium, either a bus or more advanced switching system, such as a crossbar

More information

Applying Sequential Consistency to Web Caching

Applying Sequential Consistency to Web Caching Applying Sequential Consistency to Web Caching Francisco J. Torres-Rojas and Esteban Meneses Abstract Web caches have several advantages for reducing the server load, minimizing the network traffic and

More information

Multiprocessor Support

Multiprocessor Support CSC 256/456: Operating Systems Multiprocessor Support John Criswell University of Rochester 1 Outline Multiprocessor hardware Types of multi-processor workloads Operating system issues Where to run the

More information

Lamport Clocks: Verifying A Directory Cache-Coherence Protocol. Computer Sciences Department

Lamport Clocks: Verifying A Directory Cache-Coherence Protocol. Computer Sciences Department Lamport Clocks: Verifying A Directory Cache-Coherence Protocol * Manoj Plakal, Daniel J. Sorin, Anne E. Condon, Mark D. Hill Computer Sciences Department University of Wisconsin-Madison {plakal,sorin,condon,markhill}@cs.wisc.edu

More information

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Symmetric Multiprocessors: Synchronization and Sequential Consistency Constructive Computer Architecture Symmetric Multiprocessors: Synchronization and Sequential Consistency Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC) CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2002 Lecture 22: Remote Procedure Call (RPC) 22.0 Main Point Send/receive One vs. two-way communication Remote Procedure

More information

Parallel Computer Architecture Lecture 5: Cache Coherence. Chris Craik (TA) Carnegie Mellon University

Parallel Computer Architecture Lecture 5: Cache Coherence. Chris Craik (TA) Carnegie Mellon University 18-742 Parallel Computer Architecture Lecture 5: Cache Coherence Chris Craik (TA) Carnegie Mellon University Readings: Coherence Required for Review Papamarcos and Patel, A low-overhead coherence solution

More information

Hardware Memory Models: x86-tso

Hardware Memory Models: x86-tso Hardware Memory Models: x86-tso John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 9 20 September 2016 Agenda So far hardware organization multithreading

More information

CS Computer Architecture

CS Computer Architecture CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 Computer Systems Organization The CPU (Central Processing Unit) is the brain of the computer. Fetches instructions from main memory.

More information

Page 1. Cache Coherence

Page 1. Cache Coherence Page 1 Cache Coherence 1 Page 2 Memory Consistency in SMPs CPU-1 CPU-2 A 100 cache-1 A 100 cache-2 CPU-Memory bus A 100 memory Suppose CPU-1 updates A to 200. write-back: memory and cache-2 have stale

More information

CSE 5306 Distributed Systems. Consistency and Replication

CSE 5306 Distributed Systems. Consistency and Replication CSE 5306 Distributed Systems Consistency and Replication 1 Reasons for Replication Data are replicated for the reliability of the system Servers are replicated for performance Scaling in numbers Scaling

More information

Consistency and Replication

Consistency and Replication Consistency and Replication Introduction Data-centric consistency Client-centric consistency Distribution protocols Consistency protocols 1 Goal: Reliability Performance Problem: Consistency Replication

More information

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Consistency and Replication Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Reasons for Replication Reliability/Availability : Mask failures Mask corrupted data Performance: Scalability

More information

Comprehensive Review of Data Prefetching Mechanisms

Comprehensive Review of Data Prefetching Mechanisms 86 Sneha Chhabra, Raman Maini Comprehensive Review of Data Prefetching Mechanisms 1 Sneha Chhabra, 2 Raman Maini 1 University College of Engineering, Punjabi University, Patiala 2 Associate Professor,

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

4 Chip Multiprocessors (I) Chip Multiprocessors (ACS MPhil) Robert Mullins

4 Chip Multiprocessors (I) Chip Multiprocessors (ACS MPhil) Robert Mullins 4 Chip Multiprocessors (I) Robert Mullins Overview Coherent memory systems Introduction to cache coherency protocols Advanced cache coherency protocols, memory systems and synchronization covered in the

More information

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing

A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni

More information

Lecture 11: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison

Lecture 11: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison Lecture 11: Relaxed Consistency Models Topics: sequential consistency recap, relaxing various SC constraints, performance comparison 1 Relaxed Memory Models Recall that sequential consistency has two requirements:

More information

WORLD WIDE NEWS GATHERING AUTOMATIC MANAGEMENT

WORLD WIDE NEWS GATHERING AUTOMATIC MANAGEMENT WORLD WIDE NEWS GATHERING AUTOMATIC MANAGEMENT Luís Veiga and Paulo Ferreira {luis.veiga, paulo.ferreira } @ inesc.pt INESC, Rua Alves Redol, 9 - Lisboa -1000 Lisboa - Portugal Abstract The world-wide-web

More information

Lecture 12: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison

Lecture 12: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison Lecture 12: Relaxed Consistency Models Topics: sequential consistency recap, relaxing various SC constraints, performance comparison 1 Relaxed Memory Models Recall that sequential consistency has two requirements:

More information

Chapter 9: Concurrency Control

Chapter 9: Concurrency Control Chapter 9: Concurrency Control Concurrency, Conflicts, and Schedules Locking Based Algorithms Timestamp Ordering Algorithms Deadlock Management Acknowledgements: I am indebted to Arturas Mazeika for providing

More information

Transparent Orthogonal Checkpointing Through User-Level Pagers

Transparent Orthogonal Checkpointing Through User-Level Pagers Transparent Orthogonal Checkpointing Through User-Level Pagers Espen Skoglund, Christian Ceelen, and Jochen Liedtke System Architecture Group University of Karlsruhe {skoglund,ceelen,liedtke}@ira.uka.de

More information

Introduction to Computing and Systems Architecture

Introduction to Computing and Systems Architecture Introduction to Computing and Systems Architecture 1. Computability A task is computable if a sequence of instructions can be described which, when followed, will complete such a task. This says little

More information