nosql DBMS: Concepts, Techniques and Patterns Iztok Savnik, FAMNIT, 2017

Size: px
Start display at page:

Download "nosql DBMS: Concepts, Techniques and Patterns Iztok Savnik, FAMNIT, 2017"

Transcription

1 nosql DBMS: Concepts, Techniques and Patterns Iztok Savnik, FAMNIT, 2017

2 Contents Requirements Consistency model Partitioning Storage Layout Query Models

3 Requirements Requirements of the new web applications ACID vs. BASE CAP theorem

4 ACID vs. BASE ACID Atomicity Consistency Isolation Durability Characteristics Strong consistency Focus on commit Nested transactions Availability? Conservative (pessimistic) Difficult evolution BASE Basically Available Soft-state Eventual consistency Characteristics Weak consistency Availability first Approximate answers OK Aggressive (optimistic) Simpler! Faster Easier evolution

5 CAP theorem Towards Robust Distributed Systems at ACM s PODC 2000, Eric Brewer CAP theorem Consistency System is in a consistent state after the execution of an operation After an update of some writer all readers see his updates in some shared data source Availability A system is designed and implemented in a way that allows it to continue operation if e.g. nodes in a cluster crash or some hardware or software parts are down due to upgrades Partition tolerance The ability of the system to continue operation in the presence of network partitions Theorem: You can have at most two of these properties for any shared-data system Widely adopted today by large web companies

6 CAP theorem

7 Forfeit Partitions Examples Single - site databases Cluster databases LDAP xfs file system Traits 2 - phase commit cache validation protocols

8 Forfeit Availability Examples Distributed databases Distributed locking Majority protocols Traits Pessimistic locking Make minority partitions unavailable

9 Forfeit Consistency Examples Coda Web cachinge DNS Traits expirations/leases conflict resolution optimistic

10 Consistency Models A consistency model determines rules for visibility and apparent order of updates. System is in a consistent state after the execution of an operation Example: Row X is replicated on nodes M and N Client A writes row X to node N Some period of time t elapses. Client B reads row X from node M Does client B see the write from client A? Consistency is a continuum with tradeoffs Consistency models Timestamps Optimistic Locking Vector Clocks Multiversion Storage

11 Strict Consistency All read operations must return the data from the latest completed write operation, regardless of which replica the operations went to Implies either: All operations for a given row go to the same node (replication for availability) or nodes employ some kind of distributed transaction protocol (eg 2 Phase Commit or Paxos) CAP Theorem: Strict Consistency can t be achieved at the same time as availability and partition-tolerance.

12 Eventual Consistency As t, readers will see writes. In a steady state, the system is guaranteed to eventually return the last written value For example: DNS, or MySQL Slave Replication (log shipping) Special cases of eventual consistency: Read-your-own-writes consistency ( sent mail box) Causal consistency (if you write Y after reading X, anyone who reads Y sees X) gmail has RYOW but not causal!

13 Timestamps and Vector Clocks Determining a history of a row Eventual consistency relies on deciding what value a row will eventually converge to In the case of two writers writing at the same time, this is difficult Timestamps are one solution, but rely on synchronized clocks and don t capture causality Vector clocks are an alternative method of capturing order in a distributed system

14 Vector Clocks A vector clock is a tuple {t, t,..., t } of clock values from each 1 2 n node v < v if: 1 2 For all i, v 1i v 2i For at least one i, v 1i < v 2i v < v implies global time ordering of events 1 2 When data is written from node i, it sets t to its clock value. i This allows eventual consistency to resolve consistency between writes on multiple replicas.

15 Optimistic Concurency Control Locking is conservative approach that avoids conflicts Additional code for locking Checking deadlock Lock saturation for all objects that are frequently used If conflicts are rare we can get concurrent transactions by not locking Instead we check the conflicts before commiting transaction Optimistic control is faster than locking if there are not a lot of conflicts (worse when there are many conflicts)

16 Kung-Robinson model Transaction has 3 phasis: READ: Transaction reads from the databse but makes changes to a local copy. VALIDATE: validate changes. WRITE: make local changes public. old changed objects ROOT new

17 Validation Check the conditions that are sufficient Every transaction gets numerical timestamp Transaction IDs are assingned at the end of READ phase, just before validation ReadSet(Ti): Set of objects read by transaction Ti WriteSet(Ti): Set of objects changed by Ti

18 Optimistic Locking Unique counter or clock value is saved for each piece of data. Client on update provides the counter/clock-value of the revision it likes to update Optimistic locking scheme needs a total order of version numbers To allow causality reasoning on versions e. g. which revision is considered the most recent by a node on which an update was issued a lot of history has to be saved Downside of this procedure Project Voldemort development team does not work well in a distributed and dynamic scenario where servers show up and go away often and without prior notice such a total order easily gets disrupted in a distributed and dynamic setting of nodes, the Project Voldemort team argues

19 Optimistic Locking Couchbase optimistic locking We optimistically speculate that there will be no conflict Before commit we check if there are conflicts Commit is issued if there are no conflicts Handling one read/write operation is easier than the complex transaction Example: Let s assume we are building an online wikipedia like application using Couchbase Server: users can update an article and add newer articles. Let s assume Alice is using this application to edit an article on bicycles to correct some information. Alice opens up the article and makes those changes but before she hits save, she gets distracted and walks away from her desk. In the meantime, let s assume Joe notices the same error in the bicycle article and wants to correct the mistake. If optimistic locking is used in the application, Joe can edit the article and save his changes. When Alice returns and wants to save her changes, either Alice or the application will want to handle the latest updates before allowing Alice s action to change the document. Optimistic locking takes the optimistic view that data conflicts due to concurrent edits occur rarely, so it s more important to allow concurrent edits.

20 Optimistic Locking Couchbase pessimistic locking Application simply locks the objects and unlocks it after commit Example: Now let s assume that your business process requires exclusive access to one or more documents or a graph of documents. Referring to our previous example, when Alice is editing the document she does not want any other user to edit the same document. If Joe tries to open the page, he will have to wait until Alice has released the lock. With pessimistic locking, the application will need to explicitly get a lock on the document to guarantee exclusive user access. When the user is done accessing the document, the locks can be removed either manually or using a timeout.

21 Multiversion Concurrency Control MVCC is a concurrency control method commonly used by database management systems to provide concurrent access to the database Isolation is the property that provides guarantees in the concurrent accesses to data Isolation is implemented by means of a concurrency control protocol MVCC aims at solving the problem by keeping multiples copies of each data item On update MVCC creates a newer version of the data item Isolation level commonly implemented with MVCC is snapshot isolation All reads made in a transaction will see a consistent snapshot of the database It reads the last committed values that existed at the time it started How to remove versions that become obsolete? PostgreSQL adopts this approach with its VACUUM proces

22 Multiversion Concurrency Control Each table cell has a timestamp Timestamps don t necessarily need to correspond to real life Multiple versions can exist concurrently for a given row Reads may return most recent, most recent before T, etc. (free snapshots) System may provide optimistic concurrency control with compareand-swap on timestamps SQL Anywhere, InterBase, Firebird, Oracle, PostgreSQL, MongoDB and Microsoft SQL Server (2005 and later) Not the same as serializability

23 Vector Clock - Details A vector clock is defined as a tuple V[0], V[1],..., V[n] of clock values from each node. In a distributed scenario node i maintains such a tuple of clock values, which represent the state of itself and the other (replica) nodes state as it is aware about at a given time V i [0] for the clock value of the first node, V i [1] for the clock value of the second node,... V i [i] for itself,... V i [n] for the clock value of the last node. Clock values may be real timestamps derived from a node s local clock, version/revision numbers or some other ordinal values.

24 Vector Clocks Vector clocks are updated in a way defined by the following rules If an internal operation happens at node i, this node will increment its clock V i [i]. This means that internal updates are seen immediately by the executing node. If node i sends a message to node k, it first advances its own clock value V i [i] and attaches the vector clock V i to the message to node k. Thereby, he tells the receiving node about his internal state and his view of the other nodes at the time the message is sent. If node i receives a message from node j, it first advances its vector clock V i [i] and then merges its own vector clock with the vector clock V message attached to the message from node j so that: V i = max(v i, V message ) To compare two vector clocks V i and Vj in order to derive a partial ordering, the following rule is applied: V i > V j, if k V i [k] >= V j [k] and l V i [l] > V j [l] If neither V i > V j nor V i < V j applies, a conflict caused by concurrent updates has occurred and needs to be resolved by e.g. a client application.

25 Vector Clocks

26 Vector Clocks Casual reasoning between updates Clients participate in the vector clock scenario They keep a vector clock of the last replica node they have talked to and use this vector clock depending on the client consistency model that is required For monotonic read consistency a client attaches this last vector clock it received to requests and the contacted replica node makes sure that the vector clock of its response is greater than the vector clock the client submitted Clients can be sure to see only newer versions of some piece of data Advantages of vector clocks are No dependence on synchronized clocks No total ordering of revision numbers required for casual reasoning No need to store and maintain multiple revisions of a piece of data on all nodes

27 Gossip Gossip is one method to propagate a view of cluster status. Every t seconds, on each node: The node selects some other node to chat with. The node reconciles its view of the cluster with its gossip buddy. Each node maintains a timestamp for itself and for the most recent information it has from every other node. Information about cluster state spreads in O(log n) rounds (eventual consistency) Scalable and no SPOF, but state is only eventually consistent

28 Gossip 4 rounds of the protocol

29 Propagate State via Gossip Gossip protocol operates in either a state or an operation transfer model to handle read and update operations as well as replication among database nodes State Transfer Model Data or deltas of data are exchanged between clients and servers as well as among servers Database server nodes maintain vector clocks for their data and also state version trees for conflicting versions versions where the corresponding vector clocks cannot be brought into a V A < V B or V A > V B relation Clients maintain vector clocks for pieces of data they have already requested or updated Vector clocks are exchanged and processed in the following manner Query Processing When a client queries for data it sends its vector clock of the requested data along with the request Server node responds with part of his state tree for the piece of data that precedes the vector clock attached to the client request and the server s vector clock Client has to resolve potential version conflicts Update Processing Internode Gossiping

30 Propagate State via Gossip Operation Transfer Model Operations applicable to locally maintained data are communicated among nodes in the operation transfer mode lesser bandwidth is consumed to interchange operations in contrast to actual data or deltas of data Importance to apply the operations in correct order on each node a replica node first has to determine a casual relationship between operations (by vector clock comparison) before applying them to its data In this model a replica node maintains the following vector clocks V state : The vector clock corresponding to the last updated state of its data. V i : A vector clock for itself where compared to V state merges with received vector clocks may already have happened V j : A vector clock received by the last gossip message of replica node j (for each of those replica nodes). Exchange and processing vector clocks in read-, update- and internode-messages Complex protocols for handling queue of (deferred) operations correct (casual) order reasoned by the vector clocks attached to these operations

31 Partitioning Assuming that data in large scale systems exceeds the capacity of a single machine and should also be replicated to ensure reliability and allow scaling measures such as load-balancing, Ways of partitioning the data of such a system have to be thought about Depending on the size of the system and the other factors like dynamism (e. g. how often and dynamically storage nodes may join and leave) There are different approaches to this issue: Memory Caches Clustering Master(s)/slaves model Sharding Consistent Hashing

32 Memory Caches Like memcached (memcached.org/) partitioned though transient in-memory databases replicate most frequently requested parts of a database to main memory rapidly deliver data to clients and disburden database servers significantly consists of an array of processes with an assigned amount of memory launched on several machines in a network made known to an application via configuration. memcached protocol implementation in 3 different programming languages client applications provided a simple key-/value-store API 3 It stores objects placed under a key into the cache by hashing that key against the configured memcached-instances. If a memcached process does not respond, most API-implementations ignore the non-answering node and use the responding nodes instead implicit rehashing of cache-objects as part of them gets hashed to a different memcached-server after a cache-miss;

33 Memory Caches formerly non-answering node joins the memcached server array again keys for part of the data get hashed to it again after a cache miss and objects now dedicated to that node will implicitly leave the memory of some other memcached server they have been hashed to while the node was down memcached applies a LRU 4 strategy for cache cleaning and allows it to specify timeouts for cache-objects Logic Half in Client, Half in Server Clients understand how to choose which server to read or write to for an item, what to do when it cannot contact a server. The servers understand how store and fetch items. They also manage when to evict or reuse memory. O(1) All commands are implemented to be as fast and lock-friendly as possible. This gives allows near-deterministic query speeds for all use cases. Queries on slow machines should run in well under 1ms. High end servers can serve millions of keys per second in throughput. Other implementations of memory caches are available e. g. for application servers like JBoss

34 Database clusters High Availability all components of a system from end-to-end provide an uninterrupted service High availability can be achieved through redundancy or very fast recovery. Clustering of database servers is another approach to partition data strives for transparency towards clients who should not notice talking to a cluster of database servers instead of a single server Scale the persistence layer of a system to a certain degree many criticize that clustering features have only been added on top of DBMSs that were not originally designed for distribution MySQL cluster Shared-nothing clustering and auto-sharding Designed to provide high availability and high throughput with low latency, while allowing for near linear scalability

35 Database clusters Clustering uses multiple servers to work with a shared database system Each server within a cluster is called a node. If the primary node fails, then the virtual database service will become active on one of the secondary nodes Low disk and network latency essential Some clustering systems operate at the Operating System level not the database level e.g. MS-SQL MySQL InnoDB InnoDB cluster is a collection of products that work together A group of MySQL servers can be configured in a cluster using MySQL Shell The cluster has a single master (the primary) = read-write master Multiple secondary servers are replicas of the master A minimum of three servers are required to create a high availability cluster A client application is connected to the primary via MySQL Router

36 Database clusters MySQL InnoDB InnoDB is a general-purpose storage engine that balances high reliability and high performance. Key advantages of InnoDB include: Its DML operations follow the ACID model, with transactions featuring commit, rollback, and crash-recovery capabilities to protect user data. Row-level locking and Oracle-style consistent reads increase multi-user concurrency and performance. Data is arranged on disk so as to optimize queries based on primary keys. To maintain data integrity, InnoDB supports FOREIGN KEY constraints

37 Master(s)/Slaves models Each data partition has a single master and multiple slaves Write-operations for all or parts of the data are routed to master(s) Number of replica-servers satisfying read-requests (slaves) Read/write protocol If the master replicates to its clients asynchronously there are no write lags If the master crashes before completing replication to at least one client the write-operation is lost If the master replicates writes synchronously to at least one slave Read requests can go to any replicas if the client can tolerate some degree of data staleness "mastership" happens at the virtual node level There is NO single particular physical node that plays the role as the master Therefore, the write workload is also distributed across different physical node When a physical node crashes The masters of certain partitions will be lost The most updated slave will be nominated to become the new master

38 Master(s)/Slaves models Master Slave model works very well in general when the application has a high read/write ratio It also works very well when the update happens evenly in the key range So it is the predominant model of data replication 2 ways how the master propagate updates to the slave State transfer and Operation transfer The state transfer model is more robust against message lost Deltas are used many times (hash trees of the objects) Multi-master model allows updates to happen at any replica One master / slaves model unable to spread the workload evenly Problems to retain consistency

39 Master(s)/Slaves models Consistency of Master(s) models Strict consistency can be achieved by 2PC bring all N replicas to the same state at every update prepare" phase where the coordinator ask every replica to confirm whether each of them is ready to perform the update After gathering all N replicas responses positively "commit" phase Ask every replicas to commit If any one of the replica crashes, the update will be unsuccessful Quorum based 2PC the coordinator only need to update W<N replicas Write to all the N replicas but only wait for positive acknowledgment for any W On read we need to access R replicas (W+R>N) and return the latest (timestamp) If the client can tolerate a more relax consistency model We don't need to use the 2PC commit or quorum based protocol

40 Sharding Partition the data in such a way that data typically requested and updated together resides on the same node There are different interpretations of the sharding Horizontal partitioning of the data store (not single table!) Each shard may have the same schemata Usually some subset of attributes are used for the definition of partitioning These attributes form the shard key (sometimes referred to as the partition key) Sharding physically organizes the data Load and storage volume is evenly distributed among the servers Data shards may also be replicated for reasons of reliability and loadbalancing It may be either allowed to write to a dedicated replica only or to all replicas maintaining a partition of the data.

41 Sharding When an application stores and retrieves data Sharding logic directs the application to the appropriate shard This sharding logic can be implemented as part of the data access code in the application, or it could be implemented by the data storage system Benefits of sharding You can scale the system out by adding further shards running on additional storage nodes. A system can use off-the-shelf hardware rather than specialized and expensive computers for each storage node. You can reduce contention and improve performance by balancing the workload across shards. In the cloud, shards can be located physically close to the users that'll access the data. Sharding strategies (MS-SQL) Three strategies are commonly used when selecting the shard key and deciding how to distribute data across shards

42 Sharding Lookup strategy The mapping between a virtual shard and the physical partitions the sharding logic implements a map that routes a request for data to the shard that contains that data using the shard key

43 Sharding Range strategy This strategy groups related items together in the same shard, and orders them by shard key the shard keys are sequential Data is usually held in row key order in the shard Items that are subject to range queries and need to be grouped together can use a shard key that has the same value for the partition key but a unique value for the row key.

44 Sharding Hash strategy The sharding logic computes the shard to store an item in based on a hash of one or more attributes of the data The purpose of this strategy is to reduce the chance of hotspots Achieves a balance between the size of each shard and the average load that each shard will encounter

45 Sharding Advantages and considerations Lookup. This offers more control over the way that shards are configured and used Using virtual shards reduces the impact when rebalancing data New physical partitions can be added to even out the workload. The mapping between a virtual shard and the physical partitions can be modified without affecting application code Range. This is easy to implement and works well with range queries They can often fetch multiple data items from a single shard in a single operation. This strategy offers easier data management For example, users in the same region are in the same shard This strategy doesn't provide optimal balancing between shards. Hash. Rebalancing shards is difficult and might not resolve the problem of uneven load This strategy offers a better chance of more even data and load distribution. Request routing can be accomplished directly by using the hash function. There's no need to maintain a map. Rebalancing shards is difficult.

46 Consistent Hashing The topic is modern It is motivated by issues in present-day systems Web caching, Peer-to-peer systems, Database partitioning it s general and flexible enough to be useful for other problems Web caching was original motivation for consistent hashing, 1997 Store requested pages in the local cache Next time there is no need to retrieve the page the from remote source Obvious benefits: less Internet traffic, faster access The amount of the data used to store pages cached for 100 machines can be substantial Where to store them? One machine? Cluster? Use hash function to map URLs to caches First intuition of the computer scientist? Nice properties of hash functions Behaves like a random function, spreading data out evenly and without noticeable correlation across the possible buckets Designing good hash functions is not easy

47 Consistent Hashing Say there are n caches, named {0, 1, 2,..., n 1}. We store the Web page with URL x at the cache server h(x) mod n h(x) way bigger than n Solution works well if you do not change n All elements would have to be re-hashed if n changes n changes if a cache is added or a cache fails Consistent hashing What we want? hash table-type functionality and changing n does not affect hashing and almost all objects stay assigned to the same cache when n changes Leading idea: hash server names and URLs with the same hash function Which objects are assigned to which caches

48 Consistent Hashing Inserting a new object x to the database Given an object x that hashes to the bucket h(x) Imagine we have already hashed all the cache server names we scan buckets to the right of h(x) until we find a bucket h(s) to which the name of some cache s hashes. Expected load on each of the n cache servers is exactly 1/n fraction of the objects Suppose we add a new cache server s which objects have to move? Only the objects stored at s. Adding the nth cache causes only a 1/n fraction of the objects to relocate This approach to consistent hashing can also be visualized on a circle

49 Consistent Hashing Standard hash table operations Lookup and Insert? Efficiently implementation of the rightward/clockwise scan for the cache server s that minimizes h(s) subject to h(s) h(x) We want a data structure for storing the cache names, with the corresponding hash values as keys, that supports a fast Successor operation Which data structure? Given a key we need the next key and its value. Not hash table, heap? How about binary search tree? Complexity is O(log n) Reducing the variance Expected load of each cache server is a 1/n fraction of the objects The realized load of each cache will vary An easy way to decrease this variance is to make k virtual copies of each cache s hashing with k different hash functions to get h1(s),..., hk(s) Each server now has k copies all of them are labeled s Objects are assigned as before from h(x), we scan clockwise until we encounter one of the hash values of some cache s Choosing k log 2 n is large enough to obtain reasonably balanced loads

50 Consistent Hashing Virtual copies are also useful for dealing with heterogeneous caches that have different capacities. Use the number of virtual copies of a cache server proportional to the server s capacity;

51 Literature Christof Strauch, NoSQL Databases. Ho, Ricky: NOSQL Patterns. November horicky.blogspot.com/2009/11/nosql-patterns.html Lipcon, Todd: Design Patterns for Distributed Non-Relational Databases, Presentation, static.last.fm/johan/nosql /intro_nosql.pdf Karger, David, Et.Al., Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web, ACM symposium on Theory of computing, 1997 Microsoft Azure, Sharding pattern, Jun docs.microsoft.com/en-us/azure/architecture/patterns/sharding Tim Roughgarden, Gregory Valiant, CS168: The Modern Algorithmic Toolbox, Lecture #1: Introduction and Consistent Hashing, Stanford 2017.

52 Literature Cauchbase, Optimistic or pessimistic locking Which one should you pick?.

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Distributed Data Management Replication

Distributed Data Management Replication Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread

More information

Modern Database Concepts

Modern Database Concepts Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different

More information

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre NoSQL systems: sharding, replication and consistency Riccardo Torlone Università Roma Tre Data distribution NoSQL systems: data distributed over large clusters Aggregate is a natural unit to use for data

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Important Lessons. Today's Lecture. Two Views of Distributed Systems Important Lessons Replication good for performance/ reliability Key challenge keeping replicas up-to-date Wide range of consistency models Will see more next lecture Range of correctness properties L-10

More information

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1 Computing Parable The Archery Teacher Courtesy: S. Keshav, U. Waterloo Lecture 16, page 1 Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency

More information

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.

More information

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs 1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds

More information

Consistency in Distributed Systems

Consistency in Distributed Systems Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours

More information

Relational databases

Relational databases COSC 6397 Big Data Analytics NoSQL databases Edgar Gabriel Spring 2017 Relational databases Long lasting industry standard to store data persistently Key points concurrency control, transactions, standard

More information

CompSci 516 Database Systems

CompSci 516 Database Systems CompSci 516 Database Systems Lecture 20 NoSQL and Column Store Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Reading Material NOSQL: Scalable SQL and NoSQL Data Stores Rick

More information

Dynamo: Key-Value Cloud Storage

Dynamo: Key-Value Cloud Storage Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems

More information

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides.

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides. C A CAP Theorem P March 26, 2018 Thanks to Arvind K., Dong W., and Mihir N. for slides. CAP Theorem It is impossible for a web service to provide these three guarantees at the same time (pick 2 of 3):

More information

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB s C. Faloutsos A. Pavlo Lecture#23: Distributed Database Systems (R&G ch. 22) Administrivia Final Exam Who: You What: R&G Chapters 15-22

More information

Heckaton. SQL Server's Memory Optimized OLTP Engine

Heckaton. SQL Server's Memory Optimized OLTP Engine Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Extreme Computing. NoSQL.

Extreme Computing. NoSQL. Extreme Computing NoSQL PREVIOUSLY: BATCH Query most/all data Results Eventually NOW: ON DEMAND Single Data Points Latency Matters One problem, three ideas We want to keep track of mutable state in a scalable

More information

MySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia

MySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia MySQL Replication Options Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia Few Words About Percona 2 Your Partner in MySQL and MongoDB Success 100% Open Source Software We work with MySQL,

More information

CA485 Ray Walshe NoSQL

CA485 Ray Walshe NoSQL NoSQL BASE vs ACID Summary Traditional relational database management systems (RDBMS) do not scale because they adhere to ACID. A strong movement within cloud computing is to utilize non-traditional data

More information

{brandname} 9.4 Glossary. The {brandname} community

{brandname} 9.4 Glossary. The {brandname} community {brandname} 9.4 Glossary The {brandname} community Table of Contents 1. 2-phase commit........................................................................... 2 2. Atomicity, Consistency, Isolation,

More information

Topics in Reliable Distributed Systems

Topics in Reliable Distributed Systems Topics in Reliable Distributed Systems 049017 1 T R A N S A C T I O N S Y S T E M S What is A Database? Organized collection of data typically persistent organization models: relational, object-based,

More information

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000 Brewer s CAP Theorem Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000 Written by Table of Contents Introduction... 2 The CAP-Theorem...

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm Final Exam Logistics CS 133: Databases Fall 2018 Lec 25 12/06 NoSQL Final exam take-home Available: Friday December 14 th, 4:00pm in Olin Due: Monday December 17 th, 5:15pm Same resources as midterm Except

More information

Distributed Systems COMP 212. Revision 2 Othon Michail

Distributed Systems COMP 212. Revision 2 Othon Michail Distributed Systems COMP 212 Revision 2 Othon Michail Synchronisation 2/55 How would Lamport s algorithm synchronise the clocks in the following scenario? 3/55 How would Lamport s algorithm synchronise

More information

CS5412: TRANSACTIONS (I)

CS5412: TRANSACTIONS (I) 1 CS5412: TRANSACTIONS (I) Lecture XVII Ken Birman Transactions 2 A widely used reliability technology, despite the BASE methodology we use in the first tier Goal for this week: in-depth examination of

More information

CSE 530A ACID. Washington University Fall 2013

CSE 530A ACID. Washington University Fall 2013 CSE 530A ACID Washington University Fall 2013 Concurrency Enterprise-scale DBMSs are designed to host multiple databases and handle multiple concurrent connections Transactions are designed to enable Data

More information

Datacenter replication solution with quasardb

Datacenter replication solution with quasardb Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION

More information

Distributed Databases

Distributed Databases Distributed Databases Chapter 22.6-22.14 Comp 521 Files and Databases Spring 2010 1 Final Exam When Monday, May 3, at 4pm Where, here FB007 What Open book, open notes, no computer 48-50 multiple choice

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of

More information

Integrity in Distributed Databases

Integrity in Distributed Databases Integrity in Distributed Databases Andreas Farella Free University of Bozen-Bolzano Table of Contents 1 Introduction................................................... 3 2 Different aspects of integrity.....................................

More information

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat

Dynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing

More information

Concurrency Control. R &G - Chapter 19

Concurrency Control. R &G - Chapter 19 Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book Review DBMSs support concurrency, crash recovery with: ACID

More information

Transaction Management & Concurrency Control. CS 377: Database Systems

Transaction Management & Concurrency Control. CS 377: Database Systems Transaction Management & Concurrency Control CS 377: Database Systems Review: Database Properties Scalability Concurrency Data storage, indexing & query optimization Today & next class Persistency Security

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions

More information

CS October 2017

CS October 2017 Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering Consistency: Relaxed SWE 622, Spring 2017 Distributed Software Engineering Review: HW2 What did we do? Cache->Redis Locks->Lock Server Post-mortem feedback: http://b.socrative.com/ click on student login,

More information

Transactions and ACID

Transactions and ACID Transactions and ACID Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently A user

More information

Documentation Accessibility. Access to Oracle Support

Documentation Accessibility. Access to Oracle Support Oracle NoSQL Database Availability and Failover Release 18.3 E88250-04 October 2018 Documentation Accessibility For information about Oracle's commitment to accessibility, visit the Oracle Accessibility

More information

Migrating Oracle Databases To Cassandra

Migrating Oracle Databases To Cassandra BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra

More information

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 CSC 261/461 Database Systems Lecture 20 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Announcements Project 1 Milestone 3: Due tonight Project 2 Part 2 (Optional): Due on: 04/08 Project 3

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Distributed Data Management Transactions

Distributed Data Management Transactions Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut must ensure that interactions succeed consistently An OLTP Topic Motivation Most database interactions consist of multiple, coherent operations

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems Consistency CS 475, Spring 2018 Concurrent & Distributed Systems Review: 2PC, Timeouts when Coordinator crashes What if the bank doesn t hear back from coordinator? If bank voted no, it s OK to abort If

More information

PNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8)

PNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8) PNUTS and Weighted Voting Vijay Chidambaram CS 380 D (Feb 8) PNUTS Distributed database built by Yahoo Paper describes a production system Goals: Scalability Low latency, predictable latency Must handle

More information

Database Tuning and Physical Design: Execution of Transactions

Database Tuning and Physical Design: Execution of Transactions Database Tuning and Physical Design: Execution of Transactions Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Transaction Execution 1 / 20 Basics

More information

Chapter 11 - Data Replication Middleware

Chapter 11 - Data Replication Middleware Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled

More information

Data synchronization Conflict resolution strategies

Data synchronization Conflict resolution strategies Data synchronization Conflict resolution strategies Waldemar Korłub Department of Computer Architecture Faculty of Electronics, Telecommunications and Informatics Gdansk University of Technology November

More information

MySQL High Availability

MySQL High Availability MySQL High Availability And other stuff worth talking about Peter Zaitsev CEO Moscow MySQL Users Group Meetup July 11 th, 2017 1 Few Words about Percona 2 Percona s Purpose To Champion Unbiased Open Source

More information

DB2 Lecture 10 Concurrency Control

DB2 Lecture 10 Concurrency Control DB2 Lecture 10 Control Jacob Aae Mikkelsen November 28, 2012 1 / 71 Jacob Aae Mikkelsen DB2 Lecture 10 Control ACID Properties Properly implemented transactions are commonly said to meet the ACID test,

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

A transaction is a sequence of one or more processing steps. It refers to database objects such as tables, views, joins and so forth.

A transaction is a sequence of one or more processing steps. It refers to database objects such as tables, views, joins and so forth. 1 2 A transaction is a sequence of one or more processing steps. It refers to database objects such as tables, views, joins and so forth. Here, the following properties must be fulfilled: Indivisibility

More information

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency

More information

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25 DATABASE TRANSACTIONS CS121: Relational Databases Fall 2017 Lecture 25 Database Transactions 2 Many situations where a sequence of database operations must be treated as a single unit A combination of

More information

Distributed Databases

Distributed Databases Distributed Databases Chapter 22, Part B Database Management Systems, 2 nd Edition. R. Ramakrishnan and Johannes Gehrke 1 Introduction Data is stored at several sites, each managed by a DBMS that can run

More information

Birth of Optimistic Methods. On Optimistic Methods for Concurrency Control. Basic performance arg

Birth of Optimistic Methods. On Optimistic Methods for Concurrency Control. Basic performance arg On Optimistic Methods for Concurrency Control. Kung81: H.T. Kung, John Robinson. ACM Transactions on Database Systems (TODS), vol 6, no 2, June 1981. Birth of Optimistic Methods Lovely, complex, very concurrent

More information

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques Recap CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A?

More information

Apache Cassandra - A Decentralized Structured Storage System

Apache Cassandra - A Decentralized Structured Storage System Apache Cassandra - A Decentralized Structured Storage System Avinash Lakshman Prashant Malik from Facebook Presented by: Oded Naor Acknowledgments Some slides are based on material from: Idit Keidar, Topics

More information

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis 1 NoSQL So-called NoSQL systems offer reduced functionalities compared to traditional Relational DBMSs, with the aim of achieving

More information

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

Security Mechanisms I. Key Slide. Key Slide. Security Mechanisms III. Security Mechanisms II

Security Mechanisms I. Key Slide. Key Slide. Security Mechanisms III. Security Mechanisms II Database Facilities One of the main benefits from centralising the implementation data model of a DBMS is that a number of critical facilities can be programmed once against this model and thus be available

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

Transaction Management: Concurrency Control, part 2

Transaction Management: Concurrency Control, part 2 Transaction Management: Concurrency Control, part 2 CS634 Class 16 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Locking for B+ Trees Naïve solution Ignore tree structure,

More information

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching Locking for B+ Trees Transaction Management: Concurrency Control, part 2 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 16 Naïve solution Ignore tree structure,

More information

Concurrency Control Overview. COSC 404 Database System Implementation. Concurrency Control. Lock-Based Protocols. Lock-Based Protocols (2)

Concurrency Control Overview. COSC 404 Database System Implementation. Concurrency Control. Lock-Based Protocols. Lock-Based Protocols (2) COSC 404 Database System Implementation Concurrency Control Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Concurrency Control Overview Concurrency control (CC) is a mechanism

More information

Distributed Data Store

Distributed Data Store Distributed Data Store Large-Scale Distributed le system Q: What if we have too much data to store in a single machine? Q: How can we create one big filesystem over a cluster of machines, whose data is

More information

CSE 344 Final Review. August 16 th

CSE 344 Final Review. August 16 th CSE 344 Final Review August 16 th Final In class on Friday One sheet of notes, front and back cost formulas also provided Practice exam on web site Good luck! Primary Topics Parallel DBs parallel join

More information

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University Introduction to Computer Science William Hsu Department of Computer Science and Engineering National Taiwan Ocean University Chapter 9: Database Systems supplementary - nosql You can have data without

More information

Advanced Databases Lecture 17- Distributed Databases (continued)

Advanced Databases Lecture 17- Distributed Databases (continued) Advanced Databases Lecture 17- Distributed Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch www.mniazi.ir Alternative Models of Transaction Processing Notion of a single

More information

Everything You Need to Know About MySQL Group Replication

Everything You Need to Know About MySQL Group Replication Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights

More information

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Distributed Databases

Distributed Databases Distributed Databases Chapter 21, Part B Database Management Systems, 2 nd Edition. R. Ramakrishnan and Johannes Gehrke 1 Introduction Data is stored at several sites, each managed by a DBMS that can run

More information

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Key-Value Document Column Family Graph John Edgar 2 Relational databases are the prevalent solution

More information

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM

MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM About us Adamo Tonete MongoDB Support Engineer Agustín Gallego MySQL Support Engineer Agenda What are MongoDB and MySQL; NoSQL

More information

Concurrency Control CHAPTER 17 SINA MERAJI

Concurrency Control CHAPTER 17 SINA MERAJI Concurrency Control CHAPTER 17 SINA MERAJI Announcement Sign up for final project presentations here: https://docs.google.com/spreadsheets/d/1gspkvcdn4an3j3jgtvduaqm _x4yzsh_jxhegk38-n3k/edit#gid=0 Deadline

More information

Introduction to MySQL InnoDB Cluster

Introduction to MySQL InnoDB Cluster 1 / 148 2 / 148 3 / 148 Introduction to MySQL InnoDB Cluster MySQL High Availability made easy Percona Live Europe - Dublin 2017 Frédéric Descamps - MySQL Community Manager - Oracle 4 / 148 Safe Harbor

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These

More information

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability

More information

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions Distributed Systems Day 13: Distributed Transaction To Be or Not to Be Distributed.. Transactions Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,,

More information

Exam 2 Review. October 29, Paul Krzyzanowski 1

Exam 2 Review. October 29, Paul Krzyzanowski 1 Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check

More information

Concurrency Control. [R&G] Chapter 17 CS432 1

Concurrency Control. [R&G] Chapter 17 CS432 1 Concurrency Control [R&G] Chapter 17 CS432 1 Conflict Serializable Schedules Two schedules are conflict equivalent if: Involve the same actions of the same transactions Every pair of conflicting actions

More information

A can be implemented as a separate process to which transactions send lock and unlock requests The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction

More information

HOW CLUSTRIXDB RDBMS SCALES WRITES & READS

HOW CLUSTRIXDB RDBMS SCALES WRITES & READS Scaling out a SQL RDBMS while maintaining ACID guarantees in realtime is a very large challenge. Most scaling DBMS solutions relinquish one or many realtime transactionality requirements. ClustrixDB achieves

More information

Distributed Data Analytics Partitioning

Distributed Data Analytics Partitioning G-3.1.09, Campus III Hasso Plattner Institut Different mechanisms but usually used together Distributing Data Replication vs. Replication Store copies of the same data on several nodes Introduces redundancy

More information

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals

References. NoSQL Motivation. Why NoSQL as the Solution? NoSQL Key Feature Decisions. CSE 444: Database Internals References SE 444: atabase Internals Scalable SQL and NoSQL ata Stores, Rick attell, SIGMO Record, ecember 2010 (Vol. 39, No. 4) ynamo: mazon s Highly vailable Key-value Store. y Giuseppe eandia et. al.

More information