1 No compromises: distributed transactions with consistency, availability, and performance Aleksandar Dragojevi c, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, Miguel Castro Microsoft Research
2 Distributed transaction Advantage:A very strong primitive abstract away concurrency and failures simplify building distributed systems. Disadvantage: Not widely used because the poor performance and weak consistency.
3 Solution FaRM, A main memory distributed computing platform can provide distributed transactions with strict serializability, high performance, durability, and high availability. FaRM hit a peak throughput of 140 million TATP transactions per second and 90 machines with a 4.9TB database and it recovers from a failure in less than 50ms.
4 Hardware Trends Non-volatile memory : Achieved by attaching batteries to power supply unit and writing the contents of DRAM (Cheap) to SSD when the power fails. Fast Network with RDMA (remote direct memory access) which allows computer in a network to exchange data in main memory without involving the processor, cache or OS. Optimize CPU overheads : reducing message counts, using one-sided RDMA reads and write, exploiting parallelism
5 Programming model and architecture
7 Overview FaRM transactions uses optimistic concurrency control. FaRM provides applications with the abstraction of a global address space among machines. A FaRM instance moves through a sequence of configurations over time as machine fail or new machines are added.
8 Configuration A configuration is a tuple <i, S, F, CM> where i is a unique id, S is the set of machines in the configuration, F is a mapping from machines to failure domains, CM is the configuration manager. Every machines in a FaRM instance agree on the current configuration and to store it.
9 Global address space and queues The global address space consists of 2GB regions, each replicated on one primary and f backups. Objects are always read from the primary copy using local memory accesses and using one-sided RMA reads if remote. Each object has a 64-bit version that is used for concurrency control and replication. FIFO queues are implemented for transaction or message queues.
10 Distributed transactions and replication
11 Brief Overview 1. Transaction protocols and replication protocols are integrated to improve performance. 2. Fewer messages than traditional protocols, and exploits one-sided RDMA reads and writes for CPU efficiency and low latency. 3. Primary-backup replication are stored in DRAM for data and transactions log.
12 Transactions protocol
13 Overview FaRM guarantees that individual object reads are atomic, that they read only committed data, and reads of objects written by the transaction return the latest value written. lock-free reads are provided, which are optimized single-object read only transactions. During execution phase, transactions use one-sided RDMA to read objects and they buffer writes locally.
15 Lock and validation Lock: 1. Coordinator writes a LOCK record to log on each machine that is a primary for any written object. 2. Primaries attempt to lock these objects at specified versions. 3. Coordinator aborts transaction and send abort record to all primaries if locking fail. Validation: 1. Coordinator performs read validation (One-sided RDMA) by reading from their primaries and abort transaction if any object changed. 2. Use RPC if a primaries hold more than t r objects.
16 Commit and Truncate Commit: 1. The coordinator writes a COMMIT-BACKUP records to the non-volatile logs at each backup and waits for an ack from the NIC. 2. The coordinator writes a COMMIT-PRIMARY record to the logs at each primary if all backup acked. Then, Report to application if at least one of primaries acked. 3. Primaries update object and their versions, and unlocking them which exposes the writes committed by the transaction. Truncate: 1. The coordinator truncates logs at primaries and backups after receiving acks from all primaries. 2. Backups update their copies of the objects at this phase.
17 Correctness Serializability: 1. All the write locks were acquired. 2. Committed read-only transactions at the point of their last read. Serializability across failures: 1. Waiting for hardware acks from all backups before writing COMMIT-PRIMARY. 2. Wait for at least one successful commit among primaries.
18 Failure recovery
19 Overview 1. FaRM provides durability and high availability using replication: all committed state can be recovered from regions and logs stored in non-volatile DRAM. 2. Durability is ensured even if at most f replicas per object lose the content of non-volatile DRAM. 3. Availability: Maintain availability with failures and network partition by provide a partition exists that contains a majority of the machines which remain connected to each other and to a majority of replicas in Zookeeper service, and the partition contains at least one replica of each object.
20 Failure detection FaRM uses leases to detect failures. 1. Every machines holds a lease at the CM and the CM holds a lease at every other machine. 2. Expiry of any lease triggers failure recovery. 3. Each machine sends a lease request to the CM and it responds with a message that acts as both a lease grant to the machine and a lease request from the CM.
21 Reconfiguration 1. The reconfiguration protocol moves a FaRM instance from on configuration to the next. 2. One-sided RDMA operation require new protocol since the absence of remote CPU. 3. Precise membership: After a failure, all machines in a new configuration must agree on its membership before allowing object mutation. (7 Steps)
23 Timeline Suspect: 1. When a lease for a machine expires at the CM, it suspects that machine of failure and initiates reconfiguration (block all external requests) and try to become the new CM. Probe: 1. The new CM issue an RDMA read to all machines in the config, any machine for which the read fails is also suspected.(correlated Failure) 2. The new CM proceed the reconfiguration after it obtains responses for a majority.
24 Timeline Update configuration: 1. The new CM attempts to update the configuration data to <c+1, S, F, CM id >. Remap regions: 1. New CM reassigns regions mapped to failed machines to restore the number of replicas to f For failed primaries, a surviving backup is promoted to be the new primary for efficiency Send new configuration: 1. New CM sends NEW-CONFIG message (config id, own id, id of other machines in this config) to all the machines in the configuration. Also, reset the lease protocol if CM has changed.
25 Timeline Apply new configuration: 1. Each machine update it current configuration id after it receive the NEW-CONFIG message with a greater configuration identifier, 2. Each machine Allocates space to hold any new region replicas assigned to it, block all external requests and reply to the CM with a NEW-CONFIG-ACK message. Commit new configuration: 1. After receiving ACKs back, new CM waits to ensure that any old leases expires, send NEW-CONFIG-COMMIT to all configuration members. 2. All configuration members can unblock previously blocked external requests.
26 Transaction state recovery FaRM recovers transaction after a configuration change using the logs distributed across the replicas of objects modified by a transaction. The timeline of transaction state recovery: 1. Block access to recovering regions. 2. Drain logs. 3. Find recovering transactions. 4. Lock recovery. 5. Replicate log records. 6. Vote 7. Decide
27 Transaction state recovery
28 Step 1-2 Block access to recovering regions 1. Blocking requests for local pointer and RDMA references to the region until we finish lock recovery. Drain logs 1. All machines process all the records in their logs when they receive a NEW-CONFIG-COMMIT. 2. They record the configuration identifier in a variable LastDrained when they are done.
29 Step 3-4 Find recovering transactions 1. Recovering transaction is one whose commit phase spans configuration changes, and for some replica of a written object, some primary of a read object, or the coordinator has changed due to reconfig. 2. Metadata is added during reconfiguration phase. 3. During log drain, the transaction id and list of updated region identifiers in each log record in each log examined the set of recovering transactions. 4. Each backup of a region sends NEED-RECOVERY msg to primary if needed. Lock Recovery 1. The primary shards the transaction by id across its thread and lock any objects modified by recovering transactions in parallel.
30 Step 5-6 Replicate log records 1. The threads in the primary replicate log records by sending backups the REPLICATE -TXSTATE message for any transactions that they are missing. Vote 1. The coordinator for a recovering transaction decides whether to commit or abort the transaction based on votes from each region updated by the transaction. 2. RECOVERY-VOTE msg is sent from primary to peer threads for each recovering transaction that modified the region 3. Corresponding msg are sent back to vote.
31 Step 7 Decide 1. Coordinator decides to commit if it receives a commit-primary vote from any region. 2. Otherwise, wait until there is at least one region voted commit-backup and all other regions modified by the transaction voted lock, commit backup, or truncated. 3. Otherwise, abort.
32 Correctness Intuition: Recovery preserves the outcome for transactions that were previously committed or aborted. 1. Step 2 ensures a log record for a recovering transaction that committed is guaranteed to be processed and accepted before or during log draining. 2. Step 3-4 ensures that the primaries for the region modified by the transaction see these records. 3. Records are replicated to the backups to guarantee that voting will produce same results. 4. Primaries send votes to coordinator based on the records they have seen. 5. Decide step guarantees that the coordinator decides to commit any transaction that were committed before.
33 Recovering data 1. FaRM must recover data at new backups for a region to ensure that it can tolerate f replica failures in the future. 2. Each machine sends REGIONS-ACTIVE msg to CM when all regions for which it is primary become active. The CM sends a ALL-REGIONS-ACTIVE to all machines in the config after receiving all those msg. FaRM begins data recovery for new backups. 3. Recovered object need to be examined before being copied to the backup.(version Comparison).
34 Recovering allocator state 1. FaRM allocator splits region into blocks(1mb) and keeps metadata called block headers. 2. Block header consists object size and slab free list and replicated to backups when a new block is allocated. (Ensure availability after failure) 3. After a failure, slab free lists are recovered on the new primary by scanning objects in the region.
36 Setup Testbed: 90 machines used for a FaRM and 5 machines for a replicated ZooKeeper. Machine spec: 256GB of DRAM and two 8-core Intel E CPUs running Window Server 2012 R2. Hyper-threading are enabled and the first 30 threads for the foreground work and the remaining 2 threads for the lease manager. FaRM Config: 1 primary and 2 backups with a lease time of 10ms.
37 Benchmarks TATP (Telecommunication Application Transaction Processing): 1. %70 of the operations are single-row lookup by using FaRM s lock free reads (no commit phase). 2. %10 of the operations read 2-4 rows and require validation during the commit phase. 3. %20 of the operations are updates and require full commit protocol. TPCC, a well known benchmark with complex transactions that access hundreds of rows.
38 Normal case performance(failur e-free) TATP: 140 million TATP transaction per second with 58 us median latency and 645 us 99th percentile latency. (Outperforms published TATP results for Hekaton) TPCC: Forms up 4.5 million new transaction per second with median latency of 808 us and 99th percentile latency of 1.9ms. (Throughput is 17x higher than Silo without logging)
39 Failures TATP: Throughput drops sharply at the failure but recover rapidly(40ms) and have some dips because of skewed access in the benchmark. TPCC: Regain most of throughput in less than 50ms and that all regions become active shortly after that.
40 Failures TPCC: Regain most of throughput in less than 50ms and that all regions become active shortly after that.
41 Failing the CM Recovery is slower than when a non-cm process fails because the increasing reconfiguration time: from 20 ms to 97 ms.
42 Distribution of recovery times Same process time to regain throughput when use smaller data set. Median recovery time: Around 50 ms More than %70 executions have recovery time less than 100 ms.
43 Related work 1. RAMCloud: Does not support multi-object transaction where FaRM supports, takes too long for recovery where FaRM just use tens of a milliseconds of a failure. 2. Spanner: Use 2f+1 replicas compared to FaRM s f+1, and sends more messages to commit than FaRM. 3. Silo: Storage is local and thus availability is lost when the machine fails, FaRM also achieves higher throughput and lower latency than Silo.
44 Conclusion 1. Transaction make it easier to program distributed systems but many systems avoid them or weaker their consistency to improve availability and performance. 2. FaRM is a distributed main memory computing platform for modern data centers that provides strictly serializable transactions with high throughput, low latency, and high availability. 3. FaRM can also recover from a machine failure back to providing peak throughput in less than 50 ms, making failures transparent to applications.
No Compromises Distributed Transactions with Consistency, Availability, Performance Aleksandar Dragojevic, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, Miguel
FaRM: Fast Remote Memory Problem Context DRAM prices have decreased significantly Cost effective to build commodity servers w/hundreds of GBs E.g. - cluster with 100 machines can hold tens of TBs of main
CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say
FaRM: Fast Remote Memory Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, Miguel Castro Microsoft Research Abstract We describe the design and implementation of FaRM, a new main memory distributed
Tailwind: Fast and Atomic RDMA-based Replication Yacine Taleb, Ryan Stutsman, Gabriel Antoniu, Toni Cortes In-Memory Key-Value Stores General purpose in-memory key-value stores are widely used nowadays
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions
Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total
Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit
Fault Tolerance Causes of failure: process failure machine failure network failure Goals: transparent: mask (i.e., completely recover from) all failures or predictable: exhibit a well defined failure behavior
Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
Distributed Systems Day 13: Distributed Transaction To Be or Not to Be Distributed.. Transactions Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,,
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission
PNUTS and Weighted Voting Vijay Chidambaram CS 380 D (Feb 8) PNUTS Distributed database built by Yahoo Paper describes a production system Goals: Scalability Low latency, predictable latency Must handle
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
Deconstructing RDMA-enabled Distributed Transaction Processing: Hybrid is Better! Xingda Wei, Zhiyuan Dong, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems (IPADS) Shanghai Jiao Tong
Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties
Accelerating Ceph with Flash and High Speed Networks Dror Goldenberg VP Software Architecture Santa Clara, CA 1 The New Open Cloud Era Compute Software Defined Network Object, Block Software Defined Storage
At-most-once Algorithm for Linearizable RPC in Distributed Systems Seo Jin Park 1 1 Stanford University Abstract In a distributed system, like RAM- Cloud, supporting automatic retry of RPC, it is tricky
High Performance Transactions in Deuteronomy Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang Microsoft Research Overview Deuteronomy: componentized DB stack Separates transaction,
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to
XI. Lecture Topics Properties of Failures and Concurrency in SQL Implementation of Degrees of Isolation CS338 1 Problems Caused by Failures Accounts(, CId, BranchId, Balance) update Accounts set Balance
What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT QUICK PRIMER ON CEPH AND RADOS CEPH MOTIVATING PRINCIPLES All components must scale horizontally There can be no single point of failure The solution
Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay Kulkarni, niraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman 1 Introduction Distributed low-latency in-memory key-value stores are
Implementing Linearizability at Large Scale and Low Latency Collin Lee, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, and John Ousterhout Stanford University, NEC Abstract Linearizability is the strongest
Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan Jones, Daniel Abadi, Samuel Madden, June 2010, SIGMOD CS 848 May, 2016 Michael Abebe Background Motivations Database partitioning
RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction
The Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani CS5204 Operating Systems 1 Introduction GFS is a scalable distributed file system for large data intensive
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma email@example.com Fall 2015 004395771 Overview Google file system is a scalable distributed file system
Transaction Is any action that reads from and/or writes to a database. A transaction may consist of a simple SELECT statement to generate a list of table contents; it may consist of series of INSERT statements
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds
CS 138: Google CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved. Google Environment Lots (tens of thousands) of computers all more-or-less equal - processor, disk, memory, network interface
Chapter 17: Recovery System Database System Concepts See www.db-book.com for conditions on re-use Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery
Database Management Systems Distributed Databases Doug Shook What does it mean to be distributed? Multiple nodes connected by a network Data on the nodes is logically related The nodes do not need to be
SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory JOY ARULRAJ JUSTIN LEVANDOSKI UMAR FAROOQ MINHAS PER-AKE LARSON Microsoft Research NON-VOLATILE MEMORY [NVM] PERFORMANCE DRAM VOLATILE
A Cross Media File System Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson 1 Let s build a fast server NoSQL store, Database, File server, Mail server Requirements
Replication and Consistency Fall 2010 Jussi Kangasharju Chapter Outline Replication Consistency models Distribution protocols Consistency protocols 2 Data Replication user B user C user A object object
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
N. America Example Implementing caches Doug Woos Asia System + Caches Africa put (k2, g(get(k1)) put (k2, g(get(k1)) What if clients use a sharded key-value store to coordinate their output? Or CPUs use
DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports
Exploiting Commutativity For Practical Fast Replication Seo Jin Park Stanford University John Ousterhout Stanford University Abstract Traditional approaches to replication require client requests to be
COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early
Adaptive Recovery for SCM-Enabled Databases Ismail Oukid (TU Dresden & SAP), Daniel Bossle* (SAP), Anisoara Nica (SAP), Peter Bumbulis (SAP), Wolfgang Lehner (TU Dresden), Thomas Willhalm (Intel) * Contributed
T ransaction Management 4/23/2018 1 Air-line Reservation 10 available seats vs 15 travel agents. How do you design a robust and fair reservation system? Do not enough resources Fair policy to every body
Introduction to Parallel Databases Companies need to handle huge amount of data with high data transfer rate. The client server and centralized system is not much efficient. The need to improve the efficiency
CS 138: Google CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Google Environment Lots (tens of thousands) of computers all more-or-less equal - processor, disk, memory, network interface
Percolator Large-Scale Incremental Processing using Distributed Transactions and Notifications D. Peng & F. Dabek Motivation Built to maintain the Google web search index Need to maintain a large repository,
CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of
Distributed Database Management System UNIT-2 Bharati Vidyapeeth s Institute of Computer Applications and Management, New Delhi-63,By Shivendra Goel. U2.1 Concurrency Control Concurrency control is a method
TEHNIL NOTE How does Transactions Note to readers: Some content about read-only transaction coordination in this document is out of date due to changes made in v6.4 as a result of Jepsen testing. Updates
Deukyeon Hwang UNIST Wook-Hee Kim UNIST Youjip Won Hanyang Univ. Beomseok Nam UNIST Fast but Asymmetric Access Latency Non-Volatility Byte-Addressability Large Capacity CPU Caches (Volatile) Persistent
Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE
Transactions: ACID, Concurrency control (2P, OCC) Intro to distributed txns The transaction Definition: A unit of work: May consist of multiple data accesses or updates Must commit or abort as a single
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model
Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better! Xingda Wei, Zhiyuan Dong, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University Contacts:
Distributed Systems 8L for Part IB Handout 3 Dr. Steven Hand 1 Distributed Mutual Exclusion In first part of course, saw need to coordinate concurrent processes / threads In particular considered how to
A can be implemented as a separate process to which transactions send lock and unlock requests The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction
Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows
Module 8 Fault Tolerance CS655! 8-1! Module 8 - Fault Tolerance CS655! 8-2! Dependability Reliability! A measure of success with which a system conforms to some authoritative specification of its behavior.!
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and