Using Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance. Michael Owen Research and Development Engineer

Size: px
Start display at page:

Download "Using Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance. Michael Owen Research and Development Engineer"

Transcription

1 1

2 Using Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance Michael Owen Research and Development Engineer 2

3 Background 3

4 bet365 in stats Founded in 2000 Located in Stoke-on-Trent The largest online sports betting company Over 19 million customers One of the largest private companies in the UK Employs more than 2,000 people : Over 26 billion was staked Last year is likely to be around 25% up Business growing very rapidly! Very technology focused company 4

5 bet365 technology stats Over 500 employees within technology 60 million per year IT budget Fifteen datacentres in seven countries worldwide 100Gb capable private dark fibre network 9 upstream ISPs 150 Gigabits of aggregated edge bandwidth 25 Gbps and 6M HTTP requests/sec at peak Around 1 to 1.5 million markets on site at any time 18 languages supported Push systems burst to 100,000 changes per second Almost all this change generated via automated models Database systems running at > 500K TPS at peak Over 2.5 million concurrent users of our push systems We stream more live sport than anyone else in Europe 5

6 Production systems using Erlang and Riak Cash-out A system used by customers to close out bets early. Stronger An online transaction processing (OLTP) data layer. 6

7 Why Erlang and Riak? 7

8 Our historical technology stack Very pragmatic What would deliver a quality product to market in record time Mostly.NET with some Java middleware Lot and lots of SQL Server 8

9 But we needed to change Complexity of code and systems Needed to make better use of multi-core CPUs Needed to scale out Could no longer scale our SQL infrastructure Had scaled up and out as far as we could Lack of scalability caused undue stress on the infrastructure Lead to loss of availability 9

10 Erlang Adoption 10

11 Erlang Key learnings You can get a lot done in a short space of time. A plus and a minus! Tooling is limited Hot code upgrades with state can be hard Dependency management could be better Get as much visibility into the system as possible e.g. stats, data etc Use OTP / reuse proven code Keep to standards (e.g. code layout etc) 11

12 Erlang Key learnings Check your supervision tree Message passing is a double edged sword Keep state small (e.g. gen_server state) Binaries and GC: Heap vs reference-counted Explore whether you need the Transparent Huge Pages (THP) feature in the Linux kernel Don t use error_logger as your main logger Validate all data coming into the Erlang system at the edge 12

13 Riak Adoption 13

14 Riak Brief overview Key value store Inspired by the Dynamo paper Traditionally, an eventually consistent system (AP from CAP) Riak 2.0+: Introduction of a strongly consistent option (CP from CAP) A Riak cluster is a 160-bit integer space the Riak Ring Split into partitions a virtual node (vnode) is responsible for each one A vnode lives on one of the Riak nodes Data is stored in a number of partitions (n_val setting default 3) Consistent hashing technique helps with identifying the partitions for putting and getting the data 14

15 Riak Why? Open source aspect Uses Erlang Based on solid ideas Horizontally scalable Highly available Masterless: No global locks performance is predictable Designed to be operationally simple Support and community exists 15

16 Riak Key learnings Eventually consistent: Eventually the data will converge to the consistent value Keep in mind: A get/read may return an old value A put/write may be accepted for a key at the same time as another concurrent put for the same key in the cluster (i.e. no global locking) Bend your problem! E.g. Look at it from another side Data model for your use case i.e. normalisation isn t key: Trade off puts vs gets for your use case No bigger object sizes than 1MB Store data in a structure which helps with version upgrades Riak Enterprise Multi-Datacenter Replication + Support 16

17 Riak Key learnings Consult Riak s System Performance Tuning documentation Different internode network vs inbound Monitor network and disk usage Use the riak-admin diag command Use Basho Bench to load test your cluster For bitcask backend: load test merging and tune for your use case Setting log_needs_merge to true will help with this tuning Allow siblings (i.e. allow_mult = true) With resolving siblings asap 17

18 Riak What are siblings? A sibling happens when Riak does not know which value is the causally recent (E.g. because of concurrent puts/writes) Uses version vectors to know this Version vector A is Concurrent to version vector B (as opposed to Descends or Dominates) Explained later Referenced as vector clocks in the Riak documentation should have been named version vectors Talk: A Brief History of Time in Riak. Sean Cribbs (Basho). RICON Similar logic, however: Vector clocks is about tracking events to a computation Version vectors is about tracking updates to data replicas Riak 2.0 introduced the option of dotted version vectors instead Similar idea to the ORSWOT CRDT dot functionality (explained later) Reduces potential number of siblings (i.e. causality tracking is more accurate) -> limits sibling explosion All sibling values are returned (i.e. more than one) Big difference to the normal experience with SQL type data stores 18

19 Riak allow_mult=false You can set allow_mult to false (i.e. no siblings to the client) with: last_write_wins set to false Uses version vectors. In conflict, the sibling with the highest timestamp wins. last_write_wins set to true Doesn't use version vectors new value overwrites current value However, not recommended* because of potential data loss Network problems Reading: Fallacies of Distributed Computing Explained, Arnon Rotem-Gal-Oz Complexity of time synchronisation across servers (speed of light, machines fail etc) Reading: There is No Now - Problems with simultaneity in distributed systems, Justin Sheehy * Perhaps for immutable data with separate unique keys 19

20 Riak Dealing with siblings Sibling values are returned on a get request Need to have a merge function to produce the correct value The merge function should be deterministic by having the following properties: Associativity Order of the merge function being applied to the data doesn t matter as long as the sequence of the data items is not changed Commutativity Order into the merge function does not matter Idempotence Merge function applied twice to the same value results in the same value Can be hard to get right Can lead to possible data loss and incorrect data 20

21 CRDTs 21

22 CRDTs What are they? Conflict-Free Replicated Data Types Can be: Operation based: Commutative Replicated Data Types State based: Convergent Replicated Data Types Reduces complexity by having no client side siblings But still having no data loss Readings: A comprehensive study of Convergent and Commutative Replicated Data Types. Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski Conflict-free replicated data types. Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski CRDTs: Consistency without concurrency control. Mihai Letia, Nuno Preguiça, Marc Shapiro

23 CRDTs Operation based Commutative Replicated Data Types All replicas of the data are sent operational updates Relies more on a good network and reliably delivering updates Knowing the current true membership is more important 23

24 CRDTs State based Convergent Replicated Data Types Data is locally updated, sent to replicas and merged Update function must be monotonically increasing Generally easier to understand than operation based Easier to have an elastic membership of replicas However, more data is sent around the network 24

25 CRDTs Types Different data type implementations exist for: Counters Sets Maps For our core use case the Sets data type made sense However, we were and are using Riak 1.4+ and a Sets CRDT isn t available Introduced in Riak 2.0+ At the time, Riak 2.0+ wasn t even a release candidate 25

26 CRDTs riak_dt We decided to use the riak_dt dependency and integrate it ourselves into our system using Riak Apache License Version 2.0 ( Different set based implementations exist (all state based CRDT s): G-Set: Grow only set i.e. no remove OR-Set: Observe Remove Set Able to add and remove However, when an element is removed a tombstone still exists i.e. Size problem ORSWOT: Observe Remove Set Without Tombstones Able to add and remove, but doesn t have tombstones In a concurrent add and remove of the same element > add-wins Reading: An optimized conflict-free replicated set. Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, Sérgio Duarte For our core use case we chose to use the ORSWOT 26

27 CRDTs ORSWOT overview Example: ORSWOT entry ORSWOT element (i.e. data) << KEY_NAME >> = { [ {x,1}, {y,2} ], [ {<< Data1 >>, [ {x,1} ] }, {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {y,2} ] } ] } ORSWOT entries (a list for the purpose of example) Value i.e. the ORSWOT Version vector exists as part of the ORSWOT: List of tuples Each tuple being {UniqueActorName, Counter} ORSWOT operations happen through an unique actor Similar but a different instance of version vector to what will also exist for the overall key/value Dots exist for each element in the ORSWOT set: Just a minimal version vector Each dot pair represents a tuple in the ORSWOT version vector at a particular point Usually a list of one Can be a list of more than one when, for example, two ORSWOTs for two concurrent adds of the same element (i.e. using two different unique actors) are merged 27

28 CRDTs ORSWOT overview Adding an element: Version vector as part of the ORSWOT incremented Counter for unique actor being used is incremented (or set as 1 if not currently in the version vector) The updated {UniqueActorName, Counter} pair is stored with the element as its dots If an entry already exists in the ORSWOT for the element, this is replaced Example: Adding << Data2 >> using unique actor y to the existing ORSWOT: { [ {x,1} ], [ {<< Data1 >>, [ {x,1} ] } ] } Results in the new ORSWOT: { [ {x,1}, {y,1} ], [ {<< Data1 >>, [ {x,1} ] }, {<< Data2 >>, [ {y,1} ] } ] } and ORSWOT value (i.e. ignoring metadata / what a client would be interested in) of: [ <<"Data1">>, <<"Data2">> ] 28

29 CRDTs ORSWOT overview Removing an element: Version vector as part of the ORSWOT does not change Of course when the put happens to store the ORSWOT s updated value, the version vector for the overall key/value is incremented The elements entry is simply removed from the ORSWOT i.e. No tombstones Example: Removing << Data1 >> from the existing ORSWOT: { [ {x,1}, {y,1} ], [ {<< Data1 >>, [ {x,1} ] }, {<< Data2 >>, [ {y,1} ] } ] } Results in the new ORSWOT: { [ {x,1}, {y,1} ], [ {<< Data2 >>, [ {y,1} ] } ] } * Further options do exist, such as being able to delay removes E.g. the ORSWOT object doing a remove might have not seen the original add for the element yet (e.g. because of being a replica not merged with the add yet) Would also add further logic to the merge operation 29

30 CRDTs ORSWOT overview Merging ORSWOT s: ORSWOT A and ORSWOT B E.g. because of siblings detected by version vectors for the overall key/value Version vectors for ORSWOT A and ORSWOT B merged i.e. the least possible common descendant of both (*) Elements merged: Common elements only kept if there exists a non-empty dots for them from merging (*): Common dot pairs for the element in ORSWOT A and ORSWOT B Dot pairs for the element only in ORSWOT A where the dot pair count is greater than any count* for the same actor in ORSWOT B s version vector Dot pairs for the element only in ORSWOT B where the dot pair count is greater than any count* for the same actor in ORSWOT A s version vector Elements only in ORSWOT A only kept if there exists a non-empty dots for them after: Keeping only dot pairs for the element where the dot pair count is greater than any count* for the same actor in ORSWOT B s version vector Elements only in ORSWOT B only kept if there exists a non-empty dots for them after: Keeping only dot pairs for the element where the dot pair count is greater than any count* for the same actor in ORSWOT A s version vector * As you might think, if there doesn t exist an appropriate actor pair/count in the version vector then the dot pair is merged/kept 30

31 CRDTs ORSWOT overview Merging example: ORSWOT A: { [ {x,1}, {y,2} ], [ {<< Data1 >>, [ {x,1} ] }, {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {y,2} ] } ] } Seen: 1. Adding element << Data1 >> via actor x 2. Adding element << Data2 >> via actor y 3. Adding element << Data3 >> via actor y ORSWOT B: { [ {x,1}, {y,1}, {z,2} ], [ {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {z,1} ] }, {<< Data4 >>, [ {z,2} ] } ] } Seen: 1. Adding element << Data1 >> via actor x 2. Adding element << Data2 >> via actor y 3. Adding element << Data3 >> via actor z 4. Adding element << Data4 >> via actor z 5. Removing element << Data1 >> via actor z Merged ORSWOT: { [ {x,1}, {y,2}, {z,2} ], [ {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {y,2}, {z,1} ] }, {<< Data4 >>, [ {z,2} ] } ] } 31

32 CRDTs ORSWOT overview Merging example: ORSWOT A: { [ {x,1}, {y,2} ], [ {<< Data1 >>, [ {x,1} ] }, {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {y,2} ] } ] } ORSWOT B: { [ {x,1}, {y,1}, {z,2} ], [ {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {z,1} ] }, {<< Data4 >>, [ {z,2} ] } ] } Merged = { [ {x,1}, {y,2}, {z,2} ], [ {<< Data2 >>, [ {y,1} ] }, {<< Data3 >>, [ {y,2}, {z,1} ] }, {<< Data4 >>, [ {z,2} ] } ] } [ {x,1}, {y,2}, {z,2} ] is the least possible common descendant for the version vectors of ORSWOT A and ORSWOT B Common elements: << Data2 >> : ORSWOT s have common dot pair {y,1} no other dot pairs to consider/merge with it => element in << Data3 >> : ORSWOT A has {y,2} which has a greater count than {y,1} in ORSWOT B s version vector [ {x,1}, {y,1}, {z,2} ] ORSWOT B has {z,1} which is included because ORSWOT A s version vector [ {x,1}, {y,2} ] doesn t have a count for z. Therefore {y,2} and {z,1} are merged to give [ {y,2}, {z,1} ] => element in Elements only in ORSWOT A: << Data1 >> : From ORSWOT A, no dots exist from [ {x,1} ] which are greater than ORSWOT B s version vector [ {x,1}, {y,1}, {z,2} ] => element not in Elements only in ORSWOT B: << Data4 >> : From ORSWOT B, dot pair {z,2} for the element is kept because ORSWOT A s version vector [ {x,1}, {y,2} ] doesn t have a count for z => element in 32

33 CRDTs riak_dt integration Erlang middle layer between clients and Riak Middle layer needed to have unique actors (gen_server s) as part of using the ORSWOT implementation With getting a balance on the number of them e.g. due to impacting ORSWOT version vector size Unique actor names pre-defined as part of server setup configuration With making them small e.g. due to impacting ORSWOT version vector size Clients don t have to deal with siblings Middle layer (i.e. client to Riak) does need to bring back siblings Not as good as a Riak server side CRDT (i.e. Riak 2.0+) Still using version vectors on the overall key/value s i.e. to know to do the ORSWOT merge operation 33

34 Getting It Live 34

35 Tooling From day 1 we ate our own dog food Monitoring Performance counters Error reporting (including correlation between systems) Built custom adhoc query tool On replicated Riak cluster Custom reconciliation between new system and old system Created build and release scripts for automation 35

36 Released in phases Able to build on stable ground Able to get data to impact future decisions Built confidence Move functionality to using Erlang and Riak Not all phases were immediately business impacting However, overall able to get business impacting functionality out sooner 36

37 Replay testing Captured logs asynchronously in one phase Common interface between old and new systems Able to do reconciliation between the systems Big range of test data / realistic load profile Used for functional and performance testing Different logs for long weekend run vs quick run Complemented other testing such as specific unit/integration testing, fuzz testing and formal UAT 37

38 Performance testing Done early and repeated Built custom client using replay logs Able to increase load profile easily Used our custom tooling / monitoring Identified bottlenecks / tested horizontal scalability of system Adhoc changes and fed back into development Had a specific profile which could give a fair test between changes Basho Bench used to sanity test Riak cluster setup 38

39 Failure testing Did failure testing Built confidence and understanding Trying to make sure a failure seen in production isn t the first time it s experienced Tested common procedures e.g. taking nodes in and out of service Failures will happen embrace them 39

40 Today with Stronger The project was a success! We have a system which is: Performant and able to deal with many times our peak load Reliable and deals with failure using minimal human intervention We have introduced the business to a number of new technologies We have grown the capabilities of the business and our people 40

41 Questions? 41

42 42

Advanced Databases ( CIS 6930) Fall Instructor: Dr. Markus Schneider. Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar

Advanced Databases ( CIS 6930) Fall Instructor: Dr. Markus Schneider. Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar Advanced Databases ( CIS 6930) Fall 2016 Instructor: Dr. Markus Schneider Group 17 Anirudh Sarma Bhaskara Sreeharsha Poluru Ameya Devbhankar BEFORE WE BEGIN NOSQL : It is mechanism for storage & retrieval

More information

Strong Eventual Consistency and CRDTs

Strong Eventual Consistency and CRDTs Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic

More information

SWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE

SWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE SWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE Annette Bieniusa T.U. Kaiserslautern Carlos Baquero HSALab, U. Minho Marc Shapiro, Marek Zawirski INRIA, LIP6 Nuno Preguiça, Sérgio Duarte, Valter Balegas

More information

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017

Genie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017 Genie Distributed Systems Synthesis and Verification Marc Rosen EN.600.667: Advanced Distributed Systems and Networks May 1, 2017 1 / 35 Outline Introduction Problem Statement Prior Art Demo How does it

More information

What Came First? The Ordering of Events in

What Came First? The Ordering of Events in What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.

More information

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance WHITEPAPER Relational to Riak relational Introduction This whitepaper looks at why companies choose Riak over a relational database. We focus specifically on availability, scalability, and the / data model.

More information

Strong Eventual Consistency and CRDTs

Strong Eventual Consistency and CRDTs Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic

More information

Conflict-free Replicated Data Types (CRDTs)

Conflict-free Replicated Data Types (CRDTs) Conflict-free Replicated Data Types (CRDTs) for collaborative environments Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Conflict-free

More information

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons

More information

Riak. Distributed, replicated, highly available

Riak. Distributed, replicated, highly available INTRO TO RIAK Riak Overview Riak Distributed Riak Distributed, replicated, highly available Riak Distributed, highly available, eventually consistent Riak Distributed, highly available, eventually consistent,

More information

Eventually Consistent HTTP with Statebox and Riak

Eventually Consistent HTTP with Statebox and Riak Eventually Consistent HTTP with Statebox and Riak Author: Bob Ippolito (@etrepum) Date: November 2011 Venue: QCon San Francisco 2011 1/62 Introduction This talk isn't really about web. It's about how we

More information

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies! DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is

More information

Eventual Consistency Today: Limitations, Extensions and Beyond

Eventual Consistency Today: Limitations, Extensions and Beyond Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley - Nomchin Banga Outline Eventual Consistency: History and Concepts How eventual is eventual consistency?

More information

CS 655 Advanced Topics in Distributed Systems

CS 655 Advanced Topics in Distributed Systems Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3

More information

Lessons learnt re-writing a PubSub system. Chandru Mullaparthi - Principal Software Architect at bet365

Lessons learnt re-writing a PubSub system. Chandru Mullaparthi - Principal Software Architect at bet365 1 Lessons learnt re-writing a PubSub system Chandru Mullaparthi - Principal Software Architect at bet365 2 About Founded in 2000 Located in Stoke-on-Trent The largest online sports betting company Over

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

Conflict-Free Replicated Data Types (basic entry)

Conflict-Free Replicated Data Types (basic entry) Conflict-Free Replicated Data Types (basic entry) Marc Shapiro Sorbonne-Universités-UPMC-LIP6 & Inria Paris http://lip6.fr/marc.shapiro/ 16 May 2016 1 Synonyms Conflict-Free Replicated Data Types (CRDTs).

More information

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt Putting together the platform: Riak, Redis, Solr and Spark Bryan Hunt 1 $ whoami Bryan Hunt Client Services Engineer @binarytemple 2 Minimum viable product - the ideologically correct doctrine 1. Start

More information

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and

More information

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high

More information

What am I? Bryan Hunt Basho Client Services Engineer Erlang neophyte JVM refugee Be gentle

What am I? Bryan Hunt Basho Client Services Engineer Erlang neophyte JVM refugee Be gentle What am I? Bryan Hunt Basho Client Services Engineer Erlang neophyte JVM refugee Be gentle What are you? Developer Operations Other Structure of this talk Introduction to Riak Introduction to Riak 2.0

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Self-healing Data Step by Step

Self-healing Data Step by Step Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com

More information

Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"

Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data? Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1 About the speaker NoSQL team lead at Trifork, Aarhus,

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study

Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study Improving Logical Clocks in Riak with Dotted Version Vectors: A Case Study Ricardo Gonçalves Universidade do Minho, Braga, Portugal, tome@di.uminho.pt Abstract. Major web applications need the partition-tolerance

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

(It s the invariants, stupid)

(It s the invariants, stupid) Consistency in three dimensions (It s the invariants, stupid) Marc Shapiro Cloud to the edge Social, web, e-commerce: shared mutable data Scalability replication consistency issues Inria & Sorbonne-Universités

More information

Coordination-Free Computations. Christopher Meiklejohn

Coordination-Free Computations. Christopher Meiklejohn Coordination-Free Computations Christopher Meiklejohn LASP DISTRIBUTED, EVENTUALLY CONSISTENT COMPUTATIONS CHRISTOPHER MEIKLEJOHN (BASHO TECHNOLOGIES, INC.) PETER VAN ROY (UNIVERSITÉ CATHOLIQUE DE LOUVAIN)

More information

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB s C. Faloutsos A. Pavlo Lecture#23: Distributed Database Systems (R&G ch. 22) Administrivia Final Exam Who: You What: R&G Chapters 15-22

More information

#IoT #BigData. 10/31/14

#IoT #BigData.  10/31/14 #IoT #BigData Seema Jethani @seemaj @basho 1 10/31/14 Why should we care? 2 11/2/14 Source: http://en.wikipedia.org/wiki/internet_of_things Motivation for Specialized Big Data Systems Rate of data capture

More information

Modern Database Concepts

Modern Database Concepts Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different

More information

Eventual Consistency Today: Limitations, Extensions and Beyond

Eventual Consistency Today: Limitations, Extensions and Beyond Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley Presenter: Yifei Teng Part of slides are cited from Nomchin Banga Road Map Eventual Consistency:

More information

INF-5360 Presentation

INF-5360 Presentation INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling

More information

Selective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation

Selective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation Selective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation Christopher Meiklejohn Basho Technologies, Inc. Bellevue, WA Email: cmeiklejohn@basho.com Peter Van Roy Université

More information

Presented By: Devarsh Patel

Presented By: Devarsh Patel : Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support

More information

Dynamo: Amazon s Highly Available Key-Value Store

Dynamo: Amazon s Highly Available Key-Value Store Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

Bloom: Big Systems, Small Programs. Neil Conway UC Berkeley

Bloom: Big Systems, Small Programs. Neil Conway UC Berkeley Bloom: Big Systems, Small Programs Neil Conway UC Berkeley Distributed Computing Programming Languages Data prefetching Register allocation Loop unrolling Function inlining Optimization Global coordination,

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Evaluating Cloud Databases for ecommerce Applications. What you need to grow your ecommerce business

Evaluating Cloud Databases for ecommerce Applications. What you need to grow your ecommerce business Evaluating Cloud Databases for ecommerce Applications What you need to grow your ecommerce business EXECUTIVE SUMMARY ecommerce is the future of not just retail but myriad industries from telecommunications

More information

Just-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design

Just-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design Just-Right Consistency As available as possible As consistent as necessary Correct by design Marc Shapiro, UMC-LI6 & Inria Annette Bieniusa, U. Kaiserslautern Nuno reguiça, U. Nova Lisboa Christopher Meiklejohn,

More information

Map-Reduce. Marco Mura 2010 March, 31th

Map-Reduce. Marco Mura 2010 March, 31th Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of

More information

CSE-E5430 Scalable Cloud Computing Lecture 10

CSE-E5430 Scalable Cloud Computing Lecture 10 CSE-E5430 Scalable Cloud Computing Lecture 10 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 23.11-2015 1/29 Exam Registering for the exam is obligatory,

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016

Consistency in Distributed Storage Systems. Mihir Nanavati March 4 th, 2016 Consistency in Distributed Storage Systems Mihir Nanavati March 4 th, 2016 Today Overview of distributed storage systems CAP Theorem About Me Virtualization/Containers, CPU microarchitectures/caches, Network

More information

Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency

Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency CPSC 416 Project Proposal n6n8: Trevor Jackson, u2c9: Hayden Nhan, v0r5: Yongnan (Devin) Li, x5m8: Li Jye Tong Introduction

More information

From Relational to Riak

From Relational to Riak www.basho.com From Relational to Riak December 2012 Table of Contents Table of Contents... 1 Introduction... 1 Why Migrate to Riak?... 1 The Requirement of High Availability...1 Minimizing the Cost of

More information

Scaling for Humongous amounts of data with MongoDB

Scaling for Humongous amounts of data with MongoDB Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis

More information

Distributed Data Management Replication

Distributed Data Management Replication Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

Key-Value Stores: RiakKV

Key-Value Stores: RiakKV B4M36DS2, BE4M36DS2: Database Systems 2 h p://www.ksi.m.cuni.cz/~svoboda/courses/181-b4m36ds2/ Lecture 7 Key-Value Stores: RiakKV Mar n Svoboda mar n.svoboda@fel.cvut.cz 12. 11. 2018 Charles University,

More information

DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS

DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS LASP 2 DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS 3 EN TAL AV CHRISTOPHER MEIKLEJOHN 4 RESEARCH WITH: PETER VAN ROY (UCL) 5 MOTIVATION 6 SYNCHRONIZATION IS EXPENSIVE 7 SYNCHRONIZATION IS SOMETIMES

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Eventual Consistency 1

Eventual Consistency 1 Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra

More information

Consistency Without Transactions Global Family Tree

Consistency Without Transactions Global Family Tree Consistency Without Transactions Global Family Tree NoSQL Matters Cologne Spring 2014 2014 by Intellectual Reserve, Inc. All rights reserved. 1 Contents Introduction to FamilySearch Family Tree Motivation

More information

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:

More information

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage) The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) Enterprise storage: $30B market built on disk Key players: EMC, NetApp, HP, etc.

More information

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns

More information

Key-Value Stores: RiakKV

Key-Value Stores: RiakKV NDBI040: Big Data Management and NoSQL Databases h p://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-ndbi040/ Lecture 4 Key-Value Stores: RiakKV Mar n Svoboda svoboda@ksi.mff.cuni.cz 25. 10. 2016 Charles

More information

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters CLUSTERING HIVEMQ Building highly available, horizontally scalable MQTT Broker Clusters 12/2016 About this document MQTT is based on a publish/subscribe architecture that decouples MQTT clients and uses

More information

ScaleArc for SQL Server

ScaleArc for SQL Server Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations

More information

Cloud-backed applications. Write Fast, Read in the Past: Causal Consistency for Client- Side Applications. Requirements & challenges

Cloud-backed applications. Write Fast, Read in the Past: Causal Consistency for Client- Side Applications. Requirements & challenges Write Fast, Read in the Past: ausal onsistenc for lient- Side Applications loud-backed applications 1. Request 3. Repl 2. Process request & store update 4. Transmit update 50~200ms databas e app server

More information

Oracle Data Warehousing Pushing the Limits. Introduction. Case Study. Jason Laws. Principal Consultant WhereScape Consulting

Oracle Data Warehousing Pushing the Limits. Introduction. Case Study. Jason Laws. Principal Consultant WhereScape Consulting Oracle Data Warehousing Pushing the Limits Jason Laws Principal Consultant WhereScape Consulting Introduction Oracle is the leading database for data warehousing. This paper covers some of the reasons

More information

Introduction Storage Processing Monitoring Review. Scaling at Showyou. Operations. September 26, 2011

Introduction Storage Processing Monitoring Review. Scaling at Showyou. Operations. September 26, 2011 Scaling at Showyou Operations September 26, 2011 I m Kyle Kingsbury Handle aphyr Code http://github.com/aphyr Email kyle@remixation.com Focus Backend, API, ops What the hell is Showyou? Nontrivial complexity

More information

Datacenter replication solution with quasardb

Datacenter replication solution with quasardb Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION

More information

Conflict-free Replicated Data Types in Practice

Conflict-free Replicated Data Types in Practice Conflict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11 th January, 2017 HASLab/INESC TEC & University of Minho InfoBlender Motivation Background Background: CAP Theorem

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Today l Basic distributed file systems l Two classical examples Next time l Naming things xkdc Distributed File Systems " A DFS supports network-wide sharing of files and devices

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Important Lessons. Today's Lecture. Two Views of Distributed Systems Important Lessons Replication good for performance/ reliability Key challenge keeping replicas up-to-date Wide range of consistency models Will see more next lecture Range of correctness properties L-10

More information

CS November 2018

CS November 2018 Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR Petri Kero CTO / Ministry of Games MOBILE GAME BACKEND CHALLENGES Lots of concurrent users Complex interactions between players Persistent world with frequent

More information

Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188 Distributed Systems February 3, 2015 Data Replication CS 188 Distributed Systems February 3, 2015 Page 1 Some Other Possibilities What if the machines sharing files are portable and not always connected? What if the machines communicate across

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Key-Value Stores: RiakKV

Key-Value Stores: RiakKV B4M36DS2: Database Systems 2 h p://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-b4m36ds2/ Lecture 4 Key-Value Stores: RiakKV Mar n Svoboda svoboda@ksi.mff.cuni.cz 24. 10. 2016 Charles University in Prague,

More information

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles INF3190:Distributed Systems - Examples Thomas Plagemann & Roman Vitenberg Outline Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles Today: Examples Googel File System (Thomas)

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [DYNAMO & GOOGLE FILE SYSTEM] Frequently asked questions from the previous class survey What s the typical size of an inconsistency window in most production settings? Dynamo?

More information

ECE Engineering Robust Server Software. Spring 2018

ECE Engineering Robust Server Software. Spring 2018 ECE590-02 Engineering Robust Server Software Spring 2018 Business Continuity: Disaster Recovery Tyler Bletsch Duke University Includes material adapted from the course Information Storage and Management

More information

MY CONVERSATION HAS RUN DRY

MY CONVERSATION HAS RUN DRY PARTITION TOLERANCE MY CONVERSATION HAS RUN DRY Many systems degrade, or otherwise change state, under partition BRING THE PIECES BACK TOGETHER REDISCOVER COMMUNICATION A EXAMPLE ANPLICATION 5 clients

More information

Traditional RDBMS Wisdom is All Wrong -- In Three Acts. Michael Stonebraker

Traditional RDBMS Wisdom is All Wrong -- In Three Acts. Michael Stonebraker Traditional RDBMS Wisdom is All Wrong -- In Three Acts Michael Stonebraker The Stonebraker Says Webinar Series The first three acts: 1. Why main memory is the answer for OLTP Recording available at VoltDB.com

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,

More information

Designing for Scalability. Patrick Linskey EJB Team Lead BEA Systems

Designing for Scalability. Patrick Linskey EJB Team Lead BEA Systems Designing for Scalability Patrick Linskey EJB Team Lead BEA Systems plinskey@bea.com 1 Patrick Linskey EJB Team Lead at BEA OpenJPA Committer JPA 1, 2 EG Member 2 Agenda Define and discuss scalability

More information

SolidFire and Ceph Architectural Comparison

SolidFire and Ceph Architectural Comparison The All-Flash Array Built for the Next Generation Data Center SolidFire and Ceph Architectural Comparison July 2014 Overview When comparing the architecture for Ceph and SolidFire, it is clear that both

More information

Big Data Analytics. Rasoul Karimi

Big Data Analytics. Rasoul Karimi Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Outline

More information

Two Success Stories - Optimised Real-Time Reporting with BI Apps

Two Success Stories - Optimised Real-Time Reporting with BI Apps Oracle Business Intelligence 11g Two Success Stories - Optimised Real-Time Reporting with BI Apps Antony Heljula October 2013 Peak Indicators Limited 2 Two Success Stories - Optimised Real-Time Reporting

More information

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) OUTLINE Flat datacenter storage Deterministic data placement in fds Metadata properties of fds Per-blob metadata in fds Dynamic Work Allocation in fds Replication

More information

A Guide to Architecting the Active/Active Data Center

A Guide to Architecting the Active/Active Data Center White Paper A Guide to Architecting the Active/Active Data Center 2015 ScaleArc. All Rights Reserved. White Paper The New Imperative: Architecting the Active/Active Data Center Introduction With the average

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Copyright 2003 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4. Other Approaches

More information

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability

More information

Heckaton. SQL Server's Memory Optimized OLTP Engine

Heckaton. SQL Server's Memory Optimized OLTP Engine Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability

More information

Configuration changes such as conversion from a single instance to RAC, ASM, etc.

Configuration changes such as conversion from a single instance to RAC, ASM, etc. Today, enterprises have to make sizeable investments in hardware and software to roll out infrastructure changes. For example, a data center may have an initiative to move databases to a low cost computing

More information

Inside the PostgreSQL Shared Buffer Cache

Inside the PostgreSQL Shared Buffer Cache Truviso 07/07/2008 About this presentation The master source for these slides is http://www.westnet.com/ gsmith/content/postgresql You can also find a machine-usable version of the source code to the later

More information

Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants

Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants Valter Balegas, Diogo Serra, Sérgio Duarte Carla Ferreira, Rodrigo Rodrigues, Nuno Preguiça NOVA LINCS/FCT/Universidade

More information