Ken Birman. Cornell Dept of Computer Science Colloquium
|
|
- Hubert Gallagher
- 5 years ago
- Views:
Transcription
1 Ken Birman Sept 24, 2009 Cornell Dept of Computer Science Colloquium 1
2 My career has been focused on multicast Mostly, in support of service replication and this was important in the old world Client server computing, dominated by databases ACID consistency guarantees were the key The virtual synchrony model is a form of ACID property But a new world has displaced the old one Sept 24, 2009 Cornell Dept of Computer Science Colloquium 2
3 Sept 24, 2009 Cornell Dept of Computer Science Colloquium 3
4 Web 2.0 Mashups Simple ways to create and share collaboration and social network applications [Try it! Examples: Live Objects, Google Wave, Javascript/AJAX, Silverlight, Java Fx, Adobe FLEX and AIR, etc. Sept 24, 2009 Cornell Dept of Computer Science Colloquium 4
5 Cloud computing platforms have a characteristic style Client system is a browser or an application pretending to be a browser (web services). If a request times out the client will retry it Cloud platform has a massive number of servers facing the clients. These are stateless and handle client requests with a focus on speed. If one fails, there is no warning. The cluster manager just restarts it, perhaps elsewhere Behind the servers is a row of stateful (usually database) servers. These operate off some form of work queues, usually implemented as enterprise service buses (ESBs). Think publish subscribe with logging. They push updates to the client facing systems. Finally, there is a big pool of machines to run backend applications. This looks like a loosely coupled collection of Linux machines sharing a file system and a lock manager, and often programmed mostly with Hadoop (MapReduce). Sept 24, 2009 Cornell Dept of Computer Science Colloquium 5
6 Client systems use web technologies , file storage, IM, search Web services Databases, files with data used by servers ESB glues it together Google/IBM/Amazon/Yahoo! host the services Embarrassingly parallel backend tasks prepare the databases and files
7 Range in size from edge facilities to megascale. Incredible economies of scale Approximate costs for a small size center (1K servers) and a larger, 50K server center. Technology Cost in smallsized Data Center Cost in Large Data Center Cloud Advantage Network $95 per Mbps/ month $13 per Mbps/ month 7.1 Storage Administratio n $2.20 per GB/ month ~140 servers/ Administrator $0.40 per GB/ month >1000 Servers/ Administrator Each data center is 11.5 times the size of a football field Slides 4 17 provided by Roger Barga, Head of Cloud Computing, Microsoft
8 Old world: we replicated servers for speed and availability, but maintained consistency New world: scalability matters most of all Lots of replication, but weak consistency Sept 24, 2009 Cornell Dept of Computer Science Colloquium 8
9 Why can t we have both? Scalability Consistency too Sept 24, 2009 Cornell Dept of Computer Science Colloquium 9
10 Sept 24, 2009 Cornell Dept of Computer Science Colloquium 10
11 Why do reliability/consistency mechanisms scale poorly? Focus on replication supported by multicast Can we do something about it? For example, can we fix the whole mess? How would a truly scalable platform look? Sept 24, 2009 Cornell Dept of Computer Science Colloquium 11
12 Sept 24, 2009 Cornell Dept of Computer Science Colloquium 12
13 As described by Randy Shoup at LADIS 2008 Thou shalt 1. Partition Everything 2. Use Asynchrony Everywhere 3. Automate Everything 4. Remember: Everything Fails 5. Embrace Inconsistency Sept 24, 2009 Cornell Dept of Computer Science Colloquium 13
14 Werner Vogels is CTO at Amazon.com His first act? He banned multicast! Amazon was troubled by platform instability Vogels decreed: all communication via SOAP/TCP This was slower but Stability matters more than speed Sept 24, 2009 Cornell Dept of Computer Science Colloquium 14
15 Key to scalability is decoupling, loosest possible synchronization Any synchronized mechanism is a risk His approach: create a committee Anyone who wants to deploy a highly consistent mechanism needs committee approval. They don t meet very often Sept 24, 2009 Cornell Dept of Computer Science Colloquium 15
16 Consistency technologies just don t scale! Sept 11, 24, 2009 P2P Cornell 2009 Dept Seattle, of Computer Washington Science Colloquium 16
17 This is the common thread All three guys Really build massive data centers, that work And are opposed to consistency mechanisms Sept 24, 2009 Cornell Dept of Computer Science Colloquium 17
18 A consistent distributed system will often have many components, but users observe behavior indistinguishable from that of a single component reference system Reference Model Implementation Sept 24, 2009 Cornell Dept of Computer Science Colloquium 18
19 Transactions that update replicated data Atomic broadcast or other forms of reliable multicast protocols Distributed 2 phase locking mechanisms Sept 24, 2009 Cornell Dept of Computer Science Colloquium 19
20 A=3 B=7 B = B A A=A+1 Non replicated reference execution Synchronous execution Virtually synchronous execution Virtual synchrony is a consistency model: Synchronous runs: indistinguishable from non replicated object that saw the same updates (like Paxos) Virtually synchronous runs are indistinguishable from synchronous runs Sept 24, 2009 Cornell Dept of Computer Science Colloquium 20
21 Inconsistency causes bugs Clients would never be able to trust servers a free for all My rent check bounced? That can t be right! Weak or best effort consistency? Jason Fane Properties Sept 2009 Tommy Tenant Strong security guarantees demand consistency Would you trust a medical electronic health records system or a bank that used weak consistency for better scalability? Sept 24, 2009 Cornell Dept of Computer Science Colloquium 21
22 They think consistency mechanisms are capable of destabilizing the whole data center They typically run over UDP and IP multicast As a result, are fast in the laboratory, but lack flow control. If a node gets overwhelmed they just keep on sending, and things get worse and worse They also believe that these mechanisms tend to cause synchronization over sets of nodes, which can cause load oscillations Sept 24, 2009 Cornell Dept of Computer Science Colloquium 22
23 A blend of stories (ebay, Amazon, Yahoo): Pub sub message bus very popular. System scaled up. Rolled out a faster ethernet. Product uses IPMC to accelerate sending All goes well until one day, under heavy load, loss rates spike, triggering collapse Oscillation observed Sept 24, 2009 Cornell Dept of Computer Science Colloquium 23
24 Without IP multicast (IPMC)? Pub/sub struggles to disseminate data This inevitably limits speed, scalability But IPMC itself scales poorly! Routers and NICs become promiscuous if we use large numbers of multicast addresses [Tock] Then everyone gets every IPMC packet Sept 24, 2009 Cornell Dept of Computer Science Colloquium 24
25 Routers and NICs use Bloom Filters for quick forwarding/receive decisions Should I forward packet P on link L? Is this packet one I should accept? Such a filter is just an array of k bit vectors Hash message k times; test/set bit b k in bvec[k] Too many addresses cause the filters to fill up and high false positive rate ensues Sept 24, 2009 Cornell Dept of Computer Science Colloquium 25
26 If IPMC becomes promiscuous, receivers get lots of junk It shows up fast, mixed with valid data This swamps the receivers, which drop packets including good packets So loss rates skyrocket, TCP shuts down. Woe ensues! Sept 24, 2009 Cornell Dept of Computer Science Colloquium 26
27 1. System gets larger and more loaded 2. Generalized loss starts to occur 3. Causes a massive spike of NAK messages. Retransmissions just make things worse meltdown! 4. After 90 seconds, the pub sub product resets Sept 24, 2009 Cornell Dept of Computer Science Colloquium 27
28 Oscillations can also arise in other ways For example, a very slow server that grabs a lock Some systems even self synchronize But those problems can usually be solved Distinction: Those are software issues; better designs can fix IPMC is hardware that breaks if we just scale the size of our overall system up! Sept 24, 2009 Cornell Dept of Computer Science Colloquium 28
29 A consistent system fights to maintain some guarantee: a feedback loop Thus when the network collapses, the consistency property is often seen to be at fault Had we not cared about consistency, feedback instability would not have occurred So consistency is dangerous! Sept 24, 2009 Cornell Dept of Computer Science Colloquium 29
30 Replicate with TCP overlay? Data often stale Use IPMC? It melts down catastrophically My code was bug free. It worked in testing and ran for years in our data center Then they upgraded the data center and my old application explodes. Was that my fault??! Sept 24, 2009 Cornell Dept of Computer Science Colloquium 30
31 My group has been working on this topic for about two years now Quick summary: Fix IP multicast to make multicast a friendly and safe citizen again [Dr Multicast: HotNets 08, LADIS 08, submission to Eurosys 10] Gossip tunnels for UDP and other point to point traffic: provably safe and scalable [Gossip Objects: ICP2P 09; submitted to ACM TOCS] Sept 24, 2009 Cornell Dept of Computer Science Colloquium 31
32 Underway now: Collaboration with Cisco to put these and related tools onto Internet backbone routers Goal: transparent software fault tolerance, with a reliable multicast that runs at 100Gbit data rates Also will be able to host virtual services on the routers themselves Translated back to a data center: safe and guaranteed stable reliability mechanisms Sept 24, 2009 Cornell Dept of Computer Science Colloquium 32
33 CTOs fear disruptive network meltdowns Fix the network, then reimplement consistency protocols as scalable solutions within NTI Well I ll give you another chance making a story that even a skeptical CTO might consider Sept 24, 2009 Cornell Dept of Computer Science Colloquium 33
34 c c Today s embrace of inconsistency has given us scalable services we just can t trust. As cloud takes on medical care, banking inconsistency just isn t good enough NTI: cloud scale consistency without fear has papers (NTI is still work in progress) Sept 24, 2009 Cornell Dept of Computer Science Colloquium 34
CS5412: ANATOMY OF A CLOUD
1 CS5412: ANATOMY OF A CLOUD Lecture VIII Ken Birman How are cloud structured? 2 Clients talk to clouds using web browsers or the web services standards But this only gets us to the outer skin of the cloud
More informationCS5412: HOW MUCH ORDERING?
1 CS5412: HOW MUCH ORDERING? Lecture XVI Ken Birman Ordering 2 The key to consistency turns has turned out to be delivery ordering (durability is a separate thing) Given replicas that are initially in
More informationCLOUD HOSTED COMPUTING FOR DEMANDING REAL-TIME SETTINGS KEN BIRMAN. Rao Professor of Computer Science Cornell University
CLOUD HOSTED COMPUTING FOR DEMANDING REAL-TIME SETTINGS KEN BIRMAN Rao Professor of Computer Science Cornell University CAN THE CLOUD DO REAL-TIME? Internet of Things, Smart Grid / Buildings / Cars,...
More informationCS603: Distributed Systems
CS603: Distributed Systems Lecture 1: Basic Communication Services Cristina Nita-Rotaru Lecture 1/ Spring 2006 1 Reference Material Textbooks Ken Birman: Reliable Distributed Systems Recommended reading
More informationDIVING IN: INSIDE THE DATA CENTER
1 DIVING IN: INSIDE THE DATA CENTER Anwar Alhenshiri Data centers 2 Once traffic reaches a data center it tunnels in First passes through a filter that blocks attacks Next, a router that directs it to
More informationIntroduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and
Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and Jaliya Ekanayake Range in size from edge facilities
More informationCS5412: THE REALTIME CLOUD
1 CS5412: THE REALTIME CLOUD Lecture XXIV Ken Birman Can the Cloud Support Real-Time? 2 More and more real time applications are migrating into cloud environments Monitoring of traffic in various situations,
More informationCS5412: DIVING IN: INSIDE THE DATA CENTER
1 CS5412: DIVING IN: INSIDE THE DATA CENTER Lecture V Ken Birman We ve seen one cloud service 2 Inside a cloud, Dynamo is an example of a service used to make sure that cloud-hosted applications can scale
More informationCS5412: DIVING IN: INSIDE THE DATA CENTER
1 CS5412: DIVING IN: INSIDE THE DATA CENTER Lecture V Ken Birman Data centers 2 Once traffic reaches a data center it tunnels in First passes through a filter that blocks attacks Next, a router that directs
More informationReliable Distributed System Approaches
Reliable Distributed System Approaches Manuel Graber Seminar of Distributed Computing WS 03/04 The Papers The Process Group Approach to Reliable Distributed Computing K. Birman; Communications of the ACM,
More informationSolace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery
Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery Java Message Service (JMS) is a standardized messaging interface that has become a pervasive part of the IT landscape
More informationCS October 2017
Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.
More informationCS5412: OTHER DATA CENTER SERVICES
1 CS5412: OTHER DATA CENTER SERVICES Lecture V Ken Birman Tier two and Inner Tiers 2 If tier one faces the user and constructs responses, what lives in tier two? Caching services are very common (many
More informationCLOUD COMPUTING Lecture 27: CS 2110 Spring 2013
CLOUD COMPUTING Lecture 27: CS 2110 Spring 2013 Computing has evolved... 2 Fifteen years ago: desktop/laptop + clusters Then Web sites Social networking sites with photos, etc Cloud computing systems Cloud
More informationDISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationCornell University, Hebrew University * (*) Others are also involved in some aspects of this project I ll mention them when their work arises
Live Objects Krzys Ostrowski, Ken Birman, Danny Dolev Cornell University, Hebrew University * (*) Others are also involved in some aspects of this project I ll mention them when their work arises Live
More informationRELIABILITY & AVAILABILITY IN THE CLOUD
RELIABILITY & AVAILABILITY IN THE CLOUD A TWILIO PERSPECTIVE twilio.com To the leaders and engineers at Twilio, the cloud represents the promise of reliable, scalable infrastructure at a price that directly
More informationCS5412: BIMODAL MULTICAST ASTROLABE
1 CS5412: BIMODAL MULTICAST ASTROLABE Lecture XIX Ken Birman Gossip 201 2 Recall from early in the semester that gossip spreads in log(system size) time But is this actually fast? 1.0 % infected 0.0 Time
More informationDistributed Data Store
Distributed Data Store Large-Scale Distributed le system Q: What if we have too much data to store in a single machine? Q: How can we create one big filesystem over a cluster of machines, whose data is
More informationIndirect Communication
Indirect Communication To do q Today q q Space and time (un)coupling Common techniques q Next time: Overlay networks xkdc Direct coupling communication With R-R, RPC, RMI Space coupled Sender knows the
More informationInfrastructures for Cloud Computing and Big Data M
University of Bologna Dipartimento di Informatica Scienza e Ingegneria (DISI) Engineering Bologna Campus Class of Infrastructures for Cloud Computing and Big Data M Cloud support and Global strategies
More informationCS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University
CS 555: DISTRIBUTED SYSTEMS [REPLICATION & CONSISTENCY] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L25.1 L25.2 Topics covered
More informationData Intensive Scalable Computing
Data Intensive Scalable Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Examples of Big Data Sources Wal-Mart 267 million items/day, sold at 6,000 stores HP built them
More informationDistributed Systems CS6421
Distributed Systems CS6421 Intro to Distributed Systems and the Cloud Prof. Tim Wood v I teach: Software Engineering, Operating Systems, Sr. Design I like: distributed systems, networks, building cool
More information10. Replication. Motivation
10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationebay Marketplace Architecture
ebay Marketplace Architecture Architectural Strategies, Patterns, and Forces Randy Shoup, ebay Distinguished Architect QCon SF 2007 November 9, 2007 What we re up against ebay manages Over 248,000,000
More informationOur topic. Concurrency Support. Why is this a problem? Why threads? Classic solution? Resulting architecture. Ken Birman
Our topic Concurrency Support Ken Birman To get high performance, distributed systems need to achieve a high level of concurrency As a practical matter: interleaving tasks, so that while one waits for
More informationReliable Distributed Systems. Scalability
Reliable Distributed Systems Scalability Scalability Today we ll focus on how things scale Basically: look at a property that matters Make something bigger Like the network, the number of groups, the number
More informationebay s Architectural Principles
ebay s Architectural Principles Architectural Strategies, Patterns, and Forces for Scaling a Large ecommerce Site Randy Shoup ebay Distinguished Architect QCon London 2008 March 14, 2008 What we re up
More informationCS505: Distributed Systems
Cristina Nita-Rotaru CS505: Distributed Systems Protocols. Slides prepared based on material by Prof. Ken Birman at Cornell University, available at http://www.cs.cornell.edu/ken/book/ Required reading
More informationIntroduction to the Active Everywhere Database
Introduction to the Active Everywhere Database INTRODUCTION For almost half a century, the relational database management system (RDBMS) has been the dominant model for database management. This more than
More informationGFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures
GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,
More informationMySQL Group Replication. Bogdan Kecman MySQL Principal Technical Engineer
MySQL Group Replication Bogdan Kecman MySQL Principal Technical Engineer Bogdan.Kecman@oracle.com 1 Safe Harbor Statement The following is intended to outline our general product direction. It is intended
More informationQOS Quality Of Service
QOS Quality Of Service Michael Schär Seminar in Distributed Computing Outline Definition QOS Attempts and problems in the past (2 Papers) A possible solution for the future: Overlay networks (2 Papers)
More informationAdvanced option settings on the command line. Set the interface and ports for the OpenVPN daemons
Advanced option settings on the command line docs.openvpn.net/command-line/advanced-option-settings-on-the-command-line Set the interface and ports for the OpenVPN daemons In the Admin UI under Server
More informationData Consistency Now and Then
Data Consistency Now and Then Todd Schmitter JPMorgan Chase June 27, 2017 Room #208 Data consistency in real life Social media Facebook post: January 22, 2017, at a political rally Comments displayed are
More informationCassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent
Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these
More informationBest Practices for Scaling Websites Lessons from ebay
Best Practices for Scaling Websites Lessons from ebay Randy Shoup ebay Distinguished Architect QCon Asia 2009 Challenges at Internet Scale ebay manages 86.3 million active users worldwide 120 million items
More informationTCP Strategies. Keepalive Timer. implementations do not have it as it is occasionally regarded as controversial. between source and destination
Keepalive Timer! Yet another timer in TCP is the keepalive! This one is not required, and some implementations do not have it as it is occasionally regarded as controversial! When a TCP connection is idle
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationData Centers. Tom Anderson
Data Centers Tom Anderson Transport Clarification RPC messages can be arbitrary size Ex: ok to send a tree or a hash table Can require more than one packet sent/received We assume messages can be dropped,
More informationCS5412: THE BASE METHODOLOGY VERSUS THE ACID MODEL
1 CS5412: THE BASE METHODOLOGY VERSUS THE ACID MODEL Lecture VIII Ken Birman Today s lecture will be a bit short 2 We have a guest with us today: Kate Jenkins from Akamai The world s top content hosting
More informationCloud Computing. DB Special Topics Lecture (10/5/2012) Kyle Hale Maciej Swiech
Cloud Computing DB Special Topics Lecture (10/5/2012) Kyle Hale Maciej Swiech Managing servers isn t for everyone What are some prohibitive issues? (we touched on these last time) Cost (initial/operational)
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationThe Internet. CS514: Intermediate Course in Operating Systems. Massive scale. Constant flux. Typical examples. A tale of woe. Sample tale of woe
The Internet CS54: Intermediate Course in Operating Systems Massive scale. Constant flux Professor Ken Birman Vivek Vishnumurthy: TA Source: Burch and Cheswick Demand for more autonomic, intelligent behavior
More informationCPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University
CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network
More informationOptimization of Windows 8, 8.1 and 10 for better performance
Windows 10 Optimization of Windows 8, 8.1 and 10 for better performance By default, Windows 10 turns your PC into a server for distributing updates to other machines. Here's how to make it stop. One of
More information17655: Discussion: The New z/os Interface for the Touch Generation
17655: Discussion: The New z/os Interface for the Touch Generation Thursday, August 13, 2015: 12:30 PM-1:30 PM Europe 2 (Walt Disney World Dolphin ) Speaker: Geoff Smith(IBM Corporation) 1 Trademarks The
More informationDistributed Systems Intro and Course Overview
Distributed Systems Intro and Course Overview COS 418: Distributed Systems Lecture 1 Wyatt Lloyd Distributed Systems, What? 1) Multiple computers 2) Connected by a network 3) Doing something together Distributed
More informationPerformance Benchmarking an Enterprise Message Bus. Anurag Sharma Pramod Sharma Sumant Vashisth
Performance Benchmarking an Enterprise Message Bus Anurag Sharma Pramod Sharma Sumant Vashisth About the Authors Sumant Vashisth is Director of Engineering, Security Management Business Unit at McAfee.
More informationSolace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware
Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware What You Will Learn The goal of zero latency in financial services has caused the creation of
More informationConsensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.
Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationOverview. TCP & router queuing Computer Networking. TCP details. Workloads. TCP Performance. TCP Performance. Lecture 10 TCP & Routers
Overview 15-441 Computer Networking TCP & router queuing Lecture 10 TCP & Routers TCP details Workloads Lecture 10: 09-30-2002 2 TCP Performance TCP Performance Can TCP saturate a link? Congestion control
More informationCS514: Intermediate Course in Computer Systems
: Intermediate Course in Computer Systems Lecture 15: February 24, 2003 Overcoming network outages in applications that need high availability Network outages We ve seen a number of security mechanisms
More informationCS5412 CLOUD COMPUTING: PRELIM EXAM Open book, open notes. 90 minutes plus 45 minutes grace period, hence 2h 15m maximum working time.
CS5412 CLOUD COMPUTING: PRELIM EXAM Open book, open notes. 90 minutes plus 45 minutes grace period, hence 2h 15m maximum working time. SOLUTION SET In class we often used smart highway (SH) systems as
More informationAnanta: Cloud Scale Load Balancing. Nitish Paradkar, Zaina Hamid. EECS 589 Paper Review
Ananta: Cloud Scale Load Balancing Nitish Paradkar, Zaina Hamid EECS 589 Paper Review 1 Full Reference Patel, P. et al., " Ananta: Cloud Scale Load Balancing," Proc. of ACM SIGCOMM '13, 43(4):207-218,
More informationKeeping Your Network at Peak Performance as You Virtualize the Data Center
Keeping Your Network at Peak Performance as You Virtualize the Data Center Laura Knapp WW Business Consultant Laurak@aesclever.com Applied Expert Systems, Inc. 2011 1 Background The Physical Network Inside
More informationChapter 8 Fault Tolerance
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 8 Fault Tolerance 1 Fault Tolerance Basic Concepts Being fault tolerant is strongly related to
More informationCOMMUNICATION IN DISTRIBUTED SYSTEMS
Distributed Systems Fö 3-1 Distributed Systems Fö 3-2 COMMUNICATION IN DISTRIBUTED SYSTEMS Communication Models and their Layered Implementation 1. Communication System: Layered Implementation 2. Network
More informationCS514: Intermediate Course in Computer Systems. Lecture 38: April 23, 2003 Nested transactions and other weird implementation issues
: Intermediate Course in Computer Systems Lecture 38: April 23, 2003 Nested transactions and other weird implementation issues Transactions Continued We saw Transactions on a single server 2 phase locking
More informationNFON Whitepaper: Integrating Microsoft Lync (Skype for Business) with Telephony
. Myths and facts - for enterprise owners, managers and buyers. Document Version: V1.0 Date: November 2015 NFON UK Ltd, 140 Wales Farm Road, London, W3 6UG, UK Page 1 of 7 1. INTRODUCTION While the business
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationDistributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented
More informationExample File Systems Using Replication CS 188 Distributed Systems February 10, 2015
Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability
More informationThird Wave. How to Incorporate Legacy Devices to the. of Internet Evolution
How to Incorporate Legacy Devices to the Third Wave of Internet Evolution John Rinaldi Business Development Manager Real Time Automation N26 W23315 Paul Rd Pewaukee, WI 53072 (p) 262-436-9299 414-460-6556
More informationCMSC 433 Programming Language Technologies and Paradigms. Spring 2013
1 CMSC 433 Programming Language Technologies and Paradigms Spring 2013 Distributed Computing Concurrency and the Shared State This semester we have been looking at concurrent programming and how it is
More informationFighting Phishing I: Get phish or die tryin.
Fighting Phishing I: Get phish or die tryin. Micah Nelson and Max Hyppolite bit.ly/nercomp_sap918 Please, don t forget to submit your feedback for today s session at the above URL. If you use social media
More informationCS5412: NETWORKS AND THE CLOUD
1 CS5412: NETWORKS AND THE CLOUD Lecture III Ken Birman The Internet and the Cloud 2 Cloud computing is transforming the Internet! Mix of traffic has changed dramatically Demand for networking of all kinds
More informationIntroduction to Distributed * Systems
Introduction to Distributed * Systems Outline about the course relationship to other courses the challenges of distributed systems distributed services *ility for distributed services about the course
More informationAWS Solution Architecture Patterns
AWS Solution Architecture Patterns Objectives Key objectives of this chapter AWS reference architecture catalog Overview of some AWS solution architecture patterns 1.1 AWS Architecture Center The AWS Architecture
More informationEnterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions Chapter 1: Solving Integration Problems Using Patterns 2 Introduction The Need for Integration Integration Challenges
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationCLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters
CLUSTERING HIVEMQ Building highly available, horizontally scalable MQTT Broker Clusters 12/2016 About this document MQTT is based on a publish/subscribe architecture that decouples MQTT clients and uses
More informationTIME FOR A NEW NETWORK REINVENTING THE NETWORK. Abner Germanow March 15, 2011
TIME FOR A NEW NETWORK REINVENTING THE NETWORK Abner Germanow March 15, 2011 IT SPENDING The Network 3 Copyright 2010 Juniper Networks, Inc. www.juniper.net IMPACT OF THE NETWORK The Network Not Connected
More informationGOSSIP ARCHITECTURE. Gary Berg css434
GOSSIP ARCHITECTURE Gary Berg css434 WE WILL SEE Architecture overview Consistency models How it works Availability and Recovery Performance and Scalability PRELIMINARIES Why replication? Fault tolerance
More informationApsaraDB for Redis. Product Introduction
ApsaraDB for Redis is compatible with open-source Redis protocol standards and provides persistent memory database services. Based on its high-reliability dual-machine hot standby architecture and seamlessly
More informationLike It Or Not Web Applications and Mashups Will Be Hot
Like It Or Not Web Applications and Mashups Will Be Hot Tommi Mikkonen Tampere University of Technology tommi.mikkonen@tut.fi Antero Taivalsaari Sun Microsystems Laboratories antero.taivalsaari@sun.com
More informationDistributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
More informationDistributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs
1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds
More informationPNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8)
PNUTS and Weighted Voting Vijay Chidambaram CS 380 D (Feb 8) PNUTS Distributed database built by Yahoo Paper describes a production system Goals: Scalability Low latency, predictable latency Must handle
More informationIndirect Communication
Indirect Communication Vladimir Vlassov and Johan Montelius KTH ROYAL INSTITUTE OF TECHNOLOGY Time and Space In direct communication sender and receivers exist in the same time and know of each other.
More informationICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017
ICANN and Technical Work: Really? Yes! Steve Crocker DNS Symposium, Madrid, 13 May 2017 Welcome, everyone. I appreciate the invitation to say a few words here. This is an important meeting and I think
More informationSEARCH ENGINE OPTIMIZATION Noun The process of maximizing the number of visitors to a particular website by ensuring that the site appears high on the list of results returned by a search engine such as
More informationProgramming Models MapReduce
Programming Models MapReduce Majd Sakr, Garth Gibson, Greg Ganger, Raja Sambasivan 15-719/18-847b Advanced Cloud Computing Fall 2013 Sep 23, 2013 1 MapReduce In a Nutshell MapReduce incorporates two phases
More informationEventual Consistency 1
Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra
More informationCurrent Topics in OS Research. So, what s hot?
Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general
More informationMore on LANS. LAN Wiring, Interface
More on LANS Chapters 10-11 LAN Wiring, Interface Mostly covered this material already NIC = Network Interface Card Separate processor, buffers incoming/outgoing data CPU might not be able to keep up network
More informationMODELS OF DISTRIBUTED SYSTEMS
Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between
More informationYour Data Demands More NETAPP ENABLES YOU TO LEVERAGE YOUR DATA & COMPUTE FROM ANYWHERE
Your Data Demands More NETAPP ENABLES YOU TO LEVERAGE YOUR DATA & COMPUTE FROM ANYWHERE IN ITS EARLY DAYS, NetApp s (www.netapp.com) primary goal was to build a market for network-attached storage and
More informationCloud Computing CS
Cloud Computing CS 15-319 Programming Models- Part III Lecture 6, Feb 1, 2012 Majd F. Sakr and Mohammad Hammoud 1 Today Last session Programming Models- Part II Today s session Programming Models Part
More informationICANN Start, Episode 1: Redirection and Wildcarding. Welcome to ICANN Start. This is the show about one issue, five questions:
Recorded in October, 2009 [Music Intro] ICANN Start, Episode 1: Redirection and Wildcarding Welcome to ICANN Start. This is the show about one issue, five questions: What is it? Why does it matter? Who
More informationPrimary-Backup Replication
Primary-Backup Replication CS 240: Computing Systems and Concurrency Lecture 7 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Simplified Fault Tolerance
More informationFailure Tolerance. Distributed Systems Santa Clara University
Failure Tolerance Distributed Systems Santa Clara University Distributed Checkpointing Distributed Checkpointing Capture the global state of a distributed system Chandy and Lamport: Distributed snapshot
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS
Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:
More informationUpgrade Your MuleESB with Solace s Messaging Infrastructure
The era of ubiquitous connectivity is upon us. The amount of data most modern enterprises must collect, process and distribute is exploding as a result of real-time process flows, big data, ubiquitous
More informationAzure Cloud Architecture
Azure Cloud Architecture Training Schedule 2015 May 18-20 Belgium (TBD) Overview This course is a deep dive in every architecture aspect of the Azure Platform-as-a-Service components. It delivers the needed
More informationManaging Data at Scale: Microservices and Events. Randy linkedin.com/in/randyshoup
Managing Data at Scale: Microservices and Events Randy Shoup @randyshoup linkedin.com/in/randyshoup Background VP Engineering at Stitch Fix o Combining Art and Science to revolutionize apparel retail Consulting
More information