Failures in Distributed Systems
|
|
- Daniela Bates
- 5 years ago
- Views:
Transcription
1 CPSC 426/526 Failures in Distributed Systems Ennan Zhai Computer Science Department Yale University
2 Recall: Lec-14 In lec-14, we learned: - Difference between privacy and anonymity - Anonymous communications - Case study: Dissent [OSDI 12]
3 Lecture Roadmap Failures in Distributed Systems Root Causes How to handle failures Case Study: Catastrophic Failures
4 Failures in Distributed Systems What is the meaning of failure: - Your system does not function correctly - Your system implementation does not meet your spec - What is your intuition when you say some code has bugs?
5 Why we care about failures? Data Center Outages Generate Big Losses Downtime in a data center can cost an average of $740,357 per incident -- cloud outage cost report in 2016.
6 Failures in Distributed Systems Different aspects of system failures: - Reliability: Operation & job failures, data loss & corruption - Performance: Traffic overloaded, connection delay - Availability: Node and cluster downtime - Consistency: Permanent inconsistent replicas - Scalability: Function incorrectly in large-scale clusters
7 Failures in Distributed Systems Different aspects of system failures: - Reliability: Operation & job failures, data loss & corruption - Performance: Traffic overloaded, connection delay - Availability: Node and cluster downtime - Consistency: Permanent inconsistent replicas - Scalability: Function incorrectly in large-scale clusters
8 Reliability & Availability Correlated failures resulting from EBS due to bugs in one EBS server Summary of the October 22, 2012 AWS Service Event in the US-East Region We d like to share more about the service event that occurred on Monday, October 22nd in the US-East Region. We have now completed the analysis of the events that affected AWS customers, and we want to describe what happened, our understanding of how customers were affected, and what we are doing to prevent a similar issue from occurring in the future. The Primary Event and the Impact to Amazon Elastic Block Store (EBS) and Amazon Elastic Compute Cloud (EC2)
9 Reliability & Availability Elastic Compute Cloud (EC2) Elastic Block Store (EBS)
10 Reliability & Availability Elastic Block Store (EBS)
11 Reliability & Availability VM1 VM2 VM1 VM3 VM1 VM2 VM3 VM Elastic Block Store (EBS)
12 Netflix VM1 VM2 VM1 VM3 VM1 VM2 VM3 VM EBS Cluster1 EBS Cluster2
13 Netflix VM1 VM2 VM1 VM3 VM1 VM2 VM3 VM EBS Cluster1 EBS Cluster2
14 Netflix VM1 VM2 VM1 VM3 VM1 VM2 VM3 VM EBS Cluster1 EBS Cluster2
15 Netflix VM1 VM2 VM1 VM3 VM1 VM2 VM3 VM EBS Cluster1 EBS Cluster2
16 Global Outages
17 A more tricky example
18
19 Video App Cloud Provider A Cloud Provider B
20 Video App Cloud Provider A Cloud Provider B Third-party infrastructure components
21 Video App Cloud Provider A Cloud Provider B Third-party infrastructure components ISP Router A ISP Router B ISP Router C
22 Video App Cloud Provider A Cloud Provider B Third-party infrastructure components ISP Router A ISP Router B ISP Router C Power Source
23 Video App Cloud Provider A Cloud Provider B Third-party infrastructure components ISP Router A ISP Router B ISP Router C Power Source
24 Video App Cloud Provider A Cloud Provider B Third-party infrastructure components ISP Router A ISP Router B ISP Router C Power Source
25 Video App Cloud Provider A Cloud Provider B Cloud providers do not usually share Third-party infrastructure components information about their dependencies ISP Router A ISP Router B ISP Router C Power Source
26 Failures in Distributed Systems Different aspects of system failures: - Reliability: Operation & job failures, data loss & corruption - Performance: Traffic overloaded, connection delay - Availability: Node and cluster downtime - Consistency: Permanent inconsistent replicas - Scalability: Function incorrectly in large-scale clusters
27 Failures in Distributed Systems Different aspects of system failures: - Reliability: Operation & job failures, data loss & corruption - Performance: Traffic overloaded, connection delay - Availability: Node and cluster downtime - Consistency: Permanent inconsistent replicas - Scalability: Function incorrectly in large-scale clusters
28 Consistency Failures ZooKeeper (synchronization service) Issue # Nodes A, B, C start (w/ latex txid: 10) 2. B becomes leader 3. B crashes 4. C becomes leader 5. C commits new txid-value pair (11, X) 6. A crashes, before committing the new txid C loses quorum and C crashes 8. A and B are back online after C crashes 9. A becomes leader 10. A's commits new txid-value pair (11, Y) 11. C is back online after A's new tx commit 12. C announce to B (11, X) 13. B replies diff starting with tx Inconsistency: A has (11, Y), C has (11, X) PERMANENT INCONSISTENT REPLICA
29 Failures in Distributed Systems Different aspects of system failures: - Reliability: Operation & job failures, data loss & corruption - Performance: Traffic overloaded, connection delay - Availability: Node and cluster downtime - Consistency: Permanent inconsistent replicas - Scalability: Function incorrectly in large-scale clusters
30 Scalability Failures Large data In HBase Tens of minutes Insufficient lookup opera;on R1 R R2 R100K R3
31 Lecture Roadmap Failures in Distributed Systems Root Causes How to handle failures Case Study: Catastrophic Failures
32 Root Causes Customer Environment 25% 9% 31% Misconfiguration Hardware Bugs 15% 20% 5 Storage Systems, 2011
33 Root Causes Customer Environment 25% 9% 31% Misconfiguration Hardware Bugs 15% 20% More Cloud Systems 5 Storage Systems, 2011
34 Root-cause types: Logic Logic (29%) Many domain-specific issues 48
35 Root-cause types: Error handling Logic Error handling (18%) 49
36 Example
37 Root-cause types: Op/miza/on Logic Error handling Op#miza#on (15%) 55
38 Root-cause types: Configura3on Logic Error handling Op3miza3on Configura)on (14%) 56
39 Example Configuration File Specify parameters MySQL
40 Example
41 Example
42 Root-cause types: Race Race (12%) < 50% local concurrency bugs Buggy thread interleaving Tons of work > 50% distributed concurrency bugs Reordering of messages, crashes, Cmeouts More work is needed SAMC [OSDI 14] 69
43 Root-cause types: Hang Hang (4%) Classical deadlock Un-served jobs, stalled opera8ons, Root causes? How to detect them? 70
44 Root-cause types: Space Space (4%) Big data + leak = Big leak Clean-up opera:ons must be flawless. 72
45 Root-case types: Load Load (4%) Happen when systems face high request load Relates to QoS and admission control 73
46 Lecture Roadmap Failures in Distributed Systems Root Causes How to handle failures Case Study: Catastrophic Failures
47 Failure Handling in Distributed Systems
48 Failure Handling in Distributed Systems Service initialization Changing network paths Upgrading software components Outage Service Runtime
49 Failure Handling in Distributed Systems Post-Failure Troubleshooting Failure Prevention 1. Diagnosis tools 2. Accountability 3. Provenance Service initialization Changing network paths Upgrading software components Outage Service Runtime
50 Failure Handling in Distributed Systems Post-Failure Troubleshooting Failure Prevention Failure Tolerance 1. Diagnosis tools 2. Accountability 3. Provenance Service initialization Changing network paths Upgrading software components Outage Service Runtime
51 Lecture Roadmap Failures in Distributed Systems Root Causes How to handle failures Case Study: Catastrophic Failures
52 Next Lecture In the lec-16, I will learn: - Misconfigurations in distributed systems - How important they are - How to handle misconfiguration
Heading Off Correlated Failures through Independence-as-a-Service
Heading Off Correlated Failures through Independence-as-a-Service Ennan Zhai 1 Ruichuan Chen 2, David Isaac Wolinsky 1, Bryan Ford 1 1 Yale University 2 Bell Labs/Alcatel-Lucent Background Cloud services
More informationCloud Computing. Ennan Zhai. Computer Science at Yale University
Cloud Computing Ennan Zhai Computer Science at Yale University ennan.zhai@yale.edu About Final Project About Final Project Important dates before demo session: - Oct 31: Proposal v1.0 - Nov 7: Source code
More informationCPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University
CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network
More informationReplications and Consensus
CPSC 426/526 Replications and Consensus Ennan Zhai Computer Science Department Yale University Recall: Lec-8 and 9 In the lec-8 and 9, we learned: - Cloud storage and data processing - File system: Google
More informationSAMC: Sema+c- Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems
SAMC: Sema+c- Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi *, Jeffrey F. Lukman, and Haryadi S. Gunawi * 1 Internet Services
More informationCPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University
CPSC / PP Lookup Service Ennan Zhai Computer Science Department Yale University Recall: Lec- Network basics: - OSI model and how Internet works - Socket APIs red PP network (Gnutella, KaZaA, etc.) UseNet
More informationHOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION
HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION Steve Bertoldi, Solutions Director, MarkLogic Agenda Cloud computing and on premise issues Comparison of traditional vs cloud architecture Review of use
More informationTROPIC: Transactional Resource Orchestration Platform In the Cloud
TROPIC: Transactional Resource Orchestration Platform In the Cloud Changbin Liu, Yun Mao*, Xu Chen*, Mary Fernandez*, Boon Thau Loo, Jacobus Van der Merwe* * netdb.cis.upenn.edu/dmf 1 Motivation Infrastructure
More informationPrincipal Solutions Architect. Architecting in the Cloud
Matt Tavis Principal Solutions Architect Architecting in the Cloud Cloud Best Practices Whitepaper Prescriptive guidance to Cloud Architects Just Search for Cloud Best Practices to find the link ttp://media.amazonwebservices.co
More informationDISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?
DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing Slide 1 Slide 3 ➀ What is Cloud Computing? ➁ X as a Service ➂ Key Challenges ➃ Developing for the Cloud Why is it called Cloud? services provided
More informationCPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University
CPSC 4/5 PP Lookup Service Ennan Zhai Computer Science Department Yale University Recall: Lec- Network basics: - OSI model and how Internet works - Socket APIs red PP network (Gnutella, KaZaA, etc.) UseNet
More informationWelcome to the New Era of Cloud Computing
Welcome to the New Era of Cloud Computing Aaron Kimball The web is replacing the desktop 1 SDKs & toolkits are there What about the backend? Image: Wikipedia user Calyponte 2 Two key concepts Processing
More informationServers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, #netflixcloud #cassandra12
Servers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, Netflix @eatupmartha #netflixcloud #cassandra12 1 June 29, 2012 2 3 4 [1] 5 From the Netflix tech blog: Cassandra, our distributed cloud persistence
More informationDocument Sub Title. Yotpo. Technical Overview 07/18/ Yotpo
Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time
More informationAWS_SOA-C00 Exam. Volume: 758 Questions
Volume: 758 Questions Question: 1 A user has created photo editing software and hosted it on EC2. The software accepts requests from the user about the photo format and resolution and sends a message to
More informationDistributed Systems Homework 1 (6 problems)
15-440 Distributed Systems Homework 1 (6 problems) Due: November 30, 11:59 PM via electronic handin Hand in to Autolab in PDF format November 29, 2011 1. You have set up a fault-tolerant banking service.
More informationWhite Paper Amazon Aurora A Fast, Affordable and Powerful RDBMS
White Paper Amazon Aurora A Fast, Affordable and Powerful RDBMS TABLE OF CONTENTS Introduction 3 Multi-Tenant Logging and Storage Layer with Service-Oriented Architecture 3 High Availability with Self-Healing
More informationProgramming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines
A programming model in Cloud: MapReduce Programming model and implementation for processing and generating large data sets Users specify a map function to generate a set of intermediate key/value pairs
More informationCryptographic Systems
CPSC 426/526 Cryptographic Systems Ennan Zhai Computer Science Department Yale University Recall: Lec-10 In lec-10, we learned: - Consistency models - Two-phase commit - Consensus - Paxos Lecture Roadmap
More informationPracticeDump. Free Practice Dumps - Unlimited Free Access of practice exam
PracticeDump http://www.practicedump.com Free Practice Dumps - Unlimited Free Access of practice exam Exam : AWS-Developer Title : AWS Certified Developer - Associate Vendor : Amazon Version : DEMO Get
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon Aurora: Amazon s New Relational Database Engine Carlos Conde Technology Evangelist @caarlco 2015, Amazon Web Services,
More informationIntroduction to data centers
Introduction to data centers Paolo Giaccone Notes for the class on Switching technologies for data centers Politecnico di Torino December 2017 Cloud computing Section 1 Cloud computing Giaccone (Politecnico
More informationThere Is More Consensus in Egalitarian Parliaments
There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine
More informationMoving to the Cloud: Making It Happen With MarkLogic
Moving to the Cloud: Making It Happen With MarkLogic Alex Bleasdale, Manager, Support, MarkLogic COPYRIGHT 20 April 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Agenda 1. MarkLogic in the cloud 2.
More informationAmazon Aurora Deep Dive
Amazon Aurora Deep Dive Anurag Gupta VP, Big Data Amazon Web Services April, 2016 Up Buffer Quorum 100K to Less Proactive 1/10 15 caches Custom, Shared 6-way Peer than read writes/second Automated Pay
More informationAWS: Basic Architecture Session SUNEY SHARMA Solutions Architect: AWS
AWS: Basic Architecture Session SUNEY SHARMA Solutions Architect: AWS suneys@amazon.com AWS Core Infrastructure and Services Traditional Infrastructure Amazon Web Services Security Security Firewalls ACLs
More informationHow we built a highly scalable Machine Learning platform using Apache Mesos
How we built a highly scalable Machine Learning platform using Apache Mesos Daniel Sârbe Development Manager, BigData and Cloud Machine Translation @ SDL Co-founder of BigData/DataScience Meetup Cluj,
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationLessons learned while automating MySQL in the AWS cloud. Stephane Combaudon DB Engineer - Slice
Lessons learned while automating MySQL in the AWS cloud Stephane Combaudon DB Engineer - Slice Our environment 5 DB stacks Data volume ranging from 30GB to 2TB+. Master + N slaves for each stack. Master
More informationMechanisms and Architectures for Tail-Tolerant System Operations in Cloud
Mechanisms and Architectures for Tail-Tolerant System Operations in Cloud Qinghua Lu 1,2, Liming Zhu 2, Xiwei Xu 2, Len Bass 2, Shanshan Li 1, Weishan Zhang 1, Ning Wang 1 1 College of Computer and Communication
More informationChoosing a MySQL HA Solution Today. Choosing the best solution among a myriad of options
Choosing a MySQL HA Solution Today Choosing the best solution among a myriad of options Questions...Questions...Questions??? How to zero in on the right solution You can t hit a target if you don t have
More informationDesigning Fault-Tolerant Applications
Designing Fault-Tolerant Applications Miles Ward Enterprise Solutions Architect Building Fault-Tolerant Applications on AWS White paper published last year Sharing best practices We d like to hear your
More informationSpecPaxos. James Connolly && Harrison Davis
SpecPaxos James Connolly && Harrison Davis Overview Background Fast Paxos Traditional Paxos Implementations Data Centers Mostly-Ordered-Multicast Network layer Speculative Paxos Protocol Application layer
More informationCS 425 / ECE 428 Distributed Systems Fall 2017
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Nov 7, 2017 Lecture 21: Replication Control All slides IG Server-side Focus Concurrency Control = how to coordinate multiple concurrent
More informationDesigning Distributed Systems using Approximate Synchrony in Data Center Networks
Designing Distributed Systems using Approximate Synchrony in Data Center Networks Dan R. K. Ports Jialin Li Naveen Kr. Sharma Vincent Liu Arvind Krishnamurthy University of Washington CSE Today s most
More informationKillTest *KIJGT 3WCNKV[ $GVVGT 5GTXKEG Q&A NZZV ]]] QORRZKYZ IUS =K ULLKX LXKK [VJGZK YKX\OIK LUX UTK _KGX
KillTest Q&A Exam : AWS-SysOps Title : AWS Certified SysOps Administrator Associate Version : Demo 1 / 4 1.A user has created photo editing software and hosted it on EC2. The software accepts requests
More informationAbout Intellipaat. About the Course. Why Take This Course?
About Intellipaat Intellipaat is a fast growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 600,000 in over
More informationIntuitive distributed algorithms. with F#
Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype
More informationZooKeeper Atomic Broadcast
ZooKeeper Atomic Broadcast The heart of the ZooKeeper coordination service Benjamin Reed, Flavio Junqueira Yahoo! Research ZooKeeper Service Transforms a request into an idempotent transaction Request
More informationAurora, RDS, or On-Prem, Which is right for you
Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation
More informationSimple testing can prevent most critical failures -- An analysis of production failures in distributed data-intensive systems
Simple testing can prevent most critical failures -- An analysis of production failures in distributed data-intensive systems Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Rodrigues, Xu Zhao, Yongle Zhang,
More informationMIGRATING SAP WORKLOADS TO AWS CLOUD
MIGRATING SAP WORKLOADS TO AWS CLOUD Cloud is an increasingly credible and powerful infrastructure alternative for critical business applications. It s a great way to avoid capital expenses and maintenance
More informationAmazon Aurora Deep Dive
Amazon Aurora Deep Dive Enterprise-class database for the cloud Damián Arregui, Solutions Architect, AWS October 27 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enterprise
More informationConsistency in Distributed Systems
Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission
More informationHighly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona
Highly Available Database Architectures in AWS Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona Hello, Percona Live Attendees! What this talk is meant to
More informationGetting Started with Amazon EC2 and Amazon SQS
Getting Started with Amazon EC2 and Amazon SQS Building Scalable, Reliable Amazon EC2 Applications with Amazon SQS Overview Amazon Elastic Compute Cloud (EC2) is a web service that provides resizable compute
More informationDistributed Systems COMP 212. Lecture 18 Othon Michail
Distributed Systems COMP 212 Lecture 18 Othon Michail Virtualisation & Cloud Computing 2/27 Protection rings It s all about protection rings in modern processors Hardware mechanism to protect data and
More informationProject Presentation
Project Presentation Saad Arif Dept. of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL November 7, 2013 1 Introduction 1 Introduction 2 Gallery 1 Introduction 2
More informationUseNet and Gossip Protocol
CPSC 426/526 UseNet and Gossip Protocol Ennan Zhai Computer Science Department Yale University Recall: Lec-1 Understanding: - Distributed systems vs. decentralized systems - Why we need both? red P2P network
More informationTroubleshooting VoIP in Converged Networks
Troubleshooting VoIP in Converged Networks Terry Slattery Principal Consultant CCIE #1026 1 Objective Provide examples of common problems Troubleshooting tips What to monitor Remediation Tips you can use
More informationAmazon Web Services. Block 402, 4 th Floor, Saptagiri Towers, Above Pantaloons, Begumpet Main Road, Hyderabad Telangana India
(AWS) Overview: AWS is a cloud service from Amazon, which provides services in the form of building blocks, these building blocks can be used to create and deploy various types of application in the cloud.
More informationDistributed Systems CS6421
Distributed Systems CS6421 Intro to Distributed Systems and the Cloud Prof. Tim Wood v I teach: Software Engineering, Operating Systems, Sr. Design I like: distributed systems, networks, building cool
More informationMapReduce & BigTable
CPSC 426/526 MapReduce & BigTable Ennan Zhai Computer Science Department Yale University Lecture Roadmap Cloud Computing Overview Challenges in the Clouds Distributed File Systems: GFS Data Process & Analysis:
More informationMicroservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan
Microservices at Netflix Scale First Principles, Tradeoffs, Lessons Learned Ruslan Meshenberg @rusmeshenberg Microservices: all benefits, no costs? Netflix is the world s leading Internet television network
More informationCIT 668: System Architecture. Amazon Web Services
CIT 668: System Architecture Amazon Web Services Topics 1. AWS Global Infrastructure 2. Foundation Services 1. Compute 2. Storage 3. Database 4. Network 3. AWS Economics Amazon Services Architecture Regions
More informationDeploy. A step-by-step guide to successfully deploying your new app with the FileMaker Platform
Deploy A step-by-step guide to successfully deploying your new app with the FileMaker Platform Share your custom app with your team! Now that you ve used the Plan Guide to define your custom app requirements,
More informationExam Distributed Systems
Exam Distributed Systems 5 February 2010, 9:00am 12:00pm Part 2 Prof. R. Wattenhofer Family Name, First Name:..................................................... ETH Student ID Number:.....................................................
More informationHosting DesktopNow in Amazon Web Services. Ivanti DesktopNow powered by AppSense
Hosting DesktopNow in Amazon Web Services Ivanti DesktopNow powered by AppSense Contents Purpose of this Document... 3 Overview... 3 1 Non load balanced Amazon Web Services Environment... 4 Amazon Web
More informationApplications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
More informationAlwaysOn Availability Groups: Backups, Restores, and CHECKDB
AlwaysOn Availability Groups: Backups, Restores, and CHECKDB www.brentozar.com sp_blitz sp_blitzfirst email newsletter videos SQL Critical Care 2016 Brent Ozar Unlimited. All rights reserved. 1 What I
More informationWHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud.
WHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud www.cloudcheckr.com TABLE OF CONTENTS Overview 3 What Is ELB? 3 How ELB Works 4 Classic Load Balancer 5 Application
More informationWrite On Aws. Aws Tools For Windows Powershell User Guide using the aws tools for windows powershell (p. 19) this section includes information about
We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer, you have convenient answers with write on aws. To get
More informationSecurity and Compliance at Mavenlink
Security and Compliance at Mavenlink Table of Contents Introduction....3 Application Security....4....4....5 Infrastructure Security....8....8....8....9 Data Security.... 10....10....10 Infrastructure
More informationRA-GRS, 130 replication support, ZRS, 130
Index A, B Agile approach advantages, 168 continuous software delivery, 167 definition, 167 disadvantages, 169 sprints, 167 168 Amazon Web Services (AWS) failure, 88 CloudTrail Service, 21 CloudWatch Service,
More informationCS / Cloud Computing. Recitation 9 October 22 nd and 25 th, 2013
CS15-319 / 15-619 Cloud Computing Recitation 9 October 22 nd and 25 th, 2013 Announcements Encounter a general bug: Post on Piazza Encounter a grading bug: Post Privately on Piazza Don t ask if my answer
More informationCLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters
CLUSTERING HIVEMQ Building highly available, horizontally scalable MQTT Broker Clusters 12/2016 About this document MQTT is based on a publish/subscribe architecture that decouples MQTT clients and uses
More informationAmazon Aurora Deep Dive
Amazon Aurora Deep Dive Kevin Jernigan, Sr. Product Manager Amazon Aurora PostgreSQL Amazon RDS for PostgreSQL May 18, 2017 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda
More informationHadoop/MapReduce Computing Paradigm
Hadoop/Reduce Computing Paradigm 1 Large-Scale Data Analytics Reduce computing paradigm (E.g., Hadoop) vs. Traditional database systems vs. Database Many enterprises are turning to Hadoop Especially applications
More informationCourse Information. Course Basics. Course Format. Arvind Krishnamurthy. Instructor: Arvind Krishnamurthy. TA: Naveen Sharma
Course Information CSE 550: Introduction to Computer Systems Research Arvind Krishnamurthy Instructor: Arvind Krishnamurthy Interests: distributed systems, networks, operating systems, security Email,
More informationPROTECT YOUR DATA FROM MALWARE AND ENSURE BUSINESS CONTINUITY ON THE CLOUD WITH NAVLINK MANAGED AMAZON WEB SERVICES MANAGED AWS
PROTECT YOUR DATA FROM MALWARE AND ENSURE BUSINESS CONTINUITY ON THE CLOUD WITH NAVLINK MANAGED AMAZON WEB SERVICES MANAGED AWS Improved performance Faster go-to-market Better security In today s disruptive
More informationPega 7 Webinar Series
Pega 7 Webinar Series Achieve the Highest Levels of Availability, Scalability and Performance Mark Replogle Ken Schwarz October 16, 2013 Pega 7 Webinar Series 01 The Power to Simplify Pega 7 Overview Thursday,
More informationCloud & container monitoring , Lars Michelsen Check_MK Conference #4
Cloud & container monitoring 04.05.2018, Lars Michelsen Some cloud definitions Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking Software-as-a-Service (SaaS) Applications
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationCase Study Virtual Expo. Virtual Expo, a leader in online B2B exhibitions, used Varnish Plus Cloud to build and enhance their own in-house CDN
Case Study Virtual Expo Virtual Expo, a leader in online B2B exhibitions, used Varnish Plus Cloud to build and enhance their own in-house CDN Varnish Plus Cloud on AWS allows flexibility and permits getting
More informationPersistent Storage with Docker in production - Which solution and why?
Persistent Storage with Docker in production - Which solution and why? Cheryl Hung 2013-2017 StorageOS Ltd. All rights reserved. Cheryl 2013-2017 StorageOS Ltd. All rights reserved. 2 Why do I need storage?
More informationAbstractions for Model Checking SDN Controllers. Divjyot Sethi, Srinivas Narayana, Prof. Sharad Malik Princeton University
Abstractions for Model Checking SDN s Divjyot Sethi, Srinivas Narayana, Prof. Sharad Malik Princeton University Traditional Networking Swt 1 Swt 2 Talk OSPF, RIP, BGP, etc. Swt 3 Challenges: - Difficult
More informationSecurity of End User based Cloud Services Sang Young
Security of End User based Cloud Services Sang Young Chairman, Mobile SIG Professional Information Security Association sang.young@pisa.org.hk Cloud Services you can choose Social Media Business Applications
More informationIntroduction to Cloud Computing
You will learn how to: Build and deploy cloud applications and develop an effective implementation strategy Leverage cloud vendors Amazon EC2 and Amazon S3 Exploit Software as a Service (SaaS) to optimize
More informationData Protection Done Right with Dell EMC and AWS
Data Protection Done Right with Dell EMC and AWS There is a growing need for superior data protection Introduction If you re like most businesses, your data is continually growing in speed and scale, and
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationHyperconverged Cloud Architecture with OpenNebula and StorPool
Hyperconverged Cloud Architecture with OpenNebula and StorPool Version 1.0, January 2018 Abstract The Hyperconverged Cloud Architecture with OpenNebula and StorPool is a blueprint to aid IT architects,
More informationTraining on Amazon AWS Cloud Computing. Course Content
Training on Amazon AWS Cloud Computing Course Content 15 Amazon Web Services (AWS) Cloud Computing 1) Introduction to cloud computing Introduction to Cloud Computing Why Cloud Computing? Benefits of Cloud
More informationHigh Availability and Disaster Recovery. Cherry Lin, Jonathan Quinn
High Availability and Disaster Recovery Cherry Lin, Jonathan Quinn Managing the Twin Risks to your Operations Data Loss Down Time The Three Approaches Backups High Availability Disaster Recovery Geographic
More informationHow to backup Ceph at scale. FOSDEM, Brussels,
How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me Bartłomiej Święcki OVH Wrocław, PL Current job: More Ceph awesomeness Speedlight Ceph intro Open-source Network storage Scalable Reliable
More informationvsphere Replication for Disaster Recovery to Cloud
vsphere Replication for Disaster Recovery to Cloud vsphere Replication 5.6 This document supports the version of each product listed and supports all subsequent versions until the document is replaced
More informationAmazon Web Services (AWS) Solutions Architect Intermediate Level Course Content
Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content Introduction to Cloud Computing A Short history Client Server Computing Concepts Challenges with Distributed Computing Introduction
More informationCLOUD COMPUTING. A public cloud sells services to anyone on the Internet. The cloud infrastructure is made available to
CLOUD COMPUTING In the simplest terms, cloud computing means storing and accessing data and programs over the Internet instead of your computer's hard drive. The cloud is just a metaphor for the Internet.
More informationEmbracing Failure. Fault Injec,on and Service Resilience at Ne6lix. Josh Evans Director of Opera,ons Engineering, Ne6lix
Embracing Failure Fault Injec,on and Service Resilience at Ne6lix Josh Evans Director of Opera,ons Engineering, Ne6lix Josh Evans 24 years in technology Tech support, Tools, Test Automa,on, IT & QA Management
More informationAmazon Web Services Training. Training Topics:
Amazon Web Services Training Training Topics: SECTION1: INTRODUCTION TO CLOUD COMPUTING A Short history Client Server Computing Concepts Challenges with Distributed Computing Introduction to Cloud Computing
More informationEverything You Need to Know About MySQL Group Replication
Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights
More informationDistributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
More informationDistributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 10. Consensus: Paxos Paul Krzyzanowski Rutgers University Fall 2017 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value
More informationUnlimited Scalability in the Cloud A Case Study of Migration to Amazon DynamoDB
Unlimited Scalability in the Cloud A Case Study of Migration to Amazon DynamoDB Steve Saporta CTO, SpinCar Mar 19, 2016 SpinCar When a web-based business grows... More customers = more transactions More
More informationMySQL Enterprise High Availability
MySQL Enterprise High Availability A Reference Guide A MySQL White Paper 2018, Oracle Corporation and/or its affiliates Table of Contents MySQL Enterprise High Availability... 1 A Reference Guide... 1
More informationContainer Orchestration on Amazon Web Services. Arun
Container Orchestration on Amazon Web Services Arun Gupta, @arungupta Docker Workflow Development using Docker Docker Community Edition Docker for Mac/Windows/Linux Monthly edge and quarterly stable
More informationProtecting remote site data SvSAN clustering - failure scenarios
White paper Protecting remote site data SvSN clustering - failure scenarios Service availability and data integrity are key metrics for enterprises that run business critical applications at multiple remote
More informationAWS Solutions Architect Associate (SAA-C01) Sample Exam Questions
1) A company is storing an access key (access key ID and secret access key) in a text file on a custom AMI. The company uses the access key to access DynamoDB tables from instances created from the AMI.
More informationAutomated Upgrades of PostgreSQL Clusters in Cloud
Gülçin Yıldırım Automated Upgrades of PostgreSQL Clusters in Cloud 1 select * from me; Site Reliability Engineer @ 2ndQuadrant Board Member @ PostgreSQL Europe MSc Comp. & Systems Eng. @ Tallinn University
More informationFault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together
Fault-Tolerant Computer System Design ECE 695/CS 590 Putting it All Together Saurabh Bagchi ECE/CS Purdue University ECE 695/CS 590 1 Outline Looking at some practical systems that integrate multiple techniques
More information