CSE 124: Networked Services Lecture-15

Size: px
Start display at page:

Download "CSE 124: Networked Services Lecture-15"

Transcription

1 Fall 2010 CSE 124: Networked Services Lecture-15 Instructor: B. S. Manoj, Ph.D 11/18/2010 CSE 124 Networked Services Fall

2 Updates Signup sheet for PlanetLab experiment Signup Deadline is today Project-2 idea finalization Sample Idea: SuperProxy A web proxy service that lets you download files at higher speeds Client->Proxy: Parallelizes connection over multiple sockets Proxy->Server: Caches data, or downloads Free service Ad insertion capability for revenue generation Ads must be inserted in a non-invasive manner Ads may be from third parties Users will visit the url (e.g., and use it for all their Internet needs Presentation/Demo Deadline: Last Lecture class (December 2 nd, 2010) 11/18/2010 CSE 124 Networked Services Fall

3 Giant-Scale Services From Lessons from Giant-Scale Services by Eric A Brewer, UC Berkeley and formerly with Inktomi Corporation 11/18/2010 CSE 124 Networked Services Fall

4 Why Giant Scale Services? Access anywhere, any time Home, office, coffee shop, airport etc. Available via multiple devices Computers, smart-phones, set-top boxes etc Groupware support Centralization of services helps Calendars, e-vite, etc Lower overall cost End-user device utilization: 4% Infrastructure resources: 80% Fundamental cost advantage over stand-alone application Simplified service update Most powerful long term advantage No physical distribution of software or hardware is necessary 11/18/2010 CSE 124 Networked Services Fall

5 Key assumptions Service provider has limited control over Clients Network (except the intranet) Queries drive the traffic Web or database queries Eg. http, ftp, or RPC Read-only queries greatly out-number data updates Read dominates write Product evaluations vs purchases Stock quotes vs trading 11/18/2010 CSE 124 Networked Services Fall

6 Basic Model of Giant Scale Services Clients Web browsers, clients, XML programs The Best-effort IP network Access to the service The load manager Level indirection to balance the load To prevent faults Servers System workers Combines CPU, memory, and disks Persistence data storage Replicated or partitioned database Spread across servers disks May include network attached storage, DBMs, or RAIDs Services backplane Optional system-area-network Handles inter server traffic 11/18/2010 CSE 124 Networked Services Fall

7 Clusters Clusters in Giant-Scale Servers Collections of commodity servers Main benefits Absolute scalability Many new services must serve a fraction of the world s population Cost and performance Compelling reason for clusters Bandwidth and operational costs dwarf hardware cost Independent components Help handling faults Incremental scalability Helps handle uncertainty and expense of growing the service Typically 3 years depreciation lifetime A unit rack space quadruples in computing 11/18/2010 power in every CSE years Networked Services Fall

8 Load Management Simple load management strategy DNS Round-robin IP address distribution DNS It does not hide down/inactive servers Short time-to-live is an option Many browsers mishandle expired DNS info Layer-4/Layer-7 switches Transport/Application layer switches Processes higher layer info at wirespeeds Helps fault tolerance High throughput: 20Gbps Can detect down nodes: by connection status 11/18/2010 CSE 124 Networked Services Fall

9 Load Management (contd) Layer-4 switches Can understand tcp and port numbers to route Service specific layer-7 switches A user: Walmart Can track session information Smart client End-to-end approach Using alternative server info (DNS) 11/18/2010 CSE 124 Networked Services Fall

10 Key Requirements of Giant Scale Services High Availability Much like other communication services Always available water, electricity, telephone etc To handle component failures Natural disasters Growth and evolution Design points (reduce failures) Symmetry Internal disks No people, wires, monitors Offsite clusters Contracts limit temperature and power variations 11/18/2010 CSE 124 Networked Services Fall

11 Availability Metrics Availability metrics Uptime: Fraction of time the site is available (e.g..9999= 99.99%; 8.64 seconds/day) MTBF: Mean time between failures MTTF: Mean time to repair Two ways to improve uptime Increase MTBF or reduce MTTR Hard to improve and verify MTBF MTTR reduction is preferred Total time required to verify improvement is less for MTTR 11/18/2010 CSE 124 Networked Services Fall

12 Availability Metrics Yield: As an availability metric; not as throughput metric Similar to uptime, but translates to user experience Not all seconds have equal value A second lost when there are no queries A second lost during peak hour is really an issue 11/18/2010 CSE 124 Networked Services Fall

13 Availability Metrics Harvest: A query may be answered partially or fully Harvest determines how much info returned Can help in ensuring user satisfaction while handling faults inbox loads, but task list or contacts are not Ebay auction info loaded, but not user profile Key point: We can control how faults affect Yield, Harvest, or both Total Capacity remains the same Fault In a Replicated system Reduced yield In a Partitioned system Reduced harvest 11/18/2010 CSE 124 Networked Services Fall

14 DQ Principle Data /query x Queries/second -> Constant A useful metric for giant-scale systems Represents a physical capacity bottleneck Max I/O bandwidth or total disk seeks/second At high utilization a giant-scale system approaches the constant Includes all overhead Data copying, presentation layers, and network-bound issues Each node has a different DQ value Easy to measure the relative impact of faults on DQ values Because different systems have different DQ values 11/18/2010 CSE 124 Networked Services Fall

15 DQ Value (contd..) Overall DQ value linearly scales with number of nodes Helps in sampling of the behavior of the entire system Small test cluster can predict the behavior of the entire system Inktome: 4 node clusters are used to predict the impact of software updates on 100 node clusters DQ impact must be evaluated before any proposed HW/SW changes Linear reduction of DQ is linear with faults 11/18/2010 CSE 124 Networked Services Fall

16 DQ Value (contd..) Future demand DQ addition required is to be estimated Fault impact and DQ Degrade DQ linearly with number of faults (failed nodes) DQ may be handled differently Data intensive services DQ applies mostly to this category Data base access Majority of top 100 sites are data intensive Computation intensive services Simulation, super computing, or Communication intensive services Chat, news, or VoIP How DQ impacts the Uptime Yield Harvest 11/18/2010 CSE 124 Networked Services Fall

17 Replication Vs Partitioning Replication Traditional method for improving availability How DQ affects Replication E.g, two node cluster, one fault Harvest: 100% Yield: 50% Maintains D and reduces Q Partition How DQ affects Partition E.g, two node cluster, one fault Harvest: 50% Yield: 100% Reduces D and maintains Q DQ drops to 50% in both cases 11/18/2010 CSE 124 Networked Services Fall

18 Load Redirection and Replication Traditional replication provisions excess capacity Load redirection is required on faults Replicas handle load handled by the failed nodes Hard to achieve under high utilization k out of n failures will demand a redirection of k/(n-k) load over to the remaining n-k nodes Loss of 2 out of 5 nodes implies a redirected load of 2/3 and an overload of 5/3 (166%) 11/18/2010 CSE 124 Networked Services Fall

19 Replication and DQ Replication of disks is cheap Storage is cheap, but not processing But to access the data, DQ points is required Partitioning has no real savings over replication (in terms of DQ points) The same DQ points are needed In some rare cases, replication can demand more DQ points 11/18/2010 CSE 124 Networked Services Fall

20 Replication and DQ (contd..) Replication and partition can be used to provide better control over availability Partition the data first to suitable size Replicate based on the importance of data Easy to grow the system via replication than partition Replication can be based on data s importance Which data is lost in the event of a fault Replication of key data At the cost of some extra disks A fault can still result in 1/n data loss, but of lesser importance Replication can be made random the lost harvest a random subset of data avoids hotspots in the partitions Search (Inktome): Partial replication system: Full replication User content: Full replication Clustered Web: No replication 11/18/2010 CSE 124 Networked Services Fall

21 Graceful Degradation Degradation under faults must be trouble free A Graceful degradation is affected by High peak-to-average ratio 1.6:1 to 6:1 and even 10:1 Single event bursts (Flash crowd) Movie ticket sales, Football matches, breaking sensational news Natural disasters and power failures DQ can drop very high Can happen independently 11/18/2010 CSE 124 Networked Services Fall

22 Graceful degradation under faults DQ principle gives new opportunities Either maintain D and limit Q or reduce D and maintain Q Admission Control (AC) Maintain D, reduces Q Maintains harvest Dynamic database reduction (cut the data size by half) Reduces D, maintains Q Maintains Yield Graceful degradation can be achieved at various degrees combination of the above two Key question: How saturation should affect: uptime, yield, harvest, and Quality of Service 11/18/2010 CSE 124 Networked Services Fall

23 Access control strategies Cost based AC Perform AC based on estimated query cost (in DQs) Reduces the average D per query Denying one expensive query can retain many inexpensive queries Net gain in query and harvest Another method is probabilistic AC Helps retrying queries will lead to success Reduced yield, increased harvest Priority or value based AC Datek handles stock queries differently from other queries Queries will be executed within 60 seconds or they charge no commission Drop low valued queries and thus DQ points Reduced yield, increased harvest Reduced data freshness When saturated, a financial site can make stock quotes expire less frequently Not only reduces freshness, but also DQ requirement Increased yield, reduced harvest 11/18/2010 CSE 124 Networked Services Fall

24 Disaster tolerance Disaster Complete loss of one or more replicas Natural disasters can affect all the replicas in a geographical location Fire or other disasters affect only one replica Disaster tolerance deals with managing replica groups and graceful degradation for handling disaster Key questions: How many locations and how many replicas 2 replicas in 3 locations: 2/6 loss in a natural disaster Each remaining locations must handle 50% (6/4=1.5) more traffic Inktome Current approach Reduce D by 50% at remaining locations Best approach Reduce D by 2/3 and thereby increases Q by 3/2 11/18/2010 CSE 124 Networked Services Fall

25 Disaster Tolerance (contd..) Load management is another issue in Disaster Tolerance When clusters fail, Layer-4 switches do not help DNS Long failover response time (several hours) Smart clients Are more suitable to quick failovers (seconds to minutes) 11/18/2010 CSE 124 Networked Services Fall

26 Evolution and Growth Giant scale services need to be frequently updated Product revisions, software bug fixes, security updates, or addition of new services Hard to detect problems slow memory leaks non-deterministic bugs Continued growth plan is essential Online evolution process Evolution with minimal downtime Giant scale services are frequently updated Acceptable quality software Target MTBF Minimal MTTR No cascading failures 11/18/2010 CSE 124 Networked Services Fall

27 Online evolution Process Each online evolution phase a certain amount of DQ points Total DQ loss for n nodes n-number of nodes, u- time required per node Total DQ loss = DQ x upgrade time per node. Upgrades Software upgrades Quick; new and old systems can co-exist Can be done by controlled reboot during MTTR Hardware upgrades are harder 11/18/2010 CSE 124 Networked Services Fall

28 Upgrade approaches Fast reboot Quickly reboot with the upgrades Downtime cannot be avoided Effect on yield can be contained by scheduling the reboot at off-peak hours Staging area and automation are essential Upgrades happen simultaneously 11/18/2010 CSE 124 Networked Services Fall

29 Upgrade approaches Rolling upgrades Upgrades nodes one at a time in a wave manner One node is down at a time Old and new systems may co-exist Compatibility between old and new systems is a must Partitioned system Harvest will be affected, yield is unaffected Replicated system Harvest and yield are unaffected Upgrade happens on replica at a time Still conducted at off-peak hours to avoid affecting the yield due to faults 11/18/2010 CSE 124 Networked Services Fall

30 Upgrade approaches Big flip Most complicated among the three Upgrade the cluster one half at a time Switch off all traffic to it, take down a half, upgrade it Turn the upgraded part on, direct new traffic to the upgraded part, wait for old traffic in the to-be-upgraded part to complete The upgraded half runs while the old half is taken down One version (half) runs at a time 50% DQ loss Replicas: 50% loss of D (yield) Partitions: 50% loss of Q (harvest) Big flip is powerful Hardware, OS, schema, networking, physical relocation can all be done Inktome did it twice 11/18/2010 CSE 124 Networked Services Fall

31 Basics of Giant scale services: Summary Get the basics right Use symmetry to simplify the analysis and management Decide on availability metrics Yield and Harvest are more important than uptime Focus on MTTR at least as much as MTBF Repair time is easier to affect (and to be controlled) for an evolving system Understand load redirection during faults Replication is insufficient, higher DQ demand is to be considered Graceful degradation Intelligent admission control and dynamic database reduction can help Use DQ analysis of all upgrades Evaluate all upgrade options and DQ demand in advance and do capacity planning Automatic upgrades as much as possible Develop automatic upgrade options such as rolling upgrades, ensure simple way to revert to old version 11/18/2010 CSE 124 Networked Services Fall

Lecture 8: Internet and Online Services. CS 598: Advanced Internetworking Matthew Caesar March 3, 2011

Lecture 8: Internet and Online Services. CS 598: Advanced Internetworking Matthew Caesar March 3, 2011 Lecture 8: Internet and Online Services CS 598: Advanced Internetworking Matthew Caesar March 3, 2011 Demands of modern networked services Old approach: run applications on local PC Now: major innovation

More information

Internet Services and Search Engines. Amin Vahdat CSE 123b May 2, 2006

Internet Services and Search Engines. Amin Vahdat CSE 123b May 2, 2006 Internet Services and Search Engines Amin Vahdat CSE 123b May 2, 2006 Midterm: May 9 Annoucements Second assignment due May 15 Lessons from Giant-Scale Services Service Replication Service Partitioning

More information

Data Center Performance

Data Center Performance Data Center Performance George Porter CSE 124 Feb 15, 2017 *Includes material taken from Barroso et al., 2013, UCSD 222a, and Cedric Lam and Hong Liu (Google) Part 1: Partitioning work across many servers

More information

Clusters. Or: How to replace Big Iron with PCs. Robert Grimm New York University

Clusters. Or: How to replace Big Iron with PCs. Robert Grimm New York University Clusters Or: How to replace Big Iron with PCs Robert Grimm New York University Before We Dive into Clusters! Assignment 2: HTTP/1.1! Implement persistent connections, pipelining, and digest authentication!

More information

CSE 124: TAIL LATENCY AND PERFORMANCE AT SCALE. George Porter November 27, 2017

CSE 124: TAIL LATENCY AND PERFORMANCE AT SCALE. George Porter November 27, 2017 CSE 124: TAIL LATENCY AND PERFORMANCE AT SCALE George Porter November 27, 2017 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative

More information

TAIL LATENCY AND PERFORMANCE AT SCALE

TAIL LATENCY AND PERFORMANCE AT SCALE TAIL LATENCY AND PERFORMANCE AT SCALE George Porter May 21, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons license

More information

CSE 124: QUANTIFYING PERFORMANCE AT SCALE AND COURSE REVIEW. George Porter December 6, 2017

CSE 124: QUANTIFYING PERFORMANCE AT SCALE AND COURSE REVIEW. George Porter December 6, 2017 CSE 124: QUANTIFYING PERFORMANCE AT SCALE AND COURSE REVIEW George Porter December 6, 2017 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

11/13/2018 CACHING, CONTENT-DISTRIBUTION NETWORKS, AND OVERLAY NETWORKS ATTRIBUTION

11/13/2018 CACHING, CONTENT-DISTRIBUTION NETWORKS, AND OVERLAY NETWORKS ATTRIBUTION CACHING, CONTENT-DISTRIBUTION NETWORKS, AND OVERLAY NETWORKS George Porter November 1, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike.0 Unported (CC BY-NC-SA.0)

More information

Cluster-Based Scalable Network Services

Cluster-Based Scalable Network Services Cluster-Based Scalable Network Services Suhas Uppalapati INFT 803 Oct 05 1999 (Source : Fox, Gribble, Chawathe, and Brewer, SOSP, 1997) Requirements for SNS Incremental scalability and overflow growth

More information

CSE 124: Networked Services Lecture-17

CSE 124: Networked Services Lecture-17 Fall 2010 CSE 124: Networked Services Lecture-17 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/30/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

Business Continuity and Disaster Recovery. Ed Crowley Ch 12 Business Continuity and Disaster Recovery Ed Crowley Ch 12 Topics Disaster Recovery Business Impact Analysis MTBF and MTTR RTO and RPO Redundancy Failover Backup Sites Load Balancing Mirror Sites Disaster

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

DNS and Modern Network Services. Amin Vahdat CSE 123b April 27, 2006

DNS and Modern Network Services. Amin Vahdat CSE 123b April 27, 2006 DNS and Modern Network Services Amin Vahdat CSE 123b April 27, 2006 Midterm: May 9 Annoucements Second assignment due May 15 Domain Name System Motivation 1982: single hosts.txt file stored and distributed

More information

Large-Scale Web Applications

Large-Scale Web Applications Large-Scale Web Applications Mendel Rosenblum Web Application Architecture Web Browser Web Server / Application server Storage System HTTP Internet CS142 Lecture Notes - Intro LAN 2 Large-Scale: Scale-Out

More information

Lecture 4: Introduction to Computer Network Design

Lecture 4: Introduction to Computer Network Design Lecture 4: Introduction to Computer Network Design Instructor: Hussein Al Osman Based on Slides by: Prof. Shervin Shirmohammadi Hussein Al Osman CEG4190 4-1 Computer Networks Hussein Al Osman CEG4190 4-2

More information

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju Distributed Data Infrastructures, Fall 2017, Chapter 2 Jussi Kangasharju Chapter Outline Warehouse-scale computing overview Workloads and software infrastructure Failures and repairs Note: Term Warehouse-scale

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/60 Definition Distributed Systems Distributed System is

More information

6.033 Lecture Fault Tolerant Computing 3/31/2014

6.033 Lecture Fault Tolerant Computing 3/31/2014 6.033 Lecture 14 -- Fault Tolerant Computing 3/31/2014 So far what have we seen: Modularity RPC Processes Client / server Networking Implements client/server Seen a few examples of dealing with faults

More information

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05 Engineering Goals Scalability Availability Transactional behavior Security EAI... Scalability How much performance can you get by adding hardware ($)? Performance perfect acceptable unacceptable Processors

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/58 Definition Distributed Systems Distributed System is

More information

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 05r. Case study: Google Cluster Architecture Paul Krzyzanowski Rutgers University Fall 2016 1 A note about relevancy This describes the Google search cluster architecture in the mid

More information

Maximize the Speed and Scalability of Your MuleSoft ESB with Solace

Maximize the Speed and Scalability of Your MuleSoft ESB with Solace Maximize the Speed and Scalability of MuleSoft s Mule ESB enterprise service bus software makes information and interactive services accessible to a wide range of applications and users by intelligently

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Google File System (GFS) and Hadoop Distributed File System (HDFS) Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

ApsaraDB for Redis. Product Introduction

ApsaraDB for Redis. Product Introduction ApsaraDB for Redis is compatible with open-source Redis protocol standards and provides persistent memory database services. Based on its high-reliability dual-machine hot standby architecture and seamlessly

More information

Oracle Exadata: Strategy and Roadmap

Oracle Exadata: Strategy and Roadmap Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies! DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is

More information

Scalability of web applications

Scalability of web applications Scalability of web applications CSCI 470: Web Science Keith Vertanen Copyright 2014 Scalability questions Overview What's important in order to build scalable web sites? High availability vs. load balancing

More information

What do we want out of a Network?

What do we want out of a Network? What do we want out of a Network? Dr. Eric A. Brewer Professor, UC Berkeley Co-Founder & Chief Scientist, Inktomi Distributed Systems don t work... There exist working DS: Simple protocols: DNS, WWW, Napster

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 13 High Availability and Reliability for Mission Critical Systems November 9, 2015 Sam Siewert RASM Reliability High Quality Components (Unit Test) Redundancy Dual String

More information

Outline. Parallel Database Systems. Information explosion. Parallelism in DBMSs. Relational DBMS parallelism. Relational DBMSs.

Outline. Parallel Database Systems. Information explosion. Parallelism in DBMSs. Relational DBMS parallelism. Relational DBMSs. Parallel Database Systems STAVROS HARIZOPOULOS stavros@cs.cmu.edu Outline Background Hardware architectures and performance metrics Parallel database techniques Gamma Bonus: NCR / Teradata Conclusions

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices

More information

GIS - Clustering Architectures. Raj Kumar Integration Management 9/25/2008

GIS - Clustering Architectures. Raj Kumar Integration Management 9/25/2008 GIS - Clustering Architectures Raj Kumar Integration Management 9/25/2008 Agenda What is Clustering Reasons to Cluster Benefits Perimeter Server Clustering Components of GIS Clustering Perimeter Server

More information

ScaleArc for SQL Server

ScaleArc for SQL Server Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations

More information

Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation.

Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation. Notes Midterm reminder Second midterm next week (04/03), regular class time 20 points, more questions than midterm 1 non-comprehensive exam: no need to study modules before midterm 1 Online testing like

More information

Best Practices for Scaling Websites Lessons from ebay

Best Practices for Scaling Websites Lessons from ebay Best Practices for Scaling Websites Lessons from ebay Randy Shoup ebay Distinguished Architect QCon Asia 2009 Challenges at Internet Scale ebay manages 86.3 million active users worldwide 120 million items

More information

Next Generation Erasure Coding Techniques Wesley Leggette Cleversafe

Next Generation Erasure Coding Techniques Wesley Leggette Cleversafe Next Generation Erasure Coding Techniques Wesley Leggette Cleversafe Topics r What is Erasure Coded Storage? r The evolution of Erasure Coded storage r From first- to third-generation erasure coding r

More information

CS5460: Operating Systems Lecture 20: File System Reliability

CS5460: Operating Systems Lecture 20: File System Reliability CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving

More information

Issues in Distributed Architecture

Issues in Distributed Architecture Issues in Distributed Architecture Simon Roberts Simon.Roberts@earthlink.net Simon Roberts Issues in Distributed Architecture Page 1 Why Do We Need Architecture? Network programming systems usually aren't

More information

Downtime Prevention Buyer s Guide. 6 QUESTIONS to help you choose the right availability protection for your applications

Downtime Prevention Buyer s Guide. 6 QUESTIONS to help you choose the right availability protection for your applications Downtime Prevention Buyer s Guide 6 QUESTIONS to help you choose the right availability protection for your applications Question 6 questions to help you choose the right availability protection for your

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture X: Parallel Databases Topics Motivation and Goals Architectures Data placement Query processing Load balancing

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

Parallel Databases C H A P T E R18. Practice Exercises

Parallel Databases C H A P T E R18. Practice Exercises C H A P T E R18 Parallel Databases Practice Exercises 181 In a range selection on a range-partitioned attribute, it is possible that only one disk may need to be accessed Describe the benefits and drawbacks

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads Gen9 server blades give more performance per dollar for your investment. Executive Summary Information Technology (IT)

More information

MULE ESB High Availability (HA) CLUSTERING

MULE ESB High Availability (HA) CLUSTERING MULE ESB High Availability (HA) CLUSTERING Availability, Reliability and Scalability Abstract: ESB offers a built-in active-active High Availability clustering capability. For applications that require

More information

Next Steps Spring 2011 Lecture #18. Multi-hop Networks. Network Reliability. Have: digital point-to-point. Want: many interconnected points

Next Steps Spring 2011 Lecture #18. Multi-hop Networks. Network Reliability. Have: digital point-to-point. Want: many interconnected points Next Steps Have: digital point-to-point We ve worked on link signaling, reliability, sharing Want: many interconnected points 6.02 Spring 2011 Lecture #18 multi-hop networks: design criteria network topologies

More information

CS 111. Operating Systems Peter Reiher

CS 111. Operating Systems Peter Reiher Operating System Principles: Accessing Remote Data Operating Systems Peter Reiher Page 1 Outline Data on other machines Remote file access architectures Challenges in remote data access Security Reliability

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

Ensuring the Success of E-Business Sites. January 2000

Ensuring the Success of E-Business Sites. January 2000 Ensuring the Success of E-Business Sites January 2000 Executive Summary Critical to your success in the e-business market is a high-capacity, high-availability and secure web site. And to ensure long-term

More information

1 of 6 4/8/2011 4:08 PM Electronic Hardware Information, Guides and Tools search newsletter subscribe Home Utilities Downloads Links Info Ads by Google Raid Hard Drives Raid Raid Data Recovery SSD in Raid

More information

PRO, PRO+, and SERVER

PRO, PRO+, and SERVER AliveChat Overview PRO, PRO+, and SERVER Version 1.2 Page 1 Introducing AliveChat 4. Our Latest Release for 2008! Mac, Windows, or Linux - All You Need is a Web Browser Access AliveChat anywhere you have

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 Web Search Engine Crawling Indexing Computing

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 CS 347 Notes 12 5 Web Search Engine Crawling

More information

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group WHITE PAPER: BEST PRACTICES Sizing and Scalability Recommendations for Symantec Rev 2.2 Symantec Enterprise Security Solutions Group White Paper: Symantec Best Practices Contents Introduction... 4 The

More information

More on Testing and Large Scale Web Apps

More on Testing and Large Scale Web Apps More on Testing and Large Scale Web Apps Testing Functionality Tests - Unit tests: E.g. Mocha - Integration tests - End-to-end - E.g. Selenium - HTML CSS validation - forms and form validation - cookies

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

ECE 486/586. Computer Architecture. Lecture # 2

ECE 486/586. Computer Architecture. Lecture # 2 ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

Oracle Rdb Hot Standby Performance Test Results

Oracle Rdb Hot Standby Performance Test Results Oracle Rdb Hot Performance Test Results Bill Gettys (bill.gettys@oracle.com), Principal Engineer, Oracle Corporation August 15, 1999 Introduction With the release of Rdb version 7.0, Oracle offered a powerful

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

Configuring Network Load Balancing

Configuring Network Load Balancing Configuring Network Load Balancing LESSON 1 70-412 EXAM OBJECTIVE Objective 1.1 Configure Network Load Balancing (NLB). This objective may include but is not limited to: Install NLB nodes; configure NLB

More information

Performance Evaluation of Virtualization Technologies

Performance Evaluation of Virtualization Technologies Performance Evaluation of Virtualization Technologies Saad Arif Dept. of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL September 19, 2013 1 Introduction 1 Introduction

More information

The UnAppliance provides Higher Performance, Lower Cost File Serving

The UnAppliance provides Higher Performance, Lower Cost File Serving The UnAppliance provides Higher Performance, Lower Cost File Serving The UnAppliance is an alternative to traditional NAS solutions using industry standard servers and storage for a more efficient and

More information

Today: Coda, xfs. Case Study: Coda File System. Brief overview of other file systems. xfs Log structured file systems HDFS Object Storage Systems

Today: Coda, xfs. Case Study: Coda File System. Brief overview of other file systems. xfs Log structured file systems HDFS Object Storage Systems Today: Coda, xfs Case Study: Coda File System Brief overview of other file systems xfs Log structured file systems HDFS Object Storage Systems Lecture 20, page 1 Coda Overview DFS designed for mobile clients

More information

Performance Innovations with Oracle Database In-Memory

Performance Innovations with Oracle Database In-Memory Performance Innovations with Oracle Database In-Memory Eric Cohen Solution Architect Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information

More information

Memory-Based Cloud Architectures

Memory-Based Cloud Architectures Memory-Based Cloud Architectures ( Or: Technical Challenges for OnDemand Business Software) Jan Schaffner Enterprise Platform and Integration Concepts Group Example: Enterprise Benchmarking -) *%'+,#$)

More information

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server

More information

Lecture 23 Database System Architectures

Lecture 23 Database System Architectures CMSC 461, Database Management Systems Spring 2018 Lecture 23 Database System Architectures These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used

More information

Client Server & Distributed System. A Basic Introduction

Client Server & Distributed System. A Basic Introduction Client Server & Distributed System A Basic Introduction 1 Client Server Architecture A network architecture in which each computer or process on the network is either a client or a server. Source: http://webopedia.lycos.com

More information

Chapter 20: Database System Architectures

Chapter 20: Database System Architectures Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types

More information

Distributed Systems. Lecture 4 Othon Michail COMP 212 1/27

Distributed Systems. Lecture 4 Othon Michail COMP 212 1/27 Distributed Systems COMP 212 Lecture 4 Othon Michail 1/27 What is a Distributed System? A distributed system is: A collection of independent computers that appears to its users as a single coherent system

More information

Chapter 6. Storage and Other I/O Topics

Chapter 6. Storage and Other I/O Topics Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

The Microsoft Large Mailbox Vision

The Microsoft Large Mailbox Vision WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more email has many advantages. Large mailboxes

More information

Data Centers. Tom Anderson

Data Centers. Tom Anderson Data Centers Tom Anderson Transport Clarification RPC messages can be arbitrary size Ex: ok to send a tree or a hash table Can require more than one packet sent/received We assume messages can be dropped,

More information

CSE 123b Communications Software

CSE 123b Communications Software CSE 123b Communications Software Spring 2002 Lecture 13: Content Distribution Networks (plus some other applications) Stefan Savage Some slides courtesy Srini Seshan Today s class Quick examples of other

More information

CSE 451: Operating Systems Winter Redundant Arrays of Inexpensive Disks (RAID) and OS structure. Gary Kimura

CSE 451: Operating Systems Winter Redundant Arrays of Inexpensive Disks (RAID) and OS structure. Gary Kimura CSE 451: Operating Systems Winter 2013 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Gary Kimura The challenge Disk transfer rates are improving, but much less fast than CPU performance

More information

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017 Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google

More information

WHITE PAPER Software-Defined Storage IzumoFS with Cisco UCS and Cisco UCS Director Solutions

WHITE PAPER Software-Defined Storage IzumoFS with Cisco UCS and Cisco UCS Director Solutions WHITE PAPER Software-Defined Storage IzumoFS with Cisco UCS and Cisco UCS Director Solutions Introduction While the data handled by companies has an average growth rate of over 50% per annum, growth of

More information

Introduction to Distributed * Systems

Introduction to Distributed * Systems Introduction to Distributed * Systems Outline about the course relationship to other courses the challenges of distributed systems distributed services *ility for distributed services about the course

More information

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications CSE 123b Communications Software Spring 2004 Today s class Quick examples of other application protocols Mail, telnet, NFS Content Distribution Networks (CDN) Lecture 12: Content Distribution Networks

More information

Lessons Learned Operating Active/Active Data Centers Ethan Banks, CCIE

Lessons Learned Operating Active/Active Data Centers Ethan Banks, CCIE Lessons Learned Operating Active/Active Data Centers Ethan Banks, CCIE #20655 @ecbanks Senior Network Architect, Carenection Co-founder, Packet Pushers Interactive http://ethancbanks.com http://packetpushers.net

More information

The Path to Lower-Cost, Scalable, Highly Available Windows File Serving

The Path to Lower-Cost, Scalable, Highly Available Windows File Serving The Path to Lower-Cost, Scalable, Highly Available Windows File Serving Higher Performance, Modular Expansion, Fault Tolerance at a Lower Cost The Challenges of Cost Effective, Scalable File Services for

More information

INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE

INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE Michael Poulson and Devin Jansen EMS Software Software Support Engineer October 16-18, 2017 Performance Improvements and Best Practices Medium-Volume Traffic

More information

Architekturen für die Cloud

Architekturen für die Cloud Architekturen für die Cloud Eberhard Wolff Architecture & Technology Manager adesso AG 08.06.11 What is Cloud? National Institute for Standards and Technology (NIST) Definition On-demand self-service >

More information

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR Petri Kero CTO / Ministry of Games MOBILE GAME BACKEND CHALLENGES Lots of concurrent users Complex interactions between players Persistent world with frequent

More information

Modern Database Concepts

Modern Database Concepts Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different

More information

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID System Upgrade Teaches RAID In the growing computer industry we often find it difficult to keep track of the everyday changes in technology. At System Upgrade, Inc it is our goal and mission to provide

More information