Cloud Computing. DB Special Topics Lecture (10/5/2012) Kyle Hale Maciej Swiech

Similar documents
Spanner: Google's Globally-Distributed Database. Presented by Maciej Swiech

Distributed Systems. 19. Spanner. Paul Krzyzanowski. Rutgers University. Fall 2017

Spanner: Google's Globally-Distributed Database* Huu-Phuc Vo August 03, 2013

NewSQL Databases. The reference Big Data stack

Architekturen für die Cloud

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

Introduction To Cloud Computing

1/10/2011. Topics. What is the Cloud? Cloud Computing

Cloud Computing Technologies and Types

Megastore: Providing Scalable, Highly Available Storage for Interactive Services & Spanner: Google s Globally- Distributed Database.

Google Spanner - A Globally Distributed,

In this unit we are going to look at cloud computing. Cloud computing, also known as 'on-demand computing', is a kind of Internet-based computing,

Spanner : Google's Globally-Distributed Database. James Sedgwick and Kayhan Dursun

Lecture 7: Data Center Networks

Spanner: Google s Globally- Distributed Database

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Cloud Computing. Technologies and Types

Migrating Oracle Databases To Cassandra

Data Centers and Cloud Computing

Introduction to Cloud Computing. [thoughtsoncloud.com] 1

Fundamentals Large-Scale Distributed System Design. (a.k.a. Distributed Systems 1)

Lecture 09: VMs and VCS head in the clouds

App Engine: Datastore Introduction

Large-Scale Web Applications

CHEM-E Process Automation and Information Systems: Applications

Don t Give Up on Serializability Just Yet. Neha Narula

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Cloud platforms. T Mobile Systems Programming

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Spanner: Google s Globally-Distributed Database. Wilson Hsieh representing a host of authors OSDI 2012

Data Centers and Cloud Computing. Data Centers

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Cloud platforms T Mobile Systems Programming

Module Day Topic. 1 Definition of Cloud Computing and its Basics

Lecture 1: January 22

Building Consistent Transactions with Inconsistent Replication

Lecture 1: January 23

COMPTIA CLO-001 EXAM QUESTIONS & ANSWERS

Beyond TrueTime: Using AugmentedTime for Improving Spanner

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

Distributed Algorithms Models

INFS 214: Introduction to Computing

ECE Enterprise Storage Architecture. Fall ~* CLOUD *~. Tyler Bletsch Duke University

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Corbett et al., Spanner: Google s Globally-Distributed Database

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

Introduction to NoSQL Databases

Database Architectures

Cloud & AWS Essentials Agenda. Introduction What is the cloud? DevOps approach Basic AWS overview. VPC EC2 and EBS S3 RDS.

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

SEEM3450 Engineering Innovation and Entrepreneurship

Everything you need to know about cloud. For companies with people in them

Integrity in Distributed Databases

DEEP DIVE INTO CLOUD COMPUTING

Cloud Computing and Service-Oriented Architectures

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Next-Generation Cloud Platform

SQL Server 2014 Upgrade

DATABASE SCALE WITHOUT LIMITS ON AWS

Introduction to data centers

Introduction to NoSQL

CIT 668: System Architecture. Amazon Web Services

Data Consistency Now and Then

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Data Center Fundamentals: The Datacenter as a Computer

CISC 7610 Lecture 2b The beginnings of NoSQL

ERP Solution to the Cloud

Introduction to Database Services

Cluster-Level Google How we use Colossus to improve storage efficiency

Cloud Infrastructure and Operations Chapter 2B/8 Page Main concept from which Cloud Computing developed

Introduction Configuration for the Digital Experience Cloud

To Cloud or Not To. An exploration of the economics of clouds and cyber-security.

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

DIVING IN: INSIDE THE DATA CENTER

MySQL Group Replication. Bogdan Kecman MySQL Principal Technical Engineer

Software and Migration Services FAQ for more information (available from Electronic Data Solutions ). Some implementation will be required, including

10. Replication. Motivation

CLOUD COMPUTING It's about the data. Dr. Jim Baty Distinguished Engineer Chief Architect, VP / CTO Global Sales & Services, Sun Microsystems

Transactions and ACID

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Why the cloud matters?

Distributed Systems. 27. Engineering Distributed Systems. Paul Krzyzanowski. Rutgers University. Fall 2018

Migrating Enterprise Applications to the Cloud Session 672. Leighton L. Nelson

Hybrid Infrastructure Hosting Clouds + Dedicated + Colocated GoGrid / ServePath September 09

Distributed Systems Intro and Course Overview

The Evolution of a Data Project

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

A Comparative Study of Amazon Web Service and Windows Azure

Cluster-Based Computing

Course Overview. ECE 1779 Introduction to Cloud Computing. Marking. Class Mechanics. Eyal de Lara

Iggy Fernandez Infrastructure as Code demo by Artem Danielov

Workshop Report: ElaStraS - An Elastic Transactional Datastore in the Cloud

Distributed Systems. GFS / HDFS / Spanner

Principal Solutions Architect. Architecting in the Cloud

Memory-Based Cloud Architectures

Cloud Computing. And What It Means for Your Agency. Phoebe Schimpf Director of IT NYSCAA

CLOUD COMPUTING PRIMER FOR EXECUTIVES

Transcription:

Cloud Computing DB Special Topics Lecture (10/5/2012) Kyle Hale Maciej Swiech

Managing servers isn t for everyone What are some prohibitive issues? (we touched on these last time)

Cost (initial/operational) Setup/Software installation Manageability Space Development

So what is cloud computing? A shift in responsibility Let someone else manage hardware infrastructure/ software environment/applications But why cloud?

Cloud Service Models

The Usual Case You buy/manage/build everything

Infrastructure as a Service (IaaS) What are we buying here? A remote machine (not necessarily a physical one!) E.g. I don t want to manage my own cluster!

Why Virtualization? (Hardware virtualization)

Why Virtualization? Consolidation Flexibility for user (Pick your favorite OS) Flexibility for provider (live migration for load balancing, repairs, etc.) Performance (e.g. load user s OS image on close-by physical machine)

Platform as a Service (PaaS) What are we buying here? A software/hardware framework to build applications on E.g. I don t want to setup MySQL/ Apache/Oracle, I just want to write my web app! Bonus points: how is this different from a regular hosted environment?

Software as a Service (SaaS) What are we buying here? Functionality (business/personal) We don t have to build anything E.g. I don t want to buy hardware or install software or write code, I just want to use it! Think, renting an application Bonus points: how is this any different from a webapp?

Some Common Properties of SaaS Applications Scales up/down based on usage Subscription-based Pay-per-use Multi-tenancy Customizable (e.g. for look-and-feel) Collaboration/sharing

Benefits of SaaS

Benefits of SaaS Updating applications is easier Environment is (mostly) uniform -> portability Less worry about having an adequate machine Lower cost (for everyone) Simplified deployment

Cloud Issues/Problems? Weather is the least of them

Trust Users must shift more trust to the provider Is my stuff going to disappear? Can someone else see my stuff? (privacy)

Security Providers must protect their infrastructure and users data More software layers (e.g. with virtualization)! More security concerns to manage Are cloud administrators honest/vulnerable to social engineering? (also a question of trust) Can a provider segregate my data from other users?

Thin Clients As we move computation to cloud, need less on client-side Modest hardware Cheap In the Extreme: ultra-thin/zero client. Only enough system software (BIOS/kernel) to boot OS from the network Require network connectivity

Amazon EC2 demo

Google Spanner A globally distributed, temporally versioned database

Key features of Spanner " Externally consistent global write-transactions with synchronous replication " Non-blocking reads in the past " Schematized, semi-relational data model " SQL-like query interface " Temporal versioning

Why make this? " Traditional RDBMS Normalized data Transactions Don't scale well to 'web size' " NoSQL Scale to size No transactions 'Eventually consistent' data

Why make this? (cont'd) " People want Scalability Synchronously available data Transaction support

Why make this? (cont'd) " People want Scalability Synchronously available data Transaction support > Google Spanner

Design of Spanner " "We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions." Google

Spanner Design: zones " Spanner stores data in 'zones' in various 'universes' " Zones provide Physical isolation Data locality

Spanner Design: spanserver " Transaction manager and lock table ensure concurrency " Writes go through Paxos layer, non-blocking reads can go directly to data " If only one Paxos group is involved, transaction manager is bypassed (most transactions) " Data can be 'sharded' as necessary

Spanner Design: Data Model " Schematized semi-relational tables " SQL-like language " General-purpose transactions " Synchronous replication " An application can contain 1+ databases Each db can contain unlimited number of schematized tables

Spanner Design: SQL

Spanner Design: TrueTime " Synchronicity is hard, especially across distributed data centers " How do we solve this?

Spanner Design: TrueTime " Synchronicity is hard, especially across distributed data centers " How do we solve this? " Atomic clocks and GPS!

Spanner Design: TrueTime " Using the GPS and atomic clocks, Spanner can figure out serialization of transactions " If the time uncertainty grows too large, Spanner slows down

What does this give us? " Transactions! " Consistent data! " Global Scalability! " Failure tolerance!

Drawbacks " No offline access " Average latency of ~10ms, but 100ms latencies should be expected (especially on multi-site writes) " TrueTime requires special hardware (GPS + Atomic clock)