Big Table. Dennis Kafura CS5204 Operating Systems

Similar documents
CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

Bigtable. Presenter: Yijun Hou, Yixiao Peng

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

big picture parallel db (one data center) mix of OLTP and batch analysis lots of data, high r/w rates, 1000s of cheap boxes thus many failures

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

BigTable. Chubby. BigTable. Chubby. Why Chubby? How to do consensus as a service

BigTable. CSE-291 (Cloud Computing) Fall 2016

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

CS November 2018

CS November 2017

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Big Table. Google s Storage Choice for Structured Data. Presented by Group E - Dawei Yang - Grace Ramamoorthy - Patrick O Sullivan - Rohan Singla

Extreme Computing. NoSQL.

Distributed File Systems II

CA485 Ray Walshe NoSQL

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

Lessons Learned While Building Infrastructure Software at Google

DIVING IN: INSIDE THE DATA CENTER

BigTable: A Distributed Storage System for Structured Data

Distributed Systems [Fall 2012]

CS5412: OTHER DATA CENTER SERVICES

CSE-E5430 Scalable Cloud Computing Lecture 9

CS5412: DIVING IN: INSIDE THE DATA CENTER

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

CS5412: DIVING IN: INSIDE THE DATA CENTER

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Distributed Computation Models

Structured Big Data 1: Google Bigtable & HBase Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

11 Storage at Google Google Google Google Google 7/2/2010. Distributed Data Management

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Lecture: The Google Bigtable

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

Google File System and BigTable. and tiny bits of HDFS (Hadoop File System) and Chubby. Not in textbook; additional information

Hyperion Interactive Reporting Reports & Dashboards Essentials

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Bigtable: A Distributed Storage System for Structured Data

Typical size of data you deal with on a daily basis

CS 655 Advanced Topics in Distributed Systems

BigTable A System for Distributed Structured Storage

DATABASE SYSTEMS. Database programming in a web environment. Database System Course, 2016

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

Distributed Database Case Study on Google s Big Tables

MapReduce and Friends

Storage and File Hierarchy

COS 318: Operating Systems

An Adventure in Data Modeling

Bigtable: A Distributed Storage System for Structured Data

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data

Comparing SQL and NOSQL databases

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

CS 241 Honors Memory

Changing Requirements for Distributed File Systems in Cloud Storage

Dear I-Life customer, Every I-Life product goes through a rigorous process of sample evaluation and approval before commercial launch.

W b b 2.0. = = Data Ex E pl p o l s o io i n

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

An Introduction to Big Data Formats

MapReduce & BigTable

Infrastructure system services

COSC 6339 Big Data Analytics. NoSQL (II) HBase. Edgar Gabriel Fall HBase. Column-Oriented data store Distributed designed to serve large tables

CS3600 SYSTEMS AND NETWORKS

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

BigTable: A System for Distributed Structured Storage

File Systems: Interface and Implementation

File Systems: Interface and Implementation

Rule 14 Use Databases Appropriately

Segregating Data Within Databases for Performance Prepared by Bill Hulsizer

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

THE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan

Map-Reduce. Marco Mura 2010 March, 31th

Mapping discarded needles in Fusion Tables Tutorial

Programming Systems for Big Data

CS370 Operating Systems

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

GFS: The Google File System

Streaming Auto-Scaling in Google Cloud Dataflow

Introduction to Database Systems CSE 344

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018

HBASE INTERVIEW QUESTIONS

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

University of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Ecommerce Site Search. A Guide to Evaluating Site Search Solutions

CSE 344 Final Review. August 16 th

INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados

Introduction to Databases

How to Break Software by James Whittaker

Administrivia. Tree-Structured Indexes. Review. Today: B-Tree Indexes. A Note of Caution. Introduction

Transcription:

Big Table Dennis Kafura CS5204 Operating Systems 1

Introduction to Paper summary with this lecture. is a Google product Google = Clever "We settled on this data model after examining a variety of potential uses of a -like system. "The implementation described in the previous section required a number of refinements to achieve the high performance, availability, and reliability required by our users." Dennis Kafura CS5204 Operating Systems 2

Focus Today Structure Recovery System Table Distribution The API Dennis Kafura CS5204 Operating Systems 3

Structure Goals for this section Understand the relation to GFS Know what the parts of the system are Know how they work together Dennis Kafura CS5204 Operating Systems 4

Backup s GFS Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data GFS Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Dennis Kafura CS5204 Operating Systems 5

Characters Just a whimsical introduction Chubby A file system whose files/directories have individual locks on all files. These locks are used to coordinate the rest of the system. SSTable A slim map sorted by key. It is the most basic primitive in the structure. Deletion Since SSTables are immutable, any deletion takes the form of another record which is interpreted as a deletion. Master The server which does no clientoriented work, but directs the efforts of all tablet servers. Tablet Server Contains the data and handles client read/write interactions. Dennis Kafura CS5204 Operating Systems 6

Characters Just a whimsical introduction Table Tables exist only as a high-level construct. At the low level the table is still and SSTable. Tablet One part of the Table. Each Tablet holds only 100MB-200MB of the whole. They are constantly splitting and merging. Metatable Is just kind of special. It s whole purpose is to refer to the main table. Root Tablet If there is a king of the special, this is it. It is the only tablet which refers to the rest of the metatable. Dennis Kafura CS5204 Operating Systems 7

Relationships among the entities Is a pointer to Owns the lock to Controls the contents of Is broken into Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 8

Let s Look Deeper A table is really only the exposed interface The real data is stored in an SSTable inherits certain attributes from the underlying SSTable structure Key and data types are raw character strings Records are ordered by Key Records are immutable. adds to this structure by adding dimensionality. The row key determines the horizontal slice The column family:name determines the vertical slice The version number determines the final dimension A tablet is really just a range of horizontal slices. The combination of these features allows big table to work with ranges and filters in any of the three dimensions. Dennis Kafura CS5204 Operating Systems 9

Goals for this section Recovery System Understanding how to recover from a hardware failure Understand the impact of loss of connectivity Understand the impact of a lost messages Dennis Kafura CS5204 Operating Systems 10

What if things go wrong? Is a pointer to Scenario 1: Tablet Server Loses Connectivity Owns the lock to 6 Controls the contents of Is broken into 3 Creates and manages 4 Is Live On 1 2? 5 Dennis Kafura CS5204 Operating Systems 11

What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 1 Is a pointer to Owns the lock to 6 Controls the contents of Is broken into 4 Creates and manages Is Live On 2 3 5 1 Dennis Kafura CS5204 Operating Systems 12

What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 2 G-K A-Z 7 Q-Z Is a pointer to Owns the lock to Controls the contents of Is broken into 8 A-F, L-P S1 S2 S3 S4 Creates and manages Is Live On 6 Dennis Kafura CS5204 Operating Systems 13

What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 3 Is a pointer to Owns the lock to A-F, L-P A-F Controls the contents of 10 Is broken into 12 Creates and manages Is Live On 9 11 Dennis Kafura CS5204 Operating Systems 14

What if things go wrong? Scenario 4: Metadata is lost and new Master Is a pointer to 7 Owns the lock to 3 4 4 6 Controls the contents of Is broken into Creates and manages Is Live On 1 2? 5 Dennis Kafura CS5204 Operating Systems 15

Goals for this section Table Distribution System Understand the process for adding/removing a server Understand how to handle an overwhelmed server Understand how to handle deletions/changes to the database. Dennis Kafura CS5204 Operating Systems 16

Server Join/Leave Responsibilities Is a pointer to + Owns the lock to Controls the contents of Is broken into + + Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 17

Tablet Growth/Shrinkage Merger Split Undersized: <100MB Ideal: 100MB- 200MB Oversized: >200MB Dennis Kafura CS5204 Operating Systems 18

If You Can t Handle the Heat User interactions may cause hot spots where requests are more frequent than the baseline! 115% 115% 115% 100% 160% Is a pointer to Owns the lock to Controls the contents of Is broken into Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 19

Move the Kitchen After redistributing the work load, hot spots are easier to deal with and the labor is more evenly divided. Is a pointer to Owns the lock to Controls the contents of Is broken into 100% 100% 113% 100% 113% Creates and manages Is Live On Note that granularity in this image does not show updated pointers from metatable or locks on Chubby files Dennis Kafura CS5204 Operating Systems 20

What if I Want to Delete Something? Memtable Tablet in RAM New SSTable GFS Changes & Deletions Existing SSTables The process of merging an SSTable with the Memtable is known as a compaction. Minor Compactions Involve at least one SSTable Grow the set of SSTables May contain deletions Major Compactions Include all SSTables Reduce the set of SSTables Dennis Kafura CS5204 Operating Systems 21

The API Goals for this section Explain how this differs from SQL. How to create your own table. Using as a hash table/vector. Dennis Kafura CS5204 Operating Systems 22

If You Had to Perform a Project Projects are notoriously inefficient Checking an extensive table is ALWAYS to be avoided With an a truly ENORMOUS table is a very bad idea Lon Lat City 123 87 New Oslo 78 23 New Canada -100 67 New Bermuda 45 59 New England 171-45 Old Hampshire -165 21 Old Mexico 0 66 Old England 78-51 New Ireland 41 0 New Equador 100 12 Old Zealand Dennis Kafura CS5204 Operating Systems 23

If You Had to Perform a Join is quite sparse. Imagine this was your table and only the red spots had data (everything else is null). Joining with nulls create semantic nonsense. Joining on a null creates more nulls. Dennis Kafura CS5204 Operating Systems 24

Completely Configurable Structure Excellent Business Ownership Records Records will be state_city for alphabetical ordering Column families will be Better Business Bureau ratings Columns will be business names Version will be ownership purchase date Data will be owner name, address, phone and email. Ranked X type businesses Records will be region_city for geographical ordering Column families will designate types of services Columns will be specific business names Version will be automated Data will be popularity by customer vote with address. Dennis Kafura CS5204 Operating Systems 25

Multiple Tools for Fine Control MapReduce MapReduce is closed on (i.e. MR(Bt) Bt). Use it to determine the most successful owner (based on average BBB rank). Sawzall A script language which can execute actions with tablet server clock cycles. Use it to determine the vote history of a set of businesses for graphing purposes. Regular Expressions Can be used for any combination of record, column and data recognition schemes. Use it to determine all the best voted hotels in a region. Dennis Kafura CS5204 Operating Systems 26

Order Large Groups of Data I d like to have all the demographic statistics for the states A-L. I d like to have the hotel listings for cities in Pennsylvania. I d like to have hockey scores for all pro, semi-pro and college teams in the last three years. I want to see all the Google searches in the last 24 hours. Dennis Kafura CS5204 Operating Systems 27

Only Take What You Want I d like to have all the demographic statistics for the states A-L. But I ll only look at ethnic percentages I d like to have the hotel listings for cities in Pennsylvania. But I only want the ones in Harrisburg I d like to have hockey scores for all pro, semi-pro and college teams in the last three years. But I just want to see the Black Hawks I want to see all the Google searches in the last 24 hours. But only the ones for www.disney.com Dennis Kafura CS5204 Operating Systems 28

Summary Structure of the system Methods for recovery Data management Characteristics of the API Dennis Kafura CS5204 Operating Systems 29