Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

Similar documents
Bigtable. Presenter: Yijun Hou, Yixiao Peng

CS November 2017

CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

CS November 2018

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

Big Table. Google s Storage Choice for Structured Data. Presented by Group E - Dawei Yang - Grace Ramamoorthy - Patrick O Sullivan - Rohan Singla

BigTable. CSE-291 (Cloud Computing) Fall 2016

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

BigTable. Chubby. BigTable. Chubby. Why Chubby? How to do consensus as a service

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable: A Distributed Storage System for Structured Data

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Distributed File Systems II

BigTable: A Distributed Storage System for Structured Data

Distributed Systems [Fall 2012]

big picture parallel db (one data center) mix of OLTP and batch analysis lots of data, high r/w rates, 1000s of cheap boxes thus many failures

Lecture: The Google Bigtable

7680: Distributed Systems

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Bigtable: A Distributed Storage System for Structured Data

Structured Big Data 1: Google Bigtable & HBase Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

CSE-E5430 Scalable Cloud Computing Lecture 9

MapReduce & BigTable

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Distributed Database Case Study on Google s Big Tables

Bigtable: A Distributed Storage System for Structured Data

Extreme Computing. NoSQL.

CA485 Ray Walshe NoSQL

Google File System and BigTable. and tiny bits of HDFS (Hadoop File System) and Chubby. Not in textbook; additional information

Outline. Spanner Mo/va/on. Tom Anderson

BigTable A System for Distributed Structured Storage

CS5412: OTHER DATA CENTER SERVICES

BigTable: A System for Distributed Structured Storage

CS5412: DIVING IN: INSIDE THE DATA CENTER

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Design & Implementation of Cloud Big table

11 Storage at Google Google Google Google Google 7/2/2010. Distributed Data Management

Infrastructure system services

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Paxos Phase 2. Paxos Phase 1. Google Chubby. Paxos Phase 3 C 1

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

DIVING IN: INSIDE THE DATA CENTER

Data Storage in the Cloud

CS5412: DIVING IN: INSIDE THE DATA CENTER

NPTEL Course Jan K. Gopinath Indian Institute of Science

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Big Table. Dennis Kafura CS5204 Operating Systems

Lessons Learned While Building Infrastructure Software at Google

HBase: Overview. HBase is a distributed column-oriented data store built on top of HDFS

W b b 2.0. = = Data Ex E pl p o l s o io i n

CA485 Ray Walshe Google File System

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Recap: First Requirement. Recap: Second Requirement. Recap: Strengthening P2

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Google big data techniques (2)

Google Data Management

Data Informatics. Seon Ho Kim, Ph.D.

The Google File System (GFS)

The Google File System

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Ghislain Fourny. Big Data 5. Wide column stores

The Google File System

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

The Google File System

Distributed Systems 16. Distributed File Systems II

CLOUD-SCALE FILE SYSTEMS

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

Applications of Paxos Algorithm

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

COSC 6339 Big Data Analytics. NoSQL (II) HBase. Edgar Gabriel Fall HBase. Column-Oriented data store Distributed designed to serve large tables

The Google File System

Distributed computing: index building and use

The Google File System

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

HBASE INTERVIEW QUESTIONS

CSE 124: Networked Services Lecture-16

The Google File System

MapReduce. U of Toronto, 2014

CSE 124: Networked Services Fall 2009 Lecture-19

Distributed Filesystem

Comparing SQL and NOSQL databases

Google File System 2

YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores

Exam 2 Review. October 29, Paul Krzyzanowski 1

NoSQL Database Comparison: Bigtable, Cassandra and MongoDB CJ Campbell Brigham Young University October 16, 2015

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Google File System. Arun Sundaram Operating Systems

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

From Google File System to Omega: a Decade of Advancement in Big Data Management at Google

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

Megastore: Providing Scalable, Highly Available Storage for Interactive Services & Spanner: Google s Globally- Distributed Database.

Tools for Social Networking Infrastructures

The Google File System

Transcription:

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

Google Bigtable 2 A distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of : Data size (from URLs to web pages to satellite imagery) a Latency requirements (from backend bulk processing to real-time data serving).

Google Bigtable 3 Wide Applicability Scalability High Performance High Availability Bigtable is used by: Google Analytics Google Finance Google Earth Google Search / Personalized Search Orkut etc.

Google Bigtable is 4 a sparse, distributed, persistent, multi dimensional sorted map which is indexed by a Row Key, a Column Key and a Timestamp, where each value in the map is an un-interpreted array of bytes.

Data Model ( row key : string, column : string, timestamp : int64) string anchor:cnnsi.com; Contents; anchor:my.look.ca com.cnn.www <html> <html> <html> T3 T5 T6 CNN T9 CNN.COM T8 Fig: A slice of an eg table (consider name Webtable) storing web pages 5

6

Bigtable Rows and Column Families 7 Rows: Arbitrary strings; currently up to 64KB in size allowed; typical size for string data 10-100 bytes Every read or write is atomic Lexicographic ordering of keys Row range definition for a table is dynamic called a Tablet Column Families: Column keys are grouped into sets called column families Syntax: family:qualifier Eg: anchor:my.look.ca : in the figure from previous slide language:languageid: This stores the language for webpage

Timestamps 8 Timestamps: Multiple versions of the data are allowed in a single cell; indexed by timestamps Timestamps are 64 bit integer values Client or Bigtable can generate timestamps for the data Stored in decreasing order keeping newest data on top of storage Bigtable column family can have their own garbage collection mechanism defined to help keep only recent n no. of data values

Bigtable API 9 Bigtable API Provides functions for Creating and Deleting tables Creating and Deleting column families Changing cluster, table and column family metadata Access to data rows for write and delete operations Scan through data for particular column families, rows and data values with filters applied Batch and atomic writes

Bigtable Building Blocks 10 Google File System Google SSTable Chubby

Bigtable Building Blocks 11 Google File System: To Store log and data files Google SSTable: Used internally to store data in Bigtable Chubby: Paxos based system for consensus in network of unreliable processors Provides a namespace to store Directories and small files. Each Dir or file can be used as a lock and every access is atomic Chubby Unavailable Bigtable Unavailable

Implementation 12 Three Major components: A Library that is linked to every client One master server: Responsible for assigning tablets to Tablet Servers Many Tablet servers: Dynamically added and removed from a cluster to accommodate changes in workloads Recall: Tablet: Every table in Bigtable is dynamically partitioned on the basis of row keys / range. Each row range is called a tablet.

Implementation Cntd. 13 One master server: Responsible for assigning tablets to Tablet Servers Detecting the addition and expiration of Tablet Servers Balancing Tablet Server Load Garbage collection of files in GFS Note: No client practically communicates with Master Server Many Tablet servers: Dynamically added and removed from a cluster to accommodate changes in workloads Each Tablet Server manages a set of Tablet Handles Read and Write requests to the Tablets Split tablets when they have grown too large

Tablets 14 User Table 1 Chubby File Root Tablet (1 st metadata Tablet) Other Metadata Tablets Fig: Tablet Location Hierarchy User Table N

Tablet Assignment 15 Master Server and Tablet Assignment: Assign a tablet to only one Tablet Server at a given time Keep track of all the Tablet servers including tablets assignments and un-assignments Master is responsible for: Detecting when a tablet server is no longer serving its tablets Keep track of lock / Session with Chubby Steps Taken by Master at Startup: Grab a unique Master lock in Chubby Scan the servers directory to find live servers Communicate with every live server to check assignments Scan the Metadata table to learn the set of Tablets

Tablet Representation 16 memtable Read Op Memory GFS tablet log Write Op SSTable Files

Tablet Serving 17 Write Operation on Tablet When write operation arrives at Tablet: Server checks for well formedness and proper authorization Valid mutation is written to the commit log Group commit is used to improve the throughput After write is committed, contents are inserted to memtable Read Operation on Tablet Checked for Well formedness and proper authorization Valid read operation is executed on a merged view of sequence of SSTables and memtable

Tablet Compactions 18 Minor Compactions: Shrinks memory usage of Tablet server Reduces amount of data that has to be recreated from log in case of failure Merging Compactions: Performed in background on SSTables created by minor compactions Major Compactions: All SSTables are merged and rewritten to one SSTable. Bigtable performs this operation regularly. Everything that happens with data here is atomic!!!

Refinements 19 Locality Groups Compression Caching for Read performance Bloom Filters Commit Log Implementation Speeding up Tablet recovery Exploiting immutability

Summary 20 Bigtable is a distributed storage system for storing structured data at Google In operation since 2005, by August 2006 more than 60 projects are using Bigtable Effective performance, High availability and Scalability are the key features for most of the clients Control over architecture allows Google to customize the product as needed. Use by old and new clients demonstrates that Bigtable architecture works.

This Lecture Notes from 21 the paper published by Google in OSDI 2006 http://grail.csuohio.edu/~sschung/cis612/googlebigt able-osdi06.pdf