Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Similar documents
BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

BigTable A System for Distributed Structured Storage

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

Big Table. Google s Storage Choice for Structured Data. Presented by Group E - Dawei Yang - Grace Ramamoorthy - Patrick O Sullivan - Rohan Singla

CS November 2017

CS November 2018

Bigtable. Presenter: Yijun Hou, Yixiao Peng

BigTable: A System for Distributed Structured Storage

CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

BigTable. CSE-291 (Cloud Computing) Fall 2016

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

BigTable: A Distributed Storage System for Structured Data

Distributed File Systems II

Outline. Spanner Mo/va/on. Tom Anderson

Distributed Systems [Fall 2012]

MapReduce & BigTable

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

BigTable. Chubby. BigTable. Chubby. Why Chubby? How to do consensus as a service

7680: Distributed Systems

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Bigtable: A Distributed Storage System for Structured Data

Bigtable: A Distributed Storage System for Structured Data

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

CS5412: OTHER DATA CENTER SERVICES

CA485 Ray Walshe NoSQL

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

big picture parallel db (one data center) mix of OLTP and batch analysis lots of data, high r/w rates, 1000s of cheap boxes thus many failures

The Google File System

Extreme Computing. NoSQL.

CLOUD-SCALE FILE SYSTEMS

Structured Big Data 1: Google Bigtable & HBase Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

Lessons Learned While Building Infrastructure Software at Google

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

Distributed Database Case Study on Google s Big Tables

CS5412: DIVING IN: INSIDE THE DATA CENTER

Google Data Management

Google File System. By Dinesh Amatya

Bigtable: A Distributed Storage System for Structured Data

The Google File System (GFS)

The Google File System

CSE-E5430 Scalable Cloud Computing Lecture 9

Distributed computing: index building and use

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

11 Storage at Google Google Google Google Google 7/2/2010. Distributed Data Management

CSE 124: Networked Services Lecture-16

The Google File System

CSE 124: Networked Services Fall 2009 Lecture-19

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Google big data techniques (2)

Data Informatics. Seon Ho Kim, Ph.D.

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams

The Google File System

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018

Google File System and BigTable. and tiny bits of HDFS (Hadoop File System) and Chubby. Not in textbook; additional information

Infrastructure system services

The Google File System

Percolator. Large-Scale Incremental Processing using Distributed Transactions and Notifications. D. Peng & F. Dabek

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Google Disk Farm. Early days

Comparing SQL and NOSQL databases

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

NPTEL Course Jan K. Gopinath Indian Institute of Science

The Google File System

Google is Really Different.

CA485 Ray Walshe Google File System

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Filesystem

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

MapReduce. U of Toronto, 2014

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

CS5412: DIVING IN: INSIDE THE DATA CENTER

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

Lecture: The Google Bigtable

The Google File System. Alexandru Costan

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)

Distributed Systems 16. Distributed File Systems II

Design & Implementation of Cloud Big table

Distributed computing: index building and use

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Big Table. Dennis Kafura CS5204 Operating Systems

COSC 6339 Big Data Analytics. NoSQL (II) HBase. Edgar Gabriel Fall HBase. Column-Oriented data store Distributed designed to serve large tables

Map-Reduce. Marco Mura 2010 March, 31th

BigData and Map Reduce VITMAC03

Dept. Of Computer Science, Colorado State University

Google File System, Replication. Amin Vahdat CSE 123b May 23, 2006

Google File System (GFS) and Hadoop Distributed File System (HDFS)

The Google File System

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Staggeringly Large Filesystems

Google File System. Arun Sundaram Operating Systems

Data Storage in the Cloud

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016

Transcription:

Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li

References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University of Washington 2

Motivation Lots of (semi) structured data at Google URLs Contents, crawl metadata, links Per-user data: User preference settings, search results Scale is large Billions of URLs, hundreds of million of users, Existing Commercial database doesn t meet the requirements 3

Goals Store and manage all the state reliably and efficiently Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time 4

BigTable vs. GFS GFS provides raw data storage We need: More sophisticated storage Key - value mapping Flexible enough to be useful Store semi-structured data Reliable, scalable, etc. 5

BigTable Bigtable is a distributed storage system for managing large scale structured data Wide applicability Scalability High performance High availability 6

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 7

Data Model Sparse Sorted Multidimensional 8

Cell Contains multiple versions of the data Can locate a data using row key, column key and a time stamp Treats data as uninterpreted array of bytes that allow clients to serialize various forms of structured and semi-structured data Supports automatic garbage collection per column family for management of versioned data 9

Goals Store and manage all the state reliably and efficiently Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time 10

Row Row key is an arbitrary string Access to column data in a row is atomic Row creation is implicit upon storing data Rows ordered lexicographically Rows close together lexicographically usually reside on one or a small number of machines 11

Columns Columns are grouped into Column Families: family:optional_qualifier Column family Has associated type information Usually of the same type 12

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 13

API Metadata operations Create/delete tables, column families, change metadata, modify access control list Writes ( atomic ) Set (), DeleteCells(), DeleteRow() Reads Scanner: read arbitrary cells in a BigTable 14

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 15

Tablets Large tables broken into tablets at row boundaries Tablet holds contiguous range of rows Clients can often choose row keys for locality Aim for ~100MB to 200MB of data per tablet Serving machine responsible for ~100 tablets Fast recovery: 100 machine each pick up 1 tablet from failed machine Fine-grained load balancing: Migrate tablets away from overloaded machine 16

Tablets and Splitting

System Structure Master Metadata operations Load balancing Keep track of live tablet servers Master failure Tablet server Accept read and write to data 18

System Structure

System Structure read/write

System Structure Metadata operations

Locating Tablets 3-level hierarchical lookup scheme for tablets Location is ip port of servers in META tables 22

Tablet Representation and serving Append only tablet log SSTable on GFS A Sorted map of string to string If you want to find a row data, all the data are contiguous Memtable write buffer When a read comes in, you have to merge SSTable data and uncommitted value. 23

Tablet Representation and Serving 24

Tablet Representation and Serving 25

Compaction Tablet state represented as a set of immutable compacted SSTable files, plus tail of log Minor compaction: When in-memory buffer fills up, it freezes the in-memory buffer and create a new SSTable Major compaction: Periodically compact all SSTables for tablet into new base SSTable on GFS Storage reclaimed from deletions at this point Produce new tables 26

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 27

Goals Reliable system for storing and managing all the states Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time 28

Locality Groups Clients can group multiple column families together into a locality group A separate SSTable is generated for each locality group Enable more efficient read Can be declared to be in-memory 29

Compression Many opportunities for compression Similar values in columns and cells Within each SSTable for a locality group, encode compressed blocks Keep blocks small for random access Exploit fact that many values very similar 30

Goals Reliable system for storing and managing all the states Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time 31

Commit log and recovery Single commit log file per tablet server reduce the number of concurrent file writes to GFS Tablet Recovery redo points in log perform the same set of operations from last persistent state 32

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 33

Performance evaluation Test Environment Based on a GFS with 1876 machines 400 GB IDE hard drives in each machine Two-level tree-shaped switched network Performance Tests Random Read/Write Sequential Read/Write 34

Single tablet-server performance Random reads is the slowest Transfer 64 KB SSTable over GFS to read 1000 byte Random and sequential writes perform better Append writes to server to a single commit log Group commit 35

Performance Scaling Performance didn t scale linearly Load imbalance in multiple server configurations Larger data transfer overhead 36

Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 37

Google Analytics A service that analyzes traffic patterns at web sites Raw Click Table Row for each end-user session Row key is (website name, time) Summary Table Extracts recent session data using MapReduce jobs 38

Google Earth Use one table for preprocessing and one for serving Different latency requirements (disk vs memory) Each row in the imagery table represents a single geographic segment Column family to store data source One column for each raw image Very sparse 39

Personalized Search Row key is a unique userid A column family for each type of user action Replicated across Bigtable clusters to increase availability and reduce latency 40

Conclusions Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write based interface for other frameworks to build on top of it It has enabled Google to deal with large scale data efficiently 41