CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

Similar documents
CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

BigTable. CSE-291 (Cloud Computing) Fall 2016

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

Big Table. Google s Storage Choice for Structured Data. Presented by Group E - Dawei Yang - Grace Ramamoorthy - Patrick O Sullivan - Rohan Singla

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

CS November 2017

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

CS November 2018

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Distributed File Systems II

BigTable. Chubby. BigTable. Chubby. Why Chubby? How to do consensus as a service

Distributed Systems [Fall 2012]

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

BigTable: A Distributed Storage System for Structured Data

Extreme Computing. NoSQL.

CSE-E5430 Scalable Cloud Computing Lecture 9

7680: Distributed Systems

CS5412: OTHER DATA CENTER SERVICES

MapReduce & BigTable

big picture parallel db (one data center) mix of OLTP and batch analysis lots of data, high r/w rates, 1000s of cheap boxes thus many failures

Structured Big Data 1: Google Bigtable & HBase Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

BigTable A System for Distributed Structured Storage

CS5412: DIVING IN: INSIDE THE DATA CENTER

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Bigtable: A Distributed Storage System for Structured Data

Bigtable: A Distributed Storage System for Structured Data

BigTable: A System for Distributed Structured Storage

Bigtable: A Distributed Storage System for Structured Data

Google File System and BigTable. and tiny bits of HDFS (Hadoop File System) and Chubby. Not in textbook; additional information

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

Outline. Spanner Mo/va/on. Tom Anderson

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Distributed computing: index building and use

Big Table. Dennis Kafura CS5204 Operating Systems

CS /29/18. Paul Krzyzanowski 1. Question 1 (Bigtable) Distributed Systems 2018 Pre-exam 3 review Selected questions from past exams

CA485 Ray Walshe NoSQL

Distributed Systems Pre-exam 3 review Selected questions from past exams. David Domingo Paul Krzyzanowski Rutgers University Fall 2018

DIVING IN: INSIDE THE DATA CENTER

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

11 Storage at Google Google Google Google Google 7/2/2010. Distributed Data Management

Infrastructure system services

Google big data techniques (2)

From Google File System to Omega: a Decade of Advancement in Big Data Management at Google

Distributed computing: index building and use

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Cassandra Design Patterns

CS5412: DIVING IN: INSIDE THE DATA CENTER

CS 138: Google. CS 138 XVII 1 Copyright 2016 Thomas W. Doeppner. All rights reserved.

Data Storage in the Cloud

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Recap: First Requirement. Recap: Second Requirement. Recap: Strengthening P2

Distributed Systems 16. Distributed File Systems II

Recap. CSE 486/586 Distributed Systems Google Chubby Lock Service. Paxos Phase 2. Paxos Phase 1. Google Chubby. Paxos Phase 3 C 1

NPTEL Course Jan K. Gopinath Indian Institute of Science

CSE 124: Networked Services Lecture-16

Information Retrieval (IR) Introduction to Information Retrieval. Lecture Overview. Why do we need IR? Basics of an IR system.

Map-Reduce. Marco Mura 2010 March, 31th

Lecture: The Google Bigtable

Google Data Management

Map Reduce. Yerevan.

Staggeringly Large Filesystems

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 12: Conclusion. Aidan Hogan

Index Construction. Dictionary, postings, scalable indexing, dynamic indexing. Web Search

Lessons Learned While Building Infrastructure Software at Google

modern database systems lecture 4 : information retrieval

The Google File System

CA485 Ray Walshe Google File System

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

Data Informatics. Seon Ho Kim, Ph.D.

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)

Distributed Database Case Study on Google s Big Tables

Google Disk Farm. Early days

GFS: The Google File System. Dr. Yingwu Zhu

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Google is Really Different.

CS 655 Advanced Topics in Distributed Systems

CLOUD-SCALE FILE SYSTEMS

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

MapReduce. U of Toronto, 2014

Text Analytics. Index-Structures for Information Retrieval. Ulf Leser

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong

The Google File System

Introduction to Information Retrieval (Manning, Raghavan, Schutze)

Google File System 2

Design & Implementation of Cloud Big table

CSE 124: Networked Services Fall 2009 Lecture-19

Distributed Filesystem

68A8 Multimedia DataBases Information Retrieval - Exercises

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA

Data-Intensive Distributed Computing

Cassandra- A Distributed Database

Transcription:

CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

References Bigtable: A Distributed Storage System for Structured Data. Fay Chang et. al. OSDI 2006. CSE 544 - Winter 2009 2

Outline Motivation Brief overview of information retrieval Key features of Bigtable Bigtable API Bigtable architecture Bigtable performance and discussion CSE 544 - Winter 2009 3

Different Types of Data Structured data All data conforms to a schema. Ex: business data Semistructured data Some structure in the data but implicit and irregular Ex: resume, ads Unstructured data No structure in data. Ex: text, sound, video, images CSE 544 - Winter 2009 4

Information Retrieval (IR) Goal: search collection of text documents Field exists since the 1950 s Field evolved independently of databases Renewed interest thanks to the Web Traditional IR techniques geared toward relatively small data collections and expert users Web has millions of documents and non-expert users CSE 544 - Winter 2009 5

Document Search Two types of queries, called searches Boolean query: recipe AND (beef OR stew) Result: all docs with word recipe and word beef or word stew Ranked query: recipe beef stew Result: list of docs ranked by relevance to query Different from precise RDBMS queries CSE 544 - Winter 2009 6

Success Criteria Precision Percentage of retrieved docs that are relevant to query Recall Percentage of relevant docs in the database that are returned in response to a query Goal: high precision and high recall CSE 544 - Winter 2009 7

IR Framework Need a way to represent documents Need a way to compare documents CSE 544 - Winter 2009 8

Vector Space Model Document vector doc_id word 1 word 2... word n Nb occurrences of word n in doc 2 1 2 3 Term frequencies (TF) term j document i: t ij Number of occurrences of term j in document i Intuition: frequent terms are more important CSE 544 - Winter 2009 9

TF/IDF Weighting of Terms Term frequencies (TF) term j document i: t ij But some terms are frequent in general (e.g., the ) Some terms frequent is some collections Example: object and class in Java tutorial docs Inverse doc. frequency (IDF): log(n/n j ) N = nb documents; n j = nb docs that contain word j w ij = TF * IDF Length normalization: w ij * = w ij / ( w ik2 ) Because some documents longer than others CSE 544 - Winter 2009 10

Document Similarity and Ranking Assume a t-dimensional space (t is nb terms) Term A Document vector for doc 1 θ Document vector for query Document vector for doc 2 Term B Similarity Dot product of document vectors sim(q, D i ) = terms q j * w ij * CSE 544 - Winter 2009 11

Cosine Similarity We defined w * ij = w ij / ( w ik2 ) sim(q, D i ) = terms q j * w ij * Hence, similarity is equal to sim(q,d i ) = ( terms q j w ij ) / (( q ik2 ) ( w ik2 )) sim(q,d i ) = Q D i / Q D i cos(θ) = Q D i / Q D i This metric is called cosine similarity CSE 544 - Winter 2009 12

Indexing Text Documents Goal: efficient eval. of boolean and ranked queries Before indexing documents Eliminate stop words Stem all the words Goal: reduce related terms to a canonical form Example: run, running, runner all stem to run CSE 544 - Winter 2009 13

Inverted Index Maps each word to list of docs (inverted list) that contain it Enables fast retrieval of all docs that contain given term Hash value Lexicon (in memory) 0001 Flower 2 0001 Foot 3 0011 Boat 1 Term # docs Postings pointer Postings file (on disk) doc_id: 8 doc_id: 3 doc_id: 1 doc_id: 3 doc_id: 5 Hash index (in this example) doc_id: 2 CSE 544 - Winter 2009 14

Inverted Index Can also add additional info with each document in the inverted list Flower Type Position... DocID title 5... 8 header 10... 8 text 23... 3 CSE 544 - Winter 2009 15

Using Index to Answer Queries Boolean query: intersect/merge doc lists Ranked query: Merge doc lists For each document Compute relevance with respect to query Fetch and return docs in decreasing rank order CSE 544 - Winter 2009 16

Web Search Goal: high quality results Challenges Large number of documents Large number of terms Malicious content publishers Users are only willing to look at a few results Observations Additional info in hyperlink graph Associate anchor text with destination page CSE 544 - Winter 2009 17

Outline Motivation Brief overview of information retrieval Key features of Bigtable Bigtable API Bigtable architecture Bigtable performance and discussion CSE 544 - Winter 2009 18

What is Bigtable? Distributed storage system Designed for structured data Designed to scale to thousands of servers Designed to store up to several hundred terabytes To scale, Bigtable has a limited set of features CSE 544 - Winter 2009 19

Bigtable Data Model Sparse, multidimensional sorted map (row:string, column:string, time:int64) string Example from Fig 1: Columns are grouped into families How could we use Bigtable for forward/inverted indexes? CSE 544 - Winter 2009 20

Key Features Read/writes of data under single row key is atomic Only single-row transactions! Data is stored in lexicographical order Improves data access locality Column families are unit of access control Data is versioned (old versions are garbage collected) Example: most recent three crawls of each page, with times CSE 544 - Winter 2009 21

Outline Motivation Brief overview of information retrieval Key features of Bigtable Bigtable API Bigtable architecture Bigtable performance and discussion CSE 544 - Winter 2009 22

API Data definition Creating/deleting tables or column families Changing access control rights Data manipulation Writing or deleting values Looking up values from individual rows Iterate over subset of data in the table Bigtable can serve as input to or output from MapReduce CSE 544 - Winter 2009 23

Outline Motivation Brief overview of information retrieval Key features of Bigtable Bigtable API Bigtable architecture Bigtable performance and discussion CSE 544 - Winter 2009 24

Chubby Lock Service In a distributed system, agreement is a problem Different failure scenarios are possible Nodes can have inconsistent views of who is up and who is down Messages can arrive out-of-order at different nodes But need agreement to make decisions Chubby Provides black-box agreement service through lock abstraction Uses the well-known Paxos algorithm CSE 544 - Winter 2009 25

Google File System A file = A series of chunks Size of a chunk 64MB Append & read only Fault-tolerance Chunks are distributed Chunks are replicated Master node Decides chunk placement Decides replica placement Tells clients where to find data Master File 26/22

Bigtable Building Block: SSTable Persistent map from keys to values Ordered Immutable Keys and values are strings API Look up value associated with a key Iterate over all key/value pairs in given range Implementation Sequence of blocks + one block index to locate other blocks CSE 544 - Winter 2009 27

A Table in Bigtable: Basics A table consists of a set of tablets: Section 5.3 Each tablet comprises one or more SSTables SSTable SSTable SSTable Tablet 1 Tablets are stored in GFS Tablet servers load data into memory Table (example with two tablets) SSTable Tablet 2 Reads are served from memory SSTable Tables are range partitioned 28

Writing to Tablets Remember: SSTables are immutable When a write operation arrives at a tablet server, the latter Writes the mutation to a commit log stored in GFS Waits until done Inserts the mutation into an in-memory buffer, the memtable The memtable is sorted lexicographically To serve reads, the tablet server Merges the SSTables and the memtable into a single view CSE 544 - Winter 2009 29

Loading Tablets To load a tablet, a tablet server does the following Finds location of tablet through its METADATA (Fig. 4) Read SSTables index blocks into memory Recall an SSTable consists of a set of blocks + 1 index block Read the commit log since redo point and reconstructs the memtable (the METADATA includes the redo point) CSE 544 - Winter 2009 30

Compaction To keep memtables below a threshold Minor compaction: convert memtable into an SSTable Merging compaction: Read a few SSTables and the memtable Write out a new SSTable Major compaction: Replace all SSTables and memtable with a new SSTable CSE 544 - Winter 2009 31

Assigning Tablets to Tablet Servers Master Assigns tablets to tablet servers Manages tablet server churn and load imbalances Processes schema changes Tablet server Handles read/write to tablets that it has loaded Splits large tablets Clients cache tablets locations CSE 544 - Winter 2009 32

Optimizations Vertical partitioning: locality groups Compression Caching Additional indexing: bloom filters Commit log optimizations Tablet migration optimization CSE 544 - Winter 2009 33

Outline Motivation Brief overview of information retrieval Key features of Bigtable Bigtable API Bigtable architecture Bigtable performance and discussion CSE 544 - Winter 2009 34

Performance CSE 544 - Winter 2009 35

Summary Bigtable is a distributed system for storing structured data Provides high performance and high availability Scales incrementally Restricted functionality Widely used by many applications at Google CSE 544 - Winter 2009 36