Big Table Dennis Kafura CS5204 Operating Systems 1
Introduction to Paper summary with this lecture. is a Google product Google = Clever "We settled on this data model after examining a variety of potential uses of a -like system. "The implementation described in the previous section required a number of refinements to achieve the high performance, availability, and reliability required by our users." Dennis Kafura CS5204 Operating Systems 2
Focus Today Structure Recovery System Table Distribution The API Dennis Kafura CS5204 Operating Systems 3
Structure Goals for this section Understand the relation to GFS Know what the parts of the system are Know how they work together Dennis Kafura CS5204 Operating Systems 4
Backup s GFS Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data GFS Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Data Dennis Kafura CS5204 Operating Systems 5
Characters Just a whimsical introduction Chubby A file system whose files/directories have individual locks on all files. These locks are used to coordinate the rest of the system. SSTable A slim map sorted by key. It is the most basic primitive in the structure. Deletion Since SSTables are immutable, any deletion takes the form of another record which is interpreted as a deletion. Master The server which does no clientoriented work, but directs the efforts of all tablet servers. Tablet Server Contains the data and handles client read/write interactions. Dennis Kafura CS5204 Operating Systems 6
Characters Just a whimsical introduction Table Tables exist only as a high-level construct. At the low level the table is still and SSTable. Tablet One part of the Table. Each Tablet holds only 100MB-200MB of the whole. They are constantly splitting and merging. Metatable Is just kind of special. It s whole purpose is to refer to the main table. Root Tablet If there is a king of the special, this is it. It is the only tablet which refers to the rest of the metatable. Dennis Kafura CS5204 Operating Systems 7
Relationships among the entities Is a pointer to Owns the lock to Controls the contents of Is broken into Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 8
Let s Look Deeper A table is really only the exposed interface The real data is stored in an SSTable inherits certain attributes from the underlying SSTable structure Key and data types are raw character strings Records are ordered by Key Records are immutable. adds to this structure by adding dimensionality. The row key determines the horizontal slice The column family:name determines the vertical slice The version number determines the final dimension A tablet is really just a range of horizontal slices. The combination of these features allows big table to work with ranges and filters in any of the three dimensions. Dennis Kafura CS5204 Operating Systems 9
Goals for this section Recovery System Understanding how to recover from a hardware failure Understand the impact of loss of connectivity Understand the impact of a lost messages Dennis Kafura CS5204 Operating Systems 10
What if things go wrong? Is a pointer to Scenario 1: Tablet Server Loses Connectivity Owns the lock to 6 Controls the contents of Is broken into 3 Creates and manages 4 Is Live On 1 2? 5 Dennis Kafura CS5204 Operating Systems 11
What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 1 Is a pointer to Owns the lock to 6 Controls the contents of Is broken into 4 Creates and manages Is Live On 2 3 5 1 Dennis Kafura CS5204 Operating Systems 12
What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 2 G-K A-Z 7 Q-Z Is a pointer to Owns the lock to Controls the contents of Is broken into 8 A-F, L-P S1 S2 S3 S4 Creates and manages Is Live On 6 Dennis Kafura CS5204 Operating Systems 13
What if things go wrong? Scenario 2: Master Server Loses Connectivity Part 3 Is a pointer to Owns the lock to A-F, L-P A-F Controls the contents of 10 Is broken into 12 Creates and manages Is Live On 9 11 Dennis Kafura CS5204 Operating Systems 14
What if things go wrong? Scenario 4: Metadata is lost and new Master Is a pointer to 7 Owns the lock to 3 4 4 6 Controls the contents of Is broken into Creates and manages Is Live On 1 2? 5 Dennis Kafura CS5204 Operating Systems 15
Goals for this section Table Distribution System Understand the process for adding/removing a server Understand how to handle an overwhelmed server Understand how to handle deletions/changes to the database. Dennis Kafura CS5204 Operating Systems 16
Server Join/Leave Responsibilities Is a pointer to + Owns the lock to Controls the contents of Is broken into + + Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 17
Tablet Growth/Shrinkage Merger Split Undersized: <100MB Ideal: 100MB- 200MB Oversized: >200MB Dennis Kafura CS5204 Operating Systems 18
If You Can t Handle the Heat User interactions may cause hot spots where requests are more frequent than the baseline! 115% 115% 115% 100% 160% Is a pointer to Owns the lock to Controls the contents of Is broken into Creates and manages Is Live On Dennis Kafura CS5204 Operating Systems 19
Move the Kitchen After redistributing the work load, hot spots are easier to deal with and the labor is more evenly divided. Is a pointer to Owns the lock to Controls the contents of Is broken into 100% 100% 113% 100% 113% Creates and manages Is Live On Note that granularity in this image does not show updated pointers from metatable or locks on Chubby files Dennis Kafura CS5204 Operating Systems 20
What if I Want to Delete Something? Memtable Tablet in RAM New SSTable GFS Changes & Deletions Existing SSTables The process of merging an SSTable with the Memtable is known as a compaction. Minor Compactions Involve at least one SSTable Grow the set of SSTables May contain deletions Major Compactions Include all SSTables Reduce the set of SSTables Dennis Kafura CS5204 Operating Systems 21
The API Goals for this section Explain how this differs from SQL. How to create your own table. Using as a hash table/vector. Dennis Kafura CS5204 Operating Systems 22
If You Had to Perform a Project Projects are notoriously inefficient Checking an extensive table is ALWAYS to be avoided With an a truly ENORMOUS table is a very bad idea Lon Lat City 123 87 New Oslo 78 23 New Canada -100 67 New Bermuda 45 59 New England 171-45 Old Hampshire -165 21 Old Mexico 0 66 Old England 78-51 New Ireland 41 0 New Equador 100 12 Old Zealand Dennis Kafura CS5204 Operating Systems 23
If You Had to Perform a Join is quite sparse. Imagine this was your table and only the red spots had data (everything else is null). Joining with nulls create semantic nonsense. Joining on a null creates more nulls. Dennis Kafura CS5204 Operating Systems 24
Completely Configurable Structure Excellent Business Ownership Records Records will be state_city for alphabetical ordering Column families will be Better Business Bureau ratings Columns will be business names Version will be ownership purchase date Data will be owner name, address, phone and email. Ranked X type businesses Records will be region_city for geographical ordering Column families will designate types of services Columns will be specific business names Version will be automated Data will be popularity by customer vote with address. Dennis Kafura CS5204 Operating Systems 25
Multiple Tools for Fine Control MapReduce MapReduce is closed on (i.e. MR(Bt) Bt). Use it to determine the most successful owner (based on average BBB rank). Sawzall A script language which can execute actions with tablet server clock cycles. Use it to determine the vote history of a set of businesses for graphing purposes. Regular Expressions Can be used for any combination of record, column and data recognition schemes. Use it to determine all the best voted hotels in a region. Dennis Kafura CS5204 Operating Systems 26
Order Large Groups of Data I d like to have all the demographic statistics for the states A-L. I d like to have the hotel listings for cities in Pennsylvania. I d like to have hockey scores for all pro, semi-pro and college teams in the last three years. I want to see all the Google searches in the last 24 hours. Dennis Kafura CS5204 Operating Systems 27
Only Take What You Want I d like to have all the demographic statistics for the states A-L. But I ll only look at ethnic percentages I d like to have the hotel listings for cities in Pennsylvania. But I only want the ones in Harrisburg I d like to have hockey scores for all pro, semi-pro and college teams in the last three years. But I just want to see the Black Hawks I want to see all the Google searches in the last 24 hours. But only the ones for www.disney.com Dennis Kafura CS5204 Operating Systems 28
Summary Structure of the system Methods for recovery Data management Characteristics of the API Dennis Kafura CS5204 Operating Systems 29