Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Similar documents
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

CS November 2017

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Outline. Spanner Mo/va/on. Tom Anderson

BigTable: A Distributed Storage System for Structured Data

CS November 2018

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

CS5412: DIVING IN: INSIDE THE DATA CENTER

Ghislain Fourny. Big Data 5. Wide column stores

BigTable A System for Distributed Structured Storage

CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

Big Data Analytics. Rasoul Karimi

Distributed File Systems II

CS5412: DIVING IN: INSIDE THE DATA CENTER

BigTable: A System for Distributed Structured Storage

Distributed Systems [Fall 2012]

7680: Distributed Systems

Data Informatics. Seon Ho Kim, Ph.D.

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CSE-E5430 Scalable Cloud Computing Lecture 9

CS5412: OTHER DATA CENTER SERVICES

Structured Big Data 1: Google Bigtable & HBase Shiow-yang Wu ( 吳秀陽 ) CSIE, NDHU, Taiwan, ROC

MapReduce & BigTable

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

Chapter 24 NOSQL Databases and Big Data Storage Systems

BigTable. CSE-291 (Cloud Computing) Fall 2016

DIVING IN: INSIDE THE DATA CENTER

Extreme Computing. NoSQL.

Bigtable. Presenter: Yijun Hou, Yixiao Peng

CISC 7610 Lecture 2b The beginnings of NoSQL

Ghislain Fourny. Big Data 5. Column stores

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

Comparing SQL and NOSQL databases

CA485 Ray Walshe NoSQL

10 Million Smart Meter Data with Apache HBase

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 12 Google Bigtable

Google File System and BigTable. and tiny bits of HDFS (Hadoop File System) and Chubby. Not in textbook; additional information

Programming model and implementation for processing and. Programs can be automatically parallelized and executed on a large cluster of machines

NoSQL Database Comparison: Bigtable, Cassandra and MongoDB CJ Campbell Brigham Young University October 16, 2015

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

Distributed Data Analytics Partitioning

Typical size of data you deal with on a daily basis

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Design & Implementation of Cloud Big table

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Introduction to NoSQL Databases

Challenges for Data Driven Systems

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

HBASE INTERVIEW QUESTIONS

Big Table. Google s Storage Choice for Structured Data. Presented by Group E - Dawei Yang - Grace Ramamoorthy - Patrick O Sullivan - Rohan Singla

Google big data techniques (2)

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Hadoop An Overview. - Socrates CCDH

Fattane Zarrinkalam کارگاه ساالنه آزمایشگاه فناوری وب

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

Cassandra- A Distributed Database

Distributed Data Store

CIB Session 12th NoSQL Databases Structures

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Distributed computing: index building and use

Distributed Systems 16. Distributed File Systems II

Data Storage in the Cloud

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Facebook. The Technology Behind Messages (and more ) Kannan Muthukkaruppan Software Engineer, Facebook. March 11, 2011

Microsoft Big Data and Hadoop

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Exam 2 Review. October 29, Paul Krzyzanowski 1

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Lessons Learned While Building Infrastructure Software at Google

Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

Parallel Techniques for Big Data. Patrick Valduriez

Distributed PostgreSQL with YugaByte DB

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

COSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan

A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages

Datacenter replication solution with quasardb

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

NewSQL Databases. The reference Big Data stack

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores

Database Evolution. DB NoSQL Linked Open Data. L. Vigliano

Intro Cassandra. Adelaide Big Data Meetup.

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017

Transcription:

Big Data Processing Technologies Chentao Wu Associate Professor Dept. of Computer Science and Engineering wuct@cs.sjtu.edu.cn

Schedule (1) Storage system part (first eight weeks) lec1: Introduction on big data and cloud computing Iec2: Introduction on data storage lec3: Data reliability (Replication/Archive/EC) lec4: Data consistency problem lec5: Block level storage and file storage lec6: Object-based storage lec7: Distributed file system lec8: Metadata management

Schedule (2) Reading & Project part (middle two/three weeks) Database part (last five weeks) lec9: Introduction on database lec10: Relational database (SQL) lec11: Relational database (NoSQL) lec12: Distributed database lec13: Main memory database

Collaborators

Distributed vs. Parallel? Parallel DBMSs Shared-memory Shared-disk Shared-nothing Distributed is basically shared-nothing parallel Perhaps with a slower network

What s Special About Distributed Computing? Parallel computation No shared memory/disk Unreliable Networks Delay, reordering, loss of packets Unsynchronized clocks Impossible to have perfect synchrony Partial failure: can t know what s up, what s down A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. Leslie Lamport, Turing 2013

Distributed Database Systems DBMS an influential special case of distributed computing The trickiest part of distributed computing is state, i.e. Data Transactions provide an influential model for concurrency/parallelism DBMSs worried about fault handling early on Special-case because not all programs are written transactionally And if not, database techniques may not apply Many of today s most complex distributed systems are databases Cloud SQL databases like Spanner, Aurora, Azure SQL NoSQL databases like DynamoDB, Cassandra, MongoDB, Couchbase We ll focus on concurrency control and recovery You already know many lessons of distributed query processing

Distributed Concurrency Control Consider a shared-nothing or distributed DBMS For today, assume partitioning but no replication of data Each transaction arrives at some node: The coordinator for the transaction T 1

Where does the Lock Table go?

Where does the Lock Table go? Typical design: Locks partitioned with the data Independent: each node manages its own lock table Works for objects that fit on one node (pages, tuples)

Where is the Lock Table Typical design: Locks partitioned with the data Independent: each node manages its own lock table Works for objects that fit on one node (pages, tuples) For coarser-grained locks, assign a home node Object being locked (table, DB) exists across nodes Boats Reserves Sailors

Where is the Lock Table Typical design: Locks partitioned with the data Independent: each node manages its own lock table Works for objects that fit on one node (pages, tuples) For coarser-grained locks, assign a home node Object being locked (table, DB) exists across nodes Can be hash-partitioned Sailors Boats Reserves

Where is the Lock Table Typical design: Locks partitioned with the data Independent: each node manages its own lock table Works for objects that fit on one node (pages, tuples) For coarser-grained locks, assign a home node Object being locked (table, DB) exists across nodes Can be hash-partitioned Or centralized at a master node Boats Sailors Reserves

Distributed voting? How? Vote for Commitment How many votes does a commit need to win? ALL of them (unanimous!) How do we implement distributed voting?! In the face of message/node failure/delay? T 1

Distributed database hadoop Hbase: Google s BigTable was first blob-based storage system Yahoo! Open-sourced it -> HBase Major Apache project today Facebook uses HBase internally API Get/Put(row) Scan(row range, filter) range queries MultiPut

HBase Architecture Small group of servers running Zab, a Paxos-like protocol HDFS

HBase Storage hierarchy HBase Table Split it into multiple regions: replicated across servers One Store per ColumnFamily (subset of columns with similar query patterns) per region Memstore for each Store: in-memory updates to Store; flushed to disk when full StoreFiles for each store for each region: where the data lives - Blocks HFile SSTable from Google s BigTable

HFile (For a census table example) SSN:000-00-0000 Demographic Ethnicity

Strong Consistency: HBase Write-Ahead Log Write to HLog before writing to MemStore Can recover from failure

Log Replay After recovery from failure, or upon bootup (HRegionServer/HMaster) Replay any stale logs (use timestamps to find out where the database is w.r.t. the logs) Replay: add edits to the MemStore Why one HLog per HRegionServer rather than per region? Avoids many concurrent writes, which on the local file system may involve many disk seeks

Cross-data center replication Zookeeper actually a file system for control information 1. /hbase/replication/st ate 2. /hbase/replication/p eers /<peer cluster number> 3. /hbase/replication/rs /<hlog>

Amazon DynamoDB Scalable Dynamo architecture Reliable Replicas over multiple data centers Speed Fast, single-digit milliseconds Secure Weak schema

Amazon DynamoDB

Data Model table Container, similar to a worksheet in excel, Cannot query across domains Item Item name item name ->(Attribute, value) pairs An item is stored in a domain (a row in a worksheet. Attributes are column names) Example domain: cars Item 1: car1 :{ make : BMW, year : 2009 }

table

Partition keys

Primary key of table Single key (hash) Hash-range key A pair of attributes: first one is hash key, 2 nd one is range key. Example: Reply(Id, datetime, ) Data type Simple: string and number Multi-valued: string set and number set

Access methods Amazon DynamoDB is a web service that uses HTTP and HTTPS as the transport method JavaScript Object Notation (JSON) as a message serialization format APIs Java, PHP,.Net Boto

CloudFront For content delivery: distribute content to end users with a global network of edge locations. Edges : servers close to user s geographical location Objects are organized into distributions Each distribution has a domain name Distributions are stored in a S3 bucket

Use cases Hosting your most frequently accessed website components Small pieces of your website are cached in the edge locations, and are ideal for Amazon CloudFront. Distributing software distribute applications, updates or other downloadable software to end users. Publishing popular media files If your application involves rich media audio or video that is frequently accessed

Simple Queue Service Store messages traveling between computers Make it easy to build automated workflows Implemented as a web service read/add messages easily Scalable to millions of messages a day

Some features Message body : <8Kb in any format Message is retained in queues for up to 4days Messages can be sent and read simultaneously Can be locked, keeping from simultaneous processing Accessible with SOAP/REST Simple: Only a few methods Secure sharing

A typical workflow

Workflow with AWS

Conclusion: A new horizontally scalable distributed key-value store complying with stringent performance requirements was developed Dynamo was more transparent in the way system worked, rather than being a black box in comparison to relational database. Application developers had more flexibility and control over the system; to tune parameters to best suite needs of their application Emphasis on the increasing importance of availability, performance over

Google:BigTable Introduction Development began in 2004 at Google (published 2006) A need to store/handle large amounts of (semi)-structured data Many Google projects store data in BigTable

Goals of BigTable: Asynchronous processing across continuously evolving data Petabytes in size High volume of concurrent reading/writing spanning many CPUs Need ability to conduct analysis across many subsets of data Temporal analysis (e.g. how to anchors or content change over time?) Can work well with many

BigTable in a Nutshell Distributed multi-level map Fault-tolerant Scalable Thousands of servers Terabytes of memory-based data Petabytes of disk-based data Millions of reads/writes per second Self-managing Dynamic server management

Building Blocks Google File System is used for BigTable s storage Scheduler assigns jobs across many CPUs and watches for failures Lock service distributed lock manager MapReduce is often used to read/write data to BigTable BigTable can be an input or output

Data Model :Example: Web Indexing Semi Three Dimensional datacube Input(row, column, timestamp) Output(cell contents)

Row

Columns

Cells

timestamps

Column family

Column family family:qualifier

Data Model - Timestamps Used to store different version of data in a cell New writes default to current time Lookup options: Return most recent K values Return all values in the timestamp range

System Structure Bigtable Cell Bigtable Master Performs metadata ops+ Load balancing metadata Bigtable client Bigtable client library read/write Bigtable tablet server Bigtable tablet server Bigtable tablet server Open() Serves data Serves data Serves data Cluster scheduling system GFS Lock service Handles failover, monitoring Holds tablet data, logs Holds metadata Handles master-election

Locating Tablets Metadata for tablet locations and start/end row are stored in a special Bigtable cell

Reading/Writing to Tablets Write commands First write command gets put into a queue/log for commands on that tablet Data is written to GFS and when this write command is committed, queue is updated Mirror this write on the tablet s buffer memory Read commands Must combine the buffered commands not yet committed with the data in GFS

API Metadata operations Create and delete tables, column families, change metadata Writes (atomic) Set(): write cells in a row DeleteCells(): delete cells in a row DeleteRow(): delete all cells in a row Reads Scanner: read arbitrary cells in BigTable Each row read is atomic Can restrict returned rows to a particular range Can ask for just data from one row, all rows, a subset of rows, etc. Can ask for all columns, just certainly column families, or specific columns

Compression Low CPU cost compression techniques are adopted Complete across each SSTable for a locality group Used BMDiff and Zippy building blocks of compression Keys: sorted strings of (Row, Column, Timestamp) Values Grouped by type/column family name BMDiff across all values in one family Zippy as final pass over a whole block Catches more localized repetitions Also catches cross-column family repetition Compression at a factor of 10 from empirical results

Big Data Processing Technologies Chentao Wu Associate Professor Dept. of Computer Science and Engineering wuct@cs.sjtu.edu.cn

Reference: https://aws.amazon.com/cn/documentation/dynamodb/ https://aws.amazon.com/cn/dynamodb/ https://en.wikipedia.org/wiki/amazon_dynamodb Chang F, Dean J, Ghemawat S, et al. Bigtable: a distributed storage system for structured data[c]// Symposium on Operating Systems Design and Implementation. USENIX Association, 2006:205-218. MapReduce/Bigtable for Distributed Optimization https://pt.wikipedia.org/wiki/drbd https://hbase.apache.org/apache_hbase_reference_guide.pdf https://phoenix.apache.org/presentations/oc-hug-2014-10-4x3.pdf https://www.cloudera.com/documentation/enterprise/5-9-x/pdf/clouderahbase.pdf www.cs.utexas.edu/~dsb/cs386d/projects14/hbase.pdf https://openproceedings.org/2016/conf/edbt/paper-298.pdf https://d0.awsstatic.com/whitepapers/cassandra_on_aws.pdf https://www.tutorialspoint.com/cassandra/cassandra_tutorial.pdf

Thank you!