Using space-filling curves for multidimensional

Similar documents
CISC 7610 Lecture 2b The beginnings of NoSQL

10 Million Smart Meter Data with Apache HBase

CS 655 Advanced Topics in Distributed Systems

Chapter 11: Indexing and Hashing

Comparing SQL and NOSQL databases

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Chapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Non-Relational Databases. Pelle Jakovits

Introduction to NoSQL Databases

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Apache Hadoop Goes Realtime at Facebook. Himanshu Sharma

Ghislain Fourny. Big Data 5. Column stores

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Stages of Data Processing

Data Informatics. Seon Ho Kim, Ph.D.

Strategies for Incremental Updates on Hive

MapReduce and Friends

CS34800 Information Systems

CS November 2018

Extreme Computing. NoSQL.

CS November 2017

B-Tree. CS127 TAs. ** the best data structure ever

BigTable: A Distributed Storage System for Structured Data

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Study of NoSQL Database Along With Security Comparison

Challenges for Data Driven Systems

A Fast and High Throughput SQL Query System for Big Data

Chapter 12: Indexing and Hashing. Basic Concepts

Flash Storage Complementing a Data Lake for Real-Time Insight

Typical size of data you deal with on a daily basis

File Structures and Indexing

Big Data Hadoop Course Content

Cassandra- A Distributed Database

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Chapter 12: Indexing and Hashing

Database index structures

Advanced Data Management Technologies Written Exam

A Glimpse of the Hadoop Echosystem

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

CA485 Ray Walshe NoSQL

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Distributed File Systems II

Chapter 2: Intro to Relational Model

Big Data Analytics using Apache Hadoop and Spark with Scala

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Distributed computing: index building and use

The State of Apache HBase. Michael Stack

Chapter 12: Indexing and Hashing

Chap4: Spatial Storage and Indexing. 4.1 Storage:Disk and Files 4.2 Spatial Indexing 4.3 Trends 4.4 Summary

Column-Family Databases Cassandra and HBase

Course: Database Management Systems. Lê Thị Bảo Thu

ADVANCED HBASE. Architecture and Schema Design GeeCON, May Lars George Director EMEA Services

Introduction to Column Oriented Databases in PHP

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

COSC 6339 Big Data Analytics. NoSQL (II) HBase. Edgar Gabriel Fall HBase. Column-Oriented data store Distributed designed to serve large tables

CompSci 516 Database Systems

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

CMSC 461 Final Exam Study Guide

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Chapter 11: Indexing and Hashing

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University

Database Evolution. DB NoSQL Linked Open Data. L. Vigliano

Distributed computing: index building and use

COSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan

Map-Reduce. Marco Mura 2010 March, 31th

Distributed Data Analytics Partitioning

CIB Session 12th NoSQL Databases Structures

HBASE INTERVIEW QUESTIONS

Presented by Sunnie S Chung CIS 612

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

SEARCHING BILLIONS OF PRODUCT LOGS IN REAL TIME. Ryan Tabora - Think Big Analytics NoSQL Search Roadshow - June 6, 2013

Ghislain Fourny. Big Data 5. Wide column stores

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara

/ Cloud Computing. Recitation 10 March 22nd, 2016

COMP 430 Intro. to Database Systems. Indexing

Topics to Learn. Important concepts. Tree-based index. Hash-based index

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

CSE-E5430 Scalable Cloud Computing Lecture 9

Introduction to BigData, Hadoop:-

Distributed Data Store

10. Replication. Motivation

CSE 124: Networked Services Lecture-16

Cloudera Kudu Introduction

BigTable. CSE-291 (Cloud Computing) Fall 2016

Relational databases

Distributed Non-Relational Databases. Pelle Jakovits

Transcription:

Using space-filling curves for multidimensional indexing Dr. Bisztray Dénes Senior Research Engineer 1 Nokia Solutions and Networks 2014

In medias res Performance problems with RDBMS Switch to NoSQL store (HBase) Query is based on multiple properties Multidimensional indexing Space-filling curves 2 Nokia Solutions and Networks 2014

Table of Contents Brief introduction to indexes and bases The main topic, i.e. problem statement and ways to solve the problem Solution and Results 3 Nokia Solutions and Networks 2014

Databases and Indexes Introduction 4 Nokia Solutions and Networks 2014

Example of a Relation attributes (or columns) tuples (or rows) 5 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Relation schema A 1, A 2,, A n are attributes R = (A 1, A 2,, A n ) is a relation schema Example: instructor = (ID, name, dept_name, salary) Formally, given sets D 1, D 2,. D n a relation r is a subset of D 1 x D 2 x x D n Thus, a relation is a set of n-tuples (a 1, a 2,, a n ) where each a i D i The current values (relation instance) of a relation are specified by a table An element t of r is a tuple, represented by a row in a table 6 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Keys Let K R K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(r) - Example: {ID} and {ID,name} are both superkeys of instructor. Superkey K is a candidate key if K is minimal Example: {ID} is a candidate key for Instructor One of the candidate keys is selected to be the primary key. - which one? 7 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Indexing Basic Concepts Indexing mechanisms used to speed up access to desired. - E.g., author catalog in library Search Key - attribute to set of attributes used to look up records in a file. An index file consists of records (called index entries) of the form search key pointer Index files are typically much smaller than the original file Two basic kinds of indices: - Ordered indices: search keys are stored in sorted order - Hash indices: search keys are distributed uniformly across buckets using a hash function. 8 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Ordered Indices In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library. Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file. - Also called clustering index - The search key of a primary index is usually but not necessarily the primary key. Secondary index: an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index. Index-sequential file: ordered sequential file with a primary index. 9 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Dense Index Files Dense index Index record appears for every search-key value in the file. E.g. index on ID attribute of instructor relation 10 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Sparse Index Files Sparse Index: contains index records for only some search-key values. - Applicable when records are sequentially ordered on search-key To locate a record with search-key value K we: - Find index record with largest search-key value < K - Search file sequentially starting at the record to which the index record points 11 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Secondary Indices Frequently, one wants to find all the records whose values in a certain field (which is not the search-key of the primary index) satisfy some condition. - Example 1: In the instructor relation stored sequentially by ID, we may want to find all instructors in a particular department - Example 2: as above, but where we want to find all instructors with a specified salary or with salary in a specified range of values We can have a secondary index with an index record for each searchkey value 12 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Secondary Indices Example Secondary index on salary field of instructor Index record points to a bucket that contains pointers to all the actual records with that particular search-key value. Secondary indices have to be dense 13 Nokia Solutions and Networks 2014 Slide content Silberschatz, Korth. Sudarshan, 2010

Multi dimensional indexing and the space filling curves 14 Nokia Solutions and Networks 2014

Application Characteristics Statistic analysis of massive CDR to generate Business Reports: - On-demand report with URL and Time Range - On-demand report with MSISDN and Time Range Data-intensive and performance-critical application: - Cut down the reporting response time from hours to minutes - Extremely high load to Disk I/O due to high throughput requirement 15 Nokia Solutions and Networks 2014

Technical Requirements for Storage Items Product 1 Product 2 Record Size 450Byte/record Multimedia Message (MM): 150KB - 10MB Short Message (SMS): 160Byte Data Retention time 3 months 6 months Concurrency 20,000 per second MM 200 requests/second for writing 1000 requests/second for reading SMS 800 requests/second for writing 1000 requests/second for reading Total size 70 TB 30 TB Latency Write delay < 10ms Reporting < 5 minutes MM reading delay < 0.2s, writing delay <0.4s SMS reading delay < 0.1s, writing delay < 0.2s Throughputs 72 Mbps for writing 584 Mbps for reading (1hour) 14.4 Gbps for reading (1day) 432 Gbps for reading (1month) MM SMS 1.228 Gbps for reading 246 Mbps for writing 1 Mbps for reading 1.28 Mbps for writing 16 Nokia Solutions and Networks 2014

Existing Solution and Disadvantage Oracle base & EMC disk array Traditional RDBMS became bottleneck for Big Data storage and processing Low performance in Data Intensive application, e.g. Generating business reports through big analysis Incapable Scale-In/Out for ever-increasing volume 17 Nokia Solutions and Networks 2014

RDBMS Scaling RDBMS Almost raw with optimised machinery Add more application servers Add slave base servers for parallel read Add cache Improve Main Server Hardware Denormalise bse because of join speed bottlenecks Prematerialise costly queries Drop secondary indexes because of maintenance slowdown Only the master is taking the writes Loose consistency guarantees Increases costs and not scalable to infinity Breakes RDMBS paradigm 18 Nokia Solutions and Networks 2014

CAP Theorem a guarantee that every request receives a response about whether it was successful or failed Availability MySQL, Postgres, etc Pick Two Cassandra, SimpleDB, Riak, Dynamo, etc Consistency all nodes see the same at the same time MongoDB, Hbase, BigTable, Redis, etc Partition Tolerance the system continues to operate despite arbitrary message loss or failure of part of the system 19 Nokia Solutions and Networks 2014

Hbase Open source, distributed sorted map store Open source: Apache 2.0 license Distributed - Store and access on 1-700 commodity servers - Linear scaling of capacity and IOPS by adding servers Sorted Map Datastore - Not a relational base - Tables consists of rows, each of which has a primary key (row key) - Each row may have any number of columns like a Map<byte[], byte[]> - Rows are stored in sorted order 20 Nokia Solutions and Networks 2014

Sorted Map Datastore Implicit PRIMARY KEY in RDBMS terms Row key Data Data is all byte[] in HBase Different types of separated into different column families cutting tlipcon info: { height : 274cm, state : CA } roles: { ASF : Director, Hadoop : Founder } info: { height : 170cm, state : CA } roles: { Hadoop : Committer @ts=2010, Hadoop : PMC @ts=2011, Hive : Contributor } Different rows may have different sets of columns (table is sparse) A single cell might have different values at different timestamps 21 Nokia Solutions and Networks 2014 Slide content Cloudera Inc.

Scalability with Regions Table contains sorted rows and dynamically splitted into regions - Rows stored in byte-lexicographic order based on rowkey - Somewhat similar to Relational DB Primary Index (always unique) Region is a continuous set of sorted rows Hbase ensures that all cells of the same rowkey are all on the same server 22 Nokia Solutions and Networks 2014

Table and Regions Rows Region Server 1 Region Server 2 Region Server 3 A... H.... Q.. Z Keys: [T-Z) Keys: [A-C) Keys: [I-M) Keys: [M-T) Keys: [F-I) Keys: [C-F) 23 Nokia Solutions and Networks 2014 Slide content Cloudera Inc.

Hbase Query Methods Get - Retrieves a single row using a rowkey Scan - Scans the full table Range - Retrieves a range of rows between a given startrow and stoprow 24 Nokia Solutions and Networks 2014

Challenges Handling multi-dimensional distribution - Transformation of in multi-dimension space to single dimension (NoSQL/HBase is single rowkey design) - Adopt a linearization technique Data Partition - Data must be well organized and distributed over nodes to ensure query efficiency and to avoid hotspot in accessing - Pre-defined region instead of automatic region split in Hbase - Keep frequently retrieved stored together to utilize Range Query as much as possible inside of region 25 Nokia Solutions and Networks 2014

Rowkey Design Rowkey algorithm by a space-filling curve Good extendibility in space size (recursion level and region placeholder) Spatial indexing is natively supported (while subdividing, prefix also fit to curve) Easy to build up additional indexing on top of it (Prefix Hash Tree, B Tree, KD-Tree etc.) Different granularity in sub-space size 26 Nokia Solutions and Networks 2014

Rowkey algorithm by Z-ordering It loosely preserves the locality of -points in the multi-dimensional space Easy to implement. - Binary Z-ordering space, bitwise interleaving. - Using the Z-order value as rowkey. 11 10 01 00 11 10 01 00 00 01 10 11 Z-order encoding (2-D) 00 01 10 11 Z-order curve (2-D) multiple-dimensional Z-order curve 27 Nokia Solutions and Networks 2014

Rowkey algorithm by Hilbert Curve - little complex compared to the Z-order curve - better one-dimensional continuity 11 11 10 01 00 10 01 00 00 01 10 11 Hilbert encoding (2-D) 00 01 10 11 Hilbert curve (2-D) multiple-dimensional Hilbert curve 28 Nokia Solutions and Networks 2014

Hilbert Curve Generation first order second order third order fourth order 29 Nokia Solutions and Networks 2014

Z-order Curve and Hilbert Curve compared Space aggregation Hilbert curve keeps better space aggregation than Z-order curve, which can be seen from left figure. For the same region of space, the Hilbert curve has less falsepositives than Z-order. Calculation complexity Hilbert curve is more complicated in calculation, which aims to keep the space aggregation of points. Z-order curve Hilbert curve 30 Nokia Solutions and Networks 2014

Data partitioning based on multi-dimension It can be achieved by pre-defined regions (bucket) in Hbase - Automatic region split can t be used in consideration of performance Region split solutions - Trie-based space split (by the mid-point of dimension) - Point-based space split (by the median of points) 31 Nokia Solutions and Networks 2014

Data Partitioning Two dimensional distribution by time and MSISDN MSISDN dimension: -users will be evenly distributed to different regions - avoids the Writing Hotspot in case of high concurrent insert (20,000 per second) Time dimension -Inside a region, one users s will be written in a time continuous way -increases query efficiency 32 Nokia Solutions and Networks 2014

Binary Tree for Hilbert curve 33 Nokia Solutions and Networks 2014

Data Partition: Trie-based region split splits the space at the midpoint of all dimensions - equal size splits in spacefilling curve - each subspace uniquely corresponds to a segment of the curve. 34 Nokia Solutions and Networks 2014

KD-Tree Splits the space by the median of points Results in subspaces with equal number of points Data point distribution must be known beforehand to select median 35 Nokia Solutions and Networks 2014

Solution and Results 36 Nokia Solutions and Networks 2014

Rowkey algorithm based on Hilbert Curve The rowkey algorithm uses multiple attributes to generate Hilbert curve. Each value in Hilbert curve represents the rowkey in HBase. Rowkey algorithm... Rowkey Rowkey_1 Rowkey_2 Rowkey_3...... Rowkey_n 37 Nokia Solutions and Networks 2014

Trie-based region split Selecting the middle-point of each dimension to make customized region presplit Each segment represents a range Hilbert value and corresponds to a region.... Rowkey algorithm Hilbert curve Value_ Range_1 Value_ Range_2 Value_ Range_3... Value_ Range_k Value_ Range_n HBase Region_1 Region_2 Region_3... Region_k Region_n 38 Nokia Solutions and Networks 2014

Data Distribution HBase table is divided into multiple regions, and each region contains a part of (multiple rows), which have continuous rowkey in value. The region is represented by startkey and endkey, and it takes one row in.meta. table of HBase (the row also contains other information related to this region). Our approach uses the.meta. table as mapping structure to distribute user across all regions. Rowkey algorithm... 39 Nokia Solutions and Networks 2014 Mapping structure of to region Start Key 0 3 4 7 8 11 12 15 End Key Rowkey_0 RowKey_1 Rowkey_2 Rowkey_3 Rowkey_4 RowKey_5 Rowkey_6 Rowkey_7... Rowkey_12 RowKey_13 Rowkey_14 Rowkey_15 value value value value value value value value value value value value Region 1 Region 2 Region 4

Results URL Query Performance Response time is decreased from hours to minutes (comparing to original Oracle solution) ~3x performance speedup after optimizations Experiment with 1 master + 8 node Better performance after the extension of nodes and memory 40 Nokia Solutions and Networks 2014

Results MSISDN Query Performance 1day 1 hour 1 month 41 Nokia Solutions and Networks 2014 3 month 2 month One hour 13s (1 Map) 13s (1 Map) Speedup = 1 24 hours 18s (24 Map) 18s (1 Map) Speedup = 1 720 hours (1 month) 33s (8 Maps) 113s (720 Maps) Speedup = 3.42x 1440 hours (2 months) 2160 hours (3 months) 49s (8 Maps) 216s (1440 Maps) Speedup = 3.54x 60s (8 Maps) 327s (2160 Maps) Speedup = 5.45x 1 minute to scan all of one subscriber within 3 month, it needs more than 5 minutes before optimization 3~5x performance speedup after optimization Experiment with 1 master + 8 node Better performance after the extension of nodes and memory

Results Data Writing Performance Data Writing latency is less than 0.05ms on average (requirement is <10ms) Before Optimization, region flush, split and compaction will affect writing latency badly, even cause service break Region split and compaction storm is avoided by scheduled region compaction, flush and advanced region split, performance curve of high concurrent writing become stable. 42 Nokia Solutions and Networks 2014

Achievement Response time of reporting is decreased from hours to minutes comparing to the original Oracle solution 4 minutes to scan 2 hours (144m records) in high concurrency (20,000tps), 3x performance speedup by Hadoop & Hbase optimization 1 minute to scan all of one subscriber within 3 month, 3~5x performance speedup by further Hadoop & HBase optimization Writing latency is less than 0.05ms, much less than requirement (<10ms) 43 Nokia Solutions and Networks 2014