Access Path Selection in Main-Memory Optimized Data Systems
|
|
- Gwenda Reynolds
- 5 years ago
- Views:
Transcription
1 Access Path Selection in Main-Memory Optimized Data Systems Should I Scan or Should I Probe? Manos Athanassoulis Harvard University Talk at CS265, February 16 th,
2 Access Path Selection SELECT x FROM table_a WHERE y < 10;...the system chooses how to retrieve it. 3
3 Access Path Choices Full Base Data Scan Secondary Index Scan (Auxilliary copy of the data + structure) 4
4 Access Path Choices Full Base Data Scan Secondary Index Scan (Auxilliary copy of the data + structure) 5
5 Selectivity Scan is best 0% Index is best why ask this question anew? 6
6 new workloads new architectures new hardware more and more concurrent similar read queries columnar data organization main-memory optimized data systems 7
7 modern data systems for analytics 8
8 Column(-group) Orientation Compression 9
9 SIMD Multi-Core Vectorized Processing C 1 C 2 C 3 C 4 10
10 1:1 Scan to Result Shared Scans S 1 S 2 S 3 S 1 r 1 r 2 r 3 r 1 r 2 r 3 11
11 Q 0, Q 1, Q 2,...Q n 12
12 ? Are indexes ever needed? Q 0, Q 1, Q 2,...Q n 13
13 ? Are indexes ever needed? 14
14 If so, how should the optimizer choose an access path? Q 0, Q 1, Q 2,...Q n 15
15 !"# ratio = +,-./ 0123 #45, 0123 > 1 scan < 1 index let us model 17
16 access path selection modeling ( &* +, *.0 rows tuple size memory bandwidth query selectivity scan data -.*/0&,12&. = *.0 ) ( ) &* +, scan data!"#$%#&# = ( ) &* +, what if we have q queries? 18
17 access path selection modeling q concurrent queries * &, -., rows tuple size memory bandwidth query selectivity queries predicate evaluation cost (CPU) scan data!"#$%#&# = (#) * + &, -., * CPU cost per query (may be too high!) scan data 23,45&.67&3 = 9 ; :<= selectivity per query,35 : + * + &, =! -. ; + * + &, ; -.,,. &.! ; = 9,35 : :<= total query selectivity (sum) 19
18 !"#$%&'( = *#+, - ('./,, ! 4 -, - ('./ + +!"#$;#(# <96=:"#(6>?#8 56'78(/9:(6 20
19 access path selection modeling index data traverse tree traverse leaves+data result write./&&./01&/#0' = '34 5! 6 +,! "# $% #&' ( ) * +, -"# 7&01&#./01&/#0' = #&' 6! * 6 +, 80"0./01&/#0' = #&' 6! 6 -"# $% rows tuple size memory bandwidth query selectivity queries predicate evaluation cost (CPU) branching factor (fanout) cache miss latency index tuple size how to make index drop-in replacement? 9&#:'"%/-"& = #&' 6! 6 "# $% 21
20 index data traverse tree traverse leaves+data ;3/" = #&' 6! 6 '34 < #&' 6! 6 -"# $% access path selection modeling./&&./01&/#0' = '34 5! 6 +,! "# $% #&' ( ) * +, -"# 7&01&#./01&/#0' = #&' 6! * 6 +, 80"0./01&/#0' = #&' 6! 6 -"# $% sort result write 9&#:'"%/-"& = #&' 6! 6 "# $% rows tuple size memory bandwidth query selectivity queries predicate evaluation cost (CPU) branching factor (fanout) cache miss latency index tuple size how to make index drop-in replacement? 22
21 !"#$%&'() = +, -'. / 0, & , 0 5, & 1 + 0, 6)( , )( :$$9:;<$:(;- =$;<$(9:;<$:(;- >;);9:;<$:(;-?$(@-)8:6)$ 23
22 new access path selection index datares res res index design data size & hardware resdatares res!"# ratio = +, -.//-.01/ # 5, 6/01/2-.01/ / /2:38;.<8/ =0> #?0@7080, B"C?D28 DE + F./G<?08/2 + # 5, 9/2:38;.<8/ scan Dynamic Parameter + #concurrent read queries # 5 5 sum of query selectivity of q queries # 5 = 2/3 I IJK data size (for result) 29
23 new access path selection index datares res res!"# ratio = +, -./ 1 0 1, # 5, 6, , 789 :; + 1, 89 :; <=> 1, 89 :;, 1 + # 5, 1, 89 :; 1 89 :; 9B rows tuple size memory bandwidth query selectivity queries predicate evaluation cost (CPU) branching factor (fanout) cache miss latency index tuple size scan resdatares res Dynamic Parameter + #concurrent read queries # 5 5 sum of query selectivity of q queries # 5 = 9B- C CDE 30
24 new access path selection index datares res res!"# ratio = data size +, -./ index design, 23, # 7, hardware characteristics data layout 23, <=> 9:, 23 + # 7, 9: + 9: + ;9: 1 9: 23 :B ;9: rows tuple size memory bandwidth query selectivity queries predicate evaluation cost (CPU) branching factor (fanout) cache miss latency index tuple size scan resdatares res Dynamic Parameter + #concurrent read queries # 7 7 sum of query selectivity of q queries # 7 = :B- C CDE 31
25 new access path selection Modeling Index 3 Scan 2 Scan Index data size has complex effect concurrency matters!! 32
26 new access path selection Experiments 8 Queries 8 Cores 100M Tuples 8 Cores Selectivity Crossover Scan Index Selectivity Crossover Index Scan shared scans help up to a point Relation Size Concurrent read queries 33
27 new access path selection Experiments 8 Queries 8 Cores 100M Tuples 8 Cores Selectivity Crossover Scan Index Selectivity Crossover Scan shared scans help up to a point Index Relation Size Concurrent read queries 34
28 new access path selection Experiments 8 Queries 8 Cores 100M Tuples 8 Cores Selectivity Crossover Scan Index Selectivity Crossover Index Scan Two x 256 Relation Size Concurrent read queries 35
29 new access path selection Selectivity Scan is best Selectivity Scan is best 0% Index is best 0% Index is best Concurrent Queries 36
30 10% Hardware Improvements 1% 0% Dawn of time
31 10% Hardware Improvements Column Stores 1% 0% Dawn of time
32 10% Hardware Improvements Column Stores 1% Main Memory 0% Dawn of time
33 10% Hardware Improvements Column Stores What-if 1% questions Main Memory 0% Dawn of time Future 40
Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe?
Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe? Michael S. Kester Manos Athanassoulis Stratos Idreos Harvard University {kester, manos, stratos}@seas.harvard.edu
More informationHammer Slide: Work- and CPU-efficient Streaming Window Aggregation
Large-Scale Data & Systems Group Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch, Holger Pirk Large-Scale Data & Systems (LSDS)
More informationPARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server
More informationColumn Stores vs. Row Stores How Different Are They Really?
Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background
More informationJignesh M. Patel. Blog:
Jignesh M. Patel Blog: http://bigfastdata.blogspot.com Go back to the design Query Cache from Processing for Conscious 98s Modern (at Algorithms Hardware least for Hash Joins) 995 24 2 Processor Processor
More informationcstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman
cstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman What is CitusDB? CitusDB is a scalable analytics database that extends PostgreSQL Citus shards your data and automa/cally parallelizes
More informationclass 12 b-trees 2.0 prof. Stratos Idreos
class 12 b-trees 2.0 prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ A B C A B C clustered/primary index on A Stratos Idreos /26 2 A B C A B C clustered/primary index on A pos C pos
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationColumn-Oriented Database Systems. Liliya Rudko University of Helsinki
Column-Oriented Database Systems Liliya Rudko University of Helsinki 2 Contents 1. Introduction 2. Storage engines 2.1 Evolutionary Column-Oriented Storage (ECOS) 2.2 HYRISE 3. Database management systems
More informationFirebird in 2011/2012: Development Review
Firebird in 2011/2012: Development Review Dmitry Yemanov mailto:dimitr@firebirdsql.org Firebird Project http://www.firebirdsql.org/ Packages Released in 2011 Firebird 2.1.4 March 2011 96 bugs fixed 4 improvements,
More informationUniversity of Waterloo Midterm Examination Sample Solution
1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,
More informationColumnstore and B+ tree. Are Hybrid Physical. Designs Important?
Columnstore and B+ tree Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 B+ tree & Columnstore on same table = Hybrid design 4? C O L C O L B+ tree B+ tree ? C O L C O L B+ tree B+ tree
More informationQuery Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems
Query Processing with Indexes CPS 216 Advanced Database Systems Announcements (February 24) 2 More reading assignment for next week Buffer management (due next Wednesday) Homework #2 due next Thursday
More informationAdvanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Advanced Databases Lecture 1- Query Processing Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Overview Measures of Query Cost Selection Operation Sorting Join Operation Other
More informationData Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation Harald Lang 1, Tobias Mühlbauer 1, Florian Funke 2,, Peter Boncz 3,, Thomas Neumann 1, Alfons Kemper 1 1
More informationTime Series Storage with Apache Kudu (incubating)
Time Series Storage with Apache Kudu (incubating) Dan Burkert (Committer) dan@cloudera.com @danburkert Tweet about this talk: @getkudu or #kudu 1 Time Series machine metrics event logs sensor telemetry
More informationColumnstore Technology Improvements in SQL Server 2016
Columnstore Technology Improvements in SQL Server 2016 Subtle Subtitle AlwaysOn Niko Neugebauer Our Sponsors Niko Neugebauer Microsoft Data Platform Professional OH22 (http://www.oh22.net) SQL Server MVP
More informationQLIK INTEGRATION WITH AMAZON REDSHIFT
QLIK INTEGRATION WITH AMAZON REDSHIFT Qlik Partner Engineering Created August 2016, last updated March 2017 Contents Introduction... 2 About Amazon Web Services (AWS)... 2 About Amazon Redshift... 2 Qlik
More informationData Blocks: Hybrid OLTP and OLAP on compressed storage
Data Blocks: Hybrid OLTP and OLAP on compressed storage Ben Brümmer Technische Universität München Fürstenfeldbruck, 26. November 208 Ben Brümmer 26..8 Lehrstuhl für Datenbanksysteme Problem HDD/Archive/Tape-Storage
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationAn In-Depth Analysis of Data Aggregation Cost Factors in a Columnar In-Memory Database
An In-Depth Analysis of Data Aggregation Cost Factors in a Columnar In-Memory Database Stephan Müller, Hasso Plattner Enterprise Platform and Integration Concepts Hasso Plattner Institute, Potsdam (Germany)
More informationTechnical Deep-Dive in a Column-Oriented In-Memory Database
Technical Deep-Dive in a Column-Oriented In-Memory Database Carsten Meyer, Martin Lorenz carsten.meyer@hpi.de, martin.lorenz@hpi.de Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software
More informationIntroduction to Database Systems CSE 344
Introduction to Database Systems CSE 344 Lecture 6: Basic Query Evaluation and Indexes 1 Announcements Webquiz 2 is due on Tuesday (01/21) Homework 2 is posted, due week from Monday (01/27) Today: query
More informationDistributed Transaction Processing in the Escada Protocol
Distributed Transaction Processing in the Escada Protocol Alfrânio T. Correia Júnior alfranio@lsd.di.uminho.pt. Grupo de Sistemas Distribuídos Departamento de Informática Universidade do Minho Distributed
More informationData Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationData Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationMain-Memory Databases 1 / 25
1 / 25 Motivation Hardware trends Huge main memory capacity with complex access characteristics (Caches, NUMA) Many-core CPUs SIMD support in CPUs New CPU features (HTM) Also: Graphic cards, FPGAs, low
More informationThe Right Read Optimization is Actually Write Optimization. Leif Walsh
The Right Read Optimization is Actually Write Optimization Leif Walsh leif@tokutek.com The Right Read Optimization is Write Optimization Situation: I have some data. I want to learn things about the world,
More informationInterpreting Explain Plan Output. John Mullins
Interpreting Explain Plan Output John Mullins jmullins@themisinc.com www.themisinc.com www.themisinc.com/webinars Presenter John Mullins Themis Inc. (jmullins@themisinc.com) 30+ years of Oracle experience
More information7. Query Processing and Optimization
7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one
More informationLocality. Christoph Koch. School of Computer & Communication Sciences, EPFL
Locality Christoph Koch School of Computer & Communication Sciences, EPFL Locality Front view of instructor 2 Locality Locality relates (software) systems with the physical world. Front view of instructor
More informationWhen MPPDB Meets GPU:
When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU
More informationIntroduction to Database Systems CSE 344
Introduction to Database Systems CSE 344 Lecture 10: Basics of Data Storage and Indexes 1 Student ID fname lname Data Storage 10 Tom Hanks DBMSs store data in files Most common organization is row-wise
More information1/3/2015. Column-Store: An Overview. Row-Store vs Column-Store. Column-Store Optimizations. Compression Compress values per column
//5 Column-Store: An Overview Row-Store (Classic DBMS) Column-Store Store one tuple ata-time Store one column ata-time Row-Store vs Column-Store Row-Store Column-Store Tuple Insertion: + Fast Requires
More informationSepand Gojgini. ColumnStore Index Primer
Sepand Gojgini ColumnStore Index Primer SQLSaturday Sponsors! Titanium & Global Partner Gold Silver Bronze Without the generosity of these sponsors, this event would not be possible! Please, stop by the
More informationQuery Processing Models
Query Processing Models Holger Pirk Holger Pirk Query Processing Models 1 / 43 Purpose of this lecture By the end, you should Understand the principles of the different Query Processing Models Be able
More informationOracle Database In-Memory
Oracle Database In-Memory A Focus On The Technology Andy Rivenes Database In-Memory Product Management Oracle Corporation Email: andy.rivenes@oracle.com Twitter: @TheInMemoryGuy Blog: blogs.oracle.com/in-memory
More informationB-Tree. CS127 TAs. ** the best data structure ever
B-Tree CS127 TAs ** the best data structure ever Storage Types Cache Fastest/most costly; volatile; Main Memory Fast access; too small for entire db; volatile Disk Long-term storage of data; random access;
More informationAlgorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I
Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language
More informationOutline. Database Management and Tuning. Index Data Structures. Outline. Index Tuning. Johann Gamper. Unit 5
Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 5 1 2 Conclusion Acknowledgements: The slides are provided by Nikolaus Augsten
More informationGreenplum Architecture Class Outline
Greenplum Architecture Class Outline Introduction to the Greenplum Architecture What is Parallel Processing? The Basics of a Single Computer Data in Memory is Fast as Lightning Parallel Processing Of Data
More informationDatabase Systems CSE 414
Database Systems CSE 414 Lecture 10: Basics of Data Storage and Indexes 1 Reminder HW3 is due next Tuesday 2 Motivation My database application is too slow why? One of the queries is very slow why? To
More informationAn Oracle White Paper April 2010
An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited
More informationCost Models. the query database statistics description of computational resources, e.g.
Cost Models An optimizer estimates costs for plans so that it can choose the least expensive plan from a set of alternatives. Inputs to the cost model include: the query database statistics description
More informationAlgorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II
Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language
More informationclass 13 scans vs indexes prof. Stratos Idreos
class 13 scans vs indexes prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ b-tree - dynamic tree - always balanced 35,50 35, 12,20 50, 1,2,3 12,15,17 20, Stratos Idreos 2 /24 select from
More informationCreating and Running a Report
Creating and Running a Report Reports are similar to queries in that they retrieve data from one or more tables and display the records. Unlike queries, however, reports add formatting to the output including
More informationREFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track. FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X
REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X FLASHSTACK REFERENCE ARCHITECTURE September 2018 TABLE
More informationCopyright 2014, Oracle and/or its affiliates. All rights reserved.
1 Oracle Database 12c Preview In-Memory Column Store (V12.1.0.2) Michael Künzner Principal Sales Consultant The following is intended to outline our general product direction. It is intended for information
More informationOracle Database In-Memory What s New and What s Coming
Oracle Database In-Memory What s New and What s Coming Andy Rivenes Product Manager for Database In-Memory Oracle Database Systems DOAG - May 10, 2016 #DBIM12c Safe Harbor Statement The following is intended
More informationCOLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE)
COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE) PRESENTATION BY PRANAV GOEL Introduction On analytical workloads, Column
More informationColumnstore Technology Improvements in SQL Server Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan
Columnstore Technology Improvements in SQL Server 2016 Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan Thank You microsoft.com hortonworks.com aws.amazon.com red-gate.com Empower users with
More informationSQL Server 2016 gives 40% improved performance over SQL Server 2014
Overview World Record Breaking Performance (TPC-H) SQL Server 06 gives 40% improved performance over SQL Server 04 SSAS 06 Query Exec Multi-Dimensional (MOLAP) Tabular (VertiPaq) Query Exec ETL SSAS 06
More informationFlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray
REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray FLASHSTACK REFERENCE ARCHITECTURE December 2017 TABLE OF CONTENTS
More informationRecent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect
Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect Copyright 2017, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is intended to
More informationCSE 544, Winter 2009, Final Examination 11 March 2009
CSE 544, Winter 2009, Final Examination 11 March 2009 Rules: Open books and open notes. No laptops or other mobile devices. Calculators allowed. Please write clearly. Relax! You are here to learn. Question
More informationCSC443 Winter 2018 Assignment 1. Part I: Disk access characteristics
CSC443 Winter 2018 Assignment 1 Due: Sunday Feb 11, 2018 at 11:59 PM Part I: Disk access characteristics In this assignment, we investigate the data access characteristics of secondary storage devices.
More informationCOLUMN DATABASES A NDREW C ROTTY & ALEX G ALAKATOS
COLUMN DATABASES A NDREW C ROTTY & ALEX G ALAKATOS OUTLINE RDBMS SQL Row Store Column Store C-Store Vertica MonetDB Hardware Optimizations FACULTY MEMBER VERSION EXPERIMENT Question: How does time spent
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data
More informationUnique Data Organization
Unique Data Organization INTRODUCTION Apache CarbonData stores data in the columnar format, with each data block sorted independently with respect to each other to allow faster filtering and better compression.
More informationHigh-Level Data Models on RAMCloud
High-Level Data Models on RAMCloud An early status report Jonathan Ellithorpe, Mendel Rosenblum EE & CS Departments, Stanford University Talk Outline The Idea Data models today Graph databases Experience
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.830 Database Systems: Fall 2008 Quiz II There are 14 questions and 11 pages in this quiz booklet. To receive
More informationAvoiding Sorting and Grouping In Processing Queries
Avoiding Sorting and Grouping In Processing Queries Outline Motivation Simple Example Order Properties Grouping followed by ordering Order Property Optimization Performance Results Conclusion Motivation
More informationHP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture
HP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture Based on Microsoft SQL Server 2014 Data Warehouse Fast Track (DWFT) Reference Architecture
More informationDeukyeon Hwang UNIST. Wook-Hee Kim UNIST. Beomseok Nam UNIST. Hanyang Univ.
Deukyeon Hwang UNIST Wook-Hee Kim UNIST Youjip Won Hanyang Univ. Beomseok Nam UNIST Fast but Asymmetric Access Latency Non-Volatility Byte-Addressability Large Capacity CPU Caches (Volatile) Persistent
More informationEvaluation of Relational Operations: Other Techniques
Evaluation of Relational Operations: Other Techniques [R&G] Chapter 14, Part B CS4320 1 Using an Index for Selections Cost depends on #qualifying tuples, and clustering. Cost of finding qualifying data
More informationExadata X3 in action: Measuring Smart Scan efficiency with AWR. Franck Pachot Senior Consultant
Exadata X3 in action: Measuring Smart Scan efficiency with AWR Franck Pachot Senior Consultant 16 March 2013 1 Exadata X3 in action: Measuring Smart Scan efficiency with AWR Exadata comes with new statistics
More informationInformation Systems (Informationssysteme)
Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IX B-Trees c Jens Teubner Information
More informationcomplex plans and hybrid layouts
class 7 complex plans and hybrid layouts prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ essential column-stores features virtual ids late tuple reconstruction (if ever) vectorized execution
More informationCS122 Lecture 15 Winter Term,
CS122 Lecture 15 Winter Term, 2014-2015 2 Index Op)miza)ons So far, only discussed implementing relational algebra operations to directly access heap Biles Indexes present an alternate access path for
More informationDatenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016
Datenbanksysteme II: Modern Hardware Stefan Sprenger November 23, 2016 Content of this Lecture Introduction to Modern Hardware CPUs, Cache Hierarchy Branch Prediction SIMD NUMA Cache-Sensitive Skip List
More informationBridging the Processor/Memory Performance Gap in Database Applications
Bridging the Processor/Memory Performance Gap in Database Applications Anastassia Ailamaki Carnegie Mellon http://www.cs.cmu.edu/~natassa Memory Hierarchies PROCESSOR EXECUTION PIPELINE L1 I-CACHE L1 D-CACHE
More informationCS 3510 Comp&Net Arch
CS 3510 Comp&Net Arch Cache P1 Dr. Ken Hoganson 2010 Von Neuman Architecture Instructions and Data Op Sys CPU Main Mem Secondary Store Disk I/O Dev Bus The Need for Cache Memory performance has not kept
More informationCS 222/122C Fall 2016, Midterm Exam
STUDENT NAME: STUDENT ID: Instructions: CS 222/122C Fall 2016, Midterm Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100) This exam has six (6)
More informationIn-Memory Data Management Jens Krueger
In-Memory Data Management Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute OLTP vs. OLAP 2 Online Transaction Processing (OLTP) Organized in rows Online Analytical Processing
More informationData Systems that are Easy to Design, Tune and Use. Stratos Idreos
Data Systems that are Easy to Design, Tune and Use data systems that are easy to: (years) (months) design & build set-up & tune (hours/days) use e.g., adapt to new applications, new hardware, spin off
More informationcolumn-stores basics
class 3 column-stores basics prof. HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS265/ project description is now online First background info will be given this Friday and detailed lecture on Feb 21 Basic Readings
More information[MS10987A]: Performance Tuning and Optimizing SQL Databases
[MS10987A]: Performance Tuning and Optimizing SQL Databases Length : 4 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server Delivery Method : Instructor-led (Classroom) Course
More informationHYRISE In-Memory Storage Engine
HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University
More informationData Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation Harald Lang, Tobias Mühlbauer, Florian Funke 2,, Peter Boncz 3,, Thomas Neumann, Alfons Kemper Technical
More informationData Warehousing (Special Indexing Techniques)
Data Warehousing (Special Indexing Techniques) Naveed Iqbal, Assistant Professor NUCES, Islamabad Campus (Lecture Slides Weeks # 13&14) Special Index Structures Inverted index Bitmap index Cluster index
More informationDATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland
DATABASES AND THE CLOUD Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland AVALOQ Conference Zürich June 2011 Systems Group www.systems.ethz.ch Enterprise Computing Center
More informationHadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera
More informationIBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata
Research Report IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata Executive Summary The problem: how to analyze vast amounts of data (Big Data) most efficiently. The solution: the solution is threefold:
More informationCS122 Lecture 3 Winter Term,
CS122 Lecture 3 Winter Term, 2014-2015 2 Record- Level File Organiza3on Last time, :inished discussing block- level organization Can also organize data :iles at the record- level Heap &ile organization
More informationHardware Acceleration of Database Operations
Hardware Acceleration of Database Operations Jared Casper and Kunle Olukotun Pervasive Parallelism Laboratory Stanford University Database machines n Database machines from late 1970s n Put some compute
More informationChapter 6 Solutions S-3
6 Solutions Chapter 6 Solutions S-3 6.1 There is no single right answer for this question. The purpose is to get students to think about parallelism present in their daily lives. The answer should have
More informationColumn-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi
Column-Stores vs. Row-Stores How Different are they Really? Arul Bharathi Authors Daniel J.Abadi Samuel R. Madden Nabil Hachem 2 Contents Introduction Row Oriented Execution Column Oriented Execution Column-Store
More informationB.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2
Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,
More informationEfficient Bulk Deletes for Multi Dimensional Clustered Tables in DB2
Efficient Bulk Deletes for Multi Dimensional Clustered Tables in DB2 Bishwaranjan Bhattacharjee, Timothy Malkemus IBM T.J. Watson Research Center Sherman Lau, Sean McKeough, Jo-anne Kirton Robin Von Boeschoten,
More informationPredicate Pushdown in Parquet and Databricks Spark
MASTER S THESIS Predicate Pushdown in Parquet and Databricks Spark Author: Boudewijn Braams VU: bbs820 (2527663) - UvA: 040040 Supervisor: Peter Boncz Second reader: Alexandru Uta Daily supervisor (Databricks):
More informationTrends and Concepts in Software Industry I
Trends and Concepts in Software Industry I Goals Deep technical understanding of column-oriented dictionary-encoded in-memory databases and its application in enterprise computing Foundations of database
More informationHardware Acceleration for Database Systems using Content Addressable Memories
Hardware Acceleration for Database Systems using Content Addressable Memories Nagender Bandi, Sam Schneider, Divyakant Agrawal, Amr El Abbadi University of California, Santa Barbara Overview The Memory
More informationclass 10 b-trees 2.0 prof. Stratos Idreos
class 10 b-trees 2.0 prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ CS Colloquium HV Jagadish Prof University of Michigan 10/6 Stratos Idreos /29 2 CS Colloquium Magdalena Balazinska
More informationLec 25: Parallel Processors. Announcements
Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza
More informationA Closer Look at SERVER-SIDE RENDERING. Technology Overview
A Closer Look at SERVER-SIDE RENDERING Technology Overview Driven by server-based rendering, Synapse 5 is the fastest PACS in the medical industry, offering subsecond image delivery and diagnostic quality.
More informationPerformance in the Multicore Era
Performance in the Multicore Era Gustavo Alonso Systems Group -- ETH Zurich, Switzerland Systems Group Enterprise Computing Center Performance in the multicore era 2 BACKGROUND - SWISSBOX SwissBox: An
More informationSmooth Scan: Statistics-Oblivious Access Paths. Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser
Smooth Scan: Statistics-Oblivious Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q16 Q18 Q19 Q21 Q22
More informationA Fast and High Throughput SQL Query System for Big Data
A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More information