Jignesh M. Patel. Blog:
|
|
- Cody Wade
- 5 years ago
- Views:
Transcription
1 Jignesh M. Patel Blog:
2 Go back to the design Query Cache from Processing for Conscious 98s Modern (at Algorithms Hardware least for Hash Joins) Processor Processor Caches+TLB Processor Caches+TLB +Main Caches Memory +Main Memory +Multicore 23, Jignesh M. Patel, University of Wisconsin 2
3 Cycles / output tuple Probe Build ParBBon 5 4 Instruction path length is 58% of Radix 3.3x 3 more cache misses 7x more TLB misses on load 47% more cache 2 than Radix misses than Radix 24% faster 5% faster than WiscJ than Radix Radix WiscJ Radix WiscJ Intel Xeon Sun T2 22, Jignesh M. Patel, University of Wisconsin 3
4 Skew in partitioning- based hash join algorithms causes partition size skew à work imbalance Cycles / output tuple Probe Build Partition 2x faster than Radix Non- partitioned (Wisconsin) hash join improves with higher skew! 3.75x faster than Radix 6 cycles total Radix WiscJ Radix WiscJ Intel Xeon Sun T2 22, Jignesh M. Patel, University of Wisconsin 4
5 Hash join algorithm started simple, and with each architectural turn, it adapted. We have come full circle: The simple hash join is now very competitive. And, in many cases more efficient than the more complex methods! 22, Jignesh M. Patel, University of Wisconsin 5
6 π (project) σ (select) γ (aggregate) Fast & Data Scalable Relational- like τ U (sort) (join) (union) Data Processing Processing Primitives Kernels δ (anti- join) (bag- >set) (minus) 23, Jignesh M. Patel, University of Wisconsin 6
7 Want CPU caches Multicores, multi- socket, heterogeneous cores Constraint DRAM High Performance NVRAM (e.g. SSDs) Lower- powered, lower latency, higher bandwidth, persistent stores Low Cost Power Magnetic Hard Disk Drives 23, Jignesh M. Patel, University of Wisconsin 7
8 Goal Run data hardware speeds Short- term the speed of hardware today Long- term Hardware- software co- design for data kernels 24, Jignesh M. Patel, University of Wisconsin 9
9 Li and Patel, SIGMOD 3 What? Scan a column of a table applying some predicate Why? A key primitive in database The critical kernel in main memory analytic systems How? Conserve memory bandwidth: BitWeaving the data Use every bit of data that is brought to the processer efficiently using intra- cycle parallelism 24, Jignesh M. Patel, University of Wisconsin
10 Li and Patel, SIGMOD 3 Traditional Row Store shipdate discount quantity Mar % 5 Jan % 4 Apr % 3 May % 6 Feb % One big file shipdate Mar Jan Apr May Feb File: Column Store discount 5% 2% % % 5% File: n- quantity File: n 6 bits Order- preserving compression Column Codes: 3 bits , Jignesh M. Patel, University of Wisconsin
11 BitWeaving/H BitWeaving/V Code First batch of Processor Words (batch size = code size in bits) Code Code 5 Code 9 Code 3 Code 7 Code 2 Code 25 Code 29 Code 2 Code 6 Code Code 4 Code 8 Code 22 Code 26 Code 3 Code 3 Code 7 Code Code 5 Code 9 Code 23 Code 27 Code 3 Code 4 Code 8 Code 2 Code 6 Code 2 Code 24 Code 28 Code 32 Next batch of processor words First batch of Processor Words (batch size = code size in bits) 24, Jignesh M. Patel, University of Wisconsin Next batch of processor words 2
12 SELECT SUM(l_discount * l_price) FROM lineitem WHERE l_shipdate BETWEEN Date AND Date + year AND l_discount BETWEEN Discount. AND Discount +. AND l_quantity < Quantity BitWeaving/H l_price l_discount Aggregation RID List: 9, 5 Result bit vector Result bit vector AND Result bit vector BitWeaving/H Result bit vector AND Result bit vector l_quantity l_shipdate l_discount BitWeaving/V BitWeaving/V BitWeaving/H 24, Jignesh M. Patel, University of Wisconsin 3
13 Word Column Codes: The first (most significant) bits of 8 consecutive codes Word 2 The second bits of 8 consecutive codes Word 3 The third bits of 8 consecutive codes Word 4 The last (least significant) bits of 8 consecutive codes 24, Jignesh M. Patel, University of Wisconsin 4
14 Column Codes: Constant Predicate a < 5??????? Result Bit Vector Early Pruning: terminate the predicate evaluation on a segment, when all results have been determined. 24, Jignesh M. Patel, University of Wisconsin 5
15 Early pruning probability: 96% at bit position 4 Early Pruning Probability P(b) % 8% 6% 4% 2% % Segment size: 64 codes, code size: 32 bits Early pruning probability: 98% at bit position 8 Fill factor:% Fill factor: % Fill factor: % This cut- off mechanism allows for efficient evaluation of conjunction/disjunction Bit Posi8on b predicates 24, Jignesh M. Patel, University of Wisconsin 6
16 Segment Code size (3 + bits) c c2 c3 c4 c5 c6 c7 c8 c9 c c c2 c3 c4 c5 c6 c7 c8 c9 Segment word Predicate evaluation is done on the 4 codes in parallel c c5 c9 c3 < 5? Memory space word 2 word 3 word 4 c2 c6 c c4 c3 c7 c c5 c4 c8 c2 c6 Word size (6 bits) 24, Jignesh M. Patel, University of Wisconsin 7
17 Uses only 3 instructions! Without the delimiter, we would need ~2 instructions c= c5=7 c9=6 c3=2 X = ( c c 5 c 9 c ) 3 Y = 5555 ( ) ( Y + (X M) ) M 2 M = M2 = Works for arbitrary code sizes & word sizes! 24, Jignesh M. Patel, University of Wisconsin 8
18 Segment Code size (3 + bits) c c2 c3 c4 c5 c6 c7 c8 c9 c c c2 c3 c4 c5 c6 c7 c8 c9 Memory space Segment word word 2 word 3 word 4 c c5 c9 c3 c2 c6 c c4 c3 c7 c c5 c4 c8 c2 c6 < 5? < 5? < 5? < 5? >> >> >>2 >>3 c c2 c3 c4 Word size (6 bits) Result bit vector computed efficiently with this layout! 24, Jignesh M. Patel, University of Wisconsin 9
19 SYSTEM Intel Xeon X bits ALU 28 bits SIMD 2MB L3 Cache 24GB memory Single threaded execution WORKLOAD. Synthetic SELECT COUNT(*) FROM R WHERE R.a < C billion tuples Uniform distribution Selectivity: % 2. TPC- SF= scan only with materialized join results 24, Jignesh M. Patel, University of Wisconsin 2
20 8 Cycles / code 6 4 Naive Size of code (# bits) 24, Jignesh M. Patel, University of Wisconsin 2
21 SIMD Paper: T. Willhalm, N. Popovici, Y. Boshmaf, H. Plattner, A. Zeier, and J. Schaffner. SIMD- Scan: Ultra fast in- memory table scan using on- chip vector processing units. PVLDB 9 8 2X: SIMD parallelism Cycles / code Naive SIMD Size of code (# bits) 24, Jignesh M. Patel, University of Wisconsin 22
22 Blink Paper: R. Johnson, V. Raman, R. Sidle, and G. Swart. Row- wise parallel predicate evaluation. VLDB 8 8 Cycles / code Naive SIMD BL Size of code (# bits) 24, Jignesh M. Patel, University of Wisconsin 23
23 Cycles / code X- 4X speedup over BL: ) Use the extra (delimiter) bit 2) Easy to produce the result bit vector with the HBP layout Naive SIMD BL 2 BitWeaving/H Size of code (# bits) 24, Jignesh M. Patel, University of Wisconsin 24
24 8 Cycles / code Naive SIMD BL BitWeaving/H BitWeaving/V Size of code (# bits) Many more experiments in the paper 2X speedup: Early pruning 24, Jignesh M. Patel, University of Wisconsin 25
25 Customer Product Buy cid cname gender address Andy M Main st. 2 Kate F 2 th blvd. 3 Bob M 3 5 th ave. pid pname Milk 2 Coffee 3 Tea cid pid status 2 S 2 2 F 3 3 S 2 S cid cname gender address pid pname status Andy M Main st. 2 Coffee S 2 Kate F 2 th blvd. 2 Coffee F 3 Bob M 3 5 th ave. 3 Tea S Andy M Main st. 2 Coffee S NULL NULL NULL NULL Milk NULL WideTable 24, Jignesh M. Patel, University of Wisconsin
26 WideTable Denormalization Column-store Packed Scan Dictionary encoding Now we can run analytical workloads (e.g. TPC- H) using simple BitWeaved scans 24, Jignesh M. Patel, University of Wisconsin
27 Data Schema Graph Query Quickstep/WT Query Transformer Denormalizer Schema transformer Flatten using Bit- Weaved Scans {WideTables} Results Non- transformable query sent to the source system 24, Jignesh M. Patel, University of Wisconsin 28
28 Customer Customer Nation Region Buy Nation Region Buy Product Schema Graph Product Nation Schema Tree Region WideTable = (Region Nation Customer) (Region Nation Product Buy) SMW = {WideTables } e.g. for TPC- H, SMW={lineItemWT, orderswt, partsuppwt, customerwt} 24, Jignesh M. Patel, University of Wisconsin 29
29 TPC- H Queries Joins Nested Queries Non- FK joins WideTable Q, Q6 LineitemWT Q3, Q5, Q7- Q, Q2, Q4, Q9 LineitemWT Q4, Q5, Q7, Q8, Q2 LineitemWT Q Q2, Q, Q6 PartsuppWT Q3 OrdersWT Q22 OrdersWT 24, Jignesh M. Patel, University of Wisconsin 3
30 SYSTEM Intel Xeon E GHz 2 cores / 24 threads 5MB L3 Cache 32G, 6MHz DDR3 BENCHMARK SF: (~GB) SMW = lineitemwt orderswt partsuppwt customerwt dictionaries filter columns TOTAL 5.4 GB.7 GB.2 GB.5 GB.8 GB.3 GB 8.5GB 24, Jignesh M. Patel, University of Wisconsin 3
31 WideTable over X faster than MonetDB for about half of the 2 queries Speedup over MonetDB to Q Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q Q Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q2 Q22 24, Jignesh M. Patel, University of Wisconsin 32
32 Speedup over MonetDB WideTable scales better to Q Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q Q Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q2 Q22 24, Jignesh M. Patel, University of Wisconsin 33
33 CPU caches Multicores, multi- socket, heterogeneous cores DRAM NVRAM (e.g. SSDs) Lower- powered, lower latency, higher bandwidth, persistent stores Magnetic Hard Disk Drives 23, Jignesh M. Patel, University of Wisconsin 34
34 22, Jignesh M. Patel, University of Wisconsin 35
35 Long Term: Raw computing + Server Box Processor and storage costs Memory Bus tends to zero! Memory The cost is in I/O Bus moving data and Spinning Disk powering the Modern Server Box circuits/devices For many decades 23, Jignesh M. Patel, University of Wisconsin Today Processor Memory Bus Memory I/O Bus Spinning Disk + 36
36 Do et al., SIGMOD 3 Total Energy (KJ), log scale Inside TPC- H the Flash Q6, SSD SF SRAM Host Interface Controller Embedded Processor Smart SSD DRAM Controller SAS SSD Bus Controller Bus Controller Bus Controller.5X Data storage Elapsed Time (seconds), log scale DRAM F F F HDD F F F.8X Bandwidth Relative to the I/O Interface Speed 27 Internal SSD I/O Interface 29 2 Year There Besides are similar data storage ways of using it has hardware creatively, Can we gain efficiency (performance and e.g. High IDISKs, I/O bandwidth ASICs, CGRA, inside FPGAs, the flash or chips GPUs. energy) Embedded by pushing computing big power data processing Basically, General- purpose primitives need hardware into memory the and Smart software SSD? synergy! 23, Jignesh M. Patel, University of Wisconsin 37
37 X, GPU X, CPU X, CPU X, CPU X, CPU X, CPU X, CPU X, GPU Scan Scattered Read/Write Sequential read kernel Index access kernel 23, Jignesh M. Patel, University of Wisconsin 38
38 Transformative architectural changes at all levels (CPU, memory subsystem, I/O subsystem) is underway Need to rethink data processing kernels current bare metal speed Need to think of hardware software co- design Big Data Hardware Big Data Software 23, Jignesh M. Patel, University of Wisconsin 39
BitWeaving: Fast Scans for Main Memory Data Processing
BitWeaving: Fast Scans for Main Memory Data Processing Yinan Li University of Wisconsin Madison yinan@cswiscedu Jignesh M Patel University of Wisconsin Madison jignesh@cswiscedu ABSTRACT This paper focuses
More informationBDCC: Exploiting Fine-Grained Persistent Memories for OLAP. Peter Boncz
BDCC: Exploiting Fine-Grained Persistent Memories for OLAP Peter Boncz NVRAM System integration: NVMe: block devices on the PCIe bus NVDIMM: persistent RAM, byte-level access Low latency Lower than Flash,
More informationIn-Memory Data Management
In-Memory Data Management Martin Faust Research Assistant Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University of Potsdam Agenda 2 1. Changed Hardware 2.
More informationPerformance in the Multicore Era
Performance in the Multicore Era Gustavo Alonso Systems Group -- ETH Zurich, Switzerland Systems Group Enterprise Computing Center Performance in the multicore era 2 BACKGROUND - SWISSBOX SwissBox: An
More informationJoin Processing for Flash SSDs: Remembering Past Lessons
Join Processing for Flash SSDs: Remembering Past Lessons Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison $/MB GB Flash Solid State Drives (SSDs) Benefits of
More informationHash Joins for Multi-core CPUs. Benjamin Wagner
Hash Joins for Multi-core CPUs Benjamin Wagner Joins fundamental operator in query processing variety of different algorithms many papers publishing different results main question: is tuning to modern
More informationcomplex plans and hybrid layouts
class 7 complex plans and hybrid layouts prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ essential column-stores features virtual ids late tuple reconstruction (if ever) vectorized execution
More informationDBMS Data Loading: An Analysis on Modern Hardware. Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki
DBMS Data Loading: An Analysis on Modern Hardware Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki Data loading: A necessary evil Volume => Expensive 4 zettabytes
More informationMulti-threaded Queries. Intra-Query Parallelism in LLVM
Multi-threaded Queries Intra-Query Parallelism in LLVM Multithreaded Queries Intra-Query Parallelism in LLVM Yang Liu Tianqi Wu Hao Li Interpreted vs Compiled (LLVM) Interpreted vs Compiled (LLVM) Interpreted
More informationData Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationMidterm Review. March 27, 2017
Midterm Review March 27, 2017 1 Overview Relational Algebra & Query Evaluation Relational Algebra Rewrites Index Design / Selection Physical Layouts 2 Relational Algebra & Query Evaluation 3 Relational
More informationData Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationToward GPUs being mainstream in analytic processing
Toward GPUs being mainstream in analytic processing An initial argument using simple scan-aggregate queries Mark D. Hill markhill@cs.wisc.edu Jason Power powerjg@cs.wisc.edu Jignesh M. Patel jignesh@cs.wisc.edu
More informationCitation for published version (APA): Ydraios, E. (2010). Database cracking: towards auto-tunning database kernels
UvA-DARE (Digital Academic Repository) Database cracking: towards auto-tunning database kernels Ydraios, E. Link to publication Citation for published version (APA): Ydraios, E. (2010). Database cracking:
More informationCrescando: Predictable Performance for Unpredictable Workloads
Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A. ETH Zurich, Systems Group (Funded by Enterprise Computing
More informationWhen MPPDB Meets GPU:
When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU
More informationImplications of Emerging 3D GPU Architecture on the Scan Primitive
Implications of Emerging 3D GPU Architecture on the Scan Primitive Mark D. Hill markhill@cs.wisc.edu Jason Power powerjg@cs.wisc.edu Jignesh M. Patel jignesh@cs.wisc.edu Yinan Li yinan@cs.wisc.edu Department
More informationAnastasia Ailamaki. Performance and energy analysis using transactional workloads
Performance and energy analysis using transactional workloads Anastasia Ailamaki EPFL and RAW Labs SA students: Danica Porobic, Utku Sirin, and Pinar Tozun Online Transaction Processing $2B+ industry Characteristics:
More informationColumn Stores vs. Row Stores How Different Are They Really?
Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background
More informationHYRISE In-Memory Storage Engine
HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University
More informationA Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture
A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture By Gaurav Sheoran 9-Dec-08 Abstract Most of the current enterprise data-warehouses
More informationIn-Memory Data Structures and Databases Jens Krueger
In-Memory Data Structures and Databases Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute What to take home from this talk? 2 Answer to the following questions: What makes
More informationNEC Express5800 A2040b 22TB Data Warehouse Fast Track. Reference Architecture with SW mirrored HGST FlashMAX III
NEC Express5800 A2040b 22TB Data Warehouse Fast Track Reference Architecture with SW mirrored HGST FlashMAX III Based on Microsoft SQL Server 2014 Data Warehouse Fast Track (DWFT) Reference Architecture
More informationMorsel- Drive Parallelism: A NUMA- Aware Query Evaluation Framework for the Many- Core Age. Presented by Dennis Grishin
Morsel- Drive Parallelism: A NUMA- Aware Query Evaluation Framework for the Many- Core Age Presented by Dennis Grishin What is the problem? Efficient computation requires distribution of processing between
More informationMain-Memory Databases 1 / 25
1 / 25 Motivation Hardware trends Huge main memory capacity with complex access characteristics (Caches, NUMA) Many-core CPUs SIMD support in CPUs New CPU features (HTM) Also: Graphic cards, FPGAs, low
More informationADMS/VLDB, August 27 th 2018, Rio de Janeiro, Brazil OPTIMIZING GROUP-BY AND AGGREGATION USING GPU-CPU CO-PROCESSING
ADMS/VLDB, August 27 th 2018, Rio de Janeiro, Brazil 1 OPTIMIZING GROUP-BY AND AGGREGATION USING GPU-CPU CO-PROCESSING OPTIMIZING GROUP-BY AND AGGREGATION USING GPU-CPU CO-PROCESSING MOTIVATION OPTIMIZING
More informationAccess Path Selection in Main-Memory Optimized Data Systems
Access Path Selection in Main-Memory Optimized Data Systems Should I Scan or Should I Probe? Manos Athanassoulis Harvard University Talk at CS265, February 16 th, 2018 1 Access Path Selection SELECT x
More informationOn-Disk Bitmap Index Performance in Bizgres 0.9
On-Disk Bitmap Index Performance in Bizgres 0.9 A Greenplum Whitepaper April 2, 2006 Author: Ayush Parashar Performance Engineering Lab Table of Contents 1.0 Summary...1 2.0 Introduction...1 3.0 Performance
More informationIngo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA Copyright 2003, SAS Institute Inc. All rights reserved.
Intelligent Storage Results from real life testing Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA SAS Intelligent Storage components! OLAP Server! Scalable Performance Data Server!
More informationLenovo Database Configuration for Microsoft SQL Server TB
Database Lenovo Database Configuration for Microsoft SQL Server 2016 22TB Data Warehouse Fast Track Solution Data Warehouse problem and a solution The rapid growth of technology means that the amount of
More informationLenovo Database Configuration
Lenovo Database Configuration for Microsoft SQL Server Standard Edition DWFT 9TB Reduce time to value with pretested hardware configurations Data Warehouse problem and a solution The rapid growth of technology
More informationColumn-Stores vs. Row-Stores: How Different Are They Really?
Column-Stores vs. Row-Stores: How Different Are They Really? Daniel J. Abadi, Samuel Madden and Nabil Hachem SIGMOD 2008 Presented by: Souvik Pal Subhro Bhattacharyya Department of Computer Science Indian
More informationEvaluation of Relational Operations. SS Chung
Evaluation of Relational Operations SS Chung Cost Metric Query Processing Cost = Disk I/O Cost + CPU Computation Cost Disk I/O Cost = Disk Access Time + Data Transfer Time Disk Acess Time = Seek Time +
More informationSTORING DATA: DISK AND FILES
STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how
More informationRed Fox: An Execution Environment for Relational Query Processing on GPUs
Red Fox: An Execution Environment for Relational Query Processing on GPUs Haicheng Wu 1, Gregory Diamos 2, Tim Sheard 3, Molham Aref 4, Sean Baxter 2, Michael Garland 2, Sudhakar Yalamanchili 1 1. Georgia
More informationHardware Acceleration for Database Systems using Content Addressable Memories
Hardware Acceleration for Database Systems using Content Addressable Memories Nagender Bandi, Sam Schneider, Divyakant Agrawal, Amr El Abbadi University of California, Santa Barbara Overview The Memory
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationDATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland
DATABASES AND THE CLOUD Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland AVALOQ Conference Zürich June 2011 Systems Group www.systems.ethz.ch Enterprise Computing Center
More informationToward timely, predictable and cost-effective data analytics. Renata Borovica-Gajić DIAS, EPFL
Toward timely, predictable and cost-effective data analytics Renata Borovica-Gajić DIAS, EPFL Big data proliferation Big data is when the current technology does not enable users to obtain timely, cost-effective,
More informationSingle-pass restore after a media failure. Caetano Sauer, Goetz Graefe, Theo Härder
Single-pass restore after a media failure Caetano Sauer, Goetz Graefe, Theo Härder 20% of drives fail after 4 years High failure rate on first year (factory defects) Expectation of 50% for 6 years https://www.backblaze.com/blog/how-long-do-disk-drives-last/
More informationAdaptive Parallel Compressed Event Matching
Adaptive Parallel Compressed Event Matching Mohammad Sadoghi 1,2 Hans-Arno Jacobsen 2 1 IBM T.J. Watson Research Center 2 Middleware Systems Research Group, University of Toronto April 2014 Mohammad Sadoghi
More informationBridging the Processor/Memory Performance Gap in Database Applications
Bridging the Processor/Memory Performance Gap in Database Applications Anastassia Ailamaki Carnegie Mellon http://www.cs.cmu.edu/~natassa Memory Hierarchies PROCESSOR EXECUTION PIPELINE L1 I-CACHE L1 D-CACHE
More informationColumn-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi
Column-Stores vs. Row-Stores How Different are they Really? Arul Bharathi Authors Daniel J.Abadi Samuel R. Madden Nabil Hachem 2 Contents Introduction Row Oriented Execution Column Oriented Execution Column-Store
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationIntel Enterprise Processors Technology
Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology
More informationPathfinder/MonetDB: A High-Performance Relational Runtime for XQuery
Introduction Problems & Solutions Join Recognition Experimental Results Introduction GK Spring Workshop Waldau: Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery Database & Information
More informationCS 564 Final Exam Fall 2015 Answers
CS 564 Final Exam Fall 015 Answers A: STORAGE AND INDEXING [0pts] I. [10pts] For the following questions, clearly circle True or False. 1. The cost of a file scan is essentially the same for a heap file
More informationArchitecture-Conscious Database Systems
Architecture-Conscious Database Systems 2009 VLDB Summer School Shanghai Peter Boncz (CWI) Sources Thank You! l l l l Database Architectures for New Hardware VLDB 2004 tutorial, Anastassia Ailamaki Query
More informationAccelerating Analytical Workloads
Accelerating Analytical Workloads Thomas Neumann Technische Universität München April 15, 2014 Scale Out in Big Data Analytics Big Data usually means data is distributed Scale out to process very large
More informationWeaving Relations for Cache Performance
Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon Computer Platforms in 198 Execution PROCESSOR 1 cycles/instruction Data and Instructions cycles
More informationHadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera
More informationUsing Transparent Compression to Improve SSD-based I/O Caches
Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr
More informationTrack Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross
Track Join Distributed Joins with Minimal Network Traffic Orestis Polychroniou Rajkumar Sen Kenneth A. Ross Local Joins Algorithms Hash Join Sort Merge Join Index Join Nested Loop Join Spilling to disk
More informationAlbis: High-Performance File Format for Big Data Systems
Albis: High-Performance File Format for Big Data Systems Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Adrian Schuepbach, Bernard Metzler, IBM Research, Zurich 2018 USENIX Annual Technical Conference
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationAnnouncements. Database Systems CSE 414. Why compute in parallel? Big Data 10/11/2017. Two Kinds of Parallel Data Processing
Announcements Database Systems CSE 414 HW4 is due tomorrow 11pm Lectures 18: Parallel Databases (Ch. 20.1) 1 2 Why compute in parallel? Multi-cores: Most processors have multiple cores This trend will
More informationAdaptive Concurrent Query Execution Framework for an Analytical In-Memory Database System - Supplemental Material
Adaptive Concurrent Query Execution Framework for an Analytical In-Memory Database System - Supplemental Material Harshad Deshmukh Hakan Memisoglu University of Wisconsin - Madison {harshad, memisoglu,
More informationHewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE
Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE Digital transformation is taking place in businesses of all sizes Big Data and Analytics Mobility Internet of Things
More informationRELATIONAL OPERATORS #1
RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query
More informationAdaptive Query Processing on Prefix Trees Wolfgang Lehner
Adaptive Query Processing on Prefix Trees Wolfgang Lehner Fachgruppentreffen, 22.11.2012 TU München Prof. Dr.-Ing. Wolfgang Lehner > Challenges for Database Systems Three things are important in the database
More informationWorkload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013
Workload Optimized Systems: The Wheel of Reincarnation Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013 Outline Definition Technology Minicomputers Prime Workstations Apollo Graphics
More informationAccelerating multi-column selection predicates in main-memory the Elf approach
2017 IEEE 33rd International Conference on Data Engineering Accelerating multi-column selection predicates in main-memory the Elf approach David Broneske University of Magdeburg david.broneske@ovgu.de
More informationCloud Computing with FPGA-based NVMe SSDs
Cloud Computing with FPGA-based NVMe SSDs Bharadwaj Pudipeddi, CTO NVXL Santa Clara, CA 1 Choice of NVMe Controllers ASIC NVMe: Fully off-loaded, consistent performance, M.2 or U.2 form factor ASIC OpenChannel:
More informationRed Fox: An Execution Environment for Relational Query Processing on GPUs
Red Fox: An Execution Environment for Relational Query Processing on GPUs Georgia Institute of Technology: Haicheng Wu, Ifrah Saeed, Sudhakar Yalamanchili LogicBlox Inc.: Daniel Zinn, Martin Bravenboer,
More informationColumnstore and B+ tree. Are Hybrid Physical. Designs Important?
Columnstore and B+ tree Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 B+ tree & Columnstore on same table = Hybrid design 4? C O L C O L B+ tree B+ tree ? C O L C O L B+ tree B+ tree
More informationEvaluation of Intel Memory Drive Technology Performance for Scientific Applications
Evaluation of Intel Memory Drive Technology Performance for Scientific Applications Vladimir Mironov, Andrey Kudryavtsev, Yuri Alexeev, Alexander Moskovsky, Igor Kulikov, and Igor Chernykh Introducing
More informationand data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed
5.3 By convention, a cache is named according to the amount of data it contains (i.e., a 4 KiB cache can hold 4 KiB of data); however, caches also require SRAM to store metadata such as tags and valid
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationSmooth Scan: Statistics-Oblivious Access Paths. Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser
Smooth Scan: Statistics-Oblivious Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q16 Q18 Q19 Q21 Q22
More informationHammer Slide: Work- and CPU-efficient Streaming Window Aggregation
Large-Scale Data & Systems Group Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch, Holger Pirk Large-Scale Data & Systems (LSDS)
More informationBeyond EXPLAIN. Query Optimization From Theory To Code. Yuto Hayamizu Ryoji Kawamichi. 2016/5/20 PGCon Ottawa
Beyond EXPLAIN Query Optimization From Theory To Code Yuto Hayamizu Ryoji Kawamichi 2016/5/20 PGCon 2016 @ Ottawa Historically Before Relational Querying was physical Need to understand physical organization
More informationActive Disks - Remote Execution
- Remote Execution for Network-Attached Storage Erik Riedel, University www.pdl.cs.cmu.edu/active UC-Berkeley Systems Seminar 8 October 1998 Outline Network-Attached Storage Opportunity Applications Performance
More informationColumn Scan Acceleration in Hybrid CPU-FPGA Systems
Column Scan Acceleration in Hybrid CPU-FPGA Systems ABSTRACT Nusrat Jahan Lisa, Annett Ungethüm, Dirk Habich, Wolfgang Lehner Technische Universität Dresden Database Systems Group Dresden, Germany {firstname.lastname}@tu-dresden.de
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationInstant Recovery for Main-Memory Databases
Instant Recovery for Main-Memory Databases Ismail Oukid*, Wolfgang Lehner*, Thomas Kissinger*, Peter Bumbulis, and Thomas Willhalm + *TU Dresden SAP SE + Intel GmbH CIDR 2015, Asilomar, California, USA,
More informationData Processing on Emerging Hardware
Data Processing on Emerging Hardware Gustavo Alonso Systems Group Department of Computer Science ETH Zurich, Switzerland 3 rd International Summer School on Big Data, Munich, Germany, 2017 www.systems.ethz.ch
More informationIBM s Data Warehouse Appliance Offerings
IBM s Data Warehouse Appliance Offerings RChaitanya IBM India Software Labs Agenda 1 IBM Smart Analytics System (D5600) System Overview Technical Architecture Software / Hardware stack details 2 Netezza
More informationIn-Memory Data Management Jens Krueger
In-Memory Data Management Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute OLTP vs. OLAP 2 Online Transaction Processing (OLTP) Organized in rows Online Analytical Processing
More informationOverview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri
Overview of Data Exploration Techniques Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri data exploration not always sure what we are looking for (until we find it) data has always been big volume
More informationDatenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016
Datenbanksysteme II: Modern Hardware Stefan Sprenger November 23, 2016 Content of this Lecture Introduction to Modern Hardware CPUs, Cache Hierarchy Branch Prediction SIMD NUMA Cache-Sensitive Skip List
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationQuery Processing. Introduction to Databases CompSci 316 Fall 2017
Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2
More informationCascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching
Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value
More informationColumn-Oriented Database Systems. Liliya Rudko University of Helsinki
Column-Oriented Database Systems Liliya Rudko University of Helsinki 2 Contents 1. Introduction 2. Storage engines 2.1 Evolutionary Column-Oriented Storage (ECOS) 2.2 HYRISE 3. Database management systems
More informationIntroduction to Database Systems CSE 414. Lecture 16: Query Evaluation
Introduction to Database Systems CSE 414 Lecture 16: Query Evaluation CSE 414 - Spring 2018 1 Announcements HW5 + WQ5 due tomorrow Midterm this Friday in class! Review session this Wednesday evening See
More informationCSE 562 Final Exam Solutions
CSE 562 Final Exam Solutions May 12, 2014 Question Points Possible Points Earned A.1 7 A.2 7 A.3 6 B.1 10 B.2 10 C.1 10 C.2 10 D.1 10 D.2 10 E 20 Bonus 5 Total 105 CSE 562 Final Exam 2014 Relational Algebra
More informationWeaving Relations for Cache Performance
Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon David DeWitt, Mark Hill, and Marios Skounakis University of Wisconsin-Madison Memory Hierarchies PROCESSOR EXECUTION PIPELINE
More informationSort vs. Hash Join Revisited for Near-Memory Execution. Nooshin Mirzadeh, Onur Kocberber, Babak Falsafi, Boris Grot
Sort vs. Hash Join Revisited for Near-Memory Execution Nooshin Mirzadeh, Onur Kocberber, Babak Falsafi, Boris Grot 1 Near-Memory Processing (NMP) Emerging technology Stacked memory: A logic die w/ a stack
More informationWeaving Relations for Cache Performance
VLDB 2001, Rome, Italy Best Paper Award Weaving Relations for Cache Performance Anastassia Ailamaki David J. DeWitt Mark D. Hill Marios Skounakis Presented by: Ippokratis Pandis Bottleneck in DBMSs Processor
More informationCrazy little thing called hardware GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH
Crazy little thing called hardware GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH HTDC 2014 Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch Hardware
More informationCSE 344 MAY 2 ND MAP/REDUCE
CSE 344 MAY 2 ND MAP/REDUCE ADMINISTRIVIA HW5 Due Tonight Practice midterm Section tomorrow Exam review PERFORMANCE METRICS FOR PARALLEL DBMSS Nodes = processors, computers Speedup: More nodes, same data
More informationSDA: Software-Defined Accelerator for general-purpose big data analysis system
SDA: Software-Defined Accelerator for general-purpose big data analysis system Jian Ouyang(ouyangjian@baidu.com), Wei Qi, Yong Wang, Yichen Tu, Jing Wang, Bowen Jia Baidu is beyond a search engine Search
More informationMemTest: A Novel Benchmark for In-memory Database
MemTest: A Novel Benchmark for In-memory Database Qiangqiang Kang, Cheqing Jin, Zhao Zhang, Aoying Zhou Institute for Data Science and Engineering, East China Normal University, Shanghai, China 1 Outline
More informationA Study of Data Partitioning on OpenCL-based FPGAs. Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST)
A Study of Data Partitioning on OpenC-based FPGAs Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST) 1 Outline Background and Motivations Data Partitioning on FPGA OpenC on FPGA
More informationCSE 190D Spring 2017 Final Exam Answers
CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join
More informationOncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries
Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big
More informationSoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research
SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1 The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for
More informationComposite Group-Keys
Composite Group-Keys Space-efficient Indexing of Multiple Columns for Compressed In-Memory Column Stores Martin Faust, David Schwalb, and Hasso Plattner Hasso Plattner Institute for IT Systems Engineering
More informationRealizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics
Realizing the Next Generation of Exabyte-scale Persistent Memory-Centric Architectures and Memory Fabrics Zvonimir Z. Bandic, Sr. Director, Next Generation Platform Technologies Western Digital Corporation
More informationHPE ProLiant DL580 Gen10 and Ultrastar SS300 SSD 195TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture
WHITE PAPER HPE ProLiant DL580 Gen10 and Ultrastar SS300 SSD 195TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture BASED ON MICROSOFT SQL SERVER 2017 DATA WAREHOUSE FAST TRACK REFERENCE
More information