DATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland
|
|
- Alfred Haynes
- 5 years ago
- Views:
Transcription
1 DATABASES AND THE CLOUD Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland AVALOQ Conference Zürich June 2011
2 Systems Group Enterprise Computing Center
3 Some results of ECC Impact on Direct on Products and Releases TA Models, Privacy (CreditSuisse) Stream Federator (SAP) Exploratory work and alternative designs Global Dictionary, Global TAs (SAP) Design of New Products Crescando (Amadeus) Hadoop (Amadeus, CreditSuisse) Semantic Search (CreditSuisse) All projects based on Real products Customer requirements Real data, real loads
4 DATABASES: The cornerstone of enterprise architectures
5 Relational Databases No real alternative to relational database engines Transactional guarantees Recovery guarantees Existing investments Constant need for larger and larger deployments to cope with modern loads
6 Relational Databases today Peak load provisioning = license a huge database engine to cope with peak loads Scalability = relational engines scale but only at an exponential cost in licenses, hardware, and administrative overhead Complexity = Huge costs for tuning, maintenance and administration Price = exorbitant costs for the performance
7 CLOUD COMPUTING: Change in model
8 The key to cloud computing TODAY= Secondary sector economy (manufacturing) Focus is on production processes CLOUD= Tertiary sector economy (services) Focus is on business processes Subject to completely different forms of regulations and behavior expectations 8
9 AMADEUS: reaching for the clouds
10 Amadeus Workload Passenger Booking Database ~ 600 GB of raw data (two years of bookings) single table, denormalized ~ 50 attributes: flight no, name, date,..., many flags Query Workload up to 4000 queries / second latency guarantees: 2 seconds today: only pre canned queries allowed Update Workload avg. 600 updates per second (1 update per GB per sec) peak of updates per second data freshness guarantee: 2 seconds Problems with State of the Art Simple queries workonly because of mat. views multi month project to implement new query / process Complex queries do not work at all
11 Traditional engines break down Update Load in Updates/sec MySQL Query 50th MySQL Query 90th Synthetic Workload Parameter s 20'000 MySQL Query 99th 9'000 8'000 15'000 10'000 Query Latency in msec 7'000 6'000 5'000 4'000 3'000 Query Latency in msec 5'000 2'000 1' Performance depends on workload parameters changes in load (updates, columns accessed) > huge variance Unpredictable performance, impossible to tune correctly
12 Many problems Exhaustive benchmarking and analysis shows: Lack of scalability with number of cores Problems with I/O with number of cores Load interaction problems Unpredictable performance Increasingly expensive tuning
13 Amadeus requirements Predictable (= constant) Performance Meet SLAs on latency and data freshness Affordable Cost compared to mainframe / current license Maintain Consistency monotonic reads (ACID not needed) Suitable for modern hardware main memory, NUMA, large data centers
14 CRESCANDO: rethinking data processing
15 What is Crescando? A distributed (relational) table: main memory on NUMA horizontally partitioned distributed within and across machines Query / update interface SELECT * FROM table WHERE <any predicate> UPDATE table SET <anything> WHERE <any predicate> monotonic reads / writes (SI within a single partition)
16 Work unit = Clock Scan QUERIES UPDATES BUILD QUERY INDEX FOR NEXT SCAN READ CURSOR WRITE CURSOR DATA IN CIRCULAR BUFFER (WIDE TABLE)
17 Crescando on 1 Machine (N Cores) Scan Thread Scan Thread Input Queue (Operations) Split Scan Thread Scan Thread Merge Output Queue (Result Tuples)... Input Queue (Operations) Scan Thread Output Queue (Result Tuples)
18 Crescando in a Data Center (N Machines)
19 Implementation Details Optimization decide for batch of queries which indexes to build runs once every second (must be fast) Query + update indexes different indexes for different kinds of predicates e.g., hash tables, R trees, tries,... must fit in L2 cache (better L1 cache) Probe indexes Updates in right order, queries in any order Persistence & Recovery Log updates / inserts to disk (not a bottleneck)
20 What is different? No threads (work unit = core) No synchronization across work units Data partitioned across work units Work units across cores and machines No indexes on data, no materialized views Constant performance Performance determined by design Dynamic scalability / elasticity
21 Linear scalability on modern hardware
22 Status Going live summer 2011
23 The road ahead Many exciting projects SharedDB SwissBox Cloud computing Barrelfish FPGA data processing Solid systems research with real impact
24 Systems Group Enterprise Computing Center
Crescando: Predictable Performance for Unpredictable Workloads
Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A. ETH Zurich, Systems Group (Funded by Enterprise Computing
More informationMULTICORE IN DATA APPLIANCES. Gustavo Alonso Systems Group Dept. of Computer Science ETH Zürich, Switzerland
MULTICORE IN DATA APPLIANCES Gustavo Alonso Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox CREST Workshop March 2012 Systems Group = www.systems.ethz.ch Enterprise Computing Center
More informationPerformance in the Multicore Era
Performance in the Multicore Era Gustavo Alonso Systems Group -- ETH Zurich, Switzerland Systems Group Enterprise Computing Center Performance in the multicore era 2 BACKGROUND - SWISSBOX SwissBox: An
More informationData Processing on Emerging Hardware
Data Processing on Emerging Hardware Gustavo Alonso Systems Group Department of Computer Science ETH Zurich, Switzerland 3 rd International Summer School on Big Data, Munich, Germany, 2017 www.systems.ethz.ch
More informationResearch Collection. Daedalus a distributed crescando system. Master Thesis. ETH Library. Author(s): Giannikis, Georgios. Publication Date: 2009
Research Collection Master Thesis Daedalus a distributed crescando system Author(s): Giannikis, Georgios Publication Date: 2009 Permanent Link: https://doi.org/10.3929/ethz-a-005816890 Rights / License:
More informationRack-scale Data Processing System
Rack-scale Data Processing System Jana Giceva, Darko Makreshanski, Claude Barthels, Alessandro Dovis, Gustavo Alonso Systems Group, Department of Computer Science, ETH Zurich Rack-scale Data Processing
More informationCrazy little thing called hardware GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH
Crazy little thing called hardware GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH HTDC 2014 Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch Hardware
More informationCSE 544: Principles of Database Systems
CSE 544: Principles of Database Systems Anatomy of a DBMS, Parallel Databases 1 Announcements Lecture on Thursday, May 2nd: Moved to 9am-10:30am, CSE 403 Paper reviews: Anatomy paper was due yesterday;
More informationData Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationData Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationHYRISE In-Memory Storage Engine
HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University
More informationBig and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant
Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model
More informationSTORING DATA: DISK AND FILES
STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how
More informationColumn-Stores vs. Row-Stores. How Different are they Really? Arul Bharathi
Column-Stores vs. Row-Stores How Different are they Really? Arul Bharathi Authors Daniel J.Abadi Samuel R. Madden Nabil Hachem 2 Contents Introduction Row Oriented Execution Column Oriented Execution Column-Store
More informationPractical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars
Practical MySQL Performance Optimization Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars In This Presentation We ll Look at how to approach Performance Optimization Discuss Practical
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationPractical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 20 th, 2016 Percona Technical Webinars
Practical MySQL Performance Optimization Peter Zaitsev, CEO, Percona July 20 th, 2016 Percona Technical Webinars In This Presentation We ll Look at how to approach Performance Optimization Discuss Practical
More informationGustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2
Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design
More informationColumn Stores vs. Row Stores How Different Are They Really?
Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background
More informationMain-Memory Databases 1 / 25
1 / 25 Motivation Hardware trends Huge main memory capacity with complex access characteristics (Caches, NUMA) Many-core CPUs SIMD support in CPUs New CPU features (HTM) Also: Graphic cards, FPGAs, low
More informationNext-Generation Cloud Platform
Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology
More informationMassive Scalability With InterSystems IRIS Data Platform
Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special
More informationCSE 544 Principles of Database Management Systems
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 5 - DBMS Architecture and Indexing 1 Announcements HW1 is due next Thursday How is it going? Projects: Proposals are due
More informationAchieving Horizontal Scalability. Alain Houf Sales Engineer
Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches
More informationRecovering Disk Storage Metrics from low level Trace events
Recovering Disk Storage Metrics from low level Trace events Progress Report Meeting May 05, 2016 Houssem Daoud Michel Dagenais École Polytechnique de Montréal Laboratoire DORSAL Agenda Introduction and
More informationA comparison of UKCloud s platform against other public cloud providers
Pure commitment. A comparison of UKCloud s platform against other public cloud providers version 1.0 (Based upon August 2017 data) The evolution of UKCloud UKCloud has long been known for its VMware powered
More informationCS848 Paper Presentation Building a Database on S3. Brantner, Florescu, Graf, Kossmann, Kraska SIGMOD 2008
CS848 Paper Presentation Building a Database on S3 Brantner, Florescu, Graf, Kossmann, Kraska SIGMOD 2008 Presented by David R. Cheriton School of Computer Science University of Waterloo 15 March 2010
More informationStatistics Driven Workload Modeling for the Cloud
UC Berkeley Statistics Driven Workload Modeling for the Cloud Archana Ganapathi, Yanpei Chen Armando Fox, Randy Katz, David Patterson SMDB 2010 Data analytics are moving to the cloud Cloud computing economy
More informationSTORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo
STORAGE SYSTEMS Operating Systems 2015 Spring by Euiseong Seo Today s Topics HDDs (Hard Disk Drives) Disk scheduling policies Linux I/O schedulers Secondary Storage Anything that is outside of primary
More informationJargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems
Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons
More information* Contributed while interning at SAP. September 1 st, 2017 PUBLIC
Adaptive Recovery for SCM-Enabled Databases Ismail Oukid (TU Dresden & SAP), Daniel Bossle* (SAP), Anisoara Nica (SAP), Peter Bumbulis (SAP), Wolfgang Lehner (TU Dresden), Thomas Willhalm (Intel) * Contributed
More informationIn-Memory Data Management Jens Krueger
In-Memory Data Management Jens Krueger Enterprise Platform and Integration Concepts Hasso Plattner Intitute OLTP vs. OLAP 2 Online Transaction Processing (OLTP) Organized in rows Online Analytical Processing
More informationJignesh M. Patel. Blog:
Jignesh M. Patel Blog: http://bigfastdata.blogspot.com Go back to the design Query Cache from Processing for Conscious 98s Modern (at Algorithms Hardware least for Hash Joins) 995 24 2 Processor Processor
More informationData Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of
More informationOracle Performance on M5000 with F20 Flash Cache. Benchmark Report September 2011
Oracle Performance on M5000 with F20 Flash Cache Benchmark Report September 2011 Contents 1 About Benchware 2 Flash Cache Technology 3 Storage Performance Tests 4 Conclusion copyright 2011 by benchware.ch
More informationJitterbit is comprised of two components: Jitterbit Integration Environment
Technical Overview Integrating your data, applications, and other enterprise systems is critical to the success of your business but, until now, integration has been a complex and time-consuming process
More informationIntroduction to Database Systems CSE 414
Introduction to Database Systems CSE 414 Lecture 24: Parallel Databases CSE 414 - Spring 2015 1 Announcements HW7 due Wednesday night, 11 pm Quiz 7 due next Friday(!), 11 pm HW8 will be posted middle of
More informationCascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching
Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value
More informationHuge market -- essentially all high performance databases work this way
11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch
More informationDiscover the all-flash storage company for the on-demand world
Discover the all-flash storage company for the on-demand world STORAGE FOR WHAT S NEXT The applications we use in our personal lives have raised the level of expectations for the user experience in enterprise
More informationData Stream Processing in the Cloud
Department of Computing Data Stream Processing in the Cloud Evangelia Kalyvianaki ekalyv@imperial.ac.uk joint work with Raul Castro Fernandez, Marco Fiscato, Matteo Migliavacca and Peter Pietzuch Peter
More informationOracle Exadata: Strategy and Roadmap
Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended
More informationBatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications
BatchDB: Efficient Isolated Execution of Hybrid OLTP+OLAP Workloads for Interactive Applications Darko Makreshanski, Jana Giceva, Claude Barthels, Gustavo Alonso Systems Group, Department of Computer Science,
More informationCSE 544, Winter 2009, Final Examination 11 March 2009
CSE 544, Winter 2009, Final Examination 11 March 2009 Rules: Open books and open notes. No laptops or other mobile devices. Calculators allowed. Please write clearly. Relax! You are here to learn. Question
More informationImprove Web Application Performance with Zend Platform
Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching
More informationI/O & Storage. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University
I/O & Storage Jin-Soo Kim ( jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics I/O systems Device characteristics: block vs. character I/O systems
More informationDictionary compression for a scan-based, main-memory database system
Research Collection Master Thesis Dictionary compression for a scan-based, main-memory database system Author(s): Bernet, Janick Publication Date: 2010 Permanent Link: https://doi.org/10.3929/ethz-a-006050496
More informationLeveraging the power of Flash to Enable IT as a Service
Leveraging the power of Flash to Enable IT as a Service Steve Knipple CTO / VP Engineering August 5, 2014 In summary Flash in the datacenter, simply put, solves numerous problems. The challenge is to use
More informationDatenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016
Datenbanksysteme II: Modern Hardware Stefan Sprenger November 23, 2016 Content of this Lecture Introduction to Modern Hardware CPUs, Cache Hierarchy Branch Prediction SIMD NUMA Cache-Sensitive Skip List
More informationWhat We Have Already Learned. DBMS Deployment: Local. Where We Are Headed Next. DBMS Deployment: 3 Tiers. DBMS Deployment: Client/Server
What We Have Already Learned CSE 444: Database Internals Lectures 19-20 Parallel DBMSs Overall architecture of a DBMS Internals of query execution: Data storage and indexing Buffer management Query evaluation
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationDisks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?
Why Not Store Everything in Main Memory? Storage Structures Introduction Chapter 8 (3 rd edition) Sharma Chakravarthy UT Arlington sharma@cse.uta.edu base Management Systems: Sharma Chakravarthy Costs
More informationDesigning Hybrid Data Processing Systems for Heterogeneous Servers
Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk University
More informationSQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationCSE 451: Operating Systems Spring Module 12 Secondary Storage
CSE 451: Operating Systems Spring 2017 Module 12 Secondary Storage John Zahorjan 1 Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution
More informationTo Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016
To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016 Story Let s start with the story 2 First things to decide Before you decide how to shard you d best understand whether or not
More informationPrivate Cloud Database Consolidation Name, Title
Private Cloud Database Consolidation Name, Title Agenda Cloud Introduction Business Drivers Cloud Architectures Enabling Technologies Service Level Expectations Customer Case Studies Conclusions
More informationWhat is new in the cloud? Donald Kossmann ETH Zurich
What is new in the cloud? Donald Kossmann ETH Zurich http://systems.ethz.ch Acknowledgments Questions? Agenda Why? How? What? Simple Truths Power of data the more data the merrier (GB > TB > PB) data comes
More informationRACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING
RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING EXECUTIVE SUMMARY Today, businesses are increasingly turning to cloud services for rapid deployment of apps and services.
More informationHash Joins for Multi-core CPUs. Benjamin Wagner
Hash Joins for Multi-core CPUs Benjamin Wagner Joins fundamental operator in query processing variety of different algorithms many papers publishing different results main question: is tuning to modern
More informationA Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture
A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture By Gaurav Sheoran 9-Dec-08 Abstract Most of the current enterprise data-warehouses
More informationMySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona In the Presentation Practical approach to deal with some of the common MySQL Issues 2 Assumptions You re looking
More informationSAP HANA Scalability. SAP HANA Development Team
SAP HANA Scalability Design for scalability is a core SAP HANA principle. This paper explores the principles of SAP HANA s scalability, and its support for the increasing demands of data-intensive workloads.
More informationMULTI-THREADED QUERIES
15-721 Project 3 Final Presentation MULTI-THREADED QUERIES Wendong Li (wendongl) Lu Zhang (lzhang3) Rui Wang (ruiw1) Project Objective Intra-operator parallelism Use multiple threads in a single executor
More informationCSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble
CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution
More informationVMware Virtual SAN Technology
VMware Virtual SAN Technology Today s Agenda 1 Hyper-Converged Infrastructure Architecture & Vmware Virtual SAN Overview 2 Why VMware Hyper-Converged Software? 3 VMware Virtual SAN Advantage Today s Agenda
More informationRef: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1
Ref: Chap 12 Secondary Storage and I/O Systems Applied Operating System Concepts 12.1 Part 1 - Secondary Storage Secondary storage typically: is anything that is outside of primary memory does not permit
More informationCS122 Lecture 8 Winter Term,
CS122 Lecture 8 Winter Term, 2014-2015 2 Other Join Algorithms Nested- loops join is generally useful, but slow Most joins involve equality tests against attributes Such joins are called equijoins Two
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationCaribou: Intelligent Distributed Storage
: Intelligent Distributed Storage Zsolt István, David Sidler, Gustavo Alonso Systems Group, Department of Computer Science, ETH Zurich 1 Rack-scale thinking In the Cloud ToR Switch Compute Compute + Provisioning
More informationMulti-threaded Queries. Intra-Query Parallelism in LLVM
Multi-threaded Queries Intra-Query Parallelism in LLVM Multithreaded Queries Intra-Query Parallelism in LLVM Yang Liu Tianqi Wu Hao Li Interpreted vs Compiled (LLVM) Interpreted vs Compiled (LLVM) Interpreted
More informationAnswering Queries Using Cooperative Semantic Caching
Answering Queries Using Cooperative Caching Andrei Vancea 1, Prof. Dr. Burkhard Stiller 1,2 1 Department of Informatics IFI, Communication Systems Group CSG, University of Zürich 2 associated with the
More informationS-Store: Streaming Meets Transaction Processing
S-Store: Streaming Meets Transaction Processing H-Store is an experimental database management system (DBMS) designed for online transaction processing applications Manasa Vallamkondu Motivation Reducing
More informationAn Oracle White Paper April 2010
An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited
More informationMDHIM: A Parallel Key/Value Store Framework for HPC
MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system
More informationDigital Enterprise Platform for Live Business. Kevin Liu SAP Greater China, Vice President General Manager of Big Data and Platform BU
Digital Enterprise Platform for Live Business Kevin Liu SAP Greater China, Vice President General Manager of Big Data and Platform BU Rethinking the Future Competing in today s marketplace means leveraging
More informationFusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic
WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationBuilding Adaptive Performance Models for Dynamic Resource Allocation in Cloud Data Centers
Building Adaptive Performance Models for Dynamic Resource Allocation in Cloud Data Centers Jin Chen University of Toronto Joint work with Gokul Soundararajan and Prof. Cristiana Amza. Today s Cloud Pay
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationSAP HANA. Jake Klein/ SVP SAP HANA June, 2013
SAP HANA Jake Klein/ SVP SAP HANA June, 2013 SAP 3 YEARS AGO Middleware BI / Analytics Core ERP + Suite 2013 WHERE ARE WE NOW? Cloud Mobile Applications SAP HANA Analytics D&T Changed Reality Disruptive
More informationSchema-Agnostic Indexing with Azure Document DB
Schema-Agnostic Indexing with Azure Document DB Introduction Azure DocumentDB is Microsoft s multi-tenant distributed database service for managing JSON documents at Internet scale Multi-tenancy is an
More informationCSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage
CSE 451: Operating Systems Winter 2005 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationIntroduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006
November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lecture 25: Parallel Databases CSE 344 - Winter 2013 1 Announcements Webquiz due tonight last WQ! J HW7 due on Wednesday HW8 will be posted soon Will take more hours
More informationWarehouse-Scale Computing
ecture 31 Computer Science 61C Spring 2017 April 7th, 2017 Warehouse-Scale Computing 1 New-School Machine Structures (It s a bit more complicated!) Software Hardware Parallel Requests Assigned to computer
More informationCaching and reliability
Caching and reliability Block cache Vs. Latency ~10 ns 1~ ms Access unit Byte (word) Sector Capacity Gigabytes Terabytes Price Expensive Cheap Caching disk contents in RAM Hit ratio h : probability of
More informationTieto s itap Offering
Tieto s itap Offering say goodbye to worries related to managing Servers & IT infrastructure Jimi Inge Lead Service Architect MZZ Managed Services and Transformation jimi.inge@tieto.com itap An Overview
More informationWhat is Real Application Testing?
Real Application Testing Real Application Testing Enterprise Manager Management Packs Enhancements What is Real Application Testing? New database option available with EE only Includes two new features
More informationAuto Management for Apache Kafka and Distributed Stateful System in General
Auto Management for Apache Kafka and Distributed Stateful System in General Jiangjie (Becket) Qin Data Infrastructure @LinkedIn GIAC 2017, 12/23/17@Shanghai Agenda Kafka introduction and terminologies
More information[This is not an article, chapter, of conference paper!]
http://www.diva-portal.org [This is not an article, chapter, of conference paper!] Performance Comparison between Scaling of Virtual Machines and Containers using Cassandra NoSQL Database Sogand Shirinbab,
More informationSEER: LEVERAGING BIG DATA TO NAVIGATE THE COMPLEXITY OF PERFORMANCE DEBUGGING IN CLOUD MICROSERVICES
SEER: LEVERAGING BIG DATA TO NAVIGATE THE COMPLEXITY OF PERFORMANCE DEBUGGING IN CLOUD MICROSERVICES Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou Cornell
More informationAnnouncements. Database Systems CSE 414. Why compute in parallel? Big Data 10/11/2017. Two Kinds of Parallel Data Processing
Announcements Database Systems CSE 414 HW4 is due tomorrow 11pm Lectures 18: Parallel Databases (Ch. 20.1) 1 2 Why compute in parallel? Multi-cores: Most processors have multiple cores This trend will
More informationSecondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane
Secondary storage CS 537 Lecture 11 Secondary Storage Michael Swift Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions or data retrieval
More informationYCSB++ benchmarking tool Performance debugging advanced features of scalable table stores
YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores Swapnil Patil M. Polte, W. Tantisiriroj, K. Ren, L.Xiao, J. Lopez, G.Gibson, A. Fuchs *, B. Rinaldi * Carnegie
More informationDatabase Hardware Selection Guidelines
Database Hardware Selection Guidelines BRUCE MOMJIAN Database servers have hardware requirements different from other infrastructure software, specifically unique demands on I/O and memory. This presentation
More informationYCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores
YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores Swapnil Patil Milo Polte, Wittawat Tantisiriroj, Kai Ren, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs *, Billie
More informationL7: Performance. Frans Kaashoek Spring 2013
L7: Performance Frans Kaashoek kaashoek@mit.edu 6.033 Spring 2013 Overview Technology fixes some performance problems Ride the technology curves if you can Some performance requirements require thinking
More informationNVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory
NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory Dhananjoy Das, Sr. Systems Architect SanDisk Corp. 1 Agenda: Applications are KING! Storage landscape (Flash / NVM)
More information