Time-Series Data in MongoDB on a Budget. Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018
|
|
- Jewel Boone
- 5 years ago
- Views:
Transcription
1 Time-Series Data in MongoDB on a Budget Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018
2 TIME SERIES DATA in MongoDB on a Budget Click to add text
3 What is Time-Series Data? Characteristics: Arriving data is stored as a new value as opposed to overwriting existing values Usually arrives in time order Accumulated data size grows over time Time is the primary means of organizing/accessing the data 3
4 Time Series Data in MONGODB on a Budget Click to add text
5 Why MongoDB? General purpose database Specialized Time-Series DBs do exist Do not use mmap storage engine 5
6 Data Retention Options Purge old entries Set up MongoDB index with TTL option (be careful if this index is your shard key) Aggregate data and store summaries Create summary document, delete original raw data Huge compression possible (seconds->minutes->hours->days->months->years) Measurement buckets Store all entries for a time window in a single document Avoids storing duplicate metadata Individual Documents for Each Measurement Useful when data is sparse or intermittent (e.g., events rather than sensors) 6
7 Potential Problems with Data Collection Duplicate entries Utilize unique index in MongoDB to reject duplicate entries Delayed Out of order 7
8 Problems with Delayed and Out-of-Order Entries Alert/Event generation Incremental Backup 8
9 Enable Streaming of Data Add recordedtime field (in addition to existing field with timestamp) Utilize $currentdate feature of db.collection.update() $currentdate: { recordedtime: true } You cannot use this field as a shard key! Requires use of update instead of insert Which in turn requires specification of _id field Consider constructing your _id to solve the duplicate entries issue at the same time Allows applications to reliably process each document once and only once. 9
10 Accessing Your Data It s only *mostly* write-only.
11 Create Appropriate Indexes Avoid collection scans! Consider using: db.admincommand( { setparameter: 1, notablescan: 1 } ) Avoid queries that might as well be collection scans Create the indexes you need (but no more) Don t depend on index intersection Don t over index Each index can take up a lot of disk/memory Consider using partial indexes { partialfilterexpression: { speed: { $gt: 75.0 } } } 11
12 Check Your Indexes Use.explain() liberally Check which indexes are actually used: db.collection.aggregate( [ { $indexstats: {}}]) 12
13 Adding Data Getting the Speed You Need
14 API Methods Insert array database[collection].insert(doc_array) Insert unordered bulk bulk = database[collection].initialize_unordered_bulk_op() bulk.insert(doc) # loop here bulk.execute() Upsert unordered bulk bulk = database[collection].initialize_unordered_bulk_op() bulk.find({"_id": doc["_id"]}).upsert().update_one({"$set": doc}) # loop here bulk.execute() Insert single database[collection].insert(doc) Upsert single database[collection].update_one({"_id": doc["_id"]}, {"$set": doc}, upsert=true) 14
15 Relative Performance Comparison of API Methods Insert Array Insert Unordered Bulk Update Unordered Bulk Insert Single Update Single Docs/Sec 15
16 Benchmarks and other lies. Answering, Why can t I just use a gigantic HDD RAID array?
17 Benchmark Environment VMs 4 core Intel(R) Xeon(R) CPU E GHz 8 GB RAM Sandisk Ultra II 960GB SSD WD 5TB 7200rpm HDD MongoDB WiredTiger 4GB Cache Snappy collection compression Standalone server (no replica set, no mongos) Data 178 bytes per document in 6 fields 3 indexes (2 compound) Disk usage: 40% storage, 60% indexes Using update unordered bulk method, 1000 docs per bulk.execute() 17
18 Benchmark SSD vs. HDD Inserts/Sec SSD HDD 18
19 SSD Benchmark 60 Minutes 19
20 SSD Benchmark 0:30-1:00 20
21 HDD Benchmark 0:30-1:30 21
22 HDD Benchmark 0:30-8:45 (42M documents) 22
23 HDD Benchmark Last Hour 23
24 SSD Benchmark 0:30-2:10 (42M documents) 24
25 Benchmark SSD vs. HDD Last Hour Inserts/Sec SSD HDD 25
26 96 Hour Test 26
27 TL;DR Don t trust someone else s benchmarks (especially mine!) Benchmark using your own schema and indexes Artificially accelerate index size exceeding available memory 27
28 Time Series Data in MongoDB on a BUDGET
29 Replica Set Rollout Options Follow standard advice 3 server replica sets (Primary, Secondary, Secondary) Every replica set server on its own hardware Disk mirroring Cost cutting options Primary, Secondary, Arbiter Locate multiple replica set servers on the same hardware (but NOT from the SAME replica set) No disk mirroring (how many copies do you really need?) I love downtime and don t care about my data Single instance servers instead of replica sets RAID0 ( no wasted disk space! ) No backups 29
30 Storing Lots of Data Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
31 Conventional Sharding Non-sharded data kept in default replica set Shard key hashed on timestamp to evenly distribute data Pros: Increases insert rate Arbitrarily large data storage Cons: All shard replica sets should have comparable hardware All shards start thrashing at the same time Expanding means a LOT of rebalancing 31
32 Data Access Patterns New writes are always very recent Reads are almost always of recent data Reads of old data are intuitively slower let s take advantage of that. 32
33 Sharding by Zone Non-sharded data kept in default replica set Most recent time-series data stored in fast replica set Older time-series data stored in slow replica sets Pros: Pay for speed where we need it Swap fast to slow before thrashing kills performance Infinite data size Cons: Ceiling on insert speed 33
34 Prerequisites for Zone Sharding Sharded cluster configured (config replica set, mongos, etc) Existing replica set rsmain (primary shard) contains your normal (not timeseries) data TimeSeries collection with an index on time New replica set for time-series data (e.g., rs001) added as a shard 34
35 Initial Zone Ranges Run on mongos: use admin sh.enablesharding( DBName ) sh.shardcollection( DBName.TimeSeries, { time : 1 } ) sh.addshardtag('rsmain', future') sh.addshardtag( rs001', ts001') sh.addtagrange('dbname.timeseries',{time: new Date(" ")}, {time:maxkey},'future') sh.addtagrange( DBName.TimeSeries',{time:MinKey},{time:new Date(" ")}, ts001') # sh.splitat('dbname.timeseries', {"time" : new Date(" ")}) 35
36 Adding a New Time-Series Replica Set Step 1 Create new Replica Set When? Well before you run out of available fast storage Before your input capacity is lowered too close to your needs Where? On the same server with fast storage as the current time-series replica set Run on mongos: use admin db.runcommand({addshard: rs002/hostname:port", name: "rs002"}) sh.addshardtag( rs002, ts002') var configdb=db.getsiblingdb("config"); configdb.tags.update({tag: ts001"},{$set:{'max.time': new ISODate( ) }}) sh.addtagrange( DBName.TimeSeries',{time:new Date(" ")},{time:new Date(" ")}, ts002') # sh.splitat('dbname.timeseries', {"time" : new ISODate(" ")}) 36
37 Adding a New Time-Series Replica Set Step 2 Wait before Relocation Initially nothing changes all data is added into previous replica set Eventually, new entries match the min.time of the new replica set and will be stored there How long to wait before relocation? Make sure you don t fill up your fast storage How far back in time do normal queries go? - Queries to previous replica set will get slower after relocation 37
38 Adding a New Time-Series Replica Set Step 3 Relocate to Slow Storage Follow standard procedure for moving replica set Multiple server instances can share same server/storage Use unique ports Set wiredtigercachesizegb appropriately 38
39 Pause for Questions
40 Wrap Up 1. Determine your anticipated time-series data rate 2. Mock up a benchmark app matching your use-case Focus on indexed fields and their cardinality 3. Benchmark on a single server Fast storage Limited memory to accelerate index thrashing Ensure benchmarks run long enough 4. Iterate adjusting the following tradeoffs: single vs bulk/array upsert vs insert size of bulk/array insert/upsert if using measurement buckets, adjust size of bucket 5. If you achieve your needed data rate, use shard tags to push old data to slower (cheaper) servers 40
41 Rate My Session 41
42 Thank You Sponsors!! 42
43 Thank You!
The course modules of MongoDB developer and administrator online certification training:
The course modules of MongoDB developer and administrator online certification training: 1 An Overview of the Course Introduction to the course Table of Contents Course Objectives Course Overview Value
More informationHow to Scale MongoDB. Apr
How to Scale MongoDB Apr-24-2018 About me Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant (2012-2016)
More informationScaling for Humongous amounts of data with MongoDB
Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis
More informationReduce MongoDB Data Size. Steven Wang
Reduce MongoDB Data Size Tangome inc Steven Wang stwang@tango.me Outline MongoDB Cluster Architecture Advantages to Reduce Data Size Several Cases To Reduce MongoDB Data Size Case 1: Migrate To wiredtiger
More informationYour First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database
Your First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database Me - @adamotonete Adamo Tonete Senior Technical Engineer Brazil Agenda What is MongoDB? The good side of MongoDB
More informationScaling MongoDB. Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB Senior Service Technical Service Engineer.
caling MongoDB Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB enior ervice Technical ervice Engineer 1 Me and the expected audience @adamotonete Intermediate - At least 6+ months
More informationCourse Content MongoDB
Course Content MongoDB 1. Course introduction and mongodb Essentials (basics) 2. Introduction to NoSQL databases What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL
More informationMike Kania Truss
Mike Kania Engineer @ Truss http://truss.works/ MongoDB on AWS With Minimal Suffering + Topics Provisioning MongoDB Replica Sets on AWS Choosing storage and a storage engine Backups Monitoring Capacity
More informationPercona Live Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations. Kimberly Wilkins
Percona Live 2016 Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations Kimberly Wilkins Principal Engineer - Databases, Rackspace/ ObjectRocket www.linkedin.com/in/wilkinskimberly,
More informationMongoDB 2.2 and Big Data
MongoDB 2.2 and Big Data Christian Kvalheim Team Lead Engineering, EMEA christkv@10gen.com @christkv christiankvalheim.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis ...without
More informationUse multi-document ACID transactions in MongoDB 4.0 November 7th Corrado Pandiani - Senior consultant Percona
November 7th 2018 Corrado Pandiani - Senior consultant Percona Thank You Sponsors!! About me really sorry for my face Italian (yes, I love spaghetti, pizza and espresso) 22 years spent in designing, developing
More informationMongoDB Shootout: MongoDB Atlas, Azure Cosmos DB and Doing It Yourself
MongoDB Shootout: MongoDB Atlas, Azure Cosmos DB and Doing It Yourself Agenda and Intro Click for subtitle or brief description Agenda Intro Goal for this talk Who is this David Murphy person? The technologies
More informationScaling with mongodb
Scaling with mongodb Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: ross@10gen.com twitter: RossC0 Today's Talk Scaling Understanding
More informationHow to upgrade MongoDB without downtime
How to upgrade MongoDB without downtime me - @adamotonete Adamo Tonete, Senior Technical Engineer Brazil Agenda Versioning Upgrades Operations that always require downtime Upgrading a replica-set Upgrading
More informationSQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns
More informationDocument Object Storage with MongoDB
Document Object Storage with MongoDB Lecture BigData Analytics Julian M. Kunkel julian.kunkel@googlemail.com University of Hamburg / German Climate Computing Center (DKRZ) 2017-12-15 Disclaimer: Big Data
More informationBringing code to the data: from MySQL to RocksDB for high volume searches
Bringing code to the data: from MySQL to RocksDB for high volume searches Percona Live 2016 Santa Clara, CA Ivan Kruglov Senior Developer ivan.kruglov@booking.com Agenda Problem domain Evolution of search
More informationTime Series Live 2017
1 Time Series Schemas @Percona Live 2017 Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationMATH is Hard: TTL Index Configuration and Considerations. Kimberly Wilkins Sr.
MATH is Hard: TTL Index Configuration and Considerations Kimberly Wilkins Sr. DBA/Engineer kimberly@objectrocket.com @dba_denizen Drowning in Data? TTL s are your lifeboat Sources? Amounts? 600 TB 115
More informationMongoDB Schema Design for. David Murphy MongoDB Practice Manager - Percona
MongoDB Schema Design for the Click "Dynamic to edit Master Schema" title World style David Murphy MongoDB Practice Manager - Percona Who is this Person and What Does He Know? Former MongoDB Master Former
More informationBeyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona
Beyond Relational Databases: MongoDB, Redis & ClickHouse Marcos Albe - Principal Support Engineer @ Percona Introduction MySQL everyone? Introduction Redis? OLAP -vs- OLTP Image credits: 451 Research (https://451research.com/state-of-the-database-landscape)
More informationMongoDB Revs You Up: What Storage Engine is Right for You?
MongoDB Revs You Up: What Storage Engine is Right for You? Jon Tobin, Director of Solution Eng. --------------------- Jon.Tobin@percona.com @jontobs Linkedin.com/in/jonathanetobin Agenda How did we get
More informationAurora, RDS, or On-Prem, Which is right for you
Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation
More informationFast, In-Memory Analytics on PPDM. Calgary 2016
Fast, In-Memory Analytics on PPDM Calgary 2016 In-Memory Analytics A BI methodology to solve complex and timesensitive business scenarios by using system memory as opposed to physical disk, by increasing
More informationMongoDB Backup and Recovery Field Guide. Tim Vaillancourt Sr Technical Operations Architect, Percona
MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra,
More informationMongoDB CRUD Operations
MongoDB CRUD Operations Release 3.2.3 MongoDB, Inc. February 17, 2016 2 MongoDB, Inc. 2008-2016 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationWhat s new in Mongo 4.0. Vinicius Grippa Percona
What s new in Mongo 4.0 Vinicius Grippa Percona About me Support Engineer at Percona since 2017 Working with MySQL for over 5 years - Started with SQL Server Working with databases for 7 years 2 Agenda
More informationBreaking Barriers: MongoDB Design Patterns. Nikolaos Vyzas & Christos Soulios
Breaking Barriers: MongoDB Design Patterns Nikolaos Vyzas & Christos Soulios a bit about us and this talk Who we are what we do Christos Soulios Christos is a principal architect at Pythian Delivers Big
More informationNew Oracle NoSQL Database APIs that Speed Insertion and Retrieval
New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction
More informationWiredTiger In-Memory vs WiredTiger B-Tree. October, 5, 2016 Mövenpick Hotel Amsterdam Sveta Smirnova
WiredTiger In-Memory vs WiredTiger B-Tree October, 5, 2016 Mövenpick Hotel Amsterdam Sveta Smirnova Table of Contents What is Percona Memory Engine for MongoDB? Typical use cases Advanced Memory Engine
More informationMongoDB Storage Engine with RocksDB LSM Tree. Denis Protivenskii, Software Engineer, Percona
MongoDB Storage Engine with RocksDB LSM Tree Denis Protivenskii, Software Engineer, Percona Contents - What is MongoRocks? 2 Contents - What is MongoRocks? - RocksDB overview 3 Contents - What is MongoRocks?
More informationMongoDB CRUD Operations
MongoDB CRUD Operations Release 3.2.4 MongoDB, Inc. March 11, 2016 2 MongoDB, Inc. 2008-2016 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationMongoDB: Comparing WiredTiger In-Memory Engine to Redis. Jason Terpko DBA, Rackspace/ObjectRocket 1
MongoDB: Comparing WiredTiger In-Memory Engine to Redis Jason Terpko DBA, Rackspace/ObjectRocket www.linkedin.com/in/jterpko 1 Background Started out in relational databases in public education then financial
More informationWhy Choose Percona Server for MongoDB? Tyler Duzan
Why Choose Percona Server for MongoDB? Tyler Duzan Product Manager Who Am I? My name is Tyler Duzan Formerly an operations engineer for more than 12 years focused on security and automation Now a Product
More informationMongoDB for a High Volume Logistics Application. Santa Clara, California April 23th 25th, 2018
MongoDB for a High Volume Logistics Application Santa Clara, California April 23th 25th, 2018 about me... Eric Potvin Software Engineer in the performance team at Shipwire, an Ingram Micro company, in
More informationIndexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel
Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes
More informationMySQL Database Scalability
MySQL Database Scalability Nextcloud Conference 2016 TU Berlin Oli Sennhauser Senior MySQL Consultant at FromDual GmbH oli.sennhauser@fromdual.com 1 / 14 About FromDual GmbH Support Consulting remote-dba
More informationIBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store
IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data IBM Db2 Event Store Disclaimer The information contained in this presentation is provided for informational purposes only.
More informationReal-Time & Big Data GIS: Best Practices. Josh Joyner Adam Mollenkopf
Real-Time & Big Data GIS: Best Practices Josh Joyner Adam Mollenkopf ArcGIS Enterprise with real-time capabilities Desktop Apps APIs live features stream services live & historic aggregates & features
More informationDatabase Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu
Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based
More informationITG Software Engineering
Introduction to MongoDB Course ID: Page 1 Last Updated 12/15/2014 MongoDB for Developers Course Overview: In this 3 day class students will start by learning how to install and configure MongoDB on a Mac
More informationCOMP283-Lecture 3 Applied Database Management
COMP283-Lecture 3 Applied Database Management Introduction DB Design Continued Disk Sizing Disk Types & Controllers DB Capacity 1 COMP283-Lecture 3 DB Storage: Linear Growth Disk space requirements increases
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationEfficient Data Structures for Tamper-Evident Logging
Efficient Data Structures for Tamper-Evident Logging Scott A. Crosby Dan S. Wallach Rice University Everyone has logs Tamper evident solutions Current commercial solutions Write only hardware appliances
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationPractical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars
Practical MySQL Performance Optimization Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars In This Presentation We ll Look at how to approach Performance Optimization Discuss Practical
More informationAccelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016
Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016 Nikita Ivanov CTO and Co-Founder GridGain Systems Peter Zaitsev CEO and Co-Founder Percona About the Presentation
More informationMyRocks deployment at Facebook and Roadmaps. Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom
MyRocks deployment at Facebook and Roadmaps Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom Agenda MySQL at Facebook MyRocks overview Production Deployment
More informationPercona Live Santa Clara, California April 24th 27th, 2017
Percona Live 2017 Santa Clara, California April 24th 27th, 2017 MongoDB Shell: A Primer Rick Golba The Mongo Shell It is a JavaScript interface to MongoDB Part of the standard installation of MongoDB Used
More informationEngineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05
Engineering Goals Scalability Availability Transactional behavior Security EAI... Scalability How much performance can you get by adding hardware ($)? Performance perfect acceptable unacceptable Processors
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationOpen Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona
Open Source Database Performance Optimization and Monitoring with PMM Fernando Laudares, Vinicius Grippa, Michael Coburn Percona Fernando Laudares 2 Vinicius Grippa 3 Michael Coburn Product Manager for
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
More informationScaling MongoDB: Avoiding Common Pitfalls. Jon Tobin Senior Systems
Scaling MongoDB: Avoiding Common Pitfalls Jon Tobin Senior Systems Engineer Jon.Tobin@percona.com @jontobs www.linkedin.com/in/jonathanetobin Agenda Document Design Data Management Replica3on & Failover
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School
More informationSystem Requirements EDT 6.0. discoveredt.com
System Requirements EDT 6.0 discoveredt.com Contents Introduction... 3 1 Components, Modules & Data Repositories... 3 2 Infrastructure Options... 5 2.1 Scenario 1 - EDT Portable or Server... 5 2.2 Scenario
More informationInnodb Performance Optimization
Innodb Performance Optimization Most important practices Peter Zaitsev CEO Percona Technical Webinars December 20 th, 2017 1 About this Presentation Innodb Architecture and Performance Optimization 3h
More informationReal-Time & Big Data GIS: Best Practices. Suzanne Foss Josh Joyner
Real-Time & Big Data GIS: Best Practices Suzanne Foss Josh Joyner ArcGIS Enterprise With Real-time Capabilities Desktop Apps APIs visualization ingestion dissemination & actuation analytics storage Agenda:
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationDBMS Data Loading: An Analysis on Modern Hardware. Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki
DBMS Data Loading: An Analysis on Modern Hardware Adam Dziedzic, Manos Karpathiotakis*, Ioannis Alagiannis, Raja Appuswamy, Anastasia Ailamaki Data loading: A necessary evil Volume => Expensive 4 zettabytes
More informationEvaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA
Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation report prepared under contract with HP Executive Summary The computing industry is experiencing an increasing demand for storage
More informationCS November 2018
Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationCloud Backup and Recovery for Healthcare and ecommerce
Get Your Cloud Backup On Cloud Backup and Recovery for Healthcare and ecommerce Peter Smails, Vice President, Marketing & Business Development Shalabh Goyal, Director, Product Management October 12 th,
More informationDELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE
WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily
More informationHOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION
HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION Steve Bertoldi, Solutions Director, MarkLogic Agenda Cloud computing and on premise issues Comparison of traditional vs cloud architecture Review of use
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationHammer Slide: Work- and CPU-efficient Streaming Window Aggregation
Large-Scale Data & Systems Group Hammer Slide: Work- and CPU-efficient Streaming Window Aggregation Georgios Theodorakis, Alexandros Koliousis, Peter Pietzuch, Holger Pirk Large-Scale Data & Systems (LSDS)
More informationNoSQL Performance Test
bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de info@bankmark.de T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,
More informationGoogle File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
More informationKinetic drive. Bingzhe Li
Kinetic drive Bingzhe Li Consumption has changed It s an object storage world, unprecedented growth and scale In total, a complete redefinition of the storage stack https://www.openstack.org/summit/openstack-summit-atlanta-2014/session-videos/presentation/casestudy-seagate-kinetic-platform-in-action
More informationMongoDB Distributed Write and Read
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui MongoDB Distributed Write and Read Lecturer : Dr. Pavle Mogin SWEN 432 Advanced Database Design and Implementation Advanced
More informationDiscover the all-new CacheMount
Discover the all-new CacheMount 1 2 3 4 5 Why CacheMount and what are its problem solving abilities Cache function makes the hybrid cloud more efficient The key of CacheMount: Cache Volume User manual
More informationAN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING
AN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING THE EASIEST WAY TO INCREASE PERFORMANCE AND LOWER STORAGE COSTS Bruce Kornfeld, Chief Marketing Officer, StorMagic Luke Pruen, Technical
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationMMS Backup Manual Release 1.4
MMS Backup Manual Release 1.4 MongoDB, Inc. Jun 27, 2018 MongoDB, Inc. 2008-2016 2 Contents 1 Getting Started with MMS Backup 4 1.1 Backing up Clusters with Authentication.................................
More informationMongoDB: Replica Sets and Sharded Cluster. Monday, November 5, :30 AM - 5:00 PM - Bull
MongoDB: Replica Sets and Sharded Cluster Monday, November 5, 2018 1:30 AM - 5:00 PM - Bull About me Adamo Tonete Senior Support Engineer São Paulo / Brazil @adamotonete Replicaset and Shards This is a
More informationHighway to Hell or Stairway to Cloud?
Highway to Hell or Stairway to Cloud? Percona Live 2018, Frankfurt ALEXANDER KUKUSHKIN 06-11-2018 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech The Patroni guy alexander.kukushkin@zalando.de
More informationIBM Emulex 16Gb Fibre Channel HBA Evaluation
IBM Emulex 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance
More informationFederated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni
Federated Array of Bricks Y Saito et al HP Labs CS 6464 Presented by Avinash Kulkarni Agenda Motivation Current Approaches FAB Design Protocols, Implementation, Optimizations Evaluation SSDs in enterprise
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection
More information10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline
10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Copyright 2003 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4. Other Approaches
More informationRunning MySQL on AWS. Michael Coburn Wednesday, April 15th, 2015
Running MySQL on AWS Michael Coburn Wednesday, April 15th, 2015 Who am I? 2 Senior Architect with Percona 3 years on Friday! Canadian but I now live in Costa Rica I see 3-10 different customer environments
More informationMongoDB Schema Design
MongoDB Schema Design Demystifying document structures in MongoDB Jon Tobin @jontobs MongoDB Overview NoSQL Document Oriented DB Dynamic Schema HA/Sharding Built In Simple async replication setup Automated
More informationSplunk is a great tool for exploring your log data. It s very powerful, but
Sysadmin David Lang David Lang is a site reliability engineer at Google. He spent more than a decade at Intuit working in the Security Department for the Banking Division. He was introduced to Linux in
More informationBoost Performance and Extend NAS Life
Boost Performance and Extend NAS Life Doug Rainbolt Vice President of Marketing Alacritech, Inc. Santa Clara, CA August 2012 1 Agenda Spring 2012 Alacritech Confidential & Proprietary All Rights Reserved
More informationHow to Pick SQL Server Hardware
How to Pick SQL Server Hardware The big picture 1. What SQL Server edition do you need? 2. Does your RPO/RTO dictate shared storage? 3. If you need shared storage, what s important? 4. No-brainer answers
More informationOracle TimesTen In-Memory Database 18.1
Oracle TimesTen In-Memory Database 18.1 Scaleout Functionality, Architecture and Performance Chris Jenkins Senior Director, In-Memory Technology TimesTen Product Management Best In-Memory Databases: For
More informationPOSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN. PostgresConf US
POSTGRESQL ON AWS: TIPS & TRICKS (AND HORROR STORIES) ALEXANDER KUKUSHKIN PostgresConf US 2018 2018-04-20 ABOUT ME Alexander Kukushkin Database Engineer @ZalandoTech Email: alexander.kukushkin@zalando.de
More informationChapter 12: File System Implementation
Chapter 12: File System Implementation Silberschatz, Galvin and Gagne 2013 Chapter 12: File System Implementation File-System Structure File-System Implementation Allocation Methods Free-Space Management
More informationMongoDB Backup & Recovery Field Guide
MongoDB Backup & Recovery Field Guide Tim Vaillancourt Percona Speaker Name `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra, redis, rabbitmq, solr, mesos
More informationKathleen Durant PhD Northeastern University CS Indexes
Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical
More informationGLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools
GLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools Dustin Black Senior Architect, Software-Defined Storage @dustinlblack 2017-05-02 Ben Turner Principal Quality Engineer
More informationAccelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card
Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card The Rise of MongoDB Summary One of today s growing database
More informationDeduplication Storage System
Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business
More informationMongoDB An Overview. 21-Oct Socrates
MongoDB An Overview 21-Oct-2016 Socrates Agenda What is NoSQL DB? Types of NoSQL DBs DBMS and MongoDB Comparison Why MongoDB? MongoDB Architecture Storage Engines Data Model Query Language Security Data
More informationIntroduction to Hadoop. Owen O Malley Yahoo!, Grid Team
Introduction to Hadoop Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since
More informationMySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona In the Presentation Practical approach to deal with some of the common MySQL Issues 2 Assumptions You re looking
More information