Breaking Barriers: MongoDB Design Patterns. Nikolaos Vyzas & Christos Soulios
|
|
- Bethany Wheeler
- 5 years ago
- Views:
Transcription
1 Breaking Barriers: MongoDB Design Patterns Nikolaos Vyzas & Christos Soulios
2 a bit about us and this talk Who we are what we do
3 Christos Soulios Christos is a principal architect at Pythian Delivers Big Data platforms for some of the world's top tech organizations Expert in Big Data, Hadoop, NoSQL etc Working with MongoDB since v1.7 (back in 2011) 3
4 Nik Vyzas Nik is a Sr. TechOps Architect at Percona 10+ years experience in production support and enterprise software development for large scale distributed environments Expert in a variety of open-source technologies especially RHEL, Debian, Percona Server, XtraDB Cluster, MongoDB, Ansible, Java and Python. Over the years he has also mastered the dark art of turning caffeine into new software and bug fixes. 4
5 What is this talk about? Proven MongoDB design patterns Data modelling and indexing principles Common MongoDB pitfalls and how to avoid them Balancing performance and data consistency Best practices for scaling out / sharding How to generally press the go-faster button 5
6 Session Overview Indexing Strategy Data Modeling: To reference or to embed? Ranking / Fast Accounting in MongoDB Atomic Updates and the Optimistic Locking Pattern Keyword Search Pattern Defensive Programming Read / Write Concern Sharding Considerations 6
7 Indexing Strategy Optimizing Query Performance
8 Sometimes queries are a bit slow 8
9 Think about your indexing 9
10 Index Types Basic index types: Single field indexes Compound indexes Other noteworthy indexes: Text indexes Geospatial indexes Hashed indexes (* mainly for sharding) 10
11 Index Properties Unique - acts as index and constraint Sparse - only when field exists Partial - based on specified criteria TTL - * Keep in mind the TTLDeleter thread runs every 60 seconds 11
12 Indexing Tips Ensure your indexes fit in memory. Try to be minimal Don't index everything Indexes are costly When indexing timestamps, always index coarsely. NEVER index milliseconds Don t index fields with low cardinality Careful with text indexes 12
13 More Indexing Tips Prefer compound indexes that will improve multiple queries Create indexes that cover the queries - all data is retrieved from the index When developing code you can start mongod with the notablescan option Over time schemas and query patterns evolve, always review your indexes 13
14 The explain() plan 14
15 The explain() plan Returns the query execution plan for a specific query Provides execution statistics e.g. rows scanned, indexes used etc. For sharded collections information regarding shards accessed is included Use the explain plan to identify required indexes for filtering and sorting documents 15
16 The explain() plan No Index db.movies.find({'year': '2001'}).explain(true) "queryplanner" : { "winningplan" : { "stage" : "COLLSCAN", "filter" : { "year" : {"$eq" : "2001" } }, "direction" : "forward" } }
17 The explain() plan No Index "executionstats" : { "totaldocsexamined" : 250, "executionstages" : { "nreturned" : 7, "advanced" : 7, "direction" : "forward", "docsexamined" : 250 }, }, 17
18 The explain() plan - Indexed "queryplanner" : { "winningplan" : { "stage" : "IXSCAN", "keypattern" : { "year" : 1 }, "indexname" : "year_1", "ismultikey" : false, "isunique" : false, "issparse" : false, "ispartial" : false, "indexversion" : 1, "direction" : "forward", "indexbounds" : { "year" : ["[\"2001\", \"2001\"]" ] } 18
19 The explain() plan - Indexed "executionstats" : { "nreturned" : 7, "executiontimemillis" : 0, "totalkeysexamined" : 7, "totaldocsexamined" : 7, } 19
20 The Database Profiler Collects data about operations, cursors and db commands Configurable per database or per instance Allows setting slowopthresholdms to capture only slow queries or all Crucial for identifying bottlenecks and understanding workload 20
21 Data Modeling: To reference or to embed? Embedded vs. referenced pattern implementations
22 Data modeling We define data relationships between collections How do I join data? Effective data modeling Application side joins Two basic models: Embedded or Referenced ALWAYS ask yourself to reference or to embed? 22
23 Pattern: Embedded One-to-One Relationship {'_id' : ObjectId( ), } 'title' : 'Shawshank Redemption', 'director': { 'name' : 'Frank Darabont', }, 23
24 Pattern: Embedded One-to-Many Relationship } {'_id' : ObjectId( ), 'title' : 'Shawshank Redemption', 'writers' : [{'name':'stephen King', }, {'name':'frank Darabont', }], 24
25 The Embedded Model Faster reads / writes whole BSON is retrieved in 1x database call Updates at the document level enforce atomicity Duplication can lead to data inconsistencies Avoid embedding data with unbound growth Never embed documents that grow after creation (MMAPv1 storage engine) 25
26 Pattern: Referenced One-to-Many Relationship Reference on movie_id = 'xyz' { } movies '_id' : 'xyz', 'title' : 'The Wall',... { } { } reviews '_id' : ObjectId( ), 'movie_id' : 'xyz', 'rating' : 8,... '_id' : ObjectId( ), 'movie_id' : 'xyz', 'rating' : 2,... 26
27 The Referenced Model Enforces data consistency Allows for Parent & Child Tree References Each relationship requires an additional call Ensure that your referenced fields are indexed This becomes costly: Makes reading slower Makes writing slower Requires more indexes 27
28 Atomic Updates & the Optimistic Lock Pattern Atomic Updates & Collection Versioning
29 Atomic Update Operations >>> db.movies.update_one( {'rating': {'$gt': 9 }}, {'$set' : {'favorite' : True }}) >>> old_doc = db.movies.find_one_and_update( {'rating': {'$gt': 9 }}, {'$set' : {'favorite' : True }}) 29
30 Atomic Update Operations >>> db.movies.update_one( {'rating': {'$gt': 9 }}, {'$set' : {'favorite' : True }}) >>> old_doc = db.movies.find_one_and_update( {'rating': {'$gt': 9 }}, {'$set' : {'favorite' : True }}) The update and return occurs within a single atomic operation 30
31 Update Operators The list of valid update operators: $inc $set $unset $addtoset $push / $pushall : Add value into array $pop / $popall : Remove first / last value(s) of array : Increment counter : Set a new value : Set value = NULL : Add value into array (duplicates not inserted) $pull / $pullall : Remove instance(s) of value from array $rename : Update key name(s) 31
32 Pattern: Optimistic Locking For complex changes use the Optimistic Locking Design Pattern: Include a version field in all documents {'_id': ObjectId( ), 'title':'zootopia', 'v':1 } Retrieve a document and remember its version Make a series of complex transformations to the document or create a new one Do not forget to increment the version of the new document Update the document only if the version has not changed 32
33 Pattern: Optimistic Locking Update only if the document version has not changed m = db.movies.find_one({'title' : 'Zootopia'}) v = m['v'] # Remember the old version m = complex_transformations(m) m['v'] = v + 1 # Increment the version r = db.movies.replace_one({'_id' : m['_id'], 'v' : v}, m) if r.modified_count == 0: compensate() 33
34 Ranking / Fast Accounting in MongoDB High performance accounting to avoid aggregation
35 Pattern: Fast Accounting Use case: Count daily and monthly reviews posted for each movie. Display a histogram on the movie page Naive solution: Run counts on the reviews collection when histograms must be rendered Slow and resource consuming to aggregate millions of documents Calculating on every page view is too often Indexing may help but it will not solve the problem Fetching old data destroys page cache 35
36 Pattern: Fast Accounting Solution: Fast Accounting Design Pattern Create a separate collection to store aggregate counters Update counters when a new review is submitted If there are more than one counters, multiple updates will be performed This is a pattern taken from Complex Event Processing (CEP) 36
37 Pattern: Fast Accounting - Schema Create a separate collection named 'review_counts': { '_id': {'movie_id: ObjectId( ), 'day' : ' '}, 'count' : }, { '_id': {'movie_id: ObjectId( ), 'month' : ' '}, 'count' : } Query Dimensions 37
38 Pattern: Fast Accounting Increment counts Update daily counts: >>> db.review_counts.update_one({'_id': {'movie_id': ObjectId( ), 'day' : ' '}}, {'$inc' : {'count' : 1}}, upsert=true) Update monthly counts: >>> db.review_counts.update_one({'_id': {'movie_id': ObjectId( ), 'month' : ' '}}, {'$inc' : {'count' : 1}}, upsert=true) 38
39 Pattern: Fast Accounting Retrieve counts Retrieve daily count for a single day: >>> db.review_counts.find_one({'_id': {'movie_id': ObjectId( ), 'day' : ' '}})['count']
40 Pattern: Fast Accounting Documents for the latest dates and months are in memory Retrievals are very fast because they search indexed data Updates are very fast They happen in memory Use the _id index to ensure uniqueness and save space 40
41 Pattern: Fast Accounting Updates are atomic They can scale to thousands of concurrent updates Always use upsert=true to create new counters More dimensions can be added in the counter don't overdo it This pattern can be adopted for aggregating any timeseries data 41
42 Keyword Search Pattern Modelling data for retrieval based on specific keyword or tag
43 Living in the #hashtag world Use case: Retrieve a document based on a specific hashtag or keyword Naive solution: Add all tags delimited to a tags field and create a text index e.g. db.movies.createindex({"tags":"text"}) Text indexes require more space Take very long to build Significantly reduce insertion time More intensive retrieval processing 43
44 Keyword Search Pattern Solution: Keyword Search Pattern Create separate index entries per tags Groups documents based on tags Leverages multi-key indexes using an array (automatically created) Results in smaller and faster indexes compared to text 44
45 Keyword Search Pattern: Schema A movies collection with search keywords / tags: { '_id': {'movie_id: ObjectId( ), 'title' : 'World War Z'}, 'tags' : ['thriller','2016','zombies'] }, db.movies.createindex({tags: 1}) ### Separate index entries have now been created for: ### - thriller ### ### - zombies 45
46 Gotchas and Pitfalls Be careful: Insertion degrades on high cardinality (i.e. thousands) If indexes get too large asynchronous indexing may be required Allowing for free-text entry can lead to high cardinality - try to maintain a list if possible Be weary of case-sensitivity - consider forcing UPPER / LOWER case Do not use this pattern for full-text-search, rather prefer text indexing 46
47 Defensive programming Best practices for reading and writing data with a schemaless database
48 Structure in a schemaless world MongoDB does not enforce schema Key considerations for coding: Is the data I m writing valid? Is the data I m reading valid? 48
49 Structure in a schemaless world Methods for ensuring data is valid: Using BSON document types Document validation capability (3.2+) 49
50 BSON Document Types BSON provides support for common variable types, most importantly: bool int long double string array timestamp date objectid object 50
51 BSON Document Types Python types supported by Pymongo Pymongo converts Python types in a JSON document to BSON types BSON types also supported by the Java driver Generally - language specific drivers support BSON Custom Types can also be defined using a class Document types can also be defined using an ORM such as MongoEngine 51
52 BSON Document Types For example - insert a document in Pymongo enforcing datetime: >>> doc = { } "date": datetime(2003, 11, 26), "title_id":"tt ", "user_location":"texas", "title_name":"the Shawshank Redemption", "summary":"best movie ever!! >>> db.bsontest.insert(doc) 52
53 BSON Document Types Then retrieve the value in Mongo cli: > db.bsontest.find() { "_id" : ObjectId("570aae6d0059a38a781fed60"), "title_id" : "tt ", "user_location" : "Texas", "summary" : "Tied for the best movie I have ever seen", "date" : ISODate(" T00:00:00Z"), "title_name" : "The Shawshank Redemption" } 53
54 Document Validation Document validation is supported in Mongo 3.2+ Validation can be set during collection creation or on an existing collection Two modes of operation: Strict - Applied to all document inserts / updates Moderate - Applied to inserts / updates on documents that conform Setting validationaction: warn for testing (logs errors) error for enforcing (throws an error) 54
55 New Document Validation Create a validation on the dvtest collection: db.createcollection( dvtest", { validator : { $and: [ {"title_id" : { $type: "string" }}, {"user_location" : { $exists: true }}, {"title_name" : { $type: "string" }} ] } } 55
56 New Document Validation Insert an invalid document into the dvtest collection: db.dvtest.insert({ foo": "bar"}) WriteResult({ "ninserted" : 0, "writeerror" : { "code" : 121, "errmsg" : "Document failed validation" } }) 56
57 Existing Document Validation Add a new validation to an existing dvtest collection with moderate validation: db.runcommand( { collmod: "dvtest", validator: {$and:[{title_id: {$exists:true}}]}, validationlevel: "moderate" } ) 57
58 Read / Write Concern Read / Write Concern levels for CRUD operations
59 What is Write Concern Write concern determines the level of acknowledgement for data written by mongod processes: w = 0: No acknowledgement at all. It fails only if connectivity errors occur at the client application w = 1 (default): Require acknowledgement by the Primary replica w > 1: Acknowledgment by the number of replicas equal to w w = majority 59
60 Write Concern wtimeout: Time (ms) for an acknowledgement to return The j=true option requires an acknowledgement that data was written to the database journal 60
61 Write Concern Tips Never do unsafe writes (w=0) Except if you don t care about your data w=1 is not safe at all. A write can be overwritten by an outdated replica after a fail over w='majority' is safe. But it s slow w>1 is your best bet Always use wtimeout when w>1. If write concern cannot be achieved, the write will block forever 61
62 Read Preference read_preference specifies the replica instance that read operations are directed at: Possible values: PRIMARY [default] PRIMARY_PREFERRED SECONDARY SECONDARY_PREFERRED NEAREST 62
63 Read Concern Read concern specifies the isolation level for read operations ReadConcern( local ) returns local data stored on the replica queried [default] ReadConcern( majority ) returns data replicated to the majority of replicas i.e. already replicated majority is only supported by the WiredTiger storage engine, not by the MMAPv1 63
64 Read Concern Tips Read from secondaries when possible to scale reads All read preference modes except PRIMARY may return stale data because of replication lag majority read concern is slow majority read concern does not guarantee the latest data, but the latest data replicated to the majority of replicas 64
65 Sharding Considerations Hash vs. Timestamp Distribution
66 Sharding in MongoDB Sharding: Horizontal partitioning of data across multiple nodes / replicasets Sharded replicasets are recommended for HA Collections are sharded across replicasets based on a shard key High cardinality of the shard-key ensures even distribution across replicasets Collections which are not sharded remain on the primary shard 66
67 Sharding in MongoDB What are my sharding options? Hash based Range based Tag based 67
68 Hash Based Sharding Use hash indexes for the ranges Evenly distributed reads / writes Random operations due to random sharding algorithm Retrieving multiple documents can lead to scatter - gather Key use cases: Scaling: Load balancing reads & writes (example to follow) Disaster recovery: Parallel shard recovery 68
69 Hash Based Sharding Example: Shard key hash(datetime) - good write distribution WRITES :00: :00: :00:00 Shard1 (Primary) Shard2 Shard3 69
70 Hash Based Sharding Example: Shard key hash(datetime) - scatter gather reads Find datetime values between 17th & 21st of April READS :00: :00: :00:00 Shard1 (Primary) Shard2 Shard3 70
71 Hash Based Sharding Example: Shard key hash(userid) - good read distribution Find user with id = ed4f7269 READS ed4f7269 Shard1 (Primary) Shard2 Shard3 71
72 Range Based Sharding Ranges are defined on the defined data e.g. number / date-time Data is divided across range of documents E.G. 4x shards with int >> Shard1 with values etc. Can lead to hotspot shards on date-based ranges As ranges change chunk migration may cause overhead Key use cases: Scaling: Load balancing reads & writes Disaster recovery: Parallel shard recovery 72
73 Range Based Sharding Example: Shard key datetime value - bad write distribution On 20th April all writes go to Shard3 WRITES :00: :00: :00:00 Shard1 (Primary) Shard2 Shard3 73
74 Range Based Sharding Example: Shard key datetime value - bad write distribution Similar scenario with reads READS :00: :00: :00:00 Shard1 (Primary) Shard2 Shard3 74
75 Tag Based Sharding Allows for custom data distribution Data is divided across predefined tags E.G. Americas on Shard1.. EU on Shard2.. APAC on Shard3 Can lead to hotspots depending on use-case Key use cases: Geo-locality: Force data into suitable geographically dispersed shards HW Optimization: Force hot data onto faster hardware 75
76 Tag Based Sharding Example: Shard tags on location - faster response times Writes occur in a local DC WRITES USA GREECE AUSTRALIA Shard1 (Primary) Tag: AM Shard2 Tag: EU Shard3 Tag: APAC 76
77 Tag Based Sharding Example: Shard tags on location - faster response times Reads occur in a local DC READS WRITES USA GREECE AUSTRALIA Shard1 (Primary) Tag: AM Shard2 Tag: EU Shard3 Tag: APAC 77
78 Tag Based Sharding Example: Shard tags on year ranges - automatic archiving New data is written to high speed node WRITES <NEW DATA> <FEWER WRITES> <NO WRITE ACTIVITY> Shard1 (Primary) Tag: 2016 Shard2 Tag: Shard3 Tag: < x Cores - 128GB RAM - SSD 16x Cores - 64GB RAM - SSD 4x Cores - 32GB RAM - Rotational 78
79 Tag Based Sharding Example: Shard tags on year ranges - automatic archiving New data is written to high speed node READS WRITES <NEW DATA> <FEWER READS> <ONLY REPORTING> Shard1 (Primary) Tag: 2016 Shard2 Tag: Shard3 Tag: < x Cores - 256GB RAM - SSD 16x Cores - 64GB RAM - SSD 4x Cores - 32GB RAM - Rotational 79
80 the end Q&A
Scaling for Humongous amounts of data with MongoDB
Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis
More informationThe course modules of MongoDB developer and administrator online certification training:
The course modules of MongoDB developer and administrator online certification training: 1 An Overview of the Course Introduction to the course Table of Contents Course Objectives Course Overview Value
More informationCourse Content MongoDB
Course Content MongoDB 1. Course introduction and mongodb Essentials (basics) 2. Introduction to NoSQL databases What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL
More informationMongoDB Index Types How, when and where should they be used?
MongoDB Index Types How, when and where should they be used? 26 April - 1:30 PM - 1:55 PM at Room 209 { me : '@adamotonete' } Adamo Tonete I've been working for Percona since late 2015 as a Senior Technical
More informationDocument Object Storage with MongoDB
Document Object Storage with MongoDB Lecture BigData Analytics Julian M. Kunkel julian.kunkel@googlemail.com University of Hamburg / German Climate Computing Center (DKRZ) 2017-12-15 Disclaimer: Big Data
More informationScaling with mongodb
Scaling with mongodb Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: ross@10gen.com twitter: RossC0 Today's Talk Scaling Understanding
More informationMONGODB INTERVIEW QUESTIONS
MONGODB INTERVIEW QUESTIONS http://www.tutorialspoint.com/mongodb/mongodb_interview_questions.htm Copyright tutorialspoint.com Dear readers, these MongoDB Interview Questions have been designed specially
More informationPercona Live Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations. Kimberly Wilkins
Percona Live 2016 Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations Kimberly Wilkins Principal Engineer - Databases, Rackspace/ ObjectRocket www.linkedin.com/in/wilkinskimberly,
More informationWhat s new in Mongo 4.0. Vinicius Grippa Percona
What s new in Mongo 4.0 Vinicius Grippa Percona About me Support Engineer at Percona since 2017 Working with MySQL for over 5 years - Started with SQL Server Working with databases for 7 years 2 Agenda
More informationMongoDB. David Murphy MongoDB Practice Manager, Percona
MongoDB Click Replication to edit Master and Sharding title style David Murphy MongoDB Practice Manager, Percona Who is this Person and What Does He Know? Former MongoDB Master Former Lead DBA for ObjectRocket,
More informationSQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns
More informationTime-Series Data in MongoDB on a Budget. Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018
Time-Series Data in MongoDB on a Budget Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018 TIME SERIES DATA in MongoDB on a Budget Click to add text
More informationReduce MongoDB Data Size. Steven Wang
Reduce MongoDB Data Size Tangome inc Steven Wang stwang@tango.me Outline MongoDB Cluster Architecture Advantages to Reduce Data Size Several Cases To Reduce MongoDB Data Size Case 1: Migrate To wiredtiger
More informationMongoDB Schema Design
MongoDB Schema Design Demystifying document structures in MongoDB Jon Tobin @jontobs MongoDB Overview NoSQL Document Oriented DB Dynamic Schema HA/Sharding Built In Simple async replication setup Automated
More informationMongoDB 2.2 and Big Data
MongoDB 2.2 and Big Data Christian Kvalheim Team Lead Engineering, EMEA christkv@10gen.com @christkv christiankvalheim.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis ...without
More informationScaling MongoDB. Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB Senior Service Technical Service Engineer.
caling MongoDB Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB enior ervice Technical ervice Engineer 1 Me and the expected audience @adamotonete Intermediate - At least 6+ months
More informationData Model Design for MongoDB
Data Model Design for MongoDB Release 3.2.3 MongoDB, Inc. February 17, 2016 2 MongoDB, Inc. 2008-2016 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States
More informationMongoDB Schema Design for. David Murphy MongoDB Practice Manager - Percona
MongoDB Schema Design for the Click "Dynamic to edit Master Schema" title World style David Murphy MongoDB Practice Manager - Percona Who is this Person and What Does He Know? Former MongoDB Master Former
More informationPercona Live Santa Clara, California April 24th 27th, 2017
Percona Live 2017 Santa Clara, California April 24th 27th, 2017 MongoDB Shell: A Primer Rick Golba The Mongo Shell It is a JavaScript interface to MongoDB Part of the standard installation of MongoDB Used
More informationHow to Scale MongoDB. Apr
How to Scale MongoDB Apr-24-2018 About me Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant (2012-2016)
More informationMATH is Hard: TTL Index Configuration and Considerations. Kimberly Wilkins Sr.
MATH is Hard: TTL Index Configuration and Considerations Kimberly Wilkins Sr. DBA/Engineer kimberly@objectrocket.com @dba_denizen Drowning in Data? TTL s are your lifeboat Sources? Amounts? 600 TB 115
More informationMongoDB. copyright 2011 Trainologic LTD
MongoDB MongoDB MongoDB is a document-based open-source DB. Developed and supported by 10gen. MongoDB is written in C++. The name originated from the word: humongous. Is used in production at: Disney,
More informationMongoDB - a No SQL Database What you need to know as an Oracle DBA
MongoDB - a No SQL Database What you need to know as an Oracle DBA David Burnham Aims of this Presentation To introduce NoSQL database technology specifically using MongoDB as an example To enable the
More informationMongoDB Backup & Recovery Field Guide
MongoDB Backup & Recovery Field Guide Tim Vaillancourt Percona Speaker Name `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra, redis, rabbitmq, solr, mesos
More informationMongoDB Distributed Write and Read
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui MongoDB Distributed Write and Read Lecturer : Dr. Pavle Mogin SWEN 432 Advanced Database Design and Implementation Advanced
More informationIndex everything One query type Low latency High concurrency. Index nothing Queries as programs High latency Low concurrency
SCHEMA ON READ Index everything One query type Low latency High concurrency Index nothing Queries as programs High latency Low concurrency Index everything One query type Low latency High concurrency Index
More informationGroup13: Siddhant Deshmukh, Sudeep Rege, Sharmila Prakash, Dhanusha Varik
Group13: Siddhant Deshmukh, Sudeep Rege, Sharmila Prakash, Dhanusha Varik mongodb (humongous) Introduction What is MongoDB? Why MongoDB? MongoDB Terminology Why Not MongoDB? What is MongoDB? DOCUMENT STORE
More informationPercona Live September 21-23, 2015 Mövenpick Hotel Amsterdam
Percona Live 2015 September 21-23, 2015 Mövenpick Hotel Amsterdam MongoDB, Elastic, and Hadoop: The What, When, and How Kimberly Wilkins Principal Engineer/Database Denizen ObjectRocket/Rackspace kimberly@objectrocket.com
More informationMongoDB Shell: A Primer
MongoDB Shell: A Primer A brief guide to features of the MongoDB shell Rick Golba Percona Solutions Engineer June 8, 2017 1 Agenda Basics of the Shell Limit and Skip Sorting Aggregation Pipeline Explain
More informationMongoDB Tutorial for Beginners
MongoDB Tutorial for Beginners Mongodb is a document-oriented NoSQL database used for high volume data storage. In this tutorial you will learn how Mongodb can be accessed and some of its important features
More informationOpen source, high performance database. July 2012
Open source, high performance database July 2012 1 Quick introduction to mongodb Data modeling in mongodb, queries, geospatial, updates and map reduce. Using a location-based app as an example Example
More informationChapter 24 NOSQL Databases and Big Data Storage Systems
Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL
More informationMongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM
MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM About us Adamo Tonete MongoDB Support Engineer Agustín Gallego MySQL Support Engineer Agenda What are MongoDB and MySQL; NoSQL
More informationOral Questions and Answers (DBMS LAB) Questions & Answers- DBMS
Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database
More informationBeyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona
Beyond Relational Databases: MongoDB, Redis & ClickHouse Marcos Albe - Principal Support Engineer @ Percona Introduction MySQL everyone? Introduction Redis? OLAP -vs- OLTP Image credits: 451 Research (https://451research.com/state-of-the-database-landscape)
More informationExploring the replication in MongoDB. Date: Oct
Exploring the replication in MongoDB Date: Oct-4-2016 About us Database Consultant @Pythian OSDB managed services since 2014 Lead Database Consultant @Pythian OSDB managed services since 2014 https://tr.linkedin.com/in/okanbuyukyilmaz
More informationScaling MongoDB: Avoiding Common Pitfalls. Jon Tobin Senior Systems
Scaling MongoDB: Avoiding Common Pitfalls Jon Tobin Senior Systems Engineer Jon.Tobin@percona.com @jontobs www.linkedin.com/in/jonathanetobin Agenda Document Design Data Management Replica3on & Failover
More informationNOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY WHAT IS NOSQL? Stands for No-SQL or Not Only SQL. Class of non-relational data storage systems E.g.
More informationWhy Choose Percona Server for MongoDB? Tyler Duzan
Why Choose Percona Server for MongoDB? Tyler Duzan Product Manager Who Am I? My name is Tyler Duzan Formerly an operations engineer for more than 12 years focused on security and automation Now a Product
More informationRelational to NoSQL: Getting started from SQL Server. Shane Johnson Sr. Product Marketing Manager Couchbase
Relational to NoSQL: Getting started from SQL Server Shane Johnson Sr. Product Marketing Manager Couchbase Today s agenda Why NoSQL? Identifying the right application Modeling your data Accessing your
More informationCouchbase Architecture Couchbase Inc. 1
Couchbase Architecture 2015 Couchbase Inc. 1 $whoami Laurent Doguin Couchbase Developer Advocate @ldoguin laurent.doguin@couchbase.com 2015 Couchbase Inc. 2 2 Big Data = Operational + Analytic (NoSQL +
More informationNoSQL Databases Analysis
NoSQL Databases Analysis Jeffrey Young Intro I chose to investigate Redis, MongoDB, and Neo4j. I chose Redis because I always read about Redis use and its extreme popularity yet I know little about it.
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationUse multi-document ACID transactions in MongoDB 4.0 November 7th Corrado Pandiani - Senior consultant Percona
November 7th 2018 Corrado Pandiani - Senior consultant Percona Thank You Sponsors!! About me really sorry for my face Italian (yes, I love spaghetti, pizza and espresso) 22 years spent in designing, developing
More informationMongoDB Step By Step. By B.A.Khivsara Assistant Professor Department of Computer Engineering SNJB s COE,Chandwad
MongoDB Step By Step By B.A.Khivsara Assistant Professor Department of Computer Engineering SNJB s COE,Chandwad Outline Introduction to MongoDB Installation in Ubuntu Starting MongoDB in Ubuntu Basic Operations
More informationITG Software Engineering
Introduction to MongoDB Course ID: Page 1 Last Updated 12/15/2014 MongoDB for Developers Course Overview: In this 3 day class students will start by learning how to install and configure MongoDB on a Mac
More informationDocument Databases: MongoDB
NDBI040: Big Data Management and NoSQL Databases hp://www.ksi.mff.cuni.cz/~svoboda/courses/171-ndbi040/ Lecture 9 Document Databases: MongoDB Marn Svoboda svoboda@ksi.mff.cuni.cz 28. 11. 2017 Charles University
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationJargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems
Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons
More informationKim Greene - Introduction
Kim Greene kim@kimgreene.com 507-216-5632 Skype/Twitter: iseriesdomino Copyright Kim Greene Consulting, Inc. All rights reserved worldwide. 1 Kim Greene - Introduction Owner of an IT consulting company
More informationFinal Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23
Final Exam Review 2 Kathleen Durant CS 3200 Northeastern University Lecture 23 QUERY EVALUATION PLAN Representation of a SQL Command SELECT {DISTINCT} FROM {WHERE
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.
More informationMongoDB Backup and Recovery Field Guide. Tim Vaillancourt Sr Technical Operations Architect, Percona
MongoDB Backup and Recovery Field Guide Tim Vaillancourt Sr Technical Operations Architect, Percona `whoami` { name: tim, lastname: vaillancourt, employer: percona, techs: [ mongodb, mysql, cassandra,
More informationMongoDB An Overview. 21-Oct Socrates
MongoDB An Overview 21-Oct-2016 Socrates Agenda What is NoSQL DB? Types of NoSQL DBs DBMS and MongoDB Comparison Why MongoDB? MongoDB Architecture Storage Engines Data Model Query Language Security Data
More informationIBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store
IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data IBM Db2 Event Store Disclaimer The information contained in this presentation is provided for informational purposes only.
More information1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions
Big Data Hadoop Architect Online Training (Big Data Hadoop + Apache Spark & Scala+ MongoDB Developer And Administrator + Apache Cassandra + Impala Training + Apache Kafka + Apache Storm) 1 Big Data Hadoop
More informationNoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems
CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationDatabase Solution in Cloud Computing
Database Solution in Cloud Computing CERC liji@cnic.cn Outline Cloud Computing Database Solution Our Experiences in Database Cloud Computing SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure
More informationHow to upgrade MongoDB without downtime
How to upgrade MongoDB without downtime me - @adamotonete Adamo Tonete, Senior Technical Engineer Brazil Agenda Versioning Upgrades Operations that always require downtime Upgrading a replica-set Upgrading
More informationBecome a MongoDB Replica Set Expert in Under 5 Minutes:
Become a MongoDB Replica Set Expert in Under 5 Minutes: USING PERCONA SERVER FOR MONGODB IN A FAILOVER ARCHITECTURE This solution brief outlines a way to run a MongoDB replica set for read scaling in production.
More informationMigrating Oracle Databases To Cassandra
BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra
More informationHigh Performance NoSQL with MongoDB
High Performance NoSQL with MongoDB History of NoSQL June 11th, 2009, San Francisco, USA Johan Oskarsson (from http://last.fm/) organized a meetup to discuss advances in data storage which were all using
More informationMongoDB. Nicolas Travers Conservatoire National des Arts et Métiers. MongoDB
Nicolas Travers Conservatoire National des Arts et Métiers 1 Introduction Humongous (monstrous / enormous) NoSQL: Documents Oriented JSon Serialized format: BSon objects Implemented in C++ Keys indexing
More informationFREE AND OPEN SOURCE SOFTWARE CONFERENCE (FOSSC-17) MUSCAT, FEBRUARY 14-15, 2017
From Relational Model to Rich Document Data Models - Best Practices Using MongoDB Vinu Sherimon 1, Sherimon P.C. 2 Abstract Open Source Software steps up the development of today s diverse applications.
More informationMongoDB Architecture
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui MongoDB Architecture Lecturer : Dr. Pavle Mogin SWEN 432 Advanced Database Design and Implementation Advanced Database Design
More informationAurora, RDS, or On-Prem, Which is right for you
Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationElasticSearch in Production
ElasticSearch in Production lessons learned Anne Veling, ApacheCon EU, November 6, 2012 agenda! Introduction! ElasticSearch! Udini! Upcoming Tool! Lessons Learned introduction! Anne Veling, @anneveling!
More informationArchitecture of a Real-Time Operational DBMS
Architecture of a Real-Time Operational DBMS Srini V. Srinivasan Founder, Chief Development Officer Aerospike CMG India Keynote Thane December 3, 2016 [ CMGI Keynote, Thane, India. 2016 Aerospike Inc.
More informationRun your own Open source. (MMS) to avoid vendor lock-in. David Murphy MongoDB Practice Manager, Percona
Run your own Open source Click alternative to edit to Master Ops-Manager title style (MMS) to avoid vendor lock-in David Murphy MongoDB Practice Manager, Percona Who is this Person and What Does He Know?
More informationDatacenter replication solution with quasardb
Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION
More informationMongoDB CRUD Operations
MongoDB CRUD Operations Release 3.2.4 MongoDB, Inc. March 11, 2016 2 MongoDB, Inc. 2008-2016 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationScaling. Marty Weiner Grayskull, Eternia. Yashh Nelapati Gotham City
Scaling Marty Weiner Grayskull, Eternia Yashh Nelapati Gotham City Pinterest is... An online pinboard to organize and share what inspires you. Relationships Marty Weiner Grayskull, Eternia Yashh Nelapati
More informationMongoDB CRUD Operations
MongoDB CRUD Operations Release 3.2.3 MongoDB, Inc. February 17, 2016 2 MongoDB, Inc. 2008-2016 This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 3.0 United States License
More informationMongoDB. History. mongodb = Humongous DB. Open-source Document-based High performance, high availability Automatic scaling C-P on CAP.
#mongodb MongoDB Modified from slides provided by S. Parikh, A. Im, G. Cai, H. Tunc, J. Stevens, Y. Barve, S. Hei History mongodb = Humongous DB Open-source Document-based High performance, high availability
More informationTopics. History. Architecture. MongoDB, Mongoose - RDBMS - SQL. - NoSQL
Databases Topics History - RDBMS - SQL Architecture - SQL - NoSQL MongoDB, Mongoose Persistent Data Storage What features do we want in a persistent data storage system? We have been using text files to
More informationCSE 124: Networked Services Fall 2009 Lecture-19
CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but
More informationRavenDB & document stores
université libre de bruxelles INFO-H415 - Advanced Databases RavenDB & document stores Authors: Yasin Arslan Jacky Trinh Professor: Esteban Zimányi Contents 1 Introduction 3 1.1 Présentation...................................
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationrelational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance
WHITEPAPER Relational to Riak relational Introduction This whitepaper looks at why companies choose Riak over a relational database. We focus specifically on availability, scalability, and the / data model.
More informationTime Series Live 2017
1 Time Series Schemas @Percona Live 2017 Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2
More informationMySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia
MySQL Replication Options Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia Few Words About Percona 2 Your Partner in MySQL and MongoDB Success 100% Open Source Software We work with MySQL,
More informationOracle NoSQL Database Enterprise Edition, Version 18.1
Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across
More informationApplied NoSQL in.net
Applied NoSQL in.net Topics Survey data management world Install mongodb Map C# classes to mongo collections Explore schema design Write queries against mongodb Update records Breaking with tradition is
More informationA Journey to DynamoDB
A Journey to DynamoDB and maybe away from DynamoDB Adam Dockter VP of Engineering ServiceTarget Who are we? Small Company 4 Developers AWS Infrastructure NO QA!! About our product Self service web application
More informationMongoDB for a High Volume Logistics Application. Santa Clara, California April 23th 25th, 2018
MongoDB for a High Volume Logistics Application Santa Clara, California April 23th 25th, 2018 about me... Eric Potvin Software Engineer in the performance team at Shipwire, an Ingram Micro company, in
More informationMongoDB Chunks Distribution, Splitting, and Merging. Jason Terpko
Percona Live 2016 MongoDB Chunks Distribution, Splitting, and Merging Jason Terpko NoSQL DBA, Rackspace/ObjectRocket www.linkedin.com/in/jterpko, jason.terpko@rackspace.com My Story Started out in relational
More informationCustom Reference Data REST API
Introduction, page 1 Limitations, page 1 Setup Requirements, page 2 Architecture, page 7 API Endpoints and Examples, page 8 Introduction The Custom Reference Data (CRD) APIs exists to allow query, creation,
More informationMongoDB Web Architecture
MongoDB Web Architecture MongoDB MongoDB is an open-source, NoSQL database that uses a JSON-like (BSON) document-oriented model. Data is stored in collections (rather than tables). - Uses dynamic schemas
More informationMongoDB Shootout: MongoDB Atlas, Azure Cosmos DB and Doing It Yourself
MongoDB Shootout: MongoDB Atlas, Azure Cosmos DB and Doing It Yourself Agenda and Intro Click for subtitle or brief description Agenda Intro Goal for this talk Who is this David Murphy person? The technologies
More informationAzure-persistence MARTIN MUDRA
Azure-persistence MARTIN MUDRA Storage service access Blobs Queues Tables Storage service Horizontally scalable Zone Redundancy Accounts Based on Uri Pricing Calculator Azure table storage Storage Account
More informationNoSQL BENCHMARKING AND TUNING. Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India
NoSQL BENCHMARKING AND TUNING Nachiket Kate Santosh Kangane Ankit Lakhotia Persistent Systems Ltd. Pune, India Today large variety of available NoSQL options has made it difficult for developers to choose
More informationMySQL High Availability
MySQL High Availability And other stuff worth talking about Peter Zaitsev CEO Moscow MySQL Users Group Meetup July 11 th, 2017 1 Few Words about Percona 2 Percona s Purpose To Champion Unbiased Open Source
More informationCISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL
CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationEverything You Need to Know About MySQL Group Replication
Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights
More informationDatenbanksysteme II: Caching and File Structures. Ulf Leser
Datenbanksysteme II: Caching and File Structures Ulf Leser Content of this Lecture Caching Overview Accessing data Cache replacement strategies Prefetching File structure Index Files Ulf Leser: Implementation
More informationA Fast and High Throughput SQL Query System for Big Data
A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190
More information