Buffering to Redis for Efficient Real-Time Processing. Percona Live, April 24, 2018
|
|
- Carol Payne
- 5 years ago
- Views:
Transcription
1 Buffering to Redis for Efficient Real-Time Processing Percona Live, April 24, 2018
2 Presenting Today Jon Hyman CTO & Co-Founder Braze (Formerly
3 Mobile is at the vanguard of a new wave of borderless engagement. [ ] the roller coaster will be accelerating Digital is the main reason just over half of Fortune 500 companies have disappeared since the year 2000 faster than ever, only this time it ll be about actual experiences, with much less emphasis on the way those experiences get made PIERRE NANTERME, CEO, ACCENTURE WALT MOSSBERG, AMERICAN JOURNALIST & FORMER RECODE EDITOR AT LARGE SOURCE: DIGITAL DISRUPTION HAS ONLY JUST BEGUN (DAVOS WORLD ECONOMIC FORUM), THE DISAPPEARING COMPUTER (RECODE)
4 More than 1 Billion MAU Braze empowers you to humanize your brand-customer relationships at scale. Tens of Billions of Messages Sent Monthly Global Customer Presence ON SIX CONTINENTS
5 Quick Intro to Redis Today Coordinating Customer Journeys with Redis Buffering Analytics to Redis TOC
6 Quick Intro to Redis
7 What is Redis? Redis is an open source (BSD licensed), in-memory data structure store, used as database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has builtin replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster. Braze uses all the data types from Redis Today s talk we ll look at sorted sets, sets, hashes, and strings
8 Redis data types Strings: key value storage. Redis has atomic operations to set a key if it doesn t exist and to set expiry You can use this to create a basic locking mechanism SET key value NX EX 10 Set key to value if it does not exist, and expire the key in 10 seconds Redis returns whether or not the set succeeded
9 Redis data types Sets: Lists of string values that do not contain any duplicates. Sets do not have an ordering. SADD key a SADD key b SADD key a SMEMBERS key [ a, b ]
10 Redis data types Hashes: A data structure that can store string keys and string values HSET key foo bar HSET key bar bang HGETALL key { foo : bar, bar : bang } Hashes can also have keys be incremented HINCRBY key baz 1 HINCRBY key baz 3 HGET key baz 4
11 Redis data types Sorted Sets: Like sets, but each element also has a numerical score associated with it. Sorted Sets are ordered by that score. ZADD scores alice 100 ZADD scores bob 80 ZADD scores carol 110 ZRANGEBYSCORE scores 0-1 [ [bob, 80], [alice, 100], [carol, 110] ] ZREVRANGEBYSCORE scores 0-1 [ [carol, 110], [alice, 100], [bob, 80] ]
12 Coordinating Customer Journeys with Redis
13 Canvas Allows customers to create multi-step, multi-message, multi-day customer journeys
14 Canvas Canvas is distributed and event driven When messages are sent, we fire received campaign event Processes listen for the received campaign event and determine if that should schedule new message If a new message should be scheduled, enqueue a new job process to send the message.
15 Using Redis as a Job Queue Jobs are added to Redis sorted set with Unix timestamp as the score and value as job data One new job added per message Worker processes on servers poll scheduled set with ZRANGEBYSCORE -INF <now> LIMIT 0 1, then one worker process ZREMs ZRANGEBYSCORE -INF <now> LIMIT 0 1 has O(1) runtime due to Redis implementation of sorted sets ZREM has O(log N)runtime For canvas, enqueue one job per each branch. When the job runs, the process determine if the branch path is valid and grab a lock to prevent other branches from processing Lock takes the form of a SET NX EX operation
16 Canvas This architecture worked great in staging, in beta, and for the first few months of the general release and all was good Processing runtime depends on number of branches a canvas has and the number of users entering the canvas. January, 2017 one customer created a canvas with 11 branches targeting more than 10 million users to run at 10am the next day. Canvas architecture design meant we had to process 110 million jobs right at 10am
17 What happened?
18 Thundering Herd: Enqueuing Jobs This particular canvas created 110 million jobs to all run at 10am the next morning at the same timestamp These jobs are stored in a sorted set, where workers are polling to move jobs from sorted set to queues ZRANGEBYSCORE -INF <now> LIMIT 0 1 has O(1) runtime due to Redis implementation of sorted sets ZREM has O(log N) runtime Every worker server s ZRANGEBYSCORE would return something, only one process would successfully ZREM the job Excessive ZREM operations slowed down Redis It took more than 40 minutes just to enqueue the jobs, meaning that if it was 10:35am, we hadn t finished enqueuing the 10am jobs yet. This was now a customer facing incident.
19 One user per job inefficiencies Each job was one {user, branch} pair Determining if the user should go down that path involves querying database state and making Redis locks 110 million roundtrips to each database to determine if processing should continue It took more than 90 minutes to process the next steps
20 What did we do?
21 Fixing Canvas architectural issues Initial code design was inefficient: one job per {user, branch} pair. Each job needs access to database state, so we made a lot of extra database calls. Because messages tend to go to multiple users around the same time, we figured we could buffer them and have a single job process multiple users at once.
22 Use Redis sets as a buffer
23 Fixing one user per job inefficiencies When a received campaign event is fired, instead of enqueueing a new job to send a message, create a new set with key buffer:step_id:timestamp. Add user to this set. This lets users buffer up for the same timestamp. Periodically flush this set in batches of 100 users: When doing an SADD, also do a SET NX EX to a key to determine if we should enqueue a job to run in 3 seconds which will flush the set. The job does an SPOP 100 to get 100 elements, and will re-enqueue other jobs to run to continue flushing the set if it is non-empty
24 Fixing the thundering herd Added random microsecond jitter to all jobs in the sorted set to split up one second into a million pieces Existing code used ZRANGEBYSCORE -INF <now> 0 1 to consume from left side of sorted set Consume from the right side with ZREVRANGEBYSCORE Consume from the middle Keep track of how far backlogged we are in the set Randomly add jitter or whole seconds to move along the set to start consuming the middle
25 Results of architectural changes Saved more than 50 gigabytes of RAM for the original canvas Instead of 110 million jobs, we enqueued only about 1.4 million jobs Instead of 40 minutes to enqueue from the sorted set, all jobs enqueued in a few seconds Next steps of the canvas processed in about 14 minutes, down from 90 minutes.
26 We adapted buffering in other places, such as our REST API
27 REST API Buffering Braze has REST APIs to ingest user attribute data, event data and purchases Application servers query user state when processing, it is more efficient to make batch roundtrips to databases We encourage customers to batch data, but some integrations make 1 API call per data point Less Efficient, 2 Round Trips to Query State POST /users/track More Efficient, 1 Round Trip to Query State POST /users/track { attributes: [{ user_id : 123, first_name : Alice }], } POST /users/track { attributes: [{ user_id : 456, first_name : Bob }], } { } attributes: [ { user_id : 123, first_name : Alice }, { user_id : 456, first_name : Bob }, ],
28 REST API Buffering Braze has REST APIs to ingest user attribute data, event data and purchases Application servers query user state when processing, it is more efficient to make batch roundtrips to databases We encourage customers to batch data, but some integrations make 1 API call per data point Less Efficient, 2 Round Trips to Query State POST /users/track More Efficient, 1 Round Trip to Query State POST /users/track { attributes: [{ user_id : 123, first_name : Alice }], } POST /users/track { attributes: [{ user_id : 456, first_name : Bob }], } { } attributes: [ { user_id : 123, first_name : Alice }, { user_id : 456, first_name : Bob }, ], We use the same pattern and SADD data to a Redis set and flush it every second This lets us buffer multiple API calls and process them together
29 Improving Writes for Time Series Analytics
30 We collect a lot of time series analytics
31 Time series analytics are stored in MongoDB Non-hashed MongoDB sharding divides data into ranges and puts them on different nodes
32 Time series data is easy to pre-aggregate { app_id: date: , name: website_visits, 6: 120, 7: 541, 8: 1200, 9: 800, }
33 Shard on {app_id:1, name:1, date:1} { app_id: date: , name: website_visits, 6: 120, 7: 541, 8: 1200, 9: 800, }
34 {app_id: 1, name: 1, date: 1} One document per app, per event name, per day
35 {app_id: 1, name: 1, date: 1} What happens when more events come in at once than one shard can handle?
36 {app_id: 1, name: 1, date: 1} What happens when more events come in at once than one shard can handle?
37 Treat Redis hashes as if they were MongoDB sub-documents
38 MongoDB Redis { app_id: date: , name: website_visits, Use a hash based on shard key where keys are hours and values are the amount to increment by 6: 120, 7: 541, 8: 1200, 9: 800, }
39 HINCRBY website_visits 8 1 SADD "buffered" website_visits
40 Periodically flush from Redis to MongoDB just like we do with Canvas sets
41 keys = SMEMBERS( buffered ) Flush buffer from Redis to MongoDB increment_hashes = REDIS MULTI keys.each { key HGETALL(key) } SREM( buffered, k) keys.each { key DEL(key) } END MULTI keys.each_with_index do key, i app_id, name, date = deserialize(key) db.my_timeseries.find( {app_id: app_id, name: name, date: date} ).update_one($inc: increment_hashes[i]) end * This example algorithm is vulnerable to data loss, do not use directly
42 We do this with 12 Redis servers to shard out writes to a single MongoDB document Can buffer the same hash key to each Redis and flush independently
43 Scale We re doing over 1 million ops per second to Redis That s 1 million writes to Mongo deferred per second Mongo flush rate is approximately 7k writes per second Redis is handling 142x more writes per second than Mongo for analytics
44 Summary When processing a flurry of events, holding and batching can be efficient to improve throughput Redis multiple data types can be used for buffering Braze uses sets to buffer streams of data to process in bulk Add with SADD, remove with SPOP Reduces database roundtrips and storage costs Braze uses hashes to buffer time series analytics using HINCRBY
45 Thank you! We are hiring! braze.com/careers
46 Rate My Session 5
47
Harnessing the Full power of Redis. Daniel Magliola
Harnessing the Full power of Redis Daniel Magliola daniel@danielmagliola.com http://danielmagliola.com What is Redis? Redis is essentially like Memcached, but better I mean, it s an in-memory key-value
More informationRedis Functions and Data Structures at Percona Live. Dave Nielsen, Developer Redis
Redis Functions and Data Structures at Percona Live Dave Nielsen, Developer Advocate dave@redislabs.com @davenielsen Redis Labs @redislabs Redis = A Unique Database Redis is an open source (BSD licensed),
More informationDatabase Solution in Cloud Computing
Database Solution in Cloud Computing CERC liji@cnic.cn Outline Cloud Computing Database Solution Our Experiences in Database Cloud Computing SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure
More informationNoSQL Databases Analysis
NoSQL Databases Analysis Jeffrey Young Intro I chose to investigate Redis, MongoDB, and Neo4j. I chose Redis because I always read about Redis use and its extreme popularity yet I know little about it.
More informationManaging IoT and Time Series Data with Amazon ElastiCache for Redis
Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All
More informationLECTURE 27. Python and Redis
LECTURE 27 Python and Redis PYTHON AND REDIS Today, we ll be covering a useful but not entirely Python-centered topic: the inmemory datastore Redis. We ll start by introducing Redis itself and then discussing
More informationRedis Func+ons and Data Structures
Redis Func+ons and Data Structures About This Talk Topic : Redis Func/ons and Data Structures Presenter: Redis Labs, the open source home and provider of enterprise Redis About Redis Labs: 5300+ paying
More informationRedis as a Reliable Work Queue. Percona University
Redis as a Reliable Work Queue Percona University 2015-02-12 Introduction Tom DeWire Principal Software Engineer Bronto Software Chris Thunes Senior Software Engineer Bronto Software Introduction Introduction
More informationHome of Redis. April 24, 2017
Home of Redis April 24, 2017 Introduction to Redis and Redis Labs Redis with MySQL Data Structures in Redis Benefits of Redis e 2 Redis and Redis Labs Open source. The leading in-memory database platform,
More informationRedis as a Time Series DB. Josiah Carlson
Redis as a Time Series DB Josiah Carlson - @dr_josiah Agenda Who are you? What is Redis? (3 minutes, optional) What is a time series database? Combining structures for success Analyzing/segmenting events
More informationNoSQL: Redis and MongoDB A.A. 2016/17
Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica NoSQL: Redis and MongoDB A.A. 2016/17 Matteo Nardelli Laurea Magistrale in Ingegneria Informatica -
More informationTHE FLEXIBLE DATA-STRUCTURE SERVER THAT COULD.
REDIS THE FLEXIBLE DATA-STRUCTURE SERVER THAT COULD. @_chriswhitten_ REDIS REDIS April 10, 2009; 6 years old Founding Author: Salvatore Sanfilippo Stable release: 3.0.3 / June 4, 2015; 3 months ago Fundamental
More informationUsing Redis As a Time Series Database
WHITE PAPER Using Redis As a Time Series Database Dr.Josiah Carlson, Author of Redis in Action CONTENTS Executive Summary 2 Use Cases 2 Advanced Analysis Using a Sorted Set with Hashes 2 Event Analysis
More informationBeyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona
Beyond Relational Databases: MongoDB, Redis & ClickHouse Marcos Albe - Principal Support Engineer @ Percona Introduction MySQL everyone? Introduction Redis? OLAP -vs- OLTP Image credits: 451 Research (https://451research.com/state-of-the-database-landscape)
More informationHome of Redis. Redis for Fast Data Ingest
Home of Redis Redis for Fast Data Ingest Agenda Fast Data Ingest and its challenges Redis for Fast Data Ingest Pub/Sub List Sorted Sets as a Time Series Database The Demo Scaling with Redis e Flash 2 Fast
More informationHow to Scale MongoDB. Apr
How to Scale MongoDB Apr-24-2018 About me Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant (2012-2016)
More informationwhitepaper Using Redis As a Time Series Database: Why and How
whitepaper Using Redis As a Time Series Database: Why and How Author: Dr.Josiah Carlson, Author of Redis in Action Table of Contents Executive Summary 2 A Note on Race Conditions and Transactions 2 Use
More informationAmritansh Sharma
17.12.2018 Amritansh Sharma - 000473628 1 CONTENTS 1 Introduction and Background 3 1.1 Relational Databases 3 1.2 NoSQL Databases 4 1.3 Key Value Stores 5 2 Redis 7 2.1 Redis vs Other Key-Value Stores
More informationMy Other Car is a Redis. Etan Grundstein & Sasha Popov DYNAMIC YIELD
My Other Car is a Redis Etan Grundstein & Sasha Popov DYNAMIC YIELD About Dynamic Yield Dynamic Yield helps marketers increase revenue by personalizing customer interactions across web, mobile web, mobile
More informationAccelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016
Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016 Nikita Ivanov CTO and Co-Founder GridGain Systems Peter Zaitsev CEO and Co-Founder Percona About the Presentation
More informationXtraDB 5.7: Key Performance Algorithms. Laurynas Biveinis Alexey Stroganov Percona
XtraDB 5.7: Key Performance Algorithms Laurynas Biveinis Alexey Stroganov Percona firstname.lastname@percona.com XtraDB 5.7 Key Performance Algorithms Focus on the buffer pool, flushing, the doublewrite
More informationJason Brelloch and William Gimson
Jason Brelloch and William Gimson Overview 1. Introduction 2. History 3. Specifications a. Structure b. Communication c. Datatypes 4. Command Overview 5. Advanced Capabilities 6. Advantages 7. Disadvantages
More informationWiredTiger In-Memory vs WiredTiger B-Tree. October, 5, 2016 Mövenpick Hotel Amsterdam Sveta Smirnova
WiredTiger In-Memory vs WiredTiger B-Tree October, 5, 2016 Mövenpick Hotel Amsterdam Sveta Smirnova Table of Contents What is Percona Memory Engine for MongoDB? Typical use cases Advanced Memory Engine
More informationHigh-Level Data Models on RAMCloud
High-Level Data Models on RAMCloud An early status report Jonathan Ellithorpe, Mendel Rosenblum EE & CS Departments, Stanford University Talk Outline The Idea Data models today Graph databases Experience
More informationAerospike Scales with Google Cloud Platform
Aerospike Scales with Google Cloud Platform PERFORMANCE TEST SHOW AEROSPIKE SCALES ON GOOGLE CLOUD Aerospike is an In-Memory NoSQL database and a fast Key Value Store commonly used for caching and by real-time
More informationIBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data. IBM Db2 Event Store
IBM Db2 Event Store Simplifying and Accelerating Storage and Analysis of Fast Data IBM Db2 Event Store Disclaimer The information contained in this presentation is provided for informational purposes only.
More information羅仲成 ROY LOU 17MEDIA 分散式緩存服務實踐 DISTRIBUTED CACHING SERVICE
羅仲成 ROY LOU 17MEDIA 分散式緩存服務實踐 DISTRIBUTED CACHING SERVICE ABOUT ME 17media architect Past: HTC, Google, NVIDIA 2-year-old monster s dad Jogging, basketball, snowboarding There are only two hard things
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationCaching Memcached vs. Redis
Caching Memcached vs. Redis San Francisco MySQL Meetup Ryan Lowe Erin O Neill 1 Databases WE LOVE THEM... Except when we don t 2 When Databases Rule Many access patterns on the same set of data Transactions
More informationRedis - a Flexible Key/Value Datastore An Introduction
Redis - a Flexible Key/Value Datastore An Introduction Alexandre Dulaunoy AIMS 2011 MapReduce and Network Forensic MapReduce is an old concept in computer science The map stage to perform isolated computation
More informationMongoDB 2.2 and Big Data
MongoDB 2.2 and Big Data Christian Kvalheim Team Lead Engineering, EMEA christkv@10gen.com @christkv christiankvalheim.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis ...without
More informationScaling for Humongous amounts of data with MongoDB
Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis
More informationMongoDB: Comparing WiredTiger In-Memory Engine to Redis. Jason Terpko DBA, Rackspace/ObjectRocket 1
MongoDB: Comparing WiredTiger In-Memory Engine to Redis Jason Terpko DBA, Rackspace/ObjectRocket www.linkedin.com/in/jterpko 1 Background Started out in relational databases in public education then financial
More informationTriple R Riak, Redis and RabbitMQ at XING
Triple R Riak, Redis and RabbitMQ at XING Dr. Stefan Kaes, Sebastian Röbke NoSQL matters Cologne, April 27, 2013 ActivityStream Intro 3 Types of Feeds News Feed Me Feed Company Feed Activity Creation
More informationCS November 2018
Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationStreaming Log Analytics with Kafka
Streaming Log Analytics with Kafka Kresten Krab Thorup, Humio CTO Log Everything, Answer Anything, In Real-Time. Why this talk? Humio is a Log Analytics system Designed to run on-prem High volume, real
More informationCourse Content MongoDB
Course Content MongoDB 1. Course introduction and mongodb Essentials (basics) 2. Introduction to NoSQL databases What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL
More informationReal World Web Scalability. Ask Bjørn Hansen Develooper LLC
Real World Web Scalability Ask Bjørn Hansen Develooper LLC Hello. 28 brilliant methods to make your website keep working past $goal requests/transactions/sales per second/hour/day Requiring minimal extra
More informationO Reilly RailsConf,
O Reilly RailsConf, 2011-05- 18 Who is that guy? Jesper Richter- Reichhelm / @jrirei Berlin, Germany Head of Engineering @ wooga Wooga does social games Wooga has dedicated game teams Cooming soon PHP
More informationRedis Tuesday, May 29, 12
Redis 2.6 @antirez Redis 2.6 Major new features. Based on unstable branch (minus the cluster code). Why a 2.6 release? Redis Cluster is a long term project (The hurried cat produced blind kittens). Intermediate
More informationFrom the event loop to the distributed system. Martyn 3rd November, 2011
From the event loop to the distributed system Martyn Loughran martyn@pusher.com @mloughran 3rd November, 2011 From the event loop to the distributed system From the event loop to the distributed system
More informationRun your own Open source. (MMS) to avoid vendor lock-in. David Murphy MongoDB Practice Manager, Percona
Run your own Open source Click alternative to edit to Master Ops-Manager title style (MMS) to avoid vendor lock-in David Murphy MongoDB Practice Manager, Percona Who is this Person and What Does He Know?
More informationBigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI 2006 Presented by Xiang Gao 2014-11-05 Outline Motivation Data Model APIs Building Blocks Implementation Refinement
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationUpgrading Databases. without losing your data, your performance or your mind. Charity
Upgrading Databases without losing your data, your performance or your mind Charity Majors @mipsytipsy Upgrading Databases without losing your data, your performance or your mind Charity Majors @mipsytipsy
More informationInvitation to a New Kind of Database. Sheer El Showk Cofounder, Lore Ai We re Hiring!
Invitation to a New Kind of Database Sheer El Showk Cofounder, Lore Ai www.lore.ai We re Hiring! Overview 1. Problem statement (~2 minute) 2. (Proprietary) Solution: Datomics (~10 minutes) 3. Proposed
More informationScaling. Yashh Nelapati Gotham City. Marty Weiner Krypton. Friday, July 27, 12
Scaling Marty Weiner Krypton Yashh Nelapati Gotham City Pinterest is... An online pinboard to organize and share what inspires you. Relationships Marty Weiner Grayskull, Eternia Relationships Marty
More informationAccelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017
Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 About the Presentation Problems Existing Solutions Denis Magda
More informationCounting is Hard: Probabilistically Counting Views at Reddit. Krishnan Chandra, Data Engineer
Counting is Hard: Probabilistically Counting Views at Reddit Krishnan Chandra, Data Engineer What is probabilistic counting? Overview How did probabilistic counting help us scale? What issues did we face
More informationTime Series Live 2017
1 Time Series Schemas @Percona Live 2017 Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2
More informationScaling with mongodb
Scaling with mongodb Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: ross@10gen.com twitter: RossC0 Today's Talk Scaling Understanding
More informationdjango-redis-cache Documentation
django-redis-cache Documentation Release 1.5.2 Sean Bleier Nov 15, 2018 Contents 1 Intro and Quick Start 3 1.1 Intro................................................... 3 1.2 Quick Start................................................
More informationTime-Series Data in MongoDB on a Budget. Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018
Time-Series Data in MongoDB on a Budget Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018 TIME SERIES DATA in MongoDB on a Budget Click to add text
More informationThe course modules of MongoDB developer and administrator online certification training:
The course modules of MongoDB developer and administrator online certification training: 1 An Overview of the Course Introduction to the course Table of Contents Course Objectives Course Overview Value
More informationScaling. Marty Weiner Grayskull, Eternia. Yashh Nelapati Gotham City
Scaling Marty Weiner Grayskull, Eternia Yashh Nelapati Gotham City Pinterest is... An online pinboard to organize and share what inspires you. Relationships Marty Weiner Grayskull, Eternia Yashh Nelapati
More informationPractical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars
Practical MySQL Performance Optimization Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars In This Presentation We ll Look at how to approach Performance Optimization Discuss Practical
More informationICALEPS 2013 Exploring No-SQL Alternatives for ALMA Monitoring System ADC
ICALEPS 2013 Exploring No-SQL Alternatives for ALMA Monitoring System Overview The current paradigm (CCL and Relational DataBase) Propose of a new monitor data system using NoSQL Monitoring Storage Requirements
More informationHow you can benefit from using. javier
How you can benefit from using I was Lois Lane redis has super powers myth: the bottleneck redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop,mset -P 16 -q On my laptop: SET: 513610 requests
More informationOracle Database 12c: JMS Sharded Queues
Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE
More informationStorm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter
Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 5. Key-value stores Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Key-value store Basic
More informationA Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers
A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented
More informationIntroduction to Hadoop. Owen O Malley Yahoo!, Grid Team
Introduction to Hadoop Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since
More informationReal-Time & Big Data GIS: Best Practices. Suzanne Foss Josh Joyner
Real-Time & Big Data GIS: Best Practices Suzanne Foss Josh Joyner ArcGIS Enterprise With Real-time Capabilities Desktop Apps APIs visualization ingestion dissemination & actuation analytics storage Agenda:
More informationLecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka
Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka What problem does Kafka solve? Provides a way to deliver updates about changes in state from one service to another
More informationScaling MongoDB. Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB Senior Service Technical Service Engineer.
caling MongoDB Percona Webinar - Wed October 18th 11:00 AM PDT Adamo Tonete MongoDB enior ervice Technical ervice Engineer 1 Me and the expected audience @adamotonete Intermediate - At least 6+ months
More informationCassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent
Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these
More informationClassifying malware using network traffic analysis. Or how to learn Redis, git, tshark and Python in 4 hours.
Classifying malware using network traffic analysis. Or how to learn Redis, git, tshark and Python in 4 hours. Alexandre Dulaunoy January 9, 2015 Problem Statement We have more 5000 pcap files generated
More informationDistributed computing: index building and use
Distributed computing: index building and use Distributed computing Goals Distributing computation across several machines to Do one computation faster - latency Do more computations in given time - throughput
More informationAgenda. Introduction You Me JDriven The case : Westy Tracking Details Implementation Deployment
y t s e W g n i k c a r T e p o r u mmit E u S y r d un o F 6 d 1 u h t Clo 8 2 r e b m Septe Agenda Introduction You Me JDriven The case : Westy Tracking Details Implementation Deployment About you About
More informationOpen Source Database Ecosystem in Peter Zaitsev 3 October 2016
Open Source Database Ecosystem in 2016 Peter Zaitsev 3 October 2016 Great things are happening with Open Source Databases It is great Industry and Community to be a part of 2 Why? 3 Data Continues Exponential
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationMaking Session Stores More Intelligent KYLE J. DAVIS TECHNICAL MARKETING MANAGER REDIS LABS
Making Session Stores More Intelligent KYLE J. DAVIS TECHNICAL MARKETING MANAGER REDIS LABS What is a session store? A session store is An chunk of data that is connected to one user of a service user
More information4 Myths about in-memory databases busted
4 Myths about in-memory databases busted Yiftach Shoolman Co-Founder & CTO @ Redis Labs @yiftachsh, @redislabsinc Background - Redis Created by Salvatore Sanfilippo (@antirez) OSS, in-memory NoSQL k/v
More informationPercona Live Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations. Kimberly Wilkins
Percona Live 2016 Updated Sharding Guidelines in MongoDB 3.x with Storage Engine Considerations Kimberly Wilkins Principal Engineer - Databases, Rackspace/ ObjectRocket www.linkedin.com/in/wilkinskimberly,
More informationFluentd + MongoDB + Spark = Awesome Sauce
Fluentd + MongoDB + Spark = Awesome Sauce Nishant Sahay, Sr. Architect, Wipro Limited Bhavani Ananth, Tech Manager, Wipro Limited Your company logo here Wipro Open Source Practice: Vision & Mission Vision
More informationAlgorithms for MapReduce. Combiners Partition and Sort Pairs vs Stripes
Algorithms for MapReduce 1 Assignment 1 released Due 16:00 on 20 October Correctness is not enough! Most marks are for efficiency. 2 Combining, Sorting, and Partitioning... and algorithms exploiting these
More informationHive Metadata Caching Proposal
Hive Metadata Caching Proposal Why Metastore Cache During Hive 2 benchmark, we find Hive metastore operation take a lot of time and thus slow down Hive compilation. In some extreme case, it takes much
More informationScalable Time Series in PCP. Lukas Berk
Scalable Time Series in PCP Lukas Berk Summary Problem Statement Proposed Solution Redis Basic Types Summary Current Work Future Work Items Problem Statement Scaling PCP s metrics querying to hundreds/thousands
More informationSharding Introduction
search MongoDB Home Admin Zone Sharding Sharding Introduction Sharding Introduction MongoDB supports an automated sharding architecture, enabling horizontal scaling across multiple nodes. For applications
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationRelational databases
COSC 6397 Big Data Analytics NoSQL databases Edgar Gabriel Spring 2017 Relational databases Long lasting industry standard to store data persistently Key points concurrency control, transactions, standard
More informationWhat is Apache Kafka?
What is Apache Kafka? How it s similar to the databases you know and love, and how it s not. Kenny Gorman Founder and CEO www.eventador.io www.kennygorman.com @kennygorman I am a database nerd I have done
More informationScaling Instagram. AirBnB Tech Talk 2012 Mike Krieger Instagram
Scaling Instagram AirBnB Tech Talk 2012 Mike Krieger Instagram me - Co-founder, Instagram - Previously: UX & Front-end @ Meebo - Stanford HCI BS/MS - @mikeyk on everything communicating and sharing
More informationGoals. Facebook s Scaling Problem. Scaling Strategy. Facebook Three Layer Architecture. Workload. Memcache as a Service.
Goals Memcache as a Service Tom Anderson Rapid application development - Speed of adding new features is paramount Scale Billions of users Every user on FB all the time Performance Low latency for every
More informationMarathon Documentation
Marathon Documentation Release 3.0.0 Top Free Games Feb 07, 2018 Contents 1 Overview 3 1.1 Features.................................................. 3 1.2 Architecture...............................................
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationINFO-H-415 Advanced Databases Key-value stores and Redis. Fatemeh Shafiee Raisa Uku
INFO-H-415 Advanced Databases Key-value stores and Redis Fatemeh Shafiee 000454718 Raisa Uku 000456485 December 2017 Contents 1 Introduction 5 2 NoSQL Databases 5 2.1 Introduction to NoSQL Databases...............................
More informationExtreme Computing. NoSQL.
Extreme Computing NoSQL PREVIOUSLY: BATCH Query most/all data Results Eventually NOW: ON DEMAND Single Data Points Latency Matters One problem, three ideas We want to keep track of mutable state in a scalable
More informationI Want To Go Faster! A Beginner s Guide to Indexing
I Want To Go Faster! A Beginner s Guide to Indexing Bert Wagner Slides available here! @bertwagner bertwagner.com youtube.com/c/bertwagner bert@bertwagner.com Why Indexes? Biggest bang for the buck Can
More informationCS222P Fall 2017, Final Exam
STUDENT NAME: STUDENT ID: CS222P Fall 2017, Final Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100 + 15) Instructions: This exam has seven (7)
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationRedis for Real-Time Personalization A PSEUDO CODE APPROACH TO IMPLEMENTING PERSONALIZATION WITH REDIS
Redis for Real-Time Personalization A PSEUDO CODE APPROACH TO IMPLEMENTING PERSONALIZATION WITH REDIS Contents Redis and the Enterprise 3 The Personalization App Implemented in Redis 3 Capabilities That
More informationHow To Rock with MyRocks. Vadim Tkachenko CTO, Percona Webinar, Jan
How To Rock with MyRocks Vadim Tkachenko CTO, Percona Webinar, Jan-16 2019 Agenda MyRocks intro and internals MyRocks limitations Benchmarks: When to choose MyRocks over InnoDB Tuning for the best results
More informationBipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae
ONE MILLION FINANCIAL TRANSACTIONS PER HOUR USING ORACLE DATABASE 10G AND XA Bipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae INTRODUCTION
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationDr. Chuck Cartledge. 19 Nov. 2015
CS-695 NoSQL Database Redis (part 1 of 2) Dr. Chuck Cartledge 19 Nov. 2015 1/21 Table of contents I 1 Miscellanea 2 DB comparisons 3 Assgn. #7 4 Historical origins 5 Data model 6 CRUDy stuff 7 Other operations
More informationMySQL Performance Improvements
Taking Advantage of MySQL Performance Improvements Baron Schwartz, Percona Inc. Introduction About Me (Baron Schwartz) Author of High Performance MySQL 2 nd Edition Creator of Maatkit, innotop, and so
More informationEvent Sourcing. Intro & Challenges
Event Sourcing Intro & Challenges Michael Plöd innoq Principal Consultant @bitboss Most current systems only store the current state Classical Architecture IncidentRestController IncidentBusinessService
More informationMongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM
MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM About us Adamo Tonete MongoDB Support Engineer Agustín Gallego MySQL Support Engineer Agenda What are MongoDB and MySQL; NoSQL
More information