EVCache: Lowering Costs for a Low Latency Cache with RocksDB. Scott Mansfield Vu Nguyen EVCache
|
|
- Edward Hall
- 5 years ago
- Views:
Transcription
1 EVCache: Lowering Costs for a Low Latency Cache with RocksDB Scott Mansfield Vu Nguyen EVCache
2
3
4
5
6
7
8
9
10
11
12
13 90 seconds
14 What do caches touch? Signing up* Logging in Choosing a profile Picking liked videos Personalization* Loading home page* Scrolling home page* A/B tests Video image selection * multiple caches involved Searching* Viewing title details Playing a title* Subtitle / language prefs Rating a title My List Video history* UI strings Video production*
15 Home Page Request
16
17 Ephemeral Volatile Cache Key-Value store optimized for AWS and tuned for Netflix use cases
18 What is EVCache? Distributed, sharded, replicated key-value store Tunable in-region and global replication Based on Memcached Resilient to failure Topology aware Linearly scalable Seamless deployments
19 Why Optimize for AWS Instances disappear Zones fail Regions become unstable Network is lossy Customer requests bounce between regions Failures happen and we test all the time
20 EVCache Netflix Hundreds of terabytes of data Trillions of ops / day Tens of billions of items stored Tens of millions of ops / sec Millions of replications / sec Thousands of servers Hundreds of instances per cluster Hundreds of microservice clients Tens of distinct clusters 3 regions 4 engineers
21 Architecture Application Server Client Library Client Memcached EVCar Eureka (Service Discovery)
22 Architecture us-west-2a Client us-west-2b Client us-west-2c Client
23 Reading (get) us-west-2a us-west-2b Client Primary Secondary us-west-2c
24 Writing (set, delete, add, etc.) us-west-2a Client us-west-2b Client us-west-2c Client
25 Use Case: Lookaside Cache Application Client Library Client Ribbon Client S Data Flow C S S C C S C
26 Use Case: Transient Data Store Time Application Application Client Library Application Client Library Client Client Library Client Client
27 Use Case: Primary Store Online Application Client Library Client Online Services Offline Services Offline / Nearline Precomputes for Recommendations Data Flow
28 Use Case: Versioned Primary Store Online Application Archaius (Dynamic Properties) Client Library Client Online Services Offline Services Offline Compute Data Flow Control System (Valhalla)
29 Use Case: High Volume && High Availability Application Client Library Optional In-memory Remote Compute & Publish on schedule Data Flow Ribbon Client S C S C S S C C
30 Pipeline of Personalization Online 1 Online 2 Online Services Offline Services Compute A Compute B Data Flow Compute D Compute C Compute E
31 Additional Features Kafka Global data replication Consistency metrics Key Iteration Cache warming Lost instance recovery Backup (and restore)
32 Additional Features (Kafka) Global data replication Consistency metrics
33 Cross-Region Replication Region A APP Region B 7 read 1 mutate APP a t da t ge t 4 r se fo 6 mutate 2 send metadata Repl Proxy Repl Proxy sg s tp 5 Kafka 3 poll msg Repl Relay ht nd se m Repl Relay Kafka
34 Additional Features (Key Iteration) Cache warming Lost instance recovery Backup (and restore)
35 Cache Warming Application Client Library Client Control Data Flow Control Flow Metadata Flow S3 Cache Warmer (Spark)
36 Moneta Next-generation EVCache server
37 Moneta Moneta: The Goddess of Memory Juno Moneta: The Protectress of Funds for Juno Evolution of the EVCache server Cost optimization EVCache on SSD Ongoing lower EVCache cost per stream Takes advantage of global request patterns
38 Old Server Stock Memcached and EVCar (sidecar) All data stored in RAM in Memcached Expensive with global expansion / N+1 architecture Memcached EVCar external
39 Optimization Global data means many copies Access patterns are heavily region-oriented In one region: Hot data is used often Cold data is almost never touched Keep hot data in RAM, cold data on SSD Size RAM for working set, SSD for overall dataset
40 New Server Adds Rend and Mnemonic Still looks like Memcached Unlocks cost-efficient storage & server-side intelligence
41 Rend go get github.com/netflix/rend
42 Rend High-performance Memcached proxy & server Written in Go Powerful concurrency primitives Productive and fast Manages the L1/L2 relationship Tens of thousands of connections
43 Rend Modular set of libraries and an example main() Manages connections, request orchestration, and communication Low-overhead metrics library Connection Management Multiple orchestrators Server Loop Protocol Parallel locking for data integrity Request Orchestration Efficient connection pool Backend Handlers M E T R I C S
44 Mnemonic
45 Mnemonic Manages data storage on SSD Uses Rend server libraries Handles Memcached protocol Maps Memcached ops to RocksDB ops Rend Server Core Lib (Go) Mnemonic Op Handler (Go) Mnemonic Core (C++) RocksDB (C++)
46 Why RocksDB? Fast at medium to high write load Disk--write load higher than read load (because of Memcached) Predictable RAM Usage Record A Record B memtable memtable SST SST memtable SST SST: Static Sorted Table...
47 How we use RocksDB No Level Compaction Generated too much traffic to SSD High and unpredictable read latencies No Block Cache Rely on Local Memcached No Compression
48 How we use RocksDB FIFO Compaction SST s ordered by time Oldest SST deleted when full Reads access every SST until record found
49 How we use RocksDB Full File Bloom Filters Full Filter reduces unnecessary SSD reads Bloom Filters and Indices pinned in memory Minimize SSD access per request
50 How we use RocksDB Records sharded across multiple RocksDB per node Reduces number of files checked to decrease latency Mnemonic Core Key: ABC Key: XYZ... R R R R
51 Region-Locality Optimizations Replication and Batch updates only RocksDB* Keeps Region-Local and hot data in memory Separate Network Port for off-line requests Memcached data replaced
52 FIFO Limitations FIFO compaction not suitable for all use cases Very frequently updated records may push out valid records Expired Records still exist Requires Larger Bloom Filters SST SST SST Record A1 Record A1 Record C Record A2 Record A2 Record D Record B1 Record B1 Record E Record A3 Record A3 Record F Record B2 Record B2 Record G Record B3 Record B3 Record H time
53 AWS Instance Type i2.xlarge 4 vcpu 30 GB RAM 800 GB SSD 32K IOPS (4KB Pages) ~130MB/sec
54 Moneta Perf Benchmark (High-Vol Online Requests)
55 Moneta Perf Benchmark (cont)
56 Moneta Perf Benchmark (cont)
57 Moneta Perf Benchmark (cont)
58 Moneta Performance in Production (Batch Systems) Request get 95%, 99% = 729 μs, 938 μs L1 get 95%, 99% = 153 μs, 191 μs L2 get 95%, 99% = 1005 μs, 1713 μs ~20 KB Records ~99% Overall Hit Rate ~90% L1 Hit Rate
59 Moneta Performance in Prod (High Vol-Online Req) Request get 95%, 99% = 174 μs, 588 μs L1 get 95%, 99% = 145 μs, 190 μs L2 get 95%, 99% = 770 μs, 1330 μs ~1 KB Records ~98% Overall Hit Rate ~97% L1 Hit Rate
60 Moneta Performance in Prod (High Vol-Online Req) Get Percentiles: 50th: 102 μs (101 μs) 75th: 120 μs (115 μs) 90th: 146 μs (137 μs) 95th: 174 μs (166 μs) 99th: 588 μs (427 μs) 99.5th: 733 μs (568 μs) 99.9th: 1.39 ms (979 μs) Latencies: peak (trough) Set Percentiles: 50th: 97.2 μs (87.2 μs) 75th: 107 μs (101 μs) 90th: 125 μs (115 μs) 95th: 138 μs (126 μs) 99th: 177 μs (152 μs) 99.5th: 208 μs (169 μs) 99.9th: 1.19 ms (318 μs)
61 70% Reduction in cost*
62 Challenges/Concerns Less Visibility Unclear of Overall Data Size because of duplicates and expired records Restrict Unique Data Set to ½ of Max for Precompute Batch Data Lower Max Throughput than Memcached-based Server Higher CPU usage Planning must be better so we can handle unusually high request spikes
63 Current/Future Work Investigate Blob Storage feature Less Data read/write from SSD during Level Compaction Lower Latency, Higher Throughput Better View of Total Data Size Purging Expired SST s earlier Useful in short TTL use cases May purge 60%+ SST earlier than FIFO Compaction Reduce Worst Case Latency Better Visibility of Overall Data Size Inexpensive Deduping for Batch Data
64 Open Source
65 Thank You techblog.netflix.com
66
67 Failure Resilience in Client Operation Fast Failure Tunable Retries Operation Queues Tunable Latch for Mutations Async Replication through Kafka
68 Consistency Metrics Region A APP 1 mutate Client Dashboards 2 send metadata 4 pull data Kafka 3 poll msg Consistency Checker Atlas (Metrics Backend) 5 report
69 Lost Instance Recovery Application Client Library Client Zone A Zone B Control Data Flow Partial Data Flow Control Flow Metadata Flow S3 Cache Warmer (Spark)
70 Backup (and Restore) Application Client Library Client Control Data Flow Control Flow S3 Cache Warmer (Spark)
71 Moneta in Production Serving all of our personalization data Rend runs with two ports: One for standard users (read heavy or active data management) Another for async and batch users: Replication and Precompute Maintains working set in RAM Optimized for precomputes Smartly replaces data in L1 Std Memcached (RAM) Batch Mnemonic EVCar external internal (SSD)
72 Rend batching backend
Machine Learning meets Databases. Ioannis Papapanagiotou Cloud Database Engineering
Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering Create Personalized Recommendations for discoveries of engaging video content that maximizes member joy. Personalize Everything
More informationMaking Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari
Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer
More informationSCYLLA: NoSQL at Ludicrous Speed. 主讲人 :ScyllaDB 软件工程师贺俊
SCYLLA: NoSQL at Ludicrous Speed 主讲人 :ScyllaDB 软件工程师贺俊 Today we will cover: + Intro: Who we are, what we do, who uses it + Why we started ScyllaDB + Why should you care + How we made design decisions to
More informationMike Kania Truss
Mike Kania Engineer @ Truss http://truss.works/ MongoDB on AWS With Minimal Suffering + Topics Provisioning MongoDB Replica Sets on AWS Choosing storage and a storage engine Backups Monitoring Capacity
More informationMyRocks deployment at Facebook and Roadmaps. Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom
MyRocks deployment at Facebook and Roadmaps Yoshinori Matsunobu Production Engineer / MySQL Tech Lead, Facebook Feb/2018, #FOSDEM #mysqldevroom Agenda MySQL at Facebook MyRocks overview Production Deployment
More informationLSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data
LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Xingbo Wu Yuehai Xu Song Jiang Zili Shao The Hong Kong Polytechnic University The Challenge on Today s Key-Value Store Trends on workloads
More informationMigrating to Cassandra in the Cloud, the Netflix Way
Migrating to Cassandra in the Cloud, the Netflix Way Jason Brown - @jasobrown Senior Software Engineer, Netflix Tech History, 1998-2008 In the beginning, there was the webapp and a single database in a
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 40) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,
More informationNo compromises: distributed transactions with consistency, availability, and performance
No compromises: distributed transactions with consistency, availability, and performance Aleksandar Dragojevi c, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam,
More informationRocksDB Key-Value Store Optimized For Flash
RocksDB Key-Value Store Optimized For Flash Siying Dong Software Engineer, Database Engineering Team @ Facebook April 20, 2016 Agenda 1 What is RocksDB? 2 RocksDB Design 3 Other Features What is RocksDB?
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationHow Netflix Leverages Multiple Regions to Increase Availability: Isthmus and Active-Active Case Study
How Netflix Leverages Multiple Regions to Increase Availability: Isthmus and Active-Active Case Study Ruslan Meshenberg November 13, 2013 2013 Amazon.com, Inc. and its affiliates. All rights reserved.
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationScaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX
Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Inventing Internet TV Available in more than 190 countries 104+ million subscribers Lots of Streaming == Lots of Traffic
More informationBigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI 2006 Presented by Xiang Gao 2014-11-05 Outline Motivation Data Model APIs Building Blocks Implementation Refinement
More informationA New Key-Value Data Store For Heterogeneous Storage Architecture
A New Key-Value Data Store For Heterogeneous Storage Architecture brien.porter@intel.com wanyuan.yang@intel.com yuan.zhou@intel.com jian.zhang@intel.com Intel APAC R&D Ltd. 1 Agenda Introduction Background
More informationGoogle File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW
More informationMicroservice Layout in Netflix
Microservice Layout in Netflix Polyglot Persistence Powering Microservices Roopa Tangirala Engineering Manager Netflix Agenda 5 Use Cases Challenges Current Approach Takeaway AWS S3 CDE Search,
More informationPerformance and Scalability with Griddable.io
Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.
More informationBuilding Durable Real-time Data Pipeline
Building Durable Real-time Data Pipeline Apache BookKeeper at Twitter @sijieg Twitter Background Layered Architecture Agenda Design Details Performance Scale @Twitter Q & A Publish-Subscribe Online services
More informationGoals. Facebook s Scaling Problem. Scaling Strategy. Facebook Three Layer Architecture. Workload. Memcache as a Service.
Goals Memcache as a Service Tom Anderson Rapid application development - Speed of adding new features is paramount Scale Billions of users Every user on FB all the time Performance Low latency for every
More informationHow To Rock with MyRocks. Vadim Tkachenko CTO, Percona Webinar, Jan
How To Rock with MyRocks Vadim Tkachenko CTO, Percona Webinar, Jan-16 2019 Agenda MyRocks intro and internals MyRocks limitations Benchmarks: When to choose MyRocks over InnoDB Tuning for the best results
More informationServers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, #netflixcloud #cassandra12
Servers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, Netflix @eatupmartha #netflixcloud #cassandra12 1 June 29, 2012 2 3 4 [1] 5 From the Netflix tech blog: Cassandra, our distributed cloud persistence
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationConfiguring Short RPO with Actifio StreamSnap and Dedup-Async Replication
CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for
More informationReal-Time & Big Data GIS: Best Practices. Suzanne Foss Josh Joyner
Real-Time & Big Data GIS: Best Practices Suzanne Foss Josh Joyner ArcGIS Enterprise With Real-time Capabilities Desktop Apps APIs visualization ingestion dissemination & actuation analytics storage Agenda:
More informationBe Fast, Cheap and in Control with SwitchKV. Xiaozhou Li
Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Goal: fast and cost-efficient key-value store Store, retrieve, manage key-value objects Get(key)/Put(key,value)/Delete(key) Target: cluster-level
More informationMeasuring HEC Performance For Fun and Profit
Measuring HEC Performance For Fun and Profit Itay Neeman Director, Engineering, Splunk Clif Gordon Principal Software Engineer, Splunk September 2017 Washington, DC Forward-Looking Statements During the
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationSTORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)
1 STORAGE LATENCY 2 RAMAC 350 (600 ms) 1956 10 5 x NAND SSD (60 us) 2016 COMPUTE LATENCY 3 RAMAC 305 (100 Hz) 1956 10 8 x 1000x CORE I7 (1 GHZ) 2016 NON-VOLATILE MEMORY 1000x faster than NAND 3D XPOINT
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School
More informationRAMCloud. Scalable High-Performance Storage Entirely in DRAM. by John Ousterhout et al. Stanford University. presented by Slavik Derevyanko
RAMCloud Scalable High-Performance Storage Entirely in DRAM 2009 by John Ousterhout et al. Stanford University presented by Slavik Derevyanko Outline RAMCloud project overview Motivation for RAMCloud storage:
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
More informationAmazon Aurora Deep Dive
Amazon Aurora Deep Dive Anurag Gupta VP, Big Data Amazon Web Services April, 2016 Up Buffer Quorum 100K to Less Proactive 1/10 15 caches Custom, Shared 6-way Peer than read writes/second Automated Pay
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems
More informationHighly Scalable, Non-RDMA NVMe Fabric. Bob Hansen,, VP System Architecture
A Cost Effective,, High g Performance,, Highly Scalable, Non-RDMA NVMe Fabric Bob Hansen,, VP System Architecture bob@apeirondata.com Storage Developers Conference, September 2015 Agenda 3 rd Platform
More informationChunkStash: Speeding Up Storage Deduplication using Flash Memory
ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication
More informationHow we scaled push messaging for millions of Netflix devices. Susheel Aroskar Cloud Gateway
How we scaled push messaging for millions of Netflix devices Susheel Aroskar Cloud Gateway Why do we need push? How I spend my time in Netflix application... What is push? What is push? How you can build
More informationTools for Social Networking Infrastructures
Tools for Social Networking Infrastructures 1 Cassandra - a decentralised structured storage system Problem : Facebook Inbox Search hundreds of millions of users distributed infrastructure inbox changes
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationPerformance Benefits of Running RocksDB on Samsung NVMe SSDs
Performance Benefits of Running RocksDB on Samsung NVMe SSDs A Detailed Analysis 25 Samsung Semiconductor Inc. Executive Summary The industry has been experiencing an exponential data explosion over the
More informationFlashed-Optimized VPSA. Always Aligned with your Changing World
Flashed-Optimized VPSA Always Aligned with your Changing World Yair Hershko Co-founder, VP Engineering, Zadara Storage 3 Modern Data Storage for Modern Computing Innovating data services to meet modern
More information4 Myths about in-memory databases busted
4 Myths about in-memory databases busted Yiftach Shoolman Co-Founder & CTO @ Redis Labs @yiftachsh, @redislabsinc Background - Redis Created by Salvatore Sanfilippo (@antirez) OSS, in-memory NoSQL k/v
More informationAuthors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani
The Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani CS5204 Operating Systems 1 Introduction GFS is a scalable distributed file system for large data intensive
More informationSizing Guidelines and Performance Tuning for Intelligent Streaming
Sizing Guidelines and Performance Tuning for Intelligent Streaming Copyright Informatica LLC 2017. Informatica and the Informatica logo are trademarks or registered trademarks of Informatica LLC in the
More informationCaching At Twitter and moving towards a persistent, in-memory key-value store
aching At Twitter and moving towards a persistent, in-memory key-value store Manju Rajashekhar @manju Outline aching System Architecture Twemcache Twemproxy Learnings in-memory persistent store ache In
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection
More informationA New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.
A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd. 1 Agenda Introduction Background and Motivation Hybrid Key-Value Data Store Architecture Overview Design details Performance
More informationNVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory
NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory Dhananjoy Das, Sr. Systems Architect SanDisk Corp. 1 Agenda: Applications are KING! Storage landscape (Flash / NVM)
More informationAgenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache
Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,
More informationHedvig as backup target for Veeam
Hedvig as backup target for Veeam Solution Whitepaper Version 1.0 April 2018 Table of contents Executive overview... 3 Introduction... 3 Solution components... 4 Hedvig... 4 Hedvig Virtual Disk (vdisk)...
More informationThe Google File System
The Google File System By Ghemawat, Gobioff and Leung Outline Overview Assumption Design of GFS System Interactions Master Operations Fault Tolerance Measurements Overview GFS: Scalable distributed file
More informationPebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees
PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research
More informationDistributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 05r. Case study: Google Cluster Architecture Paul Krzyzanowski Rutgers University Fall 2016 1 A note about relevancy This describes the Google search cluster architecture in the mid
More informationMySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona In the Presentation Practical approach to deal with some of the common MySQL Issues 2 Assumptions You re looking
More informationMySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018
MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona Percona Technical Webinars 9 May 2018 Few words about Percona Monitoring and Management (PMM) 100% Free, Open Source
More informationHigh-Performance Key-Value Store on OpenSHMEM
High-Performance Key-Value Store on OpenSHMEM Huansong Fu*, Manjunath Gorentla Venkata, Ahana Roy Choudhury*, Neena Imam, Weikuan Yu* *Florida State University Oak Ridge National Laboratory Outline Background
More informationGoogle File System (GFS) and Hadoop Distributed File System (HDFS)
Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationdavidklee.net gplus.to/kleegeek linked.com/a/davidaklee
@kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture
More informationSharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment
SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment This document is provided as-is. Information and views expressed in this document, including
More informationRAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store
RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store Yiming Zhang, Rui Chu @ NUDT Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Haitao Wu @ MSRA June, 2012 1 Background Disk-based storage
More informationAmbry: LinkedIn s Scalable Geo- Distributed Object Store
Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil
More informationBigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng
Bigtable: A Distributed Storage System for Structured Data Andrew Hon, Phyllis Lau, Justin Ng What is Bigtable? - A storage system for managing structured data - Used in 60+ Google services - Motivation:
More informationHigh Performance Parallel File Access via Standard NFS v3
High Performance Parallel File Access via Standard NFS v Kent Ritchie Senior Systems Engineer AVERE SYSTEMS, INC 90 River Ave Pittsburgh PA averesystems.com Why NFS v? NFS was made for sharing Isn t pnfs
More informationWhich technology to choose in AWS?
Which technology to choose in AWS? RDS / Aurora / Roll-your-own April 17, 2018 Daniel Kowalewski Senior Technical Operations Engineer Percona 1 2017 Percona AWS MySQL options RDS for MySQL Aurora MySQL
More informationData Infrastructure at LinkedIn. Shirshanka Das XLDB 2011
Data Infrastructure at LinkedIn Shirshanka Das XLDB 2011 1 Me UCLA Ph.D. 2005 (Distributed protocols in content delivery networks) PayPal (Web frameworks and Session Stores) Yahoo! (Serving Infrastructure,
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationElastic Efficient Execution of Varied Containers. Sharma Podila Nov 7th 2016, QCon San Francisco
Elastic Efficient Execution of Varied Containers Sharma Podila Nov 7th 2016, QCon San Francisco In other words... How do we efficiently run heterogeneous workloads on an elastic pool of heterogeneous resources,
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationMySQL In the Cloud. Migration, Best Practices, High Availability, Scaling. Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017.
MySQL In the Cloud Migration, Best Practices, High Availability, Scaling Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017 1 Let me start. With some Questions! 2 Question One How Many of you
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 41) K. Gopinath Indian Institute of Science Lease Mgmt designed to minimize mgmt overhead at master a lease initially times out at 60 secs. primary can request
More informationYCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores
YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores Swapnil Patil Milo Polte, Wittawat Tantisiriroj, Kai Ren, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs *, Billie
More informationMultimedia Streaming. Mike Zink
Multimedia Streaming Mike Zink Technical Challenges Servers (and proxy caches) storage continuous media streams, e.g.: 4000 movies * 90 minutes * 10 Mbps (DVD) = 27.0 TB 15 Mbps = 40.5 TB 36 Mbps (BluRay)=
More informationRethinking Deduplication Scalability
Rethinking Deduplication Scalability Petros Efstathopoulos Petros Efstathopoulos@symantec.com Fanglu Guo Fanglu Guo@symantec.com Symantec Research Labs Symantec Corporation, Culver City, CA, USA 1 ABSTRACT
More informationPulsar. Realtime Analytics At Scale. Wang Xinglang
Pulsar Realtime Analytics At Scale Wang Xinglang Agenda Pulsar : Real Time Analytics At ebay Business Use Cases Product Requirements Pulsar : Technology Deep Dive 2 Pulsar Business Use Case: Behavioral
More informationPerformance Testing of SQL Server on Kaminario K2 Storage
Performance Testing of SQL Server on Kaminario K2 Storage September 2016 TABLE OF CONTENTS 2 3 5 14 15 17 Executive Summary Introduction to Kaminario K2 Performance Tests for SQL Server Summary Appendix:
More informationAmazon Aurora Deep Dive
Amazon Aurora Deep Dive Enterprise-class database for the cloud Damián Arregui, Solutions Architect, AWS October 27 th, 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Enterprise
More informationCHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.
CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File
More informationSQL Server Performance on AWS. October 2018
SQL Server Performance on AWS October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only. It represents AWS s
More informationSharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment
SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment This document is provided as-is. Information and views expressed in this document, including URL and other Internet
More informationUnderstanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.
Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write
More informationPocket: Elastic Ephemeral Storage for Serverless Analytics
Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationScaling App Engine Applications. Justin Haugh, Guido van Rossum May 10, 2011
Scaling App Engine Applications Justin Haugh, Guido van Rossum May 10, 2011 First things first Justin Haugh Software Engineer Systems Infrastructure jhaugh@google.com Guido Van Rossum Software Engineer
More informationStorage Performance Validation for Panzura
Storage Performance Validation for Panzura Ensuring seamless cloud storage performance for Panzura s Quicksilver Product Suite WHITEPAPER Table of Contents Background on Panzura...3 Storage Performance
More informationCSE 124: Networked Services Lecture-16
Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments
More informationIntroducing Tegile. Company Overview. Product Overview. Solutions & Use Cases. Partnering with Tegile
Tegile Systems 1 Introducing Tegile Company Overview Product Overview Solutions & Use Cases Partnering with Tegile 2 Company Overview Company Overview Te gile - [tey-jile] Tegile = technology + agile Founded
More informationApache BookKeeper. A High Performance and Low Latency Storage Service
Apache BookKeeper A High Performance and Low Latency Storage Service Hello! I am Sijie Guo - PMC Chair of Apache BookKeeper Co-creator of Apache DistributedLog Twitter Messaging/Pub-Sub Team Yahoo! R&D
More informationDeduplication Storage System
Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business
More informationNimble Storage Adaptive Flash
Nimble Storage Adaptive Flash Read more Nimble solutions Contact Us 800-544-8877 solutions@microage.com MicroAge.com TECHNOLOGY OVERVIEW Nimble Storage Adaptive Flash Nimble Storage s Adaptive Flash platform
More informationGFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures
GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,
More informationRemoving the I/O Bottleneck in Enterprise Storage
Removing the I/O Bottleneck in Enterprise Storage WALTER AMSLER, SENIOR DIRECTOR HITACHI DATA SYSTEMS AUGUST 2013 Enterprise Storage Requirements and Characteristics Reengineering for Flash removing I/O
More information