DeclStore: Layering is for the Faint of Heart
|
|
- Tamsin Bell
- 5 years ago
- Views:
Transcription
1 DeclStore: Layering is for the Faint of Heart [HotStorage, July 2017] Noah Watkins, Michael Sevilla, Ivo Jimenez, Kathryn Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn
2 Layers on layers on layers Application Special App Object Block File Logging Key-value Special App Middleware Storage Interfaces POSIX Fancy Storage Unified Storage 2
3 Application-specific storage systems Application Special App Object Block File Logging Key-value Special App Middleware Storage Interfaces POSIX Fancy Storage Unified Storage 3
4 Unified storage systems Application Special App Object Block File Logging Key-value Special App Middleware POSIX Examples OnTap Fancy Storage Swift Ceph GPFS EMC Storage Interfaces Unified Storage 4
5 Eliminating layers: big challenges, big rewards Big Question If unified storage becomes the new normal, what new challenges will need to be addressed in system design? Talk Outline Object Block File Logging Key-value Special App Storage Interfaces Unified Storage Motivating unified storage Storage interface design space Declarative storage interfaces 5
6 Motivation: systems trend towards generality Technology Trend Example Specialized unstructured data management systems JSON data type in relational database management systems Specialized real-time and embedded systems Embedded Linux and RT_PREEMPT Specialized storage systems Embedded Ceph and unified system 6
7 Motivation: breaking the narrow waist model User Defined Classes Examples Locking, Logging, Concurrency control, Metadata management Remote compute Developers willing to break layers and use non-standard APIs 7
8 Why in practice do people break down layers? Application I in AP e Cod and h t pamiddleware POSIX ef cies n e fici Special App es, rvic risks e s f t, n o, cos o i t lica ioning p u D rovis r-p ove Fancy Storage Object Block File Logging Key-value Special App Storage Interfaces Unified Storage 8
9 Unified storage has its own set of challenges Special App es, rvic risks e s f t, n o, cos o i t lica ioning p u D rovis r-p ove Logging Key-value Special App ty afe ies) S, os ficienc Q, f ility th ine b a t Por ode pa c s Storage u l Interfaces p ( Fancy Storage Unified Storage Application I in AP e Cod and h t pamiddleware POSIX ef cies n e fici Object Block File 9
10 Exploring design challenges in unified storage Application ies e Cod PI i A and h t a pmiddleware POSIX c cien i f f ne, Special App ices sks v r e ri of s cost, n atio ning, c i l Dup rovisio r-p ove Fancy Storage Object Block File Logging Key-value Special App Storage Interfaces Unified Storage 10
11 Malacology: programmable interface research platform Target storage interface (Goal) Composed, generic service glue layer Internal sys services (Building blocks) Malacology: A Programmable Storage System, M. Sevilla, N. Watkins, I. Jimenez, P. Alvaro, S. Finkelstein, J. LeFevre, C. Maltzahn, EuroSys 17 Mantle: A Programmable Metadata Load Balancer for the Ceph File System, M. Sevilla, N. Watkins, C. Maltzahn, I. Nassi, S. Brandt, et. al, SC 15 CORFU: A Shared Log Design for Flash Clusters, Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, and Ted Wobber, Michael Wei, et. al, NSDI 12 11
12 Keeping pace with evolving storage systems Hard-wired prototype implementations How to take advantage of... Software upgrades New hardware Additional services Performance features 12
13 Example: building the CORFU shared-log interface CORFU1 is a dist. shared-log High-performance design ?? Append and random reads I/O parallelism (striping) Soft-state network counter Challenges Find a good mapping Performance optimizations Minimal maintenance CORFU: A Shared Log Design for Flash Clusters, Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, and Ted Wobber, Michael Wei, et. al, NSDI 12 13
14 Example: building CORFU distributed shared-log Implementation options Partitioning Log partitioning?: 1-1, striping, etc... 14
15 Example: building CORFU distributed shared-log Implementation options Partitioning Metadata Log partitioning?: 1-1, striping, etc... 15
16 Example: building CORFU distributed shared-log Implementation options 0 Partitioning Metadata I/O interfaces Log partitioning?: 1-1, striping, etc... ByteStream, K/V FS (xfs, btrfs) BlueStore K/V (Rocks, LMDB) 16
17 Example: building CORFU distributed shared-log Implementation options 0 Partitioning Metadata I/O interfaces Log partitioning?: 1-1, striping, etc... 2x Constructed 4 versions 2 partitioning * 2 I/O interfaces Partitioning (1-1, striping) I/O interface (bytestream, K/V) ByteStream, K/V FS (xfs, btrfs) 2x BlueStore K/V (Rocks, LMDB) 17
18 Example: building CORFU distributed shared-log Implementation options Partitioning Metadata I/O interfaces Log partitioning?: 1-1, striping, etc... 2x Constructed 4 versions 0 2 partitioning * 2 I/O interfaces Partitioning (1-1, striping) I/O interface (bytestream, K/V) Hard-wired implementations Few 1000 s LOC Manually optimize and switch between approaches ByteStream, K/V FS (xfs, btrfs) 2x BlueStore K/V (Rocks, LMDB) 18
19 Example: shared-log performance toss-up in 2014 Four implementations Ceph version circa 2014 Graph takeaways Similar top performers Our claim Clear performance losers Performance Comparison of 4 Designs Appends / Sec Ceph 2014 Select simpler implementation Added complexity for no benefit What is complexity? Lines of code Conceptual 19
20 Example: clear design choice in 2016 Takeaway: A reasonable choice in 2014 would be a poor choice in 2016 after a simple upgrade Performance Comparison of 4 Designs Appends / Sec Same implementations Same hardware / benchmark Newer version of Ceph Clear performance winner in 2016 Ceph 2014 Appends / Sec Ceph
21 Problem: navigation of large design space interfaces / data models you are here heterogeneous hardware / memories features & tunables (>1000 in Ceph) time / versions 21
22 DeclStore: express storage interfaces declaratively Freedom from storage system and domain expertise Abstractions over storage services and interfaces The point is we need a high-level language Many possible mappings Many degrees of freedom Generate storage interface implementations Exploit relational database optimization research 22
23 DeclStore: building on a strong foundation Based on Bloom programming language High-level declarative language (Alvaro, CIDR `11) Many degrees of freedom on reordering Abstract relational data model Programs are queries Subject to optimization Evidence this is possible... LogicBlox (Aref, SIGMOD 2015) Dedalus (Alvaro, et. al, Datalog 2010) Declarative Networking (Loo, SIGMOD 2006) 23
24 Example: a declarative specification for CORFU System state modeled as a set of relations Interfaces are queries over a request stream Persistent state Input/Output Operate on persistent state Collection operations Full specification Just another slide s worth Compare 1K s LOC C++ # persistent relations table :epoch, [:epoch] table :log, [:pos] => [:state, :data] # interfaces are also like tables interface input, :op, [:type, :pos, :epoch] => [:data] interface output, :ret, [:type, :pos, :epoch] => [:retval] bloom :write do temp :valid_write <= write_op.notin(found_op) log <+ valid_write { o [o.pos, 'valid', o.data] } ret <= valid_write { o [o.type, o.pos, o.epoch, 'ok'] } ret <= write_op.notin(valid_write) { o [o.type, o.pos, o.epoch, 'read-only'] } end Brados: Declarative, Programmable Object Storage, Noah Watkins, Michael Sevilla, Ivo Jimenez, Neha Ojha, Peter Alvaro, Carlos Maltzahn, UCSC-SOE
25 What s the catch? This is a storage system, not a database... Generality introduces overhead Additional complexity and layers We don t want layers! The cost of online optimization 25
26 Taking advantage of time scales: offline optimization We are generating storage service implementations Automate integration on new features and hardware Planned upgrade points v.s. per-request optimization DeclStore Interface Definition System Upgrade Point Opt. Timescale (Month / Year) System Upgrade Point Opt. System Upgrade Point Request (< 1ms) 26
27 Example: a declarative specification for CORFU No hard-wired choice of storage interface! Bytestream vs K/V Partitioning strategy Take advantage of existing analysis tools and techniques! Convincingly correct! # persistent relations table :epoch, [:epoch] table :log, [:pos] => [:state, :data] # interfaces are also like tables interface input, :op, [:type, :pos, :epoch] => [:data] interface output, :ret, [:type, :pos, :epoch] => [:retval] bloom :write do temp :valid_write <= write_op.notin(found_op) log <+ valid_write { o [o.pos, 'valid', o.data] } ret <= valid_write { o [o.type, o.pos, o.epoch, 'ok'] } ret <= write_op.notin(valid_write) { o [o.type, o.pos, o.epoch, 'read-only'] } end 27
28 Example: performance benefit of group commit Ceph I/O handling is conservative Log operations are independent Lots of internal code traversal Using efficient interfaces Determined through analysis Amortize across I/O and code path Basic Operation Batching Queuing, locking, ordering Vector or range-based interfaces See paper for more details Cost models and outliers 28
29 Conclusion: expanding the set of storage interfaces... is becoming increasingly common will create an entirely new set of challenges; we showed new systems being built old systems being adapted hard-wired solutions are difficult to maintain even a basic software upgrade can have a big impact through declarative specifications can address many concerns lower maintenance costs automate system evolution 29
30 Thank you Questions? DeclStore: Layering is for the Faint of Heart Noah Watkins, Michael Sevilla, Ivo Jimenez, Kathryn Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn
31 What else we are shooting for... Volatile views of persistent data Physical design migration Example: soft-state network counter in CORFU Example: indexes, row vs. column. vs. hybrid Program analysis and optimization Example: exploit non-interference 31
32 Cost models are an important component Batching with range operations Sensitive to outliers Wasted I/O Simple heuristics for identifying outliers outperform Basic-Batch Order of magnitude > IOPS Minimal reduction over Opt-Batch Cost models needed for automatic strategy selection orthogonal to program analysis 32
Malacology. A Programmable Storage System [Sevilla et al. EuroSys '17]
Malacology A Programmable Storage System [Sevilla et al. EuroSys '17] Michael A. Sevilla, Noah Watkins, Ivo Jimenez, Peter Alvaro, Shel Finkelstein, Jeff LeFevre, Carlos Maltzahn University of California,
More informationDeclStore: Layering is for the Faint of Heart
DeclStore: Layering is for the Faint of Heart Noah Watkins, Michael A. Sevilla, Ivo Jimenez, Kathryn Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn The University of California, Santa Cruz {jayhawk,
More informationCurriculum Vitae. April 5, 2018
Curriculum Vitae April 5, 2018 Michael A. Sevilla mikesevilla3@gmail.com website: users.soe.ucsc.edu/~msevilla 127 Storey St., Santa Cruz, CA 95060 code: github.com/michaelsevilla mobile: (858) 449-3086
More informationBrados: Declarative, Programmable Object Storage
Brados: Declarative, Programmable Object Storage Noah Watkins, Michael Sevilla, Ivo Jimenez, Neha Ojha, Peter Alvaro, Carlos Maltzahn University of California, Santa Cruz Submission Type: Research Abstract
More informationCORFU: A Shared Log Design for Flash Clusters
CORFU: A Shared Log Design for Flash Clusters Authors: Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, John D. Davis EECS 591 11/7/18 Presented by Evan Agattas and Fanzhong
More informationTintenfisch: File System Namespace Schemas and Generators
System Namespace, Reza Nasirigerdeh, Carlos Maltzahn, Jeff LeFevre, Noah Watkins, Peter Alvaro, Margaret Lawson*, Jay Lofstead*, Jim Pivarski^ UC Santa Cruz, *Sandia National Laboratories, ^Princeton University
More informationMalacology: A Programmable Storage System
To appear in EuroSys 17 Malacology: A Programmable Storage System Michael A. Sevilla, Noah Watkins, Ivo Jimenez, Peter Alvaro, Shel Finkelstein, Jeff LeFevre, Carlos Maltzahn University of California,
More informationA Design for Networked Flash
A Design for Networked Flash (Clusters Of Raw Flash Units) Mahesh Balakrishnan, John Davis, Dahlia Malkhi, Vijayan Prabhakaran, Michael Wei*, Ted Wobber Microso- Research Silicon Valley * Graduate student
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationCudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace
Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace Michael A. Sevilla, Ivo Jimenez, Noah Watkins, Jeff LeFevre Peter Alvaro, Shel Finkelstein, Patrick Donnelly*,
More informationDistributed Consensus Paxos
Distributed Consensus Paxos Ethan Cecchetti October 18, 2016 CS6410 Some structure taken from Robert Burgess s 2009 slides on this topic State Machine Replication (SMR) View a server as a state machine.
More informationLustre A Platform for Intelligent Scale-Out Storage
Lustre A Platform for Intelligent Scale-Out Storage Rumi Zahir, rumi. May 2003 rumi.zahir@intel.com Agenda Problem Statement Trends & Current Data Center Storage Architectures The Lustre File System Project
More informationStorage in HPC: Scalable Scientific Data Management. Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11
Storage in HPC: Scalable Scientific Data Management Carlos Maltzahn IEEE Cluster 2011 Storage in HPC Panel 9/29/11 Who am I? Systems Research Lab (SRL), UC Santa Cruz LANL/UCSC Institute for Scalable Scientific
More informationTackling the Reproducibility Problem in Systems Research with Declarative Experiment Specifications
Tackling the Reproducibility Problem in Systems Research with Declarative Experiment Specifications Ivo Jimenez, Carlos Maltzahn (UCSC) Adam Moody, Kathryn Mohror (LLNL) Jay Lofstead (Sandia) Andrea Arpaci-Dusseau,
More informationBalancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation
Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation Hui Wang, Peter Varman Rice University FAST 14, Feb 2014 Tiered Storage Tiered storage: HDs and SSDs q Advantages:
More informationMATE-EC2: A Middleware for Processing Data with Amazon Web Services
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering
More informationUsing Lua in the Ceph distributed storage system
Using Lua in the Ceph distributed storage system Noah Watkins Michael Sevilla noahwatkins@gmail.com @noahdesu mikesevilla3@gmail.com Lua Workshop, October 2017, San Francisco yet another Lua embedding
More informationEECS 482 Introduction to Operating Systems
EECS 482 Introduction to Operating Systems Winter 2018 Baris Kasikci Slides by: Harsha V. Madhyastha OS Abstractions Applications Threads File system Virtual memory Operating System Next few lectures:
More informationEnosis: Bridging the Semantic Gap between
Enosis: Bridging the Semantic Gap between File-based and Object-based Data Models Anthony Kougkas - akougkas@hawk.iit.edu, Hariharan Devarajan, Xian-He Sun Outline Introduction Background Approach Evaluation
More informationROCK INK PAPER COMPUTER
Introduction to Ceph and Architectural Overview Federico Lucifredi Product Management Director, Ceph Storage Boston, December 16th, 2015 CLOUD SERVICES COMPUTE NETWORK STORAGE the future of storage 2 ROCK
More informationSolving Linux File System Pain Points. Steve French Samba Team & Linux Kernel CIFS VFS maintainer Principal Software Engineer Azure Storage
Solving Linux File System Pain Points Steve French Samba Team & Linux Kernel CIFS VFS maintainer Principal Software Engineer Azure Storage Legal Statement This work represents the views of the author(s)
More informationWhat's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT
What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT QUICK PRIMER ON CEPH AND RADOS CEPH MOTIVATING PRINCIPLES All components must scale horizontally There can be no single point of failure The solution
More informationTechniques to improve the scalability of Checkpoint-Restart
Techniques to improve the scalability of Checkpoint-Restart Bogdan Nicolae Exascale Systems Group IBM Research Ireland 1 Outline A few words about the lab and team Challenges of Exascale A case for Checkpoint-Restart
More informationA New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.
A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd. 1 Agenda Introduction Background and Motivation Hybrid Key-Value Data Store Architecture Overview Design details Performance
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationCS-580K/480K Advanced Topics in Cloud Computing. Object Storage
CS-580K/480K Advanced Topics in Cloud Computing Object Storage 1 When we use object storage When we check Facebook, twitter Gmail Docs on DropBox Check share point Take pictures with Instagram 2 Object
More informationData Consistency with SPLICE Middleware. Leslie Madden Chad Offenbacker Naval Surface Warfare Center Dahlgren Division
Data Consistency with SPLICE Middleware Leslie Madden Chad Offenbacker Naval Surface Warfare Center Dahlgren Division Slide 1 6/30/2005 Background! US Navy Combat Systems are evolving to distributed systems
More informationUsing Transparent Compression to Improve SSD-based I/O Caches
Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr
More informationPractical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars
Practical MySQL Performance Optimization Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars In This Presentation We ll Look at how to approach Performance Optimization Discuss Practical
More informationThe Case for Programmable Object Storage Systems
The Case for Programmable Object Storage Systems Noah Watkins Michael Sevilla Carlos Maltzahn University of California, Santa Cruz {jayhawk,msevilla,carlosm}@soe.ucsc.edu Abstract As applications scale
More informationTackling the Reproducibility Problem in Systems Research with Declarative Experiment Specifications
Tackling the Reproducibility Problem in Systems Research with Declarative Experiment Specifications Ivo Jimenez, Carlos Maltzahn (UCSC) Adam Moody, Kathryn Mohror (LLNL) Jay Lofstead (Sandia) Andrea Arpaci-Dusseau,
More informationI/O and file systems. Dealing with device heterogeneity
I/O and file systems Abstractions provided by operating system for storage devices Heterogeneous -> uniform One/few storage objects (disks) -> many storage objects (files) Simple naming -> rich naming
More informationFS Consistency & Journaling
FS Consistency & Journaling Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau) Why Is Consistency Challenging? File system may perform several disk writes to serve a single request Caching
More informationCost-Effective Virtual Petabytes Storage Pools using MARS. LCA 2018 Presentation by Thomas Schöbel-Theuer
Cost-Effective Virtual Petabytes Storage Pools using MARS LCA 2018 Presentation by Thomas Schöbel-Theuer 1 Virtual Petabytes Storage Pools: Agenda Storage Architectures Scalability && Costs HOWTO Background
More informationA Gentle Introduction to Ceph
A Gentle Introduction to Ceph Narrated by Tim Serong tserong@suse.com Adapted from a longer work by Lars Marowsky-Brée lmb@suse.com Once upon a time there was a Free and Open Source distributed storage
More informationDistributed Data Analytics Transactions
Distributed Data Analytics G-3.1.09, Campus III Hasso Plattner Institut must ensure that interactions succeed consistently An OLTP Topic Motivation Most database interactions consist of multiple, coherent
More informationBig and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant
Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model
More informationDongjun Shin Samsung Electronics
2014.10.31. Dongjun Shin Samsung Electronics Contents 2 Background Understanding CPU behavior Experiments Improvement idea Revisiting Linux I/O stack Conclusion Background Definition 3 CPU bound A computer
More informationImproving throughput for small disk requests with proximal I/O
Improving throughput for small disk requests with proximal I/O Jiri Schindler with Sandip Shete & Keith A. Smith Advanced Technology Group 2/16/2011 v.1.3 Important Workload in Datacenters Serial reads
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,
More informationINTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017
INTRODUCTION TO CEPH Orit Wasserman Red Hat August Penguin 2017 CEPHALOPOD A cephalopod is any member of the molluscan class Cephalopoda. These exclusively marine animals are characterized by bilateral
More information"Software-defined storage Crossing the right bridge"
Navigating the software-defined storage world shaping tomorrow with you René Huebel "Software-defined storage Crossing the right bridge" SDS the model and the promises Control Abstraction The promises
More informationImproved Solutions for I/O Provisioning and Application Acceleration
1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer
More informationEventually Consistent HTTP with Statebox and Riak
Eventually Consistent HTTP with Statebox and Riak Author: Bob Ippolito (@etrepum) Date: November 2011 Venue: QCon San Francisco 2011 1/62 Introduction This talk isn't really about web. It's about how we
More informationAbstract Storage Moving file format specific abstrac7ons into petabyte scale storage systems. Joe Buck, Noah Watkins, Carlos Maltzahn & ScoD Brandt
Abstract Storage Moving file format specific abstrac7ons into petabyte scale storage systems Joe Buck, Noah Watkins, Carlos Maltzahn & ScoD Brandt Introduc7on Current HPC environment separates computa7on
More informationTRANSACTIONAL FLASH CARSTEN WEINHOLD. Vijayan Prabhakaran, Thomas L. Rodeheffer, Lidong Zhou
Department of Computer Science Institute for System Architecture, Operating Systems Group TRANSACTIONAL FLASH Vijayan Prabhakaran, Thomas L. Rodeheffer, Lidong Zhou CARSTEN WEINHOLD MOTIVATION Transactions
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationChoosing the Right Container Infrastructure for Your Organization
WHITE PAPER Choosing the Right Container Infrastructure for Your Organization Container adoption is accelerating rapidly. Gartner predicts that by 2018 more than 50% of new workloads will be deployed into
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationForget IOPS: A Proper Way to Characterize & Test Storage Performance Peter Murray SwiftTest
Forget IOPS: A Proper Way to Characterize & Test Storage Performance Peter Murray peter@swifttest.com SwiftTest Storage Performance Validation Rely on vendor IOPS claims Test in production and pray Validate
More informationZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency
ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr
More informationDistributed Systems 16. Distributed File Systems II
Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS
More informationCSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores
CSE 444: Database Internals Lectures 26 NoSQL: Extensible Record Stores CSE 444 - Spring 2014 1 References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No. 4)
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationTDDB68 Concurrent Programming and Operating Systems. Lecture: File systems
TDDB68 Concurrent Programming and Operating Systems Lecture: File systems Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice: Thanks
More informationCeph Intro & Architectural Overview. Federico Lucifredi Product Management Director, Ceph Storage Vancouver & Guadalajara, May 18th, 2015
Ceph Intro & Architectural Overview Federico Lucifredi Product anagement Director, Ceph Storage Vancouver & Guadalajara, ay 8th, 25 CLOUD SERVICES COPUTE NETWORK STORAGE the future of storage 2 ROCK INK
More informationKodewerk. Java Performance Services. The War on Latency. Reducing Dead Time Kirk Pepperdine Principle Kodewerk Ltd.
Kodewerk tm Java Performance Services The War on Latency Reducing Dead Time Kirk Pepperdine Principle Kodewerk Ltd. Me Work as a performance tuning freelancer Nominated Sun Java Champion www.kodewerk.com
More informationDISTRIBUTED STORAGE AND COMPUTE WITH LIBRADOS SAGE WEIL VAULT
DISTRIBUTED STORAGE AND COMPUTE WITH LIBRADOS SAGE WEIL VAULT - 2015.03.11 AGENDA motivation what is Ceph? what is librados? what can it do? other RADOS goodies a few use cases 2 MOTIVATION MY FIRST WEB
More informationLINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED
LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED TOM MCREYNOLDS, VLAD BUZOV AUTOMOTIVE SOFTWARE OCTOBER 15TH, 2013 Why kernel upgrades : the problem Linux Kernel cadence doesn t match Automotive s
More informationWebscale, All Flash, Distributed File Systems. Avraham Meir Elastifile
Webscale, All Flash, Distributed File Systems Avraham Meir Elastifile 1 Outline The way to all FLASH The way to distributed storage Scale-out storage management Conclusion 2 Storage Technology Trend NAND
More informationReferences. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals
References CSE 444: Database Internals Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol 39, No 4) Lectures 26 NoSQL: Extensible Record Stores Bigtable: A Distributed
More informationImprovements in MySQL 5.5 and 5.6. Peter Zaitsev Percona Live NYC May 26,2011
Improvements in MySQL 5.5 and 5.6 Peter Zaitsev Percona Live NYC May 26,2011 State of MySQL 5.5 and 5.6 MySQL 5.5 Released as GA December 2011 Percona Server 5.5 released in April 2011 Proven to be rather
More informationBigTable. CSE-291 (Cloud Computing) Fall 2016
BigTable CSE-291 (Cloud Computing) Fall 2016 Data Model Sparse, distributed persistent, multi-dimensional sorted map Indexed by a row key, column key, and timestamp Values are uninterpreted arrays of bytes
More informationExecutive Summary. It is important for a Java Programmer to understand the power and limitations of concurrent programming in Java using threads.
Executive Summary. It is important for a Java Programmer to understand the power and limitations of concurrent programming in Java using threads. Poor co-ordination that exists in threads on JVM is bottleneck
More informationCisco Tetration Analytics
Cisco Tetration Analytics Enhanced security and operations with real time analytics John Joo Tetration Business Unit Cisco Systems Security Challenges in Modern Data Centers Securing applications has become
More informationRED HAT CLOUD STRATEGY (OPEN HYBRID CLOUD) Ahmed El-Rayess Solutions Architect
RED HAT CLOUD STRATEGY (OPEN HYBRID CLOUD) Ahmed El-Rayess Solutions Architect AGENDA Cloud Concepts Market Overview Evolution to Cloud Workloads Evolution to Cloud Infrastructure CLOUD TYPES AND DEPLOYMENT
More informationCurrent Topics in OS Research. So, what s hot?
Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general
More informationJoe Butler, Principal Engineer, Director Cloud Services Lab. Nov , OpenStack Summit Paris.
Telemetry the foundation of intelligent cloud orchestration. Joe Butler, Principal Engineer, Director Cloud Services Lab. Nov 3 2014, OpenStack Summit Paris. http://sched.co/1xj2lm9 Datacenter Trends and
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationProcess simulation as a domain- specific OPC UA information model
Process simulation as a domain- specific OPC UA information model Paolo Greppi, consultant, 3iP, Italy ESCAPE 20 June 6 th to 9 th 2010 Ischia, Naples (Italy) Presentation outline Classic OPC OPC Unified
More informationDifferential RAID: Rethinking RAID for SSD Reliability
Differential RAID: Rethinking RAID for SSD Reliability Mahesh Balakrishnan Asim Kadav 1, Vijayan Prabhakaran, Dahlia Malkhi Microsoft Research Silicon Valley 1 The University of Wisconsin-Madison Solid
More informationDebugging Applications in Pervasive Computing
Debugging Applications in Pervasive Computing Larry May 1, 2006 SMA 5508; MIT 6.883 1 Outline Video of Speech Controlled Animation Survey of approaches to debugging Turning bugs into features Speech recognition
More informationShared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter
Shared File System Requirements for SAS Grid Manager Table Talk #1546 Ben Smith / Brian Porter About the Presenters Main Presenter: Ben Smith, Technical Solutions Architect, IBM smithbe1@us.ibm.com Brian
More informationWhat is the Future of PostgreSQL?
What is the Future of PostgreSQL? Robert Haas 2013 EDB All rights reserved. 1 PostgreSQL Popularity By The Numbers Date Rating Increase vs. Prior Year % Increase January 2016 282.401 +27.913 +11% January
More informationDynamic Reconfiguration of Primary/Backup Clusters
Dynamic Reconfiguration of Primary/Backup Clusters (Apache ZooKeeper) Alex Shraer Yahoo! Research In collaboration with: Benjamin Reed Dahlia Malkhi Flavio Junqueira Yahoo! Research Microsoft Research
More informationThe Future of Real-Time in Spark
The Future of Real-Time in Spark Reynold Xin @rxin Spark Summit, New York, Feb 18, 2016 Why Real-Time? Making decisions faster is valuable. Preventing credit card fraud Monitoring industrial machinery
More informationBigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao
Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI 2006 Presented by Xiang Gao 2014-11-05 Outline Motivation Data Model APIs Building Blocks Implementation Refinement
More informationSolidFire and Ceph Architectural Comparison
The All-Flash Array Built for the Next Generation Data Center SolidFire and Ceph Architectural Comparison July 2014 Overview When comparing the architecture for Ceph and SolidFire, it is clear that both
More informationIntroducing the Cray XMT. Petr Konecny May 4 th 2007
Introducing the Cray XMT Petr Konecny May 4 th 2007 Agenda Origins of the Cray XMT Cray XMT system architecture Cray XT infrastructure Cray Threadstorm processor Shared memory programming model Benefits/drawbacks/solutions
More informationA Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing
A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing Z. Sebepou, K. Magoutis, M. Marazakis, A. Bilas Institute of Computer Science (ICS) Foundation for Research and
More informationParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices
ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Devices Jiacheng Zhang, Jiwu Shu, Youyou Lu Tsinghua University 1 Outline Background and Motivation ParaFS Design Evaluation
More informationVertica s Design: Basics, Successes, and Failures
Vertica s Design: Basics, Successes, and Failures Chuck Bear CIDR 2015 January 5, 2015 1. Vertica Basics: Storage Format Design Goals SQL (for the ecosystem and knowledge pool) Clusters of commodity hardware
More informationCluster-Level Google How we use Colossus to improve storage efficiency
Cluster-Level Storage @ Google How we use Colossus to improve storage efficiency Denis Serenyi Senior Staff Software Engineer dserenyi@google.com November 13, 2017 Keynote at the 2nd Joint International
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationSwift 5, ABI Stability and
Swift 5, ABI Stability and Concurrency @phillfarrugia Important Documents Concurrency Manifesto by Chris Lattner https: /gist.github.com/lattner/ 31ed37682ef1576b16bca1432ea9f782 Kicking off Concurrency
More informationAlgorithms and Data Structures for Efficient Free Space Reclamation in WAFL
Algorithms and Data Structures for Efficient Free Space Reclamation in WAFL Ram Kesavan Technical Director, WAFL NetApp, Inc. SDC 2017 1 Outline Garbage collection in WAFL Usenix FAST 2017 ACM Transactions
More informationA Comparison of Performance and Accuracy of Measurement Algorithms in Software
A Comparison of Performance and Accuracy of Measurement Algorithms in Software Omid Alipourfard, Masoud Moshref 1, Yang Zhou 2, Tong Yang 2, Minlan Yu 3 Yale University, Barefoot Networks 1, Peking University
More informationNVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory
NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory Dhananjoy Das, Sr. Systems Architect SanDisk Corp. 1 Agenda: Applications are KING! Storage landscape (Flash / NVM)
More informationPerformance improvements in MySQL 5.5
Performance improvements in MySQL 5.5 Percona Live Feb 16, 2011 San Francisco, CA By Peter Zaitsev Percona Inc -2- Performance and Scalability Talk about Performance, Scalability, Diagnostics in MySQL
More information2011 IBM Research Strategic Initiative: Workload Optimized Systems
PIs: Michael Hind, Yuqing Gao Execs: Brent Hailpern, Toshio Nakatani, Kevin Nowka 2011 IBM Research Strategic Initiative: Workload Optimized Systems Yuqing Gao IBM Research 2011 IBM Corporation Motivation
More informationAdvanced Format in Legacy Infrastructures More Transparent than Disruptive
Advanced Format in Legacy Infrastructures More Transparent than Disruptive Sponsored by IDEMA Presented by Curtis E. Stevens Agenda AF History Enterprise AF Futures SMR & LBA Indirection Hybrids & SSDs
More informationCeph Intro & Architectural Overview. Abbas Bangash Intercloud Systems
Ceph Intro & Architectural Overview Abbas Bangash Intercloud Systems About Me Abbas Bangash Systems Team Lead, Intercloud Systems abangash@intercloudsys.com intercloudsys.com 2 CLOUD SERVICES COMPUTE NETWORK
More informationSandor Heman, Niels Nes, Peter Boncz. Dynamic Bandwidth Sharing. Cooperative Scans: Marcin Zukowski. CWI, Amsterdam VLDB 2007.
Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS Marcin Zukowski Sandor Heman, Niels Nes, Peter Boncz CWI, Amsterdam VLDB 2007 Outline Scans in a DBMS Cooperative Scans Benchmarks DSM version VLDB,
More informationUK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.
UK LUG 10 th July 2012 Lustre at Exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Exascale I/O requirements Exascale I/O model 3 Lustre at Exascale - UK LUG 10th July 2012 Exascale I/O
More informationSpeeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory
More informationOPERATING SYSTEM TRANSACTIONS
OPERATING SYSTEM TRANSACTIONS Donald E. Porter, Owen S. Hofmann, Christopher J. Rossbach, Alexander Benn, and Emmett Witchel The University of Texas at Austin OS APIs don t handle concurrency 2 OS is weak
More informationToward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017
Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store Wei Xie TTU CS Department Seminar, 3/7/2017 1 Outline General introduction Study 1: Elastic Consistent Hashing based Store
More informationStorage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan
Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality
More informationBigtable. Presenter: Yijun Hou, Yixiao Peng
Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 06 Presenter: Yijun Hou, Yixiao Peng
More informationSummary: Issues / Open Questions:
Summary: The paper introduces Transitional Locking II (TL2), a Software Transactional Memory (STM) algorithm, which tries to overcomes most of the safety and performance issues of former STM implementations.
More information