Click to edit Master title

Size: px
Start display at page:

Download "Click to edit Master title"

Transcription

1 Click to edit Master title DIMM: A Distributed Metadata Management for Data-Intensive HPC Brandon Szeliga, John Cavicchio and Weisong Shi Wayne State University bszeliga@wayne.edu 1

2 Click Roadmap to edit Master title Motivation DIMM DHT Bloomfilter System Walk Through Evaluation of DIMM Related Work Conclusion 2

3 Click Motivation to edit Master title Amount of data being stored continually growing Soon to reach levels never seen before Petabyte levels E.g., Physics, bioinformatics, etc Data needs to be migrated from storage to computational nodes We envision this 3

4 Click Motivation to edit Master title Along with the increase in data stores comes an increase in the metadata associated with migrated files Two challenges Our Solution: Maintaining the migrated file DiSK: information A Distributed Shared Disk Cache System Frequent DIMM is updating the key of component of DiSK 4

5 Click DIMMto edit Master title DIstributed Metadata Management Goal: Third Reduce level the amount of migrations from archival Fourth storage level and minimize metadata for a centralized scheduler DIMM uses two key concepts: Distributed Hash Table (DHT) Bloomfilter DIMM is used for read only data and does not guarantee data is persistent within it. 5

6 Distributed Hash Table Overview Click to edit Master title A distributed hash table: Organizes nodes into a ring Inserts/Retrieves items based on a key E.g., Fifth Chord level [Stoica et al. 2001] 6

7 Click Distributed to edit Hash Master Table title By using a key related to the name of a file, a home location for each file can be determined Every Fourth node level can determine where the home is, but not if it is there Allows every node to be able to retrieve data stored if stored on its home To differentiate between data stored as a result of being on its home node or being elsewhere we have the storage divided into a home cache (H) and a local cache (C) 7

8 Click Bloomfilter to edit Master title Why Bloomfilter? Quick checking Easy Fourth insertion level Small Fifth storage level requirement Why not Bloomfilter? Needs to be larger than the set Tendency to contain false positives Cannot delete 8

9 Click Bloomfilter to edit Overview Master title Bloomfilter is an array of bits that is k times larger than a set n 9

10 Click Counter to edit Based Master Bloomfilter title Uses an array of integers instead of bits By using Fourth a level counter based Bloomfilter a centralized manager can monitor data available in DHT This allows for removal of data from Bloomfilter without false negatives However false positives are still a problem with this Bloomfilter 10

11 Click Locality-Check to edit Master Bloomfilter title In order to reduce false positives, a locality check is introduced into the Bloomfilter For Fourth every level file a set of its neighboring files are checked as well Neighboring files can be set alphanumerically or chronologically Using the existence of these neighboring files a probability of existence of the original file is given by: 11

12 Standard System vs. DIMM System Click to edit Master title 12

13 Click System to Walk edit Master Throughtitle 13

14 Click DIMM to Evaluation edit Master title Simulation used for monitoring the impact of DHT and Bloomfilter Evaluate: Impact in job scheduling Local Hits and Migrations from archive Database size vs. Bloomfilter Size Impact of the Locality Check in the Bloomfilter False negatives and False positives 14

15 Click Simulation to edit Setup Master title 400 nodes each capable of holding 2,500 files 250 GB nodes with 100MB files Trace Fourth file level generates amount of input files from normal Fifth (mean level 500, standard deviation 22) Actual files are from a uniform distribution of 100,000 files Jobs (collection of the input files) are scheduled based on SWAP(Storage-aware App. Scheduling) This attempts to maximize the amount of file hits This scheduling policy is a separate topic of ours 15

16 Click Scheme to edit Comparisons Master title SWAP All storage is being considered cache space DIMM_h This is DIMM where only the home location Fifth is level being used to hold files DIMM_hr This is DIMM with the home and the local cache being used to hold files JobMig This is DIMM, but with the ability to migrate jobs to the location of the data 16

17 Impact in Job Scheduling Local Click to edit Master title DIMM_hr performs similar to SWAP until limiting size DIMM_h underperforms SWAP due to restrictions on size DIMM_hr compares to SWAP when large cache, but suffers when caches are larger 17

18 Impact in Job Scheduling Click to edit Master title SWAP s cache scheme suffers from needing to go to the Fifth level archive for data often Also as the home cache increases we can match the migrations required of DIMM has less migrations than SWAP with a large cache, and is capable of matching JobMig 18

19 Database Size vs. Bloomfilter Size Click to edit Master title Problem with Bloomfilter is that the array needs to be larger than number of items Problem with Databases is that the per item entry Fifth is large level Next slide we compare Bloomfilter and Database: Bloomfilter where each element is a byte Database where each item is 4 bytes for location information and lg(n) bytes for file differentiation 19

20 Database Size vs. Bloomfilter Size Click to edit Master title Bloomfilters with 5x and 10x the total number Fifth of level files have a savings on space before 500 files in database Counter-based Bloomfilter is a space efficient alternative for a database 20

21 Impact of Locality Check in Click to edit Master title Created Bloomfilters of various sizes (10 7, 10 6,500x10 3, 250x10 3 ) with 4 hash functions 100x10 3 files selected from a normal distribution with various Fifth levelvariances (250, 10 3, 5x10 3, 10x10 3, 25x10 3, 50x10 3, 75x10 3, 100x10 3, 125x10 3, 250x10 3, 375x10 3, 500x10 3 ) Changing this parameter changes the number of different files inserted 21

22 Click Number to edit of False Master Positives title Increase of variance increases number Fifth of level files inserted and increases number of false positives Smaller Bloomfilter has more false positives, and increasing variance leads to more files which increases false positives as well 22

23 Click False to Positives edit Master Identified title D is the distance in the locality check T is the Fifth thresh- level hold to identify false positives Bloomfilter size of 250x10 3 Identifies at least 25% of false positives, more if D/T increases 23

24 Click False to Negatives edit Master title Bloomfilter size of 250x10 3 Similar results for other sizes The increase in false negatives is comparable to the decrease in false positives, but these are less costly 24

25 Click Related to Work edit Master title Giggle : Manage replicas in a user given configuration, [A. Chervenak et al. 2002] Requires user to define system type Achieves distributed nature by high redundancy of Fifth level data L-Store: Manages files on block level in a file system, [A. Tackett el al. 2006] Doesn t get benefit of local file hits, but may have faster transfers Interesting comparison to DIMM Zhang et al. : Job recovery in the event of node failure, [Zhang et al. 2007] 25

26 Click Conclusions to edit Master title We present a method for distributing the metadata management in HPC environments capable Fourth of level reducing the amount of migrations from archive while keeping a high number of local hits capable of reducing the size of the centralized management scheme With the introduction of locality checks in a Bloomfilter we are able to reduce the number of false positives in exchange for increasing the less costly false negatives 26

27 Click Current to and edit Future Master Work title Currently implementing a version of DIMM into our DiSK project (Distributed Shared Disk Cache) DiSK Fifth is the levelmajor project that is a culmination of DIMM s management and Differentiable Replication (DiR) Based on MIT s Chord/DHash A prototype is running on a 20-node cluster 27

28 Click to edit Master title Questions and More Information Brandon Szeliga Weisong Shi 28

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems

More information

Staggeringly Large Filesystems

Staggeringly Large Filesystems Staggeringly Large Filesystems Evan Danaher CS 6410 - October 27, 2009 Outline 1 Large Filesystems 2 GFS 3 Pond Outline 1 Large Filesystems 2 GFS 3 Pond Internet Scale Web 2.0 GFS Thousands of machines

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction:

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction: Amazon ElastiCache Introduction: How to improve application performance using caching. What are the ElastiCache engines, and the difference between them. How to scale your cluster vertically. How to scale

More information

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13 Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University

More information

Fixing the Embarrassing Slowness of OpenDHT on PlanetLab

Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Sean Rhea, Byung-Gon Chun, John Kubiatowicz, and Scott Shenker UC Berkeley (and now MIT) December 13, 2005 Distributed Hash Tables (DHTs) Same interface

More information

Scaling Indexer Clustering

Scaling Indexer Clustering Scaling Indexer Clustering 5 Million Unique Buckets and Beyond Cher-Hung Chang Principal Software Engineer Tameem Anwar Software Engineer 09/26/2017 Washington, DC Forward-Looking Statements During the

More information

Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross

Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross Parallel Object Storage Many HPC systems utilize object storage: PVFS, Lustre, PanFS,

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads) Google File System goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) focus on multi-gb files handle appends efficiently (no random writes & sequential reads) co-design GFS

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

Structuring PLFS for Extensibility

Structuring PLFS for Extensibility Structuring PLFS for Extensibility Chuck Cranor, Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University What is PLFS? Parallel Log Structured File System Interposed filesystem b/w

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Staggeringly Large File Systems. Presented by Haoyan Geng

Staggeringly Large File Systems. Presented by Haoyan Geng Staggeringly Large File Systems Presented by Haoyan Geng Large-scale File Systems How Large? Google s file system in 2009 (Jeff Dean, LADIS 09) - 200+ clusters - Thousands of machines per cluster - Pools

More information

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel Rosenblum and John Ousterhout) a Storage System

More information

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Flat Datacenter Storage Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Motivation Imagine a world with flat data storage Simple, Centralized, and easy to program Unfortunately, datacenter networks

More information

Cache Policies. Philipp Koehn. 6 April 2018

Cache Policies. Philipp Koehn. 6 April 2018 Cache Policies Philipp Koehn 6 April 2018 Memory Tradeoff 1 Fastest memory is on same chip as CPU... but it is not very big (say, 32 KB in L1 cache) Slowest memory is DRAM on different chips... but can

More information

Jinho Hwang and Timothy Wood George Washington University

Jinho Hwang and Timothy Wood George Washington University Jinho Hwang and Timothy Wood George Washington University Background: Memory Caching Two orders of magnitude more reads than writes Solution: Deploy memcached hosts to handle the read capacity 6. HTTP

More information

HOW DATA DEDUPLICATION WORKS A WHITE PAPER

HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS ABSTRACT IT departments face explosive data growth, driving up costs of storage for backup and disaster recovery (DR). For this reason,

More information

File System Internals. Jo, Heeseung

File System Internals. Jo, Heeseung File System Internals Jo, Heeseung Today's Topics File system implementation File descriptor table, File table Virtual file system File system design issues Directory implementation: filename -> metadata

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Presented by: Alvaro Llanos E

Presented by: Alvaro Llanos E Presented by: Alvaro Llanos E Motivation and Overview Frangipani Architecture overview Similar DFS PETAL: Distributed virtual disks Overview Design Virtual Physical mapping Failure tolerance Frangipani

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 CS 347 Notes 12 5 Web Search Engine Crawling

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 Web Search Engine Crawling Indexing Computing

More information

Parallel File Systems. John White Lawrence Berkeley National Lab

Parallel File Systems. John White Lawrence Berkeley National Lab Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation

More information

Asynchronous Logging and Fast Recovery for a Large-Scale Distributed In-Memory Storage

Asynchronous Logging and Fast Recovery for a Large-Scale Distributed In-Memory Storage Asynchronous Logging and Fast Recovery for a Large-Scale Distributed In-Memory Storage Kevin Beineke, Florian Klein, Michael Schöttner Institut für Informatik, Heinrich-Heine-Universität Düsseldorf Outline

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

MDHIM: A Parallel Key/Value Store Framework for HPC

MDHIM: A Parallel Key/Value Store Framework for HPC MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system

More information

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Case Studies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics The Original UNIX File System FFS Ext2 FAT 2 UNIX FS (1)

More information

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Case Studies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics The Original UNIX File System FFS Ext2 FAT 2 UNIX FS (1)

More information

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission Filesystem Disclaimer: some slides are adopted from book authors slides with permission 1 Recap Directory A special file contains (inode, filename) mappings Caching Directory cache Accelerate to find inode

More information

I/O Challenges: Todays I/O Challenges for Big Data Analysis. Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013

I/O Challenges: Todays I/O Challenges for Big Data Analysis. Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013 I/O Challenges: Todays I/O Challenges for Big Data Analysis Henry Newman CEO/CTO Instrumental, Inc. April 30, 2013 The Challenge is Archives Big data in HPC means archive and archive translates to a tape

More information

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented

More information

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic Deep Storage for Exponential Data Nathan Thompson CEO, Spectra Logic HISTORY Partnered with Fujifilm on a variety of projects HQ in Boulder, 35 years of business Customers in 54 countries Spectra builds

More information

TIBCO StreamBase 10 Distributed Computing and High Availability. November 2017

TIBCO StreamBase 10 Distributed Computing and High Availability. November 2017 TIBCO StreamBase 10 Distributed Computing and High Availability November 2017 Distributed Computing Distributed Computing location transparent objects and method invocation allowing transparent horizontal

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568 FLAT DATACENTER STORAGE Paper-3 Presenter-Pratik Bhatt fx6568 FDS Main discussion points A cluster storage system Stores giant "blobs" - 128-bit ID, multi-megabyte content Clients and servers connected

More information

Differentiated Replication Strategy in Data Centers

Differentiated Replication Strategy in Data Centers Differentiated Replication Strategy in Data Centers Tung Nguyen, Anthony Cutway, and Weisong Shi Wayne State University {nttung,acutway,weisong}@wayne.edu Abstract. Cloud computing has attracted a great

More information

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) FLAT DATACENTER STORAGE CHANDNI MODI (FN8692) OUTLINE Flat datacenter storage Deterministic data placement in fds Metadata properties of fds Per-blob metadata in fds Dynamic Work Allocation in fds Replication

More information

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout CGAR: Strong Consistency without Synchronous Replication Seo Jin Park Advised by: John Ousterhout Improved update performance of storage systems with master-back replication Fast: updates complete before

More information

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University

RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Nandu Jayakumar, Diego Ongaro, Mendel Rosenblum, Stephen Rumble, and Ryan Stutsman) DRAM in Storage

More information

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space Today CSCI 5105 Coda GFS PAST Instructor: Abhishek Chandra 2 Coda Main Goals: Availability: Work in the presence of disconnection Scalability: Support large number of users Successor of Andrew File System

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store Yiming Zhang, Rui Chu @ NUDT Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Haitao Wu @ MSRA June, 2012 1 Background Disk-based storage

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

Distributed System. Gang Wu. Spring,2018

Distributed System. Gang Wu. Spring,2018 Distributed System Gang Wu Spring,2018 Lecture7:DFS What is DFS? A method of storing and accessing files base in a client/server architecture. A distributed file system is a client/server-based application

More information

Distributed Hash Table

Distributed Hash Table Distributed Hash Table P2P Routing and Searching Algorithms Ruixuan Li College of Computer Science, HUST rxli@public.wh.hb.cn http://idc.hust.edu.cn/~rxli/ In Courtesy of Xiaodong Zhang, Ohio State Univ

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Google File System. By Dinesh Amatya

Google File System. By Dinesh Amatya Google File System By Dinesh Amatya Google File System (GFS) Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung designed and implemented to meet rapidly growing demand of Google's data processing need a scalable

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23 FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 23 2 Persistent Storage All programs require some form of persistent storage that lasts beyond the lifetime of an individual process Most

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

Tools for Social Networking Infrastructures

Tools for Social Networking Infrastructures Tools for Social Networking Infrastructures 1 Cassandra - a decentralised structured storage system Problem : Facebook Inbox Search hundreds of millions of users distributed infrastructure inbox changes

More information

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems 1 Motivation from the File Systems World The App needs to know the path /home/user/my pictures/ The Filesystem

More information

Deduplication Storage System

Deduplication Storage System Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business

More information

What is a file system

What is a file system COSC 6397 Big Data Analytics Distributed File Systems Edgar Gabriel Spring 2017 What is a file system A clearly defined method that the OS uses to store, catalog and retrieve files Manage the bits that

More information

Jai Menon and the rest of the team IBM Research Autonomic Storage Systems April 11, 2002

Jai Menon and the rest of the team IBM Research Autonomic Storage Systems April 11, 2002 Jai Menon and the rest of the team IBM Research Autonomic Storage Systems April 11, 2002 Copyright IBM Corporation 2000. All rights reserved. Presentation File Name Why do we need Autonomic Storage? Storage

More information

EaSync: A Transparent File Synchronization Service across Multiple Machines

EaSync: A Transparent File Synchronization Service across Multiple Machines EaSync: A Transparent File Synchronization Service across Multiple Machines Huajian Mao 1,2, Hang Zhang 1,2, Xianqiang Bao 1,2, Nong Xiao 1,2, Weisong Shi 3, and Yutong Lu 1,2 1 State Key Laboratory of

More information

Overcoming Obstacles to Petabyte Archives

Overcoming Obstacles to Petabyte Archives Overcoming Obstacles to Petabyte Archives Mike Holland Grau Data Storage, Inc. 609 S. Taylor Ave., Unit E, Louisville CO 80027-3091 Phone: +1-303-664-0060 FAX: +1-303-664-1680 E-mail: Mike@GrauData.com

More information

Disk Scheduling COMPSCI 386

Disk Scheduling COMPSCI 386 Disk Scheduling COMPSCI 386 Topics Disk Structure (9.1 9.2) Disk Scheduling (9.4) Allocation Methods (11.4) Free Space Management (11.5) Hard Disk Platter diameter ranges from 1.8 to 3.5 inches. Both sides

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

Operating Systems. Operating Systems Sina Meraji U of T

Operating Systems. Operating Systems Sina Meraji U of T Operating Systems Operating Systems Sina Meraji U of T Recap Last time we looked at memory management techniques Fixed partitioning Dynamic partitioning Paging Example Address Translation Suppose addresses

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1 Filesystem Disclaimer: some slides are adopted from book authors slides with permission 1 Storage Subsystem in Linux OS Inode cache User Applications System call Interface Virtual File System (VFS) Filesystem

More information

Caching with Memcached & APC. Ben Ramsey TEK X May 21, 2010

Caching with Memcached & APC. Ben Ramsey TEK X May 21, 2010 Caching with Memcached & APC Ben Ramsey TEK X May 21, 2010 Hi, I m Ben. benramsey.com @ramsey joind.in/1599 What is a cache? A cache is a collection of data duplicating original values stored elsewhere

More information

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes Monday, May 4, 2015 Topics for today Secondary memory Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes Storage management (Chapter

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University Chapter 10: File System Chapter 11: Implementing File-Systems Chapter 12: Mass-Storage

More information

Scalable overlay Networks

Scalable overlay Networks overlay Networks Dr. Samu Varjonen 1 Lectures MO 15.01. C122 Introduction. Exercises. Motivation. TH 18.01. DK117 Unstructured networks I MO 22.01. C122 Unstructured networks II TH 25.01. DK117 Bittorrent

More information

SAP HANA IBM x3850 X6

SAP HANA IBM x3850 X6 SAP HANA IBM x3850 X6 Miklos Farkas SAP HANA IBM x3850 X6 IBM Workload Optimized Solution for SAP HANA appliance Applications Data Center Ready SUSE SAP HANA GPFS FPO functionality OS SUSE Linux Enterprise

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

Google File System 2

Google File System 2 Google File System 2 goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) focus on multi-gb files handle appends efficiently (no random writes & sequential reads) co-design

More information

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General 1 ELT-53206 Peer-to-Peer Networks Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original

More information

Distributed Systems 16. Distributed File Systems II

Distributed Systems 16. Distributed File Systems II Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

CS 550 Operating Systems Spring File System

CS 550 Operating Systems Spring File System 1 CS 550 Operating Systems Spring 2018 File System 2 OS Abstractions Process: virtualization of CPU Address space: virtualization of memory The above to allow a program to run as if it is in its own private,

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge

HPC Growing Pains. IT Lessons Learned from the Biomedical Data Deluge HPC Growing Pains IT Lessons Learned from the Biomedical Data Deluge John L. Wofford Center for Computational Biology & Bioinformatics Columbia University What is? Internationally recognized biomedical

More information

Optimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service

Optimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service Optimizing Datacenter Power with Memory System Levers for Guaranteed Quality-of-Service * Kshitij Sudan* Sadagopan Srinivasan Rajeev Balasubramonian* Ravi Iyer Executive Summary Goal: Co-schedule N applications

More information

HPC in Cloud. Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni

HPC in Cloud. Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni HPC in Cloud Presenter: Naresh K. Sehgal Contributors: Billy Cox, John M. Acken, Sohum Sohoni 2 Agenda What is HPC? Problem Statement(s) Cloud Workload Characterization Translation from High Level Issues

More information

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010 Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node

More information

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION Steve Bertoldi, Solutions Director, MarkLogic Agenda Cloud computing and on premise issues Comparison of traditional vs cloud architecture Review of use

More information

MapReduce. U of Toronto, 2014

MapReduce. U of Toronto, 2014 MapReduce U of Toronto, 2014 http://www.google.org/flutrends/ca/ (2012) Average Searches Per Day: 5,134,000,000 2 Motivation Process lots of data Google processed about 24 petabytes of data per day in

More information

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General 1 ELT-53207 P2P & IoT Systems Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original

More information

Peer-to-Peer (P2P) Communication

Peer-to-Peer (P2P) Communication eer-to-eer (2) Communication 1 References Lv, Cao, Cohen, Li and Shenker, Search and Replication in Unstructured eer-to-eer Networks, In 16 th ACM Intl Conf on Supercomputing (ICS), 2002. S. Kang and M.

More information

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

VERITAS Volume Replicator. Successful Replication and Disaster Recovery VERITAS Volume Replicator Successful Replication and Disaster Recovery V E R I T A S W H I T E P A P E R Table of Contents Introduction.................................................................................1

More information

Balancing storage utilization across a global namespace Manish Motwani Cleversafe, Inc.

Balancing storage utilization across a global namespace Manish Motwani Cleversafe, Inc. Balancing storage utilization across a global namespace Manish Motwani Cleversafe, Inc. Agenda Introduction What are namespaces, why we need them Compare different types of namespaces Why we need to rebalance

More information

Distributed Systems Final Exam

Distributed Systems Final Exam 15-440 Distributed Systems Final Exam Name: Andrew: ID December 12, 2011 Please write your name and Andrew ID above before starting this exam. This exam has 14 pages, including this title page. Please

More information

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani The Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani CS5204 Operating Systems 1 Introduction GFS is a scalable distributed file system for large data intensive

More information

Top Trends in DBMS & DW

Top Trends in DBMS & DW Oracle Top Trends in DBMS & DW Noel Yuhanna Principal Analyst Forrester Research Trend #1: Proliferation of data Data doubles every 18-24 months for critical Apps, for some its every 6 months Terabyte

More information

Decentralized Distributed Storage System for Big Data

Decentralized Distributed Storage System for Big Data Decentralized Distributed Storage System for Big Presenter: Wei Xie -Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University Outline Trends in Big and Cloud Storage

More information

CS-580K/480K Advanced Topics in Cloud Computing. Object Storage

CS-580K/480K Advanced Topics in Cloud Computing. Object Storage CS-580K/480K Advanced Topics in Cloud Computing Object Storage 1 When we use object storage When we check Facebook, twitter Gmail Docs on DropBox Check share point Take pictures with Instagram 2 Object

More information

NUMA replicated pagecache for Linux

NUMA replicated pagecache for Linux NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations

More information

CS370: Operating Systems [Spring 2016] Dept. Of Computer Science, Colorado State University

CS370: Operating Systems [Spring 2016] Dept. Of Computer Science, Colorado State University Frequently asked questions from the previous class survey CS 7: OPERATING SYSTEMS [MEMORY MANAGEMENT] Shrideep Pallickara Computer Science Colorado State University TLB Does the TLB work in practice? n

More information

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,

More information

Introducing the Cray XMT. Petr Konecny May 4 th 2007

Introducing the Cray XMT. Petr Konecny May 4 th 2007 Introducing the Cray XMT Petr Konecny May 4 th 2007 Agenda Origins of the Cray XMT Cray XMT system architecture Cray XT infrastructure Cray Threadstorm processor Shared memory programming model Benefits/drawbacks/solutions

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams

More information

PRESENTATION TITLE GOES HERE

PRESENTATION TITLE GOES HERE Enterprise Storage PRESENTATION TITLE GOES HERE Leah Schoeb, Member of SNIA Technical Council SNIA EmeraldTM Training SNIA Emerald Power Efficiency Measurement Specification, for use in EPA ENERGY STAR

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information