Chelonia. a lightweight self-healing distributed storage

Similar documents
Managing Scientific Computations in Grid Systems

Chelonia - Self-healing distributed storage system

Performance and Stability of the Chelonia Storage System

arxiv: v1 [cs.dc] 3 Feb 2010

Chelonia User s manual

Scientific data management

NORDUGRID NORDUGRID-MANUAL-15 18/2/2013. The Hopi manual. Zsombor Nagy.

Lessons Learned in the NorduGrid Federation

ARC NOX AND THE ROADMAP TO THE UNIFIED EUROPEAN MIDDLEWARE

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

Dell EMC CIFS-ECS Tool

COMPUTE CANADA GLOBUS PORTAL

glite Grid Services Overview

The Lion of storage systems

Amazon Web Services (AWS) Training Course Content

an Object-Based File System for Large-Scale Federated IT Infrastructures

A Simple Mass Storage System for the SRB Data Grid

CS60021: Scalable Data Mining. Sourangshu Bhattacharya

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2016

Dynamo: Amazon s Highly Available Key-Value Store

Deploying Software Defined Storage for the Enterprise with Ceph. PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu

SQL Server on Linux and Containers

BlackPearl Customer Created Clients Using Free & Open Source Tools

Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content

Amazon Web Services Training. Training Topics:

Cluster Setup and Distributed File System

USING NGC WITH GOOGLE CLOUD PLATFORM

CA485 Ray Walshe Google File System

SAML-Based SSO Solution

and the GridKa mass storage system Jos van Wezel / GridKa

Evaluation of the Huawei UDS cloud storage system for CERN specific data

Discover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda

Progress in building the International Lattice Data Grid

Outline. ASP 2012 Grid School

AMGA metadata catalogue system

Matthew Harris Senior Project Proposal getnote The Mobile Application

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

SAML-Based SSO Solution

Question No : 1 You install Microsoft Dynamics CRM on-premises. The Microsoft Dynamics CRM environment must have a Uniform Resource Locator (URL) that

Understanding StoRM: from introduction to internals

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices

The Google File System

STORAGE MADE EASY: S3 DRIVE & S3 EXPLORER

Cisco Configuration Engine 2.0

Overview. Jakub T. Mościcki, IT/DSS. Meeting with Palestinian fellows

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

StorageGRID Webscale 11.0 Expansion Guide

CSE 124: Networked Services Fall 2009 Lecture-19

Using Cloud Services behind SGI DMF

South African Science Gateways

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

Red Hat Storage Server for AWS

Technical product documentation

Samba and Ceph. Release the Kraken! David Disseldorp

The LHC Computing Grid

Distributed Systems 16. Distributed File Systems II

GFS: The Google File System. Dr. Yingwu Zhu

EGEE and Interoperation

Choosing an Interface

Dedicated to my Family

Distributed Systems. 15. Distributed File Systems. Paul Krzyzanowski. Rutgers University. Fall 2017

Configuration Management. Norman Walsh, Lead Engineer 01 May 2012

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

DataMan. version 6.5.4

Cloud Computing /AWS Course Content

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System

IBM Storwize V7000 Unified

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

How to use with software

Google GCP-Solution Architects Exam

AWS Administration. Suggested Pre-requisites Basic IT Knowledge

Report Exec Dispatch System Specifications

The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015

클라우드스토리지구축을 위한 ceph 설치및설정

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Presented By: Ian Kelley

Core solution description

Grid Data Management

GFS: The Google File System

COMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014

Third-Party Client (s3fs) User Guide

Network+ Guide to Networks, Fourth Edition. Chapter 8 Network Operating Systems and Windows Server 2003-Based Networking

Data Storage. Paul Millar dcache

A scalable storage element and its usage in HEP

Sunday, May 1,

Let s manage agents. Tom Sightler, Principal Solutions Architect Dmitry Popov, Product Management

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Assignment 5. Georgia Koloniari

XtreemFS a case for object-based storage in Grid data management. Jan Stender, Zuse Institute Berlin

Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation.

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

CIT 668: System Architecture. Amazon Web Services

Chapter 2 Introduction to the WS-PGRADE/gUSE Science Gateway Framework

GlusterFS and RHS for SysAdmins

Transcription:

Chelonia a lightweight self-healing distributed storage Zsombor Nagy (zsombor@niif.hu) Salman Toor (salman.toor@it.uu.se) Jon Kerr Nilsen (j.k.nilsen@fys.uio.no)

Motivation How to easily... Create a storage resource from available disk space on any computer Connect storage resources through the Internet Provide an easy-to-understand way for users to upload and share their files Use these files in a Grid environment

Design The system should be... self-healing easy to deploy easy to maintain lightweight no single point of failure

Decisions Files Replication Berkeley DB HA HTTP(S) file transfer SOAP messaging X509 certificates

NORDUGRID Grid Solution for Wide Area Computing and Data Handling Chelonia Developed by the KnowARC project Will be maintained by the NorduGrid Collaboration Running in NorduGrid ARC s web service container Written in Python Running on Linux and Mac OS X (and Windows soon)

The Chelonia Cloud Storage cloud for anyone Can be created by anyone Can be shared with anyone Can be used by anyone

Global namespace Global hierarchical namespace Files are organized in collections All users see the same tree Can use logical names Paths in Chelonia namespace Similar to paths in a regular local filesystem

Replication Files are replicated User specifies how many replicas are needed Chelonia itself takes care of the replication Easy to create a new storage node

Access Control Access policies to files and collections Can grant access to Individual users If in a Grid environment, to entire virtual organizations

Client tools Several ways to access Chelonia Command line interface FUSE module In an ARC enabled Grid environment Through job specification file With ARC client tools

FUSE module We have a FUSE module to mount Chelonia as a local filesystem You can upload and download files, create collections (directories) with your OS s CLI and GUI tools (e.g. drag and drop) Linux and Mac OS X

Components of Chelonia Consists of four services The Bartender The Shepherd The Librarian The A-Hash Can have multiple instances of all services

The Shepherd Manages the storage node with the actual file data Replication requires more storage nodes, each managed by a Shepherd Adding a Shepherd, replica will automatically be created (if needed)

The Librarian and A-Hash The Librarian manages the metadata Uses the A-Hash to store the data Both metadata and Librarians can be replicated More fault tolerant Better load balance

The Bartender Serves the users by negotiating with the Librarian to query files and collections the Shepherd to upload and download files More than one Bartender to eliminate single point of failure Users can contact any Bartender and get the same result

Downloading A-H A-H A-H 2. traverse(ln) B 1. getfile(ln) A-H L 3. file's metadata S 5. TURL 4. get(referenceid) 6. TURL 7. downloading the file USER S E S

Uploading A-H 2. traverse(ln) A-H A-H A-H L 3. parent's data 4. new(metadata) 5. file's GUID 6. add entry to parent 8. list of Shepherds 10. register new replica 9. put(...) 11. TURL B S 1. putfile(ln) 12. TURL 13. uploading the file USER 14. report state change S E S

Depth test

Width test

Multiple users

Number of replicas Five Shepherds were used for this test User Side ALIVE OFFLINE THIRDWHEEL CREATING 10 files of size 100MB have 3 replicas and now user: Increase the number replicas to 5. Decrease the number of replicas to 2 Number of files in the system 50 40 30 20 10 0 50 40 30 20 10 0 15 30 45 60 75 90 105 120 135 150 165 180 195 210 225 15 30 45 60 75 90 105 120 135 105 165 180 195 210 225 240 255 Time in seconds taken by the system to achieve the needed replicas System Side 10 files with 4 replicas are in the system. One Shepherd went OFFLINE for sometime Became ONLINE Number of files in the system 50 40 30 20 10 0 15 30 45 60 75 90 105 120 135 150 165 Time in seconds taken by the system to achieve the needed replicas

Memory usage 8 days running with lots of file uploads and downloads

Better clients? SRM? WebDAV? Encryption? AAI? Transfer protocols? Transfer security? Versioning? Resuming uploads? SOAP over HTTPS? Space reservation? Future?

ClusterGrid Maintained by NIIF, Hungary Since 2002 connecting a few hundred PCs located at several universities in Hungary Previously using NIIF s own grid middleware Soon it will use ARC and Chelonia

Chelonia and ClusterGrid Soon there will be a purchase of some storage solutions which will be deployed at chosen Hungarian universities We will deploy Chelonia storage nodes on these The ClusterGrid users will be able to upload and download (input/output) files to the Chelonia cloud and refer to these files in the job descriptions

Questions / Demo

Thank You! web: http://www.knowarc.eu/chelonia/ paper: http://arxiv.org/abs/1002.0712 video: http://www.youtube.com/watch?v=neuwzghhghc Zsombor Nagy, NIIF (Hungary), zsombor@niif.hu