Academic Workflow for Research Repositories Using irods and Object Storage

Similar documents
LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

Simplifying Collaboration in the Cloud

Storage for HPC, HPDA and Machine Learning (ML)

The File Systems Evolution. Christian Bandulet, Sun Microsystems

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore

HPSS Treefrog Summary MARCH 1, 2018

Content Addressed Storage (CAS)

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Modernize Your Backup and DR Using Actifio in AWS

HPSS Treefrog Introduction.

Web Object Scaler. WOS and IRODS Data Grid Dave Fellinger

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

Introduction to Digital Archiving and IBM archive storage options

IME Infinite Memory Engine Technical Overview

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

Hybrid Cloud NAS for On-Premise and In-Cloud File Services with Panzura and Google Cloud Storage

Data Movement & Tiering with DMF 7

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Dell EMC Isilon with Cohesity DataProtect

Simple Data Protection for the Cloud Era

IBM řešení pro větší efektivitu ve správě dat - Store more with less

CDMI Support to Object Storage in Cloud K.M. Padmavathy Wipro Technologies

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Eight Tips for Better Archives. Eight Ways Cloudian Object Storage Benefits Archiving with Veritas Enterprise Vault

Executive Summary SOLE SOURCE JUSTIFICATION. Microsoft Integration

DSpace Fedora. Eprints Greenstone. Handle System

irods for Data Management and Archiving UGM 2018 Masilamani Subramanyam

Object storage platform How it can help? Martin Lenk, Specialist Senior Systems Engineer Unstructured Data Solution, Dell EMC

WHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper

REFERENCE ARCHITECTURE. Rubrik and Nutanix

SwiftStack and python-swiftclient

GlusterFS and RHS for SysAdmins

IRODS USER GROUP 2014 CAMBRIDGE,MA John Burns. 6/25/14 Archive Analy3cs Solu3ons 1

HP D2D & STOREONCE OVERVIEW

CONFIGURATION GUIDE WHITE PAPER JULY ActiveScale. Family Configuration Guide

White paper ETERNUS CS800 Data Deduplication Background

TechTour. Winning Technology Roadmaps. Sig Knapstad. Cutting Edge

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

StrongLink: Data and Storage Management Simplified

WOS COMPREHENSIVE OBJECT STORAGE EXECUTIVE SUMMARY WHITEPAPER. Collaborate Distribute Archive

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

Elastic Cloud Storage (ECS)

KEYCLOUD BACKUP AND RECOVERY AS-A-SERVICE (BRAAS): A fully-managed backup and recovery solution for your mission critical data

Shared Object-Based Storage and the HPC Data Center

MQ High Availability and Disaster Recovery Implementation scenarios

Using Cohesity with Amazon Web Services (AWS)

Cohesity Flash Protect for Pure FlashBlade: Simple, Scalable Data Protection

The Evolution of File Systems

Rio-2 Hybrid Backup Server

HYCU and ExaGrid Hyper-converged Backup for Nutanix

How To Guide: Long Term Archive for Rubrik. Using SwiftStack Storage as a Long Term Archive for Rubrik

Effizientes Speichern von Cold-Data

The Future of Archiving

Solution Brief: Commvault HyperScale Software

Hedvig as backup target for Veeam

Data Movement & Storage Using the Data Capacitor Filesystem

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Information Infrastructure Forum

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

Cat Herding. Why It s Time for a Millennial Approach to Storage. Cloud Expo East Western Digital Corporation All rights reserved 01/25/2016

Omneon MediaGrid Technical Overview

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

General Purpose Storage Servers

XenData Product Brief: SX-550 Series Servers for LTO Archives

DDN About Us Solving Large Enterprise and Web Scale Challenges

Scale-out Data Deduplication Architecture

Policy Based Distributed Data Management Systems

Nasuni UniFS a True Global File System

EMC Forum 2014 EMC ViPR and ECS: A Lap Around Software-Defined Services. Magnus Nilsson Blog: purevirtual.

EMC DATA DOMAIN PRODUCT OvERvIEW

Exam : Title : Storage Sales V2. Version : Demo

Hybrid Backup & Disaster Recovery. Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam

EMC Data Domain for Archiving Are You Kidding?

Best Practices in Designing Cloud Storage based Archival solution Sreenidhi Iyangar & Jim Rice EMC Corporation

Configuring IBM Spectrum Protect for IBM Spectrum Scale Active File Management

IBM Active Cloud Engine/Active File Management. Kalyan Gunda

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

Securing Data-at-Rest

Backup and archiving need not to create headaches new pain relievers are around

Backup & Recovery on AWS

Campaign Storage. Peter Braam Co-founder & CEO Campaign Storage

Scale-Out Architectures for Secondary Storage

Optimizing and Managing File Storage in Windows Environments

IBM TS4300 with IBM Spectrum Storage - The Perfect Match -

IBM Storwize V7000 Unified

A national approach for storage scale-out scenarios based on irods

The Future of Storage

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process

Data Management: the What, When and How

ORACLE RMAN DESIGN BEST PRACTICES WITH PANZURA QUICKSILVER CLOUD STORAGE CONTROLLERS

Enterprise Vault Whitepaper Enterprise Vault Integration with Veritas Products

Virtualizing Oracle on VMware

Real-time Protection for Microsoft Hyper-V

XenData Product Brief: XenData6 Server Software

SCS Distributed File System Service Proposal

A Close-up Look at Potential Future Enhancements in Tivoli Storage Manager

Technical Overview. Access control lists define the users, groups, and roles that can access content as well as the operations that can be performed.

Effective Planning and Use of IBM Spectrum Protect Container Storage Pools and Data Deduplication

Transcription:

1 Academic Workflow for Research Repositories Using irods and Object Storage 2016 irods User s Group Meeting 9 June 2016 Randall Splinter, Ph.D. HPC Research Computing Solutions Architect RSplinter@ 770.633.2994

2 Agenda Introduction to the Problem HPC Workflows The Problem of Long Term Archiving Object Store to the Rescue How irods Enables Object Storage Why irods with DDN WOS is a Superior Solution for Research Repositories A Case Study

3 Introduction to the Problem The Problem of Collaboration NAS technologies (NFS, CIFS) are local o They don t tend to scale well over WAN distances But collaborators are frequently widely separated in distance o How to enable researchers to share data without administrative overhead securely FTP Typically, only system administrators can control the ACLs on NFS/CIFS mounts o Security Tape o Yech

4 HPC Workflows Ingest data from a source (Analysis of data) Pre-analysis on low-end storage Move cleaned up data to a PFS and compute After full analysis data is moved to long term storage o Can this be automated? Yes, with irods. No ingest (Pure simulation) Compute models are run on a compute cluster with PFS After full analysis data is moved to long term storage o Again automation is key. Data sets are exploding!

5 The Problem of Long Term Archiving Data must be secure From deletion (accident or deliberate) Loss from theft o Security hacks o Faculty or students leave and take IP with them Must satisfy regulatory restrictions HIPAA, for instance Changing hardware standards In particular tape standards Hardware availability Spinning media is mechanical and will not last forever

6 Object Store to the Rescue All Object stores provide a way to replicate data over large distances Some more effectively than others o Provides a way to effectively share data over WAN scales Most object stores were designed for cloud storage o Security has always been important o Ease of data sharing has been important This now enables more effective data sharing and data security than with traditional storage solutions at price points that traditional NAS systems cannot approach.

7 How irods Enables Object Storage irods is a very effective middleware layer for accessing multiple storage resources irods handles the security and database management of the ingested data Provides a powerful metadata search capability for ingested data Provides a rules engine for the processing of incoming data and the manipulation of data on the back-end For instance, o Data can be moved to slower storage resources as they age or another criteria is met (Essentially HSM) o Data can be secured from removal, editing or modification based upon criteria using the null chmod Retention policies! o Anything else?

8 DDN WOS: Key Features Exabyte Scalability Create virtually limitless data repositories, non-disruptively seamlessly scale to over an exabyte of capacity Latency-Aware Access Manager WOS intelligently makes decisions on the best geographies to get from based upon location access load and latency Federated, Global Namespace Locally or across multiple geographies with smart policies for performance and/or disaster recovery on a per-object basis Flexible Data Protection Select multiple policy-driven data protection schemes to meet application, workflow and disaster recovery requirements Self-healing Architecture No hard tie between physical disks and data. Failed drives are recovered through dispersed data placement rebuilds happen at read, not write, speed rebuilding only data. Fully-Integrated Object Storage Appliance WOS7000, 60 Drives in 4U, with 1 or 2 object storage servers per appliance Pure Object Storage Formats drives with custom WOS disk file system, no Linux file I/Os, no fragmentation, fully contiguous object read and write operations for maximum disk efficiency User Defined Metadata and Metadata Search Applications can assign their own metadata via object storage API, WOS now also supports parallel search of WOS user metadata

9 Why irods with DDN WOS is a Superior Solution for Repositories Ease of Scalability with WOS Ease of administration Once rules are tested and in place the system can be managed with a minimum of administrative overhead Automating workflows to guarantee consistency and reproducibility in the science that is produced

10 Why irods with DDN WOS is a Superior Solution for Repositories Ease of auditing for both usage and back charging and for maintaining adequate data security compliance DDN WOS makes remote replication simple and provides a straightforward way to manage DR systems

11 Why irods with WOS is a Superior Solution for Repositories Central to any repository is the ability to add metadata tags and search metadata. irods has extensive abilities to do that Significantly better than any competing options # imeta add d filename Date 2 Feb 2016 # imeta ls d filename AVUs defined for dataobj filename: attribute: Meta1 value: hello units: --- attribute: Date value: 2 Feb 2016 units: # imeta rm d filename Meta1 hello

12 A Case Study Hrothgar Compute Cluster Lustre Filesystem Any statements or representatio ns around future events are subject to change.

13 Thank You! Keep in touch with us sales@ 9351 Deering Avenue Chatsworth, CA 91311 Questions? @ddn_limitless 1.800.837.2298 1.818.700.4000 company/datadirect-networks.