Remote Direct Storage Management for Exa-Scale Storage

Similar documents
Adaptation of Distributed File System to VDI Storage by Client-Side Cache

Analysis of Virtual Machine Scalability based on Queue Spinlock

Optimizing Local File Accesses for FUSE-Based Distributed Storage

Deep Learning Based Real-time Object Recognition System with Image Web Crawler

A Study on the IoT Sensor Interaction Transmission System based on BigData

SMCCSE: PaaS Platform for processing large amounts of social media

Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence

Trajectory Planning for Mobile Robots with Considering Velocity Constraints on Xenomai

Byte Index Chunking Approach for Data Compression

A Design of Building Group Management Service Framework for On-Going Commissioning

Design of Ontology Engine Architecture for L-V-C Integrating System

Batch Inherence of Map Reduce Framework

The Google File System. Alexandru Costan

Analyzing and Improving Load Balancing Algorithm of MooseFS

A Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop

A Personal Information Retrieval System in a Web Environment

Google File System (GFS) and Hadoop Distributed File System (HDFS)

MapReduce. U of Toronto, 2014

Distributed File Systems II

An Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage

CLOUD-SCALE FILE SYSTEMS

CA485 Ray Walshe Google File System

The Design and Implementation of a BLE-based WebD2D Service for Android Smartphone

BigData and Map Reduce VITMAC03

A Simple Model for Estimating Power Consumption of a Multicore Server System

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

The Google File System

A Novel Model for Home Media Streaming Service in Cloud Computing Environment

Building Ubiquitous Computing Environment Using the Web of Things Platform

ABSTRACT I. INTRODUCTION

Dell Technologies IoT Solution Surveillance with Genetec Security Center

Research on Implement Snapshot of pnfs Distributed File System

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Introduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski

Data Centers and Cloud Computing

Fast Forward I/O & Storage

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

A Polygon Rendering Method with Precomputed Information

Enosis: Bridging the Semantic Gap between

Data Centers and Cloud Computing. Data Centers

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Designing Next Generation FS for NVMe and NVMe-oF

Supporting Collaborative 3D Editing over Cloud Storage

Design of Self-Adaptive System Observation over Internet of Things

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1

Next-Generation Cloud Platform

Time Stamp based Multiple Snapshot Management Method for Storage System

CS 345A Data Mining. MapReduce

Distributed Filesystem

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION

System Specification

Accelerating String Matching Algorithms on Multicore Processors Cheng-Hung Lin

An Efficient Flow Table Management Scheme for SDNs Based On Flow Forwarding Paths

Development of Technique for Healing Data Races based on Software Transactional Memory

Big Data Service Combination for Efficient Energy Data Analytics

Design and Implementation of Secure OTP Generation for IoT Devices

Discover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com

The Google File System

EE 660: Computer Architecture Cloud Architecture: Virtualization

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 ISSN

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Virtualized Testbed Development using Openstack

DDSF: A Data Deduplication System Framework for Cloud Environments

Towards High-Performance and Cost-Effective Distributed Storage Systems with Information Dispersal Algorithms

A Case Study: Performance Evaluation of a DRAM-Based Solid State Disk

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

GFS: The Google File System

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

Data Movement & Tiering with DMF 7

Dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce

MAHA. - Supercomputing System for Bioinformatics

EXTRACT DATA IN LARGE DATABASE WITH HADOOP

The Google File System

Campaign Storage. Peter Braam Co-founder & CEO Campaign Storage

Dongjun Shin Samsung Electronics

LevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine

Bringing HyperScale Computing to the Enterprise. The need for Enterprises to overhaul their IT systems

Design and Implementation of HTML5 based SVM for Integrating Runtime of Smart Devices and Web Environments

Interrupt response times on Arduino and Raspberry Pi. Tomaž Šolc

The Fusion Distributed File System

Red Hat Enterprise 7 Beta File Systems

Jumbo: Beyond MapReduce for Workload Balancing

Dynamic Translator-Based Virtualization

Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

CLIENT DATA NODE NAME NODE

Implementation of Smart Car Infotainment System including Black Box and Self-diagnosis Function

Transcription:

, pp.15-20 http://dx.doi.org/10.14257/astl.2016.139.04 Remote Direct Storage Management for Exa-Scale Storage Dong-Oh Kim, Myung-Hoon Cha, Hong-Yeon Kim Storage System Research Team, High Performance Computing Research Department, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, Korea {dokim, mhcha, kimhy}@etri.re.kr Abstract. Recently, the size of storage has been increasing in order to store large amounts of data. The most part of storage research is focused on raising capacity and bandwidth. But, efficiency of the file management is becoming even more important in terms of in Exa-Scale Storage operations. In this paper, we present a method of Remote Direct Storage Management (RDSM) for Exa- Scale Storage. RDSM allows users to easily manage server-side file and to easily use storage-specific functions. By utilizing RDSM, file copy is up to 30% faster than cp and file movement is up to 240 times faster than mv in LINUX. Keywords: Exa-Scale Storage, file management, distributed file system, fuse based file system, client utility 1 Introduction Recently, the need for Exa-Scale Storage has increased with the demand for highcapacity storage. However, in building Exa-Scale Storage, there are lots of problems, such as file system problems, network problems, power problems, etc. [1,2]. The most part of storage research is focused on raising capacity and bandwidth. However, if the file management processing is inefficient, most of the resource is wasted due to the file management in Exa-Scale Storage. So, efficiency of the file management is becoming even more important in terms of in Exa-Scale Storage operations. Exa-Scale Storage will have a large number of volume than Peta-Scale Storage. And, that can be used at the same time for a variety of applications with multiple users. Also, Exa-Scale Storage contains various storage devices or networks due to advances in technology [2,3]. In this complex environment, the management costs will vary significantly depending on how file management is performed in the Exa- Scale environment, when the user performs the application, it varies the processing time and processing costs according to the processing method [4-7]. In this paper, we present a method of Remote Direct Storage Management (RDSM) for Exa-Scale Storage. RDSM allows a client application to manage storage for file management. That is, RDSM serves to convert the external instruction as an internal ISSN: 2287-1233 ASTL Copyright 2016 SERSC

storage instruction for efficient processing. In addition, RDSM allows users to easily use storage-specific functions, supporting more efficient storage utilization. The remainder of this paper is organized as follows. Section 2 describes the concept of RDSM. Section 3 explains the implementation of RDSM in MAHA-FS. Section 4 examines the performance evaluation results of RDSM. Lastly, the conclusion is presented in Section 5. 2 RDSM In this section, we describe the concept of RDSM. RDSM provides a method for the client to process the file effectively. In this way, file processing can be performed in the storage. Figure 1 shows a process to moving files between volumes on Linux. Figure 2 shows a process of moving files between volumes using RDSM. Fig. 1. Basic process of moving files Fig. 2. Process of moving files using RDSM As shown in Figure 1, because file movement processing is done at the client level, the processing speed becomes slow and also consumes a significant amount of resources. But, RDSM provides a method for directly managing files in the storage remotely. So, as shown in Figure 2, the engine receives a command when using RDSM to perform file movement directly between the volumes. RDSM engine is composed of RDSM command and RDSM manager. The RDSM command is the user-defined commands that is used to call internal management API in Exa-Scale Storage. The RDSM commands are transmitted to RDSM manager of Exa-Scale Storage Server according to the FUSE architecture. The RDSM manager interprets the received requests (RDSM commands), verifies that it has the appropriate commands and parameters, and requests to the RDSM worker to perform the command. RDSM enables remote control according to the POSIX API, not as a separate interface. So, in RDSM it is possible to create an application without kernel compiling. 3 Development of RDSM In this section, we describe the implementation of RDSM in MAHA-FS. MAHA-FS which is similar to HDFS [8] and GFS [9], is a FUSE-based large-scale distributed file system using thousands of commodity servers in HPC (High Performance 16 Copyright 2016 SERSC

Computing) environments [10]. MAHA-FS is the HPC version of GLORY-FS [11] and was developed by ETRI. GLORY-FS is a FUSE-based large-scale distributed file system used in cloud computing. MAHA-FS is composed of a MDS (Metadata Server), multiple DS (Data Server), multiple FUSE clients and multiple utilities. In particular, MAHA-FS can support the fusion of different types of disks like SSDs (Solid-State Drive), HDDs (Hard Disk Drive) and MAIDs (Massive Array of Idle Disks). MAHA-FS performs file management according to the requested RDSM command. Figure 3 shows the system architecture of RDSM in MAHA-FS. Fig. 3. Architecture of RDSM in MAHA-FS As shown in Figure 3, the user application and the RDSM utility calls the RDSM command through the POSIX API. The request is forwarded to the RDSM Manager of the MDS in MAHA-FS via the FUSE clients. The RDSM command is the user-defined commands that is used to call internal management API in Exa-Scale Storage. Table 1 shows an example of the RDSM command in MAHA-FS. Table 1. RDSM command in MAHA-FS RDSM command parameter 1 parameter 2 maha_cp <source file info.> <destination file info.> maha_mv <source file info.> <destination file info.> set_disk <source file info.> ssd hdd maid maha_cp is the command to copy a file directly from <source file info> to <destination file info.>. maha_mv is the command to move a file directly from <source file info.> to <destination file info.>. set_disk is a special command in MAHA-FS to migrate a file from <source file info.> to the specified disk type. MAHA-FS supports three kinds of disk type. The RDSM manager performs an analysis of the received requests based on the pre-defined RDSM command. The RDSM manager interprets the received requests, verifies that it has the appropriate commands and parameters, and requests the RDSM worker to perform the file management. Copyright 2016 SERSC 17

The RDSM worker processes the command by calling the function of the RDSM command library or the utility of MAHA-FS. The RDSM command Library consists of a number of functions that call an internal function of MAHA-FS or processes a given command. In the RDSM worker, maha_cp and maha_mv are processed by calling the appropriate functions in RDSM command. In RDSM worker, set_disk is processed by calling the migration utility. For example, if you want to move the file ( test.dat ) into the SSD, you can simply call the posix API: setxattr ( test.dat, set_disk, ssd, 3, 3). If the last parameter (flags) is 3 in the setxattr function, the FUSE client is treated as the RDSM command. Figures 4 and 5 show the information of the file before and after running the set_disk command with the utility of MAHA-FS. Fig. 4. File information before set_disk Fig. 5. File information after set_disk The bottom of Figure 4 and Figure 5 shows the location information of the chunk. As shown in Figure 4, a chunk of the files are stored on the HDD with id 7a7dd101. As shown in Figure 5, a chunk of the files are stored on the SSD with id 6a7defa2 after running set_disk. 4 Performance Evaluation In this section, we verify the performance of RDSM through experiments. The performance evaluation was conducted using 1 MDS, 5 DS and 1 Client node. Each node has two Intel Zeon E5-2609 2.4GHz CPU and 32GB memory. Each DS node has 8 HDD. On each node, OS is "Red Hat Enterprise Linux 6.2, Linux 2.6.32-220.el6.x86_64", FUSE is "2.8.3-4.el6" and the file System is MAHA-FS. This paper compares cp and RDSM_cp, as well as mv and RDSM_mv. The cp and mv applications are provided in Linux. The RDSM_cp and RDSM_mv applications are simple utilities to call maha_cp and maha_mv in table 1. <source file info.> and <destination file info.> each specify a file on a different volume. Figure 6 shows the execution time of cp, mv, RDSM_cp, and RDSM_mv according to file size at 3 DS. Figure 7 shows the execution time of the cp, mv, RDSM_cp, and RDSM_mv process the 4GB file according to the change of the number of DS. 18 Copyright 2016 SERSC

Fig. 6. Execution time on the file size changes Fig. 7. Execution time on the number of DS changes As shown in Figure 6, the execution time of RDSM_cp is 30% faster on average than cp and RDSM_mv is 240 times faster than the mv on average. RDSM_cp eliminates the client network overhead of cp, by using RDSM. RDSM_mv eliminates the data movement between volumes of mv, by using RDSM. As shown in Figure 7, the execution time is reduced by increasing the number of DS. The execution time of RDSM_cp is up to 47% faster than cp and RDSM_mv is up to 370 times faster than mv. 5 Conclusion The efficient processing of files has become more important in Exa-Scale environments. So, we presented RDSM as a method of directly managing files for Exa-Scale Storage remotely. RDSM manager was actually implemented in the MAHA-FS. By utilizing RDSM, file copy is up to 47% faster than cp and file movement is up to 370 times faster than mv in LINUX. The biggest advantage of RDSM is that it allows you to easily call the administrative functions of the server in the client. In this way, RDSM user can manage files efficiently or easily use the various storage-specific functions of the storage. In the future, it is necessary the study of an effective file transfer method between the client and Exa-Scale Storage utilizing RDSM. When I/O processing in the client application, is also required way to minimize the unwanted movement of the data. Acknowledgments. This work was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(msip) (No. R0126-15-1082, Management of Developing ICBMS (IoT, Cloud, Bigdata, Mobile, Security) Core Technologies and Development of Exascale Cloud Storage Technology). Copyright 2016 SERSC 19

References 1. Kunkel, J. M., Kuhn, M., and Ludwig, T.: Exascale Storage Systems - An Analytical Study of Expenses 2. Characteristics of Future Systems. pp. 116--134. (2014) 2. Nadkarni, A.: EMC Elastic Cloud Storage - Blueprint for Exascale Storage. White paper, EMC (2016) 3. Aloisio, G., Fiore, S.: Towards Exascale Distributed Data Management. In: International Journal of High Performance Computing Applications archive, vol. 23 issue. 4, pp. 398-- 400. (2009) 4. Dreyfus, E.: FUSE and beyond: bridging file systems, In: Proceeding of the EuroBSDcon, pp. 1--14. The Sofia (2014) 5. FUSE: Filesystem in Userspace, http://fuse.sourceforge.net 6. Ishiguro, S., Murakami, J., Oyama, Y., Tatebe, O.: Optimizing Local File Accesses for FUSE-Based Distributed Storage. In: Proceedings of the International Workshop on Data- Intensive Scalable Computing Systems (DISCS 12), pp. 760--765. IEEE, (2012) 7. Rajgarhia, A., Gehani, A.: Performance and Extension of User Space File Systems. In: the ACM Symposium on Applied Computing (SAC 00), pp. 206--213. ACM Press, New York, (2010) 8. HDFS: Hadoop Distributed File System, http://hadoop.apache.org/ 9. Ghemawat, S., Gobioff, H., and Leung, S.: The Google File System. In: 9th ACM Symposium on Operating Systems Principles (SOSP 03), pp. 20--40. ACM Press, New York, (2003) 10. Kim, Y. C., Kim, D. O., Kim, H. Y., Kim, Y. K., Choi, W.: MAHA-FS: A Distributed File Sys-tem for High Performance Metadata Processing and Random IO. KIPS Transactions on Software and Data Engineering, vol.2, issue 2, pp. 91--96. (2013) 11. Min, Y. S., Jin, K.S., Kim, H.Y., Kim, Y.K.: A Trend to Distributed File Systems for Cloud Computing. Electronics and Telecommunications Trends, vol. 24, issue 4, pp. 55-- 68. (2009) 20 Copyright 2016 SERSC