Remote Direct Storage Management for Exa-Scale Storage
|
|
- Dorothy Logan
- 5 years ago
- Views:
Transcription
1 , pp Remote Direct Storage Management for Exa-Scale Storage Dong-Oh Kim, Myung-Hoon Cha, Hong-Yeon Kim Storage System Research Team, High Performance Computing Research Department, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, Korea {dokim, mhcha, Abstract. Recently, the size of storage has been increasing in order to store large amounts of data. The most part of storage research is focused on raising capacity and bandwidth. But, efficiency of the file management is becoming even more important in terms of in Exa-Scale Storage operations. In this paper, we present a method of Remote Direct Storage Management (RDSM) for Exa- Scale Storage. RDSM allows users to easily manage server-side file and to easily use storage-specific functions. By utilizing RDSM, file copy is up to 30% faster than cp and file movement is up to 240 times faster than mv in LINUX. Keywords: Exa-Scale Storage, file management, distributed file system, fuse based file system, client utility 1 Introduction Recently, the need for Exa-Scale Storage has increased with the demand for highcapacity storage. However, in building Exa-Scale Storage, there are lots of problems, such as file system problems, network problems, power problems, etc. [1,2]. The most part of storage research is focused on raising capacity and bandwidth. However, if the file management processing is inefficient, most of the resource is wasted due to the file management in Exa-Scale Storage. So, efficiency of the file management is becoming even more important in terms of in Exa-Scale Storage operations. Exa-Scale Storage will have a large number of volume than Peta-Scale Storage. And, that can be used at the same time for a variety of applications with multiple users. Also, Exa-Scale Storage contains various storage devices or networks due to advances in technology [2,3]. In this complex environment, the management costs will vary significantly depending on how file management is performed in the Exa- Scale environment, when the user performs the application, it varies the processing time and processing costs according to the processing method [4-7]. In this paper, we present a method of Remote Direct Storage Management (RDSM) for Exa-Scale Storage. RDSM allows a client application to manage storage for file management. That is, RDSM serves to convert the external instruction as an internal ISSN: ASTL Copyright 2016 SERSC
2 storage instruction for efficient processing. In addition, RDSM allows users to easily use storage-specific functions, supporting more efficient storage utilization. The remainder of this paper is organized as follows. Section 2 describes the concept of RDSM. Section 3 explains the implementation of RDSM in MAHA-FS. Section 4 examines the performance evaluation results of RDSM. Lastly, the conclusion is presented in Section 5. 2 RDSM In this section, we describe the concept of RDSM. RDSM provides a method for the client to process the file effectively. In this way, file processing can be performed in the storage. Figure 1 shows a process to moving files between volumes on Linux. Figure 2 shows a process of moving files between volumes using RDSM. Fig. 1. Basic process of moving files Fig. 2. Process of moving files using RDSM As shown in Figure 1, because file movement processing is done at the client level, the processing speed becomes slow and also consumes a significant amount of resources. But, RDSM provides a method for directly managing files in the storage remotely. So, as shown in Figure 2, the engine receives a command when using RDSM to perform file movement directly between the volumes. RDSM engine is composed of RDSM command and RDSM manager. The RDSM command is the user-defined commands that is used to call internal management API in Exa-Scale Storage. The RDSM commands are transmitted to RDSM manager of Exa-Scale Storage Server according to the FUSE architecture. The RDSM manager interprets the received requests (RDSM commands), verifies that it has the appropriate commands and parameters, and requests to the RDSM worker to perform the command. RDSM enables remote control according to the POSIX API, not as a separate interface. So, in RDSM it is possible to create an application without kernel compiling. 3 Development of RDSM In this section, we describe the implementation of RDSM in MAHA-FS. MAHA-FS which is similar to HDFS [8] and GFS [9], is a FUSE-based large-scale distributed file system using thousands of commodity servers in HPC (High Performance 16 Copyright 2016 SERSC
3 Computing) environments [10]. MAHA-FS is the HPC version of GLORY-FS [11] and was developed by ETRI. GLORY-FS is a FUSE-based large-scale distributed file system used in cloud computing. MAHA-FS is composed of a MDS (Metadata Server), multiple DS (Data Server), multiple FUSE clients and multiple utilities. In particular, MAHA-FS can support the fusion of different types of disks like SSDs (Solid-State Drive), HDDs (Hard Disk Drive) and MAIDs (Massive Array of Idle Disks). MAHA-FS performs file management according to the requested RDSM command. Figure 3 shows the system architecture of RDSM in MAHA-FS. Fig. 3. Architecture of RDSM in MAHA-FS As shown in Figure 3, the user application and the RDSM utility calls the RDSM command through the POSIX API. The request is forwarded to the RDSM Manager of the MDS in MAHA-FS via the FUSE clients. The RDSM command is the user-defined commands that is used to call internal management API in Exa-Scale Storage. Table 1 shows an example of the RDSM command in MAHA-FS. Table 1. RDSM command in MAHA-FS RDSM command parameter 1 parameter 2 maha_cp <source file info.> <destination file info.> maha_mv <source file info.> <destination file info.> set_disk <source file info.> ssd hdd maid maha_cp is the command to copy a file directly from <source file info> to <destination file info.>. maha_mv is the command to move a file directly from <source file info.> to <destination file info.>. set_disk is a special command in MAHA-FS to migrate a file from <source file info.> to the specified disk type. MAHA-FS supports three kinds of disk type. The RDSM manager performs an analysis of the received requests based on the pre-defined RDSM command. The RDSM manager interprets the received requests, verifies that it has the appropriate commands and parameters, and requests the RDSM worker to perform the file management. Copyright 2016 SERSC 17
4 The RDSM worker processes the command by calling the function of the RDSM command library or the utility of MAHA-FS. The RDSM command Library consists of a number of functions that call an internal function of MAHA-FS or processes a given command. In the RDSM worker, maha_cp and maha_mv are processed by calling the appropriate functions in RDSM command. In RDSM worker, set_disk is processed by calling the migration utility. For example, if you want to move the file ( test.dat ) into the SSD, you can simply call the posix API: setxattr ( test.dat, set_disk, ssd, 3, 3). If the last parameter (flags) is 3 in the setxattr function, the FUSE client is treated as the RDSM command. Figures 4 and 5 show the information of the file before and after running the set_disk command with the utility of MAHA-FS. Fig. 4. File information before set_disk Fig. 5. File information after set_disk The bottom of Figure 4 and Figure 5 shows the location information of the chunk. As shown in Figure 4, a chunk of the files are stored on the HDD with id 7a7dd101. As shown in Figure 5, a chunk of the files are stored on the SSD with id 6a7defa2 after running set_disk. 4 Performance Evaluation In this section, we verify the performance of RDSM through experiments. The performance evaluation was conducted using 1 MDS, 5 DS and 1 Client node. Each node has two Intel Zeon E GHz CPU and 32GB memory. Each DS node has 8 HDD. On each node, OS is "Red Hat Enterprise Linux 6.2, Linux el6.x86_64", FUSE is " el6" and the file System is MAHA-FS. This paper compares cp and RDSM_cp, as well as mv and RDSM_mv. The cp and mv applications are provided in Linux. The RDSM_cp and RDSM_mv applications are simple utilities to call maha_cp and maha_mv in table 1. <source file info.> and <destination file info.> each specify a file on a different volume. Figure 6 shows the execution time of cp, mv, RDSM_cp, and RDSM_mv according to file size at 3 DS. Figure 7 shows the execution time of the cp, mv, RDSM_cp, and RDSM_mv process the 4GB file according to the change of the number of DS. 18 Copyright 2016 SERSC
5 Fig. 6. Execution time on the file size changes Fig. 7. Execution time on the number of DS changes As shown in Figure 6, the execution time of RDSM_cp is 30% faster on average than cp and RDSM_mv is 240 times faster than the mv on average. RDSM_cp eliminates the client network overhead of cp, by using RDSM. RDSM_mv eliminates the data movement between volumes of mv, by using RDSM. As shown in Figure 7, the execution time is reduced by increasing the number of DS. The execution time of RDSM_cp is up to 47% faster than cp and RDSM_mv is up to 370 times faster than mv. 5 Conclusion The efficient processing of files has become more important in Exa-Scale environments. So, we presented RDSM as a method of directly managing files for Exa-Scale Storage remotely. RDSM manager was actually implemented in the MAHA-FS. By utilizing RDSM, file copy is up to 47% faster than cp and file movement is up to 370 times faster than mv in LINUX. The biggest advantage of RDSM is that it allows you to easily call the administrative functions of the server in the client. In this way, RDSM user can manage files efficiently or easily use the various storage-specific functions of the storage. In the future, it is necessary the study of an effective file transfer method between the client and Exa-Scale Storage utilizing RDSM. When I/O processing in the client application, is also required way to minimize the unwanted movement of the data. Acknowledgments. This work was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(msip) (No. R , Management of Developing ICBMS (IoT, Cloud, Bigdata, Mobile, Security) Core Technologies and Development of Exascale Cloud Storage Technology). Copyright 2016 SERSC 19
6 References 1. Kunkel, J. M., Kuhn, M., and Ludwig, T.: Exascale Storage Systems - An Analytical Study of Expenses 2. Characteristics of Future Systems. pp (2014) 2. Nadkarni, A.: EMC Elastic Cloud Storage - Blueprint for Exascale Storage. White paper, EMC (2016) 3. Aloisio, G., Fiore, S.: Towards Exascale Distributed Data Management. In: International Journal of High Performance Computing Applications archive, vol. 23 issue. 4, pp (2009) 4. Dreyfus, E.: FUSE and beyond: bridging file systems, In: Proceeding of the EuroBSDcon, pp The Sofia (2014) 5. FUSE: Filesystem in Userspace, 6. Ishiguro, S., Murakami, J., Oyama, Y., Tatebe, O.: Optimizing Local File Accesses for FUSE-Based Distributed Storage. In: Proceedings of the International Workshop on Data- Intensive Scalable Computing Systems (DISCS 12), pp IEEE, (2012) 7. Rajgarhia, A., Gehani, A.: Performance and Extension of User Space File Systems. In: the ACM Symposium on Applied Computing (SAC 00), pp ACM Press, New York, (2010) 8. HDFS: Hadoop Distributed File System, 9. Ghemawat, S., Gobioff, H., and Leung, S.: The Google File System. In: 9th ACM Symposium on Operating Systems Principles (SOSP 03), pp ACM Press, New York, (2003) 10. Kim, Y. C., Kim, D. O., Kim, H. Y., Kim, Y. K., Choi, W.: MAHA-FS: A Distributed File Sys-tem for High Performance Metadata Processing and Random IO. KIPS Transactions on Software and Data Engineering, vol.2, issue 2, pp (2013) 11. Min, Y. S., Jin, K.S., Kim, H.Y., Kim, Y.K.: A Trend to Distributed File Systems for Cloud Computing. Electronics and Telecommunications Trends, vol. 24, issue 4, pp (2009) 20 Copyright 2016 SERSC
Adaptation of Distributed File System to VDI Storage by Client-Side Cache
Adaptation of Distributed File System to VDI Storage by Client-Side Cache Cheiyol Kim 1*, Sangmin Lee 1, Youngkyun Kim 1, Daewha Seo 2 1 Storage System Research Team, Electronics and Telecommunications
More informationAnalysis of Virtual Machine Scalability based on Queue Spinlock
, pp.15-19 http://dx.doi.org/10.14257/astl.2017.148.04 Analysis of Virtual Machine Scalability based on Queue Spinlock Seunghyub Jeon, Seung-Jun Cha, Yeonjeong Jung, Jinmee Kim and Sungin Jung Electronics
More informationOptimizing Local File Accesses for FUSE-Based Distributed Storage
Optimizing Local File Accesses for FUSE-Based Distributed Storage Shun Ishiguro 1, Jun Murakami 1, Yoshihiro Oyama 1,3, Osamu Tatebe 2,3 1. The University of Electro-Communications, Japan 2. University
More informationDeep Learning Based Real-time Object Recognition System with Image Web Crawler
, pp.103-110 http://dx.doi.org/10.14257/astl.2016.142.19 Deep Learning Based Real-time Object Recognition System with Image Web Crawler Myung-jae Lee 1, Hyeok-june Jeong 1, Young-guk Ha 2 1 Department
More informationA Study on the IoT Sensor Interaction Transmission System based on BigData
Vol.123 (SoftTech 2016), pp.220-224 http://dx.doi.org/10.14257/astl.2016.123.41 A Study on the IoT Sensor Interaction Transmission System based on BigData Jin-Tae Park 1, Gyung-Soo Phyo 1 and Il-Young
More informationSMCCSE: PaaS Platform for processing large amounts of social media
KSII The first International Conference on Internet (ICONI) 2011, December 2011 1 Copyright c 2011 KSII SMCCSE: PaaS Platform for processing large amounts of social media Myoungjin Kim 1, Hanku Lee 2 and
More informationNetwork Intrusion Forensics System based on Collection and Preservation of Attack Evidence
, pp.354-359 http://dx.doi.org/10.14257/astl.2016.139.71 Network Intrusion Forensics System based on Collection and Preservation of Attack Evidence Jong-Hyun Kim, Yangseo Choi, Joo-Young Lee, Sunoh Choi,
More informationTrajectory Planning for Mobile Robots with Considering Velocity Constraints on Xenomai
, pp.1-5 http://dx.doi.org/10.14257/astl.2014.49.01 Trajectory Planning for Mobile Robots with Considering Velocity Constraints on Xenomai Gil Jin Yang and Byoung Wook Choi *, Seoul National University
More informationByte Index Chunking Approach for Data Compression
Ider Lkhagvasuren 1, Jung Min So 1, Jeong Gun Lee 1, Chuck Yoo 2, Young Woong Ko 1 1 Dept. of Computer Engineering, Hallym University Chuncheon, Korea {Ider555, jso, jeonggun.lee, yuko}@hallym.ac.kr 2
More informationA Design of Building Group Management Service Framework for On-Going Commissioning
, pp.84-88 http://dx.doi.org/10.14257/astl.2014.49.18 A Design of Building Group Management Service Framework for On-Going Commissioning Taehyung Kim 1, Youn Kwae Jeong 1 and Il Woo Lee 1, 1 Electronics
More informationDesign of Ontology Engine Architecture for L-V-C Integrating System
, pp.225-230 http://dx.doi.org/10.14257/astl.2016.139.48 Design of Ontology Engine Architecture for L-V-C Integrating System Gap-Jun Son 1, Yun-Hee Son 2 and Kyu-Chul Lee * 1,2,* Department of Computer
More informationBatch Inherence of Map Reduce Framework
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.287
More informationThe Google File System. Alexandru Costan
1 The Google File System Alexandru Costan Actions on Big Data 2 Storage Analysis Acquisition Handling the data stream Data structured unstructured semi-structured Results Transactions Outline File systems
More informationAnalyzing and Improving Load Balancing Algorithm of MooseFS
, pp. 169-176 http://dx.doi.org/10.14257/ijgdc.2014.7.4.16 Analyzing and Improving Load Balancing Algorithm of MooseFS Zhang Baojun 1, Pan Ruifang 1 and Ye Fujun 2 1. New Media Institute, Zhejiang University
More informationA Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop
A Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop Myoungjin Kim 1, Seungho Han 1, Jongjin Jung 3, Hanku Lee 1,2,*, Okkyung Choi 2 1 Department of Internet and Multimedia Engineering,
More informationA Personal Information Retrieval System in a Web Environment
Vol.87 (Art, Culture, Game, Graphics, Broadcasting and Digital Contents 2015), pp.42-46 http://dx.doi.org/10.14257/astl.2015.87.10 A Personal Information Retrieval System in a Web Environment YoungDeok
More informationGoogle File System (GFS) and Hadoop Distributed File System (HDFS)
Google File System (GFS) and Hadoop Distributed File System (HDFS) 1 Hadoop: Architectural Design Principles Linear scalability More nodes can do more work within the same time Linear on data size, linear
More informationMapReduce. U of Toronto, 2014
MapReduce U of Toronto, 2014 http://www.google.org/flutrends/ca/ (2012) Average Searches Per Day: 5,134,000,000 2 Motivation Process lots of data Google processed about 24 petabytes of data per day in
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationAn Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage
, pp. 9-16 http://dx.doi.org/10.14257/ijmue.2016.11.4.02 An Efficient Provable Data Possession Scheme based on Counting Bloom Filter for Dynamic Data in the Cloud Storage Eunmi Jung 1 and Junho Jeong 2
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationThe Design and Implementation of a BLE-based WebD2D Service for Android Smartphone
, pp.1-5 http://dx.doi.org/10.14257/astl.2017.146.01 The Design and Implementation of a BLE-based WebD2D Service for Android Smartphone Do-Hyung Kim 1, Seok-Jin Yoon 1, Hyung-Seok Lee 1 and Jae-Ho Lee
More informationBigData and Map Reduce VITMAC03
BigData and Map Reduce VITMAC03 1 Motivation Process lots of data Google processed about 24 petabytes of data per day in 2009. A single machine cannot serve all the data You need a distributed system to
More informationA Simple Model for Estimating Power Consumption of a Multicore Server System
, pp.153-160 http://dx.doi.org/10.14257/ijmue.2014.9.2.15 A Simple Model for Estimating Power Consumption of a Multicore Server System Minjoong Kim, Yoondeok Ju, Jinseok Chae and Moonju Park School of
More informationA New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.
A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd. 1 Agenda Introduction Background and Motivation Hybrid Key-Value Data Store Architecture Overview Design details Performance
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationA Novel Model for Home Media Streaming Service in Cloud Computing Environment
, pp.265-274 http://dx.doi.org/10.14257/ijsh.2013.7.6.26 A Novel Model for Home Media Streaming Service in Cloud Computing Environment Yun Cui 1, Myoungjin Kim 1 and Hanku Lee1, 2,* 1 Department of Internet
More informationBuilding Ubiquitous Computing Environment Using the Web of Things Platform
, pp.105-109 http://dx.doi.org/10.14257/astl.2013 Building Ubiquitous Computing Environment Using the Web of Things Platform Woo-Chang Shin Dept. of Computer Science, at SeoKyeong University 16-1 Jungneung-Dong
More informationABSTRACT I. INTRODUCTION
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISS: 2456-3307 Hadoop Periodic Jobs Using Data Blocks to Achieve
More informationDell Technologies IoT Solution Surveillance with Genetec Security Center
Dell Technologies IoT Solution Surveillance with Genetec Security Center Surveillance December 2018 H17435 Configuration Best Practices Abstract This guide is intended for internal Dell Technologies personnel
More informationResearch on Implement Snapshot of pnfs Distributed File System
Applied Mathematics & Information Sciences An International Journal 2011 NSP 5 (2) (2011), 179S-185S Research on Implement Snapshot of pnfs Distributed File System Liu-Chao, Zhang-Jing Wang, Liu Zhenjun,
More information-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.
-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St. Petersburg Introduction File System Enterprise Needs Gluster Revisited Ceph
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationMAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti
International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti 1 Department
More informationUK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.
UK LUG 10 th July 2012 Lustre at Exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Exascale I/O requirements Exascale I/O model 3 Lustre at Exascale - UK LUG 10th July 2012 Exascale I/O
More informationCPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University
CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University Recall: Lec-7 In the lec-7, I talked about: - P2P vs Enterprise control - Firewall - NATs - Software defined network
More informationIntroduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski
Lustre Paul Bienkowski 2bienkow@informatik.uni-hamburg.de Proseminar Ein-/Ausgabe - Stand der Wissenschaft 2013-06-03 1 / 34 Outline 1 Introduction 2 The Project Goals and Priorities History Who is involved?
More informationData Centers and Cloud Computing
Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationFast Forward I/O & Storage
Fast Forward I/O & Storage Eric Barton Lead Architect 1 Department of Energy - Fast Forward Challenge FastForward RFP provided US Government funding for exascale research and development Sponsored by 7
More informationData Centers and Cloud Computing. Slides courtesy of Tim Wood
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationA Polygon Rendering Method with Precomputed Information
A Polygon Rendering Method with Precomputed Information Seunghyun Park #1, Byoung-Woo Oh #2 # Department of Computer Engineering, Kumoh National Institute of Technology, Korea 1 seunghyunpark12@gmail.com
More informationEnosis: Bridging the Semantic Gap between
Enosis: Bridging the Semantic Gap between File-based and Object-based Data Models Anthony Kougkas - akougkas@hawk.iit.edu, Hariharan Devarajan, Xian-He Sun Outline Introduction Background Approach Evaluation
More informationData Centers and Cloud Computing. Data Centers
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More information4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)
4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationDesigning Next Generation FS for NVMe and NVMe-oF
Designing Next Generation FS for NVMe and NVMe-oF Liran Zvibel CTO, Co-founder Weka.IO @liranzvibel Santa Clara, CA 1 Designing Next Generation FS for NVMe and NVMe-oF Liran Zvibel CTO, Co-founder Weka.IO
More informationSupporting Collaborative 3D Editing over Cloud Storage
, pp.33-37 http://dx.doi.org/10.14257/astl.2015.107.09 Supporting Collaborative 3D Editing over Cloud Storage Yeoun-Ui Ha 1, Jae-Hwan Jin 2, Myung-Joon Lee 3 Department of Electrical/Electronic and Computer
More informationDesign of Self-Adaptive System Observation over Internet of Things
, pp.165-171 http://dx.doi.org/10.14257/astl.2015.117.39 Design of Self-Adaptive System Observation over Internet of Things Young-Joo Kim 1, Jong-Soo Seok 1, Moon Soo Lee 1, Jeong-Si Kim 1, and YungJoon
More informationPLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS
PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad
More informationOnline Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.
, pp.1-10 http://dx.doi.org/10.14257/ijmue.2014.9.1.01 Design and Implementation of Binary File Similarity Evaluation System Sun-Jung Kim 2, Young Jun Yoo, Jungmin So 1, Jeong Gun Lee 1, Jin Kim 1 and
More informationDistributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented
More informationGoogle File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
More informationSSD Garbage Collection Detection and Management with Machine Learning Algorithm 1
, pp.197-206 http//dx.doi.org/10.14257/ijca.2018.11.4.18 SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1 Jung Kyu Park 1 and Jaeho Kim 2* 1 Department of Computer Software
More informationNext-Generation Cloud Platform
Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology
More informationTime Stamp based Multiple Snapshot Management Method for Storage System
Time Stamp based Multiple Snapshot Management Method for Storage System Yunsoo Lee 1, Dongmin Shin 1, Insoo Bae 1, Seokil Song 1, Seungkook Cheong 2 1 Dept. of Computer Engineering, Korea National University
More informationCS 345A Data Mining. MapReduce
CS 345A Data Mining MapReduce Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very large Tens to hundreds of terabytes
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationParallelizing Inline Data Reduction Operations for Primary Storage Systems
Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr
More informationIBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage
IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage Silverton Consulting, Inc. StorInt Briefing 2017 SILVERTON CONSULTING, INC. ALL RIGHTS RESERVED Page 2 Introduction Unstructured data has
More informationFunctional Partitioning to Optimize End-to-End Performance on Many-core Architectures
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures Min Li, Sudharshan S. Vazhkudai, Ali R. Butt, Fei Meng, Xiaosong Ma, Youngjae Kim,Christian Engelmann, and Galen Shipman
More informationMODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION
INFORMATION SYSTEMS IN MANAGEMENT Information Systems in Management (2014) Vol. 3 (4) 273 283 MODERN FILESYSTEM PERFORMANCE IN LOCAL MULTI-DISK STORAGE SPACE CONFIGURATION MATEUSZ SMOLIŃSKI Institute of
More informationSystem Specification
NetBrain Integrated Edition 7.1 System Specification Version 7.1a Last Updated 2018-09-04 Copyright 2004-2018 NetBrain Technologies, Inc. All rights reserved. Introduction NetBrain Integrated Edition features
More informationAccelerating String Matching Algorithms on Multicore Processors Cheng-Hung Lin
Accelerating String Matching Algorithms on Multicore Processors Cheng-Hung Lin Department of Electrical Engineering, National Taiwan Normal University, Taipei, Taiwan Abstract String matching is the most
More informationAn Efficient Flow Table Management Scheme for SDNs Based On Flow Forwarding Paths
, pp.88-93 http://dx.doi.org/10.14257/astl.2016.135.23 An Efficient Flow Table Management Scheme for SDNs Based On Flow Forwarding Paths Dongryeol Kim, Byoung-Dai Lee Kyonggi university, Department of
More informationDevelopment of Technique for Healing Data Races based on Software Transactional Memory
, pp.482-487 http://dx.doi.org/10.14257/astl.2016.139.96 Development of Technique for Healing Data Races based on Software Transactional Memory Eu-Teum Choi 1,, Kun Su Yoon 2, Ok-Kyoon Ha 3, Yong-Kee Jun
More informationBig Data Service Combination for Efficient Energy Data Analytics
, pp.455-459 http://dx.doi.org/10.14257/astl.2016.139.90 Big Data Service Combination for Efficient Energy Data Analytics Tai-Yeon Ku, Wan-ki Park, Il-Woo Lee Energy IT Technology Research Section Hyper-connected
More informationDesign and Implementation of Secure OTP Generation for IoT Devices
, pp.75-80 http://dx.doi.org/10.14257/astl.2017.146.15 Design and Implementation of Secure OTP Generation for IoT Devices Young-Sae Kim 1 and Jeong-Nyeo Kim 1 1 Electronics and Telecommunications Research
More informationDiscover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com
Discover CephFS TECHNICAL REPORT SPONSORED BY image vlastas, 123RF.com Discover CephFS TECHNICAL REPORT The CephFS filesystem combines the power of object storage with the simplicity of an ordinary Linux
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School
More informationEE 660: Computer Architecture Cloud Architecture: Virtualization
EE 660: Computer Architecture Cloud Architecture: Virtualization Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa Based on the slides of Prof. Roy Campbell & Prof Reza Farivar
More informationCS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University
CS 555: DISTRIBUTED SYSTEMS [DYNAMO & GOOGLE FILE SYSTEM] Frequently asked questions from the previous class survey What s the typical size of an inconsistency window in most production settings? Dynamo?
More informationInternational Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 ISSN
68 Improving Access Efficiency of Small Files in HDFS Monica B. Bisane, Student, Department of CSE, G.C.O.E, Amravati,India, monica9.bisane@gmail.com Asst.Prof. Pushpanjali M. Chouragade, Department of
More informationOpendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES
Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication
More informationVirtualized Testbed Development using Openstack
, pp.742-746 http://dx.doi.org/10.14257/astl.2015.120.147 Virtualized Testbed Development using Openstack Byeongok Kwak 1, Heeyoung Jung 1, 1 Electronics and Telecommunications Research Institute (ETRI),
More informationDDSF: A Data Deduplication System Framework for Cloud Environments
DDSF: A Data Deduplication System Framework for Cloud Environments Jianhua Gu, Chuang Zhang and Wenwei Zhang School of Computer Science and Technology, High Performance Computing R&D Center Northwestern
More informationTowards High-Performance and Cost-Effective Distributed Storage Systems with Information Dispersal Algorithms
Towards High-Performance and Cost-Effective Distributed Storage Systems with Information Dispersal Algorithms Dongfang Zhao 1, Kent Burlingame 1,2, Corentin Debains 1, Pedro Alvarez-Tabio 1, Ioan Raicu
More informationA Case Study: Performance Evaluation of a DRAM-Based Solid State Disk
A Case Study: Performance Evaluation of a DRAM-Based Solid State Disk Hitoshi Oi The University of Aizu November 2, 2007 Japan-China Joint Workshop on Frontier of Computer Science and Technology (FCST)
More information18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.
18-hdfs-gfs.txt Thu Oct 27 10:05:07 2011 1 Notes on Parallel File Systems: HDFS & GFS 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Ghemawat, Gobioff, Leung, "The Google File
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationHDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung
HDFS: Hadoop Distributed File System CIS 612 Sunnie Chung What is Big Data?? Bulk Amount Unstructured Introduction Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationDynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce
Dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce Shiori KURAZUMI, Tomoaki TSUMURA, Shoichi SAITO and Hiroshi MATSUO Nagoya Institute of Technology Gokiso, Showa, Nagoya, Aichi,
More informationMAHA. - Supercomputing System for Bioinformatics
MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3
More informationEXTRACT DATA IN LARGE DATABASE WITH HADOOP
International Journal of Advances in Engineering & Scientific Research (IJAESR) ISSN: 2349 3607 (Online), ISSN: 2349 4824 (Print) Download Full paper from : http://www.arseam.com/content/volume-1-issue-7-nov-2014-0
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationCampaign Storage. Peter Braam Co-founder & CEO Campaign Storage
Campaign Storage Peter Braam 2017-04 Co-founder & CEO Campaign Storage Contents Memory class storage & Campaign storage Object Storage Campaign Storage Search and Policy Management Data Movers & Servers
More informationDongjun Shin Samsung Electronics
2014.10.31. Dongjun Shin Samsung Electronics Contents 2 Background Understanding CPU behavior Experiments Improvement idea Revisiting Linux I/O stack Conclusion Background Definition 3 CPU bound A computer
More informationLevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine
777 LevelDB-Raw: Eliminating File System Overhead for Optimizing Performance of LevelDB Engine Hak-Su Lim and Jin-Soo Kim *College of Info. & Comm. Engineering, Sungkyunkwan University, Korea {haksu.lim,
More informationBringing HyperScale Computing to the Enterprise. The need for Enterprises to overhaul their IT systems
Bringing HyperScale Computing to the Enterprise The need for Enterprises to overhaul their IT systems MSys: Corporate Overview Established In: 2007 Self-funded, profitable Over 350 employees Global Presence
More informationDesign and Implementation of HTML5 based SVM for Integrating Runtime of Smart Devices and Web Environments
Vol.8, No.3 (2014), pp.223-234 http://dx.doi.org/10.14257/ijsh.2014.8.3.21 Design and Implementation of HTML5 based SVM for Integrating Runtime of Smart Devices and Web Environments Yunsik Son 1, Seman
More informationInterrupt response times on Arduino and Raspberry Pi. Tomaž Šolc
Interrupt response times on Arduino and Raspberry Pi Tomaž Šolc tomaz.solc@ijs.si Introduction Full-featured Linux-based systems are replacing microcontrollers in some embedded applications for low volumes,
More informationThe Fusion Distributed File System
Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique
More informationRed Hat Enterprise 7 Beta File Systems
Red Hat Enterprise 7 Beta File Systems New Scale, Speed & Features Ric Wheeler Director Red Hat Kernel File & Storage Team Red Hat Storage Engineering Agenda Red Hat Enterprise Linux 7 Storage Features
More informationJumbo: Beyond MapReduce for Workload Balancing
Jumbo: Beyond Reduce for Workload Balancing Sven Groot Supervised by Masaru Kitsuregawa Institute of Industrial Science, The University of Tokyo 4-6-1 Komaba Meguro-ku, Tokyo 153-8505, Japan sgroot@tkl.iis.u-tokyo.ac.jp
More informationDynamic Translator-Based Virtualization
Dynamic Translator-Based Virtualization Yuki Kinebuchi 1,HidenariKoshimae 1,ShuichiOikawa 2, and Tatsuo Nakajima 1 1 Department of Computer Science, Waseda University {yukikine, hide, tatsuo}@dcl.info.waseda.ac.jp
More informationChapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk
Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk Young-Joon Jang and Dongkun Shin Abstract Recent SSDs use parallel architectures with multi-channel and multiway, and manages multiple
More informationHadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017
Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google
More informationCLIENT DATA NODE NAME NODE
Volume 6, Issue 12, December 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Efficiency
More informationImplementation of Smart Car Infotainment System including Black Box and Self-diagnosis Function
, pp.267-274 http://dx.doi.org/10.14257/ijseia.2014.8.1.23 Implementation of Smart Car Infotainment System including Black Box and Self-diagnosis Function Minyoung Kim 1, Jae-Hyun Nam 2 and Jong-Wook Jang
More information