CASTORFS - A filesystem to access CASTOR

Similar documents
File Access Optimization with the Lustre Filesystem at Florida CMS T2

Chapter Two. Lesson A. Objectives. Exploring the UNIX File System and File Security. Understanding Files and Directories

Fuse Extension. version Erick Gallesio Université de Nice - Sophia Antipolis 930 route des Colles, BP 145 F Sophia Antipolis, Cedex France

File System Implementation

System level traffic shaping in disk servers with heterogeneous protocols

The DMLite Rucio Plugin: ATLAS data in a filesystem

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

ECFS: A decentralized, distributed and faulttolerant FUSE filesystem for the LHCb online farm

Project 3: An Introduction to File Systems. COP 4610 / CGS 5765 Principles of Operating Systems

Understanding StoRM: from introduction to internals

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

CS3600 SYSTEMS AND NETWORKS

Discover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com

Storage Resource Sharing with CASTOR.

SAM at CCIN2P3 configuration issues

Chapter 12: File System Implementation

Outline. Cgroup hierarchies

An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1

LCG data management at IN2P3 CC FTS SRM dcache HPSS

GStat 2.0: Grid Information System Status Monitoring

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Phronesis, a diagnosis and recovery tool for system administrators

Chapter 11: Implementing File

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

(32 KB) 216 * 215 = 231 = 2GB

Filesystem Hierarchy and Permissions

The UNIX Operating System. HORT Lecture 2 Instructor: Kranthi Varala

Week 10 Project 3: An Introduction to File Systems. Classes COP4610 / CGS5765 Florida State University

Outline. Cgroup hierarchies

Introduction to Linux

Performance of popular open source databases for HEP related computing problems

Project 3: An Introduction to File Systems. COP4610 Florida State University

Scientific Cluster Deployment and Recovery Using puppet to simplify cluster management

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.

Testing an Open Source installation and server provisioning tool for the INFN CNAF Tier1 Storage system

Streamlining CASTOR to manage the LHC data torrent

Actual4Test. Actual4test - actual test exam dumps-pass for IT exams

Case study: ext2 FS 1

PESIT Bangalore South Campus

ATLAS software configuration and build tool optimisation

OPERATING SYSTEM. Chapter 12: File System Implementation

File Management 1/34

Introduction (1 of 2) Performance and Extension of User Space File. Introduction (2 of 2) Introduction - FUSE 4/15/2014

Introduction to Container Technology. Patrick Ladd Technical Account Manager April 13, 2016

Introduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski

4/19/2016. The ext2 file system. Case study: ext2 FS. Recap: i-nodes. Recap: i-nodes. Inode Contents. Ext2 i-nodes

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Chapter 10: File System Implementation

Singularity in CMS. Over a million containers served

Case study: ext2 FS 1

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Porting mobile web application engine to the Android platform

A scalable storage element and its usage in HEP

What is a file system

Chapter 4 File Systems. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved

Chapter 1 - Introduction. September 8, 2016

Filesystem Hierarchy and Permissions

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

Docker und IBM Digital Experience in Docker Container

Taming Metadata Storms in Parallel Filesystems with MetaFS. Tim Shaffer

COS 318: Operating Systems. Journaling, NFS and WAFL

Virtuozzo Containers

Zenoss Resource Manager Upgrade Guide

AIM Enterprise Platform Software IBM z/transaction Processing Facility Enterprise Edition 1.1.0

ATLAS TDAQ System Administration: Master of Puppets

Introduction to Linux

rfs Remote File System Softwarepraktikum für Fortgeschrittene

Remote power and console management in large datacenters

mode uid gid atime ctime mtime size block count reference count direct blocks (12) single indirect double indirect triple indirect mode uid gid atime

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Unix Processes. What is a Process?

Project 4: File System Implementation 1

Distributed Systems 16. Distributed File Systems II

A Simple Mass Storage System for the SRB Data Grid

The UNIX File System

Emulating Windows file serving on POSIX. Jeremy Allison Samba Team

CS 1550 Project 3: File Systems Directories Due: Sunday, July 22, 2012, 11:59pm Completed Due: Sunday, July 29, 2012, 11:59pm

The CMS experiment workflows on StoRM based storage at Tier-1 and Tier-2 centers

Chapter 11: Implementing File Systems

ABSTRACT 2. BACKGROUND

TSM HSM Explained. Agenda. Oxford University TSM Symposium Introduction. How it works. Best Practices. Futures. References. IBM Software Group

Servicing HEP experiments with a complete set of ready integreated and configured common software components

NFS in Userspace: Goals and Challenges

HPSS Development Update and Roadmap 2017

Chapter 12: File System Implementation

SAP R/3 Archiving Solution. FOROSAP.COM Manuales

Introduction of Linux

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

File System & Device Drive Mass Storage. File Attributes (Meta Data) File Operations. Directory Structure. Operations Performed on Directory

A self-configuring control system for storage and computing departments at INFN-CNAF Tierl

9 May Swifta. A performant Hadoop file system driver for Swift. Mengmeng Liu Andy Robb Ray Zhang

Use of containerisation as an alternative to full virtualisation in grid environments.

Community Enterprise Operating System (CentOS 7) Courses

Oracle DB 11gR2 - što je novo? Antun Vukšić, Principal Consultant Oracle Hrvatska

Oracle Hierarchical Storage Manager and StorageTek QFS Software

Operating systems. Lecture 7 File Systems. Narcis ILISEI, oct 2014

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Improvements to the User Interface for LHCb's Software continuous integration system.

PASS4TEST IT 인증시험덤프전문사이트

Transcription:

Journal of Physics: Conference Series CASTORFS - A filesystem to access CASTOR To cite this article: Alexander Mazurov and Niko Neufeld 2010 J. Phys.: Conf. Ser. 219 052023 View the article online for updates and enhancements. This content was downloaded from IP address 148.251.232.83 on 12/10/2018 at 03:55

CASTORFS A Filesystem To Access CASTOR Alexander MAZUROV 1, Niko NEUFELD 1 1 LHCb Online Team, CERN, Geneve 23, Switzerland E-mail: alexander.mazurov@cern.ch Abstract. CASTOR provides a powerful and rich interface for managing files and pools of files backed by tape-storage. The API is modelled very closely on that of a POSIX filesystem, where part of the actual I/O part is handled by the rfio library. While the API is very close to POSIX it is still separated, which unfortunately makes it impossible to use standard tools and scripts straight away. This is particularly inconvenient when applications are written in languages other than C/C++ such as is frequently the case in web-apps. Here up to now the only the recourse was to use command-line utilities and parse their output, which is clearly a kludge. We have implemented a complete POSIX filesystem to access CASTOR using FUSE (Filesystem in Userspace) and have successfully tested and used this on SLC4 and SLC5 (both in 32 and 64 bit). We call it CastorFS. In this paper we will present its architecture and implementation, with emphasis on performance and caching aspects. 1. Introduction In this article we describe the architecture, implementation and deployment of a filesystem to access CASTOR named CASTORFS [1]. 2. CASTOR CASTOR [2] stands for the CERN Advanced STORage manager; is a hierarchical storage management (HSM) system developed at CERN used to store physics production files and user files; manages disk cache(s) and the data on tertiary storage or tapes. Files can be stored, listed, retrieved and accessed in CASTOR using command line tools or applications built on top of various data transfer protocols. Example of using command line tools: rfdir /castor/cern.ch - List directory contents rfcp /castor/path/to/file /tmp - Copy files from CASTOR to local filesystem c 2010 IOP Publishing Ltd 1

3. CASTOR filesystem CASTOR logically presents files in a UNIX (POSIX) like directory hierarchy of file names. This suggests to implement a new filesystem capable of operating on files stored on CASTOR using standard Unix operation system calls and commands like open, read, cp, rm, mkdir, ls, cat and find. We have implemented a complete POSIX filesystem to access CASTOR using FUSE (Filesystem in Userspace) [3] and have successfully tested and used this on SLC4 and SLC5 (both in 32 and 64 bit). We call it CASTORFS. 4. Implementation The FUSE library is the main component of CASTORFS. Using it with the CASTOR RFIO (Remote File I/O) and CASTOR NS (Name Server) libraries allows implementing a fully functional filesystem in a userspace program. Figure 1. Implementation. The CASTOR RFIO library provides the main input/output operations for files in CASTOR: rfio readdir, rfio open, rfio read, rfio mkdir, rfio stat, etc each operation has a corresponding POSIX operation: readdir, open, read, mkdir, stat, etc... The CASTOR NS library provides information about files and directories in CASTOR. The library not only allows to retrieve standard meta-data like creation and modification time, owner and size of a file/directory, but also to get some additional attributes, such as the hecksum of a file and the status of a file with respect to migration from disk cache to tape. 5. Usage An example of using CASTORFS, which mounts a new filesystem to a /castorfs local directory: $> /opt/castorfs/bin/castorfs /castorfs -o allow_other, castor_uid=10446, castor_gid=1470, castor_user=lbtbsupp, castor_stage_host=castorlhcb, castor_stage_svcclass=lhcbraw From the command line one can configure: the connection parameters for CASTOR; information about the CASTOR user; the mode of the connection: read only or full access. After mounting a CASTORFS filesystem one can run any of your favorite tools intended to work with files or directories. For example: 2

$> ls la /castorfs/path/to/file $> mkdir p /castorfs/path/to/dir $> cat /castorfs/path/to/file $> rm /castorfs/path/to/file $> find /castorfs grep strangefilename.raw 6. Extended attributes The CASTOR name server stores some interesting file meta-data, for example: a file checksum, a status of the migration to tapes (migrated or not). In CASTORFS we implemented a mapping of additional file information to filesystem extended attributes. These can be accessed in the following way: $> getfattr -d /castorfs/cern.ch/lhcb/online/data.tar # file: castorfs/cern.ch/lhcb/online/np.tar user.checksum="569145875" user.checksum_name="adler32" user.nbseg="1" user.status="migrated" 7. Packaging CASTORFS is available as a RPM package for Scientific Linux CERN (SLC4 and SLC5, both 32 and 64-bit architectures) and such will run unchanged on CentOS and RHEL distributions. In the LHCb Online cluster (CERN) the CASTORFS RPM has been distributed to the compute farm using Quattor/LinuxFC [4] (a system administration toolkit for automated installation, configuration and management of Linux-clusters ). If you have the same user (UID, GID) on the client host as she is registered with CASTOR, you can use the FUSE and CASTOR packages provided by their vendors. If you want to map your local user to a different UID/GID on CASTOR, you need to apply our patches to the FUSE and CASTOR libraries. These patches will be obsolete once CASTOR can use certificates. Figure 2. Using the CASTORFS client at the LHCb experiment at CERN. 8. Perfomance Due to some limitation in the Linux kernels prior 2.6.27, we have a performance problem for writing and reading files to/from CASTOR compared to native RFIO about 14 times slower for writing and 3 times slower for reading 1 (see Table??). When testing CASTORFS with the latest kernels we observe significantly better performance. 1 This is for the first read, subsequent reads are much faster, as they go through the buffer-cache 3

Table 1. Read and write perfomance. Tool Read (Mb/s) Write (Mb/s) POSIX cp command on CastorFS 31 5 rfcp command (based on CASTOR RFIO library) 100 70 9. Other implementations XCFS The Data Management group of the CERN IT department has implemented a CASTOR file system based on FUSE and the xroot IO protocol XCFS [5]. XCFS shows a better perfomance then CASTORFS, because it does not rely on rfio, but at the time of this writing it is still in a stage and requires some extra work to adopt it for using in a network different from the CERN network, such as the LHCb Online network. Figure 3. XCFS tested only at CERN network. It requires some extra work to adopt it for using in a network other than the CERN network (see Figure??). 10. Future plans An improvement of read- and write performance is the main item in the future plan. Implementation of a caching mechanism for storing information about frequently requested CASTOR file meta-data. Creating a CASTORFS package for the newest Linux kernel. References [1] CASTORFS web page https://lbtwiki.cern.ch/bin/view/online/castorfs [2] CASTOR web site http://cern.ch/castor [3] FUSE web site http://fuse.sourceforge.net [4] Quattor web site http://apps.sourceforge.net/mediawiki/quattor/index.php?title=main Page [5] XCFS Presentation at ACAT 2008 http://indico.cern.ch/materialdisplay.py?contribid=171&sessionid=29&materialid=slides&confid=34666 4