Optimization of non-contiguous MPI-I/O operations
|
|
- Rolf Newman
- 5 years ago
- Views:
Transcription
1 Optimization of non-contiguous MPI-I/O operations Enno Zickler Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg / 30
2 Outline 1 Non-Contiguous Data Access 2 Data Sieving 3 NCT Library 4 Evaluation 5 Summary & Conclusions 2 / 30
3 Non-Contiguous Data Access 3 / 30
4 What is Non-Contiguous Data Access? access on data which is scattered over the storage common data layouts often lead to non-contiguous access e.g. accessing only one element of a triples in an array non collective parallel access may also induce such access Figure 1: Non-Contiguous Data 4 / 30
5 Why is Non-Contiguous Data Access a problem? generally poor performance on storage systems overhead for many individual I/O calls on the file system long seek time on HDDs as I/O is a major bottleneck in HPC optimization of non-contiguous access is important 5 / 30
6 Data Sieving 6 / 30
7 What is Data Sieving? method to increase performance on non-contiguous data areas of unwanted (holes) data are also accessed the desired data is sieved out additional memory for buffers is needed (typically 2MB-4MB) much better performance for small holes and small data sizes 7 / 30
8 Figure 2: Data Sieving 8 / 30
9 Available implementations ROMIO[1] + good performance for small data and hole size + sustain detailed information about data layout but even harmful for big data and hole size requires detailed knowledge about the system MPI-I/O only hard to customize Performance model-directed data sieving [2] + automatic decision based on system characteristics + bandwidth, latency and number of I/O nodes properly MPI-I/O only (not released yet) 9 / 30
10 NCT Library 10 / 30
11 Goals 1 Flexibility allow algorithms to adopt to the underlying system easy to modify behavior of data sieving 2 Automation possibility to use sophisticated algorithms which do not need further user adjustments 3 Universality MPI Independence for wider use of data sieving POSIX level API 11 / 30
12 Design & Implementation C library Features view based file access functions POSIX compatible modularity through optimization plugins provides helper functions for plugins Processing of data sieving decision making: when and how to use data sieving helper functions access data and copy data between buffers 3 differn algorithms were implemented naive ROMIO simple performance model 12 / 30
13 API 1 t y p e d e f s t r u c t n c t t u p l e t 2 { 3 u i n t 3 2 t s i z e ; 4 u i n t 3 2 t d e l t a O f f s e t ; 5 } n c t t u p l e ; 6 7 t y p e d e f s t r u c t n c t i n f o t 8 { 9 double latencyfs ; 10 i n t bandwidth ; 11 i n t numnodes ; 12 double latencynetwork ; 13 u i n t 6 4 t s t r i p e S i z e ; 14 u i n t 6 4 t b l o c k s i z e ; 15 i n t dsmethod ; 16 } n c t i n f o ; n c t v i e w n c t c r e a t e v i e w ( u i n t 3 2 t i n i t a l O f f s e t, s i z e t d s b u f s i z e, v o i d d s b u f, i n t tuplecount, n c t t u p l e l i s t, n c t i n f o i n f o ) ; v o i d n c t d e s t r o y v i e w ( n c t v i e w view ) ; s i z e t n c t r e a d ( i n t fd, v o i d b u f f, u i n t 6 4 t o f f s e t, s i z e t b y t e c o u n t, n c t v i e w view ) ; s i z e t n c t w r i t e ( i n t fd, v o i d b u f f, u i n t 6 4 t o f f s e t, s i z e t b y t e c o u n t, nct view view ) ; Listing 1: nct.h 13 / 30
14 Evaluation 14 / 30
15 Methodology custom benchmark tool to use different data pattern vary hole and data size read and write of 10GB or at least 100sec WR- Cluster 1-10 lustre nodes 128KiB and 2MiB Stripe size different pattern min 3 iterations per setting all 3 runs are in a 20% range of the mean value model created for expected results 15 / 30
16 Reading model Figure 3: Model WR reading default pattern 16 / 30
17 Reading naive Figure 4: Naive 1 node WR 128 KiByte stripe reading default pattern 17 / 30
18 Reading ROMIO Figure 5: ROMIO WR cluster read 2 nodes 2 MiByte stripes 18 / 30
19 Reading ROMIO delta to naive Figure 6: ROMIO WR cluster write 2 nodes 2 MiByte stripes difference to naive 19 / 30
20 Reading simple performance model Figure 7: Simple performance Model WR cluster reading 2 nodes 2 MiByte stripes 20 / 30
21 Reading simple performance model delta to naive Figure 8: Simple performance Model WR cluster reading 2 nodes 2 MiByte stripes difference to naive 21 / 30
22 Writing ROMIO delta to naive Figure 9: ROMIO WR cluster write 2 nodes 2 MiByte stripes difference to naive 22 / 30
23 Writing simple performance model delta to naive Figure 10: Simple performance Model WR cluster writing 2 nodes 2 MiByte stripes difference to naive 23 / 30
24 Reading naive aligned Figure 11: naive reading 2 nodes 128 KiByte stripes aligned 24 / 30
25 Writing naive aligned Figure 12: naive writing 2 nodes 128 KiByte stripes aligned 25 / 30
26 Summary & Conclusions 26 / 30
27 POSIX leveled, flexible library makes research much easier potential in further optimization based on system characteristics implementation of the MPI-I/O interface is planed 27 / 30
28 Questions? 28 / 30
29 Literature Literature 29 / 30
30 Literature Literatur I Rajeev Thakur, William Gropp, and Ewing Lusk. Data sieving and collective i/o in ROMIO. In Frontiers of Massively Parallel Computation, Frontiers 99. The Seventh Symposium on the, pages IEEE, Yong Chen, Yin Lu, Prathamesh Amritkar, Rajeev Thakur, and Yu Zhuang. Performance model-directed data sieving for high-performance i/o. The Journal of Supercomputing, pages 1 25, September / 30
31 Literature Writing naive Figure 13: naive writing 10 nodes 2 MiByte stripes 30 / 30
Optimization of non-contiguous MPI-I/O Operations
Optimization of non-contiguous MPI-I/O Operations Bachelor Thesis Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität
More informationWorkflows and Scheduling
Workflows and Scheduling Frank Röder Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg 14-12-2015 Frank Röder
More informationRevealing Applications Access Pattern in Collective I/O for Cache Management
Revealing Applications Access Pattern in for Yin Lu 1, Yong Chen 1, Rob Latham 2 and Yu Zhuang 1 Presented by Philip Roth 3 1 Department of Computer Science Texas Tech University 2 Mathematics and Computer
More informationIntroduction to Parallel I/O
Introduction to Parallel I/O Bilel Hadri bhadri@utk.edu NICS Scientific Computing Group OLCF/NICS Fall Training October 19 th, 2011 Outline Introduction to I/O Path from Application to File System Common
More informationPattern-Aware File Reorganization in MPI-IO
Pattern-Aware File Reorganization in MPI-IO Jun He, Huaiming Song, Xian-He Sun, Yanlong Yin Computer Science Department Illinois Institute of Technology Chicago, Illinois 60616 {jhe24, huaiming.song, sun,
More informationImproved Solutions for I/O Provisioning and Application Acceleration
1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer
More informationIteration Based Collective I/O Strategy for Parallel I/O Systems
Iteration Based Collective I/O Strategy for Parallel I/O Systems Zhixiang Wang, Xuanhua Shi, Hai Jin, Song Wu Services Computing Technology and System Lab Cluster and Grid Computing Lab Huazhong University
More informationHint Controlled Distribution with Parallel File Systems
Hint Controlled Distribution with Parallel File Systems Hipolito Vasquez Lucas and Thomas Ludwig Parallele und Verteilte Systeme, Institut für Informatik, Ruprecht-Karls-Universität Heidelberg, 6912 Heidelberg,
More informationA High Performance Implementation of MPI-IO for a Lustre File System Environment
A High Performance Implementation of MPI-IO for a Lustre File System Environment Phillip M. Dickens and Jeremy Logan Department of Computer Science, University of Maine Orono, Maine, USA dickens@umcs.maine.edu
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationDesign, Implementation, and Evaluation of a Low-Level Extent-Based Object Store
Design, Implementation, and Evaluation of a Low-Level Extent-Based Object Store Masterarbeit Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften
More informationImproving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method
Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method Wei-keng Liao 1, Avery Ching 1, Kenin Coloma 1, Alok Choudhary 1, and Mahmut Kandemir 2 1 Northwestern University
More informationData Sieving and Collective I/O in ROMIO
Appeared in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182 189. c 1999 IEEE. Data Sieving and Collective I/O in ROMIO Rajeev Thakur William Gropp
More informationAnalyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016
Analyzing the High Performance Parallel I/O on LRZ HPC systems Sandra Méndez. HPC Group, LRZ. June 23, 2016 Outline SuperMUC supercomputer User Projects Monitoring Tool I/O Software Stack I/O Analysis
More informationExploiting Shared Memory to Improve Parallel I/O Performance
Exploiting Shared Memory to Improve Parallel I/O Performance Andrew B. Hastings 1 and Alok Choudhary 2 1 Sun Microsystems, Inc. andrew.hastings@sun.com 2 Northwestern University choudhar@ece.northwestern.edu
More informationEvaluating Algorithms for Shared File Pointer Operations in MPI I/O
Evaluating Algorithms for Shared File Pointer Operations in MPI I/O Ketan Kulkarni and Edgar Gabriel Parallel Software Technologies Laboratory, Department of Computer Science, University of Houston {knkulkarni,gabriel}@cs.uh.edu
More informationLeveraging Burst Buffer Coordination to Prevent I/O Interference
Leveraging Burst Buffer Coordination to Prevent I/O Interference Anthony Kougkas akougkas@hawk.iit.edu Matthieu Dorier, Rob Latham, Rob Ross, Xian-He Sun Wednesday, October 26th Baltimore, USA Outline
More informationImproving I/O Performance in POP (Parallel Ocean Program)
Improving I/O Performance in POP (Parallel Ocean Program) Wang Di 2 Galen M. Shipman 1 Sarp Oral 1 Shane Canon 1 1 National Center for Computational Sciences, Oak Ridge National Laboratory Oak Ridge, TN
More informationExtreme I/O Scaling with HDF5
Extreme I/O Scaling with HDF5 Quincey Koziol Director of Core Software Development and HPC The HDF Group koziol@hdfgroup.org July 15, 2012 XSEDE 12 - Extreme Scaling Workshop 1 Outline Brief overview of
More informationHDF5 I/O Performance. HDF and HDF-EOS Workshop VI December 5, 2002
HDF5 I/O Performance HDF and HDF-EOS Workshop VI December 5, 2002 1 Goal of this talk Give an overview of the HDF5 Library tuning knobs for sequential and parallel performance 2 Challenging task HDF5 Library
More informationh5perf_serial, a Serial File System Benchmarking Tool
h5perf_serial, a Serial File System Benchmarking Tool The HDF Group April, 2009 HDF5 users have reported the need to perform serial benchmarking on systems without an MPI environment. The parallel benchmarking
More informationDo You Know What Your I/O Is Doing? (and how to fix it?) William Gropp
Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp www.cs.illinois.edu/~wgropp Messages Current I/O performance is often appallingly poor Even relative to what current systems can achieve
More informationMaking Resonance a Common Case: A High-Performance Implementation of Collective I/O on Parallel File Systems
Making Resonance a Common Case: A High-Performance Implementation of Collective on Parallel File Systems Xuechen Zhang 1, Song Jiang 1, and Kei Davis 2 1 ECE Department 2 Computer and Computational Sciences
More informationHigh Performance Computing using a Parallella Board Cluster PROJECT PROPOSAL. March 24, 2015
High Performance Computing using a Parallella Board Cluster PROJECT PROPOSAL March 24, Michael Johan Kruger Rhodes University Computer Science Department g12k5549@campus.ru.ac.za Principle Investigator
More informationChallenges in HPC I/O
Challenges in HPC I/O Universität Basel Julian M. Kunkel German Climate Computing Center / Universität Hamburg 10. October 2014 Outline 1 High-Performance Computing 2 Parallel File Systems and Challenges
More informationImplementing Byte-Range Locks Using MPI One-Sided Communication
Implementing Byte-Range Locks Using MPI One-Sided Communication Rajeev Thakur, Robert Ross, and Robert Latham Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL 60439, USA
More informationExploiting Lustre File Joining for Effective Collective IO
Exploiting Lustre File Joining for Effective Collective IO Weikuan Yu, Jeffrey Vetter Oak Ridge National Laboratory Computer Science & Mathematics Oak Ridge, TN, USA 37831 {wyu,vetter}@ornl.gov R. Shane
More informationEnergy Efficient Programming
2steuer@informatik.uni-hamburg.de Seminar Effiziente Programmierung Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität
More informationExecution-driven Simulation of Network Storage Systems
Execution-driven ulation of Network Storage Systems Yijian Wang and David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA 2115 yiwang, kaeli@ece.neu.edu Abstract
More informationStorage Challenges at Los Alamos National Lab
John Bent john.bent@emc.com Storage Challenges at Los Alamos National Lab Meghan McClelland meghan@lanl.gov Gary Grider ggrider@lanl.gov Aaron Torres agtorre@lanl.gov Brett Kettering brettk@lanl.gov Adam
More informationNew programming languages and paradigms
New programming languages and paradigms Seminar: Neueste Trends im Hochleistungsrechnen Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften
More informationSeitenbasierte Komprimierung im Linux-Kernel - Page-Based Compression in the Linux Kernel
Bachelorarbeit Seitenbasierte Komprimierung im Linux-Kernel - Page-Based Compression in the Linux Kernel vorgelegt von Benjamin Warnke Fakultät für Mathematik, Informatik und Naturwissenschaften Fachbereich
More informationAn Overview of Fujitsu s Lustre Based File System
An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu
More informationLustre Lockahead: Early Experience and Performance using Optimized Locking
This CUG paper is a preprint of the final paper published in the CCPE Special Online Issue of CUG 2017 at http://onlinelibrary.wiley.com/doi/10.1002/cpe.v30.1/issuetoc Lustre Lockahead: Early Experience
More informationFile Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18
File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18 1 Introduction Historically, the parallel version of the HDF5 library has suffered from performance
More informationOrthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters
Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters Xuechen Zhang 1 Jianqiang Ou 2 Kei Davis 3 Song Jiang 2 1 Georgia Institute of Technology, 2 Wayne State University,
More informationGuidelines for Efficient Parallel I/O on the Cray XT3/XT4
Guidelines for Efficient Parallel I/O on the Cray XT3/XT4 Jeff Larkin, Cray Inc. and Mark Fahey, Oak Ridge National Laboratory ABSTRACT: This paper will present an overview of I/O methods on Cray XT3/XT4
More informationHigh Performance Supercomputing using Infiniband based Clustered Servers
High Performance Supercomputing using Infiniband based Clustered Servers M.J. Johnson A.L.C. Barczak C.H. Messom Institute of Information and Mathematical Sciences Massey University Auckland, New Zealand.
More informationQuiz for Chapter 6 Storage and Other I/O Topics 3.10
Date: 3.10 Not all questions are of equal difficulty. Please review the entire quiz first and then budget your time carefully. Name: Course: 1. [6 points] Give a concise answer to each of the following
More informationEvaluating I/O Characteristics and Methods for Storing Structured Scientific Data
Evaluating I/O Characteristics and Methods for Storing Structured Scientific Data Avery Ching 1, Alok Choudhary 1, Wei-keng Liao 1,LeeWard, and Neil Pundit 1 Northwestern University Sandia National Laboratories
More informationECE7995 (7) Parallel I/O
ECE7995 (7) Parallel I/O 1 Parallel I/O From user s perspective: Multiple processes or threads of a parallel program accessing data concurrently from a common file From system perspective: - Files striped
More informationDATA access is one of the critical performance bottlenecks
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 1.119/TC.216.2637353,
More informationBoosting Application-specific Parallel I/O Optimization using IOSIG
22 2th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Boosting Application-specific Parallel I/O Optimization using IOSIG Yanlong Yin yyin2@iit.edu Surendra Byna 2 sbyna@lbl.gov
More informationParallel I/O Techniques and Performance Optimization
Parallel I/O Techniques and Performance Optimization Lonnie Crosby lcrosby1@utk.edu NICS Scientific Computing Group NICS/RDAV Spring Training 2012 March 22, 2012 2 Outline Introduction to I/O Path from
More informationChapter 11. Parallel I/O. Rajeev Thakur and William Gropp
Chapter 11 Parallel I/O Rajeev Thakur and William Gropp Many parallel applications need to access large amounts of data. In such applications, the I/O performance can play a significant role in the overall
More informationNoncontiguous I/O Accesses Through MPI-IO
Noncontiguous I/O Accesses Through MPI-IO Avery Ching Alok Choudhary Kenin Coloma Wei-keng Liao Center for Parallel and Distributed Computing Northwestern University Evanston, IL 60208 aching, choudhar,
More informationDynamic Active Storage for High Performance I/O
Dynamic Active Storage for High Performance I/O Chao Chen(chao.chen@ttu.edu) 4.02.2012 UREaSON Outline Ø Background Ø Active Storage Ø Issues/challenges Ø Dynamic Active Storage Ø Prototyping and Evaluation
More informationStore Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete
Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete 1 DDN Who We Are 2 We Design, Deploy and Optimize Storage Systems Which Solve HPC, Big Data and Cloud Business
More informationComposite Metrics for System Throughput in HPC
Composite Metrics for System Throughput in HPC John D. McCalpin, Ph.D. IBM Corporation Austin, TX SuperComputing 2003 Phoenix, AZ November 18, 2003 Overview The HPC Challenge Benchmark was announced last
More informationLustre * Features In Development Fan Yong High Performance Data Division, Intel CLUG
Lustre * Features In Development Fan Yong High Performance Data Division, Intel CLUG 2017 @Beijing Outline LNet reliability DNE improvements Small file performance File Level Redundancy Miscellaneous improvements
More informationThe Fusion Distributed File System
Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique
More informationHow to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O?
bs_bs_banner Short Technical Note Transactions in GIS, 2014, 18(6): 950 957 How to Apply the Geospatial Data Abstraction Library (GDAL) Properly to Parallel Geospatial Raster I/O? Cheng-Zhi Qin,* Li-Jun
More informationParallel I/O Libraries and Techniques
Parallel I/O Libraries and Techniques Mark Howison User Services & Support I/O for scientifc data I/O is commonly used by scientific applications to: Store numerical output from simulations Load initial
More informationStructuring PLFS for Extensibility
Structuring PLFS for Extensibility Chuck Cranor, Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University What is PLFS? Parallel Log Structured File System Interposed filesystem b/w
More informationBlue Waters I/O Performance
Blue Waters I/O Performance Mark Swan Performance Group Cray Inc. Saint Paul, Minnesota, USA mswan@cray.com Doug Petesch Performance Group Cray Inc. Saint Paul, Minnesota, USA dpetesch@cray.com Abstract
More informationEnabling Active Storage on Parallel I/O Software Stacks. Seung Woo Son Mathematics and Computer Science Division
Enabling Active Storage on Parallel I/O Software Stacks Seung Woo Son sson@mcs.anl.gov Mathematics and Computer Science Division MSST 2010, Incline Village, NV May 7, 2010 Performing analysis on large
More informationA Classification of Parallel I/O Toward Demystifying HPC I/O Best Practices
A Classification of Parallel I/O Toward Demystifying HPC I/O Best Practices Robert Sisneros National Center for Supercomputing Applications Uinversity of Illinois at Urbana-Champaign Urbana, Illinois,
More informationA Parallel API for Creating and Reading NetCDF Files
A Parallel API for Creating and Reading NetCDF Files January 4, 2015 Abstract Scientists recognize the importance of portable and efficient mechanisms for storing datasets created and used by their applications.
More information朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC
October 28, 2013 Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration 朱义普 Director, North Asia, HPC DDN Storage Vendor for HPC & Big Data
More informationPHX: Memory Speed HPC I/O with NVM. Pradeep Fernando Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan
PHX: Memory Speed HPC I/O with NVM Pradeep Fernando Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan Node Local Persistent I/O? Node local checkpoint/ restart - Recover from transient failures ( node restart)
More informationLecture 33: More on MPI I/O. William Gropp
Lecture 33: More on MPI I/O William Gropp www.cs.illinois.edu/~wgropp Today s Topics High level parallel I/O libraries Options for efficient I/O Example of I/O for a distributed array Understanding why
More informationAccelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures
Accelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures M. Bayatpour, S. Chakraborty, H. Subramoni, X. Lu, and D. K. Panda Department of Computer Science and Engineering
More informationPerformance Characterization and Optimization of Parallel I/O on the Cray XT
Performance Characterization and Optimization of Parallel I/O on the Cray XT Weikuan Yu, Jeffrey S. Vetter, H. Sarp Oral Oak Ridge National Laboratory Oak Ridge, TN 37831 {wyu,vetter,oralhs}@ornl.gov Abstract
More informationFastForward I/O and Storage: ACG 8.6 Demonstration
FastForward I/O and Storage: ACG 8.6 Demonstration Kyle Ambert, Jaewook Yu, Arnab Paul Intel Labs June, 2014 NOTICE: THIS MANUSCRIPT HAS BEEN AUTHORED BY INTEL UNDER ITS SUBCONTRACT WITH LAWRENCE LIVERMORE
More informationIntel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John
More informationEnergy-Eciency of Long-term Storage
Irina Tolokonnikova Seminar "Energy-Ecient Programming" Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg
More informationCoordinating Parallel HSM in Object-based Cluster Filesystems
Coordinating Parallel HSM in Object-based Cluster Filesystems Dingshan He, Xianbo Zhang, David Du University of Minnesota Gary Grider Los Alamos National Lab Agenda Motivations Parallel archiving/retrieving
More informationI/O Analysis and Optimization for an AMR Cosmology Application
I/O Analysis and Optimization for an AMR Cosmology Application Jianwei Li Wei-keng Liao Alok Choudhary Valerie Taylor ECE Department, Northwestern University {jianwei, wkliao, choudhar, taylor}@ece.northwestern.edu
More informationA First Implementation of Parallel IO in Chapel for Block Data Distribution 1
A First Implementation of Parallel IO in Chapel for Block Data Distribution 1 Rafael LARROSA a, Rafael ASENJO a Angeles NAVARRO a and Bradford L. CHAMBERLAIN b a Dept. of Compt. Architect. Univ. of Malaga,
More informationMaster s Thesis: Real-Time Object Shape Perception via Force/Torque Sensor
S. Rau TAMS-Oberseminar 1 / 25 MIN-Fakultät Fachbereich Informatik Master s Thesis: Real-Time Object Shape Perception via Force/Torque Sensor Current Status Stephan Rau Universität Hamburg Fakultät für
More informationAn Evolutionary Path to Object Storage Access
An Evolutionary Path to Object Storage Access David Goodell +, Seong Jo (Shawn) Kim*, Robert Latham +, Mahmut Kandemir*, and Robert Ross + *Pennsylvania State University + Argonne National Laboratory Outline
More informationPerformance Evaluation of InfiniBand with PCI Express
Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Server Technology Group IBM T. J. Watson Research Center Yorktown Heights, NY 1598 jl@us.ibm.com Amith Mamidala, Abhinav Vishnu, and Dhabaleswar
More informationHiding I/O Latency with Pre-execution Prefetching for Parallel Applications
Hiding I/O Latency with Pre-execution Prefetching for Parallel Applications Yong Chen, Surendra Byna, Xian-He Sun, Rajeev Thakur and William Gropp Department of Computer Science, Illinois Institute of
More informationFeasibility Study of Effective Remote I/O Using a Parallel NetCDF Interface in a Long-Latency Network
Feasibility Study of Effective Remote I/O Using a Parallel NetCDF Interface in a Long-Latency Network Yuichi Tsujita Abstract NetCDF provides portable and selfdescribing I/O data format for array-oriented
More informationA Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS
A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS Adithya Bhat, Nusrat Islam, Xiaoyi Lu, Md. Wasi- ur- Rahman, Dip: Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng
More informationWhat is a file system
COSC 6397 Big Data Analytics Distributed File Systems Edgar Gabriel Spring 2017 What is a file system A clearly defined method that the OS uses to store, catalog and retrieve files Manage the bits that
More informationFeedback on BeeGFS. A Parallel File System for High Performance Computing
Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December
More informationS4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems
: Smart Selective SSD Cache for Parallel I/O Systems Shuibing He, Xian-He Sun, Bo Feng Department of Computer Science Illinois Institute of Technology Chicago, IL 6616 {she11, sun, bfeng5}@iit.edu Abstract
More informationOptimizations Based on Hints in a Parallel File System
Optimizations Based on Hints in a Parallel File System María S. Pérez, Alberto Sánchez, Víctor Robles, JoséM.Peña, and Fernando Pérez DATSI. FI. Universidad Politécnica de Madrid. Spain {mperez,ascampos,vrobles,jmpena,fperez}@fi.upm.es
More informationEmpirical Analysis of a Large-Scale Hierarchical Storage System
Empirical Analysis of a Large-Scale Hierarchical Storage System Weikuan Yu, H. Sarp Oral, R. Shane Canon, Jeffrey S. Vetter, and Ramanan Sankaran Oak Ridge National Laboratory Oak Ridge, TN 37831 {wyu,oralhs,canonrs,vetter,sankaranr}@ornl.gov
More informationHigh Performance I/O for Seismic Wave Propagation Simulations
2017 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing High Performance I/O for Seismic Wave Propagation Simulations Francieli Zanon Boito 1, Jean Luca Bez 2,
More informationLustre Parallel Filesystem Best Practices
Lustre Parallel Filesystem Best Practices George Markomanolis Computational Scientist KAUST Supercomputing Laboratory georgios.markomanolis@kaust.edu.sa 7 November 2017 Outline Introduction to Parallel
More informationTracing Internal Communication in MPI and MPI-I/O
Tracing Internal Communication in MPI and MPI-I/O Julian M. Kunkel, Yuichi Tsujita, Olga Mordvinova, Thomas Ludwig Abstract MPI implementations can realize MPI operations with any algorithm that fulfills
More informationHigh Performance MPI-2 One-Sided Communication over InfiniBand
High Performance MPI-2 One-Sided Communication over InfiniBand Weihang Jiang Jiuxing Liu Hyun-Wook Jin Dhabaleswar K. Panda William Gropp Rajeev Thakur Computer and Information Science The Ohio State University
More informationOptimizing Local File Accesses for FUSE-Based Distributed Storage
Optimizing Local File Accesses for FUSE-Based Distributed Storage Shun Ishiguro 1, Jun Murakami 1, Yoshihiro Oyama 1,3, Osamu Tatebe 2,3 1. The University of Electro-Communications, Japan 2. University
More informationHigh Performance MPI-2 One-Sided Communication over InfiniBand
High Performance MPI-2 One-Sided Communication over InfiniBand Weihang Jiang Jiuxing Liu Hyun-Wook Jin Dhabaleswar K. Panda William Gropp Rajeev Thakur Computer and Information Science The Ohio State University
More informationIntra-MIC MPI Communication using MVAPICH2: Early Experience
Intra-MIC MPI Communication using MVAPICH: Early Experience Sreeram Potluri, Karen Tomko, Devendar Bureddy, and Dhabaleswar K. Panda Department of Computer Science and Engineering Ohio State University
More informationAPI and Usage of libhio on XC-40 Systems
API and Usage of libhio on XC-40 Systems May 24, 2018 Nathan Hjelm Cray Users Group May 24, 2018 Los Alamos National Laboratory LA-UR-18-24513 5/24/2018 1 Outline Background HIO Design HIO API HIO Configuration
More informationMobile robots control architectures
1 Mobile robots control architectures Dimitri Popov Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Department Informatik Integriertes Seminar Intelligent Robotics 10 1.
More informationOutline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work
Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3
More informationOn the Performance of the POSIX I/O Interface to PVFS
On the Performance of the POSIX I/O Interface to PVFS Murali Vilayannur y Robert B. Ross z Philip H. Carns Λ Rajeev Thakur z Anand Sivasubramaniam y Mahmut Kandemir y y Department of Computer Science and
More informationCharacterizing Parallel I/O Behaviour Based on Server-Side I/O Counters
Characterizing Parallel I/O Behaviour Based on Server-Side I/O Counters SC16 - BoF Analyzing Parallel I/O SC16 BoF - Analyzing Parallel I/O, November 15, 2016 S. El Sayed JSC M. Bolten Kas D. Pleiter JSC
More informationIntroduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski
Lustre Paul Bienkowski 2bienkow@informatik.uni-hamburg.de Proseminar Ein-/Ausgabe - Stand der Wissenschaft 2013-06-03 1 / 34 Outline 1 Introduction 2 The Project Goals and Priorities History Who is involved?
More informationArchitecting a High Performance Storage System
WHITE PAPER Intel Enterprise Edition for Lustre* Software High Performance Data Division Architecting a High Performance Storage System January 2014 Contents Introduction... 1 A Systematic Approach to
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationPerformance Evaluation and Modeling of HPC I/O on Non-Volatile Memory
Performance Evaluation and Modeling of HPC I/O on Non-Volatile Memory Wei Liu wliu34@ucmerced.edu Kai Wu kwu42@ucmerced.edu Jialin Liu jalnliu@lbl.gov Feng Chen fchen@csc.lsu.edu Dong Li dli35@ucmerced.edu
More informationSHOC: The Scalable HeterOgeneous Computing Benchmark Suite
SHOC: The Scalable HeterOgeneous Computing Benchmark Suite Dakar Team Future Technologies Group Oak Ridge National Laboratory Version 1.1.2, November 2011 1 Introduction The Scalable HeterOgeneous Computing
More informationLustre File System. Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg. Paul Bienkowski Author. Michael Kuhn Supervisor
Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg September 30, 2013 Paul Bienkowski Author 2bienkow@informatik.uni-hamburg.de Michael Kuhn Supervisor michael.kuhn@informatik.uni-hamburg.de
More informationParallel I/O Scheduling in Multiprogrammed Cluster Computing Systems
Parallel I/O Scheduling in Multiprogrammed Cluster Computing Systems J.H. Abawajy School of Computer Science, Carleton University, Ottawa, Canada. abawjem@scs.carleton.ca Abstract. In this paper, we address
More informationNon-Blocking Collectives for MPI
Non-Blocking Collectives for MPI overlap at the highest level Torsten Höfler Open Systems Lab Indiana University Bloomington, IN, USA Institut für Wissenschaftliches Rechnen Technische Universität Dresden
More information