Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Similar documents
Today (2010): Multicore Computing 80. Near future (~2018): Manycore Computing Number of Cores Processing

The Fusion Distributed File System

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

CA485 Ray Walshe Google File System

Distributed NoSQL Storage for Extreme-scale System Services

Professor: Ioan Raicu. TA: Wei Tang. Everyone else

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Overcoming Data Locality: an In-Memory Runtime File System with Symmetrical Data Distribution

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

Overview Past Work Future Work. Motivation Proposal. Work-in-Progress

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Distributed Filesystem

Support for resource sharing Openness Concurrency Scalability Fault tolerance (reliability) Transparence. CS550: Advanced Operating Systems 2

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Structuring PLFS for Extensibility

Analytics in the cloud

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

The Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler

Using DDN IME for Harmonie

Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys)

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

pnfs, POSIX, and MPI-IO: A Tale of Three Semantics

Changing Requirements for Distributed File Systems in Cloud Storage

Distributed Systems 16. Distributed File Systems II

An Introduction to GPFS

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Parallel I/O Libraries and Techniques

SSS: An Implementation of Key-value Store based MapReduce Framework. Hirotaka Ogawa (AIST, Japan) Hidemoto Nakada Ryousei Takano Tomohiro Kudoh

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

MOHA: Many-Task Computing Framework on Hadoop

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

Today: Coda, xfs. Case Study: Coda File System. Brief overview of other file systems. xfs Log structured file systems HDFS Object Storage Systems

Huge market -- essentially all high performance databases work this way

Introduction to HPC Parallel I/O

The BioHPC Nucleus Cluster & Future Developments

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Fast Forward I/O & Storage

Large-Scale GPU programming

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Decentralized Distributed Storage System for Big Data

High Throughput WAN Data Transfer with Hadoop-based Storage

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song

Ubiquitous and Mobile Computing CS 525M: Virtually Unifying Personal Storage for Fast and Pervasive Data Accesses

The Google File System

Distributed File Systems II

Parallel File Systems. John White Lawrence Berkeley National Lab

Cluster Setup and Distributed File System

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Red Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015

Introduction to Distributed Data Systems

Enosis: Bridging the Semantic Gap between

an Object-Based File System for Large-Scale Federated IT Infrastructures

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

I/O Profiling Towards the Exascale

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

Toward Scalable Monitoring on Large-Scale Storage for Software Defined Cyberinfrastructure

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

SEMem: deployment of MPI-based in-memory storage for Hadoop on supercomputers

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Scientific Workflows and Cloud Computing. Gideon Juve USC Information Sciences Institute

Linux Clustering Technologies. Mark Spencer November 8, 2005

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Towards High-Performance and Cost-Effective Distributed Storage Systems with Information Dispersal Algorithms

IME Infinite Memory Engine Technical Overview

Introduction to Cloud Computing

NPTEL Course Jan K. Gopinath Indian Institute of Science

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System

Parallel File Systems for HPC

Introducing SUSE Enterprise Storage 5

The Google File System

CSE 124: Networked Services Lecture-17

REMEM: REmote MEMory as Checkpointing Storage

Cloud Filesystem. Jeff Darcy for BBLISA, October 2011

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

Current Topics in OS Research. So, what s hot?

The Future of High Performance Computing

Assignment 5. Georgia Koloniari

Lustre* - Fast Forward to Exascale High Performance Data Division. Eric Barton 18th April, 2013

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

The Google File System

Staggeringly Large File Systems. Presented by Haoyan Geng

Virtual Security Server

HPC Architectures. Types of resource currently in use

Systematic Cooperation in P2P Grids

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Carlo Cavazzoni, HPC department, CINECA

Concurrency for data-intensive applications

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

Transcription:

Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or theoretical Write a technical report Present your results Write a workshop/conference paper (optional) 2

Distributed Operating Systems http://www.scalemp.com/ Achieve a unified OS across machine boundaries The opposite of virtualization, which creates multiple virtual OS instances on one machine Choose an OS to modify CPU scheduler load balancing Modify the OS scheduler to be aware of threads and cache locality Memory manager shared memory File system leverage shared/parallel file systems Choose a virtual machine to modify (e.g. Java) Evaluate workloads for performance and scalability 3

Virtualization has overheads Quantify these overheads for a variety of workloads Computational intensive Memory intensive Storage intensive Network intensive Across different virtualization technologies Across different hardware Survey the latest research in addressing shortcomings of virtualization 4

Distributed file systems use replication to ensure reliability of data Replication Pros: Easy to implement, increases data locality and perf Cons: Expensive and inefficient, in terms of network bandwidth and disk space Erasure codes: Pros: Efficient in disk space usage Cons: Harder to implement, expensive computationally, decreases locality Investigate replacing replication with erasure codes 5

Explore decentralization of job managers Potential load balancing Load balancing Potential solutions: Work stealing Hierachical architecture 6

Most code is inherently sequential in nature this was OK while we doubled processor speeds according to Moore s Law Multi-core and manycore architectures are making sequential codes inefficient How to parallelize existing codes without burdening the programmer 7

100~1000 cores per GPU Does cluster computing programming approaches apply to GPUs? How can GPUs be generalized for HPC use? Does MapReduce map well to GPUs? What architecture support is needed? Cores should have L1/L2 caches, and GPU memory should be a L3 cache for the host memory Nvidia Fermi might be a step in the right direction Allow cores to execute independent kernels No enforcement of coherency across cores Allow core-to-core communication 8

Workflows on GPUs Make GPUs look like clusters of computers Enable scheduling of independent tasks with support for file I/O Intel MICA architecture might be useful Virtualize GPUs Allow GPUs to be partitioned over multiple virtual machines Work by Peter Dinda might be relevant and useful Applications on GPUs Medical imaging, Astronomy 9

Implement a distributed file system Use of FUSE for a general POSIX interface Use structured distributed hash tables for distributed metadata management Can scale logarithmically with system size Can create network topology aware overlays Relaxed data access semantic to increase scalability eventual consistency on data modifications write-once read-many data access patterns Evaluation scalability and performance Compare to NFS, GPFS, PVFS, Lustre, HDFS 10

Understand file access patterns in HPC How large are files and directories, how often is data accessed, what data is write once read many, or even write once read never, how much data is modified concurrently, etc Explore the use of FUSE to implement various file systems functionality not being met by existing file systems Adding Data Provenance support to distributed/parallel file systems Adding novel features to filesystem namespace 11

Modify the open source PVFS to achieve improvements in various areas: Fault tolerance High availability Metadata performance Scalability Compare PVFS to GPFS and Lustre for various workloads 13

Compare cloud performance with grids and clusters Explore variable pricing schemes, utilization models, etc Reducing the cost of cloud storage through novel architectures 15

Application Uptime % Simulations at exascale and reliability 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 2000 BG/L 1024 nodes 2009 BG/P 73,728 nodes 2007 BG/L 106,496 nodes No Checkpointing Checkpointing to Parallel File System Checkpointing to Distributed File System 2019 ~1,000,000 nodes Scale (# of nodes) 20

Reducing Checkpoint Overhead for MPI Applications Exploring redundancy to optimize MTTF and system throughput 21

High failure rate in modern HPC systems Large number of components Use of off-the-shelf unreliable components Failure rates dynamically varies based on System architecture and Workload Replication for fault detection (possible tolerance) Independent virtual machines as replicas instead of stand-alone nodes

Distributed file systems benchmarking Compare PVFS, GPFS, FusionFS, HDFS, and others Distributed hash tables benchmarking Compare Chord, Tapestry, Kademlia, C-MPI, memcache, and some database-centric systems 23

24