NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC
|
|
- Adrian Ray
- 5 years ago
- Views:
Transcription
1
2 Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data centers at Google, Yahoo, and others Programming paradigm: MapReduce Others from academia: Sector, MosaStore, Chirp 2
3 Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data centers at Google, Yahoo, and others Programming paradigm: MapReduce Others from academia: Sector, MosaStore, Chirp 3
4 Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data centers at Google, Yahoo, and others Programming paradigm: MapReduce Others from academia: Sector, MosaStore, Chirp 4
5 Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data centers at Google, Yahoo, and others Programming paradigm: MapReduce Others from academia: Sector, MosaStore, Chirp 5
6 Local Disk: 22-24: ANL/UC TG Site (7GB SCSI) Today: PADS (RAID-, 6 drives 75GB SATA) Cluster: 22-24: ANL/UC TG Site (GPFS, 8 servers, 1Gb/s each) Today: PADS (GPFS, SAN) Supercomputer: MB/s per Processor Core : IBM Blue Gene/L (GPFS).1 Today: IBM Blue Gene/P (GPFS) 1 Local Disk Cluster Supercomputer -2.2X -99X Today -15X -438X 6
7 What if we could combine the scientific community s existing programming paradigms, but yet still exploit the data locality that naturally occurs in scientific workloads? 7
8 8
9 Input Hi Data Hi Data Size Size Med Med MapReduce/MTC MapReduce/MTC (Data (Data Analysis, Analysis, Mining) Mining) MTC MTC (Big (Big Data Data and and Many Many Tasks) Tasks) Low Low [MTAGS8] Many-Task Computing for Grids and Supercomputers HPC HPC (Heroic (Heroic HTC/MTC MPI HTC/MTC MPI (Many (Many Loosely Tasks) Loosely Tasks) Coupled Coupled Tasks) Tasks) 1 1K 1K 1M 1M Number of of Tasks 9
10 Significant performance improvements can be obtained in the analysis of large dataset by leveraging information about data analysis workloads rather than individual data analysis tasks. Important concepts related to the hypothesis Workload: a complex query (or set of queries) decomposable into simpler tasks to answer broader analysis questions Data locality is crucial to the efficient use of large scale distributed systems for scientific and data-intensive applications Allocate computational and caching storage resources, co-scheduled to optimize workload performance 1
11 Resource acquired in response to demand Data diffuse from archival storage to newly acquired transient resources Resource caching allows faster responses to subsequent requests Task Dispatcher Data-Aware Task Dispatcher Scheduler Data-Aware Scheduler Resources are released when demand drops Optimizes performance by coscheduling data and computations Decrease dependency of a shared/parallel file systems Critical to support data intensive MTC Idle Resources Idle Resources Persistent Storage Shared Persistent File System Storage Shared File System text text Provisioned Resources Provisioned Resources [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion 11
12 What would data diffusion look like in practice? Extend the Falkon framework [SC7] Falkon: a Fast and Light-weight task execution framework 12
13 FA: first-available simple load balancing MCH: max-cache-hit maximize cache hits MCU: max-compute-util maximize processor utilization GCC: good-cache-compute maximize both cache hit and processor utilization at the same time [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion 13
14 3GHz dual CPUs ANL/UC TG with 128 processors Scheduling window 25 tasks Dataset 1K files 1 byte each Tasks Read 1 file Write 1 file CPU Time per Task (ms) CPU Time per Task (ms) Task Submit Notification Task Submit for Task Availability Task Notification Dispatch for (data-aware Task Availability scheduler) Task Task Results Dispatch (data-aware (data-aware scheduler) scheduler) Notification Task Results for (data-aware Task Resultsscheduler) WS Notification Communication for Task Results Throughput WS Communication (tasks/sec) Throughput (tasks/sec) firstavailablavailablcompute-utihicache- first- max- max-cache- good- firstavailablavailablcompute-utihicache- first- max- max-cache- good- without I/O with I/O compute without I/O with I/O compute Throughput (tasks/sec) Throughput (tasks/sec) [DIDC9] Towards Data Intensive Many-Task Computing, under review 14
15 Monotonically Increasing Workload Emphasizes increasing loads Sine-Wave Workload Emphasizes varying loads All-Pairs Workload Compare to best case model of active storage Image Stacking Workload (Astronomy) Evaluate data diffusion on a real large-scale dataintensive application from astronomy domain [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion 15
16 25K tasks 1MB reads 1ms compute Vary arrival rate: Min: 1 task/sec Increment function: CEILING(*1.3) Max: 1 tasks/sec 128 processors Ideal case: 1415 sec 8Gb/s peak throughput Arrival Rate (per second) Arrival Rate Tasks completed Time (sec) Tasks Completed 16
17 GPFS vs. ideal: 511 sec vs sec Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Throughput (Gb/s) Wait Queue Length Time (sec) Demand (Gb/s) Number of Nodes 17
18 Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Max-compute-util Cache Hit/Miss % Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Max-cache-hit Cache Hit/Miss % CPU Utilization % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes CPU Utilization 18
19 Nodes Allocated Throughput (Gb/s) Queue Length (x1k) % 9% 8% 7% 6% 5% 4% 3% 2% 1% % Cache Hit/Miss % 1GB 1.5GB Nodes Allocated Throughput (Gb/s) Queue Length (x1k) % 9% 8% 7% 6% 5% 4% 3% Cache Hit/Miss % 2% 1% % Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Demand (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes 1 1% 9 9% 8 8% 7 7% 6 6% 5 5% 4 4% 3 3% 2 2% 1 1% % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes Cache Hit/Miss % 2GB 4GB Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes Cache Hit/Miss %
20 Data Diffusion vs. ideal: 1436 sec vs 1415 sec Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Cache Hit/Miss % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes 2
21 Throughput (Gb/s) Ideal FA GCC 1GB 73 GCC 1.5GB GCC 2GB Response Time Local Worker Caches (Gb/s) Remote Worker Caches (Gb/s) GPFS Throughput (Gb/s) GCC 4GB 3 sec vs 1569 sec 56X 21 MCH 4GB 46 MCU 4GB Average Response Time (sec) Throughput: Average: 14Gb/s vs 4Gb/s Peak: 81Gb/s vs. 6Gb/s 1569 FA 184 GCC 1GB 114 GCC 1.5GB GCC 2GB GCC 4GB 23 MCH 4GB 287 MCU 4GB 21
22 Performance Index: 34X higher Speedup 3.5X faster than GPFS Performance Index Speedup (comp. to LAN GPFS) FA GCC 1GB GCC 1.5GB GCC 2GB GCC 4GB GCC 4GB SRP MCH 4GB MCU 4GB Performance Index Speedup (compared to first-available) 22
23 2M tasks 1MB reads 1ms compute Vary arrival rate: Min: 1 task/sec Arrival rate function: Max: 1 tasks/sec 2 processors Ideal case: 655 sec 8Gb/s peak throughput A Arrival Rate (per sec) Arrival Rate (per sec) (sin( sqrt( time Arrival Rate Arrival Rate Number of Tasks Number of Tasks.11)* ) Time (sec) Time (sec) 1)*( time.11)* Number of Tasks Completed Number of Tasks Completed 23
24 GPFS 5.7 hrs, ~8Gb/s, 1138 CPU hrs Nodes Allocated Throughput (Gb/s) Queue Length (x1k) Time (sec) Throughput (Gb/s) Wait Queue Length Demand (Gb/s) Number of Nodes 24
25 GPFS 5.7 hrs, ~8Gb/s, 1138 CPU hrs GCC+SRP 1.8 hrs, ~25Gb/s, 361 CPU hrs Nodes Allocated Throughput (Gb/s) Queue Length (x1k) j 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% % Cache Hit/Miss Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Demand (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes 25
26 GPFS 5.7 hrs, ~8Gb/s, 1138 CPU hrs GCC+SRP 1.8 hrs, ~25Gb/s, 361 CPU hrs GCC+DRP 1.86 hrs, ~24Gb/s, 253 CPU hrs Nodes Allocated Throughput (Gb/s) Queue Length (x1k) % 9% 8% 7% 6% 5% 4% 3% 2% 1% % Cache Hit/Miss % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Gb/s) Demand (Gb/s) Wait Queue Length Number of Nodes 26
27 5x5 25K tasks 24MB reads 1ms compute 2 CPUs 1x1 1M tasks 24MB reads 4sec compute 496 CPUs Ideal case: 655 sec 8Gb/s peak throughput All-Pairs( set A, set B, function F ) returns matrix M: Compare all elements of set A to all elements of set B via function F, yielding matrix M, such that M[i,j] = F(A[i],B[j]) 1 foreach $i in A 2 foreach $j in B 3 submit_job F $i $j 4 end 5 end [DIDC9] Towards Data Intensive Many-Task Computing, under review 27
28 Throughput (Gb/s) Efficiency: 75% 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% % Cache Hit/Miss Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Max Throughput (GPFS) Throughput (Data Diffusion) Max Throughput (Local Disk) 28
29 Throughput (Gb/s) Efficiency: 86% 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% % Cache Hit/Miss Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Max Throughput (GPFS) Throughput (Data Diffusion) Max Throughput (Local Memory) [DIDC9] Towards Data Intensive Many-Task Computing, under review 29
30 Pull vs. Push Data Diffusion Pulls task working set Incremental spanning forest Active Storage: Pushes workload working set to all nodes Static spanning tree Christopher Moretti, Douglas Thain, University of Notre Dame [DIDC9] Towards Data Intensive Many-Task Computing, under review Efficiency 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% % Experiment 5x5 2 CPUs 1 sec 5x5 2 CPUs.1 sec 1x1 496 CPUs 4 sec 1x CPUs 4 sec 5x5 2 CPUs 1 sec Approach Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) 5x5 2 CPUs.1 sec Experiment Local Disk/Memory (GB) Best Case (active storage) Falkon (data diffusion) Best Case (parallel file system) 1x1 496 CPUs 4 sec 1x CPUs 4 sec Network (node-to-node) (GB) Shared File System (GB)
31 Best to use active storage if Slow data source Workload working set fits on local node storage Best to use data diffusion if Medium to fast data source Task working set << workload working set Task working set fits on local node storage If task working set does not fit on local node storage Use parallel file system (i.e. GPFS, Lustre, PVFS, etc) 31
32 Purpose On-demand stacks of random locations within ~1TB dataset Challenge Processing Costs: O(1ms) per object Data Intensive: 4MB:1sec Rapid access to 1-1K random files Time-varying load [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion [TG6] AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis AP = Sloan Data Locality Number of Objects Number of Files
33 open radec2xy readhdu+gettile+curl+convertarray calibration+interpolation+dostacking writestacking Time (ms) GPFS GZ LOCAL GZ GPFS FIT LOCAL FIT Filesystem and Image Format [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion 33
34 Low data locality Similar (but better) performance to GPFS Time (ms) per stack per CPU Time (ms) per stack per CPU Data Diffusion (GZ) Data Data Diffusion Diffusion (FIT) (GZ) GPFS Data (GZ) Diffusion (FIT) GPFS GPFS (FIT) (GZ) GPFS (FIT) Number of CPUs Number of CPUs Time (ms) per stack per CPU Time (ms) per stack per CPU Data Diffusion (GZ) Data Data Diffusion Diffusion (FIT) (GZ) GPFS Data (GZ) Diffusion (FIT) GPFS GPFS (FIT) (GZ) GPFS (FIT) Number of CPUs Number of CPUs High data locality Near perfect scalability [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion 34
35 Aggregate throughput: 39Gb/s 1X higher than GPFS Reduced load on GPFS.49Gb/s 1/1 of the original load Aggregate Throughput (Gb/s) Aggregate Throughput (Gb/s) Data Diffusion (GZ) 2 2 Data Data Diffusion (FIT) (GZ) 18 GPFS Data (GZ) Diffusion (FIT) 18 GPFS GPFS (FIT) (GZ) 16 GPFS (FIT) Ideal Ideal Locality Locality [DADC8] Accelerating Large-scale Data Exploration through Data Diffusion Time (ms) per stack per CPU Time (ms) per stack per CPU 5 Data Diffusion Throughput Local 5 Data Data Diffusion Diffusion Throughput Cache-to-Cache Local Data Diffusion Cache-to-Cache 45 Data Diffusion Throughput GPFS 45 GPFS Data Throughput Diffusion Throughput (FIT) GPFS 4 GPFS GPFS Throughput (GZ) (FIT) 4 GPFS Throughput (GZ) Locality Locality Big performance gains as locality increases 35
36 Data access patterns: write once, read many Task definition must include input/output files metadata Per task working set must fit in local storage Needs IP connectivity between hosts Needs local storage (disk, memory, etc) Needs Java
37 [Ghemawat3,Dean4]: MapReduce+GFS [Bialecki5]: Hadoop+HDFS [Gu6]: Sphere+Sector [Tatebe4]: Gfarm [Chervenak4]: RLS, DRS [Kosar6]: Stork Conclusions None focused on the co-location of storage and generic black box computations with data-aware scheduling while operating in a dynamic elastic environment Swift + Falkon + Data Diffusion is arguably a more generic and powerful solution than MapReduce 37
38 Identified that data locality is crucial to the efficient use of large scale distributed systems for data-intensive applications Data Diffusion Integrated streamlined task dispatching with data aware scheduling policies Heuristics to maximize real world performance Suitable for varying, data-intensive workloads Proof of O(NM) Competitive Caching 38
39 Falkon is a real system Late 25: Initial prototype, AstroPortal January 27: Falkon v November 27: Globus incubator project v.1 February 29: Globus incubator project v.9 Implemented in Java (~2K lines of code) and C (~1K lines of code) Open source: svn co Source code contributors (beside myself) Yong Zhao, Zhao Zhang, Ben Clifford, Mihael Hategan [Globus7] Falkon: A Proposal for Project Globus Incubation 39
40 Workload 16K CPUs 1M tasks 6 sec per task 2 CPU years in 453 sec Throughput: 2312 tasks/sec 85% efficiency [TPDS9] Middleware Support for Many-Task Computing, under preparation 4
41 [TPDS9] Middleware Support for Many-Task Computing, under preparation 41
42 ACM MTAGS9 SC9 Due Date: August 1 st, 29 42
43 IEEE TPDS Journal Special Issue on MTC Due Date: December 1 st, 29 43
44 More information: Other publications: Falkon: Swift: Funding: NASA: Ames Research Center, GSRP DOE: Office of Advanced Scientific Computing Research, Office of Science, U.S. Dept. of Energy NSF: TeraGrid Relevant activities: ACM MTAGS9 Workshop at Supercomputing 29 Special Issue on MTC in IEEE TPDS Journal 44
NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC
Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data
More informationSynonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short
Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects
More informationTypically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times
Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap
More informationIoan Raicu. Distributed Systems Laboratory Computer Science Department University of Chicago
The Quest for Scalable Support of Data Intensive Applications in Distributed Systems Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian
More informationIoan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago
Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,
More informationOverview Past Work Future Work. Motivation Proposal. Work-in-Progress
Overview Past Work Future Work Motivation Proposal Work-in-Progress 2 HPC: High-Performance Computing Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface
More informationManaging and Executing Loosely-Coupled Large-Scale Applications on Clusters, Grids, and Supercomputers
Managing and Executing Loosely-Coupled Large-Scale Applications on Clusters, Grids, and Supercomputers Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Collaborators:
More informationA Data Diffusion Approach to Large Scale Scientific Exploration
A Data Diffusion Approach to Large Scale Scientific Exploration Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Yong Zhao: Microsoft Ian Foster:
More informationIoan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago
Falkon, a Fast and Light-weight task execution framework for Clusters, Grids, and Supercomputers Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration
More informationThe Quest for Scalable Support of Data-Intensive Workloads in Distributed Systems
The Quest for Scalable Support of Data-Intensive Workloads in Distributed Systems Ioan Raicu, 1 Ian T. Foster, 1,2,3 Yong Zhao 4 Philip Little, 5 Christopher M. Moretti, 5 Amitabh Chaudhary, 5 Douglas
More informationMOHA: Many-Task Computing Framework on Hadoop
Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction
More informationIan Foster, An Overview of Distributed Systems
The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,
More informationArguably one of the most fundamental discipline that touches all other disciplines and people
The scientific and mathematical approach in information technology and computing Started in the 1960s from Mathematics or Electrical Engineering Today: Arguably one of the most fundamental discipline that
More informationIoan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?
Ioan Raicu More information at: http://www.cs.iit.edu/~iraicu/ Everyone else Background? What do you want to get out of this course? 2 Data Intensive Computing is critical to advancing modern science Applies
More informationStorage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore
Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago DSL Seminar November st, 006 Analysis
More informationData Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data-Intensive Applications
Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data-Intensive Applications Ioan Raicu, Yong Zhao 2, Ian Foster,3,4, Alex Szalay 5 iraicu@cs.uchicago.edu, yozha@microsoft.com,
More informationKeywords: many-task computing; MTC; high-throughput computing; resource management; Falkon; Swift
Editorial Manager(tm) for Cluster Computing Manuscript Draft Manuscript Number: Title: Middleware Support for Many-Task Computing Article Type: HPDC Special Issue Section/Category: Keywords: many-task
More informationIntroduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work
Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Today (2014):
More informationMATE-EC2: A Middleware for Processing Data with Amazon Web Services
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering
More informationCrossing the Chasm: Sneaking a parallel file system into Hadoop
Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large
More informationHarnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets
Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress
More informationAccelerating Large Scale Scientific Exploration through Data Diffusion
Accelerating Large Scale Scientific Exploration through Data Diffusion Ioan Raicu *, Yong Zhao *, Ian Foster #*+, Alex Szalay - {iraicu,yongzh }@cs.uchicago.edu, foster@mcs.anl.gov, szalay@jhu.edu * Department
More informationTowards Data Intensive Many-Task Computing
Towards Data Intensive Many-Task Computing Ioan Raicu 1, Ian Foster 1,2,3, Yong Zhao 4, Alex Szalay 5 Philip Little 6, Christopher M. Moretti 6, Amitabh Chaudhary 6, Douglas Thain 6 iraicu@cs.uchicago.edu,
More informationCrossing the Chasm: Sneaking a parallel file system into Hadoop
Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large
More informationThe Fusion Distributed File System
Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique
More informationExtreme-scale scripting: Opportunities for large taskparallel applications on petascale computers
Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers Michael Wilde, Ioan Raicu, Allan Espinosa, Zhao Zhang, Ben Clifford, Mihael Hategan, Kamil Iskra, Pete
More informationAssistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys)
Current position: Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Guest Research Faculty, Argonne National Laboratory
More informationData Management in Parallel Scripting
Data Management in Parallel Scripting Zhao Zhang 11/11/2012 Problem Statement Definition: MTC applications are those applications in which existing sequential or parallel programs are linked by files output
More informationAssistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys)
Current position: Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Guest Research Faculty, Argonne National Laboratory
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More informationCase Studies in Storage Access by Loosely Coupled Petascale Applications
Case Studies in Storage Access by Loosely Coupled Petascale Applications Justin M Wozniak and Michael Wilde Petascale Data Storage Workshop at SC 09 Portland, Oregon November 15, 2009 Outline Scripted
More informationCloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe
Cloud Programming Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Computing Only required amount of CPU and storage can be used anytime from anywhere via network Availability, throughput, reliability
More informationDisk Cache-Aware Task Scheduling
Disk ache-aware Task Scheduling For Data-Intensive and Many-Task Workflow Masahiro Tanaka and Osamu Tatebe University of Tsukuba, JST/REST Japan Science and Technology Agency IEEE luster 2014 2014-09-24
More informationSystem Software for Big Data and Post Petascale Computing
The Japanese Extreme Big Data Workshop February 26, 2014 System Software for Big Data and Post Petascale Computing Osamu Tatebe University of Tsukuba I/O performance requirement for exascale applications
More informationStorage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster
Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster. Overview Both the industry and academia have an increase demand for good policies and mechanisms to
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationParallel File Systems for HPC
Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical
More informationIntroduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work
Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Today (2014):
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)
More informationGrid Scheduling Architectures with Globus
Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents
More informationTuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov
Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright njwright @ lbl.gov NERSC- National Energy Research Scientific Computing Center Mission: Accelerate the pace of scientific discovery
More informationAccelerating Parallel Analysis of Scientific Simulation Data via Zazen
Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation
More informationToday (2010): Multicore Computing 80. Near future (~2018): Manycore Computing Number of Cores Processing
Number of Cores Manufacturing Process 300 250 200 150 100 50 0 2004 2006 2008 2010 2012 2014 2016 2018 100 Today (2010): Multicore Computing 80 1~12 cores commodity architectures 70 60 80 cores proprietary
More informationForming an ad-hoc nearby storage, based on IKAROS and social networking services
Forming an ad-hoc nearby storage, based on IKAROS and social networking services Christos Filippidis1, Yiannis Cotronis2 and Christos Markou1 1 Institute of Nuclear & Particle Physics, NCSR Demokritos,
More informationExtreme scripting and other adventures in data-intensive computing
Extreme scripting and other adventures in data-intensive computing Ian Foster Allan Espinosa, Ioan Raicu, Mike Wilde, Zhao Zhang Computation Institute Argonne National Lab & University of Chicago How data
More informationA Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing
A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing Z. Sebepou, K. Magoutis, M. Marazakis, A. Bilas Institute of Computer Science (ICS) Foundation for Research and
More informationNowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?
Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/
More informationData Movement & Storage Using the Data Capacitor Filesystem
Data Movement & Storage Using the Data Capacitor Filesystem Justin Miller jupmille@indiana.edu http://pti.iu.edu/dc Big Data for Science Workshop July 2010 Challenges for DISC Keynote by Alex Szalay identified
More informationScientific Workflows and Cloud Computing. Gideon Juve USC Information Sciences Institute
Scientific Workflows and Cloud Computing Gideon Juve USC Information Sciences Institute gideon@isi.edu Scientific Workflows Loosely-coupled parallel applications Expressed as directed acyclic graphs (DAGs)
More informationZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table
ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table 1 What is KVS? Why to use? Why not to use? Who s using it? Design issues A storage system A distributed hash table Spread simple structured
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationSystem that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files
System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files Addressable by a filename ( foo.txt ) Usually supports hierarchical
More informationThe Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler
The Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler MSST 10 Hadoop in Perspective Hadoop scales computation capacity, storage capacity, and I/O bandwidth by
More informationMagellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009
Magellan Project Jeff Broughton NERSC Systems Department Head October 7, 2009 1 Magellan Background National Energy Research Scientific Computing Center (NERSC) Argonne Leadership Computing Facility (ALCF)
More informationData Management. Parallel Filesystems. Dr David Henty HPC Training and Support
Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER
More informationThe RAMDISK Storage Accelerator
The RAMDISK Storage Accelerator A Method of Accelerating I/O Performance on HPC Systems Using RAMDISKs Tim Wickberg, Christopher D. Carothers wickbt@rpi.edu, chrisc@cs.rpi.edu Rensselaer Polytechnic Institute
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationSami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1
Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var
More informationData Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016
National Aeronautics and Space Administration Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures 13 November 2016 Carrie Spear (carrie.e.spear@nasa.gov) HPC Architect/Contractor
More informationIBM Spectrum Scale IO performance
IBM Spectrum Scale 5.0.0 IO performance Silverton Consulting, Inc. StorInt Briefing 2 Introduction High-performance computing (HPC) and scientific computing are in a constant state of transition. Artificial
More informationImproved Solutions for I/O Provisioning and Application Acceleration
1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com 2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer
More informationSAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation
SAS Enterprise Miner Performance on IBM System p 570 Jan, 2008 Hsian-Fen Tsao Brian Porter Harry Seifert IBM Corporation Copyright IBM Corporation, 2008. All Rights Reserved. TABLE OF CONTENTS ABSTRACT...3
More informationFalkon: a Fast and Light-weight task execution framework
: a Fast and Light-weight task execution framework Ioan Raicu *, Yong Zhao *, Catalin Dumitrescu *, Ian Foster #*+, Mike Wilde #+ * Department of Computer Science, University of Chicago, IL, USA + Computation
More informationBeyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center
Beyond Petascale Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center GPFS Research and Development! GPFS product originated at IBM Almaden Research Laboratory! Research continues to
More informationLecture 11 Hadoop & Spark
Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem
More informationCharacterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date:
Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: 8-17-5 Table of Contents Table of Contents...1 Table of Figures...1 1 Overview...4 2 Experiment Description...4
More informationPLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS
PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad
More informationStore Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete
Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete 1 DDN Who We Are 2 We Design, Deploy and Optimize Storage Systems Which Solve HPC, Big Data and Cloud Business
More informationChapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.
Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies
More informationFast Forward I/O & Storage
Fast Forward I/O & Storage Eric Barton Lead Architect 1 Department of Energy - Fast Forward Challenge FastForward RFP provided US Government funding for exascale research and development Sponsored by 7
More informationDistributed NoSQL Storage for Extreme-scale System Services
Distributed NoSQL Storage for Extreme-scale System Services Short Bio 6 th year PhD student from DataSys Lab, CS Department, Illinois Institute oftechnology,chicago Academic advisor: Dr. Ioan Raicu Research
More informationAnalytics in the cloud
Analytics in the cloud Dow we really need to reinvent the storage stack? R. Ananthanarayanan, Karan Gupta, Prashant Pandey, Himabindu Pucha, Prasenjit Sarkar, Mansi Shah, Renu Tewari Image courtesy NASA
More informationEmerging Technologies for HPC Storage
Emerging Technologies for HPC Storage Dr. Wolfgang Mertz CTO EMEA Unstructured Data Solutions June 2018 The very definition of HPC is expanding Blazing Fast Speed Accessibility and flexibility 2 Traditional
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationParallel File Systems. John White Lawrence Berkeley National Lab
Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation
More informationReduction of Workflow Resource Consumption Using a Density-based Clustering Model
Reduction of Workflow Resource Consumption Using a Density-based Clustering Model Qimin Zhang, Nathaniel Kremer-Herman, Benjamin Tovar, and Douglas Thain WORKS Workshop at Supercomputing 2018 The Cooperative
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationBIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE
BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest
More informationJob sample: SCOPE (VLDBJ, 2012)
Apollo High level SQL-Like language The job query plan is represented as a DAG Tasks are the basic unit of computation Tasks are grouped in Stages Execution is driven by a scheduler Job sample: SCOPE (VLDBJ,
More informationShared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments
LCI HPC Revolution 2005 26 April 2005 Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments Matthew Woitaszek matthew.woitaszek@colorado.edu Collaborators Organizations National
More informationHadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017
Hadoop File System 1 S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y Moving Computation is Cheaper than Moving Data Motivation: Big Data! What is BigData? - Google
More informationPERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS
PERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS V. Prasathkumar, P. Jeevitha Assiatant Professor, Department of Information Technology Sri Shakthi Institute
More informationThe Red Storm System: Architecture, System Update and Performance Analysis
The Red Storm System: Architecture, System Update and Performance Analysis Douglas Doerfler, Jim Tomkins Sandia National Laboratories Center for Computation, Computers, Information and Mathematics LACSI
More informationNew User Seminar: Part 2 (best practices)
New User Seminar: Part 2 (best practices) General Interest Seminar January 2015 Hugh Merz merz@sharcnet.ca Session Outline Submitting Jobs Minimizing queue waits Investigating jobs Checkpointing Efficiency
More informationNext-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads
Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads Liran Zvibel CEO, Co-founder WekaIO @liranzvibel 1 WekaIO Matrix: Full-featured and Flexible Public or Private S3 Compatible
More informationOptimizing Load Balancing and Data-Locality with Data-aware Scheduling
Optimizing Load Balancing and Data-Locality with Data-aware Scheduling Ke Wang *, Xiaobing Zhou, Tonglin Li *, Dongfang Zhao *, Michael Lang, Ioan Raicu * * Illinois Institute of Technology, Hortonworks
More informationPARDA: Proportional Allocation of Resources for Distributed Storage Access
PARDA: Proportional Allocation of Resources for Distributed Storage Access Ajay Gulati, Irfan Ahmad, Carl Waldspurger Resource Management Team VMware Inc. USENIX FAST 09 Conference February 26, 2009 The
More informationEFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD
EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD S.THIRUNAVUKKARASU 1, DR.K.P.KALIYAMURTHIE 2 Assistant Professor, Dept of IT, Bharath University, Chennai-73 1 Professor& Head, Dept of IT, Bharath
More informationMap Reduce. Yerevan.
Map Reduce Erasmus+ @ Yerevan dacosta@irit.fr Divide and conquer at PaaS 100 % // Typical problem Iterate over a large number of records Extract something of interest from each Shuffle and sort intermediate
More informationMapReduce and Friends
MapReduce and Friends Craig C. Douglas University of Wyoming with thanks to Mookwon Seo Why was it invented? MapReduce is a mergesort for large distributed memory computers. It was the basis for a web
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationFuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc
Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,
More informationLustre A Platform for Intelligent Scale-Out Storage
Lustre A Platform for Intelligent Scale-Out Storage Rumi Zahir, rumi. May 2003 rumi.zahir@intel.com Agenda Problem Statement Trends & Current Data Center Storage Architectures The Lustre File System Project
More informationdavidklee.net gplus.to/kleegeek linked.com/a/davidaklee
@kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture
More informationIntroduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill
Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center
More informationWarehouse- Scale Computing and the BDAS Stack
Warehouse- Scale Computing and the BDAS Stack Ion Stoica UC Berkeley UC BERKELEY Overview Workloads Hardware trends and implications in modern datacenters BDAS stack What is Big Data used For? Reports,
More informationAnalyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016
Analyzing the High Performance Parallel I/O on LRZ HPC systems Sandra Méndez. HPC Group, LRZ. June 23, 2016 Outline SuperMUC supercomputer User Projects Monitoring Tool I/O Software Stack I/O Analysis
More informationClouds: An Opportunity for Scientific Applications?
Clouds: An Opportunity for Scientific Applications? Ewa Deelman USC Information Sciences Institute Acknowledgements Yang-Suk Ki (former PostDoc, USC) Gurmeet Singh (former Ph.D. student, USC) Gideon Juve
More information