A Data Diffusion Approach to Large Scale Scientific Exploration

Size: px
Start display at page:

Download "A Data Diffusion Approach to Large Scale Scientific Exploration"

Transcription

1 A Data Diffusion Approach to Large Scale Scientific Exploration Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Yong Zhao: Microsoft Ian Foster: Univ. of Chicago, CS & CI, Argonne National Laboratory, MCS Alex Szalay: The Johns Hopkins Univ., Dept. of Physics and Astronomy Funded by: NSF: TeraGrid DOE: Advanced Scientific Computing Research NASA: Ames Research Center 27 Microsoft escience Workshop at RENCI October 21 st, 27

2 Data Diffusion Example: AstroPortal Stacking Service Purpose On-demand stacks of random locations within ~1TB dataset Challenge Rapid access to 1-1K random files Time-varying load Solution Dynamic acquisition of compute, storage 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 2 S 4 Web page or Web Service = Sloan Data

3 Falkon: a Fast and Light-weight task execution framework Goal: enable the rapid and efficient execution of many independent jobs on large compute clusters Combines three techniques: multi-level scheduling techniques to enable separate treatments of resource provisioning a streamlined task dispatcher able to achieve order-of-magnitude higher task dispatch rates than conventional schedulers performs data diffusion and uses a data-aware scheduler to leverage the co-located computational and storage resources 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 3

4 Execution Model Dispatch Policy first-available, max-compute-util, max-cache-hit* Caching Policy random, FIFO, LRU, LFU Replay policy Resource Acquisition Policy one-at-a-time, additive, exponential, all-at-once, optimal* Resource Release Policy distributed, centralized* 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 4

5 Falkon 2-Tier Architecture Tier 1: Dispatcher GT4 Web Service accepting task submissions from clients and sending them to available executors Tier 2: Executor Run tasks on local resources Provisioner Static and dynamic resource provisioning WS Dispatcher Provisioner Compute Resource 1 WS Clients Compute Resource m Compute Resources Executor 1 Executor n 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 5

6 Dispatch Throughput Throughput (tasks/sec) WS Calls (no security) Falkon (no security) Falkon (GSISecureConversation) Number of Executors 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 7

7 Execution Efficiency Efficiency Sleep 64 Sleep 32 Sleep 16 Sleep 8 Sleep 4 Sleep 2 Sleep Number of Executors 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 8

8 Resource Provisioning. provisioner registration 1. task(s) submit 2. resource allocation to GRAM 3. resource allocation to LRM 4. executor registration 5. notification for work 6. pick up task(s) 7. deliver task(s) results 8. notification for task result 9. pick up task(s) results 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 1

9 Dynamic Resource Provisioning End-to-end execution time: 126 sec in ideal case 494 sec 1276 sec Average task queue time: 42.2 sec in ideal case 611 sec 43.5 sec Trade-off: Resource Utilization for Execution Efficiency GRAM +PBS Ideal (32 nodes) Falkon-15 Falkon-6 Falkon-12 Falkon-18 Falkon- Queue Time (sec) Execution Time (sec) Execution Time % 8.5% 17.% 17.6% 19.3% 28.7% 29.2% 29.7% GRAM +PBS Ideal (32 nodes) Falkon-15 Falkon-6 Falkon-12 Falkon-18 Falkon- Time to complete (sec) Resouce Utilization 3% 89% 75% 65% 59% 44% 1% Execution Efficiency 26% 72% 75% 84% 85% 99% 1% Resource Allocations /22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 13

10 Data Diffusion Data caching at executor local disk RANDOM FIFO LRU LFU Avoid the need for a shared file system Theoretical linear Throughput (Mb/s) scalability with compute resources Local Read Local Read+Write GPFS Read GPFS Read+Write File Size (MB) /22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 14

11 Shared File System Performance Read Throughput (Mb/s) Nodes 32 Nodes 16 Nodes 8 Nodes 4 Nodes 2 Nodes 1 Node Read+Write Throughput (Mb/s) Nodes 32 Nodes 16 Nodes 8 Nodes 4 Nodes 2 Nodes 1 Node 1E File Size (MB) 1 1 1E File Size (MB) 1 1 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 15

12 Local Disk Theoretical Performance Read Throughput (Mb/s) Nodes 32 Nodes 16 Nodes 8 Nodes 4 Nodes 2 Nodes 1 Node Read+Write Throughput (Mb/s) Nodes 32 Nodes 16 Nodes 8 Nodes 4 Nodes 2 Nodes 1 Node 1E File Size (MB) 1 1 1E File Size (MB) 1 1 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 16

13 Data Diffusion: Micro-Benchmarks 1. Model (local disk): local disk performance model 2. Model (shared file system): shared file system (GPFS) performance model 3. Falkon (next-available policy): load balancing across available executors, operates on shared file system 4. Falkon (next-available policy) + Wrapper: same as (3), but tasks execute through a wrapper that creates a temp scratch directory on the shared file system, makes symbolic links, executes task, and removes the temp scratch directory and symbolic links 5. Falkon (first-available policy % locality): load balancing across available executors with data caching; % data locality with data read from the shared file system 6. Falkon (first-available policy 1% locality): same as (5), but with the workload from (5) repeating four times; the caches are first populated (not as part of the timed experiment) 7. Falkon (max-compute-util policy % locality): same as (5), but uses the dataaware scheduler to send tasks to the executors that have the most data cached; the workload has % data locality 8. Falkon (max-compute-util policy 1% locality): same as (7), but with the workload repeated four times 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 19

14 Micro-Benchmarks Results: 1-64 Nodes Read Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) Read+Write Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) Number of Nodes Number of Nodes 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 2

15 Micro-Benchmarks Results: 1 Node Read Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) Read+Write Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 21

16 Micro-Benchmarks Results: 8 Nodes Read Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) Read+Write Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 22

17 Micro-Benchmarks Results: 64 Nodes Read Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) Read+Write Throughput (Mb/s) Model (local disk) 2. Model (shared file system) 3. Falkon (next-available policy) 4. Falkon (next-available policy) + Wrapper 5. Falkon (first-available policy % locality) 6. Falkon (first-available policy 1% locality) 7. Falkon (max-compute-util policy % locality) 8. Falkon (max-compute-util policy 1% locality) 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1B 1KB 1KB 1KB 1MB 1MB 1MB 1GB File Size 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 23

18 Task Dispatch Throughput: 64 Nodes, 1 Byte files on GPFS Read Read+Write Throughput (tasks/sec) No Data Access 3. Falkon (nextavailable policy) 4. Falkon (nextavailable policy) + Wrapper 5. Falkon (firstavailable policy % locality) 6. Falkon (firstavailable policy 1% locality) 7. Falkon (maxcompute-util policy compute-util policy 8. Falkon (max- % locality) 1% locality) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 24

19 Dispatch Policies 1% 1%1%1% 1%1%1% 99% 1% 99% 1% 99% 1% 1% 97% 1% 97% Cache Hit Rate (local disk) 9% 8% 7% 6% 5% 4% 3% 63% 31% first-available max-compute-util max-cache-hit 2% 16% 1% % 8% 4% Number of Nodes 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 25

20 Applications via Swift Application #Tasks/workflow #Stages ATLAS: High Energy Physics Event Simulation 5K 1 fmri DBIC: AIRSN Image Processing 1s 12 FOAM: Ocean/Atmosphere Model 2 3 GADU: Genomics 4K 4 HNL: fmri Aphasia Study 5 4 NVO/NASA: Photorealistic Montage/Morphology 1s 16 QuarkNet/I2U2: Physics Science Education 1s 3 ~ 6 RadCAD: Radiology Specification 1s 5 Classifier Training Abstract SIDGrid: EEG Wavelet 1s 2 computation Processing, Gaze Analysis SDSS: Coadd, Cluster Search 4K, 5K 2, 8 SDSS: Stacking, AstroPortal 1Ks ~ 1Ks 2 ~ 4 SwiftScript MolDyn: Molecular Dynamics 1Ks ~ 2Ks 8 Compiler Virtual Data Catalog SwiftScript Swift Architecture Scheduling Execution Engine (Karajan w/ Swift Runtime) C C C C Swift runtime callouts Status reporting Provenance collector Execution Virtual Node(s) Provenance data launcher Provenance data launcher file1 App F1 file2 App F2 file3 Provisioning Falkon Resource Provisioner Amazon EC2 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 29 13

21 Applications Medicine: fmri (Falkon vs. GRAM) Astronomy: Montage (Falkon vs. GRAM vs. MPI) Stacking (Falkon with Data Diffusion vs. Falkon using shared file system) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 3

22 Applications: fmri GRAM vs. Falkon: 85%~9% lower run time GRAM/Clustering vs. Falkon: 4%~74% lower run time GRAM GRAM/Clustering Falkon Time (s) /22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 31 Input Data Size (Volumes)

23 Applications: Montage GRAM/Clustering vs. Falkon: 57% lower application run time MPI* vs. Falkon: 4% higher application run time * MPI should be lower bound GRAM/Clustering MPI Falkon Time (s) mproject mdiff/fit mbackground madd(sub) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 32 Components madd total

24 Applications: Stacking Object Count per File Files 32M+ objects 1.5M+ files 3.3 TB compressed / 9TB uncompressed 1E+6 1E+6 1E+6 SDSS DR5 Dataset 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 33

25 Stacking Workloads Uniform sample from 32M objects ZIPF distribution of object popularity alpha =.1, 1., 2. Stacking Size (# of objects) Object Distribution Working Set Size (# of files) Working Set Size (# of objects) Locality (accesses per file) 1 ZIPF - alpha ZIPF - alpha ZIPF - alpha UNIFORM /22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 34

26 Stacking Performance Evaluation Time (sec) ZIPF - alpha.1 ZIPF - alpha 1. ZIPF - alpha 2. UNIFORM Web Server Shared File System (WAN) Shared File System (LAN) Data Diffusion Original Dataset Location 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 35

27 Object Size Effect 35 Shared File System (LAN) Data Diffusion 3 25 Time (sec) x1 1x1 5x5 Object Size 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 36

28 Data Diffusion Effect Shared File System (LAN - ZIPF 2.) Shared File System (LAN - ZIPF 1.) Shared File System (LAN - ZIPF.1) Shared File System (LAN - UNIFORM) Data Diffusion (ZIPF 2.) Data Diffusion (ZIPF 1.) 1st Time 2nd Time Data Diffusion (ZIPF.1) Data Diffusion (UNIFORM) Time (sec) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 37

29 1K*2 Stacking Workload 1 9 1K Objects - UNIFORM 1st Time: Instantenous Throughput 1K Objects - UNIFORM 1st Time: Average Throughput 1K Objects - UNIFORM 2nd Time: Instantenous Throughput 1K Objects - UNIFORM 2nd Time: Average Throughput Throughput (Cutouts/sec) Time (sec) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 38

30 1K Stacking Workload 25 1K Objects - UNIFORM: Instantenous Throughput 1K Objects - UNIFORM: Average Throughput Throughput (Cutouts/sec) Time (sec) 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 39

31 Future Work Explore data pre-fetching policies Update Swift to use data diffusion 3-Tier Architecture (essential on BG/P) Extend provisioner to support the Virtual Workspace Service (opens door to EC2) Globus Incubator Project Alternate languages and technologies 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 4

32 More Information Web: Related Projects: Swift: AstroPortal: Data Diffusion Collaborators: Ioan Raicu, Computer Science Dept., The University of Chicago Yong Zhao, Microsoft Ian Foster, Math and Computer Science Div., Argonne National Laboratory & Computer Science Dept., The University of Chicago Alex Szalay, Department of Physics and Astronomy, The Johns Hopkins University Other Falkon Collaborators: Catalin Dumitrescu, Computer Science Dept., The University of Chicago Mike Wilde, Computation Institute, University of Chicago & Argonne National Laboratory Funding: NASA: Ames Research Center, Graduate Student Research Program (GSRP) DOE: Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Dept. of Energy NSF: TeraGrid 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 44

33 More Information: Related Documents Ioan Raicu, Yong Zhao, Catalin Dumitrescu, Ian Foster, Mike Wilde. Falkon: a Fast and Light-weight task execution framework, to appear at IEEE/ACM SuperComputing 27. Ioan Raicu, Catalin Dumitrescu, Ian Foster. Dynamic Resource Provisioning in Grid Environments, to appear TeraGrid Conference 27. Yong Zhao, Mihael Hategan, Ben Clifford, Ian Foster, Gregor von Laszewski, Ioan Raicu, Tiberiu Stef-Praun, Mike Wilde. Swift: Fast, Reliable, Loosely Coupled Parallel Computation, to appear at IEEE Workshop on Scientific Workflows 27. Yong Zhao, Mihael Hategan, Ioan Raicu, Mike Wilde, Ian Foster. Swift: a Parallel Programming Tool for Large Scale Scientific Computations, under review at Scientific Programming Journal, Special Issue on Dynamic Computational Workflows: Discovery, Optimization, and Scheduling. Ioan Raicu, Ian Foster, Alex Szalay. Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets, poster presentation, IEEE/ACM SuperComputing 26. Ioan Raicu, Ian Foster, Alex Szalay, Gabriela Turcu. AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis, TeraGrid Conference 26, June 26. Alex Szalay, Julian Bunn, Jim Gray, Ian Foster, Ioan Raicu. The Importance of Data Locality in Distributed Computing Applications, NSF Workflow Workshop 26. 1/22/27 A Data Diffusion Approach to Large Scale Scientific Exploration 45

Managing and Executing Loosely-Coupled Large-Scale Applications on Clusters, Grids, and Supercomputers

Managing and Executing Loosely-Coupled Large-Scale Applications on Clusters, Grids, and Supercomputers Managing and Executing Loosely-Coupled Large-Scale Applications on Clusters, Grids, and Supercomputers Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Collaborators:

More information

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,

More information

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago DSL Seminar November st, 006 Analysis

More information

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Falkon, a Fast and Light-weight task execution framework for Clusters, Grids, and Supercomputers Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration

More information

Falkon: a Fast and Light-weight task execution framework

Falkon: a Fast and Light-weight task execution framework : a Fast and Light-weight task execution framework Ioan Raicu *, Yong Zhao *, Catalin Dumitrescu *, Ian Foster #*+, Mike Wilde #+ * Department of Computer Science, University of Chicago, IL, USA + Computation

More information

Falkon: a Fast and Light-weight task execution framework

Falkon: a Fast and Light-weight task execution framework Falkon: a Fast and Light-weight task execution framework Ioan Raicu *, Yong Zhao *, Catalin Dumitrescu *, Ian Foster #*+, Mike Wilde #+ {iraicu,yongzh,catalind}@cs.uchicago.edu, {foster,wilde}@mcs.anl.gov

More information

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data

More information

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects

More information

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap

More information

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress

More information

Accelerating Large Scale Scientific Exploration through Data Diffusion

Accelerating Large Scale Scientific Exploration through Data Diffusion Accelerating Large Scale Scientific Exploration through Data Diffusion Ioan Raicu *, Yong Zhao *, Ian Foster #*+, Alex Szalay - {iraicu,yongzh }@cs.uchicago.edu, foster@mcs.anl.gov, szalay@jhu.edu * Department

More information

Ioan Raicu. Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu. Distributed Systems Laboratory Computer Science Department University of Chicago The Quest for Scalable Support of Data Intensive Applications in Distributed Systems Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian

More information

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data

More information

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster. Overview Both the industry and academia have an increase demand for good policies and mechanisms to

More information

Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers

Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers Extreme-scale scripting: Opportunities for large taskparallel applications on petascale computers Michael Wilde, Ioan Raicu, Allan Espinosa, Zhao Zhang, Ben Clifford, Mihael Hategan, Kamil Iskra, Pete

More information

Overview Past Work Future Work. Motivation Proposal. Work-in-Progress

Overview Past Work Future Work. Motivation Proposal. Work-in-Progress Overview Past Work Future Work Motivation Proposal Work-in-Progress 2 HPC: High-Performance Computing Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface

More information

Case Studies in Storage Access by Loosely Coupled Petascale Applications

Case Studies in Storage Access by Loosely Coupled Petascale Applications Case Studies in Storage Access by Loosely Coupled Petascale Applications Justin M Wozniak and Michael Wilde Petascale Data Storage Workshop at SC 09 Portland, Oregon November 15, 2009 Outline Scripted

More information

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course? Ioan Raicu More information at: http://www.cs.iit.edu/~iraicu/ Everyone else Background? What do you want to get out of this course? 2 Data Intensive Computing is critical to advancing modern science Applies

More information

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Today (2014):

More information

Keywords: many-task computing; MTC; high-throughput computing; resource management; Falkon; Swift

Keywords: many-task computing; MTC; high-throughput computing; resource management; Falkon; Swift Editorial Manager(tm) for Cluster Computing Manuscript Draft Manuscript Number: Title: Middleware Support for Many-Task Computing Article Type: HPDC Special Issue Section/Category: Keywords: many-task

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

DiPerF: automated DIstributed PERformance testing Framework

DiPerF: automated DIstributed PERformance testing Framework DiPerF: automated DIstributed PERformance testing Framework Catalin Dumitrescu, Ioan Raicu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago Introduction

More information

Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data-Intensive Applications

Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data-Intensive Applications Data Diffusion: Dynamic Resource Provision and Data-Aware Scheduling for Data-Intensive Applications Ioan Raicu, Yong Zhao 2, Ian Foster,3,4, Alex Szalay 5 iraicu@cs.uchicago.edu, yozha@microsoft.com,

More information

Going Places with Distributed Computing

Going Places with Distributed Computing Going Places with Distributed Computing ESC Grid Workshop 2007 Kate Keahey University of Chicago Argonne National Laboratory keahey@mcs.anl.gov There is more to Grid applications than running simple batch

More information

Workflow languages and systems

Workflow languages and systems Swift is a system for the rapid and reliable specification, execution, and management of large-scale science and engineering workflows. It supports applications that execute many tasks coupled by disk-resident

More information

Disk Cache-Aware Task Scheduling

Disk Cache-Aware Task Scheduling Disk ache-aware Task Scheduling For Data-Intensive and Many-Task Workflow Masahiro Tanaka and Osamu Tatebe University of Tsukuba, JST/REST Japan Science and Technology Agency IEEE luster 2014 2014-09-24

More information

Arguably one of the most fundamental discipline that touches all other disciplines and people

Arguably one of the most fundamental discipline that touches all other disciplines and people The scientific and mathematical approach in information technology and computing Started in the 1960s from Mathematics or Electrical Engineering Today: Arguably one of the most fundamental discipline that

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

Problems for Resource Brokering in Large and Dynamic Grid Environments

Problems for Resource Brokering in Large and Dynamic Grid Environments Problems for Resource Brokering in Large and Dynamic Grid Environments Cătălin L. Dumitrescu Computer Science Department The University of Chicago cldumitr@cs.uchicago.edu (currently at TU Delft) Kindly

More information

Extreme scripting and other adventures in data-intensive computing

Extreme scripting and other adventures in data-intensive computing Extreme scripting and other adventures in data-intensive computing Ian Foster Allan Espinosa, Ioan Raicu, Mike Wilde, Zhao Zhang Computation Institute Argonne National Lab & University of Chicago How data

More information

Towards a provenance-aware distributed filesystem

Towards a provenance-aware distributed filesystem Towards a provenance-aware distributed filesystem Chen Shou, Dongfang Zhao, Tanu Malik, Ioan Raicu {cshou, dzhao8}@hawk.iit.edu, tanum@ci.uchicago.edu, iraicu@cs.iit.edu Department of Computer Science,

More information

Revealing Applications Access Pattern in Collective I/O for Cache Management

Revealing Applications Access Pattern in Collective I/O for Cache Management Revealing Applications Access Pattern in for Yin Lu 1, Yong Chen 1, Rob Latham 2 and Yu Zhuang 1 Presented by Philip Roth 3 1 Department of Computer Science Texas Tech University 2 Mathematics and Computer

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

DiPerF: automated DIstributed PERformance testing Framework

DiPerF: automated DIstributed PERformance testing Framework DiPerF: automated DIstributed PERformance testing Framework Ioan Raicu, Catalin Dumitrescu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago Introduction

More information

Towards Data Intensive Many-Task Computing

Towards Data Intensive Many-Task Computing Towards Data Intensive Many-Task Computing Ioan Raicu 1, Ian Foster 1,2,3, Yong Zhao 4, Alex Szalay 5 Philip Little 6, Christopher M. Moretti 6, Amitabh Chaudhary 6, Douglas Thain 6 iraicu@cs.uchicago.edu,

More information

Data Management in Parallel Scripting

Data Management in Parallel Scripting Data Management in Parallel Scripting Zhao Zhang 11/11/2012 Problem Statement Definition: MTC applications are those applications in which existing sequential or parallel programs are linked by files output

More information

The Quest for Scalable Support of Data-Intensive Workloads in Distributed Systems

The Quest for Scalable Support of Data-Intensive Workloads in Distributed Systems The Quest for Scalable Support of Data-Intensive Workloads in Distributed Systems Ioan Raicu, 1 Ian T. Foster, 1,2,3 Yong Zhao 4 Philip Little, 5 Christopher M. Moretti, 5 Amitabh Chaudhary, 5 Douglas

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)

More information

Grid Scheduling Architectures with Globus

Grid Scheduling Architectures with Globus Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents

More information

Ian Foster, An Overview of Distributed Systems

Ian Foster, An Overview of Distributed Systems The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,

More information

EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD

EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD S.THIRUNAVUKKARASU 1, DR.K.P.KALIYAMURTHIE 2 Assistant Professor, Dept of IT, Bharath University, Chennai-73 1 Professor& Head, Dept of IT, Bharath

More information

Clouds: An Opportunity for Scientific Applications?

Clouds: An Opportunity for Scientific Applications? Clouds: An Opportunity for Scientific Applications? Ewa Deelman USC Information Sciences Institute Acknowledgements Yang-Suk Ki (former PostDoc, USC) Gurmeet Singh (former Ph.D. student, USC) Gideon Juve

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application

More information

By Ian Foster. Zhifeng Yun

By Ian Foster. Zhifeng Yun By Ian Foster Zhifeng Yun Outline Introduction Globus Architecture Globus Software Details Dev.Globus Community Summary Future Readings Introduction Globus Toolkit v4 is the work of many Globus Alliance

More information

PERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS

PERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS PERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS V. Prasathkumar, P. Jeevitha Assiatant Professor, Department of Information Technology Sri Shakthi Institute

More information

Magellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009

Magellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009 Magellan Project Jeff Broughton NERSC Systems Department Head October 7, 2009 1 Magellan Background National Energy Research Scientific Computing Center (NERSC) Argonne Leadership Computing Facility (ALCF)

More information

Distributed NoSQL Storage for Extreme-scale System Services

Distributed NoSQL Storage for Extreme-scale System Services Distributed NoSQL Storage for Extreme-scale System Services Short Bio 6 th year PhD student from DataSys Lab, CS Department, Illinois Institute oftechnology,chicago Academic advisor: Dr. Ioan Raicu Research

More information

SDSS Dataset and SkyServer Workloads

SDSS Dataset and SkyServer Workloads SDSS Dataset and SkyServer Workloads Overview Understanding the SDSS dataset composition and typical usage patterns is important for identifying strategies to optimize the performance of the AstroPortal

More information

Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys)

Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Current position: Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Guest Research Faculty, Argonne National Laboratory

More information

Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications

Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications Experiences in Optimizing a $250K Cluster for High- Performance Computing Applications Kevin Brandstatter Dan Gordon Jason DiBabbo Ben Walters Alex Ballmer Lauren Ribordy Ioan Raicu Illinois Institute

More information

Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys)

Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Current position: Assistant Professor at Illinois Institute of Technology (CS) Director of the Data-Intensive Distributed Systems Laboratory (DataSys) Guest Research Faculty, Argonne National Laboratory

More information

Dynamic Resource Allocation Using Nephele Framework in Cloud

Dynamic Resource Allocation Using Nephele Framework in Cloud Dynamic Resource Allocation Using Nephele Framework in Cloud Anand Prabu P 1, Dhanasekar P 2, Sairamprabhu S G 3 Department of Computer Science and Engineering, RVS College of Engineering and Technology,

More information

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science ECSS Symposium, 12/16/14 M. L. Norman, R. L. Moore, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S.

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan Storage Virtualization In computer science, storage virtualization uses virtualization to enable better functionality

More information

Accelerated solution of a moral hazard problem with Swift

Accelerated solution of a moral hazard problem with Swift Accelerated solution of a moral hazard problem with Swift Tiberiu Stef-Praun 1, Gabriel A. Madeira 2, Ian Foster 1, Robert Townsend 2 1 Computation Institute, University of Chicago, USA 2 Economics Department,

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Energy Prediction for I/O Intensive Workflow Applications

Energy Prediction for I/O Intensive Workflow Applications Energy Prediction for I/O Intensive Workflow Applications ABSTRACT As workflow-based data-intensive applications have become increasingly popular, the lack of support tools to aid resource provisioning

More information

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work Today (2014):

More information

Toward Scalable Monitoring on Large-Scale Storage for Software Defined Cyberinfrastructure

Toward Scalable Monitoring on Large-Scale Storage for Software Defined Cyberinfrastructure Toward Scalable Monitoring on Large-Scale Storage for Software Defined Cyberinfrastructure Arnab K. Paul, Ryan Chard, Kyle Chard, Steven Tuecke, Ali R. Butt, Ian Foster Virginia Tech, Argonne National

More information

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007 Grid Programming: Concepts and Challenges Michael Rokitka SUNY@Buffalo CSE510B 10/2007 Issues Due to Heterogeneous Hardware level Environment Different architectures, chipsets, execution speeds Software

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

Corral: A Glide-in Based Service for Resource Provisioning

Corral: A Glide-in Based Service for Resource Provisioning : A Glide-in Based Service for Resource Provisioning Gideon Juve USC Information Sciences Institute juve@usc.edu Outline Throughput Applications Grid Computing Multi-level scheduling and Glideins Example:

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project Introduction to GT3 The Globus Project Argonne National Laboratory USC Information Sciences Institute Copyright (C) 2003 University of Chicago and The University of Southern California. All Rights Reserved.

More information

A GridFTP Transport Driver for Globus XIO

A GridFTP Transport Driver for Globus XIO A GridFTP Transport Driver for Globus XIO Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Joseph Link 5, and John Bresnahan 1,2,3 1 Mathematics and Computer Science Division, Argonne National Laboratory, Argonne,

More information

Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming

Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming Zhao Zhang +, Allan Espinosa *, Kamil Iskra #, Ioan Raicu *, Ian Foster #*+, Michael Wilde #+ + Computation Institute,

More information

ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing

ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing Prof. Wu FENG Department of Computer Science Virginia Tech Work smarter not harder Overview Grand Challenge A large-scale biological

More information

ALICE Grid Activities in US

ALICE Grid Activities in US ALICE Grid Activities in US 1 ALICE-USA Computing Project ALICE-USA Collaboration formed to focus on the ALICE EMCal project Construction, installation, testing and integration participating institutions

More information

IBM Spectrum Scale IO performance

IBM Spectrum Scale IO performance IBM Spectrum Scale 5.0.0 IO performance Silverton Consulting, Inc. StorInt Briefing 2 Introduction High-performance computing (HPC) and scientific computing are in a constant state of transition. Artificial

More information

TERAGRID 2007 CONFERENCE, MADISON, WI 1. GridFTP Pipelining

TERAGRID 2007 CONFERENCE, MADISON, WI 1. GridFTP Pipelining TERAGRID 2007 CONFERENCE, MADISON, WI 1 GridFTP Pipelining John Bresnahan, 1,2,3 Michael Link, 1,2 Rajkumar Kettimuthu, 1,2 Dan Fraser, 1,2 Ian Foster 1,2,3 1 Mathematics and Computer Science Division

More information

Globus Toolkit 4 Execution Management. Alexandra Jimborean International School of Informatics Hagenberg, 2009

Globus Toolkit 4 Execution Management. Alexandra Jimborean International School of Informatics Hagenberg, 2009 Globus Toolkit 4 Execution Management Alexandra Jimborean International School of Informatics Hagenberg, 2009 2 Agenda of the day Introduction to Globus Toolkit and GRAM Zoom In WS GRAM Usage Guide Architecture

More information

XSEDE s Campus Bridging Project Jim Ferguson National Institute for Computational Sciences

XSEDE s Campus Bridging Project Jim Ferguson National Institute for Computational Sciences January 3, 2016 XSEDE s Campus Bridging Project Jim Ferguson National Institute for Computational Sciences jwf@utk.edu What is XSEDE? extreme Science and Engineering Discovery Environment $121M project

More information

Stork: State of the Art

Stork: State of the Art Stork: State of the Art Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison kosart@cs.wisc.edu http://www.cs.wisc.edu/condor/stork 1 The Imminent Data deluge Exponential growth of

More information

Professor: Ioan Raicu. TA: Wei Tang. Everyone else

Professor: Ioan Raicu. TA: Wei Tang. Everyone else Professor: Ioan Raicu http://www.cs.iit.edu/~iraicu/ http://datasys.cs.iit.edu/ TA: Wei Tang http://mypages.iit.edu/~wtang6/ Everyone else Background? What do you want to get out of this course? 2 General

More information

The Swi( parallel scrip/ng language for Science Clouds and other parallel resources

The Swi( parallel scrip/ng language for Science Clouds and other parallel resources The Swi( parallel scrip/ng language for Science Clouds and other parallel resources Michael Wilde Computa1on Ins1tute, University of Chicago and Argonne Na1onal Laboratory wilde@mcs.anl.gov Revised 2012.0229

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

Today (2010): Multicore Computing 80. Near future (~2018): Manycore Computing Number of Cores Processing

Today (2010): Multicore Computing 80. Near future (~2018): Manycore Computing Number of Cores Processing Number of Cores Manufacturing Process 300 250 200 150 100 50 0 2004 2006 2008 2010 2012 2014 2016 2018 100 Today (2010): Multicore Computing 80 1~12 cores commodity architectures 70 60 80 cores proprietary

More information

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s

More information

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright njwright @ lbl.gov NERSC- National Energy Research Scientific Computing Center Mission: Accelerate the pace of scientific discovery

More information

Task-based distributed processing for radio-interferometric imaging with CASA

Task-based distributed processing for radio-interferometric imaging with CASA H2020-Astronomy ESFRI and Research Infrastructure Cluster (Grant Agreement number: 653477). Task-based distributed processing for radio-interferometric imaging with CASA BOJAN NIKOLIC 2 nd ASTERICS-OBELICS

More information

Usage of LDAP in Globus

Usage of LDAP in Globus Usage of LDAP in Globus Gregor von Laszewski and Ian Foster Mathematics and Computer Science Division Argonne National Laboratory, Argonne, IL 60439 gregor@mcs.anl.gov Abstract: This short note describes

More information

A Survey Paper on Grid Information Systems

A Survey Paper on Grid Information Systems B 534 DISTRIBUTED SYSTEMS A Survey Paper on Grid Information Systems Anand Hegde 800 North Smith Road Bloomington Indiana 47408 aghegde@indiana.edu ABSTRACT Grid computing combines computers from various

More information

Data Movement & Storage Using the Data Capacitor Filesystem

Data Movement & Storage Using the Data Capacitor Filesystem Data Movement & Storage Using the Data Capacitor Filesystem Justin Miller jupmille@indiana.edu http://pti.iu.edu/dc Big Data for Science Workshop July 2010 Challenges for DISC Keynote by Alex Szalay identified

More information

GPFS for Life Sciences at NERSC

GPFS for Life Sciences at NERSC GPFS for Life Sciences at NERSC A NERSC & JGI collaborative effort Jason Hick, Rei Lee, Ravi Cheema, and Kjiersten Fagnan GPFS User Group meeting May 20, 2015-1 - Overview of Bioinformatics - 2 - A High-level

More information

Databases in the Cloud

Databases in the Cloud Databases in the Cloud Ani Thakar Alex Szalay Nolan Li Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES) The Johns Hopkins University Cloudy with a chance

More information

Managing large-scale workflows with Pegasus

Managing large-scale workflows with Pegasus Funded by the National Science Foundation under the OCI SDCI program, grant #0722019 Managing large-scale workflows with Pegasus Karan Vahi ( vahi@isi.edu) Collaborative Computing Group USC Information

More information

pre-ws MDS and WS-MDS Performance Study Ioan Raicu 01/05/06 Page 1 of 17 pre-ws MDS and WS-MDS Performance Study Ioan Raicu 01/05/06

pre-ws MDS and WS-MDS Performance Study Ioan Raicu 01/05/06 Page 1 of 17 pre-ws MDS and WS-MDS Performance Study Ioan Raicu 01/05/06 Ioan Raicu 0/05/06 Page of 7 pre-ws MDS and WS-MDS Performance Study Ioan Raicu 0/05/06 Table of Contents Table of Contents... Table of Figures... Abstract... 3 Contributing People... 3 2 Testbeds and

More information

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute

Pegasus Workflow Management System. Gideon Juve. USC Informa3on Sciences Ins3tute Pegasus Workflow Management System Gideon Juve USC Informa3on Sciences Ins3tute Scientific Workflows Orchestrate complex, multi-stage scientific computations Often expressed as directed acyclic graphs

More information

Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot

Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot Research in Advanced DIstributed Cyberinfrastructure & Applications Laboratory (RADICAL) Rutgers University http://radical.rutgers.edu

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

Problems for Resource Brokering in Large and Dynamic Grid Environments*

Problems for Resource Brokering in Large and Dynamic Grid Environments* Problems for Resource Brokering in Large and Dynamic Grid Environments* Catalin L. Dumitrescu CoreGRID Institute on Resource Management and Scheduling Electrical Eng., Math. and Computer Science, Delft

More information

A Performance Evaluation of WS-MDS in the Globus Toolkit

A Performance Evaluation of WS-MDS in the Globus Toolkit A Performance Evaluation of WS-MDS in the Globus Toolkit Ioan Raicu * Catalin Dumitrescu * Ian Foster +* * Computer Science Department The University of Chicago {iraicu,cldumitr}@cs.uchicago.edu Abstract

More information

Outline. March 5, 2012 CIRMMT - McGill University 2

Outline. March 5, 2012 CIRMMT - McGill University 2 Outline CLUMEQ, Calcul Quebec and Compute Canada Research Support Objectives and Focal Points CLUMEQ Site at McGill ETS Key Specifications and Status CLUMEQ HPC Support Staff at McGill Getting Started

More information

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Bogdan Nicolae (IBM Research, Ireland) Pierre Riteau (University of Chicago, USA) Kate Keahey (Argonne National

More information

Optimizing Load Balancing and Data-Locality with Data-aware Scheduling

Optimizing Load Balancing and Data-Locality with Data-aware Scheduling Optimizing Load Balancing and Data-Locality with Data-aware Scheduling Ke Wang *, Xiaobing Zhou, Tonglin Li *, Dongfang Zhao *, Michael Lang, Ioan Raicu * * Illinois Institute of Technology, Hortonworks

More information