Ioan Raicu. Distributed Systems Laboratory Computer Science Department University of Chicago

Size: px

Start display at page:

Download "Ioan Raicu. Distributed Systems Laboratory Computer Science Department University of Chicago"

Lambert Harmon
5 years ago
Views:

The Quest for Scalable Support of Data Intensive Applications in Distributed Systems

Chicago In Collaboration with: Ian Foster, University of Chicago and Argonne National

Philip Little, Christopher Moretti, Amitabh Chaudhary, Douglas Thain, University of

1 The Quest for Scalable Support of Data Intensive Applications in Distributed Systems Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster, University of Chicago and Argonne National Laboratory Alex Szalay, The Johns Hopkins University Yong Zhao, Microsoft Corporation Philip Little, Christopher Moretti, Amitabh Chaudhary, Douglas Thain, University of Notre Dame IEEE/ACM Supercomputing 28 Argonne National Laboratory Booth November 2 th, 28

Many-Core Growth Rates Increasing 9 attention to nm parallel chips Many plans for 65 cores with In-Order nm execution On-chip shared 45 memory nm 256

2 Many-Core Growth Rates Increasing 9 attention to nm parallel chips Many plans for 65 cores with In-Order nm execution On-chip shared 45 memory nm Cores Far faster to access on-chip 32 memory than DRAM nm Interesting challenges in synchronization 22 (e.g. 64 locking) nm 64 2 Cores 4 8 Inexpensive 6 Cores Low-Power Cores Cores 32 Parallel Chips Amazing amounts of computing very cheap nm Cores 32 Cores Slower (or same) sequential speed! Cores Cores 6 nm nm 8 nm Pat Helland, Microsoft, The Irresistible Forces Meet the Movable Objects, November 9 th, 27 Slide 2

3 What will we do with + Exaflops and M+ cores?

4 Storage Resource Scalability Th hroughput (Mb/s) GPFS R LOCAL R GPFS R+W LOCAL R+W Number of Nodes IBM BlueGene/P GPFS vs. LOCAL Read Throughput node:.48gb/s vs..3gb/s 2.5x 6 nodes: 3.4Gb/s vs. 65Gb/s 48x Mb/s per CPU vs. 55Mb/s per CPU Read+Write Throughput: node:.2gb/s vs..39gb/s.95x 6 nodes:.gb/s vs. 62Gb/s 55x Metadata (mkdir / rm -rf) node: 5/sec vs. 99/sec.3x 6 nodes: 2/sec vs. 384/sec 56x 6K CPU cores GPFS 8GB/s I/O rates (6 servers) Experiments on 6K CPU BG/P achieved.3mb/s per CPU core Experiments on 5.7K CPU SiCortex achieved.6mb/s per CPU core /3/28 The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 4

5 Programming Model Issues Multicore processors Massive task parallelism Massive data parallelism Integrating black box applications Complex task dependencies (task graphs) Failure, and other execution management issues Dynamic task graphs Documenting provenance of data products Data management: input, intermediate, output Dynamic data access involving large amounts of data The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 5

6 Programming Model Issues Multicore processors Massive task parallelism Massive data parallelism Integrating black box applications Complex task dependencies (task graphs) Failure, and other execution management issues Dynamic task graphs Documenting provenance of data products Data management: input, intermediate, output Dynamic data access involving large amounts of data The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 6

7 Problem Types Input Hi Data Size Med Data Analysis, Mining Big Data and Many Tasks Heroic MPI Tasks Many Loosely Coupled Apps Low K M Number of Tasks The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 7

8 An Incomplete and Simplistic View of Programming Models and Tools The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 8

9 MTC: Many Task Computing Loosely coupled applications High-performance computations comprising of multiple distinct activities, coupled via file system operations or message passing Emphasis on using many resources over short time periods Tasks can be: small or large, independent and dependent, uniprocessor or multiprocessor, compute-intensive or data-intensive, static or dynamic, homogeneous or heterogeneous, loosely or tightly coupled, large number of tasks, large quantity of computing, and large volumes of data The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 9

Intensive: 4MB:sec Rapid access to -K random files Time-varying load AP The Quest for

10 Motivating Example: AstroPortal Stacking Service Purpose On-demand stacks of random locations within ~TB dataset Challenge Processing Costs: O(ms) per object Data Intensive: 4MB:sec Rapid access to -K random files Time-varying load AP The Quest for Scalable Support of Data Intensive Applications in Distributed Systems = Sloan Data

11 Hypothesis Significant performance improvements can be obtained in the analysis of large dataset by leveraging information about data analysis workloads rather than individual data analysis tasks. Important concepts related to the hypothesis Workload: a complex query (or set of queries) decomposable into simpler tasks to answer broader analysis questions Data locality is crucial to the efficient use of large scale distributed systems for scientific and data-intensive applications Allocate computational and caching storage resources, co-scheduled to optimize workload performance The Quest for Scalable Support of Data Intensive Applications in Distributed Systems

12 Abstract Model AMDASK: An Abstract Model for DAta-centric task farms Task Farm: A common parallel pattern that drives independent computational tasks Models the efficiency of data analysis workloads for the MTC class of applications Captures the following data diffusion properties Resources are acquired in response to demand Data and applications diffuse from archival storage to new resources Resource caching allows faster responses to subsequent requests Resources are released when demand drops Considers both data and computations to optimize performance The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 2

13 AMDASK: Base Definitions Data Stores: Persistent & Transient Store capacity, load, ideal bandwidth, available bandwidth Data Objects: Data object size, data object s storage location(s), copy time Transient resources: compute speed, resource state Task: application, input/output data The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 3

14 AMDASK: Execution Model Concepts Dispatch Policy next-available, first-available, max-compute-util, max-cache-hit Caching Policy random, FIFO, LRU, LFU Replay policy Data Fetch Policy Just-in-Time, Spatial Locality Resource Acquisition Policy one-at-a-time, additive, exponential, all-at-once, optimal Resource Release Policy distributed, centralized The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 4

15 AMDASK: Performance Efficiency Model B: Average Task Execution Time: K: Stream of tasks Β = µ ( κ ) µ(k): Task k execution time Κ k Κ Y: Average Task Execution Time with Overheads: ο(k): Dispatch overhead ς(δ,τ): Time to get data V: Workload Execution Time: A: Arrival rate of tasks T: Transient Resources Y = Κ W: Workload Execution Time with Overheads V [ µ ( κ ) + o ( κ )], δ φ ( τ ), δ Ω Κ [ µ ( κ) + o( κ) + ζ ( δ, τ )], B = max, * Κ Τ Α Υ W = max, * Κ Τ Α δ φ( τ ), δ Ω The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 5 κ Κ κ Κ

16 AMDASK: Performance Efficiency Model Efficiency Ε = Speedup V W E = max B Y Y, T Τ, Α* Y, A Y T > A S = E* T Optimizing Efficiency Easy to maximize either efficiency or speedup independently Harder to maximize both at the same time Find the smallest number of transient resources T while maximizing The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 6 speedup*efficiency

17 Model Validation Stacking service (large scale astronomy application) 92 experiments 558K files Compressed: 2MB each.tb Un-compressed: 6MB each 33TB 3.3TB Model Error Model Error 3% 2% % GPFS (GZ) Data Diffusion (FIT) - Locality GPFS (FIT) Data Diffusion (GZ) - Locality Data Diffusion (FIT) - Locality.38 Data Diffusion (FIT) - Locality 3 Data Diffusion (GZ) - Locality.38 Data Diffusion (GZ) - Locality 3 % Number of CPUs 3% 2% % GPFS (GZ) GPFS (FIT) Data Diffusion (FIT) Data Diffusion (GZ) % Data Locality 7

18 Falkon: a Fast and Light-weight task execution framework Goal: enable the rapid and efficient execution of many independent jobs on large compute clusters Combines three components: a streamlined task dispatcher resource provisioning i i through h multi-level l l scheduling techniques data diffusion and data-aware scheduling to leverage the co-located computational and storage resources Integration into Swift to leverage many applications Applications cover many domains: astronomy, astro-physics, medicine, chemistry, economics, climate modeling, etc The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 8

19 Falkon: a Fast and Light-weight task execution framework Goal: enable the rapid and efficient execution of many independent jobs on large compute clusters Combines three components: a streamlined task dispatcher resource provisioning i i through h multi-level l l scheduling techniques data diffusion and data-aware scheduling to leverage the co-located computational and storage resources Integration into Swift to leverage many applications Applications cover many domains: astronomy, astro-physics, medicine, chemistry, economics, climate modeling, etc The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 9

20 Falkon Overview The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 2

21 Data Diffusion Resource acquired in response to demand Data and applications diffuse from archival storage to newly acquired resources Resource caching allows faster responses to subsequent requests Cache Eviction Strategies: RANDOM, FIFO, LRU, LFU Resources are released when demand drops Task Dispatcher Data-Aware Scheduler Idle Resources Persistent Storage Shared File System text Provisioned Resources The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 2

22 Data Diffusion Considers both data and computations to optimize performance Supports data-aware scheduling Can optimize compute utilization, cache hit performance, or a mixture of the two Decrease dependency of a shared file system Theoretical linear scalability with compute resources Significantly increases meta-data creation and/or modification performance Central for data-centric task farm realization The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 22

23 Scheduling Policies first-available: simple load balancing max-cache-hit maximize cache hits max-compute-util maximize processor utilization good-cache-compute maximize both cache hit and processor utilization at the same time The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 23

24 Data-Aware Scheduler Profiling Task (ms) CPU Time per Task Submit Notification for Task Availability Task Dispatch (data-aware scheduler) Task Results (data-aware scheduler) Notification for Task Results WS Communication Throughput (tasks/sec) Throughput (t tasks/sec) firstavailable without I/O firstavailable with I/O max- max-cachecompute-utihit goodcachecompute The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 24

AstroPortal Stacking Service Purpose On-demand stacks of random locations within ~TB dataset Challenge Rapid access to -K random files Time-varying load Sample Workloads Locality Number of Objects

25 AstroPortal Stacking Service Purpose On-demand stacks of random locations within ~TB dataset Challenge Rapid access to -K random files Time-varying load Sample Workloads Locality Number of Objects Number of Files S 4 Web page or Web Service = Sloan Data The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 27

26 Purpose AstroPortal Stacking Service 45 On-demand stacks of random locations 4 within ~TB dataset 35 Challenge 3 Rapid access to -K random files Time (ms) 25 Time-varying load 2 Sample Workloads 5 Locality Number of Objects Number of Files open radec2xy readhdu+gettile+curl+convertarray calibration+interpolation+dostacking writestacking The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 28 S 4 Web page or Web Service GPFS GZ LOCAL GZ GPFS FIT LOCAL FIT Filesystem and Image Format = Sloan Data

27 Time (ms) per stack per CPU AstroPortal Stacking Service with Data Diffusion Low data locality Similar (but better) performance to GPFS Data Diffusion (GZ) Data Diffusion i (FIT) GPFS (GZ) GPFS (FIT) Time (ms s) per stack per CPU Data Diffusion (GZ) Data Diffusion (FIT) GPFS (GZ) GPFS (FIT) Number of CPUs High data locality Near perfect scalability Number of CPUs The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 29

28 AstroPortal Stacking Service with Data Diffusion Aggregate throughput: 39Gb/s X higher than GPFS Reduced load on GPFS Time (ms) per stack per CPU Throughput (Gb/s) Gb/s 2 / of the original load 5 Data Diffusion (GZ) Data Diffusion (FIT) GPFS (GZ) GPFS (FIT) Ideal Locality Aggregate 5 Data Diffusion Throughput Local Data Diffusion Throughput Cache-to-Cache Data Diffusion Throughput GPFS GPFS Throughput (FIT) GPFS Throughput (GZ) Locality Big performance gains as locality increases 3

29 Monotonically Increasing Workload 25K tasks MB reads ms compute Vary arrival rate: Min: task/sec Increment function: CEILING(*.3) Max: tasks/sec 28 processors Ideal case: 45 sec 8Gb/s peak throughput Arrival Rate per second Arrival Rate per sec Tasks completed Ideal Throughput Mb/s Time (sec) Tasks Comp pleted Ideal Throughpu ut (Mb/s) The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 3

30 Data Diffusion: First-available (GPFS) GPFS vs. ideal: 5 sec vs. 45 sec. Number of No odes Throughput (G Gb/s) Queue Length (xk) Time (sec) The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 32 Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes

31 Data Diffusion: Max-compute-util & max-cache-hit Max-compute-util Max-cache-hit Queue Length (xk) Cache Hit/Miss % Number of Nodes Aggregate Throughput (G Gb/s) Throughput per Node (M B/s) Queue Length (xk) CPU Utilization % Cache Hit/Miss % Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes CPU Utilization The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 33 Number of Nodes Throughput (Gb/s)

32 Data Diffusion: Good-cache-compute GB.5GB Number of Nodes Aggregate Throughput (Gb/s) Throughput per Node (MB/s) Queue Length (xk) Cache Hit/Miss %.. Number of Nodes Aggregate Throughput (Gb/s) Throughput per Node (MB/s) Queue Length (xk) Cache Hit/Miss % Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes GB 4GB Time (sec) Number of Nodes Aggregate Throughput (Gb/s) Throughput per Node (MB/s) Queue Length (xk) Cache Hit/Miss %.. Number of Nodes Aggregate Throughput (Gb/s) Throughput per Node (MB/s) Queue Length (xk) Cache Hit/Miss % Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes

33 Data Diffusion: Good-cache-compute Data Diffusion vs. ideal: 436 sec vs 45 sec Number of No odes Aggregate Through hput (Gb/s) Throughput per No ode (MB/s) Queue Length (xk) Cache Hit/Mis ss % Time (sec) Cache Hit Local % Cache Hit Global % Cache Miss % 35 Ideal Throughput (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes

34 Data Diffusion: Throughput and Response Time Throughput (Gb/s) Local Worker Caches (Gb/s) Remote Worker Caches (Gb/s) GPFS Throughput (Gb/s) Throughput: Average: 4Gb/s vs 4Gb/s Peak: Gb/s vs. 6Gb/s. Ideal firstavailable goodcachecompute, GB goodcachecompute,.5gb goodcachecompute, 2GB goodcachecompute, 4GB maxcache-hit, 4GB maxcomputeutil, 4GB Response Time 3 sec vs 569 sec 56X Average Response Time (sec) 569 firstavailable 84 goodcachecompute, GB 4 goodcachecompute,.5gb 3.4 goodcachecompute, 2GB 3. goodcachecompute, 4GB max-cachehit, 4GB 36 maxcomputeutil, 4GB

35 Data Diffusion: Performance Index, Slowdown, and Speedup Performance Index: 34X higher Speedup 3.5X faster than GPFS Slowdown first-available good-cache-compute, GB good-cache-compute,.5gb good-cache-compute, 2GB good-cache-compute, 4GB max-cache-hit, 4GB max-compute-util, 4GB Arrival Rate per Second Performance Index Performance Index Speedup (compared to first-available) firstavailable goodcachecompute, GB goodcachecompute,.5gb goodcachecompute, 2GB goodcachecompute, 4GB good- maxcachecache-hitcompute, 4GB 4GB, SRP Slowdown: 8X slowdown for GPFS Near ideal X slowdown for large enough caches maxcomputeutil, 4GB Speedu up (compared to LAN GPFS)

36 Sin-Wave Workload 2M tasks MB reads ms compute Vary arrival rate: Min: task/sec Arrival rate function: Max: tasks/sec 2 processors Ideal case: 655 sec 8Gb/s peak throughput A = (sin( sqrt( time+.)* ) + )*( time+.)*5.75 Arrival Rate (per sec) Arrival Rate 9 Number of Tasks Time (sec) Number of Tasks Complete ed

37 Sin-Wave Workload GPFS 5.7 hrs, ~8Gb/s, 38 CPU hrs DF+SRP.8 hrs, ~25Gb/s, 36 CPU hrs DF+DRP.86 hrs, ~24Gb/s, 253 CPU hrs Number of Nodes Throughput (Gb/s) Queue Length (xk) % 9% 8% 7% 6% 5% 4% 3% 2% % % Cache Hit/Miss Number of Nodes Throughput (Gb/s) Queue Length (xk) % 9% 8% 7% 6% 5% 4% 3% 2% % % Cache Hit/Miss % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % Demand (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes Time (sec) Cache Miss % Cache Hit Global % Cache Hit 39 Local % Demand (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes

38 Sin-Wave Workload Number of Nodes Through hput (Gb/s) Queue Le ength (xk) % 9% 8% 7% 6% 5% 4% 3% 2% % % Cache Hit/Miss % Time (sec) Cache Miss % Cache Hit Global % Cache Hit Local % 4 Demand (Gb/s) Throughput (Gb/s) Wait Queue Length Number of Nodes

39 5x5 25K tasks 24MB reads ms compute 2 CPUs x M tasks 24MB reads 4sec compute 496 CPUs Ideal case: 655 sec 8Gb/s peak throughput All-Pairs Workload All-Pairs( set A, set B, function F ) returns matrix M: Compare all elements of set A to all elements of set B via function F, yielding matrix M, such that M[i,j] = F(A[i],B[j]) foreach $i in A 2 foreach $j in B 3 submit_job F $i $j 4 end 5 end The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 4

40 Throughput (Gb/s) All-Pairs Workload 5x5 on 2 CPUs Efficiency: 75% Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Data Diffusion) Maximum Throughput (GPFS) Maximum Throughput (Local Disk) % 9% 8% 7% 6% 5% 4% 3% 2% % % Miss Cache Hit/ Time (sec) 42

41 All-Pairs Workload x on 4K emulated CPUs Throughput (Gb/s) Efficiency: 86% Cache Miss % Cache Hit Global % Cache Hit Local % Throughput (Data Diffusion) Maximum Throughput (GPFS) Maximum Throughput (Local Memory) Time (sec) % 9% 8% 7% 6% 5% 4% 3% 2% % % Cache Hit/ /Miss 43

42 All-Pairs Workload Data Diffusion vs. Active Storage Push vs. Pull Active Storage: % 9% 8% 7% 6% 5% Pushes workload 4% 3% working set to all nodes 2% Static spanning tree Data Diffusion Pulls task working set Incremental spanning forest Efficiency % % Experiment 5x5 2 CPUs sec 5x5 2 CPUs. sec x 496 CPUs 4 sec 5x5 2 CPUs sec Approach Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) Best Case (active storage) Falkon (data diffusion) Falkon (GPFS) 5x5 x 2 CPUs 496 CPUs. sec 4 sec Shared Local Network File Disk/Memory (node-to-node) System (GB) (GB) (GB)

43 All-Pairs Workload Data Diffusion vs. Active Storage Best to use active storage if Slow data source Workload working set fits on local node storage Good aggregate network bandwidth Best to use data diffusion if Medium to fast data source Task working set << workload working set Task working set fits on local node storage Good aggregate network bandwidth If task working set does not fit on local node storage Use parallel file system (i.e. GPFS, Lustre, PVFS, etc)

44 Limitations of Data Diffusion Needs Java.4+ Needs IP connectivity between hosts Needs local storage (disk, memory, etc) Per task workings set must fit in local storage Task definition must include input/output files metadata Data access patterns: write once, read many The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 46

45 Related Work: Data Management [Beynon]: DataCutter [Ranganathan3]: Simulations [Ghemawat3,Dean4,Chang6]: BigTable, GFS, MapReduce [Liu4]: GridDB [Chervenak4,Chervenak6]: RLS (Replica Location Service), DRS (Data Replication Service) [Tatebe4,Xiaohui5]: GFarm [Branco4,Adams6]: DIAL/ATLAS [Kosar6]: Stork [Thain8]: Chirp/Parrot Conclusion: None focused on the co-location of storage and generic black box computations with data-aware scheduling while operating in a dynamic environment The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 47

46 Scaling from K to K CPUs without Data Diffusion At K CPUs: Server to manage all K CPUs Use shared file system extensively Invoke application from shared file system Read/write data from/to shared file system At K CPUs: N Servers to manage K CPUs (:256 ratio) Don t trust the application I/O access patterns to behave optimally Copy applications and input data to RAM Read input data from RAM, compute, and write results to RAM Archive all results in a single file in RAM Copy result file from RAM back to GPFS Great potential for improvements Could leverage the Torus network for high aggregate bandwidth Collective I/O (CIO) Primitives Roadblocks: machine global IP connectivity, Java support, and time 48

47 Mythbusting Embarrassingly Happily parallel apps are trivial to run Logistical problems can be tremendous Loosely coupled apps do not require supercomputers Total computational requirements can be enormous Individual tasks may be tightly coupled Workloads frequently involve large amounts of I/O Make use of idle resources from supercomputers via backfilling Costs to run supercomputers per FLOP is among the best BG/P:.35 gigaflops/watt (higher is better) SiCortex:.32 gigaflops/watt BG/L:.23 gigaflops/watt x86-based HPC systems: an order of magnitude lower Loosely coupled apps do not require specialized system software Shared file systems are good for all applications They don t scale proportionally with the compute resources The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 49 Data intensive applications don t perform and scale well

48 Conclusions & Contributions Defined an abstract model for performance efficiency of data analysis workloads using data-centric task farms Provide a reference implementation (Falkon) Use a streamlined dispatcher to increase task throughput by several orders of magnitude over traditional LRMs Use multi-level l l scheduling to reduce perceived wait queue time for tasks to execute on remote resources Address data diffusion through co-scheduling of storage and computational resources to improve performance and scalability Provide the benefits of dedicated hardware without the associated high cost Show effectiveness of data diffusion: real large-scale astronomy application and a variety of synthetic workloads The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 5

49 More Information More information: Related Projects: Falkon: AstroPortal: Swift: Funding: NASA: Ames Research Center, Graduate Student Research Program (GSRP) DOE: Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Dept. of Energy NSF: TeraGrid The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 5

50 In conjunction with IEEE/ACM SuperComputing 28 Location: Austin Texas Date: November 7 th, 28 The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 52

51 In conjunction with IEEE/ACM SuperComputing 28 Location: Austin Texas Date: November 8 th, 28 The Quest for Scalable Support of Data Intensive Applications in Distributed Systems 53

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data