Towards Jungle Computing with Ibis/Constellation

Similar documents
MOHA: Many-Task Computing Framework on Hadoop

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short

Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times

Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5

Automating Real-time Seismic Analysis

Grid Scheduling Architectures with Globus

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

Zorilla: a peer-to-peer middleware for real-world distributed systems

PERFORMANCE ANALYSIS AND OPTIMIZATION OF MULTI-CLOUD COMPUITNG FOR LOOSLY COUPLED MTC APPLICATIONS

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Semantic Web in a Constrained Environment

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

UVA HPC & BIG DATA COURSE INTRODUCTORY LECTURES. Adam Belloum

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work

Adaptive Cluster Computing using JavaSpaces

EFFICIENT ALLOCATION OF DYNAMIC RESOURCES IN A CLOUD

Overview Past Work Future Work. Motivation Proposal. Work-in-Progress

FACULTY OF ENGINEERING B.E. 4/4 (CSE) II Semester (Old) Examination, June Subject : Information Retrieval Systems (Elective III) Estelar

Grid Scheduler. Grid Information Service. Local Resource Manager L l Resource Manager. Single CPU (Time Shared Allocation) (Space Shared Allocation)

The Use of Cloud Computing Resources in an HPC Environment

Collaboration Support in Open Hypermedia Environments

Parallel VS Distributed

Grid-Based Genetic Algorithm Approach to Colour Image Segmentation

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

MULTI-THREADED QUERIES

Hybrid Model Parallel Programs

Module 1: Introduction

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Pegasus. Automate, recover, and debug scientific computations. Rafael Ferreira da Silva.

Grid Computing. Lectured by: Dr. Pham Tran Vu Faculty of Computer and Engineering HCMC University of Technology

Overview of research activities Toward portability of performance

Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 1

Auto Management for Apache Kafka and Distributed Stateful System in General

Workflows and Scheduling

Connecting the e-infrastructure chain

Generic Framework for Parallel and Distributed Processing of Video-Data

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

Chapter 1: Introduction

Parallel Architectures

ECE 574 Cluster Computing Lecture 1

Parallel Computing with MATLAB

MediaTek CorePilot 2.0. Delivering extreme compute performance with maximum power efficiency

DRYAD / DRYADLINQ OVERVIEW. Xavier Pillons, Principal Program Manager, Technical Computing Customer Advocate Team

A Fully Automated Faulttolerant. Distributed Video Processing and Off site Replication

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Introduction to Parallel Programming

Distributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology

L3.4. Data Management Techniques. Frederic Desprez Benjamin Isnard Johan Montagnat

Multiple Broker Support by Grid Portals* Extended Abstract

DAS 1-4: Experiences with the Distributed ASCI Supercomputers

Introduction to Parallel Programming

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster

SUPPORTING EFFICIENT EXECUTION OF MANY-TASK APPLICATIONS WITH EVEREST

BUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE

Introduction to Parallel Computing

CloudKon: a Cloud enabled Distributed task execution framework

Job-Oriented Monitoring of Clusters

Federated XDMoD Requirements

Chapter 1: Distributed Information Systems

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Use cases. Faces tagging in photo and video, enabling: sharing media editing automatic media mashuping entertaining Augmented reality Games

Technology for a better society. SINTEF ICT, Applied Mathematics, Heterogeneous Computing Group

Introduction to Operating Systems

MapReduce for Data Intensive Scientific Analyses

MediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency

Forming an ad-hoc nearby storage, based on IKAROS and social networking services

The Cray Rainier System: Integrated Scalar/Vector Computing

Scheduling Algorithms in Large Scale Distributed Systems

Models for model integration

An Introduction to GPFS

Operating Systems Fundamentals. What is an Operating System? Focus. Computer System Components. Chapter 1: Introduction

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Disk Cache-Aware Task Scheduling

Overcoming Data Locality: an In-Memory Runtime File System with Symmetrical Data Distribution

Computing over the Internet: Beyond Embarrassingly Parallel Applications. BOINC Workshop 09. Fernando Costa

COMP528: Multi-core and Multi-Processor Computing

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA

Processor Architecture and Interconnect

Portable Heterogeneous High-Performance Computing via Domain-Specific Virtualization. Dmitry I. Lyakh.

White. Paper. EMC Isilon Scale-out Without Compromise. July, 2012

Adventures in Load Balancing at Scale: Successes, Fizzles, and Next Steps

Ian Foster, An Overview of Distributed Systems

Mark Sandstrom ThroughPuter, Inc.

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot

PBS PROFESSIONAL VS. MICROSOFT HPC PACK

Case Studies in Storage Access by Loosely Coupled Petascale Applications

QOS BASED SCHEDULING OF WORKFLOWS IN CLOUD COMPUTING UPNP ARCHITECTURE

Pervasive DataRush TM

BUILDING A GPU-FOCUSED CI SOLUTION

Karthik Narayanan, Santosh Madiraju EEL Embedded Systems Seminar 1/41 1

Parallel Programming. Presentation to Linux Users of Victoria, Inc. November 4th, 2015

Chapter 1: Introduction

HPC learning using Cloud infrastructure

Cees de Laat University of Amsterdam

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

Spark and HPC for High Energy Physics Data Analyses

Transcription:

Towards Jungle Computing with Ibis/Constellation Jason Maassen, Niels Drost Henri Bal, Frank Seinstra Department of Computer Science VU University, Amsterdam, The Netherlands

Introduction HPC is entering many domains Not just: physics / chemistry / climate modelling Also: semantic web / medical / multimedia analysis / neuroinformatics / remote sensing / astronomy /... HPC is becoming more complex Not just large SMP or clusters, instead: Clusters of SMPs / Grids / Clouds / Supers /... Heterogenous machines using GPU / Cell / FPGA It s a jungle out there 3DAPAS Workshop 2011 2

Example Domain Computational Astrophysics (amusecode.org)

Jungle Computing Worst case computing... as required by users Arbitrary combination of distributed, hierarchical, and heterogenous computing 3DAPAS Workshop 2011 4

Many Task Computing According to Raicu, Foster, et al [SC 08] High-performance computations comprising multiple distinct activities, coupled via file system operations or message passing. Tasks may be small or large, uni-processor or multi-processor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Applications are dynamic and heterogeneous workflows / DAGs of activities 3DAPAS Workshop 2011 5

MTC in the Jungle MTC has advantages for Jungle Computing Many distinct activities Can be implemented independently using the tools and targeted to the HPC architecture, that best suit them Reduced programming complexity Complete applications are constructed using sequences and combinations of activities 3DAPAS Workshop 2011 6

Constellation MTC system for Jungle Computing Model based on: activities (tasks) executors (resources) contexts (matchmaking) events (communication) 3DAPAS Workshop 2011 7

Constellation Model Application Application: set of activities Distinct tasks Size and complexity may vary Targeted at specific HPC platform (Loosly) Coupled using events Often wrapper around existing code Similar to workflow or DAG of tasks Dynamic and unlimited in size 3DAPAS Workshop 2011 8

Constellation Model Hardware Hardware: set of executors Capable of running activities May represent anything from a single core to an entire cluster, a GPU, etc. May be application specific Provides an application specific heterogeneous resource pool 3DAPAS Workshop 2011 9

Constellation Model Context Both activities and executors are tagged with a context Application defined label (+ rank) Used to defines relationship between activites and executors, e.g.: Data dependencies, hardware requirements,... May combine contexts Executors may have preference for label or rank 3DAPAS WorkShop 2011 10

Constellation Model Matchmaking RTS performs load-balancing and match-making Ensures activities are forwarded to a suitable executor Tries to keep all executors busy Uses context-aware work-stealing RTS also performs event routing Based on unique activity identifier ComplexHPC Spring School 2011 11

Constellation API 3DAPAS Workshop 2011 12

Constellation API 3DAPAS Workshop 2011 13

DACH 2008 Data Challenge in conjunction with IEEE Cluster/Grid 2008 Supernova detection Analyse 1052 image pairs on 11 clusters (Intrigger) Sequential executable provided 3DAPAS Workshop 2011 14

DACH 2008 Problem Main problems: Data distribution Heterogeneity of work and hardware Load balancing 3DAPAS Workshop 2011 15

DACH 2008 Workflow Winning approach in 2008: Parallelize workflow to improve hardware utilization Create hierarchical master worker framework Scheduling heuristics using data location and size 3DAPAS Workshop 2011 16

Constellation Version Option 1: Monolythic Wrap entire application in a single activity One activity per image pair Wrap each machine in one executor Multiple cores per executor Use context to influence order and placement of each of activities 3DAPAS Workshop 2011 17

Evaluation Intrigger not available Instead we use DAS3+DAS4 5+6 clusters in the Netherlands Mix of 2/4/8/12/48 core machines Various types of GPUs Three Scenarios Data locality (Executor granularity) Heterogeneous processing 3DAPAS Workshop 2011 18

Scenario 1 Data Locality Data distributed over 4 clusters of DAS3 + DAS4 Use context to express data locality and preferred processing order Adapt context to tune application No change in application 3DAPAS Workshop 2011 19

Scenario 1 Results Activity Executor Effect any any Random order any,50 VU3, VU4,50 any, biggest VU3, biggest Sorted by size Local only Sorted by size VU3, VU4, any,50 VU3, any, biggest Preference for local Fallback to any, Sorted by size 3DAPAS Workshop 2011 20

Constellation Version Option 2: Workflow Wrap each stage in activity Wrap each core executor Use context to influence order and placement of each of the jobs 3DAPAS Workshop 2011 21

Scenario 3: Heterogeneous System 18 node GPU cluster 8 cores + 1 GPU per node Activity: single task Executor: 1 core (top) 1 core or GPU (bottom) Replaced activity 7.2 with GPU version. Label activities and executors accordingly Significant performance gain. ComplexHPC Spring School 2011 22

Conclusions We think Jungle Computing is a neccesity for some application areas. Constellation offers a suitable model (MTC) to create such applications. Initial experiments show that Constellation works well for a wide range of hardware configurations Easy to reconfigure applications to match resources Allows integration of specialized accellerator codes Suitable basis for a Jungle Computing model 3DAPAS Workshop 2011 23

Future Work Application development AMUSE Remote Sensing Climate modelling Platform improvements Easier integration of existing codes Smart/automatic deployment/tuning of executors Improve data handling Better monitoring 3DAPAS Workshop 2011 24

Questions? jason@cs.vu.nl www.cs.vu.nl/ibis 3DAPAS Workshop 2011 25

Scenario 2 Executor Granularity 30 largest images only Single 48 core machine Activity: entire application (a-c) single task (d) Executor: [n]-cores No change in application for experiment (a-c) Only change executor config. Completely ported application in (d) Significant performance gain! 3DAPAS Workshop 2011 26