Adaptive Cluster Computing using JavaSpaces

Size: px

Start display at page:

Download "Adaptive Cluster Computing using JavaSpaces"

Gertrude Cook
5 years ago
Views:

1 Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University

2 Outline Background Introduction Related Work Summary of presented Framework Framework Framework Architecture Framework Operation Experimental Evaluation Conclusions & Future Work

3 Background

4 Introduction Motivation Traditional approach adopted for HPC Potential for leveraging existing networked resources for HPC Objective is to design and implement a framework that: Aggregates networked computing resources Non-intrusively exploits idle resources for Parallel/Distributed Computing Challenges that need to addressed by the Framework 1. Adaptability to Cluster Dynamics 2. System Configuration and Management Overhead 3. Heterogeneity 4. Security and Privacy concerns 5. Non intrusiveness

5 Related Work Adaptive Parallelism Job Level Parallelism Condor Piranha Cluster Based ATLAS ObjectSpace Web Based Charlotte Javelin ParaWeb

6 Features Summary of the Presented Work Implementation using Java as the core programming language to leverage portability across heterogeneous platforms Offloading user responsibilities to the Framework for Resource Availability event triggers Minimized configuration overhead by employing remote node configuration techniques Targeted Applications High Computational Complexity Massively Partitioned Problems Small Input and Output Sizes

7 Framework Architecture

8 Architectural Overview Master Module Worker Module Network Management Module Master Master Process Process Java Spaces Jini Java Space Infrastructure Space Space Remote Node Configuration Engine Worker Worker Process Process Rule Based Protocol Java Sockets Inference Engine Rule Based Protocol Java Native Interface Monitoring Agent Common Services PLATFORM INFRASTRUCTURE (OPERATING SYSTEM)

9 Architectural Components: Jini Infrastructure Master Module Master Process JavaSpaces Jini Java Space Infrastructure Space Worker Module Remote Node Config. Engine Worker Process Rule Base Protocol Java Sockets Network Management Module Inference Engine Rule Base Protocol Java Native Interface Monitoring Agent Jini Based Infrastructure Common Services PLATFORM INFRASTRUCTURE (OPERATING SYSTEM) 3. Client discovers lookup server through multicast discovery 4. Client performs lookup directly to know lookup server and receives server location Jini Client Jini Lookup Server 1. Service discovers lookup server through multicast discovery 2. Service registers with lookup Server using join Jini Server 5. Client directly accesses the server and executes code on the server Jini Client Jini Client

10 Architectural Components: Master-Worker Paradigm Master Module Master Process JavaSpaces Jini Java Space Infrastructure Space Worker Module Remote Node Config. Engine Worker Process Rule Base Protocol Java Sockets Network Management Module Inference Engine Rule Base Protocol Java Native Interface Monitoring Agent Master-Worker Interaction Diagram Common Services PLATFORM INFRASTRUCTURE (OPERATING SYSTEM) Worker 1 Writing Input Parameters Collects tasks Executes Returns results into space Worker 2 Master Creates tasks Loads into Space Awaits collection of results Collect Result Parameters Space Space Pool Pool of of Tasks Tasks Collects tasks Executes Returns results into space Worker n Legend: Writing Task Writing Results Collects tasks Executes Returns results into space

11 Architectural Components: Rule-Base Protocol UML Interaction Diagram for Rule-Base Protocol Master Module Master Process JavaSpaces Jini Java Space Infrastructure Common Services Space Worker Module Remote Node Config. Engine Worker Process Rule Base Protocol Java Sockets PLATFORM INFRASTRUCTURE (OPERATING SYSTEM) Network Management Module Inference Engine Rule Base Protocol Java Native Interface Monitoring Agent Inference Engine SNMP Server SNMP Client Worker Application 1. Server Listens for Client Connections Client connects and sends its I.P. Address 3. to Server Server assigns a Client I.D. 4. Server invokes SNMP Service for the Client Invokes Inference Engine 6. Decides signal for the Client 7. Sends signal to Client through the Server 8. Client sends signal to Application 10. Goto Step Computations are performed

12 Architectural Components: Worker Execution States Master Module Master Process JavaSpaces Jini Java Space Infrastructure Space Worker Module Remote Node Config. Engine Worker Process Rule Base Protocol Java Sockets Network Management Module Inference Engine Rule Base Protocol Java Native Interface Monitoring Agent Worker State Transition Diagram Common Services PLATFORM INFRASTRUCTURE (OPERATING SYSTEM) Stop Running Running Start Stop Stop Pause Resume Pause Pause

13 Experimental Evaluation

14 Experimental Evaluation Application Metrics Parallel Monte Carlo Simulation for Stock Option Pricing The problem domain is divided into 50 sub tasks, each comprising of 100 simulations Parallel Ray Tracing A 600X600-image plane is divided into rectangular slices of 25X600 thus creating 24 independent tasks Option Pricing Scheme Ray Tracing Scheme Scalability Medium High CPU Memory Adaptable depending High Requirements on number of simulations Dependency No No Jini Server Master Workstation P III, 256 MB RAM, 800 MHz Worker 1 P III, 64 MB RAM, 300 MHz Experimental Setup Worker 2 P III, 64 MB RAM, 300 MHz LAN SNMP Server Worker 3 P III, 64 MB RAM, 300 MHz P III, 64 MB RAM, 300 MHz Worker 12 P III, 64 MB RAM, 300 MHz

15 Experiment 1: Scalability Analysis (Option Pricing Scheme) Figure 1: Scalability Analysis for Option Pricing Scheme Time (ms) Number of Workers # of machines Max worker time Parallel Time Task Planning Task aggregation

16 Experiment 2: Adaptation Protocol Analysis (Option Pricing Scheme) Load Simulator 1: Stop Signal Load Simulator 2: Pause Signal Figure 1(a): Graph of CPU Usage on Worker Nodes for Option Pricing Scheme Figure 1(b): Graph of Worker Reaction Times for Option Pricing Scheme CPU Usage (%) Simulation Run (STOP) Class Loading Simulation Run (PAUSE) Signal Start Pause Start Stop Start Start Worker Signal 1061 Client Signal Stop Restart Pause Restart Time (ms) 0 Client Signal Worker Signal

17 Experiment 3: Dynamic Behavior Patterns under varying load conditions (Option Pricing Scheme) Figure 1(a): Graph of dynamics under load showing time measurements for Option Pricing Scheme Measurement Category Total Parallel time Task Planning and Aggregation Maximum Master Overhead Maximum Worker Time Time (ms) Without Loading Loading 3 Workers Loading 6 Workers Figure 1(b): Graph of dynamics under load showing task distribution for Option Pricing Scheme Number of Tasks Executed Worker 1 Worker 2 Worker 3 Worker 4 Worker 5 Worker 6 Worker 7 Worker 8 Worker 9 Worker 10 Worker 11 Worker 12 Worker Nodes Without Loading Loading 3 Workers Loading 6 Workers

18 Conclusions and Future Work

19 Conclusions Summary Good Scalability for coarse grained applications Idle workstations can be effectively used to compute large parallel jobs Framework provides support for global code deployment and configuration management Monitoring and reacting to System State enables us to minimize intrusiveness to machines within the cluster Future Directions Optimizations at the Application level Use of Transaction Management services to address Partial Failures Incorporating a distributed JavaSpace model to avoid a single point of resource contention or failure

20 Additional Slides

21 Summary of Related Cluster Based Parallel Computing Architectures Architecture Approach Key Features Strengths Weaknesses Condor Piranha Cluster based, Job level parallelism, Asynchronous offloading of work to idle resources on a request basis Cluster based, Adaptive parallelism, Centralized work distribution using the master worker paradigm Queuing mechanism, Check pointing and process migration, Remote procedure call and matchmaking Implementation of Linda Model using tuple spaces, Recycles idle resources based on user defined criteria. Survives partial failure, preserved local execution environment Survives partial failure, efficient cycle stealing Incompleteness of checkpointing files. Difficulty in linking checkpointing code with user code for third party software. Requires modifications to the Linda systems. Supports only Linda Programs

22 Summary of Related Cluster Based Parallel Computing Architectures (cont d) Architecture Approach Key Features Strengths Weaknesses ATLAS ObjectSpace /Anaconda Cluster based, Adaptive Parallelism, Hierarchical work stealing Cluster based, Job level parallelism at brokers Implementation based on Cilk. Emphasis on thread scheduling algorithm based on work stealing techniques. Runtime library for work stealing, thread management, marshalling of objects. Eager scheduling, automated/manual job replication, Job level parallelism at broker level. Fault Tolerance, Localized computations: more bandwidth, reduced latencies, and natural mapping to administrative domains. Improved scalability Periodic polling the health of participation hosts. Global file sharing leads to conflicts in access control. Heterogeneity concerns: conversion mechanisms for native libraries, platform specific Java interpreter. Fault tolerance is not supported among brokers.

23 Architectur e Summary of Related Web Based Parallel Computing Architectures Approach Key Features Strengths Weaknesses Charlotte Web based, Adaptive parallelism, Centralized work distribution using downloadable applets Abstraction of Distributed Shared Memory, data type mapping and use of atomic update protocol. Directory look up service for idle resource identification, embedded lightweight class server on local hosts Direct interapplet communication, Fault tolerance, Eager scheduling and Two Phase idempotent execution Centralized broker can be a potential bottleneck Javelin Web based, Adaptive parallelism, Centralized work distribution using downloadable applets Heterogeneous, secure execution environment, portability to include Linda and SPMD programming models Offline worker computation, Applet balancing scheme via broker registration Inter applet communication routed via initiating broker: potential bottleneck

24 Summary of Related Web Based Parallel Computing Architectures (cont d) Architecture Approach Key Features Strengths Weaknesses ParaWeb Web based, Adaptive parallelism, Centralized work distribution using scheduling servers Implementation of Java Parallel Runtime System to provide global shared memory and transparent thread management and Java Parallel Class Library to facilitate upload and download of execution code across the web Clients can act as workers JPRS requires modifying the Java Interpreter Open Issues Heterogeneity concerns Manual Management

25 Architectural Components: JavaSpace Service Features Coordination tool for gluing processes together into a distributed application Supports heterogeneous computing Supports Dynamic Client Pool Provides a storage repository for Java Objects Provides a simplistic API for access of the Java objects in the space write read take JavaSpaces TM Interaction: Source Java Web Site

26 Experiment 1: Scalability Analysis (Ray Tracing Scheme) Figure 2: Scalability Analysis for Ray Tracing Scheme Time (ms) Number of Workers # of machines Max Worker Time Parallel Time Task Planning Task Aggregation

27 Experiment 1: Scalability Analysis (Pre-fetching Scheme) Figure 3: Scalability Analysis for Pre-fetching Scheme based on Page Rank Time (ms) Number of Workers # of machines Max Worker Time Parallel Time Task Planning Task Aggregation

28 Experiment 2: Adaptation Protocol Analysis (Ray Tracing Scheme) Figure 2(a): Graph of CPU Usage on Worker Nodes for Ray Tracing Scheme Figure 2(b): Graph of Worker Reaction Times for Ray Tracing Scheme CPU Usage (%) Simulation Run (PAUSE) Simulation Run (STOP) Signal Restart Pause Restart Stop 1072 Start Start Stop Restart Pause Restart Worker Signal Client Signal Time (ms) Client Signal Worker Signal

29 Experiment 2: Adaptation Protocol Analysis (Pre-fetching scheme) Figure 3(a): Graph of CPU Usage on Worker Nodes for Pre-fetching Scheme Figure 3(b): Graph of Worker Reaction Times for Pre-fetching Scheme CPU Usage (%) Simulation Run (STOP) Class Loading Simulation Run (PAUSE) Signal Restart Pause Restart Stop Start Start Stop Restart Pause Restart Worker Signal Client Signal Time (ms) Client Signal Worker Signal

30 Experiment 3: Dynamic Behavior Patterns under varying load conditions (Ray Tracing Scheme) Figure 2(a): Graph of dynamics under load showing time measurements for Ray Tracing Scheme Measurement Category Total Parallel Time Task Planning and Aggregation Maximum Master Overhead Max Worker Time Time (ms) Without Loading Loading 1Worker Loading 2 Workers Figure 2(b): Graph of dynamics under load showing task distribution for Ray Tracing Scheme Number of Tasks Executed Worker 1 Worker 2 Worker 3 Worker 4 Worker Nodes Without Loading Loading 1Worker Loading 2 Workers

31 Experiment 3: Dynamic Behavior Patterns under varying load conditions (Pre-fetching Scheme) Figure 3(a): Graph of dynamics under load showing time measurements for Pre-fetching Scheme Measurement Category Parallel Time Task Planning and Aggregation Maximum Master Overhead Max Worker Time Time (ms) Without Loading Loading 1 Worker Loading 2 Workers Figure 3(b): Graph of dynamics under load showing task distribution for Pre-fetching Scheme Number of Tasks Executed Worker 1 Worker 2 Worker3 Worker 4 Worker Nodes Without Loading Loading 1 Worker Loading 2 Workers

A Framework for Adaptive Cluster Computing using JavaSpaces 1

A Framework for Adaptive Cluster Computing using JavaSpaces 1 Jyoti Batheja and Manish Parashar The Applied Software Systems Laboratory Department of Electrical and Computer Engineering, Rutgers, The State