LLGrid: On-Demand Grid Computing with gridmatlab and pmatlab

Similar documents
LLgrid: Enabling On-Demand Grid Computing With gridmatlab and pmatlab

Interactive, On-Demand Parallel Computing with pmatlab and gridmatlab

300x Matlab. Dr. Jeremy Kepner. MIT Lincoln Laboratory. September 25, 2002 HPEC Workshop Lexington, MA

Quantitative Productivity Measurements in an HPC Environment

Deployment of SAR and GMTI Signal Processing on a Boeing 707 Aircraft using pmatlab and a Bladed Linux Cluster

Deployment of SAR and GMTI Signal Processing on a Boeing 707 Aircraft using pmatlab and a Bladed Linux Cluster

Parallel Matlab: The Next Generation

Rapid prototyping of radar algorithms [Applications Corner]

pmatlab: Parallel Matlab Toolbox

Parallel Matlab: The Next Generation

Managing CAE Simulation Workloads in Cluster Environments

Technical Brief: Specifying a PC for Mascot

The DETER Testbed: Overview 25 August 2004

VMware Workstation 5 Lab. New Features and Capabilities: Multiple Snapshots, Teams, Clones, Video Capture and More

arxiv:astro-ph/ v1 6 May 2003

Operating Systems. studykorner.org

High Performance DoD DSP Applications

Parallel programming in Matlab environment on CRESCO cluster, interactive and batch mode

On-Demand Supercomputing Multiplies the Possibilities

The HPEC Challenge Benchmark Suite

CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster

A KASSPER Real-Time Signal Processor Testbed

Patagonia Cluster Project Research Cluster

Advances of parallel computing. Kirill Bogachev May 2016

Ekran System System Requirements and Performance Numbers

webmethods Task Engine 9.9 on Red Hat Operating System

LBRN - HPC systems : CCT, LSU

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010

Day 9: Introduction to CHTC

Accelerating Simulink Optimization, Code Generation & Test Automation Through Parallelization

SuperMike-II Launch Workshop. System Overview and Allocations

Cluster-based 3D Reconstruction of Aerial Video

Experimental Computing. Frank Porter System Manager: Juan Barayoga

Production Installation and Configuration. Openfiler NSA

International Journal of Computer & Organization Trends Volume5 Issue3 May to June 2015

Supercomputing resources at the IAC

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Hardware & System Requirements

Introducing the VoiceXML Server

Linux Compute Cluster in the German Automotive Industry

The Leading Parallel Cluster File System

2^48 - keine Angst vor großen Datensätzen in MATLAB

GFS: The Google File System

High Performance Computing (HPC) Prepared By: Abdussamad Muntahi Muhammad Rahman

Future Trends in Hardware and Software for use in Simulation

Maximize automotive simulation productivity with ANSYS HPC and NVIDIA GPUs

Social Behavior Prediction Through Reality Mining

The AMD64 Technology for Server and Workstation. Dr. Ulrich Knechtel Enterprise Program Manager EMEA

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Shared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

IOmark- VM. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VM- HC b Test Report Date: 27, April

Geneva 10.0 System Requirements

FROM LEGACY TO MICROSERVICES Lessons learned on the road to success by Miles & More

High performance Linux clusters - design/configuration/maintenance

Parallel Processing. Majid AlMeshari John W. Conklin. Science Advisory Committee Meeting September 3, 2010 Stanford University

SMP and ccnuma Multiprocessor Systems. Sharing of Resources in Parallel and Distributed Computing Systems

Brutus. Above and beyond Hreidar and Gonzales

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

PI System Pervasive Data Collection

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment

Open Benchmark Phase 3: Windows NT Server 4.0 and Red Hat Linux 6.0

What is an Operating System? A Whirlwind Tour of Operating Systems. How did OS evolve? How did OS evolve?

APEX SERVER SOFTWARE NETWORK VIDEO RECORDING SOFTWARE USER MANUAL. Version 1.3

Evaluation of Latent Similarity Search in DSS4LA by Using Big PMU Data

A Relative Development Time Productivity Metric for HPC Systems

RESOURCE MANAGEMENT MICHAEL ROITZSCH

OneCore Storage SDK 5.0

Toward An Integrated Cluster File System

Improve Web Application Performance with Zend Platform

MULTITHERMAN: Out-of-band High-Resolution HPC Power and Performance Monitoring Support for Big-Data Analysis

SuccessMaker Learning Management System System Requirements v1.0

SNOW LICENSE MANAGER (7.X)... 3

The HP Blade Workstation Solution A new paradigm in workstation computing featuring the HP ProLiant xw460c Blade Workstation

VERITAS Foundation Suite TM 2.0 for Linux PERFORMANCE COMPARISON BRIEF - FOUNDATION SUITE, EXT3, AND REISERFS WHITE PAPER

Enterprise Network Compute System (ENCS)

The HBAs tested in this report are the Brocade 825 and the Emulex LightPulse LPe12002 and LPe12000.

Windows Compute Cluster Server 2003 allows MATLAB users to quickly and easily get up and running with distributed computing tools.

Kernel Benchmarks and Metrics for Polymorphous Computer Architectures

Experimental Computing. Frank Porter System Manager: Juan Barayoga

Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks

Upgrade to Webtrends Analytics 8.5: Best Practices

RAIDIX Data Storage System. Entry-Level Data Storage System for Video Surveillance (up to 200 cameras)

Big Data Analytics at OSC

IT Business Management System Requirements Guide

IBM s Data Warehouse Appliance Offerings

Introduction to Grid Computing

Talkative Engage Mitel Architecture Guide. Version 1.0

Using CUDA to Accelerate Radar Image Processing

Accelerating FASGW(H) / ANL Image Processing with Model-Based Design Issue 2

When we start? 10/24/2013 Operating Systems, Beykent University 1

Towards Real-Time, Many Task Applications on Large Distributed Systems

ArcExplorer -- Java Edition 9.0 System Requirements

Supercomputing on a Shoestring: Experience with the Monash PPME Pentium Cluster

IBM MQ Appliance HA and DR Performance Report Model: M2001 Version 3.0 September 2018

MERCED CLUSTER BASICS Multi-Environment Research Computer for Exploration and Discovery A Centerpiece for Computational Science at UC Merced

N-Series SoC Based Thin Clients

10 Gbit/s Challenge inside the Openlab framework

Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES

Transcription:

LLGrid: On-Demand Grid Computing with gridmatlab and pmatlab Albert Reuther 29 September 2004 This work is sponsored by the Department of the Air Force under Air Force contract F19628-00-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government. LLgrid-HPEC-04-1

LLGrid On-Demand Grid Computing System Agenda Introduction LLGrid System Performance Results LLGrid Productivity Analysis Summary Example Application LLGrid Vision User Survey System Requirements LLgrid-HPEC-04-2

Example App: Prototype GMTI & SAR Signal Processing Streaming Sensor Data RAID Disk Recorder SAR GMTI (new) Analyst Workstation Running Matlab Research Sensor On-board processing A/D 180 MHz BW 16 Real-time front-end processing Non-real-time GMTI processing Subband Delay & Adaptive Pulse Doppler STAP Filter Bank Equalize Beamform Compress Process (Clutter) 4 3 430 GOPS 190 320 50 100 Subband Combine Target Detection Airborne research sensor data collected Research analysts develop signal processing algorithms in MATLAB using collected sensor data Individual runs can last hours or days on single workstation LLgrid-HPEC-04-3

LLGrid Goal: To develop a grid computing capability that makes it as easy to run parallel Matlab programs on grid as it is to run Matlab on own workstation. Users Users Network LAN Switch gridsan Network Storage LLGrid Alpha Cluster Cluster Switch Users LLAN Cluster Resource Manager Cluster Topology Lab Grid Computing Components Enterprise access to high throughput Grid computing Enterprise distributed storage LLgrid-HPEC-04-4

MATLAB Users Survey Number of Respondents 100 90 80 70 60 50 40 30 20 10 0 70 60 50 40 30 20 10 0 93 51 33 < 1 hr. 1-24 hrs. > 24 hrs Long Jobs 68 30 37 40 26 9 UC Conducted survey of Lab staff Do you run long MATLAB jobs? How long do those jobs run (minutes, hours, or days)? Are these jobs unclassified, classified, or both? Survey results: 464 respondents 177 answered Yes to question on whether they run long jobs Lincoln MATLAB users: Engineers and scientists, generally not computer scientists Little experience with batch queues, clusters, or mainframes Solution must be easy to use Minutes Hours Days C C UC LLgrid-HPEC-04-5

LLGrid User Requirements Easy to set up First time user setup should be automated and take less than 10 minutes Easy to use Using LLGrid should be the same as running a MATLAB job on user s computer Compatible Windows, Linux, Solaris, and MacOS X High Availability High Throughput for Medium and Large Jobs LLgrid-HPEC-04-6

LLgrid Usage Job duration (seconds) 1 100 10000 1M LLGrid Usage Enabled by LLGrid App3 App1 App2 1 10 100 Processors used by Job Allowing Lincoln staff to effectively use parallel computing daily from their desktop Interactive parallel computing 160 CPUs, 25 Users, 11 Groups Extending the current space of data analysis and simulations that Lincoln staff can perform Jobs requiring rapid turnaround App1: Weather Radar Signal Processing Algorithms Jobs requiring many CPU hours App2: Hyperspectral Image Analysis App3: Laser Propagation Simulation >8 CPU hours - Infeasible on Desktop >8 CPUs - Requires On-Demand Parallel Computing 3500 jobs, 3600 CPU Days December 03 June 04 LLgrid-HPEC-04-7

LLGrid On-Demand Grid Computing System Agenda Introduction LLGrid System Performance Results LLGrid Productivity Analysis Summary Overview Hardware Management Scripts MatlabMPI pmatlab gridmatlab LLgrid-HPEC-04-8

LLGrid Alpha Cluster Users Users Network LAN Switch gridsan Network Storage LLGrid Alpha Cluster Cluster Switch Users LLAN Cluster Resource Manager Cluster Topology Key Innovations: pmatlab - Global array semantics for parallel MATLAB gridmatlab - User s computer is transparently included into LLGrid - User never logs into LLGrid (only mounts file system) LLgrid-HPEC-04-9

Alpha Grid Hardware 80 Nodes + Head Node - 160+2 Processors, 320 GB RAM Nodes: 2650 & 1750 Dual 2.8 & 3.06 GHz Xeon (P4) 400 & 533 MHz front-side bus 4 GB RAM memory Two 36 GB SCSI hard drives 10/100 Mgmt Ethernet interface Two Gig-E Intel interfaces Running Red Hat Linux 36 GB HD 36 GB HD Xeon P0 Xeon P1 4 GB Ram Gig-E Intel Enet IF Gig-E Intel Enet IF 10/100 Mgmt Enet IF Commodity Hardware Commodity OS High Availablity LLgrid-HPEC-04-10

pmatlab Software Layers Application Input Analysis Output Parallel Library Parallel Hardware Vector/Matrix Comp Conduit Library Layer (pmatlab) Messaging (MatlabMPI) Kernel Layer Math (MATLAB) Task User Interface Hardware Interface Can build a parallel library with a few messaging primitives MatlabMPI provides this messaging capability: M PI_Send(dest,co m m,tag,x); X = MPI_Recv(source,com m,tag); LLgrid-HPEC-04-11 Can build a application with a few parallel structures and functions pmatlab provides Global Array Semantic via parallel arrays and functions X = ones(n,mapx); Y = zeros(n,mapy); Y(:,:) = fft(x);

MatlabMPI: Point-to-point Communication Any messaging system can be implemented using file I/O File I/O provided by MATLAB via load and save functions Takes care of complicated buffer packing/unpacking problem Allows basic functions to be implemented in ~250 lines of MATLAB code MPI_Send (dest, tag, comm, variable); variable save Data file load variable Sender Shared File System Receiver create Lock file detect variable = MPI_Recv (source, tag, comm); LLgrid-HPEC-04-12 Sender saves variable in Data file, then creates Lock file Receiver detects Lock file, then loads Data file

gridmatlab: Enable Grid Computing Users gridsan Network Storage Users gridmatlab LLAN Switch Cluster Switch Users gridmatlab LLAN gridmatlab Condor Resource Manager Clusters Transparent interface between pmatlab (MatlabMPI) and resource mngr. User s computer is included in LLGrid for own job only Amasses requested resources for on-demand, interactive job computation Handles all communication with the Condor resource manager (including submission file, launch scripts, and job aborts) User never interacts with queue system directly LLgrid-HPEC-04-13

LLGrid Account Creation LLGrid Account Setup Go to Account Request web page; Type Badge #, Click Create Account Account is created and mounted on user s computer Get User Setup Script Run User Setup Script User runs sample job Account Creation Script (Run on LLGrid) User Setup Script (Run on User s Computer) LLgrid-HPEC-04-14 Creates account on gridsan Creates NFS & SaMBa mount points Creates cross-mount communication directories Mounts gridsan Creates SSH keys for grid resource access Links to MatlabMPI, pmatlab, & gridmatlab source toolboxes Links to MatlabMPI, pmatlab, & gridmatlab example scripts

Account Setup Steps Typical Supercomputing Site Setup Account application/ renewal [months] Resource discovery [hours] Resource allocation application/renewal [months] Explicit file upload/download (usually ftp) [minutes] Batch queue configuration [hours] Batch queue scripting [hours] Differences between control vs. compute nodes [hours] Secondary storage configuration [minutes] Secondary storage scripting [minutes] Interactive requesting mechanism [days] Debugging of example programs [days] Documentation system [hours] Machine node names [hours] GUI launch mechanism [minutes] Avoiding user contention [years] LLGrid Account Setup [minutes] Go to Account Request web page; Type Badge #, Click Create Account Account is created and mounted on user s computer Get User Setup Script Run User Setup Script User runs sample job LLgrid-HPEC-04-15

MathWorks Distributed MATLAB (DML) Dear MATLAB user, This is an invitation to participate in an upcoming Beta Test for the Distributed MATLAB product. This will be available on the following platforms, Win 2000, Win NT, WIN XP, and Linux. The goal of this first release of Distributed MATLAB is to address the requirements of coarse-grain applications, in which the same MATLAB algorithm is executed in remote MATLAB sessions on different data sets without communication or data exchange between sessions. From DML beta email Lincoln has installed DML and is testing it to determine how it integrates with LLGrid technologies. LLgrid-HPEC-04-16

LLGrid On-Demand Grid Computing System Agenda Introduction LLGrid System Performance Results LLGrid Productivity Analysis Summary LLgrid-HPEC-04-17

Performance: Time to Parallelize Important Considerations Description Serial Code Time Time to Parallelize Applications that Parallelization Enables Missile & Sensor BMD Sim. (BMD) 2000 hours 8 hours Discrimination simulations Higher fidelity radar simulations First-principles LADAR Sim. (Ladar) 1300 hours 1 hour Speckle image simulations Aimpoint and discrimination studies Analytic TOM Leakage Calc. (Leak) 40 hours 0.4 hours More complete parameter space sim. Hercules Metric TOM Code (Herc) 900 hours 0.75 hours Monte carlo simulations Coherent laser propagation sim. (Laser) LLgrid-HPEC-04-18 40 hours 1 hour Reduce simulation run time Polynomial coefficient approx. (Coeff) 700 hours 8 hours Reduced run-time of algorithm training Ground motion tracker indicator computation simulator (GMTI) 600 hours 3 hours Reduce evaluation time of larger data sets Automatic target recognition (ATR) 650 hours 40 hours Ability to consider more target classes Ability to generate more scenarios Normal Compositional Model for Hyper-spectral Image Analysis (HSI) 960 hours 6 hours Larger datasets of images

pmatlab Application to 3D Spatial Normalization A Lincoln group is developing normalization algorithms for 3D matched-field (MFP) beamformers Sponsored by DARPA-ATO under Robust Passive Sonar program Large search space ( O(1e7) cells ) makes normalizer evaluation on processed data difficult pmatlab code enabled rapid algorithm development and parameter selection > 20x speedup by exploiting parallelism across frequency on nodes of Linux cluster Development time was ~1 day Simulated data: Matched field output at target depth, bearing Normalization Option 1 Time, minutes Time, minutes Target track Ambiguities Normalization Option 2 Normalized Power, db Normalized Power, db LLgrid-HPEC-04-19 Range, km

LLGrid On-Demand Grid Computing System Agenda Introduction LLGrid System Performance Results LLGrid Productivity Analysis Summary LLgrid-HPEC-04-20

LLGrid Productivity Analysis for ROI * Utility productivity (ROI) = Software Cost + Maintenance Cost + System Cost LLgrid-HPEC-04-21 *In development in DARPA HPCS program

LLGrid Productivity Analysis for ROI * time saved by users on system productivity (ROI) = time to parallelize + time to train time to time to + + + launch admin. system cost LLgrid-HPEC-04-22 *In development in DARPA HPCS program

LLGrid Productivity Analysis for ROI time saved by users on system productivity (ROI) = time to parallelize + time to train time to time to + + + launch admin. system cost Production LLGrid model assumptions 200 users Lab-wide 20 simultaneous jobs Average 10 CPUs per job 2 SLOCs per hour 1000 SLOCs per simulation * Lab-wide users 1.0% time-to-parallelize overhead Training time - 4 hours * Lab-wide users 10,000 parallel job launches 10 seconds to launch One sys-admin 2000 hours 200 CPUs @ $5k per node 5000 hours LLgrid-HPEC-04-23

LLGrid Productivity Analysis for ROI time saved by users on system productivity (ROI) = time to parallelize + time to train time to time to + + + launch admin. system cost Production LLGrid model assumptions 200 users Lab-wide 20 simultaneous jobs Average 10 CPUs per job 2 SLOCs per hour 1000 SLOCs per simulation * Lab-wide users 1.0% time-to-parallelize overhead Training time - 4 hours * Lab-wide users 10,000 parallel job launches 10 seconds to launch One sys-admin 2000 hours 200 CPUs @ $5k per node 5000 hours LLgrid-HPEC-04-24 ROI expected = 36,000 1000+27.8+2000+5000 ROI expected* 4.5 Steady state with full LLGrid * Mileage may vary

LLGrid Productivity Analysis for ROI time saved by users on system productivity (ROI) = time to parallelize + time to train time to time to + + + launch admin. system cost Production LLGrid model assumptions 200 users Lab-wide 20 simultaneous jobs Average 10 CPUs per job 2-4 SLOCs per hour 1000 SLOCs per simulation * Lab-wide users Measured time-to-parallelize overhead Training time - 4 hours * Lab-wide users 10,000 parallel job launches 10 seconds to launch One sys-admin 2000 hours 200 CPUs @ $5k per node 5000 hours LLgrid-HPEC-04-25 ROI * 2.6-4.6 * Varying Mileage

LLGrid On-Demand Grid Computing System Agenda Introduction LLGrid System Performance Results LLGrid Productivity Analysis Summary LLgrid-HPEC-04-26

Summary Easy to set up Easy to use User s computer transparently becomes part of LLGrid High throughput computation system 25 alpha users, expecting 200 users Lab-wide Computing jobs they could not do before 3600 CPU days of computer time in 8 months LLGrid Productivity Analysis - ROI 4.5 LLgrid-HPEC-04-27