Grid Computing in Numerical Relativity and Astrophysics

Similar documents
PHYSICS TOMORROW. The Grid from an Application Viewpoint. Ed Seidel

High Performance and Grid Computing Applications with the Cactus Framework. HPCC Program Grand Challenges (1995)

The Cactus Framework: Design, Applications and Future Directions. Cactus Code

Fourteen years of Cactus Community

What is Cactus? Cactus is a framework for developing portable, modular applications

The Cactus Framework. Erik Schnetter September 2006

Cactus Framework: Scaling and Lessons Learnt

The Cactus Framework and Toolkit: Design and Applications

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Cactus Tutorial. Introduction to Cactus. Yaakoub El Khamra. Cactus Developer, Frameworks Group CCT 27 March, 2007

Integration of Trilinos Into The Cactus Code Framework

Grid Computing Fall 2005 Lecture 2: About Grid Computing. Gabrielle Allen

Introduction to Grid Computing

Cactus: Current Status and Future Plans

Datura The new HPC-Plant at Albert Einstein Institute

Day 2 August 06, 2004 (Friday)

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Introduction. Kelly Davis. MPI-AEI. Author s name

Deliverable D8.6 - Data Management Integration

Hungarian Supercomputing Grid 1

Grid Programming Models: Current Tools, Issues and Directions. Computer Systems Research Department The Aerospace Corporation, P.O.

Grid Scheduling Architectures with Globus

An Experience in Accessing Grid Computing from Mobile Device with GridLab Mobile Services

Moderator: Edward Seidel, Director, Center for Computation & Technology, Louisiana State University

Parallelism. Wolfgang Kastaun. May 9, 2008

AstroGrid-D. Advanced Prototype Implementation of Monitoring & Steering Methods. Documentation and Test Report 1. Deliverable D6.5

IST GridLab - A Grid Application Toolkit and Testbed. Result Evaluation. Jason Maassen, Rob V. van Nieuwpoort, Andre Merzky, Thilo Kielmann

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Andrea Sciabà CERN, Switzerland

Cactus-Simulationen on the Grid

Project Vision and Mission

Erik Schnetter Rochester, August Friday, August 27, 2010

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Solving End-to-End connectivity with GMPLS

Moving e-infrastructure into a new era the FP7 challenge

Update on EZ-Grid. Priya Raghunath University of Houston. PI : Dr Barbara Chapman

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

The Grid: Feng Shui for the Terminally Rectilinear

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Parallel Programming with MPI on Clusters

Heterogeneous Multi-Computer System A New Platform for Multi-Paradigm Scientific Simulation

Isolation Forest for Anomaly Detection

Extreme I/O Scaling with HDF5

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

Overview of HPC at LONI

Grid Challenges and Experience

HPC learning using Cloud infrastructure

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Distributed ASCI Supercomputer DAS-1 DAS-2 DAS-3 DAS-4 DAS-5

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

MPI versions. MPI History

An Overview of Computational Science (Based on CSEP)

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Enzo-P / Cello. Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology. San Diego Supercomputer Center. Department of Physics and Astronomy

High Performance Computing from an EU perspective

DRIVER Step One towards a Pan-European Digital Repository Infrastructure

Microsoft SharePoint Server 2013 Plan, Configure & Manage

The FELIX project Sustainability. Bartosz Belter, PSNC

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

Shaking-and-Baking on a Grid

Compilers and Run-Time Systems for High-Performance Computing

High Performance Computing

A Scalable Adaptive Mesh Refinement Framework For Parallel Astrophysics Applications

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

IOS: A Middleware for Decentralized Distributed Computing

Ian Foster, CS554: Data-Intensive Computing

Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

The LHC Computing Grid

Improved Infrastructure Accessibility and Control with LSF for LS-DYNA

Mitigating Risk of Data Loss in Preservation Environments

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Customer Success Story Los Alamos National Laboratory

High Performance Computing Course Notes Grid Computing I

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

HOW SDN AND NFV EXTEND THE ADOPTION AND CAPABILITIES OF SELF-ORGANIZING NETWORKS (SON)

MOHA: Many-Task Computing Framework on Hadoop

BİL 542 Parallel Computing

Grid Computing Systems: A Survey and Taxonomy

MPI History. MPI versions MPI-2 MPICH2

WhatÕs New in the Message-Passing Toolkit

Applications and Tools for High-Performance- Computing in Wide-Area Networks

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

Real Parallel Computers

Networking European Digital Repositories

GridSphere s Grid Portlets

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Introduction to Data Management for Ocean Science Research

Networking European Digital Repositories

Application Projects in VIOLA

Storage and I/O requirements of the LHC experiments

Application Example Running on Top of GPI-Space Integrating D/C

N. Marusov, I. Semenov

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Enzo-P / Cello. Formation of the First Galaxies. San Diego Supercomputer Center. Department of Physics and Astronomy

The Virtual Observatory and the IVOA

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

LHC and LSST Use Cases

Introducing Overdecomposition to Existing Applications: PlasComCM and AMPI

DataONE: Open Persistent Access to Earth Observational Data

Transcription:

Grid Computing in Numerical Relativity and Astrophysics Gabrielle Allen: gallen@cct.lsu.edu Depts Computer Science & Physics Center for Computation & Technology (CCT) Louisiana State University Challenge Problems Cosmology Black Hole and Neutron Star Models Supernovae Astronomical Databases Gravitational Wave Data Analysis Drive HEC & Grids 1

Gravitational Wave Physics Analysis & Insight Observations Models Complex Simulations 2

Computational Science Needs Requires incredible mix of technologies & expertise! Many scientific/engineering components Physics, astrophysics, CFD, engineering,... Many numerical algorithm components Finite difference? Finite volume? Finite elements? Elliptic equations: multigrid, Krylov subspace,... Mesh refinement Many different computational components Parallelism (HPF, MPI, PVM,???) Multipatch Architecture (MPP, DSM, Vector, PC Clusters, FPGA,???) I/O (generate TBs/simulation, checkpointing ) Visualization of all that comes out! New technologies Grid computing Steering, data archives Such work cuts across many disciplines, areas of CS Cactus Code Freely available, modular, portable and manageable environment for collaboratively developing parallel, high-performance multidimensional simulations Developed for Numerical Relativity, but now general framework for parallel computing (CFD, astrophysics, climate modeling, chemical eng, quantum gravity, ) Finite difference, adaptive mesh refinement (Carpet, Samrai, Grace), adding FE/FV, multipatch Active user and developer communities, main development now at LSU and AEI. Open source, documentation, etc 3

Cactus Einstein Cactus modules (thorns) for numerical relativity. Many additional thorns available from other groups (AEI, CCT, ) Agree on some basic principles (e.g. names of variables) and then can share evolution, analysis etc. Can choose whether or not to use e.g. gauge choice, macros, masks, matter coupling, conformal factor Over 100 relativity papers & 30 student theses: production research code Evolve ADM EvolSimple InitialData IDAnalyticBH IDAxiBrillBH IDBrillData IDLinearWaves IDSimple SpaceMask ADMMacros ADMBase Analysis ADMAnalysis ADMConstraints AHFinder Extract PsiKadelia TimeGeodesic Gauge Conditions CoordGauge Maximal ADMCoupling StaticConformal Grand Challenge Collaborations NASA Neutron Star Grand Challenge 5 US sites 3 years Colliding neutron star problem EU Astrophysics Network 10 EU sites 3 years Continuing these problems NSF Black Hole Grand Challenge 8 US Institutions 5 years Attack colliding black hole problem Examples of Future of Science & Engineering Require Large Scale Simulations, beyond reach of any machine Require Large Geo-distributed Cross-Disciplinary Collaborations Require Grid Technologies, but not yet using them! 4

New Paradigm: Grid Computing Computational resources across the world Compute servers (double each 18 months) File servers Networks (double each 9 months) Playstations, cell phones etc Grid computing integrates communities and resources How to take advantage of this for scientific simulations? Harness multiple sites and devices Models with new level of complexity and scale, interacting with data New possibilities for collaboration and advanced scenarios NLR and Louisiana Optical Network (LONI) State initiative ($40M) to support research: 40 Gbps optical network Connects 7 sites Grid resources (IBM P5) at sites LIGO/CAMD New possibilities: Dynamical provisioning and scheduling of network bandwidth Network dependent scenarios EnLIGHTened Computing (NSF) 5

Current Grid Application Types Community Driven Distributed communities share resources Video Conferencing Virtual Collaborative Environments Data Driven Remote access of huge data, data mining Eg. Gravitational wave analysis, particle physics, astronomy Process/Simulation Driven Demanding Simulations of Science and Engineering Task farming, resource brokering, distributed computations, workflow Remote visualization, steering and interaction, etc Typical scenario: Find remote resources (task farm, distribute) Launch jobs (static) Visualize, collect results Prototypes and demos: need to move to: Fault tolerance Robustness Scaling Easy to use Complete solutions New Paradigms for Dynamic Grids Addressing large, complex, multidisciplinary problems with collaborative teams of varied researchers... Code/User/Infrastructure should be aware of environment Discover and monitor resources available NOW What is my allocation on these resources? What is bandwidth/latency Code/User/Infrastructure should make decisions Slow part of simulation can run independently spawn it off! New powerful resources just became available migrate there! Machine went down reconfigure and recover! Need more memory (or less!), get by adding (dropping) machines! Dynamically provision and use new high end resources and networks 6

Future Dynamic Grid Computing We see something, but too weak. Please simulate to enhance signal! S S 1 S 2 P 1 P 2 S 1 S 2 P 1 P 2 Future Dynamic Grid Computing Free CPUs!! RZG Queue time over, find new machine Add more resources SDSC LRZ Archive data Clone job with steered parameter Calculate/Output Invariants SDSC Further Calculations Found a black hole, Load new component Find best resources Look for horizon Calculate/Output Grav. Waves AEI Archive to LIGO experiment NCSA 7

New Grid Scenarios Intelligent Parameter Surveys, speculative computing, monte carlo Dynamic Staging: move to faster/cheaper/bigger machine Multiple Universe: create clone to investigate steered parameter Automatic Component Loading: needs of process change, discover/load/execute new calc. component on approp.machine Automatic Convergence Testing Look Ahead: spawn off and run coarser resolution to predict likely future Spawn Independent/Asynchronous Tasks: send to cheaper machine, main simulation carries on Routine Profiling: best machine/queue, choose resolution parameters based on queue Dynamic Load Balancing: inhomogeneous loads, multiple grids Inject dynamically acquired data But Need Grid Apps and Programming Tools Need application programming tools for Grid environments Frameworks for developing Grid applications Toolkits providing Grid functionality Grid debuggers and profilers Robust, dependable, flexible Grid tools Challenging CS problems: Missing or immature grid services Changing environment Different and evolving interfaces to the grid Interfaces are not simple for scientific application developers Application developers need easy, robust and dependable tools 8

GridLab Project EU 5th Framework ($7M) Partners in Europe and US PSNC (Poland), AEI & ZIB (Germany), VU (Netherlands), MASARYK (Czech), SZTAKI (Hungary), ISUFI (Italy), Cardiff (UK), NTUA (Greece), Chicago, ISI & Wisconsin (US), Sun, Compaq/HP, LSU Application and test bed oriented (Cactus + Triana) Numerical relativity Dynamic use of grids Main goal: develop application programming environment for Grid www.gridlab.org Abstract programming interface between applications and Grid services Designed for applications (move file, run remote task, migrate, write to remote file) Led to GGF Simple API for Grid Applications Grid Application Toolkit (GAT) Main result from GridLab project www.gridlab.org/gat 9

Distributed Computation Harnessing Multiple Computers Why do this? Capacity: computers can t keep up with needs Throughput: combine resources Issues Bandwidth (increasing faster than CPU) Latency Communication needs, Topology Communication/computation Techniques to be developed Overlapping communication/computation Extra ghost zones to reduce latency Compression Algorithms to do this for scientist Dynamic Adaptive Distributed Computation 17 SDSC IBM SP 1024 procs 5x12x17 =1020 5 12 OC-12 line (But only 2.5MB/sec) NCSA Origin Array 256+128+128 5x12x(4+2+2) =480 GigE:100MB/sec 4 2 5 2 12 Cactus + MPICH-G2 Communications dynamically adapt to application and environment Any Cactus application Scaling: 15% -> 85% Gordon Bell Prize (With U. Chicago/Northern, Supercomputing 2001, Denver) 10

Remote Viz & Steering HTTP Any Viz Client: LCA Vision, OpenDX Streaming HDF5 Autodownsample Changing steerable parameters Parameters Physics, algorithms Performance Cactus Worm (SC2000) Cactus simulation starts, launched from portal Migrates itself to another site Grid technologies Registers new location User tracks/steers, using HTTP, streaming data, etc Continues around Europe 11

Task Spawning (SC2001) Cactus Spawner thorn automatically prepares analysis tasks for spawning Grid technologies find resources, manage tasks, collect data Intelligence to decide when to spawn SC2001: resources of GGTC testbed. Main Cactus BH simulation starts here Appropriate analysis tasks spawned automatically to free resources worldwide User only has to invoke Cactus Spawner thorn Global Grid Testbed Collaboration Supercomputing 2001 Cactus black hole simulations spawned apparent horizon finding tasks across the grid. Prizes for most heterogeneous and most distributed testbed 5 continents and over 14 countries. Around 70 machines, 7500+ processors Many hardware types, including PS2, IA32, IA64, MIPS, Many OSs, including Linux, Irix, AIX, OSF, True64, Solaris, Hitachi Many organizations: DOE, NSF, MPG, universities, vendors All ran same Grid infrastructure, and used for different applications 12

Black Hole Task Farming (SC2002) Black hole server controls tasks and steers main job Main Cactus BH Simulation started in California Error measure returned Dozens of low resolution jobs test corotation parameter Huge job generates remote data visualized in Baltimore Job Migration GridLab demonstration SC2003 13

Notification and Information Replica Catalog SMS Server GridSphere Portal The Grid IM Server Mail Server User details, notification prefs and simulation information Grid-enabled Gravitational Physics Adaptive, intelligent simulation codes able to adapt to environment Simulation data stored across geographically distributed spaces Organization, access, mining issues Analysis of federated data sets by virtual organizations Data analysis of LIGO, GEO, LISA signals Interacting with simulation data Managing parameter space/signal analysis Now working on domain specific information and knowledge based services: Gravitational physics description language Schema for describing, searching, encoding simulation results Automated logging of simulations: reproducibility Notification and data sharing services to enable collaboration Relativity services Remote servers running e.g. waveform extraction, horizon finding etc. Connection to publications and information Automated analysis 14

Credits This talk describes work carried out over a number of years by physicists, computer scientists, mathematicians etc by the joint AEI-LSU numerical relativity groups and colleagues. 15