Grid Grid A nation wide Experimental Grid. Franck Cappello INRIA With all participants

Similar documents
Fault tolerance in Grid and Grid 5000

An e-infrastructure for Grid Research and the Industry: the CoreGRID experience

Networks & protocols research in Grid5000 DAS3

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

A Matrix Inversion Method with YML/OmniRPC on a Large Scale Platform

Large Scale Sky Computing Applications with Nimbus

Introduction to Grid 5000

Middleware for Grid computing: experiences using the VTHD Network (2.5 Gbit/s)

SILECS Super Infrastructure for Large-scale Experimental Computer Science

Introduction to Grid Computing

HPC learning using Cloud infrastructure

Software solutions for Virtual Infrastructures provisioning and management

Fault tolerance based on the Publishsubscribe Paradigm for the BonjourGrid Middleware

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France

Evaluation Strategies. Nick Feamster CS 7260 February 26, 2007

Analysis of Peer-to-Peer Protocols Performance for Establishing a Decentralized Desktop Grid Middleware

Systematic Cooperation in P2P Grids

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

OAR batch scheduler and scheduling on Grid'5000

KAAPI : Adaptive Runtime System for Parallel Computing

Toward An Integrated Cluster File System

HPC and Inria

High Performance Computing Course Notes Course Administration

Grid Scheduling Architectures with Globus

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

NCIT*net 2 General Information

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

MPI versions. MPI History

An Integrated Experimental

Energy efficiency of renewable-powered datacenters using precise electrical knowledge Nesus Meeting : 22 June Dublin.

Optimizing LS-DYNA Productivity in Cluster Environments

High Performance Computing Course Notes HPC Fundamentals

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically

MPI History. MPI versions MPI-2 MPICH2

Patagonia Cluster Project Research Cluster

Jithendar Paladugula, Ming Zhao, Renato Figueiredo

Baremetal with Apache CloudStack

Distributed Data Infrastructures, Fall 2017, Chapter 2. Jussi Kangasharju

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

irods usage at CC-IN2P3 Jean-Yves Nief

Advanced architecture and services Implications of the CEF Networks workshop

Center Extreme Scale CS Research

irods usage at CC-IN2P3: a long history

Easy Access to Grid Infrastructures

Investigating Resilient HPRC with Minimally-Invasive System Monitoring

Integration of Network Services Interface version 2 with the JUNOS Space SDK

Scalable and Flexible Software Platforms for High-Performance ECUs. Christoph Dietachmayr Sr. Engineering Manager, Elektrobit November 8, 2018

A Case for High Performance Computing with Virtual Machines

A User-level Secure Grid File System

Modalis. A First Step to the Evaluation of SimGrid in the Context of a Real Application. Abdou Guermouche and Hélène Renard, May 5, 2010

Experience in Developing Model- Integrated Tools and Technologies for Large-Scale Fault Tolerant Real-Time Embedded Systems

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Forget about the Clouds, Shoot for the MOON

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Toward an International Computer Science Grid

KerData: Scalable Data Management on Clouds and Beyond

High Performance Computing from an EU perspective

Improving locality of an object store in a Fog Computing environment

Functional Requirements for Grid Oriented Optical Networks

Virtualization for Desktop Grid Clients

CEER Cyber-Physical Testbeds (a generational leap)

Loosely Coupled Actor Systems

OFELIA The European OpenFlow Experimental Facility

SD-WAN Transform Your Agency

Outline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Introduction to iscsi

Solaris Engineered Systems

COP Cloud Computing. Presented by: Sanketh Beerabbi University of Central Florida

Top-down definition of Network Centric Operating System features

Jobs Resource Utilization as a Metric for Clusters Comparison and Optimization. Slurm User Group Meeting 9-10 October, 2012

Kadeploy3. Efficient and Scalable Operating System Provisioning for Clusters. Reliable Deployment Process with Kadeploy3

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Héméra Inria Large Scale Initiative

Dolby Conference Phone 3.1 configuration guide for West

Early Operational Experience with the Cray X1 at the Oak Ridge National Laboratory Center for Computational Sciences

NetFPGA Update at GEC4

EGEE and Interoperation

Innovation and the Internet

DREMS: A Toolchain and Platform for the Rapid Application Development, Integration, and Deployment of Managed Distributed Real-time Embedded Systems

Best Practice Deployment of F5 App Services in Private Clouds. Henry Tam, Senior Product Marketing Manager John Gruber, Sr. PM Solutions Architect

Model-Based Resource Selection for Efficient Virtual Cluster Deployment

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack

Méthodologies and outils pour l évaluation des protocoles de transport dans les réseaux très haut

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

GARUDA - The National Grid Computing Initiative of India

Interactive Analysis of Large Distributed Systems with Scalable Topology-based Visualization

Industrial system integration experts with combined 100+ years of experience in software development, integration and large project execution

Open Networking through Programmability Tal Lavian Nortel Network, Advanced Technology Lab

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

SEER: LEVERAGING BIG DATA TO NAVIGATE THE COMPLEXITY OF PERFORMANCE DEBUGGING IN CLOUD MICROSERVICES

Ethane: taking control of the enterprise

CounterACT 7.0 Single CounterACT Appliance

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

BlueGene/L. Computer Science, University of Warwick. Source: IBM

MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA

FIT IoT-LAB: The Largest IoT Open Experimental Testbed

Transcription:

1 A nation wide Experimental Grid Franck Cappello INRIA fci@lri.fr With all participants

2 Agenda Motivation project design developments Conclusion

3 Grid raises research issues but also methodological challenges Grid are complex systems: Large scale, Deep stack of complicated software Grid raises a lot of research issues: Security, Performance, Fault tolerance, Scalability, Load Balancing, Coordination, Message passing, Data storage, Programming, Algorithms, Communication protocols and architecture, Deployment, etc. How to test and compare? Fault tolerance protocols Security mechanisms Networking protocols etc.

4 Tools for Distributed System Studies To investigate Distributed System issues, we need: 1) Tools (model, simulators, emulators, experi. Platforms) Models: Sys, apps, Platforms, conditions Real systems Real applications In-lab platforms Synthetic conditions Abstraction Key system mecas. Algo, app. kernels Virtual platforms Synthetic conditions Real systems Real applications Real platforms Real conditions Realism emulation math simulation live systems validation 2) Strong interaction between these research tools

Existing Grid Research Tools SimGRid and SimGrid2 Discrete event simulation with trace injection Originally dedicated to scheduling studies GridSim Australian competitor of SimGrid Dedicated to scheduling (with deadline) France USA Australia Japan 2004 Titech Bricks Discrete event simulation for scheduling and replication studies MicroGrid Emulator with MPI communications Not dynamic No emulator or real life experimental platform These tools do not scale (limited to ~100 grid nodes) They do not consider the network issues (almost) 5

We need Grid experimental tools In the first ½ of 2003, the design and development of two Grid experimental platforms was decided: as a real life system Data Grid explorer as a large scale emulator log(cost & complexity) Major challenge Challenging Reasonable SimGrid MicroGrid Bricks Model NS, etc. Protocol proof This talk Data Grid explorer WANinLab Emulab AIST SuperCluster TERAGrid PlanetLab Naregi Testbed log(realism) 2004 emulation math simulation live systems 6

7 Ken Miura NAREGI Middleware Development Infrastructure Installation in Dec. 2003 3 SMPs, 128 procs total 6 x 128-proc clusters, with different interconnects 1 File Server Multi-gigabit networking to simulate Grid Environment NOT a production system (c.f. TeraGrid) Mainly geared towards R&D, but could be used partially for experimental production ~5 Teraflops To form a Grid with the IMS NAREGI application testbed infrastructure (~ 10 Teraflops, March 2004), and other national centers(voluntary basis) via SuperSINET

8 Henri Bal Some Grid testbeds in Europe DAS2 (2002) : Grid Lab DAS3 2005

9 Agenda Rational project design developments Conclusion

The Project 1) Building a nation wide experimental platform for Grid researches (like a particle accelerator for the computer scientists) 8 geographically distributed sites every site hosts a cluster (from 256 CPUs to 1K CPUs) All sites are connected by RENATER (French Res. and Edu. Net.) RENATER hosts probes to trace network load conditions Design and develop a system/middleware environment for safely test and repeat experiments 2) Use the platform for Grid experiments in real life conditions Address critical issues of Grid system/middleware: Programming, Scalability, Fault Tolerance, Scheduling Address critical issues of Grid Networking High performance transport protocols, Qos Port and test applications Investigate original mechanisms P2P resources discovery, Desktop Grids 2004 10

Funding & Participants Funding: 1) ACI GRID (Hardware) 2) INRIA (Hardware, Engineers) 3) CNRS (AS, Engineers, etc.) 4) Regional councils (Hardware) Steering Committee (11) : -Franck Cappello (animateur) -Thierry Priol (Directeur directeur ACI Grid) -Brigitte Plateau (Directrice CS ACI Grid) -Dany Vandrome (Renater) -Frédéric Desprez (Lyon) -Michel Daydé (Toulouse) -Yvon Jégou (Rennes) -Stéphane Lantéri (Sophia) -Raymond Namyst (Bordeaux) -Pascale Primet (Lyon) -Olivier Richard (Grenoble) Technical Committee (28) : Jean-Luc ANTHOINE Jean-Claude Barbet Pierrette Barbaresco Nicolas Capit Eddy Caron Christophe Cérin Olivier Coulaud Georges Da-Costa Yves Denneulin Benjamin Dexheimer Aurélien Dumez Gilles Gallot David Geldreich Sébastien Georget Olivier Gluck Claude Inglebert Julien Leduc Cyrille Martin Jean-Francois Méhaut Jean-Christophe Mignot Thierry Monteil Guillaume Mornet Alain Naud Vincent Néri Gaetan Peaquin Franck Simon Sebastien Varrette Jean-Marc Vincent 2004 11

Total funding: ~6M Sites Selection by the ACI Grid SC: Several sites submitted research proposals Selection considered: Scientific aspects AND capacity to manage a nodes Pau Data Grid explorer Grid 5000 nodes 2004 3 (soon 4) 1 Gb/sec ; may evolve to 10 Gb/s 12

Schedule today Call for proposals Selection of 7 sites ACI GRID Funding Call for Expression Of Interest Vendor selection Instal. First tests Final review Fisrt Demo (SC04) Hardware System/middleware Forum Security Prototypes Switch Proto to Control Prototypes Renater connection Programming Forum Demo preparation Experiments Sept03 2004 Nov03 Jan04 March04 Jun/July 04 Spt 04 Oct 04 Nov 04 13

14 Agenda Rational project design Technical developments Conclusion

foundations: Collection of experiments to be done Networking End host communication layer (interference with local communications) High performance long distance protocols (improved TCP) High Speed Network Emulation Middleware / OS Scheduling / data distribution in Grid Fault tolerance in Grid Resource management Grid SSI OS and Grid I/O Desktop Grid/P2P systems Programming Component programming for the Grid (Java, Corba) GRID-RPC GRID-MPI Code Coupling Applications Multi-parametric applications (Climate modeling/functional Genomic) Large scale experimentation of distributed applications (Electromagnetism, multi-material fluid mechanics, parallel optimization algorithms, CFD, astrophysics Medical images, Collaborating tools in virtual 3D environment 15

16 foundations: Collection of properties to evaluate Quantitative metrics : Performance Execution time, throughput, overhead Scalability Resource occupation (CPU, memory, disc, network) Applications algorithms Number of users Fault-tolerance Tolerance to very frequent failures (volatility), tolerance to massive failures (a large fraction of the system disconnects) Fault tolerance consistency across the software stack.

17 Design goal: Experimenting all layers of the Grid software stack Application Programming Environments Application Runtime Enabling Middleware Operating System Networking

18 Vision is NOT a production Grid! should be: an instrument to experiment and observe phenomena in all levels of the software stack involved in Grid. will be: a low level testbed harnessing clusters (a nation wide cluster of clusters), allowing users to fully configure the cluster nodes (including the OS) for their experiments (deep control)

19 as an Instrument Technical issues: Remotely controllable Grid nodes (installed in geographically distributed laboratories) A «Controllable» and «Monitorable» Network between the Grid nodes (may be unrealistic in some cases) 3) A middleware infrastructure allowing users to access, reserve and share the Grid nodes 4) A user toolkit to deploy, run, monitor, control experiments and collect results Scientific issues: 1) Monitorable experimental conditions 2) Control of a running experiment (suspend/restart)

20 Agenda Rational project design developments Conclusion

21 Security design nodes will be rebooted and configured at kernel level by users (very high privileges for every users); Users may configure incorrectly the cluster nodes opening security holes How to secure the local site and Internet? A confined system (no way to get out; access only through strong authentication and via a dedicated gateway) Some sites want private addresses, some others want public addresses Some sites want to connect satellite machines Access is granted only from sites Every site is responsible to following the confinement rules

Security architecture: A confined system 2 fibers (1 dedicated to ) MPLS Grid5000 site Router RENATER Router RENATER 7 VLANs per site G5k site Router RENATER G5k site RENATER LAb normal connection to Renater Switch/router labo Switch Grid Cluster + env. Routeur RENATER Grid5000 site User access point Lab. Clust LAB Firewall/nat Local Front-end (logging by ssh) Controler (DNS, LDAP, NFS, /home, Reboot, DHCP, Boot server) 2004 Configuration for private addresses 8 x 7 VLANs in (1 VLAN per tunnel) Reconfigurable nodes 22

23 Control design User want to be able to install on all nodes some specific software stack from network protocols to applications (possibly including kernel) Administrators want to be able to reset/reboot distant nodes in case of troubles developers want to develop control mechanisms in order to help debugging, such as step by step execution (relying on checkpoint/restart mechanisms) A control architecture allowing to broadcast orders from one site to the others with local relays to convert the order in actions

Usage modes Shared (preparing experiments, size S) No dedicated resources (users log in nodes and use default settings, etc.) Reserved (similar to Planet lab, size M) Reserved resources but un coordinated (Users may change node s OS on reserved ones) Batch (automatic, size L ou XL) Reserved and coordinated resources experiments run under batch/automatic mode) All these modes with calendar scheduling + compliance with local usages (almost every cluster receives funds from different institutions and several projects) 2004 24

Users/ admin (ssh loggin + password) Lab s Network LAB/Firewall Router Control Architecture In reserved and batch modes, admins and users can control their resources G5k site G5k site Control Slave Control Master Firewall/nat Control commands -rsync (kernel,dist) -orders (boot, reset) Controler: (Boot server + dhcp) Control Slave G5k site Cluster Site 3 2004 10 boot partitions on each node (cache) Cluster System kernels and distributions are downloaded from a boot server. They are uploaded by the users as system images. 25

26 prototype

27 prototype

28 Rennes Installing the real stuff Orsay

29 Rennes dell Installing the real stuff Grenoble Orsay

30 Agenda Rational project design developments Conclusion

31 Summary will offer in 2005: 8 clusters distributed over 8 sites in France, about 2500 CPUs, about 2,5 TB memory, about 100 TB Disc, about 8 Gigabit/s (directional) of bandwidth about 5 à 10 Tera operations / sec the capability for all uses to reconfigure the platform [protocols/os/middleware/runtime/application] will be opened to Grid researchers in early 2005 Could be opened to other researchers (ACI «Data Masse», CoreGrid European project members, etc.) Beyond an Instrument has federated a strong community: this is a human adventure!

32 Q&A