Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Similar documents
30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Grid Scheduling Architectures with Globus

Introduction to Grid Computing

Ian Foster, CS554: Data-Intensive Computing

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Access the power of Grid with Eclipse

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

glite Grid Services Overview

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

High Performance Computing Course Notes Grid Computing I

Distributed Operating Systems Fall Prashant Shenoy UMass Computer Science. CS677: Distributed OS

EFFICIENT SCHEDULING TECHNIQUES AND SYSTEMS FOR GRID COMPUTING

The glite middleware. Ariel Garcia KIT

High Performance Computing Course Notes HPC Fundamentals

Easy Access to Grid Infrastructures

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

EGEE and Interoperation

Fault tolerance based on the Publishsubscribe Paradigm for the BonjourGrid Middleware

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science.

XRAY Grid TO BE OR NOT TO BE?

IEPSAS-Kosice: experiences in running LCG site

Distributed and Operating Systems Spring Prashant Shenoy UMass Computer Science.

CSE 5306 Distributed Systems. Course Introduction

High Performance Computing Course Notes Course Administration

By Ian Foster. Zhifeng Yun

Distributed Systems. Overview. Distributed Systems September A distributed system is a piece of software that ensures that:

ELFms industrialisation plans

Managing CAE Simulation Workloads in Cluster Environments

FREE SCIENTIFIC COMPUTING

Distributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

Introduction to Grid Infrastructures

On the employment of LCG GRID middleware

Grid Computing. Grid Computing 2

The Grid: Processing the Data from the World s Largest Scientific Machine

Introduction to High-Performance Computing

Grid Architectural Models

Concepts of Distributed Systems 2006/2007

Cloud Computing. Summary

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

M. Roehrig, Sandia National Laboratories. Philipp Wieder, Research Centre Jülich Nov 2002

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013

Distributed Computing. Santa Clara University 2016

Magellan Project. Jeff Broughton NERSC Systems Department Head October 7, 2009

Lecture 9: MIMD Architectures

Implementing GRID interoperability

Lecture 9: MIMD Architectures

Grid Compute Resources and Job Management

Introduction to Cluster Computing

Functional Requirements for Grid Oriented Optical Networks

Grid Challenges and Experience

Distributed Systems LEEC (2006/07 2º Sem.)

TDP3471 Distributed and Parallel Computing

Andrea Sciabà CERN, Switzerland

HPC learning using Cloud infrastructure

BOSCO Architecture. Derek Weitzel University of Nebraska Lincoln

SZDG, ecom4com technology, EDGeS-EDGI in large P. Kacsuk MTA SZTAKI

MOHA: Many-Task Computing Framework on Hadoop

Distributed System. Gang Wu. Spring,2019

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

A Comparative Study of Various Computing Environments-Cluster, Grid and Cloud

Based on: Grid Intro and Fundamentals Review Talk by Gabrielle Allen Talk by Laura Bright / Bill Howe

THEBES: THE GRID MIDDLEWARE PROJECT Project Overview, Status Report and Roadmap

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Grid-Based Data Mining and the KNOWLEDGE GRID Framework

Prototypes of a Computational Grid for the Planck Satellite

Cloud Computing. Up until now

Ian Foster, An Overview of Distributed Systems

An Introduction to the Grid

Introduction. Chapter 1

Work Queue + Python. A Framework For Scalable Scientific Ensemble Applications

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity

An Introduction to Virtualization and Cloud Technologies to Support Grid Computing

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

Grid Computing Security hack.lu 2006 :: Security in Grid Computing :: Lisa Thalheim 1

Mobile Middleware Course. Mobile Platforms and Middleware. Sasu Tarkoma

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Grid Infrastructure For Collaborative High Performance Scientific Computing

Data Management Components for a Research Data Archive

Distributed Systems. Lecture 4 Othon Michail COMP 212 1/27

system of systems: such as a cloud of clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture.

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS

S i m p l i f y i n g A d m i n i s t r a t i o n a n d M a n a g e m e n t P r o c e s s e s i n t h e P o l i s h N a t i o n a l C l u s t e r

The LHC Computing Grid

Arguably one of the most fundamental discipline that touches all other disciplines and people

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

CHAPTER 2 LITERATURE REVIEW AND BACKGROUND

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

High Performance Computing

Lecture 23 Database System Architectures

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2)

g-eclipse - An Integrated Framework to Access and Maintain Grid Resources

Fault tolerance in Grid and Grid 5000

PC cluster as a platform for parallel applications

Garuda : The National Grid Computing Initiative Of India. Natraj A.C, CDAC Knowledge Park, Bangalore.

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Transcription:

1967-14 Advanced School in High Performance and GRID Computing 3-14 November 2008 Introduction to Grid computing. TAFFONI Giuliano Osservatorio Astronomico di Trieste/INAF Via G.B. Tiepolo 11 34131 Trieste ITALY

Our heads in the Grid A brief introduction to Grid Computing Dr. Giuliano Taffoni INAF - Information Technology Division

New challenges in Science Going further in scientific knowledge New high sensitivity sensors and instruments Globally distributed collaborations Delocalized knowledge Scientific and technical knowledge is distributed Laboratories are distributed Scientific data are distributed

Data, resources and services avalanche

e-science

escience is about global collaboration in key areas of science and the next generation of infrastructure that will enable it. Dr.John Taylor, Director General of the Research Councils 1998-2003

The large scale science that will increasingly be carried out through distributed global collaborations enabled by Internet From: http://www.nesc.ac.uk/nesc/define.html e-science is a new way of using Internet and its services to do science...my definition.

Simulations Experimental data Distributed collaboration e-science Digital science Geographically distributed resources

Using internet to make science On-line publication paper/pre-prints (eg. babbage.sissa.it) CPU cycle scavenging (eg. Seti@home, Condor) Sloan Digital Sky Survey: online database of astronomical data http://www.sdss.org/ Google sky

Science and WEB2.0 Collaboration tools Social networking (secondlife. facebook, NING etc.)

A new paradigm WWW share documents in transparent way Accessible through browser Share resources in transparent way Accessible through middleware

resource sharing Applications: web services technology CPU and Storage: Grid computing, Cloud Computing, etc. Data: data Grid, Virtual Observatory, Google Filesystem, etc. Instruments: e-labs, collaboration tools, etc.

What is your paradigm? Parallel Computing single systems with many processors working on same problem Distributed Computing many systems loosely coupled by a scheduler to work on related problems Grid Computing many systems tightly coupled by software, perhaps geographically distributed, to work together on single problems or on related problems

What is Grid Computing?

Some definitions a single seamless computational environment in which cycles, communication, and data are shared, and in which the workstation across the continent is no less than one down the hall" wide-area environment that transparently consists of workstations, personal computers, graphic rendering engines, supercomputers and non-traditional devices: e.g., TVs, toasters, etc." [framework for] flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources" collection of geographically separated resources (people, computers, instruments, databases) connected by a high speed network [...distinguished by...] a software layer, often called middleware, which transforms a collection of independent resources into a single, coherent, virtual machine CPU and data sharing Virtual Organization Virtual Machine (SSI)

CPU vs collaboration: VO concept The size and/or complexity of the problem requires that people in several organizations collaborate and share computing resources, data, instruments VIRTUAL ORGANIZATIONS

Resource sharing, common goals Fault Tolerant Linked by networks, cross admin domains VO Dynamic Geographically distributed resources people

Grid Concepts Secure Scalable On-demand Efficient Fault-tolerant Reliable Dynamic

The Grid Middleware Its the software layer that glue all the resources Everything that lies between the OS and the application From Ian Foster's talk

Grid Resource Storage systems Computer clusters HPC clusters Supercomputers (IBM SP, blue jean, etc) Databases Keyword: heterogeneous as regards hardware and software

Local vs remote Resources are locally managed Policies Accountability OS Storage systems Batch systems Global policies Global accessibility Dynamic resource identification Remote resource utilization

MW generic services Security Job Management Data Management Monitoring and Discovery

Grid Middleware Grid is as Operating System: different middleware = different Grid Globus alliance (Globus Toolkit) glite (EGEE middleware) Unicore (DE) GridBus GRIA

Explore the middleware Bottom-up From low level services to global services From fabric to GRID From Unix user to GRID user

The Resources Group of sites glued by the Middleware Sites are homogeneous as regards OS and SW: Scientific Linux cern 4 Sites are heterogeneous as regards HW: x86/x86_64 arch Some collective services: WMS, DMS etc.

A Grid Site Computing Element Master node Storage Element Storage system Worker nodes Computing nodes Scheduler+queue system (torque+maui, LSF, etc.

The Low level services

Security Grid is a highly complex system Authentication: establishing identity Authorization: establishing rights Message protection Passwords are not scalable and secure!!!

What do we require to security? Users point of view Easy to use, transparent, single-sign on, no password sharing Administrators point of view Define local access control Define local polices The Grid Security Infrastructure X509 digital certificates

Job Management The challenge: enabling access to heterogeneous resources and managing remote computation Create job environment Stage files in/out the environment Submit a job to the local scheduler Monitor job state Job description language

Monitor and discovery service What is the status of a resource? What are the available resources?

Data Managment Requirements Fast: as fast as networks and protocols allow Secure: server must only share files with strongly authenticated clients and no passwords in the clear or similar Robust: Fault tolerant, time-tested protocol And the winner is GRIDFTP

High Level Services

Information system Which resources are available? Where are them? What is their status? How can I optimize their use? We need a general information infrastructure: Information System

Data Management Where are data/files? Which data/file exist? How can I reach it? Are they accessible by others? ex. LFC file catalogue Distributed storage space filesystem

Job Management Cooperation infrastructure for WAN distributed resources: Chaotic system to direct; Locate, book and use the right resource Scheduling service Job description language

Taxonomy of a scheduler Centralized systems Distributed systems Hierarchical systems (hybrid)

Centralized Single point of knowledge Optimum scheduling Single point of failure Example: Condor-G

Distributed Application delegation method Optimum scaling & Fault tolerance Sub-optimal resource allocation Each Application has to develop a scheduler Example: NetSolve

Hybrid Distributed systems are scheduled by a centralized one Examples: Darwin and Nimrond-G, GridBUS

Applications for Grid computing Computation intensive Interactive simulation (climate modeling) Large-scale simulation and analysis ( atomistic simulations) Engineering (parameter studies, optimization model) Data intensive Experimental data analysis (e.g., H.E.P.) Image & sensor analysis (climate) Distributed collaboration Online instrumentation (microscopes, x-ray) Remote visualization (climate studies, biology)

Grid Projects

Summing up Modern Science requires a large amount of computing resources GRID computing and HPC are now fundamental tools for scientific research The challenge is now to build/use the infrastructure that fits at best your computational requirements. HPC and GRID computing are not mutually exclusive but can be both used to address computational resources in a transparent way.