Introduction to Grid Computing

Similar documents
Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Based on: Grid Intro and Fundamentals Review Talk by Gabrielle Allen Talk by Laura Bright / Bill Howe

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

SDS: A Scalable Data Services System in Data Grid

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

Grid Computing Systems: A Survey and Taxonomy

Grid Programming: Concepts and Challenges. Michael Rokitka CSE510B 10/2007

Grid Architectural Models

A Federated Grid Environment with Replication Services

By Ian Foster. Zhifeng Yun

High Throughput WAN Data Transfer with Hadoop-based Storage

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

A Data-Aware Resource Broker for Data Grids

Grid Computing: Status and Perspectives. Alexander Reinefeld Florian Schintke. Outline MOTIVATION TWO TYPICAL APPLICATION DOMAINS

High Performance Computing Course Notes Grid Computing I

Resolving Load Balancing Issue of Grid Computing through Dynamic Approach

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Grid Scheduling Architectures with Globus

Data Management for Distributed Scientific Collaborations Using a Rule Engine

Cloud Computing. Up until now

Knowledge Discovery Services and Tools on Grids

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

The National Fusion Collaboratory

CHAPTER 7 CONCLUSION AND FUTURE SCOPE

LIGO Virtual Data. Realizing. UWM: Bruce Allen, Scott Koranda. Caltech: Kent Blackburn, Phil Ehrens, Albert. Lazzarini, Roy Williams

Grid Computing Fall 2005 Lecture 5: Grid Architecture and Globus. Gabrielle Allen

LHC and LSST Use Cases

Introduction to Grid Technology

The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

An Experience in Accessing Grid Computing from Mobile Device with GridLab Mobile Services

Labs of the World, Unite!!!

Chapter 20: Database System Architectures

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Data publication and discovery with Globus

MOHA: Many-Task Computing Framework on Hadoop

Database Assessment for PDMS

Assignment 5. Georgia Koloniari

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

The Problem of Grid Scheduling

Kenneth A. Hawick P. D. Coddington H. A. James

Chapter 2 CommVault Data Management Concepts

CS10 The Beauty and Joy of Computing

CSE 124: Networked Services Lecture-16

The Materials Data Facility

Lecture 23 Database System Architectures

Combining Virtual Organization and Local Policies for Automated Configuration of Grid Services

Knowledge-based Grids

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

A Simulation Model for Large Scale Distributed Systems

Introduction to The Storage Resource Broker

The Earth System Grid: A Visualisation Solution. Gary Strand

CSE 124: Networked Services Fall 2009 Lecture-19

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

Clouds: An Opportunity for Scientific Applications?

Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*

Grid Computing. Grid Computing 2

An Evaluation of Alternative Designs for a Grid Information Service

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

system of systems: such as a cloud of clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture.

Chapter 3. Database Architecture and the Web

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

OVERVIEW OF DIFFERENT APPLICATION SERVER MODELS

Interconnect EGEE and CNGRID e-infrastructures

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

VIRTUAL OBSERVATORY TECHNOLOGIES

Database Architectures

Grid Compute Resources and Job Management

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification:

Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services

Cluster-Based Scalable Network Services

The Data exacell DXC. J. Ray Scott DXC PI May 17, 2016

Simplifying Collaboration in the Cloud

Layered Architecture

Data Intensive processing with irods and the middleware CiGri for the Whisper project Xavier Briand

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Exploiting peer group concept for adaptive and highly available services

Top Trends in DBMS & DW

Data Staging: Moving large amounts of data around, and moving it close to compute resources

JOB SUBMISSION ON GRID

Managing CAE Simulation Workloads in Cluster Environments

Embedded Technosolutions

Design patterns for data-driven research acceleration

Replica Selection in the Globus Data Grid

SNAG: SDN-managed Network Architecture for GridFTP Transfers

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Operating Systems. studykorner.org

Data Staging: Moving large amounts of data around, and moving it close to compute resources

Grid Computing with Voyager

Index Introduction Setting up an account Searching and accessing Download Advanced features

Day 9: Introduction to CHTC

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Transcription:

Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able to reuse this text in the final paper if you re specific and thorough. 1

Introduction to Grid Computing 2

Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies 3

What is a Grid? Many definitions exist in the literature Early defs: Foster and Kesselman, 1998 A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities Kleinrock 1969: We will probably see the spread of computer utilities, which, like present electric and telephone utilities, will service individual homes and offices 4 across the country.

3-point checklist (Foster 2002) 1. Coordinates resources not subject to centralized control 2. Uses standard, open, general purpose protocols and interfaces 3. Deliver nontrivial qualities of service e.g., response time, throughput, availability, security 5

Grid Architecture Autonomous, globally distributed computers/clusters 6

Why do we need Grids? Many large-scale problems cannot be solved by a single computer Globally distributed data and resources 7

Background: Related technologies Cluster computing Peer-to-peer computing Internet computing 8

Cluster computing Idea: put some PCs together and get them to communicate Cheaper to build than a mainframe supercomputer Different sizes of clusters Scalable can grow a cluster by adding more PCs 9

Cluster Architecture 10

Peer-to-Peer computing Connect to other computers Can access files from any computer on the network Allows data sharing without going through central server Decentralized approach also useful for Grid 11

Peer to Peer architecture 12

Internet computing Idea: many idle PCs on the Internet Can perform other computations while not being used Cycle scavenging rely on getting free time on other people s computers Example: SETI@home What are advantages/disadvantages of cycle scavenging? 13

Some Grid Applications Distributed supercomputing High-throughput computing On-demand computing Data-intensive computing Collaborative computing 14

Distributed Supercomputing Idea: aggregate computational resources to tackle problems that cannot be solved by a single system Examples: climate modeling, computational chemistry Challenges include: Scheduling scarce and expensive resources Scalability of protocols and algorithms Maintaining high levels of performance across heterogeneous systems 15

High-throughput computing Schedule large numbers of independent tasks Goal: exploit unused CPU cycles (e.g., from idle workstations) Unlike distributed computing, tasks loosely coupled Examples: parameter studies, cryptographic problems 16

On-demand computing Use Grid capabilities to meet short-term requirements for resources that cannot conveniently be located locally Unlike distributed computing, driven by cost-performance concerns rather than absolute performance Dispatch expensive or specialized computations to remote servers 17

Data-intensive computing Synthesize data in geographically distributed repositories Synthesis may be computationally and communication intensive Examples: High energy physics generate terabytes of distributed data, need complex queries to detect interesting events Distributed analysis of Sloan Digital Sky Survey data 18

Collaborative computing Enable shared use of data archives and simulations Examples: Collaborative exploration of large geophysical data sets Challenges: Real-time demands of interactive applications Rich variety of interactions 19

Grid Communities Who will use Grids? Broad view Benefits of sharing outweigh costs Universal, like a power Grid Narrow view Cost of sharing across institutional boundaries is too high Resources only shared when incentive to do so Grid will be specialized to support specific communities with specific goals 20

Government Small number of users Couple small numbers of high-end resources Goals: Provide strategic computing reserve for crisis management Support collaborative investigations of scientific and engineering problems Need to integrate diverse resources and balance diversity of competing interests 21

Health Maintenance Organization Share high-end computers, workstations, administrative databases, medical image archives, instruments, etc. across hospitals in a metropolitan area Enable new computationally enhanced applications Private grid Small scale, central management, common purpose Diversity of applications and complexity of integration 22

Materials Science Collaboratory Scientists operating a variety of instruments (electron microscopes, particle accelerators, X-ray sources) for characterization of materials Highly distributed and fluid community Sharing of instruments, archives, software, computers Virtual Grid strong focus and narrow goals Dynamic membership, decentralized, sharing resources 23

Computational Market Economy Combine: Consumers with diverse needs and interests Providers of specialized services Providers of compute resources and network providers Public Grid Need applications that can exploit loosely coupled resources Need contributors of resources 24

Grid Users Many levels of users Grid developers Tool developers Application developers End users System administrators 25

Some Grid challenges Data movement Data replication Resource management Job submission 26

Some Grid-Related Projects Globus Condor Nimrod-G 27

Globus Grid Toolkit Open source toolkit for building Grid systems and applications Enabling technology for the Grid Share computing power, databases, and other tools securely online Facilities for: Resource monitoring Resource discovery Resource management Security File management 28

Data Management in Globus Toolkit Data movement GridFTP Reliable File Transfer (RFT) Data replication Replica Location Service (RLS) Data Replication Service (DRS) 29

GridFTP High performance, secure, reliable data transfer protocol Optimized for wide area networks Superset of Internet FTP protocol Features: Multiple data channels for parallel transfers Partial file transfers Third party transfers Reusable data channels Command pipelining 30

More GridFTP features Auto tuning of parameters Striping Transfer data in parallel among multiple senders and receivers instead of just one Extended block mode Send data in blocks Know block size and offset Data can arrive out of order Allows multiple streams 31

Striping Architecture Use Striped servers 32

Limitations of GridFTP Not a web service protocol (does not employ SOAP, WSDL, etc.) Requires client to maintain open socket connection throughout transfer Inconvenient for long transfers Cannot recover from client failures 33

GridFTP 34

Reliable File Transfer (RFT) Web service with job-scheduler functionality for data movement User provides source and destination URLs Service writes job description to a database and moves files Service methods for querying transfer status 35

RFT 36

Replica Location Service (RLS) Registry to keep track of where replicas exist on physical storage system Users or services register files in RLS when files created Distributed registry May consist of multiple servers at different sites Increase scale Fault tolerance 37

Replica Location Service (RLS) Logical file name unique identifier for contents of file Physical file name location of copy of file on storage system User can provide logical name and ask for replicas Or query to find logical name associated with physical file location 38

Data Replication Service (DRS) Pull-based replication capability Implemented as a web service Higher-level data management service built on top of RFT and RLS Goal: ensure that a specified set of files exists on a storage site First, query RLS to locate desired files Next, creates transfer request using RFT Finally, new replicas are registered with RLS 39

Condor Original goal: high-throughput computing Harvest wasted CPU power from other machines Can also be used on a dedicated cluster Condor-G Condor interface to Globus resources 40

Condor Provides many features of batch systems: job queueing scheduling policy priority scheme resource monitoring resource management Users submit their serial or parallel jobs Condor places them into a queue Scheduling and monitoring Informs the user upon completion 41

Nimrod-G Tool to manage execution of parametric studies across distributed computers Manages experiment Distributing files to remote systems Performing the remote computation Gathering results User submits declarative plan file Parameters, default values, and commands necessary for performing the work Nimrod-G takes advantage of Globus toolkit features 42

Nimrod-G Architecture 43

Grid Case Studies Earth System Grid LIGO TeraGrid 44

Earth System Grid Provide climate studies scientists with access to large datasets Data generated by computational models requires massive computational power Most scientists work with subsets of the data Requires access to local copies of data 45

ESG Infrastructure Archival storage systems and disk storage systems at several sites Storage resource managers and GridFTP servers to provide access to storage systems Metadata catalog services Replica location services Web portal user interface 46

Earth System Grid 47

Earth System Grid Interface 48

Laser Interferometer Gravitational Wave Observatory (LIGO) Instruments at two sites to detect gravitational waves Each experiment run produces millions of files Scientists at other sites want these datasets on local storage LIGO deploys RLS servers at each site to register local mappings and collect info about mappings at other sites 49

Large Scale Data Replication for LIGO Goal: detection of gravitational waves Three interferometers at two sites Generate 1 TB of data daily Need to replicate this data across 9 sites to make it available to scientists Scientists need to learn where data items are, and how to access them 50

LIGO 51

LIGO Solution Lightweight data replicator (LDR) Uses parallel data streams, tunable TCP windows, and tunable write/read buffers Tracks where copies of specific files can be found Stores descriptive information (metadata) in a database Can select files based on description rather than filename 52

TeraGrid NSF high-performance computing facility Nine distributed sites, each with different capability, e.g., computation power, archiving facilities, visualization software Applications may require more than one site Data sizes on the order of gigabytes or terabytes 53

TeraGrid 54

TeraGrid Solution: Use GridFTP and RFT with front end command line tool (tgcp) Benefits of system: Simple user interface High performance data transfer capability Ability to recover from both client and server software failures Extensible configuration 55

TGCP Details Idea: hide low level GridFTP commands from users Copy file smallfile.dat in a working directory to another system: tgcp smallfile.dat tg-login.sdsc.teragrid.org:/users/ux454332 GridFTP command: globus-url-copy -p 8 -tcp-bs 1198372 \ gsiftp://tg-gridftprr.uc.teragrid.org:2811/home/navarro/smallfile.dat \ gsiftp://tg-login.sdsc.teragrid.org:2811/users/ux454332/smallfile.dat 56

The reality We have spent a lot of time talking about The Grid There is the Web and the Internet Is there a single Grid? 57

The reality Many types of Grids exist Private vs. public Regional vs. Global All-purpose vs. particular scientific problem 58