Grid Interoperation and Regional Collaboration Eric Yen ASGC Academia Sinica Taiwan 23 Jan. 2006
Dreams of Grid Computing Global collaboration across administrative domains by sharing of people, resources, data, applications, and instruments, etc. Production efficiency enables the e-infrastructure -- a global knowledge infrastructure Scalability of the number, robustness & performance of services On-demand provisioning Global science needs global grid Bridge the digital divide
Goals for Grid Interoperation All for one and one for all! Abstraction of system functions, commands, and user actions to achieve system and installation independence Retain full administrative autonomy at all participants Natural extension of each grid s efforts to include more resources in their own infrastructure Efficiency through collaboration Facilitating Grid Standardization Other functionality concerns Discovery - metadata standards for grid resources Site Verification - ongoing / one time / scheduled Job monitoring Define Policies and Rules of Engagement. Security procedures Problem resolution
Service-Oriented Architecture I A service is the logical manifestation of some physical or logical resources (like DBs, programs, devices, humans, etc.) and/or some application logic that is exposed to the network. Service interaction is facilitated by message exchanges. from Atkinson, DeRoure, Dunlop, and Fox et. al., Web Service Grids: An Evolutionary Approach
Service-Oriented Architecture II Success of the Grid infrastructure needs to be application-driven, and under-pinned by a set of core middleware services. Decompose over the Network -- Ian Foster Client can integrate dynamically select & compose services select best of breed providers publish result as a new service Problems Wide diversity of software platforms used by Grid projects/systems. re-invention of similar services in multiple projects limited sharing of software prototypes between projects
Web Services-based Grid Infrastructure Model the world as a collection of services Resource Descriptions and Aggregation Discovery Composition Adaptation & Evolution Quality of Services: security, performance, reliability,... Workflow (lifecycle management) Open Source Implementation
Components Enabling Grids for E-sciencE Component LCG/EGEE OSG ARC Monitoring/IS MDS, LDAP, GLUE, BDII, R-GMA MDS, GLUE, GridCat, MonALISA, LDAP LDAP, ARC schema, MDS Security GSI GSI GSI Software Installation Privileged users, All VO members, Publish utility Flatfile Job submission & description GRAM / JDL GRAM / RSL, Condor-G GridFTP / RSL VO support and management VOMS, LDAP VOMS, GUMS VOMS Data Management GridFTP, LFC, RLS GridFTP, SRM v1.1 GridFTP, RC, RLS SRM SRM v1.1 client Avoid Divergence! Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 7
LCG
Enabling Grids for E-sciencE OSG in the GOCdb Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 9
Enabling Grids for E-sciencE LCG/OSG Timeline Week 2 LCG client tools installed as exp software Site Passes Gstat Grid Status tests LCG to OSG job submissions working via RB Week 3 SFTs running at the OSG site. All green except for accounting and CAs Needed modification to find middleware MON box at CERN OSG to LCG job submissions working Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 10
Enabling Grids for E-sciencE LCG/OSG: STATUS Interoperation is a mutual process LCG to OSG OSG appears as a single site 3 OSG sites moving to full interoperability IS, monitoring, job matches, data transfer Modified SFTs pass with the exception of Accounting and CAs Generic Info Provider installed Dteam supported OSG to LCG Job submission works Further work required on OSG monitoring of LCG sites No standard monitoring interface Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 11
Enabling Grids for E-sciencE LCG/OSG: Outstanding issues LCG jobs need to source the environment Requires RB fix Investigate the exp software installation and advertisement Harmonisation? VOs and their management Common monitoring VO? GUMS / VOMS versioning CAs Accounting Adequate logging for audit Operations What happen when sites have problems EGEE has a very proactive operations policy Monitoring Suffer from lack of common interfaces, N² problem MonALISA/R-GMA interoperation MIS-CI Data Management FTS Add more OSG sites! Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 12
Enabling Grids for E-sciencE ARC From the meeting between LCG and Nordugrid at CERN 31st August 2005, three options were presented. LONG TERM - Agree on the interfaces at the site level and work towards producing code that works with these interfaces. Deploy the LCG CE and SE information provider in parallel with the ARC MEDIUM TERM - Present these interfaces at the Grid boundary and create a portal that does forwarding and translation. SHORT TERM - Deploy the LCG and ARC CE in parallel at large sites. Outstanding Tasks Document the LCG CE LCG to ARC job submission ARC to LCG job submission Service Discovery / Glue 2 Source: Oliver Keeble, LCG Interoperation INFSO-RI-508833 Oliver Keeble 13
ARC/NorduGrid
Interoperability background Proposed work plan from the CERN meeting: Short term: Multiple Middlewares at large sites Medium term: Gateways between grids Long term: Common Interfaces Short term is being addressed already Work plan made for Medium term tasks Long term: CRM Initiative, GLUE2, GGF, GT4 Source: Michael Gronager, EGEE/ARC Interoperability Status Joint OSG and EGEE Operations Workshop, Culham, September 2005 15
Common Interfaces $Diverges mainly on: $Job submission and description $However, work on JSDL is in progress $Information system Service/component LCG-2, glite ARC Basis GT2 from VDT GT2 own patch, GT3 pre-ws Data transfer GridFTP, SRM (DPM) GridFTP, SRM v1.1 client Data management EDG RLS, Fireman & Co, LFC RC, RLS, Fireman Information LDAP, GLUE1.1, MDS+BDII, R- GMA Job description JDL (based on classads) RSL Job submission Condor-G to GRAM GridFTP VO management VOMS, glite VOMS, CAS (?) VOMS LDAP, ARC schema, MDS-GIIS Joint OSG and EGEE Operations Workshop, Culham, September 2005 16 Source: Michael Gronager, EGEE/ARC Interoperability Status
Common Interfaces higher level Condor / Condor-G LCG supports submission via Condor-G natively LCG supports Condor as a queuing system ARC supports Condor as a queuing system Cooperation between ARC and Condor led in October 2004 to Condor-G version that can submit jobs to ARC GridFTP (translation from ARC infosystem schema to GLUE was developed by Rod Walker). Was meant to be used by LCG but nobody configured an RB this way yet Perhaps the most important common interface? Source: Michael Gronager, EGEE/ARC Interoperability Status Joint OSG and EGEE Operations Workshop, Culham, September 2005 17
Gateways Possible submission scheme from LCG to ARC Setup an LCG CE using Condor LRMS Setup Condor-G queue to submit to ARC Possible submission scheme from ARC to LCG Setup an ARC CE using Condor LRMS Setup Condor-G queue to submit to LCG Source: Michael Gronager, EGEE/ARC Interoperability Status Joint OSG and EGEE Operations Workshop, Culham, September 2005 18
Major Architectural Differences Globus GRAM submission does not work for many jobs: Each new queued job spawns a process on the gate keeper, which regularly executes a Perl script Does not perform for more than 400 jobs Globus MDS badly implemented: Problems with caching Schema not complete Globus has no broker Source: Michael Gronager, EGEE/ARC Interoperability Status Joint OSG and EGEE Operations Workshop, Culham, September 2005 19
LCG LCG and EGEE Grid Sites in the Asia-Pacific Region PAEC NCP Islamabad Tata Inst. Mumbai VECC Kolkata GOG Singapore IHEP Beijing KNU Daegu KEK Tsukuba ICEPP Tokyo Taipei - ASGC, IPAS NTU, NCU 4 LCG sites in Taiwan 12 LCG sites in Asia/ Pacific Academia Sinica Grid Centre -- Tier-1 Centre for the LHC Computing Grid (LCG) -- Asian Operations Centre for LCG and EGEE -- Coordinator of the Asia/Pacific Federation LCG site other site Univ. Melbourne in EGEE last update 01/11/06 04:29 AM AP Federation now shares the e-infrastructure with WLCG
Asia Pacific Federation Aim of EGEE: to establish a seamless European Grid infrastructure for the support of the European Research Area (ERA) Achievements of EGEE: Exceeding almost all final goals Scope expanded beyond Europe Originally, 4 Federations of CERN, Italy, UK, France Russia joined 25 March 2005 Taiwan joined 17 October 2005 AP Federation was late in terms of EGEE operations However, LCG was operational simultaneously with the rest of world since 19 March 2003 (LCG-0) 2005/12/16 Simon C. Lin / ASGC
Plan of AP Federation VO Services: deployed from April 2005 in Taiwan (APROC) LCG: ATLAS, CMS BioInformatics, BioMed Geant4 APeSci : for collaboration general e-science services in Asia Pacific Areas APDG: for testing and testbed only TWGRID: established for local services in Taiwan Potential Applications LCG, Belle, nano, biomed, digital archive, earthquake, GeoGrid, astronomy, Atmospheric Science 2005/12/16 Simon C. Lin / ASGC
EGEE Asia Pacific Services by Taiwan Production CA Services AP CIC/ROC VO Support Pre-production site User Support MW and technology development Application Development Education and Training Promotion and Outreach Scientific Linux Mirroring and Services 2005/12/16 Simon C. Lin / ASGC
Interoperation and Collaboration Objectives Building Grid Infrastructure by Application-led projects Ensure to be scalable in number, robustness & performance of services and sites Protect the regional investments in Grid MW components and evolve continuously Approaches: Application-driven Converge to a simple, well-defined service-oriented architecture Capture generic middleware infrastructure components Construct a middleware repository for re-engineering, integrative testing and interoperability insurance. Adaption to Web Services Architecture Close collaboration with other major Grids projects and standardization organizations in the world
Summary Capturing generic middleware services from application requirements --> closely interaction with application communitites to constuct effective science services Simplify construction by decomposing content, function and resource Construct a middleware repository for re-engineering, integrative testing and interoperability insurance. Open Source Model provides a new kind of knowledge- and community-building infrastructure Diversity is the norm and healthy, but collaboration is essential on a worldwide scale