The DESY Grid Testbed

Similar documents
DESY. Andreas Gellrich DESY DESY,

Computing in HEP. Andreas Gellrich. DESY IT Group - Physics Computing. DESY Summer Student Program 2005 Lectures in HEP,

Grid Computing. Olivier Dadoun LAL, Orsay. Introduction & Parachute method. Socle 2006 Clermont-Ferrand Orsay)

On the employment of LCG GRID middleware

The EU DataGrid Testbed

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Grid Computing. Olivier Dadoun LAL, Orsay Introduction & Parachute method. APC-Grid February 2007

Andrea Sciabà CERN, Switzerland

The Grid: Processing the Data from the World s Largest Scientific Machine

Tier-2 DESY Volker Gülzow, Peter Wegner

Monitoring the Usage of the ZEUS Analysis Grid

glite Grid Services Overview

Architecture Proposal

Grid Infrastructure For Collaborative High Performance Scientific Computing

HEP Grid Activities in China

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

The glite middleware. Ariel Garcia KIT

IEPSAS-Kosice: experiences in running LCG site

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

Grid Architectural Models

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

LCG-2 and glite Architecture and components

UW-ATLAS Experiences with Condor

Advanced Job Submission on the Grid

The grid for LHC Data Analysis

Troubleshooting Grid authentication from the client side

EU DataGRID testbed management and support at CERN

Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008

Overview of HEP software & LCG from the openlab perspective

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

The LHC Computing Grid

Introduction to Grid Infrastructures

A scalable storage element and its usage in HEP

Austrian Federated WLCG Tier-2

( PROPOSAL ) THE AGATA GRID COMPUTING MODEL FOR DATA MANAGEMENT AND DATA PROCESSING. version 0.6. July 2010 Revised January 2011

ISTITUTO NAZIONALE DI FISICA NUCLEARE

Grid Challenges and Experience

The European DataGRID Production Testbed

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

A Simulation Model for Large Scale Distributed Systems

Implementing GRID interoperability

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

Future Developments in the EU DataGrid

CrossGrid testbed status

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

The LHC Computing Grid

The GENIUS Grid Portal

Lessons Learned in the NorduGrid Federation

LHC COMPUTING GRID INSTALLING THE RELEASE. Document identifier: Date: April 6, Document status:

GRID COMPUTING APPLIED TO OFF-LINE AGATA DATA PROCESSING. 2nd EGAN School, December 2012, GSI Darmstadt, Germany

First Experience with LCG. Board of Sponsors 3 rd April 2009

Grid Interoperation and Regional Collaboration

Interconnect EGEE and CNGRID e-infrastructures

CernVM-FS beyond LHC computing

Bookkeeping and submission tools prototype. L. Tomassetti on behalf of distributed computing group

FREE SCIENTIFIC COMPUTING

DataGrid EDG-BROKERINFO USER GUIDE. Document identifier: Date: 06/08/2003. Work package: Document status: Deliverable identifier:

Programming Environment Oct 9, Grid Programming (1) Osamu Tatebe University of Tsukuba

An Introduction to the Grid

Authentication for Virtual Organizations: From Passwords to X509, Identity Federation and GridShib BRIITE Meeting Salk Institute, La Jolla CA.

The National Analysis DESY

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Challenges of the LHC Computing Grid by the CMS experiment

CHIPP Phoenix Cluster Inauguration

International Collaboration to Extend and Advance Grid Education. glite WMS Workload Management System

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

Status of KISTI Tier2 Center for ALICE

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

Distributed Computing Grid Experiences in CMS Data Challenge

HEP replica management

Summary of the LHC Computing Review

AGATA Analysis on the GRID

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

The German National Analysis Facility What it is and how to use it efficiently

E UFORIA G RID I NFRASTRUCTURE S TATUS R EPORT

Site Report. Stephan Wiesand DESY -DV

Experience with LCG-2 and Storage Resource Management Middleware

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Architecture of the WMS

Hardware Tokens in META Centre

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

Philippe Charpentier PH Department CERN, Geneva

Monte Carlo Production on the Grid by the H1 Collaboration

ARC integration for CMS

Troubleshooting Grid authentication from the client side

EUROPEAN MIDDLEWARE INITIATIVE

Globus Toolkit Firewall Requirements. Abstract

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

ARDA status. Massimo Lamanna / CERN. LHCC referees meeting, 28 June

Monitoring tools in EGEE

RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

Computing / The DESY Grid Center

ELFms industrialisation plans

dcache Introduction Course

Gergely Sipos MTA SZTAKI

Based on: Grid Intro and Fundamentals Review Talk by Gabrielle Allen Talk by Laura Bright / Bill Howe

Transcription:

The DESY Grid Testbed Andreas Gellrich * DESY IT Group IT-Seminar 27.01.2004 * e-mail: Andreas.Gellrich@desy.de Overview The Physics Case: LHC DESY The Grid Idea: Concepts Implementations Grid @ DESY: The Grid Testbed ILDG The Future: EGEE Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 1 1

The Physics Case Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 2 The HEP Data Chain Monte Carlo Production Online Data Taking Event Generation Detector Simulation CON GEO Data Processing Digitization Trigger & DAQ Event Reconstruction Data Analysis Nobel Prize Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 3 2

Typical HEP Jobs Monte Carlo Jobs: Event Generation: no I; small O; little CPU Detector Simulation: small I; large O; vast CPU (Digitization: usually part of simulation) Event Reconstruction: (First processing: Reprocessing: Selections: Analysis: online or semi-online) full I; full O; large CPU large I; large O; large CPU General: large I; small O; little CPU Performed by many users, many times! Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 4 Some Numbers DESY Experiment (e.g. HERA-B): Event Size: 200 kb Event Rate: 50 Hz => Event number: 500 M/y => Data Rate: 10 MB/sec => Data Volume: 100 TB/y (1 y = 10 7 sec) Online Processing: Reconstruction Time: 4 sec/event Event Rate: 50 events/sec => Bandwidth: 10 MB/sec => Farm Nodes: 200 (200 = 50 * 4) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 5 3

Some Numbers cont d Offline Computing: Users: ~300 Power Users: ~30 Active Data Volume: ~10 TB File Server: ~10 Offline Nodes: ~100 Typical Analysis Job: Event Selection: Data Selection: Processing Time: => Processing Rate: => Bandwidth: ~1 M events ~10 kb/event ~100 msec/event ~10 Hz 100 kb/sec Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 6 The LHC Computing Model ~PBytes/sec Online System ~100 MBytes/sec 1 TIPS = 25,000 SpecInt95 PC (1999) = ~15 SpecInt95 One bunch crossing per 25 ns 100 triggers per second Each event is ~1 Mbyte Tier 1 US Regional Centre ~ Gbits/sec or Air Freight Italian Regional Centre Tier 0 French Regional Centre Offline Farm ~20 TIPS ~100 MBytes/sec CERN Computer Centre >20 TIPS RAL Regional Centre Physics data cache Tier 3 Workstations Institute Institute ~0.25TIPS Tier 2 ~Gbits/sec Institute 100-1000 Mbits/sec Institute ScotGRID++ ~1 TIPS Tier2 Centre Tier2 Centre Tier2 Centre ~1 TIPS ~1 TIPS ~1 TIPS Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 7 4

The Grid Idea Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 8 Introduction We will probably see the spread of computer utilities, which, like present electric and telephone utilities, will service individual homes and offices across the country. Len Kleinrock (1969) A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. I. Foster, C. Kesselmann (1998) The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource brokering strategies emerging in industry, science, and engineering. The sharing is, necessarily, highly controlled, with resources providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules what we call a virtual organization. I. Foster, C. Kesselmann, S. Tuecke (2000) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 9 5

The Grid Dream Mobile Access Desktop Visualizing G R I D M I D D L E W A R E Supercomputer, PC-Cluster Data Storage, Sensors, Experiments Internet, Networks Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 10 Grid Types Data Grids: provide transparent access to data which can be physically distributed within Virtual Organizations (VO) Computational Grids: allow for large-scale compute resource sharing within Virtual Organizations (VO) Information Grids: provide information and data exchange, using well defined standards and web services Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 11 6

The Grid Idea cont d I. Foster: What is the Grid? A Three Point Checklist (2002) A Grid is a system that: coordinates resources which are not subject to centralized control integration and coordination of resources and users of different domains vs. local management systems (batch systems) using standard, open, general-purpose protocols and interfaces standard and open multi-purpose protocols vs. application specific system to deliver nontrivial qualities of services. coordinated use of resources vs. uncoordinated approach (world wide web) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 12 Grid Middleware Globus: Toolkit Argonne, U Chicago EDG (EU DataGrid): Project to develop Grid middleware Uses parts of Globus Funded for 3 years (1.4. 2001-31.3.2004) LCG (LHC Computing Grid): Grid infrastructure for LHC production Based on stable EDG versions plus VDT etc. LCG-2 for Data Challenges in 2004 (EDG 2.0) EGEE (Enabling Grids for E-Science in Europe) Starts 01.04.2004 for 2 + 2 years Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 13 7

Grid Ingredients Authentication: Provide a method to guarantee authenticated access only => Let users log in and create a token-like proxy Information Service: Provide a system which keeps track of the available resources => Monitoring system Resource Management: Manage and exploit the available computing resources => distribute jobs Storage Management: Manage and exploit data => replicate data Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 14 Certification Authorization and authentication are essential parts of Grids A certificate is an encrypted electronic document, digitally signed by a Certification Authority (CA) A Certificate Revocation List (CRL) published by the CA For Germany GridKa issues certificates on request (needs ID copy) Contacts at DESY: R. Mankel, A. Gellrich Users, hosts, and services must be certified The Globus Security Infrastructure (GSI) is part of the Globus Toolkit GSI is based on the openssl Public Key Infrastructure (PKI) It uses X.509 certificates The certificate is used via a token-like proxy In my case: /O=GermanGrid/OU=DESY/CN=Andreas Gellrich Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 15 8

Virtual Organization A Virtual Organization (VO) is a dynamic collection of individuals, institutions, and resources which is defined by certain sharing rules. Technically, a user is represented by his/her certificate The collection of authorized users is defined on every machine in /etc/grid-security/grid-mapfile This file is regularly updated from a central server The server holds a list of all users belonging to a collection It is this collection we call a VO! The VO a user belongs to is not part of the certificate A VO is defined in a central list, e.g. a LDAP tree In our case there is a VO=Testbed on DESY s central LDAP server Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 16 Grid @ DESY Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 17 9

Grid @ DESY In remembrance of Ralf Gerhards (1959 2003) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 18 Grid @ DESY The DESY Grid Testbed International Data Grid (ILDG) dcache D-GRID Initiative Enabling Grids for E-Science in Europe Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 19 10

The DESY Grid Testbed Initiated by IT and H1 (R. Gerhards, A. Gellrich) Open initiative to study Grids at DESY (http://www-it.desy.de/physics/grid/testbed) (mailto:grid-dev@desy.de) Main developers: M. Vorobiev (H1), J. Nowak (H1), D. Kant (QMUL), A. Gellrich (IT) Further members: A. Campbell (H1), U. Ensslin (IT), B. Lewendel (HERA-B), C. Wissing (H1 Do), S. Padhi (ZEUS), and others Test basic functionalities of the Grid middleware Implemented in collaboration with the Queen Mary University London that also takes part in the UK GridPP and LCG projects An example HEP application could be Monte Carlo simulation Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 20 Grid Testbed Technicalities Exploits 10 Linux PCs donated by H1 and HERA-B grid001 grid010 run DESY Linux 4/5 (S.u.S.E.-based) EU Data Grid (2001-2004) Grid middleware of EDG 1.4 (http://www.eu-datagrid.org/) EDG 1.4 is based on Globus 2.4 (http://www.globus.org/) LCG-2 will use EDG 2 (http://cern.ch/lcg/) EDG and Globus are built on/for RedHat Linux 6.2 Normally, Grid nodes are set up from an installation server (lcfg) The EDG distribution allows to download binaries Those binaries where modified and installed on the DESY machines Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 21 11

Grid Testbed Set-up Authentication: Grid Security Infrastructure (GSI) based on PKI (openssl) Globus Gatekeeper, Proxy renewal service Grid Information Service (GIS): Grid Resource Information Service (GRIS) Grid Information Index Service (GIIS) Resource Management: Resource Broker, Job Manager, Job Submission, Batch System (PBS), Logging and Bookkeeping Storage Management: Replica Catalogue,GSI-enabled FTP, GDMP Replica Location Service (RLS) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 22 Grid Testbed Set-up cont d Testbed hardware: => Mapping of services to logical and physical nodes. The basic nodes are: User Interface (UI) Computing Element (CE) Worker Node (WN) Resource Broker (RB) Storage Element (SE) Replica Catalogue (RC) Information Service (IS) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 23 12

Grid Testbed Minimal Set-up Per site: User Interface (UI) to submit jobs Computing Element (CE) to run jobs Worker Node (WN) to do the work Storage Element (SE) to provide data files [Grid Information Index Service (GIIS) as site-mds] Per Grid: Resource Broker (RB) Replica Catalogue (RC) Information Service (IS) VO-server (VO) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 24 Grid Testbed Schema grid001 ssh grid003 UI JDL output grid006 RB RLS BDII GIIS grid009 RC certs $HOME/.globus/ Computing Element ldap.desy.de VO RB CE SE /etc/grid-security/grid-mapfile grid007 CE PBS GRIS GRIS SE grid002 Other site GridFTP WN WN WN WN grid008 NFS disk Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 25 13

Ganglia Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 26 Ganglia cont d Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 27 14

Ganglia cont d Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 28 Ganglia cont d Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 29 15

Job Example Job Submission: Job delivers hostname and date of worker node Grid Environment Job script/binary which will be executed Job description by JDL Job submission Job status request Job output retrieval Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 30 Job Environment Start. grid003> export GLOBUS_LOCATION=/opt/globus grid003>. $GLOBUS_LOCATION/etc/globus-user-env.sh grid003> export EDG_LOCATION=/opt/edg grid003>. $EDG_LOCATION/etc/edg-user-env.sh grid003> export EDG_WL_LOCATION=$EDG_LOCATION grid003> export RC_CONFIG_FILE=$EDG_LOCATION/etc/rc.conf grid003> exportgdmp_config_file=$edg_location/etc/gdmp.conf Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 31 16

Job Authentication grid003> grid-proxy-init Your identity: /O=GermanGrid/OU=DESY/CN=Andreas Gellrich Enter GRID pass phrase for this identity: Creating proxy... Done Your proxy is valid until Thu Oct 23 01:52:00 2003 grid003> grid-proxy-info -all subject : /O=GermanGrid/OU=DESY/CN=Andreas Gellrich/CN=proxy issuer : /O=GermanGrid/OU=DESY/CN=Andreas Gellrich type : full strength : 512 bits timeleft : 11:59:48 Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 32 Job Description grid003> less script.sh #! /usr/bin/zsh host=`/bin/hostname` date=`/bin/date` echo "$host: $date grid003> less hostname.jdl Executable = "script.sh"; Arguments = " "; StdOutput = "hostname.out"; StdError = "hostname.err"; InputSandbox = {"script.sh"}; OutputSandbox = {"hostname.out","hostname.err"}; Rank = other.maxcputime; Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 33 17

Job Matching grid003> dg-job-list-match hostname.jdl Connecting to host grid006.desy.de, port 7771 *************************************************************************** COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: - grid007.desy.de:2119/jobmanager-pbs-short - grid007.desy.de:2119/jobmanager-pbs-long - grid007.desy.de:2119/jobmanager-pbs-medium - grid007.desy.de:2119/jobmanager-pbs-infinite *************************************************************************** Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 34 Job Submission grid003> dg-job-submit hostname.jdl Connecting to host grid006.desy.de, port 7771 Logging to host grid006.desy.de, port 15830 ************************************************************************************ JOB SUBMIT OUTCOME The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobid) is: https://grid006.desy.de:7846/131.169.223.35/134721208077529?grid006. desyde:7771 ************************************************************************************ Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 35 18

Job Status grid003> dg-job-status 'https://grid006.desy.de:7846/131.169.223.35/134721208077529?grid006. desy.de:7771 Retrieving Information from LB server https://grid006.desy.de:7846 Please wait: this operation could take some seconds. ************************************************************* BOOKKEEPING INFORMATION: Printing status info for the Job : https://grid006.desy.de:7846/131.169.223.35/134721208077529?grid0 06.desy.de:7771 To be continued Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 36 Job Status cont d continued: Status = Initial Status = Scheduled Status = Done Status = OutputReady Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 37 19

Job History grid003> dg-job-get-logging-info 'https://grid006.desy.de:7846/131.169.223.35/093011136900851?grid006. desy.de:7771' Retrieving Information from LB server https://grid006.desy.de:7846 Please wait: this operation could take some seconds. ********************************************************************** LOGGING INFORMATION: Printing info for the Job : https://grid006.desy.de:7846/131.169.223.35/093011136900851?grid006. desy.de:7771 To be continued Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 38 Job History cont d continued: Event Type = JobAccept Event Type = Job Transfer Event Type = JobMatch Event Type = JobScheduled Event Type = JobRun Event Type = JobDone Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 39 20

Job Output grid003> dg-job-get-output 'https://grid006.desy.de:7846/131.169.223.35/134721208077529?grid006. desy.de:7771' ************************************************************************************ JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https://grid006.desy.de:7846/131.169.223.35/134721208077529?grid0 06.desy.de:7771 have been successfully retrieved and stored in the directory: /tmp/134721208077529 ************************************************************************************ Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 40 Job Output cont d grid003> ls -l /tmp/134721208077529 total 4 -rw-r--r-- 1 gellrich it 0 Oct 23 15:49 hostname.err -rw-r--r-- 1 gellrich it 39 Oct 23 15:49 hostname.out grid003> less /tmp/134721208077529/hostname.out grid008: Thu Oct 23 15:47:41 MEST 2003 Done! Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 41 21

Data Management So far we have simply computed something The RB has picked a CE with computing resources A typical job reads/writes data Data files shall not be transferred with the job Possible Access Scenarios: The data file is directly accessible by the WN from the SE The data file needs o be replicated to the closest SE The WN copies the data file to its local disk from a SE Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 42 Data Management cont d Replica Catalogue (RC): Knows all files and their replicas Maps logical file name (LFN) to physical location Knows the access protocol (file, gridftp) grid003> edg-replica-manager-listreplicas -l file.txt configuration file: /opt/edg/etc/rc.conf logical file name: file.txt protocol: gsiftp SASL/GSI-GSSAPI authentication started SASL SSF: 56 SASL installing layers For LFN file.txt the following replicas have been found: location 1: grid002.desy.de/flatfiles/grid002/desy/file.txt Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 43 22

Data Management cont d grid002> less file.jdl InputData DataAccessProtocol ReplicaCatalog = {"LF:file.txt"}; = {"file", "gridftp"}; = "ldap://grid001.desy.de:9011/ lc=desycollection, rc=desyrc, dc=grid001, dc=desy,dc=de"; The actual access to the file must be handled by the application or by a wrapper script. In case of an NFS-mounted file system that is trivial. In case of a remote SE GridFTP must be called explicitly. All infos are provided by the RB in.brokerinfo. Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 44 Data Management cont d dcache: The dcache project is a joint effort between the Fermi National Laboratories and DESY, providing a fast and scalable disk cache system with optional Hierarchical Storage Manager (HSM) connection Beside various other access methods, the dcache implements a Grid mass storage fabric It supports gsiftp and http as transport protocols and the Storage Resource Manager (SRM) protocol as Replica Management negotiation vehicle The system is in production for all DESY experiments, the CDF experiment at FNAL and for the CMS-US division at various locations, including CERN dcache is a R&D project for the Grid! Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 45 23

Grid Testbed Observations System aspects: DESY Linux 4/5 installation is possible but clumsy Grid middleware deeply penetrates system software Information Service: Weak points in Globus (LDAP not scalable w/ writing) Rebuilt in EDG and LCG (R-GMA: RDBMS rather than LDAP) Resource Broker: Plays central role Needs to store all data coming with the job (resource intensive) Computing / Storage Element: Probably scalable w/ more worker nodes (farms) Only simple Data Management scheme implemented Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 46 Grid Testbed Plans Testbed: Connect second CE Get more sites involved: DESY Zeuthen, U Dortmund,... The Grid Testbed is not application or experiment specific! The Grid Testbed is not meant for production! Prototyping: Not studied in detail yet Open for various HEP applications Monte Carlo simulations is a good candidate Production: Would need new (or) appropriate hardware The goal must be a DESY-wide Grid infrastructure Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 47 24

The future: Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 48 EGEE Enabling Grids for E-Science in Europe (EGEE) EU 6th Framework Programme (FP 6) IST http://www.eu-egee.org/ 70 partners in 27 countries organized in 10 federations Total volume: 32 M DESY part: 148 k Headquarter: CERN Germany: DESY, DKRZ, FhG, FZK, GSI Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 49 25

EGEE cont d Create a European wide Grid production quality infrastructure on top of the present and future EU Research Network infrastructure (RN Geant and GN2) Provide distributed European research communities with roundthe-clock access to major computing resources, independent of their geographical location Support many application domains with one large-scale infrastructure that will attract new resources over time Provide outreach, training and support for new user communities Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 50 EGEE cont d Leverage national resources in a more effective way for broader European benefit 70 leading institutions in 27 countries, organized in 10 federations Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 51 26

EGEE cont d EDG/LCG-1 LCG-2 EGEE-1 LCG DESY? Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 52 EGEE cont d EGEE and LCG: Middleware R&D LCG-1/2 were based on EDG plus VDT EGEE will start with LCG-2 EGEE will be the basis for LCG-3 and beyond LCG LCG-1 LCG-2 EGEE-1 EGEE-2 Globus 2 based OGSI based EDG VDT... LCG... EGEE Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 53 27

EGEE cont d Networking (Network Activities, NA): NA1: Management of the I3 NA2: Dissemination and Outreach NA3: Training and Induction NA4: Application identification and Support NA5: Policy and International Cooperation Services (Specific Service Activities, SA): SA1: Grid operations, Support and Management SA2: Network resource provision Middleware (Joint Research Activities, JRA): JRA1: Middleware Engineering and Integration Research JRA2: Quality Assurance JRA3: Security JRA4: Network Service Development Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 54 EGEE cont d Timeline: (FP6-2002-Infrastructure2) (508833-EGEE) 6 May 2003: 6 Aug 2003: 10 Oct 2003: 17 Oct 2003: 20 Oct 2003: Proposal submitted Proposal accepted Technical Annex (TA) submitted Final negotiation with EU DESY signs contract (CPF) 26 Jan 2004: Jan 2004: Completion of negotiation with EU Consortium Agreement (CA) signing 1 Apr 2004: 18-22 Apr 2004: 21-26 Nov 2004: Project start 1 st EGEE Meeting, Cork, Ireland 2 nd EGEE Meeting, The Netherlands Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 55 28

EGEE cont d DESY is in SA1 (Specific Service Activities) => European Grid Support, Operation and Management Partner of the German ROC Support Centre and Resource Centre User and Administrator Support, CPU and Storage provision DESY will receive funding of ~148 k for 2 years DESY is committed to provide infrastructure Volker Gülzow (Representative) Andreas Gellrich (Deputy) This is the entrance into the Grid world! Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 56 Grid Impact on CC System support: Installing, testing, and configuration of middleware System administration Grid Operations: Running the local grid components User Support: Consultancy on making applications Grid-aware Help desk and documentation Training Resource Planning and Scheduling Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 57 29

Web Links Grid: LCG: DESY: http://www.gridcafe.org/ http://www.globus.org/ http://www.eu-datagrid.org/ http://www.eu-egee.org/ http://www.gridforum.org/ http://cern.ch/lcg/ http://www-it.desy.de/physics/grid/ http://www.dcache.org/ Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 58 Literature Books: Foster, C. Kesselmann: The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publisher Inc. (1999) F. Berman, G. Fox, T. Hey: Grid Computing: Making The Global Infrastructure a Reality, John Wiley & Sons (2003) Articles: I. Foster, C. Kesselmann, S. Tuecke: The Anatomy of the Grid (2000) I. Foster, C. Kesselmann, J.M. Nick, S. Tuecke: The Physiology of the Grid (2002) I. Foster: What is the Grid? A Three Point Checklist (2002) Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 59 30

Summary The Grid has developed from a smart idea to reality Grid infrastructure will be standard in (e)-science in the future LHC can not live w/o Grids DESY could live w/o Grids now, but would miss the future There are projects under way at DESY See: http://www-it.desy.de/physics/grid/ The EGEE project opens the Grid world for DESY! Andreas.Gellrich@desy.de IT-Seminar, 27.01.2004 60 31