RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE Viacheslav Ilyin Alexander Kryukov Vladimir Korenkov Yuri Ryabov Aleksey Soldatov (SINP, MSU), (SINP, MSU), (JINR, Dubna), (PNPI, Gatchina)? (NRC Kurchatov Institute )? GRID 2008, Dubna 4 July, 2008
EGEE (thus RDIG) was born as a grid by High- Energy Physics (namely by LHC Large Hadron Collider at CERN)? EGEE development (glite MW, production infrastructure) was developed under strong reqirements from LHC applications EGEE development is now toward multi discipline sustainable e-infrastructure e servicing Major Impact of the EGEE/LCG production grid EGI European Grid Initiative experience NGIs National Grid Initiatives
Applications Enabling Grids for E-sciencE CLI Access glite services Outline AP I Higher-Level Grid Services Workload Management Replica Management Visualization Workflows Grid economies etc. Foundation Grid Middleware Security infrastructure Computing & Storage Elements? Accounting Information providers & monitoring Security Authorization Authentication Data Management Metadata Catalog Storage Element Auditing File & Replica Catalog Data Movement Accountin g Information & Monitoring Information Monitoring Job Management Job Provenance Computing Element Application Monitoring Package Manager Workload Management
Enabling Grids for E-sciencE LHC Large Hadron Collider at CERN
Enabling Grids for E-sciencE LHC LHC detectors - where the 10s Petabytes of data will born each year: ATLAS (photo ) ALICE CMS LHCb
Large Hadron Collider at CERN August 2008 to start! Coming data to physicists: 6000++ Physicists 250++ Institutes 60++ Countries Challenge: analyze Petabytes of complex data cooperatively preparing data for analysis providing physicists by analysis object data and New basic knowledge facilities to make the analysis T2s: 100++ (,(, RuTier2, ) T0 (CERN) + T1s (it, fr, de, uk, nd, sp, us, jp, tw, ca, )? G R I D analysis is impossible without (Monte-Carlo) simulation of real data
LHC start in 2008: change of epoch from construction of the grid computing to the real data analysis Challenge (by the real data coming): 1000s (!?) real users to appear in the grid, 10s Petabytes of new real data each year to involve, each year physicists will analyse 10sPB*Nyears
2000-2008 deployment from testbeds to the production grid LHC applications from the beginning 2008-2009 -... physics analysis of the real data (plus MC production) by use of the grid technology stabilization of the LHC computing grid infrastructure
240 sites 45 countries 41,000 CPUs 5 PetaBytes >5000 users >100 VOs >100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences 32 %
The RDIG infrastructure - regional EGEE segment RDIG (Russian Data Intensive Grid) now is 15 Resource Centers with more 1500 CPU and more 650 TB of disc storage. RDIG Resource Centres: ITEP JINR-LCG2 Kharkov-KIPT RRC-KI RU-Moscow-KIAM RU-Phys-SPbSU RU-Protvino-IHEP RU-SPbSU Ru-Troitsk-INR ru-impb-lcg2 ru-moscow-fian ru-moscow-gcras ru-moscow-mephi ru-pnpi-lcg2 ru-moscow-sinp
RuTier2 in the World-Wide Grid Russian LHC Tier2 Computing Facilities are operated by Russian Data-Intensive Grid (RDIG) RDIG as Russian segment of the European grid infrastructure EGEE http://www.egee-rdig.ru Basic grid services (including VO management, RB/WLM etc) are provided by SINP MSU, RRC KI and JINR Operational functions are provided by JINR, IHEP, PNPI Regional Certificate Authority and security are supported by RRC KI User support (Call Center, link to GGUS in FZK) - ITEP
Production Normalised CPU time per Region (June 2008)?
RuTier2 for ALICE results from June2005 till June 2008 Number of running jobs at RDIG sites 1 143 675
CMS: Participation in CCRC08 (May 2008)? CERN-PROD T1 SINP: up to 50 Mbytes/sec!!! CERN-PROD T1 JINR: up to 43 Mbytes/sec!! CERN-PROD T1 ITEP: up to 34 Mbytes/sec! CMS CCRC08 Phedex data transfers to the RDMS CMS sites at the Production status(!) in May, 2008: transfer rates up to 50 Mbytes/sec
240 sites 45 countries 41,000 CPUs 5 PetaBytes >5000 users >100 VOs >100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences 32 %
RDIG VOs Most RC support the WLCG/EGEE VO's Alice Atlas CMS LHCb Supported by some RC's: gear Biomed Fusion Infrastructure VO's (all RC's): dteam ops Regional VO's ru-fusion, ams, eearth, photon, rdteam, rgstest Today mostly LHC applications. Today challenge grid infrastructure for nanoindustry, ITER (fusion), ++
Development and maintenance of RDIG e-infrastructure The main directions in development and maintenance of RDIG e-infrastructure are as the following: - support of basic grid-services; - Support of Regional Operations Center (ROC); - Support of Resource Centers (RC) in Russia; - RDIG Certification Authority; - RDIG Monitoring and Accounting; - participation in integration, testing, certification of grid-software; - support of Users, Virtual Organization (VO) and application; - User & Administrator training and education; - Dissemination, outreach and Communication grid activities.
Support of basic gridservices Grid operator on duty 6 teams working in weekly rotation CERN, France, Italy, UK, Russia, Taipei Geographically distributed responsibility for operations: There is no central operation ROCs in each federation (ROC Regional Operation Center)? Russia SINP MSU, RRC KI, JINR
RDIG - training, induction courses PNPI, RRC KI, JINR, IHEP, ITEP, SINP MSU: Induction courses ~ 600 Courses for application developers ~ 60 Site administrator training ~ 100 2007 series of physicists training how to use EGEE/RDIG in their daily activity (LHC coming...)? today more than 100 physicists from different LHC centers in Russia got this training
Russia in World-wide LHC Computing Grid RDIG resource planning for LHC: Russian Federation, RDIG (Note 1) 2007 2008 2009 2010 2011 2012 Split 2008 ALICE ATLAS CMS CPU (ksi2k) 3000 3000 3800 6500 12000 17000 Offered 1050 600 600 % of Total 8% 3% 4% Disk (Tbytes) 600 600 1700 3000 4000 5700 Offered 210 180 150 % of Total 12% 2% 3% Nominal WAN (Mbits/sec) 2500 2500 2500 5000 10000 10000 Tape (Tbytes) 1500 2000 3000 Offered Note 1: The Russian capacities are available in January of the indicated year.
RDIG at EGEE 2002-2008 what we got and realized Through the participation in world-wide projects EGEE/WLCG we got Creation of first grid infrastructure in Russia at national level Involvement in new grid technologies development Experience in management of large distributed computing systems Experience in production grid computing for science Problems we realized Coordination of grid initiatives in Russia at national level (today RDIG rfeprfesents Russia in European Grid Initiative)? There is no Russian NREN (National Research and Education Network)? No funding for developing grid middleware/infrastructure at national level compared with EU, US, China,... No involvement in international standardization bodies
European Grid Initiative Need to prepare permanent, common Grid infrastructure Ensure the long-term sustainability of the European e-infrastructure independent of short project funding cycles Coordinate the integration and interaction between National Grid Infrastructures (NGIs)? Operate the production Grid infrastructure on a European level for a wide range of scientific disciplines Must be no gap in the support of the production grid
Toward to European multi discipline sustainable production grid infrastructure - up to now successful grid infrastructures are/were strongly motivated/born/required by concrete applications - now we have to find the universal set of MW services optimal for creating multi discipline grids (not only in science...) - not yet well good understood issue (much more complex problem than networking...)? Personal view: Grid idea as infrastructure innovation is of 10 years old new innovative idea(s) should be born on the sholders of existing grid projects +++