The LHC Computing Grid Gergely Debreczeni (CERN IT/Grid Deployment Group)
The data factory of LHC 40 million collisions in each second After on-line triggers and selections, only 100 3-4 MB/event requires greater than 1GB/sec recording speed More than 10 milliard collisions in a year yields 10 PB/year data flow Additional Monte Carlo simulations To compare: o 1 TB is approximately corresponds to all the book produced around the world in one year o 1 EB is the amount of information generated in one year around the world. Requires ~100.000 today s fastest PCs
Why we need it? The new collider and it s detectors will generate an enormous amount of data. No single supercomputer will be able to handle the data! A reliable, permanent, failure tolerant, flexible and distributed computing enviroment is needed, to meet the requirements of the new experiments and that of the geographicaly highly distributed collaborations. SPF (Single Point of Failure) free computing system! The LHC Computing Grid is meant to be the solution!
EGEE - the framework EGEE Enabling Grids for e-sciences in Europe The aim of the EGEE and EGEE-2 projects: to develop a service Grid infrastructure which is available to scientists 24 hours-a-day http://www.eu-egee.org The project concentrate on: building a consistent, robust and secure Grid network that will attract additional EGEE facts computing resources continously maintain and improve the largest Grid infrastructure project in middleware in order to deliver a reliable Europe service to users 27 participating country attracting new users from industry as well ~70 leading institution as scinence and ensure they receive the ~30 additional contributors high standard of training and support they over 180 site need over 30 M Euros funding per 2 year
Basic elements of LCG Virtual Organisations A grouping of individuals, often not bound to a single institution or enterprise, who, by reason of Working Nodes their common membership of the VO, and in Jobs are running here sharing a common goal, are granted rights to use a set of resources on the Grid BDII Information Index Certificate s Authentication and authorisation is based on X.509 type digital certificates. Digital identity cards with extensions containing information about the user s VO membership. Issued by the Certificate Authorities. Resource Brokers Collects information from the CEs, publishes it using a special schema (GLUE). Computing elements Master/head node of a local batch system. Interface to the Grid. Publish resource availability and job status to the Grid s II Proxy servers Matchmaking of job requirements with available resources based on the BDII informaton Extends certificate lifetimes for Catalogs long running jobs Different file location catalogs, physical media and location independent logical file pointers Storage elements Disk or/and tape, common interaface
Map of LCG sites EGEE, OSG, NorduGrid ~ 32.000 processors ~ 10 PB storage ~ 20K running job at anytime ~ 185 site
Grid Monitoring - SAM SAM Service Availability Monitor Test jobs are submitted in every 3 hour to each site in production. Examines the state of the site, publish result to a central page and sends notifications to site admins if necessary http://lcg-sam.cern.ch:8443/sam/sam.py
Joining the LCG I The BUDAPEST site of the Central Research Institute for Particle and Nuclear Physics (KFKI) was the 6th to join LCG in Jun, 2003. Based on our previous Condor cluster experience, that time we had 25 processor, 1.5 TB disk storage and Condor batch system used. Now KFKI has ~110 processors, 6.5 TB storage, and supports the following Virtual Organisations: Alice, Atlas, LHCb, CMS, dteam, ops, HunGrid, Voce, BioMed
Joining the LCG/EGEE II Past and current activities: glite certification testbed Installing certifying new versions of the EGEE middleware before being released LHCb data challange Participation in LHCb s data challangge (DC04) CMS service challange Now BUDAPEST is recognized as a Tier-2 CMS center Alice ALIEN grid Dedicated gateway node (VO-box) to run Alien jobs on the LCG cluster BioMed service challange GSVG activities Participation in the Grid Security Working Group.Vulnerability testing, risk estimation. User support Providing technical support mainly for HunGrid users Joint EGEE SEEGRID2 summer school organizes by SZTAKI Demo cluster and courses at BME Presentations, demos, tutorials organised by ELTE EGEE 07 conference will be held in BUDAPEST
The HunGrid Virtual Organisation The HunGrid Virtual Organisation was created to serve as a general purpose scientific and educational national VO, by http://www.grid.kfki.hu/ KFKI RMKI Central Research Institute for Particle and Nuclear Physics ELTE Eötvös Loránd University, Faculty of Sciences
The HunGrid Virtual Organisation Additional partners: BME, Budapest University of Technology and Economics NIIF, National Information Infrastructure Development Program VEIN, University of Pannonia, Faculty of Information Technology
The HunGrid Virtual Organisation KFKI RMKI set up the first EUGridPMA recognized Certification Authority in Hungary EUGridPMA, is the European Policy Management Authority for Grid Authentication http://pki.kfki.hu Now RMKI CA operates as an RA (Regional Authority) and issues certificate for the members of the Institute... http://www.ca.niif.hu/...while the tasks of the top level Certificate Authority has been delegated to NIIF
The HunGrid P-Grade portal The P-Grade portal was developped at SZTAKI serves as a graphical user interface to the Grid. Built-in graphical workflow editor Multi-Grid management Resource management Quota management Workflow-level fault tolerance Certificate management On-line workflow and paralell job monitoring Built-in MDS and BDII based information system management Local and remote files handling Personalisation Convenient tool to access and work on the Grid! http://n42.hpcc.sztaki.hu
ClusterGrid and the LCG The ClusterGrid project is a general pourpose Grid project which targets users from the academic and educational regions. http://www.clustergrid.iif.hu/ In a simple picture practicaly it is huge collection of Condor pools in night-only operating mode. ~1000 computer and several 10 TB of storage Set up of an LCG ClusterGrid gateway is under consideration. Several difficulties to be solved in the hope of a significant improvement of resources and services!
Grid Competence Center Members of GCC play an outstanding and determining role in the Hungarian Grid R&D projects, they are leaders or participants in the vast majority of such projects including: Together easier to submit successful applications, VISSZKI and get more funding. DemoGrid SuperGrid http://www.mgkk.hu JGrid Chemistry Grid Super-Cluster Grid HunGrid NKFP Grid Formal framework is created first common applications are sent, but a much closer collaboration to reach our aims.
HunGrid todos and problems To do: Significant extension of both participating institutes and available resources is necessary. Critical Grid mass is necessary to be reached in order the machinery to work as planed/expected. Attracting research groups and maybe industrial applications (in a longer term) Demonstrate it s advantages and usability Problems: Fundamental financial problems (5 application out of 6 fails) Hard to convince people to change, to use the Grid With no user site admin s has no motivation to maintain How to join HunGrid, contact info: The HunGrid is OPEN for everybody belonging to the academic community Contact e-mail: gridadm@rmki.kfki.hu, hungridadm@lists.kfki.hu Web site: http://www.grid.kfki.hu/