GRID COMPUTING ACTIVITIES AT BARC ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006
Computing Grid at BARC Computing Grid system has been set up as a Test-Bed using existing Grid Technology Components developed at LCG 4 100 Mbps fiber Grid Enable Visual Area Network 2 1 HPC Web Services AFS PBS Globus Fabric Clusters Grid Enable Grid Enable 3 Visual Data server
Anupam-Ameya 512 512 Processors (1.73 Tera flops)
DAE Grid BHABHA ATOMIC RESEARCH CENTRE, MUMBAI INDIRA GANDHI CENTRE OF ATOMIC RESEARCH, KALPAKKAM RAJA RAMANNA CENTRE FOR ADVANCED TECHNOLOGIES, INDORE VERIABLE ENERGY CYCLOTRON CENTRE, KOLKATA
Stimulations to enter the Grid Technology Still evolving Gridtechnology Recent availability of HighBandwidth at affordablecosts Mature WebTechnologies Wide scale Global Grid Initiatives Expertise developed with DAE-CERN Collaboration.
LHC Computing LHC (Large Hadron Collider) will begin taking data in 2006-2007 at CERN. Data rates per experiment of >100 Mbytes/sec. >1 Pbytes/year of storage for raw data per experiment. Computationally problem is so large that can not be solved by a single computer centre World-wide collaborations and analysis. Desirable to share computing and analysis throughout the world.
Data Grids for HEP Image courtesy Harvey Newman, Caltech ~PBytes/sec 1 TIPS is approximately 25,000 Online System ~100 MBytes/sec SpecInt95 equivalents Tier 1 There is a bunch crossing every 25 nsecs. There are 100 triggers per second Each triggered event is ~1 MByte in size France Regional Centre ~622 Mbits/sec or Air Freight (deprecated) Germany Regional Centre Tier 0 Offline Processor Farm Italy Regional Centre ~20 TIPS ~100 MBytes/sec CERN Computer Centre FermiLab ~4 TIPS ~622 Mbits/sec Tier 3 Physics data cache Institute Institute ~0.25TIPS Physicist workstations ~622 Mbits/sec Institute ~1 MBytes/sec Tier 2 Institute Tier 4 Caltech ~1 TIPS Tier2 Centre Tier2 Centre Tier2 Centre Tier2 Centre ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS Physicists work on analysis channels. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
SHIVA: a Problem Tracking System FEATURES Fully web based system providing Tracking : Tracking reported bugs, defects, feature requests, etc. Assignment : Automatic routing and notification to support staff to get issues resolved Communication : Capturing discussion and sharing knowledge Enforcement : Automatic reminders according to severity of the issues Accountability : History and Logs
SHIVA Screenshots User Home Page
LEMON Lemon is a system designed to monitor performance metrics, exceptions & status information of extremely large clusters At CERN it monitors ~2000 nodes, ~70 clusters with ~150 metrics/host producing ~1GB of data. Estimated to monitor up to 10000 nodes A variety of web based views of monitored data for Sysadmins, managers and users Highly modular architecture allows the integration of user developed sensors for monitoring site-specific metrics.
LEMON architecture
QUATTOR Quattor is a tool suite providing automated installation, configuration and management of clusters and farms Highly suitable to install, configure and manage Grid computing clusters correctly and automatically At CERN, currently used to auto manage nodes >2000 with heterogeneous hardware and software applications Centrally configurable & reproducible installations, run time management for functional & security updates to maximize availability
QUATTOR SQL CLI SQL backend GUI SCRIPTS SOAP CDB XML backend HTTP XML configuration profiles SW server(s) SW Repository HTTP RPMs Node Configuration Manager NCM CompA CompB CompC ServiceA ServiceB RPMs/ PKGs ServiceC SW Package Manager SPMA base OS HTTP / PXE Install server Install Manager System installer Managed Nodes
DAE Grid Resource sharing and coordinated problem solving in dynamic, multiple R&D units CAT: archival storage 4 Mbps Links VECC: real-time Data collection BARC: Computing with shared controls IGCAR: wide-area Data dissemination
ANUNET In BARC BARC Router CAT Router VECC Router IGCAR Router CA VOMS File Catalog UI (Grid Portal) BDII (Resource Directory) My-Proxy Resource Broker MON (e.g. RGMA) DMZ NAT (Firewall) Cluster Network Gatekeeper (CE) SE (Storage Element) UNIT Intranet Worker Nodes
Information Provider (BDII) Resource Broker Network GateKeeper Information GateKeeper Service Site 2 Worker Nodes (Job execution) No of CPUs Memory Jobs running Jobs pending etc Worker Nodes (Job execution)
File Storage Resource Broker Network GateKeeper Information Service GateKeeper Site 2 Worker Nodes (Job execution) No of CPUs Memory Jobs running Jobs pending etc Worker Nodes (Job execution)
Grid Setup (or) Services UI User Interface Interface for using the GRID BDII Information System RB Resource Broker MyProxy Server Proxy renewal CE Computing Element SE Storage Element CE Computing Element SE Storage Element WN WN Worker Node WN Worker Node WN Worker Node WN Worker Node WN Worker Node Worker Node Site 1 WN WN Worker Node WN Worker Node Worker Node Site 2 Certifying Authority Certificates VOMS Virtual Organization Membership Server LFC File Catalog
Ceritifying Authority VOMS Top BDII File Catalogue Resource Broker (Matchmaking, Job Submission) Myproxy Server Gatekeeper Site BDII GRIS Command Line Interface Certificates User Interface Gatekeeper Site BDII GRIS command Line Interface Certificates User Interface Information Providers Information Providers PBS FMON agent Computing Element FMON Server GridFTPand RFIO FMON agent Storage Element PBS Client 32 Worker Nodes PBS FMON agent Computing Element FMON Server GridFTPand RFIO FMON agent Storage Element PBS client 10 Worker Nodes Site 1 Site 2 LFC File Catalogue GridICE Server
GRID APPLICATIONS HIGH PERFORMANCE COMPUTING ON LINE STORAGE DATA SEARCH DATABASES APPLICATION BASED VIRTUAL ORGANISATIONS DATA ACQUISITION SIMULATION VISUALISATION
THANK YOU