Computing grids, a tool for international collaboration and against digital divide Guy Wormser Director of CNRS Institut des Grilles (CNRS, France) www.eu-egee.org EGEE and glite are registered trademarks
The rationale Fight the brain drain by promoting international collaboration in the centers of excellence of emerging countries Tool : a world-wide distributed computing infrastructure such as the EGEE Grid Great opportunity for North-South collaboration The grid virtuous circle Universal observation: Strong correlation between grid usage and presence of local grid nodes Two lines of action: Promote Grid usage in emerging countries Install as many nodes as possible in emerging countries Only possible through a vigourous training progam 2
Electricity Grid Analogy with the Electricity Power Grid Power Stations Distribution Infrastructure 'Standard Interface' 3
Computing Grid Computing and Data Centres Fibre Optics of the Internet 4
Grid: Resource Sharing Enabling Grids for E-sciencE Share more than information Data, computing power, applications Middleware handles everything The Grid Your Program Single computer PROGRAMS Word/Excel Games Your Program Email/Web OPERATING SYSTEM MIDDLEWARE User Interface Machine Resource Broker Disks, CPU etc Disk Server CPU Cluster CPU Cluster 5
e-infrastructures provide easier access for Small research groups Scientists from many different fields Remote and still developing countries How e-infrastructrures help e-science To new technologies Produce and store massive amounts of data Transparent access to millions of files across different administrative domains Low cost access to resources Mobilise large amounts of CPU & storage on short notice (PC clusters) High-end facilities (supercomputers) And help to find new ways to collaborate Develops applications using distributed complex workflows Eases distributed collaborations Provides new ways of community building Gives easier access to higher education KNOWLEDGE. INFRASTRUCTURE GRID. INFRASTRUCTURE NETWORK. INFRASTRUCTURE 6
240 sites 45 countries 81,000 CPUs 15 PetaBytes >5000 users >100 VOs >100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences 32 % 7
Grids key competitive advantages Transparent access to distributed data Exemples: Earth sciences, Life sciences Handling of huge datasets Particle Physics, astrophysics, human sciences Large flexibility in computing ressources Disasters management Avian flu, malaria challenges Synergy between the grid network and the human network 8
The virtuous circle The 3 pillars of e-science A high speed network A scientific community eager to participate in an international venture An enabling tool, the grid Constructive interference between the three poles in many countries: the first one progressing pulls the 2 others Example of Morocco Participation of local «Southern» scientists to large «Northern» international projects: best example, the LHC program (Latin America, Central and South East Asia, Northern Africa) Deployment of «Southern» applications on the international Grid Key event in October 2008: Agreement by EU-AU to boost research networks in Africa 9
Collaborating infrastructures Nothing there yet! 10
SEISMOLOGY[1] Fast Determination of mechanisms of important earthquakes (IPGP: E. Clévédé, G. Patau) Challenge Provide results 24h -48h after its occurrence 5 Seisms already ported: Peru, Guadeloupe, Indonesia (Dec.), Japon, Indonesia (Feb.) Application to run on alert Collect data of 30 seismic stations from GEOSCOPE worldwide network Select stations and data Definition of a spatial 3D grid +time Peru earthquake, 23/6/2001, Mw=8.3 Data used: 15 Geoscope Stations Run for example 50-100jobs 11
Management of water resources in Mediterranean area (SWIMED) Enabling Grids for E-sciencE G. Lecca (CRS4 Italy), P. Renard (Unine, CH), J. Kerrou (INAT, Tunisia), R. Ababou (IMFT, Fr) Korba coastal aquifer Tunisia 45 km Cape Bon Peninsula 70km south-east of Tunis 12
WISDOM Drug Discovery WISDOM focuses on in silico drug discovery for neglected and emerging diseases. Malaria 46 million ligands docked 1 million selected 1TB data produced; 80 CPU-years used in 6 weeks Avian Flu H5N1 neuraminidase Impact of selected point mutations on eff. of existing drugs Identification of new potential drugs acting on mutated N1 Fall 2006 Extension to other neglected diseases 13
Grids key competitive advantages Transparent access to distributed data Exemples Earth sciences, Life sciences Handling of huge datasets Physique des particle Physics, astrophysics, human sciences Large flexibility in computing ressources Disasters management Avian flu, malaria challenges Synergy between the grid network and the human network 14
The Montpellier workshop Held in France in December 10-12 2007 Grid workshop to develop France-Africa collaboration Sponsored by CNRS and fondation «Share the knowledge» Focus on African development via science and excellence of African scientists Promote Internet connectivity and Grid nodes in Africa First actions selected: implant two grid nodes in Africa South Africa and Senegal selected as the best places to start Prepare the launch of a «EuroAfrica» FP7 program 15
The first EGEE grid node in subsaharian Africa Grid node in Dakar established in July 2008 with the help of HP/Unesco and CNRS! 16
Action Plan with South Africa May 2008 meeting in Pretoria : 3 main objectives Provide training in France to South African system administrators: Done! (in conjunction with Senegal) Similar action done in parallel in Catania Organise a users-oriented grid workshop in South Africa open to other African scientists: Done! Install a powerful South African Grid. Integrate South African sites in the European Grid EGEE : soon to be done In the mean time, successful effort to install a grid node in Senegal (July 2008) 17
Plan for 2009 European Union-African Union recent agreement to promote high speed networks in Africa : FIST project Follow/collaborate/consolidate Integrate South African nodes on the EGEE grid CHPC node as the main focus Some smaller nodes as prototypes Deploy a South African VO in EGEE with several scientific applications HEP, Life sciences, earth sciences Launch a dedicated EU «Support Action» program to promote Grids in Africa Reinforce Senegal node Install new nodes (Congo, Ivory Coast, Eastern Africa,..) Organize a series of grids workshops and tutorials in Africa in 2009 18
Contents of the EU Africa project Enabling Grids for E-sciencE Grid Users training in Africa in rotating places throughout Africa System administrators training in Europe and in Africa Introduction of grid notions in academic programs in Europe and in Africa Deployment of African-based applications on the Grid 19
Conclusion E-science is a good tool to fight brain drain by boosting the attractivity of the centers of excellence of emerging countries Grid technology is very well adapted to this task Vigorous on-going program around the world for North-South partnership with an added focus on subsaharian Africa: Install new grid nodes Deploy «Southern» applications Training, training and training Support and feed the virtuous circle Grids, networks, science 20
Simulation LHC Monte Carlo simulations; Fusion; WISDOM Jobs needing significant processing power; Large number of independent jobs; limited input data; significant output data Bulk Processing HEP ; Processing of satellite data Distributed input data; Large amount of input and output data; Job management (WMS); Metadata services; complex data structures Parallel Jobs Climate models, computational chemistry Large number of independent but communicating jobs; Need for simultaneous access to large number of CPUs; MPI libraries Short-response delays Prototyping new applications; grid Monitoring grid; Interactivity Limited input & output data; processing needs but fast response and quality of service Workflow Medical imaging; flood analysis Complex analysis algorithms; complex dependencies between jobs Commercial Applications Non-open source software; Geocluster (seismic platform); FlexX (molecular docking); Matlab, Mathematics; Idl, License server associated to an application deployment model Types of applications 21