e-neighbourhood Virtual Organisation and PNC Simon C. Lin ( ) Academia Sinica, Taipei, Taiwan

Similar documents
RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

The LHC Computing Grid

The EGEE-III Project Towards Sustainable e-infrastructures

Moving e-infrastructure into a new era the FP7 challenge

Travelling securely on the Grid to the origin of the Universe

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

High Performance Computing from an EU perspective

Grid Computing a new tool for science

The Grid: Processing the Data from the World s Largest Scientific Machine

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

Grid Interoperation and Regional Collaboration

The LHC Computing Grid

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Pan-European Grid einfrastructure for LHC Experiments at CERN - SCL's Activities in EGEE

The grid for LHC Data Analysis

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

e-infrastructure: objectives and strategy in FP7

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

IEPSAS-Kosice: experiences in running LCG site

Andrea Sciabà CERN, Switzerland

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Introduction to Grid Computing

e-science for High-Energy Physics in Korea

Grid and Cloud Activities in KISTI

First Experience with LCG. Board of Sponsors 3 rd April 2009

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

e-infrastructures in FP7 INFO DAY - Paris

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

CHIPP Phoenix Cluster Inauguration

EGEODE. !Dominique Thomas;! Compagnie Générale de Géophysique (CGG, France) R&D. Expanding Geosciences On Demand 1. «Expanding Geosciences On Demand»

Distributed e-infrastructures for data intensive science

Enabling Grids for E-sciencE. EGEE security pitch. Olle Mulmo. EGEE Chief Security Architect KTH, Sweden. INFSO-RI

e-science Infrastructure and Applications in Taiwan Eric Yen and Simon C. Lin ASGC, Taiwan Apr. 2008

Cyberinfrastructure!

Grid Challenges and Experience

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

High-Energy Physics Data-Storage Challenges

MAIN THEME Artificial Intelligence, Architecture and Applications

Distributed Monte Carlo Production for

To Compete You Must Compute

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

GÉANT Services Supporting International Networking and Collaboration

Environmental Sustainability

The Virtual Observatory and the IVOA

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

Integration of an Asian NGI with European counterparts

The World s Leading Organization for Electrical Power Systems Professionals and Industry Leaders

einfrastructures Concertation Event

A short introduction to the Worldwide LHC Computing Grid. Maarten Litmaath (CERN)

Operating the Distributed NDGF Tier-1

New Zealand Government IBM Infrastructure as a Service

CC-IN2P3: A High Performance Data Center for Research

Grid Computing Middleware. Definitions & functions Middleware components Globus glite

Status of e-science Activities in Mongolia

N. Marusov, I. Semenov

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

EGI: Linking digital resources across Eastern Europe for European science and innovation

Grid Computing: dealing with GB/s dataflows

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

ITU Academia. Smart Partnership for ICT4SDG. Jaroslaw K. PONDER Coordinator for Europe Region

Accelerating Throughput from the LHC to the World

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

GRID 2008 International Conference on Distributed computing and Grid technologies in science and education 30 June 04 July, 2008, Dubna, Russia

CERN and Scientific Computing

The Future of Solid State Lighting in Europe

Introduction to Grid Infrastructures

Garuda : The National Grid Computing Initiative Of India. Natraj A.C, CDAC Knowledge Park, Bangalore.

New Zealand Government IbM Infrastructure as a service

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

High-Performance Computing Europe s place in a Global Race

Grid BT. The evolution toward grid services. EU Grid event, Brussels May Piet Bel Grid Action Team

The University of Oxford campus grid, expansion and integrating new partners. Dr. David Wallom Technical Manager

Gigabyte Bandwidth Enables Global Co-Laboratories

One click away from Sustainable Consumption and Production

Active Archive and the State of the Industry

Renovating your storage infrastructure for Cloud era

APAC: A National Research Infrastructure Program

ALICE Grid Activities in US

e-infrastructures in FP7: Call 7 (WP 2010)

IT Town Hall Meeting

Presentation Title. Grid Computing Project Officer / Research Assistance. InfoComm Development Center (idec) & Department of Communication

European Grid Infrastructure

ehealth Ministerial Conference 2013 Dublin May 2013 Irish Presidency Declaration

Frank Russomanno Executive Vice President and Chief Operating Officer

The challenges of (non-)openness:

We are also organizational home of the Internet Engineering Task Force (IETF), the premier Internet standards-setting body.

BECOME TOMORROW S LEADER, TODAY. SEE WHAT S NEXT, NOW

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

Research Data Management & Preservation: A Library Perspective

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Grid Computing: dealing with GB/s dataflows

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure

Učešće srpskih naučnika u CERN-u. Dušan Vudragović Laboratorija za primenu računara u nauci Institut za fiziku, Beograd, Srbija

Proposition to participate in the International non-for-profit Industry Association: Energy Efficient Buildings

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Transcription:

e-neighbourhood Virtual Organisation and PNC Simon C. Lin ( ) Academia Sinica, Taipei, Taiwan 11529 sclin@gate.sinica.edu.tw 1 November 2005

e-neighbourhood is suggested by C. C. Hsieh, he does not like the name PNC Virtual Organisation (VO) VO is not necessarily just a single application domain as in some e-science collaboration I am compiling others slides, a la Confucius, 2

6 Ultimately, the Globus Toolkit is designed to enable the creation and maintenance of Virtual Organizations 3

7 Virtual Organizations Distributed resources and people Linked by networks, crossing admin domains Sharing resources, common goals Dynamic VO-A VO-B 4

8 Virtual Organizations Distributed resources and people Linked by networks, crossing admin domains Sharing resources, common goals Dynamic Fault tolerant VO-A VO-B 5

Its all about Virtual Organizations! Different views " Dynamic enterprises " Coalitions " escience collaboration " On demand computing " Utility computing! Same problem " Support work across dynamic communities with vested self interest 6

4 A new age has dawned in scientific and engineering research, pushed by continuing progress in computing, information, and communication technology, and pulled by the expanding complexity, scope and scale of today s challenges. The capacity of this technology has crossed thresholds that now make possible a comprehensive cyberinfrastructure on which to build new types of scientific and engineering knowledge environments and organizations, and to pursue research in new ways and with increased efficacy eport of the National Science Foundation Blue ibbon Advisory Panel, 2003 7

Water, water, everywhere, nor any drop to drink S. T. Coleridge, 1797

The Data Deluge A large novel: 1 Mbyte; The Bible: 5 Mbytes A Mozart symphony (compressed): 10 Mbytes A digital mammogram: 100 Mbytes OED on CD: 500 Mbytes Digital movie (compressed): 10 Gbytes Annual production of refereed journal literature ( 20 k journals; 2 M articles): 1 Tbyte Library of Congress: 20 Tbytes The Internet Archive (10 B pages) (From 1996 to 2002): 100 Tbytes Annual production of information (print, film, optical & magnetic media): 1500 to 3000 Pbytes All Worldwide Telephone communication in 2002: 19.3 ExaBytes Moore s Law enables instruments and detectors to generate unprecedented amount of data in all scientific disciplines 9

LHC/Atlas in Action

The LHC Data Challenge Starting from this event Selectivity: 1 in 10 13 Like looking for 1 person in a thousand world populations! Or for a needle in 20 million haystacks! You are looking for this signature Src: CEN 11

The Computing Needs will grow It may grow to the scale of ExaBytes of Data and PetaFlops Computing by 2015, in particular, the luminosity will be enhanced even in early stage The largest commercial database currently can only handle tens of TeraBytes The fastest stand-alone computer now is only capable of delivering 70 TeraFlops peak

Enabling Grids for E-sciencE Tera Peta Bytes AM time to move 15 minutes 1Gb WAN move time 10 hours ($1000) Disk Cost 7 disks = $5000 (SCSI) Disk Power 100 Watts Disk Weight 5.6 Kg Disk Footprint Inside machine AM time to move 2 months 1Gb WAN move time 14 months ($1 million) Disk Cost 6800 Disks + 490 units + 32 racks = $7 million Disk Power 100 Kilowatts Disk Weight 33 Tonnes Disk Footprint 60 m 2 May 2003 Approximately Correct Distributed Computing Economics Jim Gray, Microsoft esearch, MS-T-2003-24 INFSO-I-508833 Academia Sinica, Taiwan 13

Enabling Grids for E-sciencE Mohammed & Mountains Petabytes of Data cannot be moved It stays where it is produced or curated Hospitals, observatories, European Bioinformatics Institute, A few caches and a small proportion cached Distributed collaborating communities Expertise in curation, simulation & analysis Distributed & diverse data collections Discovery depends on insights Unpredictable sophisticated application code Tested by combining data from many sources Using novel sophisticated models & algorithms What can you do? If the mountain won't come to Mohammed, Mohammed must go to the mountain Move Computation to the Data INFSO-I-508833 Academia Sinica, Taiwan 14

From Optimizing Architecture to Optimizing Organisation High-performance computing has moved from being a problem of optimizing the architecture of an individual supercomputer to one of optimizing the organization of large numbers of ordinary computers operating in parallel. Scott Kirkpatrick, SCIENCE Vol. 299, 2003, p668 15

It s about Collaboration

Open Source Model A world-wide community of people cooperatively developing software A software development analogue of open scientific inquiry Users have much greater control over their computing environment An attempt to account for the costs of software development honestly A new kind of knowledge- and community-building infrastructure Note: It potentially allows academic specialities, educators, civic organizations, business enterprises and others to develop their own innovative vehicles for sharing and elaborating a common body of knowledge. A force reshaping the software industry 17

Some Quotations If I have been able to see further, it was only because I stood on the shoulders of giants. --- Isaac Newton, Letter to obert Hooke The real problem has always been, in my opinion, getting people to collaborate on a solution [for a common problem]. --- David Williams, from 50 years of Computing at CEN 18

Enabling Grids for E-sciencE Why work together Wonderful opportunity Can do things that can t be done alone its more fun! ecognising and Establishing e-dreams Combine our creativity, ingenuity and resources Challenge so Hard can t go it alone Building a broad user community is hard Building e-infrastructure is hard Deploying e-infrastructure is hard Sustaining e-infrastructure is very hard ealising our potential Multipurpose & Multidiscipline infrastructure Amortise costs International competition and collaboration Source: Malcolm Atkinson INFSO-I-508833 2 nd EGEE Conference Den Haag - 23 rd November 2004 19

Enabling Grids for E-sciencE ules of Engagement A foundation of Honesty Admit what we don t understand Admit the limits of our software Admit the limits of our support ealism Working together is worthwhile so is competition choose! But they both take time to show a profit persist Openness and Mutual espect Stand shoulder-to-shoulder to meet challenges Be prepared to adapt your ideas and plans Commitment Make strong / explicit commitments or say no Honour the commitments you make Source: Malcolm Atkinson INFSO-I-508833 2 nd EGEE Conference Den Haag - 23 rd November 2004 20

Grid - its really about collaboration! It s about sharing and building a vision for the future And it s about getting connected It s about the democratization of science It takes advantage of Open Source! Source: Vicky White

Success on a Worldwide scale If we can bring together people from all over the world (whether they be physicists, biologists, computer scientists, climate researchers or.) and they Want to be part of building the cyber infrastructure or Grid environments or e-science environments for the future Actively participate Get benefit from the collaboration Then we will be succeeding Source: Vicky White

IT Holy Grail

IT Historical Perspective 1960 1970 1980 1990 2000 esult of 40 Years of Technology Evolution: Complex, multiple systems and processes 200 billion lines of legacy code on 30,000 mainframes worldwide 40-60 billion lines of code need modernization over next five years The CIO of the Future Changing the Dialogue 25 Oct 2005 page

Notes from EDS slide The convergence of business & IT agendas is underway but there s no common language and their infrastructures can t support the business Nearly everyone has accumulated a legacy environment because of mergers and acquisitions, reorganizations, decisions to centralize then decentralize and vice versa The result is IT sprawl - the unplanned, uncoordinated legacy systems and processes which create rigid environments incapable of supporting the needs of today and tomorrow Globalization will continue... changing competitive landscapes... Where, how, when and by whom work is performed will change... No single company can do it alone - business ecosystems will dominate 25

The Fourth Wave of IT Evolution Grid computing as the 4th Wave of IT evolution $ Source: Insight eports, Global Information Inc. 26

e-business e-science and the Grid e-business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. The growing use of outsourcing is one example e-science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses. The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peerto-peer systems to provide the information technology infrastructure for e-moreorlessanything. A deluge of data of unprecedented and inevitable size must be managed and understood. People, computers, data and instruments must be linked. On demand assignment of experts, computers, networks and storage resources must be supported Source: G. Fox

Is the Timing ight? Does PNC have the capacity to do it?

EGEE/LCG-2 Grid Sites : September 2005 Country providing resources Country anticipating joining EGEE/LCG-2 grid: 160 sites, 36 countries >15,000 processors, ~5 PB storage Other national & regional grids: ~60 sites, ~6,000 processors

Open Science Grid OSG Production 46 CEs, 15459 CPUs 6 SEs http://osg-cat.grid.iu.edu/ October 25, 2005 4th EGEE Conference: Dane Skow 30

JA3 CEN Enabling Grids for E-sciencE glite Services for elease 1 Access Services Grid Access API Service UK IT/CZ Security Services Authorization Authentication Auditing Information & Monitoring Information & Monitoring Services Application Monitoring Metadata Catalog Data Services File & eplica Catalog Accounting Job Management Services Job Package Provenance Manager Storage Element Data Management Site Proxy Computing Element Workload Management INFSO-I-508833 ISGC 2005 31

Open Science Grid OSG Services elease 0.2 0.4 Configuration & Installation Service Agreement Service Authorization Service Authentication Bandwidth Allocation & eservation Service Helper Services Auditing Dynamic Connectivity Information & Monitoring Network Monitoring Job Monitoring Service Discovery Information & Monitoring Services Key: OSG Specific Metadata Catalog Storage Element Security Services File & eplica Catalog Data Movement Data Services Job Provenance Computing Element Accounting Package Manager Workload Management Job Management Services EGEE Compatible VO Service TBD October 25, 2005 4th EGEE Conference: Dane Skow 32

Enabling Grids for E-sciencE Manage and operate production Grid infrastructure for the European esearch Area Interoperate with e-infrastructure projects around the globe Contribute to Grid standardisation efforts EGEE-II Mission Support applications deployed from diverse scientific communities High Energy Physics Biomedicine Earth Sciences Astrophysics Computational Chemistry Fusion Geophysics (supporting the Industrial application, EGEODE) Finance, Multimedia..... einforce links with the full spectrum of interested industrial partners Disseminate knowledge about the Grid through training Prepare for a permanent/sustainable European Grid Infrastructure (in a GÉANT2-like manner) INFSO-I-508833 Bob Jones, 4th EGEE conference, Pisa, 24th October 2005 33

Open Science Grid Who is using OSG? The Virtual Organizations High Energy and Nuclear Physics CMS, ATLAS, STA, DZero, CDF, Fermilab Physics and Astronomy LIGO, SDSS, Auger, DES Biology fmi, GADU, GASE, GLOW Engineering GASE, GLOW Computer Science ivdgl, GLOW User Support is entirely provided by the Vos October 25, 2005 4th EGEE Conference: Dane Skow 34

Asia Pacific esource Centers BEIJING-LCG2 LCG_KNU TOKYO-LCG2 PAKGID-LCG2 TIF-LCG2 Taiwan-LCG2 Taiwan-IPAS-LCG2 TW-NCUHEP GOG-Singapore 35

esources from egional Centers Taiwan- LCG2 Taiwan- Taiwan- Taiwan- GOG- LCG-KNU PAKGID-NCP- IPAS- LCG2 NCUHEPNTU_HEP Singapore LCG2 LCG2 TIF- LCG2 BEIJING- LCG2 Tokyo- LCG2 Australia New Zealand # CPU Disk (TB) 400 50 60 50 90 38 2 4 26 70 84 96+? 40 5 5 5 0.08 0.05 0.05? 0.05 3.00 0.87? VO Dteam, Alice, Atlas, CMS, BioMed Dteam, Atlas Dteam, CMS CMS Dteam, Atlas, CMS Dteam Dteam, CMS CMS Dteam, CMS Dteam, CMS, Atlas Dteam, Atlas Atlas CMS Tier-1 Center, 32+ CPUs dedicated for OSG ATLAS Federated Tier-2 CMS Tier-2 OSG Site, Federated CMS Tier- 2 32 CPUs Pakinstan Pakinstan India for Tier-2 Grid3/OSG egional Center University U. of Auckland Melbourne, and U. NorduGrid Canterbury site, ATLAS Tier-2 36

APOC Website (www.twgrid.org/aproc) 37

Taiwan Tier-1 in CEN Courier Academia Sinica drives e-science in Asia-Pacific The Academia Sinica Grid Computing Centre (ASGC) in Taipei is currently the only LCG Tier-1 Centre in the Asia-Pacific area, with 400 KSI2K computing capacity, 50 TB disk space and a 35 TB tape library dedicated to the LCG. Since 2004, Academia Sinica has provided the services of a regional operation centre (OC), site monitoring, virtual-organization (VO) support, middleware deployment, certificate authority (CA) and global Griduser support (GGUS mainly first-line support and FAQs). The centre supports not only Tier-2 sites in Taiwan, but also Grid operations in South Korea, Singapore and other Asia- Pacific countries that are not supported by other Tier-1 sites. To support service and data challenges, a maximum Grid tutorial at the Academia Sinica Grid Computing Centre. 1.6 Gbit/s transmission rate was achieved in the 2 Gbit network bandwidth between CEN and Taiwan in June 2005. During the CMS service challenge, ASGC received 20 TB of data from CEN at an average rate of 56 Mbit/s from 14 July to 14 August. The ASGC Tier-1 Centre provided 12% of the LCG-2 computing jobs, second only to the 14% of CEN in the ATLAS data challenge in 2004. Academia Sinica will work closely with Tokyo University and other Tier-2 sites in this region for the ATLAS and CMS service challenges in the near future. ASGC is engaging in collaboration and sharing of information by taking advantage of e-science applications in the Asia-Pacific area. ASGC is also working with different partners to help form and support application-driven e-science communities in the Asia-Pacific region, to improve the nextgeneration research infrastructure and build up the e-science applications. Hosting the International Symposium on Grid Computing (ISGC) since 6 CEN Computer Newsletter September October 2005 38

Conclusion Critical mass decides which Grid technology/system to prevail; Collaboration, Data and Complexity eduction are the main themes We are about to witness Data Deluge in all disciplines of e-sciences Unprecedented way to collaborate on day-to-day basis will change the sociology of academia life, eco-system of business world and eventually every one in the society We have digital scholars, librarians, content owner... Some capacity from Asia Tier-1 Centre in Taipei... John Taylor now talks about e-esearch: Collaboratories and Curation in APAC meeting Could we build the PNC e-neighbourhood together? 39