Connecting the e-infrastructure chain Internet2 Spring Meeting, Arlington, April 23 rd, 2012 Peter Hinrich & Migiel de Vos
Topics - About SURFnet - Motivation: Big data & collaboration - Collaboration Environment (Openconext) - Bandwith on Demand (lightpaths) 1
About SURFnet Development and exploitation of: the Dutch National Network for Higher Education and Research innovative ICT platforms & services By and for the Dutch Higher Education and Research community: 160 connected organisations, 1 million users 100% ownership stichting SURF Combines the demand of connected institutions Not for profit, 85 employees 2
Developments in research More and more data: Digital Deluge More data Large scale research Collaboration in VO s Shared resources Dependence on ICT 3
Connecting the e- Infrastructure Chain 4
Researchers are knocking on our door Radio Astronomy Pulsar Research Climate Modeling for Scientist and Decision Makers CineGRID 4K+ Video Distribution Testbed Centralized Imaging for Large Scale Population Imaging Studies Jungle Computing and Multi-Model Multi-Kernel Simulations Next Generation DNA Sequencing 5
Focus Areas Hybrid end-to-end network The basis for all collaboration, providing efficient, unlimited data transport. Pioneering collaboration environment that reaches beyond existing boundaries and that seamlessly integrates the services and tools provided by a large number of suppliers. 6
Use case 1: Radio Astronomy 7
Use Case 2: Life Sciences Genome of the Netherlands - DNA reads of 750 individuals, 300 TB data - Creating a reference genome - Looking at variations in individuals to find cause of diseases. - Hospitals generate data, and do 1 st analysis 8
9 NGS The Challenge
Collaboration across boundaries Data storage, computing, analysis, visualization: sharing Collaboration in virtual teams 10
Virtual (research) Organizations - Aimed at organizing community, providing funding, infrastructure etc. Not virtual, long term, ICT awareness Netherlands Bioinformatics Centre - Aimed at actually doing research Ad hoc collaboration, little or no ICT awareness, across institutes Genome of The Netherlands Metabolomics Proteomics... 11
Infrastructure Orchestration by Virtual Organization Research Data Storage Collaboration Portal ƒ DNA Sequencer Bandwidth on demand Virtual Collaboration 09 12 14 12 SURFnet - We make innovation work
Openconext Federated identity management- SAML Group management Grouper Social Network Portal technology - OpenSocial Collaboration tools & services Openconext components available as open source http://www.openconext.org 13
Functional Components warning Supporting Services SURFfederation SURFteams OpenSocial challenges ahead License- and contract management 14 Campus Services External Services
Traditional Organisations Supporting Services SURFfederatie SURFteams OpenSocial Apps.Erasmus Apps.Groningen Apps.Leiden 15
My Experiment PubMed Grid res. Publisher Virtual Organisations Netherlands BioInformatics Centre (NBIC) N=6 N=10 N=30 Guests N=20 NBIC Group N=66 Supporting Services SURFfederatie SURFteams OpenSocial Virtual IdP Apps.NBIC.nl BoD Network service 16
BoD via OpenConext 17 Fusion of Bandwidth on Demand and Virtual Organizations - ESCC/Internet2 JT 2012, Baton Rouge, LA
BoD authorization - Development of new GUI - In collaboration with BoD user team (scientific users) - Based on OpenConext NOC ICT ICT ICT GoNL GoNL GoNL Climate Climate 18
Connecting the e-infrastructure chain Internet2 Spring Meeting, Arlington, April 23 rd, 2012 Peter Hinrich & Migiel de Vos Part 2
Overview Migiel de Vos Network services @ SURFnet - Network & Lightpaths - e-infrastructure & NetherLight - Four e-infrastructure projects 20
Infrastructure Orchestration Services Collaboration Portal Virtual Organisation ƒ Resources Bandwidth on Demand Virtual Collaboration 09 21 12 14
SURFnet network 22
SURFLightpaths Lightpath: dedicated, protected capacity over provider backbone and beyond Path between A and B Fixed or OnDemand Currently SDH-based 150Mbps up to 100Gbps 23 Near future Carrier Ethernet Lightpath: E-LINE New possibilities: E-TREE and E-LAN 1Mbps up to 100Gbps (and more in the future) Quick setup
NetherLight Open Lightpath Exchange Lightpath Exchange operated by SURFnet Located at SARA, Amsterdam Science Park http://www.netherlight.net 24
Open Exchanges Policy is Open Allows innovative and independent architectures Anyone can become a connector Any cross-connect can be made Transparently interconnect SDH Other Open Exchanges NRENs Cross Border Fibers Trans-Atlantic links Commercial Service Providers Aggregation Networks Ethernet 25
NetherLight connectivity 26
GLIF 27
e-infrastructure & NetherLight - Plaatje Gerben 28
E-Infrastructure Projects
SIP Pilot Windesheim University of Applied Sciences Request for a reliable SIP trunk OneXS Commercial SIP provider with presence at two locations Able to provide a reliable SIP trunk NetherLight Ethernet access point for Service Customer and Service Provider 30
SIP Pilot Network 31
Future climate change Several institutions with limited model output Different numerical climate models Looking for coordinated simulations In The Netherlands The Royal Netherlands Meteorological Institute Wageningen University Standardised database (CMIP5 format) Located at the British Atmospheric Data Centre, UK Will become the main climate model output database Volumes are currently too large to obtain via regular connections 32
33 Climate Change Network
Next Generation Sequencing Research Several research projects in The Netherlands E.g. Genome of the Netherlands (GoNL) Ten Dutch participating organisations Nine research institutions with High Performance Computing clusters (provided by SARA/Big Grid) SARA (non profit) supporting researchers with a large HPC cluster and PBs storage DNA sequence data providers Seven institutions with small DNA sequencers Two large data providers 34
Next Generation Sequencing Interconnect Big Grid infrastructure Collaboration of 10 institutes Using the Life Science Grid RUG AMC SARA TU Delft WUR LUMC ErasmusMC VU Hubrecht UMCN 35
Next Generation Sequencing Gather data from Large NGS facilities Transport data by harddisk On average 100 harddisks of 1TB per three months High latency High throughput Low sustainability Transport data Lightpaths Continuing flow 1GbE (future 10GbE?) -> more and more bandwidth OnDemand Lightpath enabled when required 36
Next Generation Sequencing solution Interconnect locations via OnDemand Lightpaths Currently 8 of 10 locations in the Netherlands Via NetherLight connectivity to big NGS facilities Complete Genomics (CA, USA) Investigating BGI (Hong Kong, China) SARA/BigGrid responsible for Layer3 setup 37
Genomics Network 38
Jungle Computing Many diverse resources available Clusters Clouds Grids Concurrent use of multiple resources unavoidable Scalability Data distribution Software heterogeneity Ad-hoc hardware availability Ibis to simplify programming and deployment Generalized programming framework for all Jungle Computing applications Automatically maps any application activity (task) onto any appropriate executor (hardware) 39
Collaboration Portal Jungle Computing & e-infrastructure Services Virtual Organisation ƒ Jungle Computing Resources Bandwidth on Demand Virtual Collaboration 09 40 12 14
Thank you! Peter Hinrich Migiel de Vos peter.hinrich@surfnet.nl migiel.devos@surfnet.nl 41