The Canadian CyberSKA Project

Similar documents
CyberSKA: Project Update. Cameron Kiddle, CyberSKA Technical Coordinator

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy

Designing the Future Data Management Environment for [Radio] Astronomy. JJ Kavelaars Canadian Astronomy Data Centre

SKA Regional Centre Activities in Australasia

The Computation and Data Needs of Canadian Astronomy

Data Centres in the Virtual Observatory Age

Some Reflections on Advanced Geocomputations and the Data Deluge

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

SPARC 2 Consultations January-February 2016

Clouds: An Opportunity for Scientific Applications?

CANARIE: Providing Essential Digital Infrastructure for Canada

Deliverable D3.5 Harmonised e-authentication architecture in collaboration with STORK platform (M40) ATTPS. Achieving The Trust Paradigm Shift

Sri Lanka THE JOURNEY OF TOWARDS A CREATIVE KNOWLEDGE BASED ECONOMY

Regardless of the size of your organisation, effective collaboration and communication via a well executed intranet will:

Microsoft SharePoint 2010 The business collaboration platform for the Enterprise and the Web. We have a new pie!

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

A VO-friendly, Community-based Authorization Framework

Mining Massive Data Sets With CANFAR and Skytree. Nicholas M. Ball Canadian Astronomy Data Centre National Research Council Victoria, BC, Canada

UBIQUITIOUS, RESILIENT, SECURE CONNECTIVITY IN THE NEAR-PEER THREAT ENVIRONMENT

Sensor Web when sensor networks meet the World-Wide Web

Computational issues for HI

Connecting the e-infrastructure chain

The Square Kilometre Array. Miles Deegan Project Manager, Science Data Processor & Telescope Manager

The What, Why, Who and How of Where: Building a Portal for Geospatial Data. Alan Darnell Director, Scholars Portal

The DataBridge: A Social Network for Long Tail Science Data!

GN3plus External Advisory Committee. White Paper on the Structure of GÉANT Research & Development

Orange Smart Cities. the ICT partner for innovators in the urban space

Response to Wood Buffalo Wildfire KPMG Report. Alberta Municipal Affairs

Microsoft SharePoint Server 2013 Plan, Configure & Manage

General Data Protection Regulation (GDPR) and the Implications for IT Service Management

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

The Regional Internet Registries

Position Description. Engagement Manager UNCLASSIFIED. Outreach & Engagement Information Assurance and Cyber Security Directorate.

Mail. Having your mail stored in the cloud means you can access it just about anywhere on just about any device with an internet connection.

EGI: Linking digital resources across Eastern Europe for European science and innovation

IOT DEVICE MANAGEMENT: SECURE AND SCALABLE DEPLOYMENTS WITH DIGI REMOTE MANAGER

The Materials Data Facility

BEYOND AUTHENTICATION IDENTITY AND ACCESS MANAGEMENT FOR THE MODERN ENTERPRISE

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

Transforming IT: From Silos To Services

New research on Key Technologies of unstructured data cloud storage

The Virtual Observatory and the IVOA

The New Collaboration Experience. Yves Torjman BDM

einfrastructures Concertation Event

Update from the RIPE NCC

A VOSpace deployment: interoperability and integration in big infrastructure

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

Developing Enterprise Cloud Solutions with Azure

A global open source ecosystem for power systems

VOCAL (Virtual Online Collaboration at Liverpool) is the University of Liverpool's collaborative working environment.

Familiar Simple Easy Safe. New (Delve, Sway) Different Rich Engaging Potential for innovation. Late Majority 34% 2.5% Innovators. Early Majority 34%

Overview. A fact sheet from Feb 2015

Best of SharePoint Sites and Communities

Data Center Management and Automation Strategic Briefing

Science-as-a-Service

Introduction to Grid Computing

2017 Resource Allocations Competition Results

Smart Cities & The 4th Industrial Revolution

ASKAP Data Flow ASKAP & MWA Archives Meeting

the steps that IS Services should take to ensure that this document is aligned with the SNH s KIMS and SNH s Change Requirement;

EGEE and Interoperation

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Acurian on. The Role of Technology in Patient Recruitment

Adaptive selfcalibration for Allen Telescope Array imaging

Bridging The Gap Between Industry And Academia

Technical Brief SUPPORTPOINT TECHNICAL BRIEF MARCH

The iplant Data Commons

Advanced Solutions of Microsoft SharePoint Server 2013

TRUSTED IT: REDEFINE SOCIAL, MOBILE & CLOUD INFRASTRUCTURE. John McDonald

Library 2.0 for Librarians

The Portal Aspect of the LSST Science Platform. Gregory Dubois-Felsmann Caltech/IPAC. LSST2017 August 16, 2017

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Data Intensive processing with irods and the middleware CiGri for the Whisper project Xavier Briand

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

National Cybersecurity Center of Excellence

e-infrastructures in FP7 INFO DAY - Paris

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Presence and IM: Reduce Distractions and Increase Productivity

Commissioner Ian Dyson SRO, National Enabling Programmes IMORCC

Accelerate your Software Delivery Lifecycle with IBM Development and Test Environment Services

EMC ACADEMIC ALLIANCE

Global Response Centre (GRC) & CIRT Lite. Regional Cyber security Forum 2009, Hyderabad, India 23 rd to 25 th September 2009

The Evolution of Public-Private Partnerships in Canada

Beyond the Border: A Shared Vision for Perimeter Security and Economic Competitiveness

Designing High-Performance Data Structures for MongoDB

by Cisco Intercloud Fabric and the Cisco

SharePoint SP380: SharePoint Training for Power Users (Site Owners and Site Collection Administrators)

DataONE: Open Persistent Access to Earth Observational Data

Research Data Management & Preservation: A Library Perspective

Microsoft SharePoint End User level 1 course content (3-day)

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

SIEM Solutions from McAfee

Government of Canada IPv6 Adoption Strategy. IEEE International Conference on Communications (ICC 12) June 14 th, 2012

Outline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools

McClelland Report. John McClelland CBE, June 2011

Video Conferencing & Skype for Business: Your Need-to-Know Guide

Cisco Prime Central for HCS Assurance

Data Virtualization Implementation Methodology and Best Practices

Transcription:

The Canadian CyberSKA Project A. G. Willis (on behalf of the CyberSKA Project Team) National Research Council of Canada Herzberg Institute of Astrophysics Dominion Radio Astrophysical Observatory May 24, 2011 Aveiro Portugal 2011 1 / 37

The CyberSKA Project Team Aveiro Portugal 2011 2 / 37

Outline of Talk SKA Overview GALFACTS - example of new-style survey CyberSKA CANARIE CyberSKA Requirements CyberSKA Solutions Social Networking Visualization Data Management 3rd Party Applications Next Steps Aveiro Portugal 2011 3 / 37

SKA Science Goals Aveiro Portugal 2011 4 / 37

SKA Technical Requirements Aveiro Portugal 2011 5 / 37

Obligatory SKA Site Simulation Who s driving that vehicle? RTS? PED? Aveiro Portugal 2011 6 / 37

Motivation for CyberSKA Most SKA key science goals will be achieved via large-scale survey type observing programs Very high data rates and volumes Complex multi-purpose processing and analysis Executed by globally distributed teams of researchers Drives the need for cyber-infrastructure solutions for Collaboration tools Data storage, management and distribution methods Distributed data processing, analysis and visualization Aveiro Portugal 2011 7 / 37

Radio Imaging Survey Data Rates Aveiro Portugal 2011 8 / 37

GALFACTS: The G-ALFA Continuum Transit Survey GALFACTS - example of a new-style survey with data that you cannot reduce on your laptop GALFACTS Data Rate 7 beams, 2 bands, 4 Stokes, 4098 channels per band gives 460 MB / sec 6.5 hrs per night gives 10.5 TB Near real-time processing at Arecibo high time resolution, low spectral resolution (HTLS) 1.5 TB / day low time resolution, high spectral resolution (LTHS) 53 GB / day these data sets transferred to University of Calgary For 28 night observing session HTLS 40 TB LTHS 1.5 TB Total observing time for project - 1800 hours correlator produces 2.9 PB 250 TB transferred to Calgary Aveiro Portugal 2011 9 / 37

GALFACTS Beam Pattern Aveiro Portugal 2011 10 / 37

GALFACTS Scan Pattern Aveiro Portugal 2011 11 / 37

GALFACTS Data processing Pipeline Aveiro Portugal 2011 12 / 37

CyberSKA Overview An initiative to develop a scalable and distributed infrastructure platform to meet evolving science needs of the SKA Led by the University of Calgary (Russ Taylor - project lead) with several partner institutions (currently) from North America Canadian funding for CyberSka provided by CANARIE as part of their Network Enabled Platforms (NEP) program, and Cybera NEP funding two Canadian Astronomy-related programs CyberSKA - led by University of Calgary CANFAR (Canadian Advanced Network for Astronomical Research) - led by University of Victoria Cybera - Alberta Cyberinfrastructure for Innovation Start by establishing cyberinfrastructure to support current large-scale astrophysical data needs generated by GALFACTS, PALFA and other high data volume SKA Pathfinder projects. Aveiro Portugal 2011 13 / 37

CANARIE CANARIE - Canada s Advanced Research and Innovation Network 98% of CANARIE s funding goes toward improving the effectiveness of research in Canada Network capacity improvements and new services Programs to simplify researcher access Support for provincial partner networks major funding of its programs and activities provided by the Government of Canada Annual cost about 25 million dollars Underpins $3.5 billion spent per year on research in Canadian universities and government labs 10 billion bits per second across the core network 100 billion bits per second in key corridors Aveiro Portugal 2011 14 / 37

Network-enabled Platforms (NEP) This program provides funding for the ICT infrastructure needs of each research community and provides for the development of such things as: Web portals aggregating large data sets Sophisticated software tools for modelling and visualization Sophisticated software tools enabling collaboration Goals Accelerate development and implementation of research platforms Facilitate collaboration Increase International Connectedness 20 NEP research domains including Transportation, High Energy Physics, Ocean Science, Space Science, Health Science Aveiro Portugal 2011 15 / 37

CANARIE and CyberSKA Sites Aveiro Portugal 2011 16 / 37

CyberSKA Experience/Background Leverage knowledge and experience of the Grid Research Centre at the University of Calgary, IBM, and a large technical team Adapt, customize and extend technologies used by GeoChronos (http://geochromos.org) - another CANARIE NEP funded project Platform developed by the Grid Research Centre Enables Earth observation scientists to access and share data and applications and collaborate more effectively. Employs social networking, cloud computing and data management technologies Make use of other existing tools and technologies where possible Aveiro Portugal 2011 17 / 37

Requirements for CyberSKA Platform Distributed and transparent Scalable Provide transparent access to distributed data, computing resources and services Must scale to support increasing data and processing needs Deployable Different sites should be able to deploy developed tools and participate in CyberSKA relatively easily. Heterogeneous Automated Provide a framework to enable interaction with different types of data, computing resources and services and to add/execute different processing algorithms and workflows. Automation and dynamic reconfiguration of services and data workflows in response to user demand, changing user objectives, available data and resource availability Aveiro Portugal 2011 18 / 37

Requirements II Web-enabled Web-based platform that users can access from anywhere with Internet access Collaborative Enable international/distributed teams to collaborate and communicate effectively Interactive Enable on-line interactive visualization of data Auditable Be able to track where data has come from and processes applied to it (data provenance) Interoperable Compliant with existing standards such as the Virtual Observatory (VOE) Aveiro Portugal 2011 19 / 37

System Context Model Aveiro Portugal 2011 20 / 37

System Context Model II Radio Telescopes (Arecibo, EVLA, ASKAP, SKA) Raw telescope data, monitoring data, control messages and commands Owner - Telescope providers Remote CyberSKA Sites Raw and processed data transferred between sites, user access, virtual machines, system services, collaboration services Owner - Cyber SKA community Other Data Providers Content not defined yet. CyberSKA will provide a series of APIs and utilities to allow for integration of other data providers Owner - various sources Web Services Method calls to execute defined services Owner - CyberSKA community Aveiro Portugal 2011 21 / 37

System Context Model III technical and administrative staff Applications, services, documents, Web pages, profiles, discussions, messages, publications, events and many other resources Owner - Cyber SKA community Domain scientists (astronomers, physicists) Raw and processed data, documents, Web pages, profiles, discussions, messages, publications, events, and many other resources Owner - individual researchers and teams Third Party Applications / Services Links and interfaces to tools and applications provided outside of the standard CyberSKA site. Applications may be hosted outside of CyberSKA site or may be hosted on CyberSKA resources. These applications are maintained and managed separately from CyberSKA regardless of where they are stored. Owner - various sources Educators, students and general public Information, crowd sourcing (identification of pulsars and extragalactic radio sources) Owner - individuals and schools Aveiro Portugal 2011 22 / 37

High Level Architecture Aveiro Portugal 2011 23 / 37

High Level Architecture II The core of CyberSKA is cloud based. Virtual machines are created and removed based on user and application needs A site may also have high performance computing or other specialized services that are not as well suited to vitalization. Collaboration and social networking are deployed outside of the core CyberSKA sites. This allows greater flexibility and ease in adding new sites while providing a single portal to access all of CyberSKA Access to the CyberSKA data and functionality is primarily through the web services layer. A common services definition allows new sites to join CyberSKA relatively easily while providing a common experience to all users Aveiro Portugal 2011 24 / 37

Solution - Use Social Networking Can enhance collaboration capabilities around data and applications Facebook for Scientists Facebook analogy Platform dealing with large scale in terms of users, data and applications more than 500 millions users, of whom about 50% log on to Facebook on any given day more than 30 billion pieces of content shared each month more than 550 thousand applications on Facebook platform Aveiro Portugal 2011 25 / 37

Solutions - Collaboration Portal built on top of the Elgg open source social networking platform Provides many facebook-like features including tags, bookmarks, profiles, blogs, wikis, contacts, groups, document sharing, discussions, messaging, calendars, status, activity feeds Aveiro Portugal 2011 26 / 37

Collaboration Aveiro Portugal 2011 27 / 37

Solutions - Visualization On-line visualization of multi-dimensional FITS files Supports interactive panning and zooming, histogram correction, colour map adjustments, display of pixel data value, region statistics, multiple coordinate systems, grids, selection of frame for multi-dimensional images, 2D Gaussian fitting, permalink, screenshots Aveiro Portugal 2011 28 / 37

Visualization Aveiro Portugal 2011 29 / 37

Visualization Aveiro Portugal 2011 30 / 37

Solutions - Data Access/download data for selected parameters and region of interest Requested data generated in virtualized Condor pool on server side Aveiro Portugal 2011 31 / 37

Solutions - Data II Distributed data management service Built on irods (Integrated Rule-Oriented Data System) Used PostgreSQL database for image metadata (spatial, temporal, and spectral queries supported) Supports mosaicing, plane extraction, compression and staging of images returned by query Details in talk by Venkat Mahadevan Aveiro Portugal 2011 32 / 37

Solutions - Applications API for integrating third party / remotely hosted applications Single sign-on to applications enabled using OAuth Aveiro Portugal 2011 33 / 37

CyberSKA Portal Usage 140+ members from around the world 20+ groups - GALFACTS, PALFA, EVLA, GMRT, CASA Users, etc Aveiro Portugal 2011 34 / 37

Next Steps Infrastructure Set up cloud computing environments and key services at each site Collaboration Refinement and development of collaboration features based on user feedback Data Management Expansion of distributed data management system to other sites Better integration of data management system with other CyberSka tools and services Visualization Provide server side support and improve scalability Aveiro Portugal 2011 35 / 37

Next Steps II Data Processing Establish dynamic batch-based processing and interactive service environments on cloud platform Establish framework for adding and integrating different processing algorithms and workflows Applications Extension of third-party application API to enable two-way interaction between portal and applications (i.e. pull data/information from portal, push news feeds to portal based on application activities) Aveiro Portugal 2011 36 / 37

Contact Information and Acknowledgements Portal: http://www.cyberska.org e-mail: info@cyberska.org Acknowledgements Russ Taylor - project Principal Investigator Cameron Kiddle - technical coordinator Olivier Eymere - IT Architect (IBM) CyberSKA project team Aveiro Portugal 2011 37 / 37