Designing the Future Data Management Environment for [Radio] Astronomy. JJ Kavelaars Canadian Astronomy Data Centre

Similar documents
Data Centres in the Virtual Observatory Age

The Computation and Data Needs of Canadian Astronomy

A VOSpace deployment: interoperability and integration in big infrastructure

The Virtual Observatory and the IVOA

The Canadian CyberSKA Project

Integrating Anonymous & Authenticated Access to VO Services. Patrick Dowler Canadian Astronomy Data Centre

Mining Massive Data Sets With CANFAR and Skytree. Nicholas M. Ball Canadian Astronomy Data Centre National Research Council Victoria, BC, Canada

Building on Existing Communities: the Virtual Astronomical Observatory (and NIST)

VIRTUAL OBSERVATORY TECHNOLOGIES

SPARC 2 Consultations January-February 2016

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy

Research Data Management & Preservation: A Library Perspective

Summary of Data Management Principles

CyberSKA: Project Update. Cameron Kiddle, CyberSKA Technical Coordinator

A VO-friendly, Community-based Authorization Framework

CANARIE Mandate Renewal Proposal

Tools for Handling Big Data and Compu5ng Demands. Humani5es and Social Science Scholars

SKA Regional Centre Activities in Australasia

2017 Resource Allocations Competition Results

The Portal Aspect of the LSST Science Platform. Gregory Dubois-Felsmann Caltech/IPAC. LSST2017 August 16, 2017

Technology for the Virtual Observatory. The Virtual Observatory. Toward a new astronomy. Toward a new astronomy

Scientific Data Curation and the Grid

Design and Implementation of the Japanese Virtual Observatory (JVO) system Yuji SHIRASAKI National Astronomical Observatory of Japan

The NOAO Data Lab Design, Capabilities and Community Development. Michael Fitzpatrick for the Data Lab Team

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Euclid Archive Science Archive System

Gemini Queue Operation

Italy - Information Day: 2012 FP7 Space WP and 5th Call. Peter Breger Space Research and Development Unit

e-infrastructures in FP7 INFO DAY - Paris

The Evolution of Public-Private Partnerships in Canada

Exploiting Virtual Observatory and Information Technology: Techniques for Astronomy

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Protecting Future Access Now Models for Preserving Locally Created Content

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

Cloud Computing For Researchers

Europlanet IDIS: Adapting existing VO building blocks to Planetary Sciences

The IPAC Research Archives. Steve Groom IPAC / Caltech

Moving e-infrastructure into a new era the FP7 challenge

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

PARALLEL COMPUTING IN R USING WESTGRID CLUSTERS STATGEN GROUP MEETING 10/30/2017

Where is the Slack in the SDP System? Markus Dolensky

Outline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools

IVOA and European VO Efforts Status and Plans

Electronic Records Archives: Philadelphia Federal Executive Board

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Collaboration on Cybersecurity program between California University and Shippensburg University

The Smithsonian/NASA Astrophysics Data System

PIPELINE SECURITY An Overview of TSA Programs

MAGS Data Access Policy

VMware vsphere 4 and Cisco Nexus 1000V Series: Accelerate Data Center Virtualization

GSCB-Workshop on Ground Segment Evolution Strategy

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Connecting the e-infrastructure chain

Usage of the Astro Runtime

SIA V2 Analysis, Scope and Concepts. Alliance. Author(s): D. Tody (ed.), F. Bonnarel, M. Dolensky, J. Salgado (others TBA in next version)

Distributed Archive System for the Cherenkov Telescope Array

To Compete You Must Compute

Square Kilometre Array

Future Core Ground Segment Scenarios

CANADIAN FOREST GEOSPATIAL DATA INFRASTRUCTURE INTRODUCTION

uws-client A command-line tool for UWS services Kristin Riebe GAVO, AIP 1 / 18

EUDAT- Towards a Global Collaborative Data Infrastructure

OLRC. Ontario Library Research Cloud: From Idea to Infrastructure in Three Easy Years. Dale Nick

Georgia State University Cyberinfrastructure Plan

Networking Fundamentals Training

Extending the SDSS Batch Query System to the National Virtual Observatory Grid

Response to Industry Canada Consultation Developing a Digital Research Infrastructure Strategy

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

The Board grants site licenses for utilities that do not interfere with the park function.

IT Security vs. Defensive Cyber Operations: The evolution of CAF Cyber

Symantec Endpoint Protection Cloud (SEPC)

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and

Public Sector Transformation Through Partnership and Collaboration

Large Scale Data Management of Astronomical Surveys with AstroSpark

Cloud Services. Infrastructure-as-a-Service

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Research Data Management: Edinburgh University Library s Approach Dominic Tate

UNCLASSIFIED. National and Cyber Security Branch. Presentation for Gridseccon. Quebec City, October 18-21

Helix Nebula Science Cloud Pre-Commercial Procurement pilot. 5 April 2016 Bob Jones, CERN

The smartest solution for your media processing workflows. January

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Hands-on tutorial on usage the Kepler Scientific Workflow System

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

GALAXYCOM ICT SOLUTIONS COMPANY PROFILE

ALMA Snooping Project Interface (SnooPI) User Manual

Gaia Catalogue and Archive Plans and Status

Design patterns for data-driven research acceleration

KEDAYAM A KAAPAGAM MANAGED SECURITY SERVICES. Kaapagam Technologies Sdn. Bhd. ( T)

Overview of Archiving. Cloud & IT Services for your Company. EagleMercury Archiving

Computational synergies between LSST and SKA

New Zealand Government IbM Infrastructure as a service

Brown University Libraries Technology Plan

Emergency Communications Preparedness Center (ECPC) Research and Development (R&D) Focus Group (FG)

Professional Services for Cloud Management Solutions

Distributed Data Management with Storage Resource Broker in the UK

T-SURE VIGILANCE CYBER SECURITY OPERATIONS CENTRE

Abstract. Introduction. The Virtual Astronomy Multimedia Project

National Centre for Energy Systems Integration Introduction to the Centre November 2016

Transcription:

Designing the Future Data Management Environment for [Radio] Astronomy JJ Kavelaars Canadian Astronomy Data Centre

2 Started working in Radio Data Archiving as Graduate student at Queen s in 1993

Canadian Astronomy Data Centre (CADC) Data archive housing collections from multiple (+10) telescopes: Currently house about 2 PB of telescope archive data Development group working on standards in data archive system via the International Virtual Observatory Alliance (IVOA) including: Common Archive Observation Model (CAOM and ObsCore) VOSpace - Open storage system protocol. SIAP - Simple Image Access Protocol Database Table Access Protocol (TAP) + Astronomy Data Query Language Group Management Service Authentication and Access Lead development and support for the Canadian Advanced Network For Astronomical Research (CANFAR) - Cloud computing in Astronomy Research Astronomers investigating Dark Energy, Quasars, Galaxy Evolution, Stellar Atmosphere, the Trans Neptunian Region and Machine Learning in image and spectroscopy classification. No one doing Radio astronomy.. but we are branching out. 3

Example IVOA Standard used at MWA siap.xml -- a standard SIAP interface as defined by the IVOA to access collections of celestial images; SIAP clients use http:// gleam-vo.icrar.org/gleam_postage/q/siap.xml? to access the service VOTable Output Fields 4

JCMT Data Delivery Location since 2006 5

The current data problem for our users. Multiple PBs of data stored in archives and research repositories. Many research projects growing in the volume and complexity of data analysis required. Entire CADC archive delivered to users annually, over the internet Research partnerships are growing in size and becoming increasingly geographically distributed Visualization of complex datasets not obvious or straight forward. Resource requirements not always available near the expert. Data transfer overheads often exceed computation duration. 6

7

The data centre landscape CADC operates an archive platform. Platform is used by multiple facilities. System architecture is facility (and energy) agnostic Hardware costs shared: NRC, Compute Canada and CANARIE CANFAR : science collaboration and data processing/storage platform operated by CADC Platform attempts to make generic access to storage and computing straightforward Acts as a focus point for Observational Astronomy resources allocation on Compute Canada Similar platforms exist / emerging in other disciplines CFI / CANARIE / NRC fund(ed) the development 8

CANFAR : Cloud storage and processing for astronomy. The computing and data are in the same computing centre. Keep the data local to the compute. The cloud abstracts your access to the data centre. When you run your computing on the cloud the jobs are moved to the computing centre that holds the data or the computing requests the data from the local storage. Should be transparent to users. CADC/CANFAR is currently distributed across three sites in British Columbia. About 1000 cores in a Cloud/VM system with about 4 PB of User and Archive storage. 9

10 www.canfar.net

NRC CC CC CADC - CANFAR 11

DC2 Federation Layer SC3 DC1 12

CFI Innovation Fund What CADC/CANFAR will provide (straight from proposal). Long-term archiving and service: We require enhanced access to and longterm archiving of Science Ready Data Products (SRDPs). This item, to be provided by CADC, includes: 1. Indefinite access to the SRDPs, 2. Customizations to the CANFAR cloud computing platform to meet project requirements for running the Processing Software Stack. 3. Custom design for a query interface for SRDPs in the CADC data portal, and 4. Long-term maintenance of interfaces and data formats to preserve the results of the project indefinitely 13

CFI Cyber Infrastructure Fund - PROPOSAL Improved systems for CANFAR users, including: New interactive data processing environments Database systems that support large astronomical catalogs provided by users / science teams Data processing work flows for astronomy Improved batch processing and data system Data Exploration environment. Cross-correlation Developed and design in coordination with specific projects but made general enough for wide use in astronomy. 14

Thank you JJ Kavelaars Canadian Astronomy Data Centre Tel: 250-363-8694 JJ.Kavelaars@nrc-cnrc.gc.ca www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca 15 4