SKA Computing and Software

Similar documents
SKA Technical developments relevant to the National Facility. Keith Grainge University of Manchester

New Zealand Involvement in Solving the SKA Computing Challenges

SKA Low Correlator & Beamformer - Towards Construction

THE SQUARE KILOMETER ARRAY (SKA) ESD USE CASE

Computational issues for HI

GPUS FOR NGVLA. M Clark, April 2015

SDP Design for Cloudy Regions

Overview of Science Data Processor. Paul Alexander Rosie Bolton Ian Cooper Bojan Nikolic Simon Ratcliffe Harry Smith

A.J. Faulkner K. Zarb-Adami

Energy-Efficient Data Transfers in Radio Astronomy with Software UDP RDMA Third Workshop on Innovating the Network for Data-Intensive Science, INDIS16

AA CORRELATOR SYSTEM CONCEPT DESCRIPTION

The Square Kilometre Array. Miles Deegan Project Manager, Science Data Processor & Telescope Manager

John W. Romein. Netherlands Institute for Radio Astronomy (ASTRON) Dwingeloo, the Netherlands

Data Processing for the Square Kilometre Array Telescope

Adaptive selfcalibration for Allen Telescope Array imaging

Signal processing with heterogeneous digital filterbanks: lessons from the MWA and EDA

Current Developments in the NRC Correlator Program

OSKAR-2: Simulating data from the SKA

Fast Holographic Deconvolution

PARALLEL PROGRAMMING MANY-CORE COMPUTING: THE LOFAR SOFTWARE TELESCOPE (5/5)

Where is the Slack in the SDP System? Markus Dolensky

PoS(10th EVN Symposium)098

ASKAP Central Processor: Design and Implementa8on

Developing A Universal Radio Astronomy Backend. Dr. Ewan Barr, MPIfR Backend Development Group

CSD3 The Cambridge Service for Data Driven Discovery. A New National HPC Service for Data Intensive science

ASKAP Data Flow ASKAP & MWA Archives Meeting

SKA 1 Infrastructure Element Australia. SKA Engineering Meeting Presentation 8 th October 2013

PARALLEL PROGRAMMING MANY-CORE COMPUTING FOR THE LOFAR TELESCOPE ROB VAN NIEUWPOORT. Rob van Nieuwpoort

Data Driven Innova,on, High Performance Compu,ng and the Square Kilometre Array

EVLA Memo #132 Report on the findings of the CASA Terabyte Initiative: Single-node tests

ALMA Correlator Enhancement

CASA. Algorithms R&D. S. Bhatnagar. NRAO, Socorro

MeerKAT Data Architecture. Simon Ratcliffe

The UniBoard. a RadioNet FP7 Joint Research Activity. Arpad Szomoru, JIVE

SKA Regional Centre Activities in Australasia

SKA Monitoring & Control Realisation Technologies Hardware aspects. R.Balasubramaniam GMRT

PhD in Computer And Control Engineering XXVII cycle. Torino February 27th, 2015.

IBM Power Systems HPC Cluster

Gen-Z Memory-Driven Computing

Powering Real-time Radio Astronomy Signal Processing with latest GPU architectures

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy

S.A. Torchinsky, A. van Ardenne, T. van den Brink-Havinga, A.J.J. van Es, A.J. Faulkner (eds.) 4-6 November 2009, Château de Limelette, Belgium

Click to edit Master title style

A New NSF TeraGrid Resource for Data-Intensive Science

OSKAR: Simulating data from the SKA

SKA Central Signal Processor Local Monitor and Control

Energy Efficient Transparent Library Accelera4on with CAPI Heiner Giefers IBM Research Zurich

Correlator Field-of-View Shaping

Building supercomputers from embedded technologies

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

27 March 2018 Mikael Arguedas and Morgan Quigley

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

The Breakthrough LISTEN Search for Intelligent Life: A Wideband Data Recorder for the Robert C. Byrd Green Bank Telescope

Mapping MPI+X Applications to Multi-GPU Architectures

CASPER AND GPUS MODERATOR: DANNY PRICE, SCRIBE: RICHARD PRESTAGE. Applications correlators, beamformers, spectrometers, FRB

"On the Capability and Achievable Performance of FPGAs for HPC Applications"

Radio astronomy data reduction at the Institute of Radio Astronomy

Concept Design of a Software Correlator for future ALMA. Jongsoo Kim Korea Astronomy and Space Science Institute

Diamond Networks/Computing. Nick Rees January 2011

Down selecting suitable manycore technologies for the ELT AO RTC. David Barr, Alastair Basden, Nigel Dipper and Noah Schwartz

The Canadian CyberSKA Project

High dynamic range imaging, computing & I/O load

New Software-Designed Instruments

SKA. data processing, storage, distribution. Jean-Marc Denis Head of Strategy, BigData & HPC. Oct. 16th, 2017 Paris. Coyright Atos 2017

Intel PSG (Altera) Enabling the SKA Community. Lance Brown Sr. Strategic & Technical Marketing Mgr.

On the Path to Energy-Efficient Exascale: A Perspective from the Green500

RapidIO.org Update.

Overview of XSEDE for HPC Users Victor Hazlewood XSEDE Deputy Director of Operations

Towards a Strategy for Data Sciences at UW

COSMOS Architecture and Key Technologies. June 1 st, 2018 COSMOS Team

Fujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED

Mathematical computations with GPUs

ALMA CORRELATOR : Added chapter number to section numbers. Placed specifications in table format. Added milestone summary.

LUMOS. A Framework with Analy1cal Models for Heterogeneous Architectures. Liang Wang, and Kevin Skadron (University of Virginia)

HPC Progress and Response to the National Cyber-Infrastructure

GPU Cluster Computing. Advanced Computing Center for Research and Education

Modern system architectures in embedded systems

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

TM Interfaces. Interface Management Workshop June 2013

IBM Power AC922 Server

GPU on OpenStack for Science

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

EVLA Correlator P. Dewdney Dominion Radio Astrophysical Observatory Herzberg Institute of Astrophysics

Integrating Heterogeneous Computing Techniques into the Daliuge Execution Framework

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Cryogenic Computing Complexity (C3) Marc Manheimer December 9, 2015 IEEE Rebooting Computing Summit 4

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Trends in HPC (hardware complexity and software challenges)

Square Kilometre Array

LSST: Crea*ng a Digital Universe

SKA Telescope Manager (TM): Status and Architecture Overview

FPGA APPLICATIONS FOR SINGLE DISH ACTIVITY AT MEDICINA RADIOTELESCOPES

HPC Architectures. Types of resource currently in use

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Inspur AI Computing Platform

PDR.01.01: ASSUMPTIONS AND NON-CONFORMANCE

Data Sheet FUJITSU Server PRIMERGY CX400 M4 Scale out Server

Wide-field Wide-band Full-Mueller Imaging

Transcription:

SKA Computing and Software Nick Rees 18 May 2016

Summary Introduc)on System overview Compu)ng Elements of the SKA Telescope Manager Low Frequency Aperture Array Central Signal Processor Science Data Processor Conclusions

Introduction SKA is a sodware telescope Very flexible and poten)ally easy to reconfigure Major sodware and compu)ng challenge Compu)ng challenges are huge Science Data Processor needs 100 PetaFLOPS/sec of delivered processing Current Top 500 supercomputer is Tianhe- 2 50 PetaFLOPS/sec SoDware challenges are also huge SDP development is baselined at ~100 FTE for 6 years. Talk based on current status ader recent PDR s Designs will evolve between now and CDR.

SKA System Overview 8.8 Tbits/s 5 Tbits/s ~50 PFlop 7.2 Tbits/s ~100 PFlop

Regional Centre Network 10 year IRU per 100Gbit circuit 2020-2030 Guess)mate of Regional Centres loca)ons US$.1M/Year US$.1M/Year US$.5-2M/Year US$1-3M/Year US$.2-.5M/Year US$.5-2M/Year US$1-3M/Year US$1-2M/Year US$1-3M/Year

Telescope Manager

Telescope Manager Overview

Telescope Management

Low Frequency Aperture Array

Low Frequency Aperture Array Overview Central Processing Facility Tile Processing module, TPM Screened Processing Facility Power & interface RF over Fibre o/e o/e o/e.. ADCs Spectral filters Beamforming MCCS Monitor, Control & Calibra)on Servers To Telescope Manager e/o COTS Data Network To Correlator SKALA Antenna Arrays TPM Remote Processing Facility

Tile Processing Modules WDM op)cal Analogue FPGA/Digital

Rack arrangement PSU and cooling Double sided racks Water cooled using building circula)on 64 TPMs per rack, 128 in the system Power, Clocks and )ming circulated POWER DIST. FANS TPM CLOCK DIST. PIPES DIST. FANS TPM 40Gb SWITCH FANS TPM CLOCK DIST. PIPES DIST. FANS TPM

Central Signal Processor (CSP)

Central Signal Processor Overview Control and Monitoring Correlator/ Beamformer Pulsar Search Pulsar Timing

Correlator Beam Former (CBF) Correlator: Channelise signal from every dish/aperture array sta)on in to fine frequency channels (65k) Cross- correlate all channels for every pair of dishes/ sta)ons Cross- correla)ons ( visibili)es ) passed to SDP for imaging Beamformer: Forms mul)ple beams within the dish/sta)on beam 1500 beams for Mid and 750 for Low Passes data to Pulsar Search/Timing engines/vlbi interface Very large amounts of real- )me processing: N corr ~ B(N dish.log 2 (N ch ) + N dish2 ) ~ PetaMAC/s N BF ~ B.N dish.n beam ~ few PetaMAC/s Based on custom FPGA processing plajorms

Pulsar Search General processing pipeline requires ~50 PFLOPS/sec. Baselined heterogeneous design to achieve best combina)on of hardware & sodware firmware. Two beams per compute node in current design. 250 server nodes in Australia and 750 in South Africa Dual redundant 10 & 1 gig networks. Each Node (1000 in total): Low Power CPUs GPUs FPGA boards 10 Gig inputs > 1 Tbyte RAM &/or SSDs

Science Data Processor

Science Data Processor Overview

Imaging Processing Model Correlator Subtract current sky model from visibilities using current calibration model UV processors RFI excision and phase rota)on Major cycle Grid UV data to form e.g. W-projection UV data store Image gridded data Deconvolveimaged data (minor cycle) Update current sky model Astronomical quality data Solve for telescope and image-plane calibration model Update calibration model Imaging processors

SDP Data Flow Approach: Next Generation Data Science Engine?

Computing Limitations Arithme)c Intensity ρ = Total FLOPS/Total DRAM Bytes The principal algorithms required by SDP (gridding and FFT) are typically ρ 0.5 Typical accelerators have an ρ 5-7 For example, NVidia Pascal GPU architecture has: Memory bandwidth 720 GB/sec Floa)ng point bandwidth 5,000 GFLOPS/sec ρ = 5000/720 7 Hence, the computa)onal efficiency 0.5/7 10% So, because of the bandwidth requirements, we have to buy 10 x more compu)ng than a pure HPC system would require. Unless the vendors improve the memory bandwidth

Computing Requirements ~100 PetaFLOPS/sec total sustained ~200 PetaByte/s aggregate BW to fast working memory ~50 PetaByte fast working storage ~1 TeraByte/s sustained write to storage ~10 TeraByte/s sustained read from storage ~ 10000 FLOPS/byte read from storage Current power cap proposed is ~5MW per site. 22

Data Management Challenges All Top500 HPC systems have been designed for High Performance Compu)ng (by defini)on). IDC has proposed a new term HPDA High Performance Data Analy)cs to reflect systems like SKA Must ensure the data is available when and where it is needed. CPU s must not be idling wai)ng for data to arrive Data must be in fast cache when it is needed. Need a framework that supports this. Looking at a variety of possible prototypes

Addressing Power Need to achieve a FLOPS/Waq an order of magnitude beqer than current greenest computer. Need a three pronged approach: Algorithms Hardware Testing Pursue innova)ve approaches to cut processing )mes Look at accelerators, hosts, networks and storage. Using real algorithms and fully instrumented systems

Software Budget of > 100M on manpower for sodware development across the whole telescope. Not a task for academic programmers Need professional prac)ces for development, tes)ng, integra)on and deployment. Need to unify the processes across the world- wide team of developers. Need world- leading exper)se in a number of areas. Delivered system will not be sta)c SDP hardware and sodware will be updated periodically. Key input for development will be the scien)fic and sodware community through the regional centres.

Conclusions SKA is a huge computa)onal challenge CSP ~ 50 Pflop, 5 MW SDP ~ 100 Pflop, 5 MW Tianhe- 2 ~30 Pflop, 40 MW Tradi)onal HPC is not a good match because the problem is bandwidth dominated. SKA is seen as a key programme in global IT development Showcases a major development area of High Performance Data Analysis (HPDA). Power is also a major driver. SoDware complexity is also beyond what has been achieved in astronomy previously.