OSKAR: Simulating data from the SKA

Size: px
Start display at page:

Download "OSKAR: Simulating data from the SKA"

Transcription

1 OSKAR: Simulating data from the SKA Oxford e-research Centre, 4 June 2014 Fred Dulwich, Ben Mort, Stef Salvini 1

2 Overview Simulating interferometer data for SKA: Radio interferometry basics. Measurement equation basics. Structure of OSKAR. Experiences moving from Fermi to Kepler GPU architecture. Some recent simulation results. 2

3 Radio interferometry VLA ( ) One-Mile Telescope (1964) First to use Earth-rotation aperture synthesis 3

4 Comparison with optical system Traditional optical telescope records image of the sky formed by lens (or mirror). Sky EM radiation from the sky Lens Image plane of lens 4

5 Comparison with optical system A radio interferometer samples the wave-front in the Fourier domain: Image formation done electronically. Sky EM radiation from the sky Array of detectors Processing Image formed by FT 5

6 Aperture arrays as stations Omni-directional antennas measure voltage signals from whole sky. Spatial filtering (electronic beam forming) to isolate a direction of interest. Advantages: Cost effective at low frequency No moving parts Fast scanning Multi-beaming capability Disadvantages: Sparse at high frequency Relatively high sidelobe levels Continually variable beam shape Continually variable beam polarisation 6

7 Modelling Challenges (1) AA have complex beam patterns that have to be modelled across whole sky 7

8 Modelling Challenges (2) Science goals demand very high sensitivity Require good understanding of instrumental characteristics Need comprehensive models of sky and telescope Very large instruments and sky model require HPC Design of SKA not yet finalised: simulator has to be flexible 8

9 Why simulate the SKA? Imaging performance depends strongly on how the detector elements are arranged. Aperture arrays have unique problems. Assess performance of evolving system design. Simulations can produce data challenges for pipeline developers. Ideas for SKA design have changed in recent years: Few large stations (11200 elements per station) Many small stations (256 elements per station) 9

10 Measurement Equation formalism A radio interferometer makes measurements of radiation in the Fourier domain (visibilities) for the true sky after various corruption effects, for example: Sky rotation (parallactic angle) Ionosphere Antenna pattern & shape of station beam The Hamaker-Bregman-Sault Measurement Equation of a radio interferometer can be used to simulate measured visibility data. Relies on concepts of: Source coherency matrix Jones matrix 10

11 Source coherency (brightness) matrix Source coherency matrix encapsulates source properties. Stokes parameters I, Q, U and V completely describe average polarisation of radiation from a source. Coherency matrix defined as 2x2 complex quantity for each source, s. Using linear polarisation basis: B s = " $ # $ I +Q U iv U + iv I Q % ' &' 11

12 Jones matrix Describes some physical effect on the radiation. For a single source, s, at a single receiving station, i. Jones matrix is another 2x2 complex quantity: Allows intermixing of polarisations. Allows modification of amplitude and phase of received electromagnetic wave. J s, i = a c ia ic 2 2 b d ib id

13 Jones matrix and Measurement Equation Gives modularity and makes complex simulations tractable: Jones matrices can be chained together. Allows us to separate different physical effects. Multiply matrices in order in which things actually happen: J = X s,i s i s, i s, i Visibility on baseline between stations i and j for all visible sources (s) is then: V i, j = s Y Z, J B J H s,i s s, j! 13

14 A pictorial Measurement Equation! B R Z E K V i, j = s H K s,i E s,i Z s,i R s,i B s R s, j Z H s, j E H H s, j K s, j 14

15 OSKAR overview (1) GPU-enabled software to produce simulated visibilities by direct evaluation of a measurement equation. Currently ~ lines of code, mostly C (some C++). Currently ~40 CUDA kernels/functions. Single or double precision computation available. Balance between highest performance and highest flexibility. Problem sizes vary hugely. Simulations need to run on many different systems. Minimize PCIe traffic: Copy input sky and telescope models to GPU memory. Intermediate data generated on the GPU and used without transfer to host. Host keeps track of pointers to GPU memory. Use GPU memory effectively as a giant cache. 15

16 OSKAR overview (2) Each source is independent with respect to all other sources. There are many sources in the sky Can trivially parallelise over sources. In general, each GPU thread works on one source. Easily guaranteed threads for any given kernel launch. Most expensive steps: Station beam evaluation, for all stations. Compute limited (DFT). Cross-correlation step (visibility evaluation per baseline). Bandwidth limited (Kepler); register limited (double precision, Fermi). 16

17 Jones matrix data structure Station i (slowest varying) Source s (fastest varying) 17

18 Jones matrix data structure Station i (slowest varying) Source s (fastest varying) J s, i = a c ia ic 2 2 b d ib id OSKAR functions calculate each Jones matrix for each source at each station in GPU memory (used as scratchpad )

19 Joining Jones matrices Station i (slowest varying) J = X, Y s,i s i s, i Source s (fastest varying) = x Trivially parallel: each thread does one colour 19

20 Forming visibilities ( correlator ) Source s (fastest varying) Station i 1 Station j 3 B 2 V i, j = s H J s,i B s J s, j Exploits the fact that XY = Y H X H Each thread block computes result for one baseline, or one correlation between two stations, for all sources. Each thread does a subset of sources. Accumulates partial sum into shared memory. Result of final accumulation into global memory. 20

21 Forming visibilities ( correlator ) Source s (fastest varying) Station i 1 Station j B V i, j = s H J s,i B s J s, j 3 2 Multiply together numbered cells. Accumulate results. One shared memory location per colour/thread (partial sum). Final step adds different colours, putting result into global memory. 21

22 Forming visibilities ( correlator ) Source s (fastest varying) Station i 1 Station j 3 B 2 V i, j = s H J s,i B s J s, j Next thread block does same again for another station pair. Why not just use some matrix math library? 22

23 Forming visibilities ( correlator ) Source s (fastest varying) Station i 1 Station j B 3 2 V i, j = s H J s,i B s J s, j Not quite the whole story... Non-separable baselinedependent effects must be modelled here too: Smearing terms Extended sources f (s,i, j) 23

24 Fermi to Kepler Correlate kernel (on compute 3.5 architecture, using CUDA 5.5) 43 registers (single precision) 68 registers (double precision) Must load from global memory: Stokes parameters (4 values per source) Direction cosines (3 values per source) Extended source parameters (3 values per source) Station coordinates (8 values per thread block) Jones matrices (2 x 8 values per source) Computes rotation matrix, two sinc functions, one exponential, three vector products, and two Jones complex matrix products. Not very operationally dense, but lots of data to store in registers. Global memory load is bandwidth heavy! (N 2 reads for N stations) (Current baseline design makes this worse: 1024 stations!) 24

25 Fermi to Kepler Expecting big performance gains from reduced register pressure. Kernel time Simulation time Simulation time Precision double double single M2090 (Emerald) 9.44 s (ECC off) 1125 s (ECC on) 197 s (ECC on) K20c (Ruby) 5.02 s 516 s 231 s Speedup 1.9 x 2.2 x 0.85 x (?) 25

26 Inside Kepler K20 family (slide from NVIDIA GTC 2012) 26

27 Inside Kepler K20 family (slide from NVIDIA GTC 2012) L1 cache in Kepler no longer used for global memory loads! Profiler showed that performance was limited by bandwidth to L2 cache. 27

28 Jones matrix data structure Station i (slowest varying) float2! Source s (fastest varying) J s, i = a c ia ic 2 2 } b d ib id Using const restrict not enough! Data structure too complex for compiler to optimize load from global memory. Needed four explicit ldg(float2) or ldg(double2) instructions to make use of Kepler s read only data cache

29 Fermi to Kepler Expecting big performance gains from reduced register pressure. Profiler showing >150 GB/s global memory bandwidth on K20c (theoretical max 208 GB/s). Kernel time Simulation time Simulation time Simulation time Simulation time Precision double double single double single M2090 (Emerald) 9.44 s (ECC off) 1125 s (ECC on) 197 s (ECC on) 1125 s (ECC on) 197 s (ECC on) K20c (Ruby) 5.02 s 516 s 231 s 292 s 124 s Speedup 1.9 x 2.2 x 0.85 x (?) 3.9 x 1.6 x 29

30 Example study: Modelling the impact of distant interfering sources AA have considerable sensitivity to sources outside primary beam. Strong function of frequency: Can we image at 600 MHz? Understand impact of interfering sources to a AA snapshot observation. Metric called (far) side-lobe confusion noise. With AA beams the signal from sources outside the field of interest is nonzero. The power from these sources is spread into the main field though their PSF side-lobes. Both the PSF and beam are a function of frequency and time. Known as confusion noise: millions of point sources which cannot be individually corrected for. This an important limit to the imaging performance of AAs. Region of Interest Side lobes Interfering sources 30

31 AA telescope configuration 800 y (North) [metres] x (East) [metres] y (North) [metres] antennas (courtesy N. Razavi) 693 stations (courtesy K. Grainge) x (East) [metres] 31

32 AA station beams 100 MHz 600 MHz 32

33 Sky model The SKA will be more sensitive than any current telescope, so no all-sky models exist with enough sources. Generate a 2M source sky model with the correct statistics extrapolated from the VLSS catalogue (~68k sources). Log10 cumulative number count Log10 flux bin [Jy] 33

34 Image of sidelobe confusion noise -28:00: :00: :00: :00: :00: MHz 20: :00: : :00: :00: : :50:0 10: : : : :00: : :00: :00: : :50:0 10: :00: :00: :00: :00: :00: : :00: :00: :00: : : :00: : : : : : : :00: : : : : :00:00.013:50:0 10: :00: : : : :30: : : : : : deg -36:00: :00: :30: MHz -34:00: :00: :00: : :00: :00: :00: : :00: :00: :00: : : :00: :00: :00: : : : :00: : :00: :50:0-28:00: :30: :00: :00: :00: :00: : :00: :00: :00: :00: : : : : :00:00.013:50:0 10: :00: : : : : :30: : : : : :00: :30: :00: :00: :00: :00: : deg -36:00: :00: :00: :00: : : :00:00.013:50:0 30: :30: : : :30: :30: :30: :00: :00: :00: : : :

35 Interfering (FSC) snapshot noise as a function of frequency 10 2 FSCN RMS [Jy/beam] Frequency [MHz] 35

36 Summary Large scale SKA simulations are challenging. GPUs make them possible. Simulations are vital to Assess the evolving system design. Generate semi-realistic data products for tool-chain developers and for data flow testing. 36

OSKAR-2: Simulating data from the SKA

OSKAR-2: Simulating data from the SKA OSKAR-2: Simulating data from the SKA AACal 2012, Amsterdam, 13 th July 2012 Fred Dulwich, Ben Mort, Stef Salvini 1 Overview OSKAR-2: Interferometer and beamforming simulator package. Intended for simulations

More information

S.A. Torchinsky, A. van Ardenne, T. van den Brink-Havinga, A.J.J. van Es, A.J. Faulkner (eds.) 4-6 November 2009, Château de Limelette, Belgium

S.A. Torchinsky, A. van Ardenne, T. van den Brink-Havinga, A.J.J. van Es, A.J. Faulkner (eds.) 4-6 November 2009, Château de Limelette, Belgium WIDEFIELD SCIENCE AND TECHNOLOGY FOR THE SKA SKADS CONFERENCE 2009 S.A. Torchinsky, A. van Ardenne, T. van den Brink-Havinga, A.J.J. van Es, A.J. Faulkner (eds.) 4-6 November 2009, Château de Limelette,

More information

Radio Interferometry Bill Cotton, NRAO. Basic radio interferometry Emphasis on VLBI Imaging application

Radio Interferometry Bill Cotton, NRAO. Basic radio interferometry Emphasis on VLBI Imaging application Radio Interferometry Bill Cotton, NRAO Basic radio interferometry Emphasis on VLBI Imaging application 2 Simplest Radio Interferometer Monochromatic, point source 3 Interferometer response Adding quarter

More information

Synthesis Imaging. Claire Chandler, Sanjay Bhatnagar NRAO/Socorro

Synthesis Imaging. Claire Chandler, Sanjay Bhatnagar NRAO/Socorro Synthesis Imaging Claire Chandler, Sanjay Bhatnagar NRAO/Socorro Michelson Summer Workshop Caltech, July 24-28, 2006 Synthesis Imaging 2 Based on the van Cittert-Zernike theorem: The complex visibility

More information

Correlator Field-of-View Shaping

Correlator Field-of-View Shaping Correlator Field-of-View Shaping Colin Lonsdale Shep Doeleman Vincent Fish Divya Oberoi Lynn Matthews Roger Cappallo Dillon Foight MIT Haystack Observatory Context SKA specifications extremely challenging

More information

Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting

Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting Uniform vs Natural Tapering De Villiers weighting Briggs-like schemes Reprise: dirty beam, dirty image. Fourier inversion

More information

Imaging and Deconvolution

Imaging and Deconvolution Imaging and Deconvolution Urvashi Rau National Radio Astronomy Observatory, Socorro, NM, USA The van-cittert Zernike theorem Ei E V ij u, v = I l, m e sky j 2 i ul vm dldm 2D Fourier transform : Image

More information

Primary Beams & Radio Interferometric Imaging Performance. O. Smirnov (Rhodes University & SKA South Africa)

Primary Beams & Radio Interferometric Imaging Performance. O. Smirnov (Rhodes University & SKA South Africa) Primary Beams & Radio Interferometric Imaging Performance O. Smirnov (Rhodes University & SKA South Africa) Introduction SKA Dish CoDR (2011), as summarized by Tony Willis: My sidelobes are better than

More information

OSKAR Settings Files Revision: 8

OSKAR Settings Files Revision: 8 OSKAR Settings Files Version history: Revision Date Modification 1 212-4-23 Creation. 2 212-5-8 Added default value column to settings tables. 3 212-6-13 Updated settings for version 2..2-beta. 4 212-7-27

More information

John W. Romein. Netherlands Institute for Radio Astronomy (ASTRON) Dwingeloo, the Netherlands

John W. Romein. Netherlands Institute for Radio Astronomy (ASTRON) Dwingeloo, the Netherlands Signal Processing on GPUs for Radio Telescopes John W. Romein Netherlands Institute for Radio Astronomy (ASTRON) Dwingeloo, the Netherlands 1 Overview radio telescopes six radio telescope algorithms on

More information

High Dynamic Range Imaging

High Dynamic Range Imaging High Dynamic Range Imaging Josh Marvil CASS Radio Astronomy School 3 October 2014 CSIRO ASTRONOMY & SPACE SCIENCE High Dynamic Range Imaging Introduction Review of Clean Self- Calibration Direction Dependence

More information

A GPU based brute force de-dispersion algorithm for LOFAR

A GPU based brute force de-dispersion algorithm for LOFAR A GPU based brute force de-dispersion algorithm for LOFAR W. Armour, M. Giles, A. Karastergiou and C. Williams. University of Oxford. 8 th May 2012 1 GPUs Why use GPUs? Latest Kepler/Fermi based cards

More information

ERROR RECOGNITION and IMAGE ANALYSIS

ERROR RECOGNITION and IMAGE ANALYSIS PREAMBLE TO ERROR RECOGNITION and IMAGE ANALYSIS 2 Why are these two topics in the same lecture? ERROR RECOGNITION and IMAGE ANALYSIS Ed Fomalont Error recognition is used to determine defects in the data

More information

ADVANCED RADIO INTERFEROMETRIC IMAGING

ADVANCED RADIO INTERFEROMETRIC IMAGING ADVANCED RADIO INTERFEROMETRIC IMAGING Hayden Rampadarath Based upon J. Radcliffe's DARA presentation Image courtesy of NRAO/AUI INTR ODU CT ION In the first imaging lecture, we discussed the overall and

More information

Controlling Field-of-View of Radio Arrays using Weighting Functions

Controlling Field-of-View of Radio Arrays using Weighting Functions Controlling Field-of-View of Radio Arrays using Weighting Functions MIT Haystack FOV Group: Lynn D. Matthews,Colin Lonsdale, Roger Cappallo, Sheperd Doeleman, Divya Oberoi, Vincent Fish Fulfilling scientific

More information

Powering Real-time Radio Astronomy Signal Processing with latest GPU architectures

Powering Real-time Radio Astronomy Signal Processing with latest GPU architectures Powering Real-time Radio Astronomy Signal Processing with latest GPU architectures Harshavardhan Reddy Suda NCRA, India Vinay Deshpande NVIDIA, India Bharat Kumar NVIDIA, India What signals we are processing?

More information

ALMA Memo 386 ALMA+ACA Simulation Tool J. Pety, F. Gueth, S. Guilloteau IRAM, Institut de Radio Astronomie Millimétrique 300 rue de la Piscine, F-3840

ALMA Memo 386 ALMA+ACA Simulation Tool J. Pety, F. Gueth, S. Guilloteau IRAM, Institut de Radio Astronomie Millimétrique 300 rue de la Piscine, F-3840 ALMA Memo 386 ALMA+ACA Simulation Tool J. Pety, F. Gueth, S. Guilloteau IRAM, Institut de Radio Astronomie Millimétrique 300 rue de la Piscine, F-38406 Saint Martin d'h eres August 13, 2001 Abstract This

More information

High dynamic range imaging, computing & I/O load

High dynamic range imaging, computing & I/O load High dynamic range imaging, computing & I/O load RMS ~15µJy/beam RMS ~1µJy/beam S. Bhatnagar NRAO, Socorro Parameterized Measurement Equation Generalized Measurement Equation Obs [ S M V ij = J ij, t W

More information

Adaptive selfcalibration for Allen Telescope Array imaging

Adaptive selfcalibration for Allen Telescope Array imaging Adaptive selfcalibration for Allen Telescope Array imaging Garrett Keating, William C. Barott & Melvyn Wright Radio Astronomy laboratory, University of California, Berkeley, CA, 94720 ABSTRACT Planned

More information

Fast Holographic Deconvolution

Fast Holographic Deconvolution Precision image-domain deconvolution for radio astronomy Ian Sullivan University of Washington 4/19/2013 Precision imaging Modern imaging algorithms grid visibility data using sophisticated beam models

More information

Imaging and Deconvolution

Imaging and Deconvolution Imaging and Deconvolution 24-28 Sept 202 Narrabri, NSW, Australia Outline : - Synthesis Imaging Concepts - Imaging in Practice Urvashi Rau - Image-Reconstruction Algorithms National Radio Astronomy Observatory

More information

Imaging and non-imaging analysis

Imaging and non-imaging analysis 1 Imaging and non-imaging analysis Greg Taylor University of New Mexico Spring 2017 Plan for the lecture-i 2 How do we go from the measurement of the coherence function (the Visibilities) to the images

More information

Profiling & Tuning Applications. CUDA Course István Reguly

Profiling & Tuning Applications. CUDA Course István Reguly Profiling & Tuning Applications CUDA Course István Reguly Introduction Why is my application running slow? Work it out on paper Instrument code Profile it NVIDIA Visual Profiler Works with CUDA, needs

More information

van Cittert-Zernike Theorem

van Cittert-Zernike Theorem van Cittert-Zernike Theorem Fundamentals of Radio Interferometry, Section 4.5 Griffin Foster SKA SA/Rhodes University NASSP 2016 What is the Fourier transform of the sky? NASSP 2016 2:21 Important Points:

More information

Wide-field Wide-band Full-Mueller Imaging

Wide-field Wide-band Full-Mueller Imaging Wide-field Wide-band Full-Mueller Imaging CALIM2016, Oct. 10th 2016, Socorro, NM S. Bhatnagar NRAO, Socorro The Scientific Motivation Most projects with current telescopes require precise reconstruction

More information

Computational issues for HI

Computational issues for HI Computational issues for HI Tim Cornwell, Square Kilometre Array How SKA processes data Science Data Processing system is part of the telescope Only one system per telescope Data flow so large that dedicated

More information

AA CORRELATOR SYSTEM CONCEPT DESCRIPTION

AA CORRELATOR SYSTEM CONCEPT DESCRIPTION AA CORRELATOR SYSTEM CONCEPT DESCRIPTION Document number WP2 040.040.010 TD 001 Revision 1 Author. Andrew Faulkner Date.. 2011 03 29 Status.. Approved for release Name Designation Affiliation Date Signature

More information

Modeling Antenna Beams

Modeling Antenna Beams Modeling Antenna Beams Walter Brisken National Radio Astronomy Observatory 2011 Sept 22 1 / 24 What to learn from this talk EM simulations of antennas can be complicated Many people have spent careers

More information

CUDA Experiences: Over-Optimization and Future HPC

CUDA Experiences: Over-Optimization and Future HPC CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign

More information

COMMENTS ON ARRAY CONFIGURATIONS. M.C.H. Wright. Radio Astronomy laboratory, University of California, Berkeley, CA, ABSTRACT

COMMENTS ON ARRAY CONFIGURATIONS. M.C.H. Wright. Radio Astronomy laboratory, University of California, Berkeley, CA, ABSTRACT Bima memo 66 - May 1998 COMMENTS ON ARRAY CONFIGURATIONS M.C.H. Wright Radio Astronomy laboratory, University of California, Berkeley, CA, 94720 ABSTRACT This memo briey compares radial, circular and irregular

More information

CUDA Optimization with NVIDIA Nsight Visual Studio Edition 3.0. Julien Demouth, NVIDIA

CUDA Optimization with NVIDIA Nsight Visual Studio Edition 3.0. Julien Demouth, NVIDIA CUDA Optimization with NVIDIA Nsight Visual Studio Edition 3.0 Julien Demouth, NVIDIA What Will You Learn? An iterative method to optimize your GPU code A way to conduct that method with Nsight VSE APOD

More information

GPU-Based Acceleration for CT Image Reconstruction

GPU-Based Acceleration for CT Image Reconstruction GPU-Based Acceleration for CT Image Reconstruction Xiaodong Yu Advisor: Wu-chun Feng Collaborators: Guohua Cao, Hao Gong Outline Introduction and Motivation Background Knowledge Challenges and Proposed

More information

EVLA Memo #132 Report on the findings of the CASA Terabyte Initiative: Single-node tests

EVLA Memo #132 Report on the findings of the CASA Terabyte Initiative: Single-node tests EVLA Memo #132 Report on the findings of the CASA Terabyte Initiative: Single-node tests S. Bhatnagar NRAO, Socorro May 18, 2009 Abstract This note reports on the findings of the Terabyte-Initiative of

More information

Pre-Processing and Calibration for Million Source Shallow Survey

Pre-Processing and Calibration for Million Source Shallow Survey Pre-Processing and Calibration for Million Source Shallow Survey V.N. Pandey(Kapteyn Institute/ASTRON) for LOFAR Offline Processing Team April 1 st, 2009 CALIM 09, Socorro Outline 1 2 3 4 MSSS (MS 3 )

More information

3. Image formation, Fourier analysis and CTF theory. Paula da Fonseca

3. Image formation, Fourier analysis and CTF theory. Paula da Fonseca 3. Image formation, Fourier analysis and CTF theory Paula da Fonseca EM course 2017 - Agenda - Overview of: Introduction to Fourier analysis o o o o Sine waves Fourier transform (simple examples of 1D

More information

CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION. Julien Demouth, NVIDIA Cliff Woolley, NVIDIA

CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION. Julien Demouth, NVIDIA Cliff Woolley, NVIDIA CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION Julien Demouth, NVIDIA Cliff Woolley, NVIDIA WHAT WILL YOU LEARN? An iterative method to optimize your GPU code A way to conduct that method with NVIDIA

More information

Argus Radio Telescope Architecture

Argus Radio Telescope Architecture Argus Radio Telescope Architecture Douglas Needham http://cinnion.ka8zrt.com http://www.naapo.org Argus Architecture p.1/15 Introduction: Traditional Telescopes Radio telescopes commonly consist of a single

More information

PARALLEL PROGRAMMING MANY-CORE COMPUTING FOR THE LOFAR TELESCOPE ROB VAN NIEUWPOORT. Rob van Nieuwpoort

PARALLEL PROGRAMMING MANY-CORE COMPUTING FOR THE LOFAR TELESCOPE ROB VAN NIEUWPOORT. Rob van Nieuwpoort PARALLEL PROGRAMMING MANY-CORE COMPUTING FOR THE LOFAR TELESCOPE ROB VAN NIEUWPOORT Rob van Nieuwpoort rob@cs.vu.nl Who am I 10 years of Grid / Cloud computing 6 years of many-core computing, radio astronomy

More information

Optimization Case Study for Kepler K20 GPUs: Synthetic Aperture Radar Backprojection

Optimization Case Study for Kepler K20 GPUs: Synthetic Aperture Radar Backprojection Optimization Case Study for Kepler K20 GPUs: Synthetic Aperture Radar Backprojection Thomas M. Benson 1 Daniel P. Campbell 1 David Tarjan 2 Justin Luitjens 2 1 Georgia Tech Research Institute {thomas.benson,dan.campbell}@gtri.gatech.edu

More information

CASA. Algorithms R&D. S. Bhatnagar. NRAO, Socorro

CASA. Algorithms R&D. S. Bhatnagar. NRAO, Socorro Algorithms R&D S. Bhatnagar NRAO, Socorro Outline Broad areas of work 1. Processing for wide-field wide-band imaging Full-beam, Mosaic, wide-band, full-polarization Wide-band continuum and spectral-line

More information

Wideband Mosaic Imaging for VLASS

Wideband Mosaic Imaging for VLASS Wideband Mosaic Imaging for VLASS Preliminary ARDG Test Report U.Rau & S.Bhatnagar 29 Aug 2018 (1) Code Validation and Usage (2) Noise, Weights, Continuum sensitivity (3) Imaging parameters (4) Understanding

More information

SKA Technical developments relevant to the National Facility. Keith Grainge University of Manchester

SKA Technical developments relevant to the National Facility. Keith Grainge University of Manchester SKA Technical developments relevant to the National Facility Keith Grainge University of Manchester Talk Overview SKA overview Receptors Data transport and network management Synchronisation and timing

More information

Antenna Configurations for the MMA

Antenna Configurations for the MMA MMA Project Book, Chapter 15: Array Configuration Antenna Configurations for the MMA Tamara T. Helfer & M.A. Holdaway Last changed 11/11/98 Revision History: 11/11/98: Added summary and milestone tables.

More information

Basic optics. Geometrical optics and images Interference Diffraction Diffraction integral. we use simple models that say a lot! more rigorous approach

Basic optics. Geometrical optics and images Interference Diffraction Diffraction integral. we use simple models that say a lot! more rigorous approach Basic optics Geometrical optics and images Interference Diffraction Diffraction integral we use simple models that say a lot! more rigorous approach Basic optics Geometrical optics and images Interference

More information

Thomas Abraham, PhD

Thomas Abraham, PhD Thomas Abraham, PhD (tabraham1@hmc.psu.edu) What is Deconvolution? Deconvolution, also termed as Restoration or Deblurring is an image processing technique used in a wide variety of fields from 1D spectroscopy

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

Final Exam. Today s Review of Optics Polarization Reflection and transmission Linear and circular polarization Stokes parameters/jones calculus

Final Exam. Today s Review of Optics Polarization Reflection and transmission Linear and circular polarization Stokes parameters/jones calculus Physics 42200 Waves & Oscillations Lecture 40 Review Spring 206 Semester Matthew Jones Final Exam Date:Tuesday, May 3 th Time:7:00 to 9:00 pm Room: Phys 2 You can bring one double-sided pages of notes/formulas.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY

MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY WESTFORD, MASSACHUSETTS 01886-1299 LOFAR MEMO #002 September 3, 2001 Phone: (978) 692-4764 Fax : (781) 981-0590 To: From: Subject: LOFAR Group

More information

Waves & Oscillations

Waves & Oscillations Physics 42200 Waves & Oscillations Lecture 40 Review Spring 2016 Semester Matthew Jones Final Exam Date:Tuesday, May 3 th Time:7:00 to 9:00 pm Room: Phys 112 You can bring one double-sided pages of notes/formulas.

More information

A Multi-Tiered Optimization Framework for Heterogeneous Computing

A Multi-Tiered Optimization Framework for Heterogeneous Computing A Multi-Tiered Optimization Framework for Heterogeneous Computing IEEE HPEC 2014 Alan George Professor of ECE University of Florida Herman Lam Assoc. Professor of ECE University of Florida Andrew Milluzzi

More information

Matrix Multiplication

Matrix Multiplication Systolic Arrays Matrix Multiplication Is used in many areas of signal processing, graph theory, graphics, machine learning, physics etc Implementation considerations are based on precision, storage and

More information

Accelerating the acceleration search a case study. By Chris Laidler

Accelerating the acceleration search a case study. By Chris Laidler Accelerating the acceleration search a case study By Chris Laidler Optimization cycle Assess Test Parallelise Optimise Profile Identify the function or functions in which the application is spending most

More information

Wide field polarization calibration in the image plane using the Allen Telescope Array

Wide field polarization calibration in the image plane using the Allen Telescope Array Wide field polarization calibration in the image plane using the Allen Telescope Array Mattieu de Villiers, SKA SA Casey Law, UC Berkeley 8 October 00 Abstract This study investigates wide field polarization

More information

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of

More information

Imaging Strategies and Postprocessing Computing Costs for Large-N SKA Designs

Imaging Strategies and Postprocessing Computing Costs for Large-N SKA Designs Imaging Strategies and Postprocessing Computing Costs for Large-N SKA Designs Colin J. Lonsdale Sheperd S Doeleman Divya Oberoi MIT Haystack Observatory 17 July 2004 Abstract: The performance goals of

More information

Winter College on Optics in Environmental Science February Adaptive Optics: Introduction, and Wavefront Correction

Winter College on Optics in Environmental Science February Adaptive Optics: Introduction, and Wavefront Correction 2018-23 Winter College on Optics in Environmental Science 2-18 February 2009 Adaptive Optics: Introduction, and Wavefront Correction Love G. University of Durham U.K. Adaptive Optics: Gordon D. Love Durham

More information

Continuum error recognition and error analysis

Continuum error recognition and error analysis Continuum error recognition and error analysis Robert Laing (ESO) 1 Outline Error recognition: how do you recognise and diagnose residual errors by looking at images? Image analysis: how do you extract

More information

Preparatory School to the Winter College on Optics in Imaging Science January Selected Topics of Fourier Optics Tutorial

Preparatory School to the Winter College on Optics in Imaging Science January Selected Topics of Fourier Optics Tutorial 2222-11 Preparatory School to the Winter College on Optics in Imaging Science 24-28 January 2011 Selected Topics of Fourier Optics Tutorial William T. Rhodes Florida Atlantic University Boca Raton USA

More information

Simple Spatial Domain Filtering

Simple Spatial Domain Filtering Simple Spatial Domain Filtering Binary Filters Non-phase-preserving Fourier transform planes Simple phase-step filters (for phase-contrast imaging) Amplitude inverse filters, related to apodization Contrast

More information

S2 Science EM Spectrum Revision Notes --------------------------------------------------------------------------------------------------------------------------------- What is light? Light is a form of

More information

Image Processing. Filtering. Slide 1

Image Processing. Filtering. Slide 1 Image Processing Filtering Slide 1 Preliminary Image generation Original Noise Image restoration Result Slide 2 Preliminary Classic application: denoising However: Denoising is much more than a simple

More information

Waves & Oscillations

Waves & Oscillations Physics 42200 Waves & Oscillations Lecture 41 Review Spring 2013 Semester Matthew Jones Final Exam Date:Tuesday, April 30 th Time:1:00 to 3:00 pm Room: Phys 112 You can bring two double-sided pages of

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

How accurately do our imaging algorithms reconstruct intensities and spectral indices of weak sources?

How accurately do our imaging algorithms reconstruct intensities and spectral indices of weak sources? How accurately do our imaging algorithms reconstruct intensities and spectral indices of weak sources? Urvashi Rau, Sanjay Bhatnagar, Frazer Owen ( NRAO ) 29th Annual New Mexico Symposium, NRAO, Socorro,

More information

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3

More information

Empirical Parameterization of the Antenna Aperture Illumination Pattern

Empirical Parameterization of the Antenna Aperture Illumination Pattern Empirical Parameterization of the Antenna Aperture Illumination Pattern Preshanth Jagannathan UCT/NRAO Collaborators : Sanjay Bhatnagar, Walter Brisken Measurement Equation The measurement equation in

More information

Using a multipoint interferometer to measure the orbital angular momentum of light

Using a multipoint interferometer to measure the orbital angular momentum of light CHAPTER 3 Using a multipoint interferometer to measure the orbital angular momentum of light Recently it was shown that the orbital angular momentum of light can be measured using a multipoint interferometer,

More information

The Future of High Performance Computing

The Future of High Performance Computing The Future of High Performance Computing Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Comparing Two Large-Scale Systems Oakridge Titan Google Data Center 2 Monolithic supercomputer

More information

XIV International PhD Workshop OWD 2012, October Optimal structure of face detection algorithm using GPU architecture

XIV International PhD Workshop OWD 2012, October Optimal structure of face detection algorithm using GPU architecture XIV International PhD Workshop OWD 2012, 20 23 October 2012 Optimal structure of face detection algorithm using GPU architecture Dmitry Pertsau, Belarusian State University of Informatics and Radioelectronics

More information

Workhorse ADCP Multi- Directional Wave Gauge Primer

Workhorse ADCP Multi- Directional Wave Gauge Primer Acoustic Doppler Solutions Workhorse ADCP Multi- Directional Wave Gauge Primer Brandon Strong October, 2000 Principles of ADCP Wave Measurement The basic principle behind wave the measurement, is that

More information

FFT Analysis. Document No :SKA-TEL-SDP Context PIP.CAS. Revision: 02. Author.. Stefano Salvini. Release Date:

FFT Analysis. Document No :SKA-TEL-SDP Context PIP.CAS. Revision: 02. Author.. Stefano Salvini. Release Date: FFT Analysis Document number. SKA-TEL-SDP-0000058 Context PIP.CAS Revision.02 Author.. Stefano Salvini Release Date.2015-02-09 Document Classification. Status. Draft Release Date: 2015-02-09 Page 1 of

More information

PARALLEL PROGRAMMING MANY-CORE COMPUTING: THE LOFAR SOFTWARE TELESCOPE (5/5)

PARALLEL PROGRAMMING MANY-CORE COMPUTING: THE LOFAR SOFTWARE TELESCOPE (5/5) PARALLEL PROGRAMMING MANY-CORE COMPUTING: THE LOFAR SOFTWARE TELESCOPE (5/5) Rob van Nieuwpoort Vrije Universiteit Amsterdam & Astron, the Netherlands Institute for Radio Astronomy Why Radio? Credit: NASA/IPAC

More information

Towards a Performance- Portable FFT Library for Heterogeneous Computing

Towards a Performance- Portable FFT Library for Heterogeneous Computing Towards a Performance- Portable FFT Library for Heterogeneous Computing Carlo C. del Mundo*, Wu- chun Feng* *Dept. of ECE, Dept. of CS Virginia Tech Slides Updated: 5/19/2014 Forecast (Problem) AMD Radeon

More information

Visualization & the CASA Viewer

Visualization & the CASA Viewer Visualization & the Viewer Juergen Ott & the team Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array Visualization Goals:

More information

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards By Allan P. Engsig-Karup, Morten Gorm Madsen and Stefan L. Glimberg DTU Informatics Workshop

More information

Exploiting GPU Caches in Sparse Matrix Vector Multiplication. Yusuke Nagasaka Tokyo Institute of Technology

Exploiting GPU Caches in Sparse Matrix Vector Multiplication. Yusuke Nagasaka Tokyo Institute of Technology Exploiting GPU Caches in Sparse Matrix Vector Multiplication Yusuke Nagasaka Tokyo Institute of Technology Sparse Matrix Generated by FEM, being as the graph data Often require solving sparse linear equation

More information

Kernel optimizations Launch configuration Global memory throughput Shared memory access Instruction throughput / control flow

Kernel optimizations Launch configuration Global memory throughput Shared memory access Instruction throughput / control flow Fundamental Optimizations (GTC 2010) Paulius Micikevicius NVIDIA Outline Kernel optimizations Launch configuration Global memory throughput Shared memory access Instruction throughput / control flow Optimization

More information

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N.

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N. GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran G. Ruetsch, M. Fatica, E. Phillips, N. Juffa Outline WRF and RRTM Previous Work CUDA Fortran Features RRTM in CUDA

More information

Embedded Systems. Octav Chipara. Thursday, September 13, 12

Embedded Systems. Octav Chipara. Thursday, September 13, 12 Embedded Systems Octav Chipara Caught between two worlds Embedded systems PC world 2 What are embedded systems? Any device that includes a computer (but you don t think of it as a computer) iphone digital

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

Designing for Performance. Patrick Happ Raul Feitosa

Designing for Performance. Patrick Happ Raul Feitosa Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance

More information

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms

Complexity and Advanced Algorithms. Introduction to Parallel Algorithms Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical

More information

A Modified Algorithm for CLEANing Wide-Field Maps with Extended Structures

A Modified Algorithm for CLEANing Wide-Field Maps with Extended Structures J. Astrophys. Astr. (1990) 11, 311 322 A Modified Algorithm for CLEANing Wide-Field Maps with Extended Structures Κ. S. Dwarakanath, A. A. Deshpande & Ν. Udaya Shankar Raman Research institute, Bangalore

More information

High performance Computing and O&G Challenges

High performance Computing and O&G Challenges High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating

More information

NRAO VLA Archive Survey

NRAO VLA Archive Survey NRAO VLA Archive Survey Jared H. Crossley, Loránt O. Sjouwerman, Edward B. Fomalont, and Nicole M. Radziwill National Radio Astronomy Observatory, 520 Edgemont Road, Charlottesville, Virginia, USA ABSTRACT

More information

Sky-domain algorithms to reconstruct spatial, spectral and time-variable structure of the sky-brightness distribution

Sky-domain algorithms to reconstruct spatial, spectral and time-variable structure of the sky-brightness distribution Sky-domain algorithms to reconstruct spatial, spectral and time-variable structure of the sky-brightness distribution Urvashi Rau National Radio Astronomy Observatory Socorro, NM, USA Outline : - Overview

More information

SIMULATION AND VISUALIZATION IN THE EDUCATION OF COHERENT OPTICS

SIMULATION AND VISUALIZATION IN THE EDUCATION OF COHERENT OPTICS SIMULATION AND VISUALIZATION IN THE EDUCATION OF COHERENT OPTICS J. KORNIS, P. PACHER Department of Physics Technical University of Budapest H-1111 Budafoki út 8., Hungary e-mail: kornis@phy.bme.hu, pacher@phy.bme.hu

More information

CUDA Optimization: Memory Bandwidth Limited Kernels CUDA Webinar Tim C. Schroeder, HPC Developer Technology Engineer

CUDA Optimization: Memory Bandwidth Limited Kernels CUDA Webinar Tim C. Schroeder, HPC Developer Technology Engineer CUDA Optimization: Memory Bandwidth Limited Kernels CUDA Webinar Tim C. Schroeder, HPC Developer Technology Engineer Outline We ll be focussing on optimizing global memory throughput on Fermi-class GPUs

More information

CUDA. Matthew Joyner, Jeremy Williams

CUDA. Matthew Joyner, Jeremy Williams CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

Polarization of Light and Polarizers ABSTRACT

Polarization of Light and Polarizers ABSTRACT Polarization of Light and Polarizers Douglas A. Kerr, P.E. Issue 4 September 23, 2004 ABSTRACT Light is a form of electromagnetic radiation, and has the property of direction of polarization. This article

More information

ASKAP Pipeline processing and simulations. Dr Matthew Whiting ASKAP Computing, CSIRO May 5th, 2010

ASKAP Pipeline processing and simulations. Dr Matthew Whiting ASKAP Computing, CSIRO May 5th, 2010 ASKAP Pipeline processing and simulations Dr Matthew Whiting ASKAP Computing, CSIRO May 5th, 2010 ASKAP Computing Team Members Team members Marsfield: Tim Cornwell, Ben Humphreys, Juan Carlos Guzman, Malte

More information

Applications of Piezo Actuators for Space Instrument Optical Alignment

Applications of Piezo Actuators for Space Instrument Optical Alignment Year 4 University of Birmingham Presentation Applications of Piezo Actuators for Space Instrument Optical Alignment Michelle Louise Antonik 520689 Supervisor: Prof. B. Swinyard Outline of Presentation

More information

Reflection and Refraction of Light

Reflection and Refraction of Light PC1222 Fundamentals of Physics II Reflection and Refraction of Light 1 Objectives Investigate for reflection of rays from a plane surface, the dependence of the angle of reflection on the angle of incidence.

More information

ALMA Antenna responses in CASA imaging

ALMA Antenna responses in CASA imaging ALMA Antenna responses in CASA imaging Dirk Petry (ESO), December 2012 Outline Motivation ALBiUS/ESO work on CASA responses infrastructure and ALMA beam library First test results 1 Motivation ALMA covers

More information

High Performance Computing on GPUs using NVIDIA CUDA

High Performance Computing on GPUs using NVIDIA CUDA High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and

More information

IRAM mm-interferometry School UV Plane Analysis. IRAM Grenoble

IRAM mm-interferometry School UV Plane Analysis. IRAM Grenoble IRAM mm-interferometry School 2004 1 UV Plane Analysis Frédéric Gueth IRAM Grenoble UV Plane analysis 2 UV Plane analysis The data are now calibrated as best as we can Caution: data are calibrated, but

More information

Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement

Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement Exploiting Multiple GPUs in Sparse QR: Regular Numerics with Irregular Data Movement Tim Davis (Texas A&M University) with Sanjay Ranka, Mohamed Gadou (University of Florida) Nuri Yeralan (Microsoft) NVIDIA

More information

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP

GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP INTRODUCTION or With the exponential increase in computational power of todays hardware, the complexity of the problem

More information

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea. Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences

More information