Optimization of HOM Couplers using Time Domain Schemes

Similar documents
Overview of the High-Order ADER-DG Method for Numerical Seismology

Efficiency of adaptive mesh algorithms

GPU Cluster Computing for FEM

New Technologies in CST STUDIO SUITE CST COMPUTER SIMULATION TECHNOLOGY

SIMULATION OF AN IMPLANTED PIFA FOR A CARDIAC PACEMAKER WITH EFIELD FDTD AND HYBRID FDTD-FEM

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Accelerating Double Precision FEM Simulations with GPUs

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

1.2 Numerical Solutions of Flow Problems

cuibm A GPU Accelerated Immersed Boundary Method

Discontinuous Galerkin Sparse Grid method for Maxwell s equations

Virtual EM Inc. Ann Arbor, Michigan, USA

Simulation Advances for RF, Microwave and Antenna Applications

Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing

High-Order Finite-Element Earthquake Modeling on very Large Clusters of CPUs or GPUs

Finite Element Integration and Assembly on Modern Multi and Many-core Processors

Large scale Imaging on Current Many- Core Platforms

Driven Cavity Example

1 Past Research and Achievements

Applications of Berkeley s Dwarfs on Nvidia GPUs

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards

Two-Phase flows on massively parallel multi-gpu clusters

Phased Array Antennas with Optimized Element Patterns

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

Literature Report. Daniël Pols. 23 May 2018

Simulation Advances. Antenna Applications

High-Performance Computational Electromagnetic Modeling Using Low-Cost Parallel Computers

Computational Fluid Dynamics - Incompressible Flows

SENSEI / SENSEI-Lite / SENEI-LDC Updates

Optimization to Reduce Automobile Cabin Noise

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Efficient Imaging Algorithms on Many-Core Platforms

Numerical Algorithms on Multi-GPU Architectures

Integrating GPUs as fast co-processors into the existing parallel FE package FEAST

High Performance Computing for PDE Some numerical aspects of Petascale Computing

Unstructured Grid Numbering Schemes for GPU Coalescing Requirements

Exploring unstructured Poisson solvers for FDS

Session S0069: GPU Computing Advances in 3D Electromagnetic Simulation

HFSS Hybrid Finite Element and Integral Equation Solver for Large Scale Electromagnetic Design and Simulation

Performance and accuracy of hardware-oriented native-, solvers in FEM simulations

High Performance Computing on GPUs using NVIDIA CUDA

S WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS. Jakob Progsch, Mathias Wagner GTC 2018

Asynchronous OpenCL/MPI numerical simulations of conservation laws

τ-extrapolation on 3D semi-structured finite element meshes

GPU Acceleration of Unmodified CSM and CFD Solvers

Math background. 2D Geometric Transformations. Implicit representations. Explicit representations. Read: CS 4620 Lecture 6

Progress on GPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core

Finite Element Multigrid Solvers for PDE Problems on GPUs and GPU Clusters

FOR P3: A monolithic multigrid FEM solver for fluid structure interaction

3D ADI Method for Fluid Simulation on Multiple GPUs. Nikolai Sakharnykh, NVIDIA Nikolay Markovskiy, NVIDIA

Optimization of Components using Sensitivity and Yield Information

NAMD Serial and Parallel Performance

Application of Finite Volume Method for Structural Analysis

course outline basic principles of numerical analysis, intro FEM

Modern GPUs (Graphics Processing Units)

S0432 NEW IDEAS FOR MASSIVELY PARALLEL PRECONDITIONERS

Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, Berlin, Germany

Fast Matrix-Free High-Order Discontinuous Galerkin Kernels: Performance Optimization and Modeling

Introduction to Matlab GPU Acceleration for. Computational Finance. Chuan- Hsiang Han 1. Section 1: Introduction

Parallel Adaptive Tsunami Modelling with Triangular Discontinuous Galerkin Schemes

Computing on GPU Clusters

Turbostream: A CFD solver for manycore

D036 Accelerating Reservoir Simulation with GPUs

Krishnan Suresh Associate Professor Mechanical Engineering

Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs

A Comprehensive Study on the Performance of Implicit LS-DYNA

Higher Order Multigrid Algorithms for a 2D and 3D RANS-kω DG-Solver

Graphics Pipeline 2D Geometric Transformations

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Operator Upscaling and Adjoint State Method

Accelerating Molecular Modeling Applications with Graphics Processors

Multigrid algorithms on multi-gpu architectures

The Many-Core Revolution Understanding Change. Alejandro Cabrera January 29, 2009

LS-DYNA s Linear Solver Development Phase 2: Linear Solution Sequence

How to Optimize Geometric Multigrid Methods on GPUs

HFSS Ansys ANSYS, Inc. All rights reserved. 1 ANSYS, Inc. Proprietary

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

GPU Ultrasound Simulation and Volume Reconstruction

Accelerating Finite Element Analysis in MATLAB with Parallel Computing

Bio-Medical RF Simulations with CST Microwave Studio

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA

High Performance Computing for PDE Towards Petascale Computing

Recent Via Modeling Methods for Multi-Vias in a Shared Anti-pad

The Spherical Harmonics Discrete Ordinate Method for Atmospheric Radiative Transfer

GUIDED WAVE PROPAGATION IN PLATE HAVING TRUSSED STRUCTURES

Visual Analysis of Lagrangian Particle Data from Combustion Simulations

14MMFD-34 Parallel Efficiency and Algorithmic Optimality in Reservoir Simulation on GPUs

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

COURTESY A. KLOECKNER

Challenge Problem 5 - The Solution Dynamic Characteristics of a Truss Structure

Future Directions in Computational Electromagnetics for Digital Applications

International Supercomputing Conference 2009

GPU Computation Strategies & Tricks. Ian Buck NVIDIA

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

Adaptive-Mesh-Refinement Hydrodynamic GPU Computation in Astrophysics

Simulation of Transition Radiation from a flat target using CST particle studio.

Matrix-free multi-gpu Implementation of Elliptic Solvers for strongly anisotropic PDEs

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

Transcription:

Optimization of HOM Couplers using Time Domain Schemes Workshop on HOM Damping in Superconducting RF Cavities Carsten Potratz Universität Rostock October 11, 2010 10/11/2010 2009 UNIVERSITÄT ROSTOCK FAKULTÄT INFORMATIK UND ELEKTROTECHNIK

Overview Introduction Comparison of selected numerical time domain schemes Application Example: Optimization of the filter characteristics of a preliminary HOM coupler design with SPL dimensions. Conclusions 2

Introduction - Numerical Optimization Design Results Geometrical parameters Model Simulation scattering properties thermal load... new set assessment optimization algorithm Criterions met? Optimized Design Simulation is (in general) the most time consuming part Use as few simulations as possible and a suited numerical scheme! 3

a(t) Time Domain Computation of S-Parameters #1 b(t) response excitation b(t) response t b(t) response t t t 4

b(t) Time Domain Computation of S-Parameters #2 s21db 20 fhz 1.0 10 9 1.5 10 9 2.0 10 9 2.5 10 9 3.0 10 9 a(t) t FFT Normalization 40 60 80 100 120 140 args21 3 2 1 t 1 2 1.0 10 9 1.5 10 9 2.0 10 9 2.5 10 9 3.0 10 fhz 9 3 5

Introduction Comparison of selected numerical time domain schemes Application Example: Optimization of the filter characteristics of a preliminary HOM coupler design with SPL dimensions. Conclusions 6

Numerical Schemes - Stencil Approach FDTD/FIT Commonly used on regular grids (orthogonality!) Compute fields/fluxes at a location by surrounding fields/fluxes (linear operators L1,L2) Explicit update equation for discrete field vectors (e,h): d e dt h = L1 (e,h) L 2 (e,h) HOM section and cartesian grid Explicit but limited geometric flexibility. 7

FEM Approach HOM section discretized unstructured tetrahedral grid (coarse) Allows for unstructured grids => suited for complex curved structures and a reasonable number of elements/dofs Project fields on a finite function space, compute inner product with test functions (commonly used approach: Galerkin) Problem: Leads to implicit semi-discrete formulation (unless problem is sufficiently small and mass matrices can be inverted) d dt M e 1 M 2 h = S h e Global matrices Geometrical flexibility but implicit. 8

Discontinous Galerkin (DG) - FEM Approach #1 Allows for unstructured grids => suited for complex curved structures Support of basis and test functions is limited to the individual corresponding elements Adjacent elements connected by boundary fluxes All matrices defined element wise => small Matrix inversion is feasible => explicit Parallel by design d e dt h = M 1 S ε 1 h µ 1 e + M 1 F ε 1ˆn f H µ 1ˆn f E HOM section discretized (12k elements, second order) element wise matrices coupling of adjacent elements Geometrical flexibility and explicit. 9

Discontinous Galerkin (DG) - FEM Approach #2 Our S-Parameter code is based on NUDG* framework NUDG* framework (open source) implements basic operators, time integrator,... We added: boundary conditions (broadband waveguide excitation and absorption), improved PML, modal analysis,..., pre- and post processing => on graphic card (GPU - NVIDIA CUDA based) HOM section discretized (12k elements, second order) Why graphic cards? * www.nudg.org 10

Why GPUs? Modern GPUs heavily outperform modern CPUs! Up to 1.5 TFlops / GPU (single precision) cheap (350 /Unit) Highly scalable => multiple GPU/Workstation or GPU clusters Well suited for highly parallel algorithms like Discontinuous Galerkin FEM Comparison theoretical GFLOP/s GPU vs. CPU (source: NVIDIA ) 11

Introduction Comparison of selected numerical time domain schemes Application Example: Optimization of the filter characteristics of a preliminary HOM coupler design with SPL dimensions. Conclusions 12

Application Example: HOM coupler with SPL geometric properties Coax (without connector) Port 1 Tuning of filter characteristics beam pipe Notch effect @ 704.4 MHz (SPL fundamental mode) Port 2 Port 3 Which scheme to use for optimization? (Preliminary) Model of a HOM coupler/beam pipe section with SPL specs. 13

Filter Characteristics Logs21dB 0 fhz 1.0 10 9 1.5 10 9 2.0 10 9 2.5 10 9 3.0 10 9 20 In this example: region of interest 40 60 The interesting things happen below cutoff. 80 100 120 140 TM01 fcut Filter effect for different geometric parameter combinations Selected transmissions from TM01 to TEM (computed with DG-FEM) with different geometric parameters 14

First Simulation - CST MW Studio Port 1 LogsdB fhz 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 20 40 Port 2 Port 3 60 80 s31 100 120 s21 HOM section discretized (528k meshcells total, first order) Transmission of TM01 to TEM, Computational Time 1700s on 4 Cores @ 3.2 GHz 15

Then - GPU Accelerated DG-FEM (NUDG*) Port 2 Port 1 LogsdB 0 20 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz Port 3 40 60 80 100 120 s31 s21 Transmission of TM01 to TEM, Computational Time 400s on GPU GTX 470 HOM section discretized (12k elements) *www.nudg.org 16

LogsdB Comparison of both schemes (accuracy vs. time) 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz 20 40 Results of well established commercial code MWS matches DG-FEM computed S- parameters very well. 60 4 graphs! Match very well. 80 100 120 400 s (GPU) vs. 1700 s (4 CPU Cores) Transmission of TM01 to TEM, Comparison of computed transmissions Speedup by a factor of 4 => use DG-FEM for further optimization! 17

Interlude - Further Speedup? MWS: Use more Cores? 1700 s (4 CPU Cores) => 1610 s (8 CPU Cores) (Performance is limited by memory bandwidth, not arithmetic operations count!) DG - FEM: - Local timestepping => Implementation in development (Expected gain 2...3) - Multiple GPUs => coming soon: GPU Cluster } Reduces optimization time by at least one magnitude 18

Text example #1 Logs21dB 0 20 40 60 80 100 120 140 Logs31dB 0 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz Improved coupling, but detuning! 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz 20 40 Lowering the hook in the beam pipe 60 80 19

Text example #2 Rotation around axis Logs21dB 0 20 40 60 80 100 120 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz 140 0 45 90 Logs31dB 0 fhz 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 20 40 60 80 100 120 140 180 135 90 20

Text example #3 r Logs21dB 0 20 40 60 80 100 120 140 Logs31dB 0 20 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz r 8.0 10 8 1.0 10 9 1.2 10 9 1.4 10 9 fhz Scaling of the hook s radius 40 60 80 r 100 21

Conclusion Systematic optimization (= extensive parameter sweeps) is required for HOM coupler design Therefore, a suited numerical scheme is essential Based on an (preliminary) application geometry: S-Parameters computed with DG-FEM are in very good agreement with well established code (MWS) Computation time (and thus optimization time) can be reduced significantly by GPU accelerated DG-FEM 22