Breaking Through the Barriers to GPU Accelerated Monte Carlo Particle Transport

Similar documents
Modeling Radiation Transport Using MCNP6 and Abaqus/CAE Chelsea A. D Angelo, Steven S. McCready, Karen C. Kelley Los Alamos National Laboratory

Application of MCNP Code in Shielding Design for Radioactive Sources

Automated ADVANTG Variance Reduction in a Proton Driven System. Kenneth A. Van Riper1 and Robert L. Metzger2

Outline. Monte Carlo Radiation Transport Modeling Overview (MCNP5/6) Monte Carlo technique: Example. Monte Carlo technique: Introduction

Monte Carlo Method for Medical & Health Physics

Outline. Outline 7/24/2014. Fast, near real-time, Monte Carlo dose calculations using GPU. Xun Jia Ph.D. GPU Monte Carlo. Clinical Applications

State of the art of Monte Carlo technics for reliable activated waste evaluations

Click to edit Master title style

Development of a Radiation Shielding Monte Carlo Code: RShieldMC

A PRACTICAL LOOK AT MONTE CARLO VARIANCE REDUCTION METHODS IN RADIATION SHIELDING

I. INTRODUCTION. Figure 1. Radiation room model at Dongnai General Hospital

GPU Based Convolution/Superposition Dose Calculation

Particle track plotting in Visual MCNP6 Randy Schwarz 1,*

ELECTRON DOSE KERNELS TO ACCOUNT FOR SECONDARY PARTICLE TRANSPORT IN DETERMINISTIC SIMULATIONS

Electron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations

Dose Calculations: Where and How to Calculate Dose. Allen Holder Trinity University.

1 st International Serpent User Group Meeting in Dresden, Germany, September 15 16, 2011

IMPROVEMENTS TO MONK & MCBEND ENABLING COUPLING & THE USE OF MONK CALCULATED ISOTOPIC COMPOSITIONS IN SHIELDING & CRITICALITY

DETERMINISTIC 3D RADIATION TRANSPORT SIMULATION FOR DOSE DISTRIBUTION AND ORGAN DOSE EVALUATION IN DIAGNOSTIC CT

MCNP4C3-BASED SIMULATION OF A MEDICAL LINEAR ACCELERATOR

Mesh Human Phantoms with MCNP

Evaluation of RayXpert for shielding design of medical facilities

LA-UR- Title: Author(s): Intended for: Approved for public release; distribution is unlimited.

Monte Carlo Simulation for Neptun 10 PC Medical Linear Accelerator and Calculations of Electron Beam Parameters

MPEXS benchmark results

White Paper 3D Geometry Visualization Capability for MCNP

Neutronics Analysis of TRIGA Mark II Research Reactor. R. Khan, S. Karimzadeh, H. Böck Vienna University of Technology Atominstitute

Improved Detector Response Characterization Method in ISOCS and LabSOCS

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

GPU applications in Cancer Radiation Therapy at UCSD. Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC

Basics of treatment planning II

Basics of treatment planning II

Shielding factors for traditional safety glasses

Production of neutrons in laminated barriers of radiotherapy rooms: comparison between the analytical methodology and Monte Carlo simulations

A fast and accurate GPU-based proton transport Monte Carlo simulation for validating proton therapy treatment plans

An approach to calculate and visualize intraoperative scattered radiation exposure

Comparison of Predictions by MCNP and EGSnrc of Radiation Dose

Dosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System

MCNP Monte Carlo & Advanced Reactor Simulations. Forrest Brown. NEAMS Reactor Simulation Workshop ANL, 19 May Title: Author(s): Intended for:

Geometric Templates for Improved Tracking Performance in Monte Carlo Codes

Graphical User Interface for High Energy Multi-Particle Transport

Simulation of Radiographic Testing for POD Assessment

Implementation of the EGSnrc / BEAMnrc Monte Carlo code - Application to medical accelerator SATURNE43

Modeling the ORTEC EX-100 Detector using MCNP

Limitations in the PHOTON Monte Carlo gamma transport code

Proton dose calculation algorithms and configuration data

Attenuation Coefficients for Layered Ceiling and Floor Shields in PET/CT Clinics

EVALUATION OF SPEEDUP OF MONTE CARLO CALCULATIONS OF TWO SIMPLE REACTOR PHYSICS PROBLEMS CODED FOR THE GPU/CUDA ENVIRONMENT

gpmc: GPU-Based Monte Carlo Dose Calculation for Proton Radiotherapy Xun Jia 8/7/2013

Artifact Mitigation in High Energy CT via Monte Carlo Simulation

IMPROVING COMPUTATIONAL EFFICIENCY OF MONTE-CARLO SIMULATIONS WITH VARIANCE REDUCTION

The Monte Carlo simulation of a Package formed by the combination of three scintillators: Brillance380, Brillance350, and Prelude420.

Verification of the Hexagonal Ray Tracing Module and the CMFD Acceleration in ntracer

Release Notes for Dosimetry Check with Convolution-Superposition Collapsed Cone Algorithm (CC)

CHAPTER 10: TALLYING IN MCNP

Adaptation of the Nagra Activation Analysis Methodology to Serpent

Overview and Applications of the Monte Carlo Radiation Transport Tool Kit at LLNL

Monte Carlo simulations

Accelerating koblinger's method of compton scattering on GPU

OPTIMIZATION OF MONTE CARLO TRANSPORT SIMULATIONS IN STOCHASTIC MEDIA

PSG2 / Serpent a Monte Carlo Reactor Physics Burnup Calculation Code. Jaakko Leppänen

ABSTRACT. W. T. Urban', L. A. Crotzerl, K. B. Spinney', L. S. Waters', D. K. Parsons', R. J. Cacciapouti2, and R. E. Alcouffel. 1.

Monte Carlo modelling and applications to imaging

Op#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy. GTC 2014 N. Henderson & K. Murakami

UNIVERSITY OF SOUTHAMPTON

Medical Physics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.

Click to edit Master title style

REAL-TIME ADAPTIVITY IN HEAD-AND-NECK AND LUNG CANCER RADIOTHERAPY IN A GPU ENVIRONMENT

Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System

Improved Convergence Rates in Implicit Monte Carlo Simulations Through Stratified Sampling

Disclosure 7/24/2014. Validation of Monte Carlo Simulations For Medical Imaging Experimental validation and the AAPM Task Group 195 Report

Assesing multileaf collimator effect on the build-up region using Monte Carlo method

Investigation of tilted dose kernels for portal dose prediction in a-si electronic portal imagers

Parallel computation performances of Serpent and Serpent 2 on KTH Parallel Dator Centrum

ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation

Two-Phase flows on massively parallel multi-gpu clusters

OPTIMIZATION OF MONTE CARLO TRANSPORT SIMULATIONS IN STOCHASTIC MEDIA

Graphical User Interface for High Energy Multi-Particle Transport

MCNP Variance Reduction technique application for the Development of the Citrusdal Irradiation Facility

Improving Uintah s Scalability Through the Use of Portable

Indrin Chetty Henry Ford Hospital Detroit, MI. AAPM Annual Meeting Houston 7:30-8:25 Mon 08/07/28 1/30

Daedeok-daero, Yuseong-gu, Daejeon , Republic of Korea b Argonne National Laboratory (ANL)

Investigation of Scattered Radiation Dose at the Door of a Radiotherapy Vault When the Maze Intersects the Primary Beam

Suitability Study of MCNP Monte Carlo Program for Use in Medical Physics

Technology for a better society. SINTEF ICT, Applied Mathematics, Heterogeneous Computing Group

Industrial Radiography Simulation by MCNPX for Pipeline Corrosion Detection

MCRT on a 3D Cartesian Grid

Monte Carlo simulations. Lesson FYSKJM4710 Eirik Malinen

THESIS NEUTRON PRODUCTION AND TRANSPORT AT A MEDICAL LINEAR ACCELERATOR. Submitted by. Amber Allardice

Radiological Characterization and Decommissioning of Research and Power Reactors 15602

Performance study of a fan beam collimator for a multi- modality small animal imaging device

Development a simple point source model for Elekta SL-25 linear accelerator using MCNP4C Monte Carlo code

SERPENT Cross Section Generation for the RBWR

Muon imaging for innovative tomography of large volume and heterogeneous cemented waste packages

High Throughput Computing and Sampling Issues for Optimization in Radiotherapy

New Technology in Radiation Oncology. James E. Gaiser, Ph.D. DABR Physics and Computer Planning Charlotte, NC

Scatter Correction Methods in Dimensional CT

Monte Carlo simulation of photon and electron transport

Monte Carlo methods in proton beam radiation therapy. Harald Paganetti

What is Monte Carlo Modeling*?

Transcription:

Breaking Through the Barriers to GPU Accelerated Monte Carlo Particle Transport GTC 2018 Jeremy Sweezy Scientist Monte Carlo Methods, Codes and Applications Group 3/28/2018 Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA LA-UR-18-XXXX

What is Monte Carlo Particle Transport? Follows the path of individual particles through a system Uses pseudo-random numbers to sample processes Randomly sample physical and non-physical processes Attributed to Stanislaw Ulam and Enrico Fermi Named because Ulam had an uncle who who would borrow money from relatives because he just had to go to Monte Carlo FERMIAC 3/23/18 2

Porting to Specialized Hardware is Prohibitively Expensive The world s production Monte Carlo codes have decades of development LANL s MCNP code has been in development since 1977 Equally extensive amount of V&V effort Codes have to run on desktop machines and super-computers DOE HPC platforms have been in a state of flux for the last 10-years Cell Broadband Engine Intel Xeon Phi (MIC) GPUs ARM??? Barrier #1: Limited Resources (Money, People, Time) 3/23/18 3

Monte Carlo Random Walk on GPU Hardware has reached a Performance Wall A least 6 different research groups have ported the Monte Carlo random walk to GPU hardware for neutron transport All report results against different numbers of CPUs All get the same results! Almost all are extremely simplified Production codes will likely have worse performance. What are the limitations? Conditional branching Random data access No small computational intensive kernel to accelerate 4.5x 3.0x Barrier #2: Performance of random walk on GPUs 3/25/18 4

How do You Define Performance? A computer scientist might measure performance as an increase in speed. P = T CPU T GPU A Monte Carlo specialist would measure performance as an balance between speed and statistical variance using a Figure-of-Merit FOM = σ 2 CPU T CPU 2 σ GPU T GPU Example: FOM = 0. 12 7 1 min 0. 05 2 7 2 min = 2 To date, almost all GPU implementations of Monte Carlo particle transport of have focused on increasing speed. 3/23/18 5

Next Event Estimator Next-event estimator calculates the probability of a particle from a source or collision event reaches a point without interaction Typically used for image tallies S R, E = N C σ i R, E is1 σ T w 2πR 2 R p i μ, E E G exp( M Σ T s, E G ds 0 Ray-cast A Cell 1 μ ) Cell 2 Image Plane B One to two orders of magnitude faster on GPU hardware 3/23/18 6

Traditional Track-Length Estimator The standard Monte Carlo fluence estimator Uses the sampled distance in each cell as fluence estimator Only contributes to cells through which the particle passes Easy to compute Nothing to accelerate on GPU B Cell 3 Cell 1 Cell 2 Computing has changed, we need to change our algorithms too! 3/25/18 7

Volumetric-Ray-Casting Estimator For use in place of the traditional track-length estimator on GPU Multiple pseudo-rays are generated at each source and collision event Computational intensive estimator with lower variance B Cell 3 Cell 1 Cell 2 Ray-cast F i, E = w 1UVWX UΣ T,i E Y l i NΣ T,i (E Y ) exp r Y Ur 0 Σ T r + Ω s, E G ds A neutron dance for a neutron fan. P.M. Dawn 3/25/18 8

MonteRay - Accelerating Monte Carlo Transport with GPU Ray Tracing MonteRay A library for accelerating Monte Carlo tallies with GPU Random walk is maintained on CPU Ray casting based tallies are calculated on the GPU Next-Event estimator Volumetric-Ray-Casting estimator, a new estimator designed for GPUs Supports neutron and photon tallies Can be incorporated into new and legacy Monte Carlo codes Uses continuous energy cross-section data Single precision ray casting Single precision attenuation cross-sections Double precision tallies Reduces cost of accelerating an existing Monte Carlo code with GPUs 3/23/18 9

MonteRay - Testing Tests use: GeForce GTX TitanX GPU with NVIDIA Maxwell architecture 2 CPUs (Intel Haswell E5-2660 v3 at 2.60 GHz), with 10 cores each MonteRay linked with LANL s C++ Monte Carlo code MCATK MCATK uses MPI parallelism building shared ray buffers using MPI-3 shared memory 3-D Cartesian Structured Mesh Geometry 2 tests measured performance of the Next-event estimator 4 tests measured the performance of the Volumetric-ray-casting estimator Volumetric-ray-casting estimator performance on GPU compared to the Track-length estimator performance on the CPU Base performance measured as compared to 8 CPU cores 3/23/18 10

Testing the Next-Event Estimator on GPU Hardware: Two Radiography Tests 3/23/18 11

MonteRay Medical X-Ray Imaging Simulation 50-keV X-ray beam 0.12mm spot size Radiograph used Next-Event Estimator Simulation useful for designing collimator to minimize scattered contribution 3/23/18 12

MonteRay Medical X-Ray Imaging Simulation Source and Collided contribution calculated separately Source contribution relatively easy to calculate Collided contribution important for collimator design Collided performance 15-18x 14.5x 15.3x 3/23/18 13

MonteRay Industrial Radiography Simulated a physical test object used at Los Alamos Dual Axis Radiographic Hydrodynamic Test Facility Used 4-MeV mono-energetic X-ray beam 100 x 100 image grid (10,000 estimators) to simulate image detector Calculation of scatter component needed to design collimators and experiment, but too computational expensive I'm a peeping-tom techie with x-ray eyes Patrick Lee MacDonald 3/23/18 14

MonteRay Industrial Radiography GPU Performance vs Number of CPU Cores 100 Source Collided Relative Performance 10 28.5x 24.2x 0 5 10 15 20 Number of CPU Cores / GPU Collided calculation performance 15-32x! 3/23/18 15

Volumetric-Ray-Casting Estimator on GPU Hardware vs Track-Length Estimator on CPU Hardware 3/23/18 16

Cancer Treatment Simulation 2-MeV Photon beam ( peak of 6MV medical accelerator photon spectrum) 1-cm beam radius What is the dose to healthy tissue? Tumor 2-MeV Photon Beam GPU Performance vs 8 CPU Cores 14x performance improvement in healthy tissue 3/23/18 17

Cancer Treatment Simulation GPU Performance vs Number of CPU Cores in Healthy Tissue 14.3x 10.2x Performance is 14x vs 8 CPU cores or 10x vs 12 CPU cores 3/23/18 18

Pressured Water Reactor Assembly Simulation 16x16 Fuel Assembly Performance 7.5x in the Control Rods, 5x in the fuel, and 4.5x in the coolant Fuel Pin Control Rod GPU Performance vs 8 CPU Cores 3/23/18 19

Pressured Water Reactor Assembly Simulation GPU Performance vs Number of CPU Cores 7.2x 6.0x 5.4x 4.4x Compared to 8 CPU cores performance in control rod 7.2x and 6.0x in the fuel 3/23/18 20

Criticality Accident Simulation Critical Uranium sphere in the corner of a concrete room Concrete floor, walls, ceiling, and 4 concrete pillars Uranium Sphere GPU Performance vs 8 CPU Cores Performance increase of 14-16x in the center of the room 3/23/18 21

Criticality Accident Simulation Smoother Fluence Estimate Track-Length Estimator Volumetric-Ray-Casting Estimator 3/23/18 22

Criticality Accident Simulation GPU Performance vs Number of CPU Cores 15x 10.5x Things are going great, and they re only getting better Patrick Lee MacDonald 3/23/18 23

Reflected Godiva Criticality Experiment Simulation U-235 sphere reflected by water Performance Improvement 2.5x in the core 1.0x in the water GPU Performance vs 8 CPU Cores 3/23/18 24

Reflected Godiva Criticality Experiment Simulation Variance of the Volumetric-Ray-Casting estimator approaches that of the Track-Length estimator is strong scattering material. GPU Performance vs. Num. CPU Cores 2.2x 2 Variance Ratio ( σ TL / σ 2 VRC ) 4.5 4 3.5 3 2.5 2 1.5 Variance Ratio vs Num. Collisions 2.2x 1 1 4 8 12 16 20 Number of Samples per Collision (N) Performance is limited by the estimator variance, not the GPU speed 3/23/18 25

Conclusions MonteRay provides a low cost method of providing GPU accelerated Monte Carlo particle transport Can be incorporated into legacy codes at low cost. Works with standard variance reduction methods Performance improvements of MonteRay are significant: Up to 32 times for the Next-event estimator as compared to 8 CPU cores Up to 14 times for the Volumetric-ray-casting estimator as compared to the Track-Length estimator on 8 CPU cores MonteRay provides a method of breaking through the barriers of limited resources and limited performance 3/23/18 26

Questions? Jeremy Sweezy jsweezy@lanl.gov 3/23/18 27

Extra 3/23/18 28

Uncertainty - Pressured Water Reactor Assembly Simulation Track-Length Estimator Volumetric-Ray-Casting Estimator 600 sec., 8 CPU Cores 124 cycles, 40000 Particles/Cycle 600 sec., 8 CPU Cores and 1 GPU 93 cycles, 40000 Particles/Cycle 8 rays/collision 3/23/18 29