Massively Parallel GPU-friendly Algorithms for PET. Szirmay-Kalos László, Budapest, University of Technology and Economics

Size: px
Start display at page:

Download "Massively Parallel GPU-friendly Algorithms for PET. Szirmay-Kalos László, Budapest, University of Technology and Economics"

Transcription

1 Massively Parallel GPU-friendly Algorithms for PET Szirmay-Kalos László, Budapest, University of Technology and Economics

2 (GP)GPU: CUDA (OpenCL) Multiprocessor N Multiprocessor 2 Multiprocessor 1 Scalar Processor 1 Shared memory Scalar Processor M Cache Device memory Instruction Execution Unit Massively parallel: #threads > 10 4 Independence: synchronization and write collisions should be avoided SIMD: conditional statements are not welcome Coalesced memory access

3 PET physics P e + P N e - Line Of Response (LOR)

4 Mediso nanoscan PET AnyScan PET

5 Maximum-likelihood reconstruction A 13 A 11 x 1 x 2 A 12 x 3 x 4 A 14 Expected number of hits: ~ y A x Maximize the probability of the measurement data y: ~ y 1 A 13 x 3 A 11 x 1 A 14 x A 4 12 x 2 x arg max Pr{ y ~ y ( x)}

6 Iterative solution Activity in voxels Physics Simulation Measured values Expected values Correction

7 Computational challenges Numbers of LORs and voxels: hundred millions! System matrix A: elements (PetaBytes) Probability that a positron of a voxel is detected by a LOR Patient dependent Not sparse if accurate simulation is needed Do not store, estimate on-the-fly Matrix elements are high dimensional integrals Monte Carlo quadrature Reuse of computation High performance (parallel) computational platform Minimize the effect of estimation error

8 Numerical integration f: integrand g: target density 0 1 samples

9 Quadrature error f/g g f importance sampling stratification

10 Effect of stratification undersampling oversampling

11 Direct physical simulation Input driven, scattering type algorithm Thread = photon Photons scatter different number of times The same detector is hit: write collision Random memory access Cannot mimic the detectors source detectors

12 GPU friendly approach Output driven, gathering type Thread = importon SIMD: grouping importons No write collision: LOR-driven Cannot mimic the source source detectors

13 Multiple Importance Sampling f(x) g b (x) g a (x) f(x i ) g a (x i ) f(x i ) g b (x i ) f(x i ) g a (x i )+g b (x i )

14 Direct gamma photon contribution Detector module 1 Detector module 2 LOR 5D Integration Accuracy for given sample number Cost of a sample l voxel x( v )ddv

15 Output- or LOR-driven sampling w Detector module 1 l l Detector module 2 LOR u Pros: Gathering Thread coherence Texture coherence Uniform on detectors Low-cost samples due to reuse Cons: Cannot mimick activity g Nray ( l, ) 2 D G( u, w) l 1

16 Input or voxel-driven sampling Detector module 1 Detector module 2 Pros: Can mimick activity l LOR Cons: Write collisions Less coherence u g 2 ( l, ) N voxel x( l ) l D cos( ) u xdv 2

17 Multiple Importance Sampling voxel voxel + l w u G D N l g ), ( ), ( 2 ray 1 ), ( ), ( ), ( 2 1 l g l g l g v x D u l l x N l g d ) cos( ) ( ), ( 2 voxel 2

18 Multiple Importance Sampling

19 Input-driven scattered photon transport Monte Carlo simulation: Free path Absorption? Scattering direction source voxel fetches absorption scattering

20 Free Path Sampling u s For special cross section functions (t), it can be solved analytically.

21 Ray marching u 1 exp( s) i i Complexity grows with the resolution Slow in high resolution low density media

22 Mix virtual particles to obtain a density that can be solved analytically photon Virtual collision Real collision Real material particle Virtual particle effective resolution 64 billion sample points

23 Output-driven single-scattered photon transport with reuse s 2 s 2 s 2 z 2 s 1 z 1 1. Scattering points 2. Ray marching between scattering points and detectors s 1 z 1 s 1 3. Combination

24 Multiple Importance Sampling 0 scattering 0 scattering 0, 1, 2, scattering 1 scattering LOR-driven unscattered contribution Voxel-driven unscattered contribution Photon transport LOR-driven single scatter MIS

25 Detector response photon absorption scattering Electronics hits Problem: The domain is 4 dimensional.

26 Quasi-Monte Carlo filtering L L w L X x) w( x) dx = ( L( X x( t)) dt 1 0

27 without with Detector Scattering Compensation

28 Back projection Measured values voxels Physics Simulation y Correction Expected values y~ Monte Carlo simulation or simplification ~ y ( n) A x ( n) Random or deterministic estimation x T ( n 1) ~ x ( n) y A ( y T A 1 n) Ratio of measured and computed values

29 2D study Measured sinogram

30 Backprojection with unbiased forward projection Contribution to x V outlier y / ~ y L L x T ( n 1) ~ x ( n) y / yˆ L L y A ( y T A 1 n) Ratio of measured and computed values ŷ L y~l Samples ŷ L

31 Reduce bias and outliers Averaging iteration: ~ ( n) n yl (1 n) ~ ( 1) yl Metropolis iteration: Ignore outliers randomly Acceptance with probability yˆ n L a min{ˆ y L L / ~ y n / (n) L,1} n

32 Recons res 1.3 mm voxels res 0.1 mm voxels

33 Conclusions GPU is an effective tool for computing tens of thousands of parallel threads having no conditionals and collisions. The problem must be interpreted and solved to keep this requirement in mind. Randomization (Monte Carlo) can help structure the problem in this way.

In the real world, light sources emit light particles, which travel in space, reflect at objects or scatter in volumetric media (potentially multiple

In the real world, light sources emit light particles, which travel in space, reflect at objects or scatter in volumetric media (potentially multiple 1 In the real world, light sources emit light particles, which travel in space, reflect at objects or scatter in volumetric media (potentially multiple times) until they are absorbed. On their way, they

More information

A Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT

A Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT A Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT Daniel Schlifske ab and Henry Medeiros a a Marquette University, 1250 W Wisconsin Ave, Milwaukee,

More information

GPU-Based Acceleration for CT Image Reconstruction

GPU-Based Acceleration for CT Image Reconstruction GPU-Based Acceleration for CT Image Reconstruction Xiaodong Yu Advisor: Wu-chun Feng Collaborators: Guohua Cao, Hao Gong Outline Introduction and Motivation Background Knowledge Challenges and Proposed

More information

Mixing Monte Carlo and Progressive Rendering for Improved Global Illumination

Mixing Monte Carlo and Progressive Rendering for Improved Global Illumination Mixing Monte Carlo and Progressive Rendering for Improved Global Illumination Ian C. Doidge Mark W. Jones Benjamin Mora Swansea University, Wales Thursday 14 th June Computer Graphics International 2012

More information

Advanced Graphics. Path Tracing and Photon Mapping Part 2. Path Tracing and Photon Mapping

Advanced Graphics. Path Tracing and Photon Mapping Part 2. Path Tracing and Photon Mapping Advanced Graphics Path Tracing and Photon Mapping Part 2 Path Tracing and Photon Mapping Importance Sampling Combine importance sampling techniques Reflectance function (diffuse + specular) Light source

More information

Choosing the Right Algorithm & Guiding

Choosing the Right Algorithm & Guiding Choosing the Right Algorithm & Guiding PHILIPP SLUSALLEK & PASCAL GRITTMANN Topics for Today What does an implementation of a high-performance renderer look like? Review of algorithms which to choose for

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied

More information

Basics of treatment planning II

Basics of treatment planning II Basics of treatment planning II Sastry Vedam PhD DABR Introduction to Medical Physics III: Therapy Spring 2015 Monte Carlo Methods 1 Monte Carlo! Most accurate at predicting dose distributions! Based on

More information

Photon Maps. The photon map stores the lighting information on points or photons in 3D space ( on /near 2D surfaces)

Photon Maps. The photon map stores the lighting information on points or photons in 3D space ( on /near 2D surfaces) Photon Mapping 1/36 Photon Maps The photon map stores the lighting information on points or photons in 3D space ( on /near 2D surfaces) As opposed to the radiosity method that stores information on surface

More information

Philipp Slusallek Karol Myszkowski. Realistic Image Synthesis SS18 Instant Global Illumination

Philipp Slusallek Karol Myszkowski. Realistic Image Synthesis SS18 Instant Global Illumination Realistic Image Synthesis - Instant Global Illumination - Karol Myszkowski Overview of MC GI methods General idea Generate samples from lights and camera Connect them and transport illumination along paths

More information

Participating Media. Part I: participating media in general. Oskar Elek MFF UK Prague

Participating Media. Part I: participating media in general. Oskar Elek MFF UK Prague Participating Media Part I: participating media in general Oskar Elek MFF UK Prague Outline Motivation Introduction Properties of participating media Rendering equation Storage strategies Non-interactive

More information

GAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao

GAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao GAMES Webinar: Rendering Tutorial 2 Monte Carlo Methods Shuang Zhao Assistant Professor Computer Science Department University of California, Irvine GAMES Webinar Shuang Zhao 1 Outline 1. Monte Carlo integration

More information

GPU implementation for rapid iterative image reconstruction algorithm

GPU implementation for rapid iterative image reconstruction algorithm GPU implementation for rapid iterative image reconstruction algorithm and its applications in nuclear medicine Jakub Pietrzak Krzysztof Kacperski Department of Medical Physics, Maria Skłodowska-Curie Memorial

More information

METROPOLIS MONTE CARLO SIMULATION

METROPOLIS MONTE CARLO SIMULATION POLYTECHNIC UNIVERSITY Department of Computer and Information Science METROPOLIS MONTE CARLO SIMULATION K. Ming Leung Abstract: The Metropolis method is another method of generating random deviates that

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied

More information

Introduction to Positron Emission Tomography

Introduction to Positron Emission Tomography Planar and SPECT Cameras Summary Introduction to Positron Emission Tomography, Ph.D. Nuclear Medicine Basic Science Lectures srbowen@uw.edu System components: Collimator Detector Electronics Collimator

More information

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield

NVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host

More information

Implementation and evaluation of a fully 3D OS-MLEM reconstruction algorithm accounting for the PSF of the PET imaging system

Implementation and evaluation of a fully 3D OS-MLEM reconstruction algorithm accounting for the PSF of the PET imaging system Implementation and evaluation of a fully 3D OS-MLEM reconstruction algorithm accounting for the PSF of the PET imaging system 3 rd October 2008 11 th Topical Seminar on Innovative Particle and Radiation

More information

GPU-accelerated data expansion for the Marching Cubes algorithm

GPU-accelerated data expansion for the Marching Cubes algorithm GPU-accelerated data expansion for the Marching Cubes algorithm San Jose (CA) September 23rd, 2010 Christopher Dyken, SINTEF Norway Gernot Ziegler, NVIDIA UK Agenda Motivation & Background Data Compaction

More information

Monte Carlo Method for Solving Inverse Problems of Radiation Transfer

Monte Carlo Method for Solving Inverse Problems of Radiation Transfer INVERSE AND ILL-POSED PROBLEMS SERIES Monte Carlo Method for Solving Inverse Problems of Radiation Transfer V.S.Antyufeev. ///VSP/// UTRECHT BOSTON KÖLN TOKYO 2000 Contents Chapter 1. Monte Carlo modifications

More information

An approach to calculate and visualize intraoperative scattered radiation exposure

An approach to calculate and visualize intraoperative scattered radiation exposure Peter L. Reicertz Institut für Medizinische Informatik An approach to calculate and visualize intraoperative scattered radiation exposure Markus Wagner University of Braunschweig Institute of Technology

More information

GPU applications in Cancer Radiation Therapy at UCSD. Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC

GPU applications in Cancer Radiation Therapy at UCSD. Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC GPU applications in Cancer Radiation Therapy at UCSD Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC Conventional Radiotherapy SIMULATION: Construciton, Dij Days PLANNING:

More information

INTERACTIVE GLOBAL ILLUMINATION WITH VIRTUAL LIGHT SOURCES

INTERACTIVE GLOBAL ILLUMINATION WITH VIRTUAL LIGHT SOURCES INTERACTIVE GLOBAL ILLUMINATION WITH VIRTUAL LIGHT SOURCES A dissertation submitted to the Budapest University of Technology in fulfillment of the requirements for the degree of Doctor of Philosophy (Ph.D.)

More information

Efficient Computation of Radial Distribution Function on GPUs

Efficient Computation of Radial Distribution Function on GPUs Efficient Computation of Radial Distribution Function on GPUs Yi-Cheng Tu * and Anand Kumar Department of Computer Science and Engineering University of South Florida, Tampa, Florida 2 Overview Introduction

More information

MCMC Methods for data modeling

MCMC Methods for data modeling MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms

More information

ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors

ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors Weifeng Liu and Brian Vinter Niels Bohr Institute University of Copenhagen Denmark {weifeng, vinter}@nbi.dk March 1, 2014 Weifeng

More information

Interactive Methods in Scientific Visualization

Interactive Methods in Scientific Visualization Interactive Methods in Scientific Visualization GPU Volume Raycasting Christof Rezk-Salama University of Siegen, Germany Volume Rendering in a Nutshell Image Plane Eye Data Set Back-to-front iteration

More information

Optimization solutions for the segmented sum algorithmic function

Optimization solutions for the segmented sum algorithmic function Optimization solutions for the segmented sum algorithmic function ALEXANDRU PÎRJAN Department of Informatics, Statistics and Mathematics Romanian-American University 1B, Expozitiei Blvd., district 1, code

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Dosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System

Dosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System Dosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System A. Al-Basheer, M. Ghita, G. Sjoden, W. Bolch, C. Lee, and the ALRADS Group Computational Medical Physics Team Nuclear

More information

Computer Graphics. - Volume Rendering - Philipp Slusallek

Computer Graphics. - Volume Rendering - Philipp Slusallek Computer Graphics - Volume Rendering - Philipp Slusallek Overview Motivation Volume Representation Indirect Volume Rendering Volume Classification Direct Volume Rendering Applications: Bioinformatics Image

More information

Rendering Algorithms: Real-time indirect illumination. Spring 2010 Matthias Zwicker

Rendering Algorithms: Real-time indirect illumination. Spring 2010 Matthias Zwicker Rendering Algorithms: Real-time indirect illumination Spring 2010 Matthias Zwicker Today Real-time indirect illumination Ray tracing vs. Rasterization Screen space techniques Visibility & shadows Instant

More information

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging

CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging Saoni Mukherjee, Nicholas Moore, James Brock and Miriam Leeser September 12, 2012 1 Outline Introduction to CT Scan, 3D reconstruction

More information

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010 Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)

More information

Accelerating koblinger's method of compton scattering on GPU

Accelerating koblinger's method of compton scattering on GPU Available online at www.sciencedirect.com Procedia Engineering 24 (211) 242 246 211 International Conference on Advances in Engineering Accelerating koblingers method of compton scattering on GPU Jing

More information

Introduc)on to PET Image Reconstruc)on. Tomographic Imaging. Projec)on Imaging. Types of imaging systems

Introduc)on to PET Image Reconstruc)on. Tomographic Imaging. Projec)on Imaging. Types of imaging systems Introduc)on to PET Image Reconstruc)on Adam Alessio http://faculty.washington.edu/aalessio/ Nuclear Medicine Lectures Imaging Research Laboratory Division of Nuclear Medicine University of Washington Fall

More information

high performance medical reconstruction using stream programming paradigms

high performance medical reconstruction using stream programming paradigms high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming

More information

Recent Advances in Monte Carlo Offline Rendering

Recent Advances in Monte Carlo Offline Rendering CS294-13: Special Topics Lecture #6 Advanced Computer Graphics University of California, Berkeley Monday, 21 September 2009 Recent Advances in Monte Carlo Offline Rendering Lecture #6: Monday, 21 September

More information

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been

More information

Virtual Spherical Lights for Many-Light Rendering of Glossy Scenes

Virtual Spherical Lights for Many-Light Rendering of Glossy Scenes Virtual Spherical Lights for Many-Light Rendering of Glossy Scenes Miloš Hašan Jaroslav Křivánek * Bruce Walter Kavita Bala Cornell University * Charles University in Prague Global Illumination Effects

More information

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of

More information

Free Path Sampling in High Resolution Inhomogeneous Participating Media

Free Path Sampling in High Resolution Inhomogeneous Participating Media Volume (1981), Number pp. 1 12 COMPUTER GRAPHICS forum Free Path Sampling in High Resolution Inhomogeneous Participating Media László Szirmay-Kalos, Balázs Tóth, and Milán Magdics Budapest University of

More information

Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media. Vincent Pegoraro Ingo Wald Steven G. Parker

Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media. Vincent Pegoraro Ingo Wald Steven G. Parker Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media Vincent Pegoraro Ingo Wald Steven G. Parker Outline Introduction Related Work Monte Carlo Integration Radiative Energy Transfer SMC

More information

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo

More information

GPU-based Distributed Behavior Models with CUDA

GPU-based Distributed Behavior Models with CUDA GPU-based Distributed Behavior Models with CUDA Courtesy: YouTube, ISIS Lab, Universita degli Studi di Salerno Bradly Alicea Introduction Flocking: Reynolds boids algorithm. * models simple local behaviors

More information

Participating Media. Part II: interactive methods, atmosphere and clouds. Oskar Elek MFF UK Prague

Participating Media. Part II: interactive methods, atmosphere and clouds. Oskar Elek MFF UK Prague Participating Media Part II: interactive methods, atmosphere and clouds Oskar Elek MFF UK Prague Outline Motivation Introduction Properties of participating media Rendering equation Storage strategies

More information

Scalable many-light methods

Scalable many-light methods Scalable many-light methods Jaroslav Křivánek Charles University in Prague Instant radiosity Approximate indirect illumination by 1. Generate VPLs 2. Render with VPLs 2 Instant radiosity with glossy surfaces

More information

2017 Summer Course on Optical Oceanography and Ocean Color Remote Sensing. Monte Carlo Simulation

2017 Summer Course on Optical Oceanography and Ocean Color Remote Sensing. Monte Carlo Simulation 2017 Summer Course on Optical Oceanography and Ocean Color Remote Sensing Curtis Mobley Monte Carlo Simulation Delivered at the Darling Marine Center, University of Maine July 2017 Copyright 2017 by Curtis

More information

Advanced CUDA Optimization 1. Introduction

Advanced CUDA Optimization 1. Introduction Advanced CUDA Optimization 1. Introduction Thomas Bradley Agenda CUDA Review Review of CUDA Architecture Programming & Memory Models Programming Environment Execution Performance Optimization Guidelines

More information

Electron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations

Electron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations Electron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations A. Al-Basheer, G. Sjoden, M. Ghita Computational Medical Physics Team Nuclear & Radiological Engineering University

More information

Compressed Sensing for Electron Tomography

Compressed Sensing for Electron Tomography University of Maryland, College Park Department of Mathematics February 10, 2015 1/33 Outline I Introduction 1 Introduction 2 3 4 2/33 1 Introduction 2 3 4 3/33 Tomography Introduction Tomography - Producing

More information

EXPOSING PARTICLE PARALLELISM IN THE XGC PIC CODE BY EXPLOITING GPU MEMORY HIERARCHY. Stephen Abbott, March

EXPOSING PARTICLE PARALLELISM IN THE XGC PIC CODE BY EXPLOITING GPU MEMORY HIERARCHY. Stephen Abbott, March EXPOSING PARTICLE PARALLELISM IN THE XGC PIC CODE BY EXPLOITING GPU MEMORY HIERARCHY Stephen Abbott, March 26 2018 ACKNOWLEDGEMENTS Collaborators: Oak Ridge Nation Laboratory- Ed D Azevedo NVIDIA - Peng

More information

Implementation of Bidirectional Ray Tracing Algorithm

Implementation of Bidirectional Ray Tracing Algorithm Implementation of Bidirectional Ray Tracing Algorithm PÉTER DORNBACH jet@inf.bme.hu Technical University of Budapest, Department of Control Engineering and Information Technology, Mûegyetem rkp. 9, 1111

More information

Supercomputing the Cascade Processes of Radiation Transport

Supercomputing the Cascade Processes of Radiation Transport 19 th World Conference on Non-Destructive Testing 2016 Supercomputing the Cascade Processes of Radiation Transport Mikhail ZHUKOVSKIY 1, Mikhail MARKOV 1, Sergey PODOLYAKO 1, Roman USKOV 1, Carsten BELLON

More information

Anatomy-based Regularization Methods for Dynamic PET Reconstruction

Anatomy-based Regularization Methods for Dynamic PET Reconstruction Eighth Hungarian Conference on Computer Graphics and Geometry, Budapest, 1 Anatomy-based Regularization Methods for Dynamic PET Reconstruction Ágota Kacsó and László Szirmay-Kalos Budapest University of

More information

Improved Modeling of System Response in List Mode EM Reconstruction of Compton Scatter Camera Images'

Improved Modeling of System Response in List Mode EM Reconstruction of Compton Scatter Camera Images' IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 48, NO. 1, FEBRUARY 2001 111 Improved Modeling of System Response in List Mode EM Reconstruction of Compton Scatter Camera Images' S.J. Wilderman2, J.A. Fe~sle?t~>~,

More information

Simulation. Monte Carlo

Simulation. Monte Carlo Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run is always random A single instance of a random variable Goal of a simulation experiment is to get knowledge about

More information

Discussion. Smoothness of Indirect Lighting. History and Outline. Irradiance Calculation. Irradiance Caching. Advanced Computer Graphics (Spring 2013)

Discussion. Smoothness of Indirect Lighting. History and Outline. Irradiance Calculation. Irradiance Caching. Advanced Computer Graphics (Spring 2013) Advanced Computer Graphics (Spring 2013 CS 283, Lecture 12: Recent Advances in Monte Carlo Offline Rendering Ravi Ramamoorthi http://inst.eecs.berkeley.edu/~cs283/sp13 Some slides/ideas courtesy Pat Hanrahan,

More information

B. Tech. Project Second Stage Report on

B. Tech. Project Second Stage Report on B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic

More information

Samuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR

Samuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing Samuel Coolidge, sam.r.coolidge@gmail.com Dan Simon, des480@nyu.edu Dennis Shasha, shasha@cims.nyu.edu Technical Report

More information

GPU Based Convolution/Superposition Dose Calculation

GPU Based Convolution/Superposition Dose Calculation GPU Based Convolution/Superposition Dose Calculation Todd McNutt 1, Robert Jacques 1,2, Stefan Pencea 2, Sharon Dye 2, Michel Moreau 2 1) Johns Hopkins University 2) Elekta Research Funded by and Licensed

More information

A Parallel Access Method for Spatial Data Using GPU

A Parallel Access Method for Spatial Data Using GPU A Parallel Access Method for Spatial Data Using GPU Byoung-Woo Oh Department of Computer Engineering Kumoh National Institute of Technology Gumi, Korea bwoh@kumoh.ac.kr Abstract Spatial access methods

More information

CS 563 Advanced Topics in Computer Graphics Irradiance Caching and Particle Tracing. by Stephen Kazmierczak

CS 563 Advanced Topics in Computer Graphics Irradiance Caching and Particle Tracing. by Stephen Kazmierczak CS 563 Advanced Topics in Computer Graphics Irradiance Caching and Particle Tracing by Stephen Kazmierczak Introduction Unbiased light transport algorithms can sometimes take a large number of rays to

More information

Generic Polyphase Filterbanks with CUDA

Generic Polyphase Filterbanks with CUDA Generic Polyphase Filterbanks with CUDA Jan Krämer German Aerospace Center Communication and Navigation Satellite Networks Weßling 04.02.2017 Knowledge for Tomorrow www.dlr.de Slide 1 of 27 > Generic Polyphase

More information

EE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin

EE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin EE382 (20): Computer Architecture - ism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez The University of Texas at Austin 1 Recap 2 Streaming model 1. Use many slimmed down cores to run in parallel

More information

Sampling Using GPU Accelerated Sparse Hierarchical Models

Sampling Using GPU Accelerated Sparse Hierarchical Models Sampling Using GPU Accelerated Sparse Hierarchical Models Miroslav Stoyanov Oak Ridge National Laboratory supported by Exascale Computing Project (ECP) exascaleproject.org April 9, 28 Miroslav Stoyanov

More information

Mali GPU acceleration of HEVC and VP9 Decoder

Mali GPU acceleration of HEVC and VP9 Decoder Mali GPU acceleration of HEVC and VP9 Decoder 2 Web Video continues to grow!!! Video accounted for 50% of the mobile traffic in 2012 - Citrix ByteMobile's 4Q 2012 Analytics Report. Globally, IP video traffic

More information

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung

More information

error

error PARALLEL IMPLEMENTATION OF STOCHASTIC ITERATION ALGORITHMS Roel Mart nez, László Szirmay-Kalos, Mateu Sbert, Ali Mohamed Abbas Department of Informatics and Applied Mathematics, University of Girona Department

More information

Méthodes d imagerie pour les écoulements et le CND

Méthodes d imagerie pour les écoulements et le CND Méthodes d imagerie pour les écoulements et le CND Journée scientifique FED3G CEA LIST/Lab Imagerie Tomographie et Traitement Samuel Legoupil 15 juin 2012 2D/3D imaging tomography Example Petrochemical

More information

Outline. Monte Carlo Radiation Transport Modeling Overview (MCNP5/6) Monte Carlo technique: Example. Monte Carlo technique: Introduction

Outline. Monte Carlo Radiation Transport Modeling Overview (MCNP5/6) Monte Carlo technique: Example. Monte Carlo technique: Introduction Monte Carlo Radiation Transport Modeling Overview () Lecture 7 Special Topics: Device Modeling Outline Principles of Monte Carlo modeling Radiation transport modeling with Utilizing Visual Editor (VisEd)

More information

3D-OSEM Transition Matrix for High Resolution PET Imaging with Modeling of the Gamma-Event Detection

3D-OSEM Transition Matrix for High Resolution PET Imaging with Modeling of the Gamma-Event Detection 3D-OSEM Transition Matrix for High Resolution PET Imaging with Modeling of the Gamma-Event Detection Juan E. Ortuño, George Kontaxakis, Member, IEEE, Pedro Guerra, Student Member, IEEE, and Andrés Santos,

More information

Introduction to Emission Tomography

Introduction to Emission Tomography Introduction to Emission Tomography Gamma Camera Planar Imaging Robert Miyaoka, PhD University of Washington Department of Radiology rmiyaoka@u.washington.edu Gamma Camera: - collimator - detector (crystal

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

Constructing System Matrices for SPECT Simulations and Reconstructions

Constructing System Matrices for SPECT Simulations and Reconstructions Constructing System Matrices for SPECT Simulations and Reconstructions Nirantha Balagopal April 28th, 2017 M.S. Report The University of Arizona College of Optical Sciences 1 Acknowledgement I would like

More information

Introduction to SparseGrid

Introduction to SparseGrid Introduction to SparseGrid Jelmer Ypma November 27, 2011 Abstract This vignette describes how to use SparseGrid, which is an R translation 1 of the Matlab code on http://www.sparse-grids.de (Heiss & Winschel,

More information

These are the annotated slides of the real time part of the Many Lights Rendering

These are the annotated slides of the real time part of the Many Lights Rendering These are the annotated slides of the real time part of the Many Lights Rendering course at SIGGRAPH 12. This part covers techniques for making many lights methods suitable for (high quality) real time

More information

Importance Sampling of Area Lights in Participating Media

Importance Sampling of Area Lights in Participating Media Importance Sampling of Area Lights in Participating Media Christopher Kulla Marcos Fajardo Outline Previous Work Single Scattering Equation Importance Sampling for Point Lights Importance Sampling for

More information

Workshop on Quantitative SPECT and PET Brain Studies January, 2013 PUCRS, Porto Alegre, Brasil Corrections in SPECT and PET

Workshop on Quantitative SPECT and PET Brain Studies January, 2013 PUCRS, Porto Alegre, Brasil Corrections in SPECT and PET Workshop on Quantitative SPECT and PET Brain Studies 14-16 January, 2013 PUCRS, Porto Alegre, Brasil Corrections in SPECT and PET Físico João Alfredo Borges, Me. Corrections in SPECT and PET SPECT and

More information

Report on Dagstuhl-Seminar 01242: Stochastic Methods in Rendering. June 11-15, 2001

Report on Dagstuhl-Seminar 01242: Stochastic Methods in Rendering. June 11-15, 2001 Report on Dagstuhl-Seminar 01242: Stochastic Methods in Rendering June 11-15, 2001 Stochastic methods have become indispensable tools in computer graphics, and more specifically in rendering, since the

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

Multi-Processors and GPU

Multi-Processors and GPU Multi-Processors and GPU Philipp Koehn 7 December 2016 Predicted CPU Clock Speed 1 Clock speed 1971: 740 khz, 2016: 28.7 GHz Source: Horowitz "The Singularity is Near" (2005) Actual CPU Clock Speed 2 Clock

More information

CIVA Computed Tomography Modeling

CIVA Computed Tomography Modeling CIVA Computed Tomography Modeling R. FERNANDEZ, EXTENDE, France S. LEGOUPIL, M. COSTIN, D. TISSEUR, A. LEVEQUE, CEA-LIST, France page 1 Summary Context From CIVA RT to CIVA CT Reconstruction Methods Applications

More information

1. In this problem we use Monte Carlo integration to approximate the area of a circle of radius, R = 1.

1. In this problem we use Monte Carlo integration to approximate the area of a circle of radius, R = 1. 1. In this problem we use Monte Carlo integration to approximate the area of a circle of radius, R = 1. A. For method 1, the idea is to conduct a binomial experiment using the random number generator:

More information

Evaluation of Penalty Design in Penalized Maximum- likelihood Image Reconstruction for Lesion Detection

Evaluation of Penalty Design in Penalized Maximum- likelihood Image Reconstruction for Lesion Detection Evaluation of Penalty Design in Penalized Maximum- likelihood Image Reconstruction for Lesion Detection Li Yang, Andrea Ferrero, Rosalie J. Hagge, Ramsey D. Badawi, and Jinyi Qi Supported by NIBIB under

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

NVIDIA s Compute Unified Device Architecture (CUDA)

NVIDIA s Compute Unified Device Architecture (CUDA) NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability 1 History of GPU

More information

NVIDIA s Compute Unified Device Architecture (CUDA)

NVIDIA s Compute Unified Device Architecture (CUDA) NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability History of GPU

More information

Architecture, Programming and Performance of MIC Phi Coprocessor

Architecture, Programming and Performance of MIC Phi Coprocessor Architecture, Programming and Performance of MIC Phi Coprocessor JanuszKowalik, Piotr Arłukowicz Professor (ret), The Boeing Company, Washington, USA Assistant professor, Faculty of Mathematics, Physics

More information

REMOVAL OF THE EFFECT OF COMPTON SCATTERING IN 3-D WHOLE BODY POSITRON EMISSION TOMOGRAPHY BY MONTE CARLO

REMOVAL OF THE EFFECT OF COMPTON SCATTERING IN 3-D WHOLE BODY POSITRON EMISSION TOMOGRAPHY BY MONTE CARLO REMOVAL OF THE EFFECT OF COMPTON SCATTERING IN 3-D WHOLE BODY POSITRON EMISSION TOMOGRAPHY BY MONTE CARLO Abstract C.S. Levin, Y-C Tai, E.J. Hoffman, M. Dahlbom, T.H. Farquhar UCLA School of Medicine Division

More information

Spiral ASSR Std p = 1.0. Spiral EPBP Std. 256 slices (0/300) Kachelrieß et al., Med. Phys. 31(6): , 2004

Spiral ASSR Std p = 1.0. Spiral EPBP Std. 256 slices (0/300) Kachelrieß et al., Med. Phys. 31(6): , 2004 Spiral ASSR Std p = 1.0 Spiral EPBP Std p = 1.0 Kachelrieß et al., Med. Phys. 31(6): 1623-1641, 2004 256 slices (0/300) Advantages of Cone-Beam Spiral CT Image quality nearly independent of pitch Increase

More information

COMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM

COMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM COMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM A. Boytsov 1,a, I. Kadochnikov 1,4,b, M. Zuev 1,c, A. Bulychev 2,d, Ya.

More information

Me Again! Peter Chapman. if it s important / time-sensitive

Me Again! Peter Chapman.  if it s important / time-sensitive Me Again! Peter Chapman P.Chapman1@bradford.ac.uk pchapman86@gmail.com if it s important / time-sensitive Issues? Working on something specific? Need some direction? Don t hesitate to get in touch http://peter-chapman.co.uk/teaching

More information

LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs

LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs Jin Wang*, Norm Rubin, Albert Sidelnik, Sudhakar Yalamanchili* *Computer Architecture and System Lab, Georgia Institute of Technology NVIDIA

More information

Medical Image Reconstruction Term II 2012 Topic 6: Tomography

Medical Image Reconstruction Term II 2012 Topic 6: Tomography Medical Image Reconstruction Term II 2012 Topic 6: Tomography Professor Yasser Mostafa Kadah Tomography The Greek word tomos means a section, a slice, or a cut. Tomography is the process of imaging a cross

More information

Op#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy. GTC 2014 N. Henderson & K. Murakami

Op#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy. GTC 2014 N. Henderson & K. Murakami Op#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy GTC 2014 N. Henderson & K. Murakami The collabora#on Geant4 @ Special thanks to the CUDA Center of Excellence Program Makoto Asai, SLAC

More information

An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture

An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture Rafia Inam Mälardalen Real-Time Research Centre Mälardalen University, Västerås, Sweden http://www.mrtc.mdh.se rafia.inam@mdh.se CONTENTS

More information

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU

More information

GPU 101. Mike Bailey. Oregon State University. Oregon State University. Computer Graphics gpu101.pptx. mjb April 23, 2017

GPU 101. Mike Bailey. Oregon State University. Oregon State University. Computer Graphics gpu101.pptx. mjb April 23, 2017 1 GPU 101 Mike Bailey mjb@cs.oregonstate.edu gpu101.pptx Why do we care about GPU Programming? A History of GPU Performance vs. CPU Performance 2 Source: NVIDIA How Can You Gain Access to GPU Power? 3

More information