Massively Parallel GPU-friendly Algorithms for PET. Szirmay-Kalos László, Budapest, University of Technology and Economics
|
|
- Roger McCoy
- 5 years ago
- Views:
Transcription
1 Massively Parallel GPU-friendly Algorithms for PET Szirmay-Kalos László, Budapest, University of Technology and Economics
2 (GP)GPU: CUDA (OpenCL) Multiprocessor N Multiprocessor 2 Multiprocessor 1 Scalar Processor 1 Shared memory Scalar Processor M Cache Device memory Instruction Execution Unit Massively parallel: #threads > 10 4 Independence: synchronization and write collisions should be avoided SIMD: conditional statements are not welcome Coalesced memory access
3 PET physics P e + P N e - Line Of Response (LOR)
4 Mediso nanoscan PET AnyScan PET
5 Maximum-likelihood reconstruction A 13 A 11 x 1 x 2 A 12 x 3 x 4 A 14 Expected number of hits: ~ y A x Maximize the probability of the measurement data y: ~ y 1 A 13 x 3 A 11 x 1 A 14 x A 4 12 x 2 x arg max Pr{ y ~ y ( x)}
6 Iterative solution Activity in voxels Physics Simulation Measured values Expected values Correction
7 Computational challenges Numbers of LORs and voxels: hundred millions! System matrix A: elements (PetaBytes) Probability that a positron of a voxel is detected by a LOR Patient dependent Not sparse if accurate simulation is needed Do not store, estimate on-the-fly Matrix elements are high dimensional integrals Monte Carlo quadrature Reuse of computation High performance (parallel) computational platform Minimize the effect of estimation error
8 Numerical integration f: integrand g: target density 0 1 samples
9 Quadrature error f/g g f importance sampling stratification
10 Effect of stratification undersampling oversampling
11 Direct physical simulation Input driven, scattering type algorithm Thread = photon Photons scatter different number of times The same detector is hit: write collision Random memory access Cannot mimic the detectors source detectors
12 GPU friendly approach Output driven, gathering type Thread = importon SIMD: grouping importons No write collision: LOR-driven Cannot mimic the source source detectors
13 Multiple Importance Sampling f(x) g b (x) g a (x) f(x i ) g a (x i ) f(x i ) g b (x i ) f(x i ) g a (x i )+g b (x i )
14 Direct gamma photon contribution Detector module 1 Detector module 2 LOR 5D Integration Accuracy for given sample number Cost of a sample l voxel x( v )ddv
15 Output- or LOR-driven sampling w Detector module 1 l l Detector module 2 LOR u Pros: Gathering Thread coherence Texture coherence Uniform on detectors Low-cost samples due to reuse Cons: Cannot mimick activity g Nray ( l, ) 2 D G( u, w) l 1
16 Input or voxel-driven sampling Detector module 1 Detector module 2 Pros: Can mimick activity l LOR Cons: Write collisions Less coherence u g 2 ( l, ) N voxel x( l ) l D cos( ) u xdv 2
17 Multiple Importance Sampling voxel voxel + l w u G D N l g ), ( ), ( 2 ray 1 ), ( ), ( ), ( 2 1 l g l g l g v x D u l l x N l g d ) cos( ) ( ), ( 2 voxel 2
18 Multiple Importance Sampling
19 Input-driven scattered photon transport Monte Carlo simulation: Free path Absorption? Scattering direction source voxel fetches absorption scattering
20 Free Path Sampling u s For special cross section functions (t), it can be solved analytically.
21 Ray marching u 1 exp( s) i i Complexity grows with the resolution Slow in high resolution low density media
22 Mix virtual particles to obtain a density that can be solved analytically photon Virtual collision Real collision Real material particle Virtual particle effective resolution 64 billion sample points
23 Output-driven single-scattered photon transport with reuse s 2 s 2 s 2 z 2 s 1 z 1 1. Scattering points 2. Ray marching between scattering points and detectors s 1 z 1 s 1 3. Combination
24 Multiple Importance Sampling 0 scattering 0 scattering 0, 1, 2, scattering 1 scattering LOR-driven unscattered contribution Voxel-driven unscattered contribution Photon transport LOR-driven single scatter MIS
25 Detector response photon absorption scattering Electronics hits Problem: The domain is 4 dimensional.
26 Quasi-Monte Carlo filtering L L w L X x) w( x) dx = ( L( X x( t)) dt 1 0
27 without with Detector Scattering Compensation
28 Back projection Measured values voxels Physics Simulation y Correction Expected values y~ Monte Carlo simulation or simplification ~ y ( n) A x ( n) Random or deterministic estimation x T ( n 1) ~ x ( n) y A ( y T A 1 n) Ratio of measured and computed values
29 2D study Measured sinogram
30 Backprojection with unbiased forward projection Contribution to x V outlier y / ~ y L L x T ( n 1) ~ x ( n) y / yˆ L L y A ( y T A 1 n) Ratio of measured and computed values ŷ L y~l Samples ŷ L
31 Reduce bias and outliers Averaging iteration: ~ ( n) n yl (1 n) ~ ( 1) yl Metropolis iteration: Ignore outliers randomly Acceptance with probability yˆ n L a min{ˆ y L L / ~ y n / (n) L,1} n
32 Recons res 1.3 mm voxels res 0.1 mm voxels
33 Conclusions GPU is an effective tool for computing tens of thousands of parallel threads having no conditionals and collisions. The problem must be interpreted and solved to keep this requirement in mind. Randomization (Monte Carlo) can help structure the problem in this way.
In the real world, light sources emit light particles, which travel in space, reflect at objects or scatter in volumetric media (potentially multiple
1 In the real world, light sources emit light particles, which travel in space, reflect at objects or scatter in volumetric media (potentially multiple times) until they are absorbed. On their way, they
More informationA Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT
A Fast GPU-Based Approach to Branchless Distance-Driven Projection and Back-Projection in Cone Beam CT Daniel Schlifske ab and Henry Medeiros a a Marquette University, 1250 W Wisconsin Ave, Milwaukee,
More informationGPU-Based Acceleration for CT Image Reconstruction
GPU-Based Acceleration for CT Image Reconstruction Xiaodong Yu Advisor: Wu-chun Feng Collaborators: Guohua Cao, Hao Gong Outline Introduction and Motivation Background Knowledge Challenges and Proposed
More informationMixing Monte Carlo and Progressive Rendering for Improved Global Illumination
Mixing Monte Carlo and Progressive Rendering for Improved Global Illumination Ian C. Doidge Mark W. Jones Benjamin Mora Swansea University, Wales Thursday 14 th June Computer Graphics International 2012
More informationAdvanced Graphics. Path Tracing and Photon Mapping Part 2. Path Tracing and Photon Mapping
Advanced Graphics Path Tracing and Photon Mapping Part 2 Path Tracing and Photon Mapping Importance Sampling Combine importance sampling techniques Reflectance function (diffuse + specular) Light source
More informationChoosing the Right Algorithm & Guiding
Choosing the Right Algorithm & Guiding PHILIPP SLUSALLEK & PASCAL GRITTMANN Topics for Today What does an implementation of a high-performance renderer look like? Review of algorithms which to choose for
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied
More informationBasics of treatment planning II
Basics of treatment planning II Sastry Vedam PhD DABR Introduction to Medical Physics III: Therapy Spring 2015 Monte Carlo Methods 1 Monte Carlo! Most accurate at predicting dose distributions! Based on
More informationPhoton Maps. The photon map stores the lighting information on points or photons in 3D space ( on /near 2D surfaces)
Photon Mapping 1/36 Photon Maps The photon map stores the lighting information on points or photons in 3D space ( on /near 2D surfaces) As opposed to the radiosity method that stores information on surface
More informationPhilipp Slusallek Karol Myszkowski. Realistic Image Synthesis SS18 Instant Global Illumination
Realistic Image Synthesis - Instant Global Illumination - Karol Myszkowski Overview of MC GI methods General idea Generate samples from lights and camera Connect them and transport illumination along paths
More informationParticipating Media. Part I: participating media in general. Oskar Elek MFF UK Prague
Participating Media Part I: participating media in general Oskar Elek MFF UK Prague Outline Motivation Introduction Properties of participating media Rendering equation Storage strategies Non-interactive
More informationGAMES Webinar: Rendering Tutorial 2. Monte Carlo Methods. Shuang Zhao
GAMES Webinar: Rendering Tutorial 2 Monte Carlo Methods Shuang Zhao Assistant Professor Computer Science Department University of California, Irvine GAMES Webinar Shuang Zhao 1 Outline 1. Monte Carlo integration
More informationGPU implementation for rapid iterative image reconstruction algorithm
GPU implementation for rapid iterative image reconstruction algorithm and its applications in nuclear medicine Jakub Pietrzak Krzysztof Kacperski Department of Medical Physics, Maria Skłodowska-Curie Memorial
More informationMETROPOLIS MONTE CARLO SIMULATION
POLYTECHNIC UNIVERSITY Department of Computer and Information Science METROPOLIS MONTE CARLO SIMULATION K. Ming Leung Abstract: The Metropolis method is another method of generating random deviates that
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied
More informationIntroduction to Positron Emission Tomography
Planar and SPECT Cameras Summary Introduction to Positron Emission Tomography, Ph.D. Nuclear Medicine Basic Science Lectures srbowen@uw.edu System components: Collimator Detector Electronics Collimator
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationImplementation and evaluation of a fully 3D OS-MLEM reconstruction algorithm accounting for the PSF of the PET imaging system
Implementation and evaluation of a fully 3D OS-MLEM reconstruction algorithm accounting for the PSF of the PET imaging system 3 rd October 2008 11 th Topical Seminar on Innovative Particle and Radiation
More informationGPU-accelerated data expansion for the Marching Cubes algorithm
GPU-accelerated data expansion for the Marching Cubes algorithm San Jose (CA) September 23rd, 2010 Christopher Dyken, SINTEF Norway Gernot Ziegler, NVIDIA UK Agenda Motivation & Background Data Compaction
More informationMonte Carlo Method for Solving Inverse Problems of Radiation Transfer
INVERSE AND ILL-POSED PROBLEMS SERIES Monte Carlo Method for Solving Inverse Problems of Radiation Transfer V.S.Antyufeev. ///VSP/// UTRECHT BOSTON KÖLN TOKYO 2000 Contents Chapter 1. Monte Carlo modifications
More informationAn approach to calculate and visualize intraoperative scattered radiation exposure
Peter L. Reicertz Institut für Medizinische Informatik An approach to calculate and visualize intraoperative scattered radiation exposure Markus Wagner University of Braunschweig Institute of Technology
More informationGPU applications in Cancer Radiation Therapy at UCSD. Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC
GPU applications in Cancer Radiation Therapy at UCSD Steve Jiang, UCSD Radiation Oncology Amit Majumdar, SDSC Dongju (DJ) Choi, SDSC Conventional Radiotherapy SIMULATION: Construciton, Dij Days PLANNING:
More informationINTERACTIVE GLOBAL ILLUMINATION WITH VIRTUAL LIGHT SOURCES
INTERACTIVE GLOBAL ILLUMINATION WITH VIRTUAL LIGHT SOURCES A dissertation submitted to the Budapest University of Technology in fulfillment of the requirements for the degree of Doctor of Philosophy (Ph.D.)
More informationEfficient Computation of Radial Distribution Function on GPUs
Efficient Computation of Radial Distribution Function on GPUs Yi-Cheng Tu * and Anand Kumar Department of Computer Science and Engineering University of South Florida, Tampa, Florida 2 Overview Introduction
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors
ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors Weifeng Liu and Brian Vinter Niels Bohr Institute University of Copenhagen Denmark {weifeng, vinter}@nbi.dk March 1, 2014 Weifeng
More informationInteractive Methods in Scientific Visualization
Interactive Methods in Scientific Visualization GPU Volume Raycasting Christof Rezk-Salama University of Siegen, Germany Volume Rendering in a Nutshell Image Plane Eye Data Set Back-to-front iteration
More informationOptimization solutions for the segmented sum algorithmic function
Optimization solutions for the segmented sum algorithmic function ALEXANDRU PÎRJAN Department of Informatics, Statistics and Mathematics Romanian-American University 1B, Expozitiei Blvd., district 1, code
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationDosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System
Dosimetry Simulations with the UF-B Series Phantoms using the PENTRAN-MP Code System A. Al-Basheer, M. Ghita, G. Sjoden, W. Bolch, C. Lee, and the ALRADS Group Computational Medical Physics Team Nuclear
More informationComputer Graphics. - Volume Rendering - Philipp Slusallek
Computer Graphics - Volume Rendering - Philipp Slusallek Overview Motivation Volume Representation Indirect Volume Rendering Volume Classification Direct Volume Rendering Applications: Bioinformatics Image
More informationRendering Algorithms: Real-time indirect illumination. Spring 2010 Matthias Zwicker
Rendering Algorithms: Real-time indirect illumination Spring 2010 Matthias Zwicker Today Real-time indirect illumination Ray tracing vs. Rasterization Screen space techniques Visibility & shadows Instant
More informationCUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging
CUDA and OpenCL Implementations of 3D CT Reconstruction for Biomedical Imaging Saoni Mukherjee, Nicholas Moore, James Brock and Miriam Leeser September 12, 2012 1 Outline Introduction to CT Scan, 3D reconstruction
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationAccelerating koblinger's method of compton scattering on GPU
Available online at www.sciencedirect.com Procedia Engineering 24 (211) 242 246 211 International Conference on Advances in Engineering Accelerating koblingers method of compton scattering on GPU Jing
More informationIntroduc)on to PET Image Reconstruc)on. Tomographic Imaging. Projec)on Imaging. Types of imaging systems
Introduc)on to PET Image Reconstruc)on Adam Alessio http://faculty.washington.edu/aalessio/ Nuclear Medicine Lectures Imaging Research Laboratory Division of Nuclear Medicine University of Washington Fall
More informationhigh performance medical reconstruction using stream programming paradigms
high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming
More informationRecent Advances in Monte Carlo Offline Rendering
CS294-13: Special Topics Lecture #6 Advanced Computer Graphics University of California, Berkeley Monday, 21 September 2009 Recent Advances in Monte Carlo Offline Rendering Lecture #6: Monday, 21 September
More informationComputing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany
Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been
More informationVirtual Spherical Lights for Many-Light Rendering of Glossy Scenes
Virtual Spherical Lights for Many-Light Rendering of Glossy Scenes Miloš Hašan Jaroslav Křivánek * Bruce Walter Kavita Bala Cornell University * Charles University in Prague Global Illumination Effects
More informationMULTI-DIMENSIONAL MONTE CARLO INTEGRATION
CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals
More informationChapter 1. Introduction
Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of
More informationFree Path Sampling in High Resolution Inhomogeneous Participating Media
Volume (1981), Number pp. 1 12 COMPUTER GRAPHICS forum Free Path Sampling in High Resolution Inhomogeneous Participating Media László Szirmay-Kalos, Balázs Tóth, and Milán Magdics Budapest University of
More informationSequential Monte Carlo Adaptation in Low-Anisotropy Participating Media. Vincent Pegoraro Ingo Wald Steven G. Parker
Sequential Monte Carlo Adaptation in Low-Anisotropy Participating Media Vincent Pegoraro Ingo Wald Steven G. Parker Outline Introduction Related Work Monte Carlo Integration Radiative Energy Transfer SMC
More informationVARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung
POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo
More informationGPU-based Distributed Behavior Models with CUDA
GPU-based Distributed Behavior Models with CUDA Courtesy: YouTube, ISIS Lab, Universita degli Studi di Salerno Bradly Alicea Introduction Flocking: Reynolds boids algorithm. * models simple local behaviors
More informationParticipating Media. Part II: interactive methods, atmosphere and clouds. Oskar Elek MFF UK Prague
Participating Media Part II: interactive methods, atmosphere and clouds Oskar Elek MFF UK Prague Outline Motivation Introduction Properties of participating media Rendering equation Storage strategies
More informationScalable many-light methods
Scalable many-light methods Jaroslav Křivánek Charles University in Prague Instant radiosity Approximate indirect illumination by 1. Generate VPLs 2. Render with VPLs 2 Instant radiosity with glossy surfaces
More information2017 Summer Course on Optical Oceanography and Ocean Color Remote Sensing. Monte Carlo Simulation
2017 Summer Course on Optical Oceanography and Ocean Color Remote Sensing Curtis Mobley Monte Carlo Simulation Delivered at the Darling Marine Center, University of Maine July 2017 Copyright 2017 by Curtis
More informationAdvanced CUDA Optimization 1. Introduction
Advanced CUDA Optimization 1. Introduction Thomas Bradley Agenda CUDA Review Review of CUDA Architecture Programming & Memory Models Programming Environment Execution Performance Optimization Guidelines
More informationElectron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations
Electron Dose Kernels (EDK) for Secondary Particle Transport in Deterministic Simulations A. Al-Basheer, G. Sjoden, M. Ghita Computational Medical Physics Team Nuclear & Radiological Engineering University
More informationCompressed Sensing for Electron Tomography
University of Maryland, College Park Department of Mathematics February 10, 2015 1/33 Outline I Introduction 1 Introduction 2 3 4 2/33 1 Introduction 2 3 4 3/33 Tomography Introduction Tomography - Producing
More informationEXPOSING PARTICLE PARALLELISM IN THE XGC PIC CODE BY EXPLOITING GPU MEMORY HIERARCHY. Stephen Abbott, March
EXPOSING PARTICLE PARALLELISM IN THE XGC PIC CODE BY EXPLOITING GPU MEMORY HIERARCHY Stephen Abbott, March 26 2018 ACKNOWLEDGEMENTS Collaborators: Oak Ridge Nation Laboratory- Ed D Azevedo NVIDIA - Peng
More informationImplementation of Bidirectional Ray Tracing Algorithm
Implementation of Bidirectional Ray Tracing Algorithm PÉTER DORNBACH jet@inf.bme.hu Technical University of Budapest, Department of Control Engineering and Information Technology, Mûegyetem rkp. 9, 1111
More informationSupercomputing the Cascade Processes of Radiation Transport
19 th World Conference on Non-Destructive Testing 2016 Supercomputing the Cascade Processes of Radiation Transport Mikhail ZHUKOVSKIY 1, Mikhail MARKOV 1, Sergey PODOLYAKO 1, Roman USKOV 1, Carsten BELLON
More informationAnatomy-based Regularization Methods for Dynamic PET Reconstruction
Eighth Hungarian Conference on Computer Graphics and Geometry, Budapest, 1 Anatomy-based Regularization Methods for Dynamic PET Reconstruction Ágota Kacsó and László Szirmay-Kalos Budapest University of
More informationImproved Modeling of System Response in List Mode EM Reconstruction of Compton Scatter Camera Images'
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 48, NO. 1, FEBRUARY 2001 111 Improved Modeling of System Response in List Mode EM Reconstruction of Compton Scatter Camera Images' S.J. Wilderman2, J.A. Fe~sle?t~>~,
More informationSimulation. Monte Carlo
Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run is always random A single instance of a random variable Goal of a simulation experiment is to get knowledge about
More informationDiscussion. Smoothness of Indirect Lighting. History and Outline. Irradiance Calculation. Irradiance Caching. Advanced Computer Graphics (Spring 2013)
Advanced Computer Graphics (Spring 2013 CS 283, Lecture 12: Recent Advances in Monte Carlo Offline Rendering Ravi Ramamoorthi http://inst.eecs.berkeley.edu/~cs283/sp13 Some slides/ideas courtesy Pat Hanrahan,
More informationB. Tech. Project Second Stage Report on
B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic
More informationSamuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR
Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing Samuel Coolidge, sam.r.coolidge@gmail.com Dan Simon, des480@nyu.edu Dennis Shasha, shasha@cims.nyu.edu Technical Report
More informationGPU Based Convolution/Superposition Dose Calculation
GPU Based Convolution/Superposition Dose Calculation Todd McNutt 1, Robert Jacques 1,2, Stefan Pencea 2, Sharon Dye 2, Michel Moreau 2 1) Johns Hopkins University 2) Elekta Research Funded by and Licensed
More informationA Parallel Access Method for Spatial Data Using GPU
A Parallel Access Method for Spatial Data Using GPU Byoung-Woo Oh Department of Computer Engineering Kumoh National Institute of Technology Gumi, Korea bwoh@kumoh.ac.kr Abstract Spatial access methods
More informationCS 563 Advanced Topics in Computer Graphics Irradiance Caching and Particle Tracing. by Stephen Kazmierczak
CS 563 Advanced Topics in Computer Graphics Irradiance Caching and Particle Tracing by Stephen Kazmierczak Introduction Unbiased light transport algorithms can sometimes take a large number of rays to
More informationGeneric Polyphase Filterbanks with CUDA
Generic Polyphase Filterbanks with CUDA Jan Krämer German Aerospace Center Communication and Navigation Satellite Networks Weßling 04.02.2017 Knowledge for Tomorrow www.dlr.de Slide 1 of 27 > Generic Polyphase
More informationEE382N (20): Computer Architecture - Parallelism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez. The University of Texas at Austin
EE382 (20): Computer Architecture - ism and Locality Spring 2015 Lecture 09 GPUs (II) Mattan Erez The University of Texas at Austin 1 Recap 2 Streaming model 1. Use many slimmed down cores to run in parallel
More informationSampling Using GPU Accelerated Sparse Hierarchical Models
Sampling Using GPU Accelerated Sparse Hierarchical Models Miroslav Stoyanov Oak Ridge National Laboratory supported by Exascale Computing Project (ECP) exascaleproject.org April 9, 28 Miroslav Stoyanov
More informationMali GPU acceleration of HEVC and VP9 Decoder
Mali GPU acceleration of HEVC and VP9 Decoder 2 Web Video continues to grow!!! Video accounted for 50% of the mobile traffic in 2012 - Citrix ByteMobile's 4Q 2012 Analytics Report. Globally, IP video traffic
More informationParallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs
Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung
More informationerror
PARALLEL IMPLEMENTATION OF STOCHASTIC ITERATION ALGORITHMS Roel Mart nez, László Szirmay-Kalos, Mateu Sbert, Ali Mohamed Abbas Department of Informatics and Applied Mathematics, University of Girona Department
More informationMéthodes d imagerie pour les écoulements et le CND
Méthodes d imagerie pour les écoulements et le CND Journée scientifique FED3G CEA LIST/Lab Imagerie Tomographie et Traitement Samuel Legoupil 15 juin 2012 2D/3D imaging tomography Example Petrochemical
More informationOutline. Monte Carlo Radiation Transport Modeling Overview (MCNP5/6) Monte Carlo technique: Example. Monte Carlo technique: Introduction
Monte Carlo Radiation Transport Modeling Overview () Lecture 7 Special Topics: Device Modeling Outline Principles of Monte Carlo modeling Radiation transport modeling with Utilizing Visual Editor (VisEd)
More information3D-OSEM Transition Matrix for High Resolution PET Imaging with Modeling of the Gamma-Event Detection
3D-OSEM Transition Matrix for High Resolution PET Imaging with Modeling of the Gamma-Event Detection Juan E. Ortuño, George Kontaxakis, Member, IEEE, Pedro Guerra, Student Member, IEEE, and Andrés Santos,
More informationIntroduction to Emission Tomography
Introduction to Emission Tomography Gamma Camera Planar Imaging Robert Miyaoka, PhD University of Washington Department of Radiology rmiyaoka@u.washington.edu Gamma Camera: - collimator - detector (crystal
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationConstructing System Matrices for SPECT Simulations and Reconstructions
Constructing System Matrices for SPECT Simulations and Reconstructions Nirantha Balagopal April 28th, 2017 M.S. Report The University of Arizona College of Optical Sciences 1 Acknowledgement I would like
More informationIntroduction to SparseGrid
Introduction to SparseGrid Jelmer Ypma November 27, 2011 Abstract This vignette describes how to use SparseGrid, which is an R translation 1 of the Matlab code on http://www.sparse-grids.de (Heiss & Winschel,
More informationThese are the annotated slides of the real time part of the Many Lights Rendering
These are the annotated slides of the real time part of the Many Lights Rendering course at SIGGRAPH 12. This part covers techniques for making many lights methods suitable for (high quality) real time
More informationImportance Sampling of Area Lights in Participating Media
Importance Sampling of Area Lights in Participating Media Christopher Kulla Marcos Fajardo Outline Previous Work Single Scattering Equation Importance Sampling for Point Lights Importance Sampling for
More informationWorkshop on Quantitative SPECT and PET Brain Studies January, 2013 PUCRS, Porto Alegre, Brasil Corrections in SPECT and PET
Workshop on Quantitative SPECT and PET Brain Studies 14-16 January, 2013 PUCRS, Porto Alegre, Brasil Corrections in SPECT and PET Físico João Alfredo Borges, Me. Corrections in SPECT and PET SPECT and
More informationReport on Dagstuhl-Seminar 01242: Stochastic Methods in Rendering. June 11-15, 2001
Report on Dagstuhl-Seminar 01242: Stochastic Methods in Rendering June 11-15, 2001 Stochastic methods have become indispensable tools in computer graphics, and more specifically in rendering, since the
More informationGeneral Purpose GPU Computing in Partial Wave Analysis
JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data
More informationMulti-Processors and GPU
Multi-Processors and GPU Philipp Koehn 7 December 2016 Predicted CPU Clock Speed 1 Clock speed 1971: 740 khz, 2016: 28.7 GHz Source: Horowitz "The Singularity is Near" (2005) Actual CPU Clock Speed 2 Clock
More informationCIVA Computed Tomography Modeling
CIVA Computed Tomography Modeling R. FERNANDEZ, EXTENDE, France S. LEGOUPIL, M. COSTIN, D. TISSEUR, A. LEVEQUE, CEA-LIST, France page 1 Summary Context From CIVA RT to CIVA CT Reconstruction Methods Applications
More information1. In this problem we use Monte Carlo integration to approximate the area of a circle of radius, R = 1.
1. In this problem we use Monte Carlo integration to approximate the area of a circle of radius, R = 1. A. For method 1, the idea is to conduct a binomial experiment using the random number generator:
More informationEvaluation of Penalty Design in Penalized Maximum- likelihood Image Reconstruction for Lesion Detection
Evaluation of Penalty Design in Penalized Maximum- likelihood Image Reconstruction for Lesion Detection Li Yang, Andrea Ferrero, Rosalie J. Hagge, Ramsey D. Badawi, and Jinyi Qi Supported by NIBIB under
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationNVIDIA s Compute Unified Device Architecture (CUDA)
NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability 1 History of GPU
More informationNVIDIA s Compute Unified Device Architecture (CUDA)
NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability History of GPU
More informationArchitecture, Programming and Performance of MIC Phi Coprocessor
Architecture, Programming and Performance of MIC Phi Coprocessor JanuszKowalik, Piotr Arłukowicz Professor (ret), The Boeing Company, Washington, USA Assistant professor, Faculty of Mathematics, Physics
More informationREMOVAL OF THE EFFECT OF COMPTON SCATTERING IN 3-D WHOLE BODY POSITRON EMISSION TOMOGRAPHY BY MONTE CARLO
REMOVAL OF THE EFFECT OF COMPTON SCATTERING IN 3-D WHOLE BODY POSITRON EMISSION TOMOGRAPHY BY MONTE CARLO Abstract C.S. Levin, Y-C Tai, E.J. Hoffman, M. Dahlbom, T.H. Farquhar UCLA School of Medicine Division
More informationSpiral ASSR Std p = 1.0. Spiral EPBP Std. 256 slices (0/300) Kachelrieß et al., Med. Phys. 31(6): , 2004
Spiral ASSR Std p = 1.0 Spiral EPBP Std p = 1.0 Kachelrieß et al., Med. Phys. 31(6): 1623-1641, 2004 256 slices (0/300) Advantages of Cone-Beam Spiral CT Image quality nearly independent of pitch Increase
More informationCOMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM
COMPARISON OF PYTHON 3 SINGLE-GPU PARALLELIZATION TECHNOLOGIES ON THE EXAMPLE OF A CHARGED PARTICLES DYNAMICS SIMULATION PROBLEM A. Boytsov 1,a, I. Kadochnikov 1,4,b, M. Zuev 1,c, A. Bulychev 2,d, Ya.
More informationMe Again! Peter Chapman. if it s important / time-sensitive
Me Again! Peter Chapman P.Chapman1@bradford.ac.uk pchapman86@gmail.com if it s important / time-sensitive Issues? Working on something specific? Need some direction? Don t hesitate to get in touch http://peter-chapman.co.uk/teaching
More informationLaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs
LaPerm: Locality Aware Scheduler for Dynamic Parallelism on GPUs Jin Wang*, Norm Rubin, Albert Sidelnik, Sudhakar Yalamanchili* *Computer Architecture and System Lab, Georgia Institute of Technology NVIDIA
More informationMedical Image Reconstruction Term II 2012 Topic 6: Tomography
Medical Image Reconstruction Term II 2012 Topic 6: Tomography Professor Yasser Mostafa Kadah Tomography The Greek word tomos means a section, a slice, or a cut. Tomography is the process of imaging a cross
More informationOp#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy. GTC 2014 N. Henderson & K. Murakami
Op#miza#on of CUDA- based Monte Carlo Simula#on for Radia#on Therapy GTC 2014 N. Henderson & K. Murakami The collabora#on Geant4 @ Special thanks to the CUDA Center of Excellence Program Makoto Asai, SLAC
More informationAn Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture
An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture Rafia Inam Mälardalen Real-Time Research Centre Mälardalen University, Västerås, Sweden http://www.mrtc.mdh.se rafia.inam@mdh.se CONTENTS
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationGPU 101. Mike Bailey. Oregon State University. Oregon State University. Computer Graphics gpu101.pptx. mjb April 23, 2017
1 GPU 101 Mike Bailey mjb@cs.oregonstate.edu gpu101.pptx Why do we care about GPU Programming? A History of GPU Performance vs. CPU Performance 2 Source: NVIDIA How Can You Gain Access to GPU Power? 3
More information