"Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications"

Size: px
Start display at page:

Download ""Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications""

Transcription

1 "Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications" Ryan P. Russell Assistant Professor Nitin Arora Ph.D. Candidate School of Aerospace Engineering Georgia Institute of Technology US-China Space Surveillance Technical Interchange, Beijing, China, Oct 2011

2 Motivations Space Debris: 2009 Iridium/Cosmos collision Currently track ~15K objects Next Generation sensors will track ~100K objects Need faster/better state and uncertainty prediction Covariance Realism No. Objects Tracked Russell R.P.

3 Motivation As near space environment is getting more crowded, the task of accurately tracking and cataloging growing number of objects is becoming more demanding and requiring high fidelity spacecraft trajectory simulations. High fidelity trajectory computation is slow. The problem is compounded when we are tracking large number of objects in space ( ~ 20K). Classic tradeoff between speed and accuracy Fast semi-analytic techniques (SGP4) high fidelity Special Perturbations (SP) In order to sufficiently achieve both in the context of real-time tracking on the order of 100K or more resident space objects, a paradigm shift is necessary to make the problem tractable. 3

4 Aim To bring together the innovations in: fast force model computation AND Single computer parallel programming (i.e. GPU) to achieve BOTH Speed accuracy Multiple orders of magnitude in speedup is sought Maintain approximate accuracy of that in SP Fast ephemeris Parallel Integration Future models Fast gravity model 4

5 Approach We propose a high fidelity spacecraft integration tool that takes advantage of A new (CPU based) fast and accurate perturbation models for high fidelity gravity accelerations and ephemeris Graphics Processing Unit (GPU) based runge-kutta solver to exploit the massive parallelism across multiple spacecraft the expected speedups are multiplicative (100x100=10000). The advantage of GPU based parallelism lies in its single user capability without the need of expensive computer clusters or semi analytic models (loss of accuracy). 5

6 Fast Force Models Fast and accurate Luni-solar ephemeris and Earth orientation: FIRE Reduction in computational time (multiple orders of magnitude improvement!). Provides continuous and analytic first and second derivatives of states and orientation matrices. User friendly (requires no more expertise than using JPL s SPICE). High flexibility and portability. Fast, efficient and Geopotential Computation: FETCH 3D interpolation based global gravity model, trades memory for speed. Non-Singular and Continuous to any order Accurate to the error in SH series. Scalable to any order /degree Extremely user friendly 6

7 FIRE for Fast Luni-Solar and Earth Orientation Ephemerides Russell, R. P., Arora, N., Arora, N., Russell, R. P., A Fast, Accurate, and Smooth Planetary Ephemeris Retrieval System, Celestial Mechanics and Dynamical Astronomy, Vol. 108, No. 2, 2010, pp , DOI /s NEED: position, velocity of Moon and Sun; and Earth Orientation Motivation: JPL SPICE Ephemeris is ssssllloooowww.. (software can spend more than half its time getting ephemeris data) 2 orders of magnitude reduction in call time for ephemeris calls for body and orientation calls to SPICE Custom built for problems that favor higher speeds and smooth derivatives (continuous and analytic first and second derivatives) fold improvement for trajectory propagation speeds (good for monte-carlos, etc.) CURRENT Other load Ephemeris load PREFERRED Ephemeris load Other load 7 Ryan P. Russell

8 Fast Geopotential Computations High fidelity geopotentials are expensive to compute spherical harmonics (SH) is conventional approach a 200x200 SH field is ~40,000 terms has spatial resolution ~ (100 km x 100 km) Recursive formulation Fast for single processor Not amenable to parallel computing SLOW Bottleneck in many applications: orbit estimation trajectory optimization mission design PRACTICE: Fields are truncated Engineers live with errors (blissfully unaware in most cases) Want: Fast Continuous and smooth across global domain Derivatives are continuous to at least 3 orders across global domain ( 3 U/ r 3 =Hessian of dynamics) Singularity free Low memory footprint Easy to implement credit: GRACE subroutine get_sphericalharmonics() do i=1:n SH(i)=f[SH(i-1)] end do Recursive: term i depends on term i-1 8

9 Fast Geopotential Computations: Two Solution Methods Point MasCon (PMC) model Local Weighted Interpolation FETCH model r j r cm M j Model Earth potential (minus J2) as mass concentrations Simple 2Body acceleration calculations Compute in parallel with Graphics Processing Units, GPUs Strategy: Solve inverse gravity problem Reduce to linear least squares Orthogonal solution method Optimize location/number mascons Russell, R.P., Arora, N. Global Point Mascon Models for Simple, Accurate, and Parallel Geopotential Computation, Paper AAS , AAS/AIAA Space Flight Mechanics Meeting, New Orleans, LA, Feb 2011 Precompute potential (minus J2) in a 3D mesh around the Earth Weighted interpolation between nodes Trade memory for speed J. Junkins implementation worked well 30 years ago Strategy modernize, improve Junkins method adaptive error control Local interpolants: each cell has optimized interpolation polynomial Arora, N., Russell, R.P., Fast, Efficient and Adaptive Interpolation of the Geopotential, Paper AAS , AAS/AIAA Astrodynamics Specialist Conference, Girdwood, AK, Aug Russell R.P.

10 Results (Interoplation Model) Example 200x200 SH field Domain is valid from surface to Moon Speedups upto ~300x compared to 200x200 spherical harmonics Requires ~ 1.8 GB memory Breakeven point is ~7x7 field Continuous to 3 orders (derivs easy to compute) Outperforms new Cubed-Sphere model (Colorado, Beylkin) Exact: our model computes accel as direct gradient of our interpolating function (we do not fit acceleration) Faster: ~4 fold we think, hard to tell but we can calibrate with their break even point is 20x20 ~same memory, ~same accuracy Continuous across all boundaries Eliminate non-spherical gravity from speed bottleneck in: Optimization, targeting, estimations, etc. 10

11 GPU Computing GPUs are multi-threaded computational engines They can execute hundreds of thousands of threads simultaneously CUDA (compute unified device architecture) is a GPU based parallel programming model and software environment Programming Architecture Computational grid division inter blocks and subblocks Each sub block contains certain number of threads Inter-thread communication allowed within a sub-block Inter-block communication not allowed ===> PERSONAL SUPERCOMPUTER 11

12 GPU based RK integrator Runge-Kutta integrator (preliminary study) Explicit fixed step RK45 Step size determined by the highest eccentricity case evaluated first on the CPU (future work includes variable step on GPU) Capable of using any force model implemented in C Single precision version up to 600x speedups and double precision 150x to 200x (speedups compared to similar algorithm on CPU for thousands of objects in parallel). Parallelization Structure Each thread responsible for integrating its own trajectory Leads to an embarrassingly parallel implementation very little inter thread communication across the GPU threads. Shared memory used to store ephemeris data (computed once on the CPU) Constant memory arrays used for storing global grid data (needed for the FETCH model) Gravity model coefficients stored in global memory Positions for all bodies after each time step are stored and sent back to CPU (i.e. for use later to solve the conjunction or other similar problems) Provides multiplicative speedups when combined with FAST perturbation force models 12

13 Overall Algorithmic Details CPU GPU ~ 20K Objects to be integrated Transfer one time common data Multiple Threads : One thread per body Initialize FETCH, FIRE and RK-GPU Transfer one time solution data to CPU Common GPU based FETCH model + Ephemeris Perturbation Model + Simple Drag model Call the GPU-RK Again call to GPU for second batch runs RK45 fixed step integration Solve problems : Conjunction analysis etc. 13

14 Current Tool Configuration High fidelity force model: Ephemeris based other body (sun & moon) perturbation model, implemented via FIRE. 2body + higher order gravity field acceleration obtained via FETCH gravity model (implemented on the GPU) 156x156 resolution (~200x speedups,1.6 GB memory). Drag force implemented via simple exponential based model. The integration step size is determined on the CPU and passed copied over to the GPU GPU execution configuration: Fixed number of threads per block: currently set to 64. Number of blocks dynamically determined at runtime. ~ 3KB of shared memory used in double precision. Fortran 2 cuda wrapper file developed for fast data transfer from Fortran to CUDA. The CPU implementation for comparison purposes uses a highly tuned non singular SH based gravity model implemented through a variable step RK45 integrator set to unitless tol. of 1E-12 14

15 Performance Evaluation Cases Case 1: Cluster of Objects case: Objects clustered in a normal 3D distribution Average Orbital Elements for reference a = 6700 km, ecc = 0.20, inclination = 35 deg, true anomaly = 0.0 Case 2: Random distribution of Objects case: Objects are closely uniformly distributed: - perigee varies from 6478 to km - eccentricity varies between 0.01 and 0.9 -other elements span full range

16 Test Configuration CPU: Intel Xeon 2.27GHz 8 GB of memory Compiled with Intel Fortran Compiler 11.0 with O2 optimizations GPU: Tesla C2050 : Fermi Architecture based GPU 448 CUDA cores + 3 GB on onboard memory Compiled with NVCC compiler 4.0

17 17 Case 1 Example Run

18 Absolute Performance Case 1 TOF = (10 min to 2 days) 10,000 objects simulated for ~ 2 days takes ~30 seconds

19 Speedup: Case 1 20,000 10,000 We achieve in excess of four orders of magnitude in speedup The high performance is an example for L1 cache utilization of the algorithm

20 Absolute Performance Case 2 TOF = (10 min to 2 hrs)

21 Speedup: Case 2 Speedup ~half as the random distribution case due to L1 cache (memory access ) still achieve 5000x over a tuned CPU implementation In essence this case represents a lower bound on the performance of our tool.

22 Conclusions Preliminary Study/Efforts Designed and implemented a high fidelity spacecraft trajectory integration tool Fixed step Runge-Kutta integrator along with high fidelity FETCH model has been integrated and implemented on the GPU. 3 to 4 orders of magnitude in speedups are reported The biggest limitation of the tool currently is to have upper bound on either the number of bodies of the number of integration steps The tool has immediate potential for a variety of space surveillance applications including: the conjunction problem, covariance realism, particle filters, and general Monte Carlo analyses. 22

23 Future Works Shift to a variable step integrator (must implement the FIRE ephemeris on the GPU) Fast density model Apply to actual catalogue TLEs Propagate covariances as well as states Use results for conjunction analyses Offline on CPU On GPU directly include algorithms (Chan s for example) Other applications: covariance realism, particle filters, and general Monte Carlo analyses 23

24 24 Russell R.P.

25 25 Russell R.P.

26 26 Russell R.P.

27 27 Russell R.P.

28 Defining Speedup All perturbing functions comparison basis are same for the CPU and the GPU code except the gravity field accelerations. Which on the CPU are calculated by a non-singular SH based algorithm and on the GPU it is calculated by the FETCH gravity model. The CPU time only consists of the time taken by representative set of trajectory propagations which are then extrapolated for the given number of objects. For timing the GPU calls the memory transfers calls are not required to be timed as they are typically three orders of magnitude less than the absolute running times, especially for cases with large number of integration steps. This has been verified by timing representative GPU memory transfer calls. The single trajectory integration time to get the fixed GPU step size is not included in the the absolute GPU-RK running times.

29 Truth Spherical Harmonics Model GRACE GGM02C field published and available on line to degree and order 200x x140 comes from GRACE data, higher order terms come from EGM96 Gives us moving target for residuals depending on degree SH field: ~8 digits for 150x150 field ~10 digits for 10x10 field Target for RMS(ε) of new models 1 order of magnitude smaller Accumulated errors by degree (from covariance of GGM02C solution) 29

30 Performance of a high fidelity solution fitting a 156x156 truncation of GRACE field using mascons Surface Potential 30

31 Local Interpolation Model Discretization Regular grid in lat/lon Adaptive shell thickness in radial direction Each local shell has 3D interpolating function Use weighting functions to ensure continuity across shells Allow for different interpolating functions in neighboring cells 31 Russell R.P.

32 Weight Functions each local cell (four squares) is centered at the node of the grid has its own polynomial interpolant U A (x,y) Any given square is overlapped by four cells (A,B,C,D) Compute U in the overlap region using U A, U B, U C, U D and weighting functions: w A, w B, w C, w D y x A 2D example Continuity (to any order) across boundaries preserved local interpolation functions decoupled A B fit each cell independently x y C D 32

33 How to choose interpolating functions Depart from the Junkins method (to avoid 3D quadratures) Use analytic solutions to large least squares problems using algebraic manipulator MAPLE. Consider an fifth order polynomial in each direction Leading to a total of 5x5x5=125 coefficients Evaluate the truth model model at say 10 3 equally spaced locations Leads to a simple least squares problem ( T ) H WH x = T H Wy Use MAPLE to get analytic inversion: (H T WH ) -1 to so we can solve for coefficients with simple matrix multiply Get analytic inverses for ~400 different interpolating functions Then we can optimize coefficient generation at each cell by checking all options 33

34 Adaptive Error Choose target residual error using altitude and SH error profile For each cell evaluate ~400 interpolating functions choose the one that: meets your error goal has lowest memory footprint For each cell evaluate ~400 interpolating 34 Russell R.P.

Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications

Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications Fast, High-Fidelity, Multi-Spacecraft Trajectory Simulation for Space Catalogue Applications Nitin Arora and Ryan P. Russell Fast methods for high fidelity spacecraft trajectory propagation are becoming

More information

Efficient Conjunction Assessment using Modified Chebyshev Picard Iteration

Efficient Conjunction Assessment using Modified Chebyshev Picard Iteration Efficient Conjunction Assessment using Modified Chebyshev Picard Iteration Austin B. Probe, Brent Macomber, Julie Read, Robyn Woollands, Abhay Masher, and John L. Junkins Texas A&M University ABSTRACT

More information

SCALABLE TRAJECTORY DESIGN WITH COTS SOFTWARE. x8534, x8505,

SCALABLE TRAJECTORY DESIGN WITH COTS SOFTWARE. x8534, x8505, SCALABLE TRAJECTORY DESIGN WITH COTS SOFTWARE Kenneth Kawahara (1) and Jonathan Lowe (2) (1) Analytical Graphics, Inc., 6404 Ivy Lane, Suite 810, Greenbelt, MD 20770, (240) 764 1500 x8534, kkawahara@agi.com

More information

THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU

THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU THE PROCESS OF PARALLELIZING THE CONJUNCTION PREDICTION ALGORITHM OF ESAS SSA CONJUNCTION PREDICTION SERVICE USING GPGPU M. Fehr, V. Navarro, L. Martin, and E. Fletcher European Space Astronomy Center,

More information

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea.

PhD Student. Associate Professor, Co-Director, Center for Computational Earth and Environmental Science. Abdulrahman Manea. Abdulrahman Manea PhD Student Hamdi Tchelepi Associate Professor, Co-Director, Center for Computational Earth and Environmental Science Energy Resources Engineering Department School of Earth Sciences

More information

Two-Phase flows on massively parallel multi-gpu clusters

Two-Phase flows on massively parallel multi-gpu clusters Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous

More information

ALTHOUGH a sphere is an ubiquitous object, constructing a

ALTHOUGH a sphere is an ubiquitous object, constructing a JOURNAL OF GUIDANCE,CONTROL, AND DYNAMICS Vol. 33, No. 2, March April 2010 Comparisons of the Cubed-Sphere Gravity Model with the Spherical Harmonics Brandon A. Jones, George H. Born, and Gregory Beylkin

More information

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards

Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards Very fast simulation of nonlinear water waves in very large numerical wave tanks on affordable graphics cards By Allan P. Engsig-Karup, Morten Gorm Madsen and Stefan L. Glimberg DTU Informatics Workshop

More information

ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS

ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS Ferdinando Alessi Annalisa Massini Roberto Basili INGV Introduction The simulation of wave propagation

More information

Ballistic Coefficient Prediction for Resident Space Objects

Ballistic Coefficient Prediction for Resident Space Objects Ballistic Coefficient Prediction for Resident Space Objects Dr. Ryan Russell, Nitin Arora, Vivek Vittaldev University of Texas at Austin Dr. David Gaylor, Jessica Anderson Emergent Space Technologies,

More information

Getting Started Processing DSN Data with ODTK

Getting Started Processing DSN Data with ODTK Getting Started Processing DSN Data with ODTK 1 Introduction ODTK can process several types of tracking data produced by JPL s deep space network (DSN): two-way sequential range, two- and three-way Doppler,

More information

Using GPUs to compute the multilevel summation of electrostatic forces

Using GPUs to compute the multilevel summation of electrostatic forces Using GPUs to compute the multilevel summation of electrostatic forces David J. Hardy Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of

More information

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s)

Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Directed Optimization On Stencil-based Computational Fluid Dynamics Application(s) Islam Harb 08/21/2015 Agenda Motivation Research Challenges Contributions & Approach Results Conclusion Future Work 2

More information

Collision Risk Computation accounting for Complex geometries of involved objects

Collision Risk Computation accounting for Complex geometries of involved objects Collision Risk Computation accounting for Complex geometries of involved objects Noelia Sánchez-Ortiz (1), Ignacio Grande-Olalla (1), Klaus Merz (2) (1) Deimos Space, Ronda de Poniente 19, 28760, Tres

More information

CORAM: ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES COMPUTATION TOOL *

CORAM: ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES COMPUTATION TOOL * IAA-AAS-DyCoSS2-14-5-3 CORAM: ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES COMPUTATION TOOL * Juan Antonio Pulido Cobo, Noelia Sánchez Ortiz, Ignacio Grande Olalla, Klaus Merz, INTRODUCTION

More information

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy Jan Verschelde joint with Genady Yoffe and Xiangcheng Yu University of Illinois at Chicago Department of Mathematics, Statistics,

More information

Mars Pinpoint Landing Trajectory Optimization Using Sequential Multiresolution Technique

Mars Pinpoint Landing Trajectory Optimization Using Sequential Multiresolution Technique Mars Pinpoint Landing Trajectory Optimization Using Sequential Multiresolution Technique * Jisong Zhao 1), Shuang Li 2) and Zhigang Wu 3) 1), 2) College of Astronautics, NUAA, Nanjing 210016, PRC 3) School

More information

Intermediate Parallel Programming & Cluster Computing

Intermediate Parallel Programming & Cluster Computing High Performance Computing Modernization Program (HPCMP) Summer 2011 Puerto Rico Workshop on Intermediate Parallel Programming & Cluster Computing in conjunction with the National Computational Science

More information

Generators at the LHC

Generators at the LHC High Performance Computing for Event Generators at the LHC A Multi-Threaded Version of MCFM, J.M. Campbell, R.K. Ellis, W. Giele, 2015. Higgs boson production in association with a jet at NNLO using jettiness

More information

LUNAR TEMPERATURE CALCULATIONS ON A GPU

LUNAR TEMPERATURE CALCULATIONS ON A GPU LUNAR TEMPERATURE CALCULATIONS ON A GPU Kyle M. Berney Department of Information & Computer Sciences Department of Mathematics University of Hawai i at Mānoa Honolulu, HI 96822 ABSTRACT Lunar surface temperature

More information

Accelerating CFD with Graphics Hardware

Accelerating CFD with Graphics Hardware Accelerating CFD with Graphics Hardware Graham Pullan (Whittle Laboratory, Cambridge University) 16 March 2009 Today Motivation CPUs and GPUs Programming NVIDIA GPUs with CUDA Application to turbomachinery

More information

GEOPHYS 242: Near Surface Geophysical Imaging. Class 8: Joint Geophysical Inversions Wed, April 20, 2011

GEOPHYS 242: Near Surface Geophysical Imaging. Class 8: Joint Geophysical Inversions Wed, April 20, 2011 GEOPHYS 4: Near Surface Geophysical Imaging Class 8: Joint Geophysical Inversions Wed, April, 11 Invert multiple types of data residuals simultaneously Apply soft mutual constraints: empirical, physical,

More information

Automatic Scaling Iterative Computations. Aug. 7 th, 2012

Automatic Scaling Iterative Computations. Aug. 7 th, 2012 Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics

More information

Missile External Aerodynamics Using Star-CCM+ Star European Conference 03/22-23/2011

Missile External Aerodynamics Using Star-CCM+ Star European Conference 03/22-23/2011 Missile External Aerodynamics Using Star-CCM+ Star European Conference 03/22-23/2011 StarCCM_StarEurope_2011 4/6/11 1 Overview 2 Role of CFD in Aerodynamic Analyses Classical aerodynamics / Semi-Empirical

More information

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs

Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs C.-C. Su a, C.-W. Hsieh b, M. R. Smith b, M. C. Jermy c and J.-S. Wu a a Department of Mechanical Engineering, National Chiao Tung

More information

Efficient Computation of Radial Distribution Function on GPUs

Efficient Computation of Radial Distribution Function on GPUs Efficient Computation of Radial Distribution Function on GPUs Yi-Cheng Tu * and Anand Kumar Department of Computer Science and Engineering University of South Florida, Tampa, Florida 2 Overview Introduction

More information

Analysis and Visualization Algorithms in VMD

Analysis and Visualization Algorithms in VMD 1 Analysis and Visualization Algorithms in VMD David Hardy Research/~dhardy/ NAIS: State-of-the-Art Algorithms for Molecular Dynamics (Presenting the work of John Stone.) VMD Visual Molecular Dynamics

More information

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N.

GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran. G. Ruetsch, M. Fatica, E. Phillips, N. GPU Acceleration of the Longwave Rapid Radiative Transfer Model in WRF using CUDA Fortran G. Ruetsch, M. Fatica, E. Phillips, N. Juffa Outline WRF and RRTM Previous Work CUDA Fortran Features RRTM in CUDA

More information

ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES TOOL (CORAM)

ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES TOOL (CORAM) ESA S COLLISION RISK ASSESSMENT AND AVOIDANCE MANOEUVRES TOOL (CORAM) Juan Antonio Pulido (1), Noelia Sánchez (2), Ignacio Grande (3) and Klaus Merz (4) (1)(2)(3) Elecnor Deimos Space, Ronda de Poniente,

More information

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation

Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation Multi-GPU Scaling of Direct Sparse Linear System Solver for Finite-Difference Frequency-Domain Photonic Simulation 1 Cheng-Han Du* I-Hsin Chung** Weichung Wang* * I n s t i t u t e o f A p p l i e d M

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

N-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo

N-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo N-Body Simulation using CUDA CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo Project plan Develop a program to simulate gravitational

More information

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA

Adaptive Mesh Astrophysical Fluid Simulations on GPU. San Jose 10/2/2009 Peng Wang, NVIDIA Adaptive Mesh Astrophysical Fluid Simulations on GPU San Jose 10/2/2009 Peng Wang, NVIDIA Overview Astrophysical motivation & the Enzo code Finite volume method and adaptive mesh refinement (AMR) CUDA

More information

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC

GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT

More information

University of Texas Center for Space Research. ICESAT/GLAS CSR SCF Release Notes for Orbit and Attitude Determination

University of Texas Center for Space Research. ICESAT/GLAS CSR SCF Release Notes for Orbit and Attitude Determination University of Texas Center for Space Research ICESAT/GLAS CSR SCF Notes for Orbit and Attitude Determination Charles Webb Tim Urban Bob Schutz Version 1.0 August 2006 CSR SCF Notes for Orbit and Attitude

More information

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators

On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators On Level Scheduling for Incomplete LU Factorization Preconditioners on Accelerators Karl Rupp, Barry Smith rupp@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory FEMTEC

More information

International Supercomputing Conference 2009

International Supercomputing Conference 2009 International Supercomputing Conference 2009 Implementation of a Lattice-Boltzmann-Method for Numerical Fluid Mechanics Using the nvidia CUDA Technology E. Riegel, T. Indinger, N.A. Adams Technische Universität

More information

Center for Computational Science

Center for Computational Science Center for Computational Science Toward GPU-accelerated meshfree fluids simulation using the fast multipole method Lorena A Barba Boston University Department of Mechanical Engineering with: Felipe Cruz,

More information

The Spherical Harmonics Discrete Ordinate Method for Atmospheric Radiative Transfer

The Spherical Harmonics Discrete Ordinate Method for Atmospheric Radiative Transfer The Spherical Harmonics Discrete Ordinate Method for Atmospheric Radiative Transfer K. Franklin Evans Program in Atmospheric and Oceanic Sciences University of Colorado, Boulder Computational Methods in

More information

University of Texas Center for Space Research ICESAT/GLAS Document: CSR SCF Release Notes for Orbit and Attitude Determination

University of Texas Center for Space Research ICESAT/GLAS Document: CSR SCF Release Notes for Orbit and Attitude Determination University of Texas Center for Space Research ICESAT/GLAS Document: CSR SCF Notes for Orbit and Attitude Determination Tim Urban Sungkoo Bae Hyung-Jin Rim Charles Webb Sungpil Yoon Bob Schutz Version 3.0

More information

T6: Position-Based Simulation Methods in Computer Graphics. Jan Bender Miles Macklin Matthias Müller

T6: Position-Based Simulation Methods in Computer Graphics. Jan Bender Miles Macklin Matthias Müller T6: Position-Based Simulation Methods in Computer Graphics Jan Bender Miles Macklin Matthias Müller Jan Bender Organizer Professor at the Visual Computing Institute at Aachen University Research topics

More information

Parallel Summation of Inter-Particle Forces in SPH

Parallel Summation of Inter-Particle Forces in SPH Parallel Summation of Inter-Particle Forces in SPH Fifth International Workshop on Meshfree Methods for Partial Differential Equations 17.-19. August 2009 Bonn Overview Smoothed particle hydrodynamics

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

Angles-Only Autonomous Rendezvous Navigation to a Space Resident Object

Angles-Only Autonomous Rendezvous Navigation to a Space Resident Object aa.stanford.edu damicos@stanford.edu stanford.edu Angles-Only Autonomous Rendezvous Navigation to a Space Resident Object Josh Sullivan PhD. Candidate, Space Rendezvous Laboratory PI: Dr. Simone D Amico

More information

J. Blair Perot. Ali Khajeh-Saeed. Software Engineer CD-adapco. Mechanical Engineering UMASS, Amherst

J. Blair Perot. Ali Khajeh-Saeed. Software Engineer CD-adapco. Mechanical Engineering UMASS, Amherst Ali Khajeh-Saeed Software Engineer CD-adapco J. Blair Perot Mechanical Engineering UMASS, Amherst Supercomputers Optimization Stream Benchmark Stag++ (3D Incompressible Flow Code) Matrix Multiply Function

More information

Python for Development of OpenMP and CUDA Kernels for Multidimensional Data

Python for Development of OpenMP and CUDA Kernels for Multidimensional Data Python for Development of OpenMP and CUDA Kernels for Multidimensional Data Zane W. Bell 1, Greg G. Davidson 2, Ed D Azevedo 3, Thomas M. Evans 2, Wayne Joubert 4, John K. Munro, Jr. 5, Dilip R. Patlolla

More information

GPU Implementation of Implicit Runge-Kutta Methods

GPU Implementation of Implicit Runge-Kutta Methods GPU Implementation of Implicit Runge-Kutta Methods Navchetan Awasthi, Abhijith J Supercomputer Education and Research Centre Indian Institute of Science, Bangalore, India navchetanawasthi@gmail.com, abhijith31792@gmail.com

More information

(x, y, z) m 2. (x, y, z) ...] T. m 2. m = [m 1. m 3. Φ = r T V 1 r + λ 1. m T Wm. m T L T Lm + λ 2. m T Hm + λ 3. t(x, y, z) = m 1

(x, y, z) m 2. (x, y, z) ...] T. m 2. m = [m 1. m 3. Φ = r T V 1 r + λ 1. m T Wm. m T L T Lm + λ 2. m T Hm + λ 3. t(x, y, z) = m 1 Class 1: Joint Geophysical Inversions Wed, December 1, 29 Invert multiple types of data residuals simultaneously Apply soft mutual constraints: empirical, physical, statistical Deal with data in the same

More information

A CUBED SPHERE GRAVITY MODEL FOR FAST ORBIT PROPAGATION

A CUBED SPHERE GRAVITY MODEL FOR FAST ORBIT PROPAGATION (Preprint) AAS 9-137 A CUBED SPHERE GRAVITY MODEL FOR FAST ORBIT PROPAGATIO Brandon A. Jones, George H. Born, and Gregory Beylkin The cubed sphere model of the gravity field maps the primary body to the

More information

Quantitative study of computing time of direct/iterative solver for MoM by GPU computing

Quantitative study of computing time of direct/iterative solver for MoM by GPU computing Quantitative study of computing time of direct/iterative solver for MoM by GPU computing Keisuke Konno 1a), Hajime Katsuda 2, Kei Yokokawa 1, Qiang Chen 1, Kunio Sawaya 3, and Qiaowei Yuan 4 1 Department

More information

Faster Simulations of the National Airspace System

Faster Simulations of the National Airspace System Faster Simulations of the National Airspace System PK Menon Monish Tandale Sandy Wiraatmadja Optimal Synthesis Inc. Joseph Rios NASA Ames Research Center NVIDIA GPU Technology Conference 2010, San Jose,

More information

Geometric Rectification of Remote Sensing Images

Geometric Rectification of Remote Sensing Images Geometric Rectification of Remote Sensing Images Airborne TerrestriaL Applications Sensor (ATLAS) Nine flight paths were recorded over the city of Providence. 1 True color ATLAS image (bands 4, 2, 1 in

More information

The jello cube. Undeformed cube. Deformed cube

The jello cube. Undeformed cube. Deformed cube The Jello Cube Assignment 1, CSCI 520 Jernej Barbic, USC Undeformed cube The jello cube Deformed cube The jello cube is elastic, Can be bent, stretched, squeezed,, Without external forces, it eventually

More information

High Performance Computing on GPUs using NVIDIA CUDA

High Performance Computing on GPUs using NVIDIA CUDA High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and

More information

Superdiffusion and Lévy Flights. A Particle Transport Monte Carlo Simulation Code

Superdiffusion and Lévy Flights. A Particle Transport Monte Carlo Simulation Code Superdiffusion and Lévy Flights A Particle Transport Monte Carlo Simulation Code Eduardo J. Nunes-Pereira Centro de Física Escola de Ciências Universidade do Minho Page 1 of 49 ANOMALOUS TRANSPORT Definitions

More information

Software and Performance Engineering for numerical codes on GPU clusters

Software and Performance Engineering for numerical codes on GPU clusters Software and Performance Engineering for numerical codes on GPU clusters H. Köstler International Workshop of GPU Solutions to Multiscale Problems in Science and Engineering Harbin, China 28.7.2010 2 3

More information

Sampling Using GPU Accelerated Sparse Hierarchical Models

Sampling Using GPU Accelerated Sparse Hierarchical Models Sampling Using GPU Accelerated Sparse Hierarchical Models Miroslav Stoyanov Oak Ridge National Laboratory supported by Exascale Computing Project (ECP) exascaleproject.org April 9, 28 Miroslav Stoyanov

More information

Multi-Mesh CFD. Chris Roy Chip Jackson (1 st year PhD student) Aerospace and Ocean Engineering Department Virginia Tech

Multi-Mesh CFD. Chris Roy Chip Jackson (1 st year PhD student) Aerospace and Ocean Engineering Department Virginia Tech Multi-Mesh CFD Chris Roy Chip Jackson (1 st year PhD student) Aerospace and Ocean Engineering Department Virginia Tech cjroy@vt.edu May 21, 2014 CCAS Program Review, Columbus, OH 1 Motivation Automated

More information

Shape of Things to Come: Next-Gen Physics Deep Dive

Shape of Things to Come: Next-Gen Physics Deep Dive Shape of Things to Come: Next-Gen Physics Deep Dive Jean Pierre Bordes NVIDIA Corporation Free PhysX on CUDA PhysX by NVIDIA since March 2008 PhysX on CUDA available: August 2008 GPU PhysX in Games Physical

More information

High performance Computing and O&G Challenges

High performance Computing and O&G Challenges High performance Computing and O&G Challenges 2 Seismic exploration challenges High Performance Computing and O&G challenges Worldwide Context Seismic,sub-surface imaging Computing Power needs Accelerating

More information

How to Optimize Geometric Multigrid Methods on GPUs

How to Optimize Geometric Multigrid Methods on GPUs How to Optimize Geometric Multigrid Methods on GPUs Markus Stürmer, Harald Köstler, Ulrich Rüde System Simulation Group University Erlangen March 31st 2011 at Copper Schedule motivation imaging in gradient

More information

Simulation in Computer Graphics. Particles. Matthias Teschner. Computer Science Department University of Freiburg

Simulation in Computer Graphics. Particles. Matthias Teschner. Computer Science Department University of Freiburg Simulation in Computer Graphics Particles Matthias Teschner Computer Science Department University of Freiburg Outline introduction particle motion finite differences system of first order ODEs second

More information

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters

Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Flux Vector Splitting Methods for the Euler Equations on 3D Unstructured Meshes for CPU/GPU Clusters Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences,

More information

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC

GPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of

More information

GPU acceleration of 3D forward and backward projection using separable footprints for X-ray CT image reconstruction

GPU acceleration of 3D forward and backward projection using separable footprints for X-ray CT image reconstruction GPU acceleration of 3D forward and backward projection using separable footprints for X-ray CT image reconstruction Meng Wu and Jeffrey A. Fessler EECS Department University of Michigan Fully 3D Image

More information

Comprehensive Matlab GUI for Determining Barycentric Orbital Trajectories

Comprehensive Matlab GUI for Determining Barycentric Orbital Trajectories Comprehensive Matlab GUI for Determining Barycentric Orbital Trajectories Steve Katzman 1 California Polytechnic State University, San Luis Obispo, CA 93405 When a 3-body gravitational system is modeled

More information

GPU COMPUTING WITH MSC NASTRAN 2013

GPU COMPUTING WITH MSC NASTRAN 2013 SESSION TITLE WILL BE COMPLETED BY MSC SOFTWARE GPU COMPUTING WITH MSC NASTRAN 2013 Srinivas Kodiyalam, NVIDIA, Santa Clara, USA THEME Accelerated computing with GPUs SUMMARY Current trends in HPC (High

More information

Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak

Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models C. Aberle, A. Hakim, and U. Shumlak Aerospace and Astronautics University of Washington, Seattle American Physical Society

More information

NVIDIA s Compute Unified Device Architecture (CUDA)

NVIDIA s Compute Unified Device Architecture (CUDA) NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability 1 History of GPU

More information

NVIDIA s Compute Unified Device Architecture (CUDA)

NVIDIA s Compute Unified Device Architecture (CUDA) NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability History of GPU

More information

Computation of the gravity gradient tensor due to topographic masses using tesseroids

Computation of the gravity gradient tensor due to topographic masses using tesseroids Computation of the gravity gradient tensor due to topographic masses using tesseroids Leonardo Uieda 1 Naomi Ussami 2 Carla F Braitenberg 3 1. Observatorio Nacional, Rio de Janeiro, Brazil 2. Universidade

More information

Long time integrations of a convective PDE on the sphere by RBF collocation

Long time integrations of a convective PDE on the sphere by RBF collocation Long time integrations of a convective PDE on the sphere by RBF collocation Bengt Fornberg and Natasha Flyer University of Colorado NCAR Department of Applied Mathematics Institute for Mathematics Applied

More information

On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters

On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters 1 On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters N. P. Karunadasa & D. N. Ranasinghe University of Colombo School of Computing, Sri Lanka nishantha@opensource.lk, dnr@ucsc.cmb.ac.lk

More information

The Jello Cube Assignment 1, CSCI 520. Jernej Barbic, USC

The Jello Cube Assignment 1, CSCI 520. Jernej Barbic, USC The Jello Cube Assignment 1, CSCI 520 Jernej Barbic, USC 1 The jello cube Undeformed cube Deformed cube The jello cube is elastic, Can be bent, stretched, squeezed,, Without external forces, it eventually

More information

Block Lanczos-Montgomery method over large prime fields with GPU accelerated dense operations

Block Lanczos-Montgomery method over large prime fields with GPU accelerated dense operations Block Lanczos-Montgomery method over large prime fields with GPU accelerated dense operations Nikolai Zamarashkin and Dmitry Zheltkov INM RAS, Gubkina 8, Moscow, Russia {nikolai.zamarashkin,dmitry.zheltkov}@gmail.com

More information

The Fast Multipole Method (FMM)

The Fast Multipole Method (FMM) The Fast Multipole Method (FMM) Motivation for FMM Computational Physics Problems involving mutual interactions of N particles Gravitational or Electrostatic forces Collective (but weak) long-range forces

More information

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013

GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS. Kyle Spagnoli. Research EM Photonics 3/20/2013 GTC 2013: DEVELOPMENTS IN GPU-ACCELERATED SPARSE LINEAR ALGEBRA ALGORITHMS Kyle Spagnoli Research Engineer @ EM Photonics 3/20/2013 INTRODUCTION» Sparse systems» Iterative solvers» High level benchmarks»

More information

Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows

Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows Performance Metrics of a Parallel Three Dimensional Two-Phase DSMC Method for Particle-Laden Flows Benzi John* and M. Damodaran** Division of Thermal and Fluids Engineering, School of Mechanical and Aerospace

More information

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA

HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES Cliff Woolley, NVIDIA PREFACE This talk presents a case study of extracting parallelism in the UMT2013 benchmark for 3D unstructured-mesh

More information

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film

NVIDIA. Interacting with Particle Simulation in Maya using CUDA & Maximus. Wil Braithwaite NVIDIA Applied Engineering Digital Film NVIDIA Interacting with Particle Simulation in Maya using CUDA & Maximus Wil Braithwaite NVIDIA Applied Engineering Digital Film Some particle milestones FX Rendering Physics 1982 - First CG particle FX

More information

MAGMA. Matrix Algebra on GPU and Multicore Architectures

MAGMA. Matrix Algebra on GPU and Multicore Architectures MAGMA Matrix Algebra on GPU and Multicore Architectures Innovative Computing Laboratory Electrical Engineering and Computer Science University of Tennessee Piotr Luszczek (presenter) web.eecs.utk.edu/~luszczek/conf/

More information

Accelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies

Accelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies Accelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies John C. Linford John Michalakes Manish Vachharajani Adrian Sandu IMAGe TOY 2009 Workshop 2 Virginia

More information

The Immersed Interface Method

The Immersed Interface Method The Immersed Interface Method Numerical Solutions of PDEs Involving Interfaces and Irregular Domains Zhiiin Li Kazufumi Ito North Carolina State University Raleigh, North Carolina Society for Industrial

More information

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI Overview Unparalleled Value Product Portfolio Software Platform From Desk to Data Center to Cloud Summary AI researchers depend on computing performance to gain

More information

SPECIAL TECHNIQUES-II

SPECIAL TECHNIQUES-II SPECIAL TECHNIQUES-II Lecture 19: Electromagnetic Theory Professor D. K. Ghosh, Physics Department, I.I.T., Bombay Method of Images for a spherical conductor Example :A dipole near aconducting sphere The

More information

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou

Study and implementation of computational methods for Differential Equations in heterogeneous systems. Asimina Vouronikoy - Eleni Zisiou Study and implementation of computational methods for Differential Equations in heterogeneous systems Asimina Vouronikoy - Eleni Zisiou Outline Introduction Review of related work Cyclic Reduction Algorithm

More information

Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC

Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC DLR.de Chart 1 Porting a parallel rotor wake simulation to GPGPU accelerators using OpenACC Melven Röhrig-Zöllner DLR, Simulations- und Softwaretechnik DLR.de Chart 2 Outline Hardware-Architecture (CPU+GPU)

More information

Benchmark 1.a Investigate and Understand Designated Lab Techniques The student will investigate and understand designated lab techniques.

Benchmark 1.a Investigate and Understand Designated Lab Techniques The student will investigate and understand designated lab techniques. I. Course Title Parallel Computing 2 II. Course Description Students study parallel programming and visualization in a variety of contexts with an emphasis on underlying and experimental technologies.

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

Using CUDA to Accelerate Radar Image Processing

Using CUDA to Accelerate Radar Image Processing Using CUDA to Accelerate Radar Image Processing Aaron Rogan Richard Carande 9/23/2010 Approved for Public Release by the Air Force on 14 Sep 2010, Document Number 88 ABW-10-5006 Company Overview Neva Ridge

More information

High Performance Orbital Propagation Using a Generic Software Architecture

High Performance Orbital Propagation Using a Generic Software Architecture High Performance Orbital Propagation Using a Generic Software Architecture M. Möckel SERC Limited, Mount Stromlo Observatory, Cotter Road, Weston Creek, ACT 2611, Australia J. Bennett SERC Limited, Mount

More information

CONTINGENCY PLANNING AND MISSION PERFORMANCE IMPACTS: A NOVEL APPROACH TO LAUNCH ERROR SIMULATION, CHARACTERISATION AND CORRECTION

CONTINGENCY PLANNING AND MISSION PERFORMANCE IMPACTS: A NOVEL APPROACH TO LAUNCH ERROR SIMULATION, CHARACTERISATION AND CORRECTION Emmet FLETCHER 1 1 Analytical Graphics, Inc. Plaza de la Encina 5, local 5 Tres Cantos 2876 Madrid Spain efletcher@stk.com CONTINGENCY PLANNING AND MISSION PERFORMANCE IMPACTS: A NOVEL APPROACH TO LAUNCH

More information

From Theory to Application (Optimization and Optimal Control in Space Applications)

From Theory to Application (Optimization and Optimal Control in Space Applications) From Theory to Application (Optimization and Optimal Control in Space Applications) Christof Büskens Optimierung & Optimale Steuerung 02.02.2012 The paradox of mathematics If mathematics refer to reality,

More information

My 2 hours today: 1. Efficient arithmetic in finite fields minute break 3. Elliptic curves. My 2 hours tomorrow:

My 2 hours today: 1. Efficient arithmetic in finite fields minute break 3. Elliptic curves. My 2 hours tomorrow: My 2 hours today: 1. Efficient arithmetic in finite fields 2. 10-minute break 3. Elliptic curves My 2 hours tomorrow: 4. Efficient arithmetic on elliptic curves 5. 10-minute break 6. Choosing curves Efficient

More information

Krishnan Suresh Associate Professor Mechanical Engineering

Krishnan Suresh Associate Professor Mechanical Engineering Large Scale FEA on the GPU Krishnan Suresh Associate Professor Mechanical Engineering High-Performance Trick Computations (i.e., 3.4*1.22): essentially free Memory access determines speed of code Pick

More information

Accelerating Mean Shift Segmentation Algorithm on Hybrid CPU/GPU Platforms

Accelerating Mean Shift Segmentation Algorithm on Hybrid CPU/GPU Platforms Accelerating Mean Shift Segmentation Algorithm on Hybrid CPU/GPU Platforms Liang Men, Miaoqing Huang, John Gauch Department of Computer Science and Computer Engineering University of Arkansas {mliang,mqhuang,jgauch}@uark.edu

More information

RT 3D FDTD Simulation of LF and MF Room Acoustics

RT 3D FDTD Simulation of LF and MF Room Acoustics RT 3D FDTD Simulation of LF and MF Room Acoustics ANDREA EMANUELE GRECO Id. 749612 andreaemanuele.greco@mail.polimi.it ADVANCED COMPUTER ARCHITECTURES (A.A. 2010/11) Prof.Ing. Cristina Silvano Dr.Ing.

More information

A Parallel Access Method for Spatial Data Using GPU

A Parallel Access Method for Spatial Data Using GPU A Parallel Access Method for Spatial Data Using GPU Byoung-Woo Oh Department of Computer Engineering Kumoh National Institute of Technology Gumi, Korea bwoh@kumoh.ac.kr Abstract Spatial access methods

More information

CUDA Experiences: Over-Optimization and Future HPC

CUDA Experiences: Over-Optimization and Future HPC CUDA Experiences: Over-Optimization and Future HPC Carl Pearson 1, Simon Garcia De Gonzalo 2 Ph.D. candidates, Electrical and Computer Engineering 1 / Computer Science 2, University of Illinois Urbana-Champaign

More information

Splotch: High Performance Visualization using MPI, OpenMP and CUDA

Splotch: High Performance Visualization using MPI, OpenMP and CUDA Splotch: High Performance Visualization using MPI, OpenMP and CUDA Klaus Dolag (Munich University Observatory) Martin Reinecke (MPA, Garching) Claudio Gheller (CSCS, Switzerland), Marzia Rivi (CINECA,

More information