Digital Earth Routine on Tegra K1
|
|
- Lewis Horton
- 5 years ago
- Views:
Transcription
1 Digital Earth Routine on Tegra K1 Aerosol Optical Depth Retrieval Performance Comparison and Energy Efficiency
2 Energy matters! Ecological A topic that affects us all Economical Reasons Practical Curiosity My Background: - Many years of research in High Performance Computing at Fraunhofer SCAI, Germany - Compiler development - Remote Sensing together with the Academy of Science, China
3 Sites.ieee.org mx.nthu.edu.tw AOD Retrieval Method Research cooperation: Fraunhofer SCAI (Germany) and the Academy of Science (China) Aerosol Optical Depth (AOD) is a significant optical property of aerosols AOD is applied to the atmospheric correction of remotely sensed surface features for monitoring volcanic eruptions, forest fires and air quality in general as well as climate predictions from satellites Measurements of different wavelengths for each pixel on earth (with a spacial resolution e.g. 1 km) are stacked into a data cube and form the input for Remote Sensing algorithms
4 Sites.ieee.org AOD Retrieval Method Input Data Collection Daily observations of the MODerate resolution Imaging Spectrometer MODIS from the NASA satellites TERRA und AQUA (i=1,2) Three different wavelengths from the visible spectrum (470, 550, 660 nm) are considered (j=1,2,3) The satellites were placed into a near-polar, sunsynchronous orbit at an altitude of 705km Both complement each other as they observe the same earth regions at different times of the day
5 AOD Retrieval Method Background Consider the Atmosphere as turbid medium following the Lambert-Beer-Law Optical Depth τ = τ R + τ G + τ A The total thickness τ consists of Rayleigh Scattering 4,085 τ R = 0, λ j (λ j wavelength to j) Mie Scattering τ G τ G τ R Chanel wavelength(nm) transmissivity gas-opt. thickn. abs. gas is quasi constant (ozone + water + oxygen + others» tabels) Absorption or (mainly) Scattering through aerosols (AOD ) α Ångstrom's turbidity formula: τ j = β i λ j ( β i AOD for λ j = 1μm ) τ A Example: Cloud Droplets Particles are relatively large» small α» Scattering nearly constant over λ j
6 AOD Retrieval Method SRAP-MODIS Algorithm (Synergetic Retrieval of Aerosol Properties) Difference between TopOfAtmosphere- and Surface-Reflectance (Atmospheric Distortion) τ R τ A τ R τ A Ratio of two observations is constant for all wavelengths Estimate parameters Approximation of the Jacobi-Matrix Influence of the atmosphere decreases rapidly with increasing wavelength» Approximation by TOA values of large wavelengths with minor influences α, β 1, β 2 by Quasi-Newton Method Derive AOD for different wavelengths with Ångstrom's turbidity formula
7 AOD Retrieval Method IMPORTANT for the parallelization of the Retrieval-Method AOD calculation is independent for each pixel and can be performed solely based on the respective wavelengths-vector in the data cube Quasi-Newton-Method for each pixel to determine α, β 1, β 2 The Rate of Convergence for different pixels may vary seriously, e.g. between OL pixels (over-land) OS pixels (over-sea) Masked pixels (more about that later ) Additionally the control-flow may follow different paths in the AOD kernel Diverging branches
8 Power intake (watt) Threads AOD Retrieval on multi-core processors Shared Memory parallelization with OpenMP Static OpenMP-Scheduling Problem: Imbalance on cores Reason on the one hand: Quasi-Newton (convergence) Load-Imbalance Reason on the other hand: Varying pixel data may lead to different branches in the AOD kernel (e.g. cloud-masking) Branch-Divergence static second
9 Power intake (watt) AOD Retrieval on multi-core processors Shared Memory parallelization with OpenMP Solution: adapted scheduling of the pixels AOD threads Similar pixels Similar convergence Similar pixels nearby each other Instead of blocking the iterations/pixels statically in large chunks Small blocks, e.g. of size 1 OR dynamic scheduling As each kernel run is relatively work-intensive, the thereby introduced overhead is insignificant Cloud-Masking Dependencies static dynamic second
10 AOD Retrieval on GPUs Similar to multi-core Solution again: adapted scheduling of the pixels AOD threads Similar pixels Similar convergence Similar pixels nearby each other Thread-Blocks Only little branch-divergence per construction Not too many pixels per Thread-Block - nearby Similar pixels Similar convergence Thread-Block Tuning *NOT*: more is better Registers per block restrictions Programmer (can and has to) optimize parameters GPUs are very well suited for the Retrieval kernel but not necessarily for other parts of the workflow
11 AOD Retrieval on GPUs Speedup with increasing input size Data Transfer GPU overall MC overall GPU calc MC calc 0
12 DRAM DRAM AOD Retrieval on GPUs Comparison of CPU and GPU architecture CPU vs. GPU problem-dependent (part) Low Latency vs. High Throughput Lots of automatisms vs. (still) lots of manual tuning Optimization of Thread-Blocks, register-assignment, occupancy (e.g. registers vs. threads), memory-accesses (shared memory bank conflicts, global memory coalescing), ALU ALU ALU ALU Control Unit Cache Level vs. L2
13 AOD Workflow on HYBRID systems more than the Retrieval is needed Multi-Core Multi-Core or GPU Multi-Core or GPU or HYBRID
14 Why EMBEDDED? EMBEDDED architectures are interesting in various fields of research Energy plays a major role today Satellite on-board observations Automotive sector, e.g. high performance embedded systems for in-vehicle applications The convergence of HPC and embedded systems in our heterogeneous computing future (Kaeli et al. 2011) The Exascale Challenge (Moore s Law) and future HPC systems Relatively cheap combination of multi-cores and GPUs today
15 AOD on EMBEDDED architectures NVIDIA JetsonTK1 Jetson TK1 energy efficient SoC for high performance under strong energy constraints
16 AOD Retrieval on MIXED EMBEDDED JetsonTK1
17 AOD Retrieval Method JetsonTK1 Runtime 1xSoC CPU 1HPCore CPU 4HPCores GPU 1HPCore 4xSoC GPU 1HPCore XeonWS CPU 1T CPU 4T GPU
18 AOD Retrieval Method JetsonTK1 Runtime (Scaling) 1xSoC xSoC 3xSoC xSoC 3xSoC 2xSoC 4xSoC xSoC
19 AOD Retrieval Method JetsonTK1 Energy 1xSoC CPU 1HPCore CPU 4HPCores GPU 1HPCore 4xSoC GPU 1HPCore XeonWS CPU 1T CPU 4T GPU
20 Publications 2015 Multi-Core Processors and Graphics Processing Unit Accelerators for Parallel Retrieval of Aerosol Optical Depth from Satellite Data: Implementation, Performance and Energy Efficiency J. Liu, D. Feld, Y. Xue, J. Garcke and T. Soddemann IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2016 Design of a Hybrid Parallel Workflow for Efficient Aerosol Optical Depth Retrieval from MODIS Satellite Data for Computers with Multi-core Processors and GPUs J. Liu, D. Feld, Y. Xue, J. Garcke, T. Soddemann and P. Pan International Journal of Digital Earth Workstation1 Corei GHz 8T(HT), GTX460 1xSoC ARM Cortex A GHz 4T, Kepler "192" Energy-Efficiency and Performance Comparison of Aerosol Optical Depth 5000 (AOD) retrieval on distributed Embedded SoC architectures with Nvidia GPUs 0 Workstation32Xeon D. Feld, E. Schricker, Workstation1 J. Liu, Core-i7- Y. Xue, 1xSoC J. Garcke ARM Cortex and T. Soddemann E V GHz 8T(HT), GTX GHz 8T(HT), GTX460 A GHz 4T, Kepler "192" Workstation32Xeon E V GHz 8T(HT), GTX680 SCAI Book (Springer) [t.b.a.] CPU MC GPU HYBRID DYNAMIC CPU MC GPU HYBRID DYNAMIC
21 Energy matters! Workflow as a tool Restrictions Power intake Energy consumption Runtime restriction (real-time) Minimize runtime (post-processing) e.g. on-board missions Goals influence each other Extension of the methods e.g.: Pixel-sorting to reduce divergence respect dependencies Further methods, other input data, other goals
22 Thanks for your attention! Questions? NASA Earth Observatory
Energy-Efficiency and Performance Comparison of Aerosol Optical Depth Retrieval on Distributed Embedded SoC Architectures
Energy-Efficiency and Performance Comparison of Aerosol Optical Depth Retrieval on Distributed Embedded SoC Architectures Dustin Feld, Jochen Garcke, Jia Liu, Eric Schricker, Thomas Soddemann, and Yong
More informationGlobal and Regional Retrieval of Aerosol from MODIS
Global and Regional Retrieval of Aerosol from MODIS Why study aerosols? CLIMATE VISIBILITY Presented to UMBC/NESDIS June 4, 24 Robert Levy, Lorraine Remer, Yoram Kaufman, Allen Chu, Russ Dickerson modis-atmos.gsfc.nasa.gov
More informationAtmospheric correction of hyperspectral ocean color sensors: application to HICO
Atmospheric correction of hyperspectral ocean color sensors: application to HICO Amir Ibrahim NASA GSFC / USRA Bryan Franz, Zia Ahmad, Kirk knobelspiesse (NASA GSFC), and Bo-Cai Gao (NRL) Remote sensing
More informationA Survey of Modelling and Rendering of the Earth s Atmosphere
Spring Conference on Computer Graphics 00 A Survey of Modelling and Rendering of the Earth s Atmosphere Jaroslav Sloup Department of Computer Science and Engineering Czech Technical University in Prague
More informationJAXA Himawari Monitor Aerosol Products. JAXA Earth Observation Research Center (EORC) September 2018
JAXA Himawari Monitor Aerosol Products JAXA Earth Observation Research Center (EORC) September 2018 1 2 JAXA Himawari Monitor JAXA has been developing Himawari-8 products using the retrieval algorithms
More informationMonte Carlo Ray Tracing Based Non-Linear Mixture Model of Mixed Pixels in Earth Observation Satellite Imagery Data
Monte Carlo Ray Tracing Based Non-Linear Mixture Model of Mixed Pixels in Earth Observation Satellite Imagery Data Verification of non-linear mixed pixel model with real remote sensing satellite images
More informationCUDA Optimization with NVIDIA Nsight Visual Studio Edition 3.0. Julien Demouth, NVIDIA
CUDA Optimization with NVIDIA Nsight Visual Studio Edition 3.0 Julien Demouth, NVIDIA What Will You Learn? An iterative method to optimize your GPU code A way to conduct that method with Nsight VSE APOD
More informationUV Remote Sensing of Volcanic Ash
UV Remote Sensing of Volcanic Ash Kai Yang University of Maryland College Park WMO Inter-comparison of Satellite-based Volcanic Ash Retrieval Algorithms Workshop June 26 July 2, 2015, Madison, Wisconsin
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied
More informationBarcelona Supercomputing Center
www.bsc.es Barcelona Supercomputing Center Centro Nacional de Supercomputación EMIT 2016. Barcelona June 2 nd, 2016 Barcelona Supercomputing Center Centro Nacional de Supercomputación BSC-CNS objectives:
More informationAerosol Optical Depth Retrieval from Satellite Data in China. Professor Dr. Yong Xue
Aerosol Optical Depth Retrieval from Satellite Data in China Professor Dr. Yong Xue Research Report Outline Multi-scale quantitative retrieval of Aerosol optical depth (AOD) over land Spatial resolution:
More informationThe NIR- and SWIR-based On-orbit Vicarious Calibrations for VIIRS
The NIR- and SWIR-based On-orbit Vicarious Calibrations for VIIRS Menghua Wang NOAA/NESDIS/STAR E/RA3, Room 3228, 5830 University Research Ct. College Park, MD 20746, USA Menghua.Wang@noaa.gov Workshop
More informationExperiences Using Tegra K1 and X1 for Highly Energy Efficient Computing
Experiences Using Tegra K1 and X1 for Highly Energy Efficient Computing Gaurav Mitra Andrew Haigh Luke Angove Anish Varghese Eric McCreath Alistair P. Rendell Research School of Computer Science Australian
More informationGEOG 4110/5100 Advanced Remote Sensing Lecture 2
GEOG 4110/5100 Advanced Remote Sensing Lecture 2 Data Quality Radiometric Distortion Radiometric Error Correction Relevant reading: Richards, sections 2.1 2.8; 2.10.1 2.10.3 Data Quality/Resolution Spatial
More informationPorting The Spectral Element Community Atmosphere Model (CAM-SE) To Hybrid GPU Platforms
Porting The Spectral Element Community Atmosphere Model (CAM-SE) To Hybrid GPU Platforms http://www.scidacreview.org/0902/images/esg13.jpg Matthew Norman Jeffrey Larkin Richard Archibald Valentine Anantharaj
More informationJAXA Himawari Monitor Aerosol Products. JAXA Earth Observation Research Center (EORC) August 2018
JAXA Himawari Monitor Aerosol Products JAXA Earth Observation Research Center (EORC) August 2018 1 JAXA Himawari Monitor JAXA has been developing Himawari 8 products using the retrieval algorithms based
More informationDirect radiative forcing of aerosol
Direct radiative forcing of aerosol 1) Model simulation: A. Rinke, K. Dethloff, M. Fortmann 2) Thermal IR forcing - FTIR: J. Notholt, C. Rathke, (C. Ritter) 3) Challenges for remote sensing retrieval:
More informationLIGHT SCATTERING THEORY
LIGHT SCATTERING THEORY Laser Diffraction (Static Light Scattering) When a Light beam Strikes a Particle Some of the light is: Diffracted Reflected Refracted Absorbed and Reradiated Reflected Refracted
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationMenghua Wang NOAA/NESDIS/STAR Camp Springs, MD 20746, USA
Ocean EDR Product Calibration and Validation Plan Progress Report: VIIRS Ocean Color Algorithm Evaluations and Data Processing and Analyses Define a VIIRS Proxy Data Stream Define the required in situ
More informationKohei Arai 1 1Graduate School of Science and Engineering Saga University Saga City, Japan. Kenta Azuma 2 2 Cannon Electronics Inc.
Method for Surface Reflectance Estimation with MODIS by Means of Bi-Section between MODIS and Estimated Radiance as well as Atmospheric Correction with Skyradiometer Kohei Arai 1 1Graduate School of Science
More informationHyperspectral Unmixing on GPUs and Multi-Core Processors: A Comparison
Hyperspectral Unmixing on GPUs and Multi-Core Processors: A Comparison Dept. of Mechanical and Environmental Informatics Kimura-Nakao lab. Uehara Daiki Today s outline 1. Self-introduction 2. Basics of
More informationIntroduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono
Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied
More informationImproved MODIS Aerosol Retrieval using Modified VIS/MIR Surface Albedo Ratio Over Urban Scenes
Improved MODIS Aerosol Retrieval using Modified VIS/MIR Surface Albedo Ratio Over Urban Scenes Min Min Oo, Matthias Jerg, Yonghua Wu Barry Gross, Fred Moshary, Sam Ahmed Optical Remote Sensing Lab City
More informationManycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT
Manycore and GPU Channelisers Seth Hall High Performance Computing Lab, AUT GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate
More informationParallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples
More informationPrototyping GOES-R Albedo Algorithm Based on MODIS Data Tao He a, Shunlin Liang a, Dongdong Wang a
Prototyping GOES-R Albedo Algorithm Based on MODIS Data Tao He a, Shunlin Liang a, Dongdong Wang a a. Department of Geography, University of Maryland, College Park, USA Hongyi Wu b b. University of Electronic
More informationComparison of Full-resolution S-NPP CrIS Radiance with Radiative Transfer Model
Comparison of Full-resolution S-NPP CrIS Radiance with Radiative Transfer Model Xu Liu NASA Langley Research Center W. Wu, S. Kizer, H. Li, D. K. Zhou, and A. M. Larar Acknowledgements Yong Han NOAA STAR
More informationCS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS
CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS 1 Last time Each block is assigned to and executed on a single streaming multiprocessor (SM). Threads execute in groups of 32 called warps. Threads in
More informationSeawater reflectance in the near-ir
Seawater reflectance in the near-ir Maéva DORON David DOXARAN Simon BELANGER Marcel BABIN Laboratoire d'océanographie de Villefranche Seawater Reflectance in the Near-IR Doron, Doxaran, Bélanger & Babin
More informationEfficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs
Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on GPUs Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,
More informationMotivation. Aerosol Retrieval Over Urban Areas with High Resolution Hyperspectral Sensors
Motivation Aerosol etrieval Over Urban Areas with High esolution Hyperspectral Sensors Barry Gross (CCNY) Oluwatosin Ogunwuyi (Ugrad CCNY) Brian Cairns (NASA-GISS) Istvan Laszlo (NOAA-NESDIS) Aerosols
More informationData Mining Support for Aerosol Retrieval and Analysis:
Data Mining Support for Aerosol Retrieval and Analysis: Our Approach and Preliminary Results Zoran Obradovic 1 joint work with Amy Braverman 2, Bo Han 1, Zhanqing Li 3, Yong Li 1, Kang Peng 1, Yilian Qin
More informationClass 11 Introduction to Surface BRDF and Atmospheric Scattering. Class 12/13 - Measurements of Surface BRDF and Atmospheric Scattering
University of Maryland Baltimore County - UMBC Phys650 - Special Topics in Experimental Atmospheric Physics (Spring 2009) J. V. Martins and M. H. Tabacniks http://userpages.umbc.edu/~martins/phys650/ Class
More informationUsing GPUs to compute the multilevel summation of electrostatic forces
Using GPUs to compute the multilevel summation of electrostatic forces David J. Hardy Theoretical and Computational Biophysics Group Beckman Institute for Advanced Science and Technology University of
More informationA Generic Approach For Inversion And Validation Of Surface Reflectance and Aerosol Over Land: Application To Landsat 8 And Sentinel 2
A Generic Approach For Inversion And Validation Of Surface Reflectance and Aerosol Over Land: Application To Landsat 8 And Sentinel 2 Eric Vermote NASA Goddard Space Flight Center, Code 619, Greenbelt,
More informationNASA e-deep Blue aerosol update: MODIS Collection 6 and VIIRS
NASA e-deep Blue aerosol update: MODIS Collection 6 and VIIRS Andrew M. Sayer, N. Christina Hsu (PI), Corey Bettenhausen, Nick Carletta, Jaehwa Lee, Colin Seftor, Jeremy Warner Past team members: Ritesh
More information1. Particle Scattering. Cogito ergo sum, i.e. Je pense, donc je suis. - René Descartes
1. Particle Scattering Cogito ergo sum, i.e. Je pense, donc je suis. - René Descartes Generally gas and particles do not scatter isotropically. The phase function, scattering efficiency, and single scattering
More informationRetrieval of optical and microphysical properties of ocean constituents using polarimetric remote sensing
Retrieval of optical and microphysical properties of ocean constituents using polarimetric remote sensing Presented by: Amir Ibrahim Optical Remote Sensing Laboratory, The City College of the City University
More informationNonlinear Mixing Model of Mixed Pixels in Remote Sensing Satellite Images Taking Into Account Landscape
Vol. 4, No., 23 Nonlinear Mixing Model of Mixed Pixels in Remote Sensing Satellite Images Taking Into Account Landscape Verification of the proposed nonlinear pixed pixel model through simulation studies
More informationECE 8823: GPU Architectures. Objectives
ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading
More informationGOES-R AWG Radiation Budget Team: Absorbed Shortwave Radiation at surface (ASR) algorithm June 9, 2010
GOES-R AWG Radiation Budget Team: Absorbed Shortwave Radiation at surface (ASR) algorithm June 9, 2010 Presented By: Istvan Laszlo NOAA/NESDIS/STAR 1 ASR Team Radiation Budget AT chair: Istvan Laszlo ASR
More informationData-intensive computing in radiative transfer modelling
German Aerospace Center (DLR) Remote Sensing Technology Institute (IMF) Data-intensive computing in radiative transfer modelling Dmitry Efremenko Diego Loyola Adrian Doicu Thomas Trautmann Dmitry.Efremenko@dlr.de
More informationAccelerators in Technical Computing: Is it Worth the Pain?
Accelerators in Technical Computing: Is it Worth the Pain? A TCO Perspective Sandra Wienke, Dieter an Mey, Matthias S. Müller Center for Computing and Communication JARA High-Performance Computing RWTH
More informationCarlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)
Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) 4th IEEE International Workshop of High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB
More informationTimothy Lanfear, NVIDIA HPC
GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision
More informationTEMPO & GOES-R synergy update and! GEO-TASO aerosol retrieval!
TEMPO & GOES-R synergy update and! GEO-TASO aerosol retrieval! Jun Wang! Xiaoguang Xu, Shouguo Ding, Weizhen Hou! University of Nebraska-Lincoln!! Robert Spurr! RT solutions!! Xiong Liu, Kelly Chance!
More informationHPC with GPU and its applications from Inspur. Haibo Xie, Ph.D
HPC with GPU and its applications from Inspur Haibo Xie, Ph.D xiehb@inspur.com 2 Agenda I. HPC with GPU II. YITIAN solution and application 3 New Moore s Law 4 HPC? HPC stands for High Heterogeneous Performance
More informationHPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville COSC 594 Lecture Notes March 22, 2017 1/20 Outline Introduction - Hardware
More informationCPU-GPU Heterogeneous Computing
CPU-GPU Heterogeneous Computing Advanced Seminar "Computer Engineering Winter-Term 2015/16 Steffen Lammel 1 Content Introduction Motivation Characteristics of CPUs and GPUs Heterogeneous Computing Systems
More informationTowards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers
Towards a complete FEM-based simulation toolkit on GPUs: Geometric Multigrid solvers Markus Geveler, Dirk Ribbrock, Dominik Göddeke, Peter Zajac, Stefan Turek Institut für Angewandte Mathematik TU Dortmund,
More informationHeterogeneous platforms
Heterogeneous platforms Systems combining main processors and accelerators e.g., CPU + GPU, CPU + Intel MIC, AMD APU, ARM SoC Any platform using a GPU is a heterogeneous platform! Further in this talk
More informationDesigning Parallel Programs. This review was developed from Introduction to Parallel Computing
Designing Parallel Programs This review was developed from Introduction to Parallel Computing Author: Blaise Barney, Lawrence Livermore National Laboratory references: https://computing.llnl.gov/tutorials/parallel_comp/#whatis
More informationCalibration Techniques for NASA s Remote Sensing Ocean Color Sensors
Calibration Techniques for NASA s Remote Sensing Ocean Color Sensors Gerhard Meister, Gene Eplee, Bryan Franz, Sean Bailey, Chuck McClain NASA Code 614.2 Ocean Biology Processing Group October 21st, 2010
More informationTowards a robust model of planetary thermal profiles
Towards a robust model of planetary thermal profiles RT Equation: General Solution: RT Equation: General Solution: Extinction coefficient Emission coefficient How would you express the Source function
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationMET 4410 Remote Sensing: Radar and Satellite Meteorology MET 5412 Remote Sensing in Meteorology. Lecture 9: Reflection and Refraction (Petty Ch4)
MET 4410 Remote Sensing: Radar and Satellite Meteorology MET 5412 Remote Sensing in Meteorology Lecture 9: Reflection and Refraction (Petty Ch4) When to use the laws of reflection and refraction? EM waves
More informationCS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology
CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367
More informationCS 179 Lecture 4. GPU Compute Architecture
CS 179 Lecture 4 GPU Compute Architecture 1 This is my first lecture ever Tell me if I m not speaking loud enough, going too fast/slow, etc. Also feel free to give me lecture feedback over email or at
More informationLECTURE 37: Ray model of light and Snell's law
Lectures Page 1 Select LEARNING OBJECTIVES: LECTURE 37: Ray model of light and Snell's law Understand when the ray model of light is applicable. Be able to apply Snell's Law of Refraction to any system.
More informationWhen MPPDB Meets GPU:
When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU
More informationPART I: Collecting data from National Earth Observations
Investigation: Air Pollution In this investigation, you are going to explore air pollution around the world for an entire calendar year. We will be using three tools, the National Earth Observations (NEO)
More informationIntroduction to Numerical General Purpose GPU Computing with NVIDIA CUDA. Part 1: Hardware design and programming model
Introduction to Numerical General Purpose GPU Computing with NVIDIA CUDA Part 1: Hardware design and programming model Dirk Ribbrock Faculty of Mathematics, TU dortmund 2016 Table of Contents Why parallel
More informationShortwave infrared measurements of the TROPOMI instrument on the Sentinel 5 Precursor mission
Shortwave infrared measurements of the TROPOMI instrument on the Sentinel 5 Precursor mission Jochen Landgraf 1, Joost aan de Brugh 1, Haili Hu 1,Tobias Borsdorff 1, Remco Scheepmaker 1, Andre Butz 2,
More information4.5 Images Formed by the Refraction of Light
Figure 89: Practical structure of an optical fibre. Absorption in the glass tube leads to a gradual decrease in light intensity. For optical fibres, the glass used for the core has minimum absorption at
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationTwo-Phase flows on massively parallel multi-gpu clusters
Two-Phase flows on massively parallel multi-gpu clusters Peter Zaspel Michael Griebel Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn Workshop Programming of Heterogeneous
More informationUnderstanding The MODIS Aerosol Products
Understanding The MODIS Aerosol Products Rich Kleidman Science Systems and Applications Rob Levy Science Systems and Applications Lorraine Remer NASA Goddard Space Flight Center Chistina Chu NASA Goddard
More informationCUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker
CUDA on ARM Update Developing Accelerated Applications on ARM Bas Aarts and Donald Becker CUDA on ARM: a forward-looking development platform for high performance, energy efficient hybrid computing It
More informationEvaluation and Exploration of Next Generation Systems for Applicability and Performance Volodymyr Kindratenko Guochun Shi
Evaluation and Exploration of Next Generation Systems for Applicability and Performance Volodymyr Kindratenko Guochun Shi National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
More informationAeolus L2A optical properties products and assimilation in air quality models
Aeolus L2A optical properties products and assimilation in air quality models Thomas Flament, Angela Benedetti, P. Martinet, E. Martins, L. El Amraoui, A. Dabas, P. Flamant Toulouse, 28 March 2017 Aladin,
More informationCSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591: GPU Programming Using CUDA in Practice Klaus Mueller Computer Science Department Stony Brook University Code examples from Shane Cook CUDA Programming Related to: score boarding load and store
More informationModern Processor Architectures. L25: Modern Compiler Design
Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions
More informationGPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC
GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of
More informationA Large-Scale Cross-Architecture Evaluation of Thread-Coarsening. Alberto Magni, Christophe Dubach, Michael O'Boyle
A Large-Scale Cross-Architecture Evaluation of Thread-Coarsening Alberto Magni, Christophe Dubach, Michael O'Boyle Introduction Wide adoption of GPGPU for HPC Many GPU devices from many of vendors AMD
More informationDispersion Polarization
Dispersion Polarization Phys Phys 2435: 22: Chap. 33, 31, Pg 1 Dispersion New Topic Phys 2435: Chap. 33, Pg 2 The Visible Spectrum Remember that white light contains all the colors of the s p e c t r u
More informationRealization of a low energy HPC platform powered by renewables - A case study: Technical, numerical and implementation aspects
Realization of a low energy HPC platform powered by renewables - A case study: Technical, numerical and implementation aspects Markus Geveler, Stefan Turek, Dirk Ribbrock PACO Magdeburg 2015 / 7 / 7 markus.geveler@math.tu-dortmund.de
More informationRevision History. Applicable Documents
Revision History Version Date Revision History Remarks 1.0 2011.11-1.1 2013.1 Update of the processing algorithm of CAI Level 3 NDVI, which yields the NDVI product Ver. 01.00. The major updates of this
More informationLecture 1: CS/ECE 3810 Introduction
Lecture 1: CS/ECE 3810 Introduction Today s topics: Why computer organization is important Logistics Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer
More informationHeterogeneous SoCs. May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1
COSCOⅣ Heterogeneous SoCs M5171111 HASEGAWA TORU M5171112 IDONUMA TOSHIICHI May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1 Contents Background Heterogeneous technology May 28, 2014 COMPUTER SYSTEM COLLOQUIUM
More informationTesla Architecture, CUDA and Optimization Strategies
Tesla Architecture, CUDA and Optimization Strategies Lan Shi, Li Yi & Liyuan Zhang Hauptseminar: Multicore Architectures and Programming Page 1 Outline Tesla Architecture & CUDA CUDA Programming Optimization
More informationHigh Performance Computing
The Need for Parallelism High Performance Computing David McCaughan, HPC Analyst SHARCNET, University of Guelph dbm@sharcnet.ca Scientific investigation traditionally takes two forms theoretical empirical
More informationMunara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries.
Munara Tolubaeva Technical Consulting Engineer 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. notices and disclaimers Intel technologies features and benefits depend
More informationExploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API
EuroPAR 2016 ROME Workshop Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API Suyang Zhu 1, Sunita Chandrasekaran 2, Peng Sun 1, Barbara Chapman 1, Marcus Winter 3,
More informationGPU Fundamentals Jeff Larkin November 14, 2016
GPU Fundamentals Jeff Larkin , November 4, 206 Who Am I? 2002 B.S. Computer Science Furman University 2005 M.S. Computer Science UT Knoxville 2002 Graduate Teaching Assistant 2005 Graduate
More informationAddressing Heterogeneity in Manycore Applications
Addressing Heterogeneity in Manycore Applications RTM Simulation Use Case stephane.bihan@caps-entreprise.com Oil&Gas HPC Workshop Rice University, Houston, March 2008 www.caps-entreprise.com Introduction
More information1.Rayleigh and Mie scattering. 2.Phase functions. 4.Single and multiple scattering
5 November 2014 Outline 1.Rayleigh and Mie scattering 2.Phase functions 3.Extinction 4.Single and multiple scattering Luca Lelli luca@iup.physik.uni-bremen.de Room U2080 Phone 0421.218.62097 Scattering
More informationCUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION
CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION WHAT YOU WILL LEARN An iterative method to optimize your GPU code Some common bottlenecks to look out for Performance diagnostics with NVIDIA Nsight
More informationChapter 24. Wave Optics. Wave Optics. The wave nature of light is needed to explain various phenomena
Chapter 24 Wave Optics Wave Optics The wave nature of light is needed to explain various phenomena Interference Diffraction Polarization The particle nature of light was the basis for ray (geometric) optics
More informationFast Hardware For AI
Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights
More informationComputer Vision on Tegra K1. Chen Sagiv SagivTech Ltd.
Computer Vision on Tegra K1 Chen Sagiv SagivTech Ltd. Established in 2009 and headquartered in Israel Core domain expertise: GPU Computing and Computer Vision What we do: - Technology - Solutions - Projects
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationTR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut
TR-2014-17 An Overview of NVIDIA Tegra K1 Architecture Ang Li, Radu Serban, Dan Negrut November 20, 2014 Abstract This paperwork gives an overview of NVIDIA s Jetson TK1 Development Kit and its Tegra K1
More informationMODULE 3. FACTORS AFFECTING 3D LASER SCANNING
MODULE 3. FACTORS AFFECTING 3D LASER SCANNING Learning Outcomes: This module discusses factors affecting 3D laser scanner performance. Students should be able to explain the impact of various factors on
More informationSteve Scott, Tesla CTO SC 11 November 15, 2011
Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost
More informationParallel Computing. November 20, W.Homberg
Mitglied der Helmholtz-Gemeinschaft Parallel Computing November 20, 2017 W.Homberg Why go parallel? Problem too large for single node Job requires more memory Shorter time to solution essential Better
More informationGeneral Purpose GPU Computing in Partial Wave Analysis
JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data
More informationTUNING CUDA APPLICATIONS FOR MAXWELL
TUNING CUDA APPLICATIONS FOR MAXWELL DA-07173-001_v6.5 August 2014 Application Note TABLE OF CONTENTS Chapter 1. Maxwell Tuning Guide... 1 1.1. NVIDIA Maxwell Compute Architecture... 1 1.2. CUDA Best Practices...2
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationHardware Acceleration of Feature Detection and Description Algorithms on Low Power Embedded Platforms
Hardware Acceleration of Feature Detection and Description Algorithms on LowPower Embedded Platforms Onur Ulusel, Christopher Picardo, Christopher Harris, Sherief Reda, R. Iris Bahar, School of Engineering,
More information