A Classifica*on of Scien*fic Visualiza*on Algorithms for Massive Threading Kenneth Moreland Berk Geveci Kwan- Liu Ma Robert Maynard
|
|
- Marybeth Booker
- 5 years ago
- Views:
Transcription
1 A Classifica*on of Scien*fic Visualiza*on Algorithms for Massive Threading Kenneth Moreland Berk Geveci Kwan- Liu Ma Robert Maynard Sandia Na*onal Laboratories Kitware, Inc. University of California at Davis Kitware, Inc. November 17, 2013 Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy s National Nuclear Security Administration under contract DE-AC04-94AL SAND NO P
2 Cielo AMD x86 Full x86 Core + Associated Cache 8 cores per die MPI- Only feasible 1mm 1 x86 core 1 Kepler core Trinity (op*on 1) NVIDIA GPU 2,880 cores collected in 15 SMX Shared PC, Cache, Mem Fetches Reduced control logic MPI- Only not feasible
3 Extreme Scale is Threads, Threads, Threads! Jaguar XT5 Titan XK7 Exascale* Cores 224, ,008 and 18,688 gpu To succeed at extreme scale, you need to consider the finest possible level of concurrency Expect each thread to process exactly one element Disallow communica*on among threads 1 billion Concurrency 224,256 way million way billion way Memory 300 Terabytes 700 Terabytes 128 Petabytes *Source: Scien*fic Discovery at the Exascale, Ahern, Shoshani, Ma, et al.
4 Data Parallel Visualiza*on Duplicate processes run independently on different par**ons of data.
5 Data Parallel Visualiza*on Duplicate processes run independently on different par**ons of data.
6 Data Parallel Visualiza*on Duplicate processes run independently on different par**ons of data.
7 Data Parallel Visualiza*on Duplicate processes run independently on different par**ons of data.
8 Data Parallel Visualiza*on Duplicate processes run independently on different par**ons of data.
9 Key Principles [Law et al., 1999] Data Separability Mappable Input Filter Result Invariant Filter Filter Filter
10 Mappable Input Can Break Down Coarse Parallelism Result reasonably divided among par**ons. Massive Parallelism Most par**ons are empty.
11 Result Invariant Can Break Down
12 Result Invariant Can Break Down Repeated Ver*ces
13 Result Invariant Can Break Down
14 New Key Principles Coarse Parallel Massive Parallel Data Separability Data Separability Mappable Input Discoverable Input Mapping Result Invariant Collec*ve Work
15 Characteriza*on and Classifica*on Key Principles Parameters Data Separability Separable Element Discoverable Input Mapping Point Mapping Cell Mapping Field Mapping Collec*ve Work Collec*ve Work
16 Basic Mapping Input Output
17 Basic Mapping f( ) x f(x) Input Output
18 Basic Mapping Separable Element Any Point Mapping Iden*ty Cell Mapping Iden*ty Field Mapping 1 to 1 Collec@ve Work None
19 Basic Mapping functor()
20 Map by Cell Input Output
21 Map by Cell f(,,, ) x 4 x 3 f(x 1, x 2, x 3, x 4 ) x 1 x 2 Input Output
22 Map by Cell Separable Element Cell Point Mapping Iden*ty Cell Mapping Iden*ty Field Mapping Points on cell to cell Work None
23 Map by Cell functor()
24 Reconnect Cell Input Output
25 Reconnect Cell Input Output
26 Reconnect Cell Input Output
27 Reconnect Cell Separable Element Cell Point Mapping 1 to 0 or 1 Cell Mapping 1 to 0 or more Field Mapping Iden*ty Collec@ve Work None (or find unused)
28 Reconnect Cell Keep these Remove these Output
29 Build Independent Topology Input Output
30 Build Independent Topology Input Output
31 Build Independent Topology Separable Element Any Point Mapping 1 element to many points Cell Mapping 1 element to many cells (constant number) Field Mapping Iden*ty Collec@ve Work None
32 Build Connected Topology Input Output
33 Build Connected Topology Input Output
34 Build Connected Topology Input Output
35 Build ConnectedTopology Separable Element Cell Point Mapping 1 cell to 0 or more points Cell Mapping 1 to 0 or more Field Mapping Interpolated points Collec@ve Work Resolve duplicate points
36 Build Connected Topology Merge points Output
37 Capture Cell Adjacencies Input Output
38 Capture Cell Adjacencies f(,,, ) x 4 x 3 f(x 1, x 2, x 3, x 4 ) x 1 x 2 Input Output
39 Capture Cell Adjacencies Separable Element Point, edge, or face Point Mapping Iden*ty Cell Mapping Iden*ty Field Mapping Interpolated incident fields Work Find incidence rela*onships
40 Capture Cell Adjacencies x 4 x 3 x 1 x 2 Incidence rela*onships may not be explicitly captured by topology structure. Input
41 Globally Reduce f/(x 1,x 2,x 3, ) Input
42 Globally Reduce f/(x 1,x 2,x 3, ) Input
43 Globally Reduce f/(x 1,x 2,x 3, ) Input
44 Globally Reduce Separable Element Any Point Mapping Cell Mapping Field Mapping All to 1 Collec@ve Work Algorithms None in output None in output Global Reduc*on Histogram Integrate Outline (find bounds) Sta*s*cs
45 Query Data x f(x) Search Structure Output
46 Query Data Separable Element Point, key, or query Point Mapping Iden*ty Cell Mapping Iden*ty Field Mapping 1 query to 1 output Collec@ve Work Building query structure
47 Summary
48 Conclusion Accelerators and other emerging architectures use massive threading that takes parallelism to a whole different scale Visualiza*on algorithms can be characterized by: How data is separated How input elements are mapped to output elements What collec*ve work is performed Algorithms that share all three characteriza*ons can be implemented with similar mechanisms We can exploit this observa*on to implement new algorithms
Dax: A Massively Threaded Visualiza5on and Analysis Toolkit for Extreme Scale
Dax: A Massively Threaded Visualiza5on and Analysis Toolkit for Extreme Scale GPU Technology Conference March 26, 2014 Kenneth Moreland Sandia Na5onal Laboratories Robert Maynard Kitware, Inc. Sandia National
More informationOh, Exascale! The effect of emerging architectures on scien1fic discovery. Kenneth Moreland, Sandia Na1onal Laboratories
Photos placed in horizontal posi1on with even amount of white space between photos and header Oh, $#*@! Exascale! The effect of emerging architectures on scien1fic discovery Ultrascale Visualiza1on Workshop,
More informationVisual Analysis of Lagrangian Particle Data from Combustion Simulations
Visual Analysis of Lagrangian Particle Data from Combustion Simulations Hongfeng Yu Sandia National Laboratories, CA Ultrascale Visualization Workshop, SC11 Nov 13 2011, Seattle, WA Joint work with Jishang
More informationVTK-m: Uniting GPU Acceleration Successes. Robert Maynard Kitware Inc.
VTK-m: Uniting GPU Acceleration Successes Robert Maynard Kitware Inc. VTK-m Project Supercomputer Hardware Advances Everyday More and more parallelism High-Level Parallelism The Free Lunch Is Over (Herb
More informationResponsive Large Data Analysis and Visualization with the ParaView Ecosystem. Patrick O Leary, Kitware Inc
Responsive Large Data Analysis and Visualization with the ParaView Ecosystem Patrick O Leary, Kitware Inc Hybrid Computing Attribute Titan Summit - 2018 Compute Nodes 18,688 ~3,400 Processor (1) 16-core
More informationIntegrating Analysis and Computation with Trios Services
October 31, 2012 Integrating Analysis and Computation with Trios Services Approved for Public Release: SAND2012-9323P Ron A. Oldfield Scalable System Software Sandia National Laboratories Albuquerque,
More informationEureka! Task Teams! Kyle Wheeler SC 12 Chapel Lightning Talk SAND: P
GO 08012011 Eureka! Task Teams! Kyle Wheeler SC 12 Chapel Lightning Talk Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary
More informationLarge Scale Visualization on the Cray XT3 Using ParaView
Large Scale Visualization on the Cray XT3 Using ParaView Cray User s Group 2008 May 8, 2008 Kenneth Moreland David Rogers John Greenfield Sandia National Laboratories Alexander Neundorf Technical University
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all
More informationRobert Maynard Kitware, Inc. Kenneth Moreland Sandia National Laboratories
Robert Maynard Kitware, Inc. Kenneth Moreland Sandia National Laboratories Data Analysis at Extreme Data Analysis and Visualization library Multi- and Many-core processor support Allows users to create
More informationPerspectives form US Department of Energy work on parallel programming models for performance portability
Perspectives form US Department of Energy work on parallel programming models for performance portability Jeremiah J. Wilke Sandia National Labs Livermore, CA IMPACT workshop at HiPEAC Sandia National
More informationOutline. In Situ Data Triage and Visualiza8on
In Situ Data Triage and Visualiza8on Kwan- Liu Ma University of California at Davis Outline In situ data triage and visualiza8on: Issues and strategies Case study: An earthquake simula8on Case study: A
More informationImplementing Many-Body Potentials for Molecular Dynamics Simulations
Official Use Only Implementing Many-Body Potentials for Molecular Dynamics Simulations Using large scale clusters for higher accuracy simulations. Christian Trott, Aidan Thompson Unclassified, Unlimited
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationDeveloping Integrated Data Services for Cray Systems with a Gemini Interconnect
Developing Integrated Data Services for Cray Systems with a Gemini Interconnect Ron A. Oldfield Scalable System So4ware Sandia Na9onal Laboratories Albuquerque, NM, USA raoldfi@sandia.gov Cray User Group
More informationCUDA. Matthew Joyner, Jeremy Williams
CUDA Matthew Joyner, Jeremy Williams Agenda What is CUDA? CUDA GPU Architecture CPU/GPU Communication Coding in CUDA Use cases of CUDA Comparison to OpenCL What is CUDA? What is CUDA? CUDA is a parallel
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationTitan - Early Experience with the Titan System at Oak Ridge National Laboratory
Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid
More informationPULP: Fast and Simple Complex Network Partitioning
PULP: Fast and Simple Complex Network Partitioning George Slota #,* Kamesh Madduri # Siva Rajamanickam * # The Pennsylvania State University *Sandia National Laboratories Dagstuhl Seminar 14461 November
More informationUsing a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales
Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales Margaret Lawson, Jay Lofstead Sandia National Laboratories is a multimission laboratory managed and operated
More informationWhat is DARMA? DARMA is a C++ abstraction layer for asynchronous many-task (AMT) runtimes.
DARMA Janine C. Bennett, Jonathan Lifflander, David S. Hollman, Jeremiah Wilke, Hemanth Kolla, Aram Markosyan, Nicole Slattengren, Robert L. Clay (PM) PSAAP-WEST February 22, 2017 Sandia National Laboratories
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationSimulation-time data analysis and I/O acceleration at extreme scale with GLEAN
Simulation-time data analysis and I/O acceleration at extreme scale with GLEAN Venkatram Vishwanath, Mark Hereld and Michael E. Papka Argonne Na
More informationLarge Scale Visualization on the Cray XT3 Using ParaView
Large Scale Visualization on the Cray XT3 Using ParaView Kenneth Moreland, Sandia National Laboratories David Rogers, Sandia National Laboratories John Greenfield, Sandia National Laboratories Berk Geveci,
More informationThe Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System
The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Alan Humphrey, Qingyu Meng, Martin Berzins Scientific Computing and Imaging Institute & University of Utah I. Uintah Overview
More informationSimple Parallel Biconnectivity Algorithms for Multicore Platforms
Simple Parallel Biconnectivity Algorithms for Multicore Platforms George M. Slota Kamesh Madduri The Pennsylvania State University HiPC 2014 December 17-20, 2014 Code, presentation available at graphanalysis.info
More informationHarnessing GPU speed to accelerate LAMMPS particle simulations
Harnessing GPU speed to accelerate LAMMPS particle simulations Paul S. Crozier, W. Michael Brown, Peng Wang pscrozi@sandia.gov, wmbrown@sandia.gov, penwang@nvidia.com SC09, Portland, Oregon November 18,
More informationImproving Uintah s Scalability Through the Use of Portable
Improving Uintah s Scalability Through the Use of Portable Kokkos-Based Data Parallel Tasks John Holmen1, Alan Humphrey1, Daniel Sunderland2, Martin Berzins1 University of Utah1 Sandia National Laboratories2
More informationIn situ visualization is the coupling of visualization
Visualization Viewpoints Editor: Theresa-Marie Rhyne The Tensions of In Situ Visualization Kenneth Moreland Sandia National Laboratories In situ is the coupling of software with a simulation or other data
More informationSecurity Metrics. Mark Torgerson Sandia National Laboratories 6/20/2007. Entry I-108
Security Metrics Mark Torgerson Sandia National Laboratories 6/20/2007 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of
More informationHobbes: Composi,on and Virtualiza,on as the Founda,ons of an Extreme- Scale OS/R
Hobbes: Composi,on and Virtualiza,on as the Founda,ons of an Extreme- Scale OS/R Ron Brightwell, Ron Oldfield Sandia Na,onal Laboratories Arthur B. Maccabe, David E. Bernholdt Oak Ridge Na,onal Laboratory
More informationECP Alpine: Algorithms and Infrastructure for In Situ Visualization and Analysis
ECP Alpine: Algorithms and Infrastructure for In Situ Visualization and Analysis Presented By: Matt Larsen LLNL-PRES-731545 This work was performed under the auspices of the U.S. Department of Energy by
More informationMRPB: Memory Request Priori1za1on for Massively Parallel Processors
MRPB: Memory Request Priori1za1on for Massively Parallel Processors Wenhao Jia, Princeton University Kelly A. Shaw, University of Richmond Margaret Martonosi, Princeton University Benefits of GPU Caches
More informationEarly Experiences with Trinity - The First Advanced Technology Platform for the ASC Program
Early Experiences with Trinity - The First Advanced Technology Platform for the ASC Program C.T. Vaughan, D.C. Dinge, P.T. Lin, S.D. Hammond, J. Cook, C. R. Trott, A.M. Agelastos, D.M. Pase, R.E. Benner,
More informationWar Stories : Graph Algorithms in GPUs
SAND2014-18323PE War Stories : Graph Algorithms in GPUs Siva Rajamanickam(SNL) George Slota, Kamesh Madduri (PSU) FASTMath Meeting Exceptional service in the national interest is a multi-program laboratory
More informationQualita've DNS Measurement Perspec'ves
Qualita've DNS Measurement Perspec'ves Casey Deccio Sandia Na/onal Laboratories ISC/CAIDA Data Collabora/on Workshop Oct 22, 2012 Sandia National Laboratories is a multi-program laboratory managed and
More informationECP VTK-m: Updating HPC Visualization Software for Exascale-Era Processors
ECP VTK-m: Updating HPC Visualization Software for Exascale-Era Processors Kenneth Moreland Sandia National Laboratories kmorel@sandia.gov David Pugmire Oak Ridge National Laboratory pugmire@ornl.gov Christopher
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationUsing the Cray Gemini Performance Counters
Photos placed in horizontal position with even amount of white space between photos and header Using the Cray Gemini Performance Counters 0 1 2 3 4 5 6 7 Backplane Backplane 8 9 10 11 12 13 14 15 Backplane
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt
More informationParallel Architectures
Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36
More informationShallow Water Simulations on Graphics Hardware
Shallow Water Simulations on Graphics Hardware Ph.D. Thesis Presentation 2014-06-27 Martin Lilleeng Sætra Outline Introduction Parallel Computing and the GPU Simulating Shallow Water Flow Topics of Thesis
More informationMassively Parallel Graph Analytics
Massively Parallel Graph Analytics Manycore graph processing, distributed graph layout, and supercomputing for graph analytics George M. Slota 1,2,3 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia
More informationGPU Debugging Made Easy. David Lecomber CTO, Allinea Software
GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,
More informationHeterogeneous CPU+GPU Molecular Dynamics Engine in CHARMM
Heterogeneous CPU+GPU Molecular Dynamics Engine in CHARMM 25th March, GTC 2014, San Jose CA AnE- Pekka Hynninen ane.pekka.hynninen@nrel.gov NREL is a na*onal laboratory of the U.S. Department of Energy,
More informationCtrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs
The 34 th IEEE International Conference on Computer Design Ctrl-C: Instruction-Aware Control Loop Based Adaptive Cache Bypassing for GPUs Shin-Ying Lee and Carole-Jean Wu Arizona State University October
More informationPortability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures
Photos placed in horizontal position with even amount of white space between photos and header Portability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures Christopher Forster,
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationChallenges in HPC I/O
Challenges in HPC I/O Universität Basel Julian M. Kunkel German Climate Computing Center / Universität Hamburg 10. October 2014 Outline 1 High-Performance Computing 2 Parallel File Systems and Challenges
More informationHypergraph Exploitation for Data Sciences
Photos placed in horizontal position with even amount of white space between photos and header Hypergraph Exploitation for Data Sciences Photos placed in horizontal position with even amount of white space
More informationAras Innovator Active Directory Sync
Photos placed in horizontal position with even amount of white space between photos and header Aras Innovator Active Directory Sync Sandia National Laboratories is a multi-program laboratory managed and
More informationIntroduction II. Overview
Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and
More informationParallel Programming Futures: What We Have and Will Have Will Not Be Enough
Parallel Programming Futures: What We Have and Will Have Will Not Be Enough Michael Heroux SOS 22 March 27, 2018 Sandia National Laboratories is a multimission laboratory managed and operated by National
More informationEMPRESS Extensible Metadata PRovider for Extreme-scale Scientific Simulations
EMPRESS Extensible Metadata PRovider for Extreme-scale Scientific Simulations Photos placed in horizontal position with even amount of white space between photos and header Margaret Lawson, Jay Lofstead,
More informationAsynchronous Termination Detection Module User s Guide
Asynchronous Termination Detection Module User s Guide William McLon III Sandia National Laboratories wcmclen@sandia.gov 1 Introduction Interprocessor communications are performed through point-to-point
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationTrends and Challenges in Multicore Programming
Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores
More informationAn Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language
An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language Martin C. Rinard (martin@cs.ucsb.edu) Department of Computer Science University
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt
More informationPuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks
PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks George M. Slota 1,2 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia National Laboratories, 2 The Pennsylvania
More informationMicrogrid Design Toolkit (MDT)
SAND2016-8151 C Microgrid Design Toolkit (MDT) 24 October 2016 John Eddy, Ph.D. Microgrid Design Toolkit (MDT) Principal Investigator System Sustainment & Readiness Technologies Department Sandia National
More informationHPCGraph: Benchmarking Massive Graph Analytics on Supercomputers
HPCGraph: Benchmarking Massive Graph Analytics on Supercomputers George M. Slota 1, Siva Rajamanickam 2, Kamesh Madduri 3 1 Rensselaer Polytechnic Institute 2 Sandia National Laboratories a 3 The Pennsylvania
More informationParallel Graph Coloring For Many- core Architectures
Parallel Graph Coloring For Many- core Architectures Mehmet Deveci, Erik Boman, Siva Rajamanickam Sandia Na;onal Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated
More informationPortable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.
Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences
More informationParallel and Distributed Systems. Hardware Trends. Why Parallel or Distributed Computing? What is a parallel computer?
Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and
More informationIrregular Graph Algorithms on Parallel Processing Systems
Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories
More informationChristopher Sewell Katrin Heitmann Li-ta Lo Salman Habib James Ahrens
LA-UR- 14-25437 Approved for public release; distribution is unlimited. Title: Portable Parallel Halo and Center Finders for HACC Author(s): Christopher Sewell Katrin Heitmann Li-ta Lo Salman Habib James
More informationSST + MacSim. Case Studies Using SST MacSim. Genie Hsieh Sandia National Labs
Photos placed in horizontal position with even amount of white space between photos and header SST + MacSim Case Studies Using SST MacSim Genie Hsieh Sandia National Labs Sandia National Laboratories is
More informationCRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar
CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION
More informationIntroduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes
Introduction: Modern computer architecture The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Motivation: Multi-Cores where and why Introduction: Moore s law Intel
More informationHashing Strategies for the Cray XMT MTAAP 2010
Hashing Strategies for the Cray XMT MTAAP 2010 Eric Goodman (SNL) David Haglin (PNNL) Chad Scherrer (PNNL) Daniel Chavarría-Miranda (PNNL) Jace Mogill (PNNL) John Feo (PNNL) Sandia is a multiprogram laboratory
More informationNVIDIA s Compute Unified Device Architecture (CUDA)
NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability 1 History of GPU
More informationNVIDIA s Compute Unified Device Architecture (CUDA)
NVIDIA s Compute Unified Device Architecture (CUDA) Mike Bailey mjb@cs.oregonstate.edu Reaching the Promised Land NVIDIA GPUs CUDA Knights Corner Speed Intel CPUs General Programmability History of GPU
More informationSpatio-temporal Feature Exploration in Combined Particle/Volume Reference Frames
Spatio-temporal Feature Eploration in Combined Particle/Volume Reference Frames Franz Sauer and Kwan-Liu Ma VIDI Labs University of California, Davis 2 Introduction: Problem Statement More scientific simulations
More informationMPICH: A High-Performance Open-Source MPI Implementation. SC11 Birds of a Feather Session
MPICH: A High-Performance Open-Source MPI Implementation SC11 Birds of a Feather Session Schedule MPICH2 status and plans Presenta
More informationChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPI Ron Brightwell, R&D Manager Scalable System Software Department Sandia National Laboratories is a multi-mission laboratory managed and operated
More informationExtracting Hidden Messages in Steganographic Images
DIGITAL FORENSIC RESEARCH CONFERENCE Extracting Hidden Messages in Steganographic Images By Tu-Thach Quach Presented At The Digital Forensic Research Conference DFRWS 2014 USA Denver, CO (Aug 3 rd - 6
More informationCUDA GPGPU Workshop 2012
CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline
More information8. Solving Stochastic Programs
8. Solving Stochastic Programs Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S.
More informationCUDA (Compute Unified Device Architecture)
CUDA (Compute Unified Device Architecture) Mike Bailey History of GPU Performance vs. CPU Performance GFLOPS Source: NVIDIA G80 = GeForce 8800 GTX G71 = GeForce 7900 GTX G70 = GeForce 7800 GTX NV40 = GeForce
More informationTimothy Lanfear, NVIDIA HPC
GPU COMPUTING AND THE Timothy Lanfear, NVIDIA FUTURE OF HPC Exascale Computing will Enable Transformational Science Results First-principles simulation of combustion for new high-efficiency, lowemision
More informationComputational and Statistical Tradeoffs in VoI-Driven Learning
ARO ARO MURI MURI on on Value-centered Theory for for Adaptive Learning, Inference, Tracking, and and Exploitation Computational and Statistical Tradefs in VoI-Driven Learning Co-PI Michael Jordan University
More informationPreconditioning Linear Systems Arising from Graph Laplacians of Complex Networks
Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms
More informationad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors
ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors Weifeng Liu and Brian Vinter Niels Bohr Institute University of Copenhagen Denmark {weifeng, vinter}@nbi.dk March 1, 2014 Weifeng
More informationGPU Performance Nuggets
GPU Performance Nuggets Simon Garcia de Gonzalo & Carl Pearson PhD Students, IMPACT Research Group Advised by Professor Wen-mei Hwu Jun. 15, 2016 grcdgnz2@illinois.edu pearson@illinois.edu GPU Performance
More informationRadiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System
Radiation Modeling Using the Uintah Heterogeneous CPU/GPU Runtime System Alan Humphrey, Qingyu Meng, Martin Berzins, Todd Harman Scientific Computing and Imaging Institute & University of Utah I. Uintah
More informationThe Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center
The Stampede is Coming Welcome to Stampede Introductory Training Dan Stanzione Texas Advanced Computing Center dan@tacc.utexas.edu Thanks for Coming! Stampede is an exciting new system of incredible power.
More informationExploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center
Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center 3/17/2015 2014 IBM Corporation Outline IBM OpenPower Platform Accelerating
More informationSandia National Laboratories Solutions for Aras Innovator
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories Solutions for Aras Innovator Sandia National Laboratories is a multi-program
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationSimulation of Workflow and Threat Characteristics for Cyber Security Incident Response Teams
Simulation of Workflow and Threat Characteristics for Cyber Security Incident Response Teams Theodore Reed, Robert G. Abbott, Benjamin Anderson, Kevin Nauer & Chris Forsythe Sandia National Laboratories
More informationRisk Informed Cyber Security for Nuclear Power Plants
Risk Informed Cyber Security for Nuclear Power Plants Phillip L. Turner, Timothy A. Wheeler, Matt Gibson Sandia National Laboratories Electric Power Research Institute Albuquerque, NM USA Charlotte, NC
More informationEfficiency and Programmability: Enablers for ExaScale. Bill Dally Chief Scientist and SVP, Research NVIDIA Professor (Research), EE&CS, Stanford
Efficiency and Programmability: Enablers for ExaScale Bill Dally Chief Scientist and SVP, Research NVIDIA Professor (Research), EE&CS, Stanford Scientific Discovery and Business Analytics Driving an Insatiable
More informationSupercomputer Field Data. DRAM, SRAM, and Projections for Future Systems
Supercomputer Field Data DRAM, SRAM, and Projections for Future Systems Nathan DeBardeleben, Ph.D. (LANL) Ultrascale Systems Research Center (USRC) 6 th Soft Error Rate (SER) Workshop Santa Clara, October
More informationAutomatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.
Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S. Vazhkudai Instance of Large-Scale HPC Systems ORNL s TITAN (World
More informationTurbostream: A CFD solver for manycore
Turbostream: A CFD solver for manycore processors Tobias Brandvik Whittle Laboratory University of Cambridge Aim To produce an order of magnitude reduction in the run-time of CFD solvers for the same hardware
More informationAn Introduction to OpenACC
An Introduction to OpenACC Alistair Hart Cray Exascale Research Initiative Europe 3 Timetable Day 1: Wednesday 29th August 2012 13:00 Welcome and overview 13:15 Session 1: An Introduction to OpenACC 13:15
More informationParallelism. Parallel Hardware. Introduction to Computer Systems
Parallelism We have been discussing the abstractions and implementations that make up an individual computer system in considerable detail up to this point. Our model has been a largely sequential one,
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationTrends in HPC (hardware complexity and software challenges)
Trends in HPC (hardware complexity and software challenges) Mike Giles Oxford e-research Centre Mathematical Institute MIT seminar March 13th, 2013 Mike Giles (Oxford) HPC Trends March 13th, 2013 1 / 18
More information