Heuristic-based techniques for mapping irregular communication graphs to mesh topologies
|
|
- Henry Lynch
- 6 years ago
- Views:
Transcription
1 Heuristic-based techniques for mapping irregular communication graphs to mesh topologies Abhinav Bhatele and Laxmikant V. Kale University of Illinois at Urbana-Champaign LLNL-PRES , P. O. Box 808, Livermore, CA 94551
2 Introduction Various kinds of interconnect networks deployed today for supercomputers mesh, fat-tree, Kautz graph, dragonfly Link sharing leads to contention and performance slowdowns Communication optimization becoming increasingly important 2
3 Topology aware mapping Mapping the communication graph of an application to the physical topology can optimize communication Diverse set of parameters from the application graph and the processor topology one solution may not do well in all cases Specific heuristics for mesh topologies and regular/ irregular communication graphs 3
4 Automatic Mapping Framework Application communication graph Pattern Matching Framework Regular Graphs Irregular Graphs 2D Object Graph 3D Object Graph W/o coordinate information W/ coordinate information MXOVLP, MXOV_AL, EXC, COCE, AFFN EXC, COCE, AFFN BFT, MHT, Infer structure AFFN, COCE, COCE+MHT Choose best heuristic depending on hop-bytes Processor topology information Output: Mapping file used for the next run 4
5 Automatic Mapping Framework Application communication graph Pattern Matching Framework Regular Graphs Irregular Graphs 2D Object Graph 3D Object Graph W/o coordinate information W/ coordinate information MXOVLP, MXOV_AL, EXC, COCE, AFFN EXC, COCE, AFFN BFT, MHT, Infer structure AFFN, COCE, COCE+MHT Choose best heuristic depending on hop-bytes Processor topology information Output: Mapping file used for the next run 4
6 Automatic Mapping Framework Application communication graph Pattern Matching Framework Regular Graphs Irregular Graphs 2D Object Graph 3D Object Graph W/o coordinate information W/ coordinate information MXOVLP, MXOV_AL, EXC, COCE, AFFN EXC, COCE, AFFN BFT, MHT, Infer structure AFFN, COCE, COCE+MHT Choose best heuristic depending on hop-bytes Processor topology information Output: Mapping file used for the next run 4
7 Automatic Mapping Framework Application communication graph Pattern Matching Framework Regular Graphs Irregular Graphs 2D Object Graph 3D Object Graph W/o coordinate information W/ coordinate information MXOVLP, MXOV_AL, EXC, COCE, AFFN EXC, COCE, AFFN BFT, MHT, Infer structure AFFN, COCE, COCE+MHT Relieve the application developer of the mapping burden Choose best heuristic depending on hop-bytes Output: Mapping file used for the next run Processor topology information 4
8 Automatic Mapping Framework Application communication graph Pattern Matching Framework Regular Graphs Irregular Graphs 2D Object Graph 3D Object Graph W/o coordinate information W/ coordinate information MXOVLP, MXOV_AL, EXC, COCE, AFFN EXC, COCE, AFFN BFT, MHT, Infer structure AFFN, COCE, COCE+MHT Relieve the application developer of the mapping burden Choose best heuristic depending on hop-bytes Output: Mapping file used for the next run Processor topology information No change to the application code 4
9 Two different scenarios There is no spatial information associated with the node Option 1: Work without it Option 2: If we know that the simulation has a geometric configuration, try to infer the structure of the graph We have geometric coordinate information for each node Use coordinate information to avoid crossing of edges and for other optimizations 5
10 Finding the nearest available Algorithms in this paper: processor pick a vertex in the graph to map find a desirable processor to place it on Spiraling Quadtree 6
11 Finding the nearest available Algorithms in this paper: processor pick a vertex in the graph to map find a desirable processor to place it on Spiraling Quadtree 6
12 Finding the nearest available Algorithms in this paper: processor pick a vertex in the graph to map find a desirable processor to place it on Spiraling Quadtree 6
13 Finding the nearest available Algorithms in this paper: processor pick a vertex in the graph to map find a desirable processor to place it on Spiraling Quadtree 6
14 Finding the nearest available processor Algorithms in this paper: pick a vertex in the graph to map find a desirable processor to place it on Spiraling Quadtree 6
15 Finding the nearest available processor 7
16 Finding the nearest available processor consecutive calls to the spiraling and quadtree algorithms for findnearest from the AFFN mapping regular communi- (2D) mesh. We then discuss mapping heuristics for different 7
17 Spiraling vs Quadtree Execution Time (ms) Spiral (corner) Quadtree (corner) Spiral (center) Quadtree (center) Number of nodes in the communication graph 8
18 Spiraling vs Quadtree Performance when called from AFFN: a mapping algorithm Execution Time (ms) Spiraling Quadtree Number of nodes in the communication graph 9
19 Mapping Irregular Graphs Object graph: 90 nodes Processor Mesh: 10 x 9 10
20 No coordinate information 11
21 No coordinate information Breadth first traversal (BFT) Start with a random node and one end of the processor mesh Map nodes as you encounter them close to their parent 11
22 No coordinate information Breadth first traversal (BFT) Start with a random node and one end of the processor mesh Map nodes as you encounter them close to their parent Max heap traversal (MHT) Start with a random node and one end/center of the mesh Put neighbors of a mapped node into the heap (node at the top is the one with maximum number of mapped neighbors) Map elements in the heap one by one around the centroid of their mapped neighbors 11
23 Mapping visualization BFT: 2.89 MHT:
24 Inferring the spatial placement 13
25 Inferring the spatial placement Graph layout algorithms Force-based layout to reduce the total energy in the system Use the graphviz library to obtain coordinates of the nodes 13
26 Inferring the spatial placement Graph layout algorithms Force-based layout to reduce the total energy in the system Use the graphviz library to obtain coordinates of the nodes 13
27 With coordinate information Affine Mapping (AFFN) Stretch/shrink the object graph (based on coordinates of nodes) to map it on to the processor grid In case of conflicts for the same processor, find the nearest available processor Corners to Center (COCE) Use four corners of the object graph based on coordinates Start mapping simultaneously from all sides Place nodes encountered during a BFT close to their parents 14
28 Mapping visualization AFFN: 3.17 COCE:
29 COCE+MHT Hybrid: We fix four nodes at geometric corners of the mesh to four processors in 2D Put neighbors of these nodes into a max heap Map from all sides inwards Starting from centroid of mapped neighbors COCE:
30 Time Complexity 17
31 Time Complexity All algorithms discussed above choose a desired processor and spiral around it to find the nearest available processor Heuristics generally applicable to any topology 17
32 Time Complexity All algorithms discussed above choose a desired processor and spiral around it to find the nearest available processor Heuristics generally applicable to any topology Depending on the running time of findnearest: BFT COCE AFFN MHT COCE+MHT O(n) O(n) O(n) O(n logn) O(n logn) O(n (logn) 2 ) O(n (logn) 2 ) O(n (logn) 2 ) O(n (logn) 2 ) O(n (logn) 2 ) 17
33 Running Time Time (ms) COCE+MHT MHT AFFN COCE BFT Number of nodes 18
34 Evaluation Metric for comparison: hop-bytes ere Indicates amount of traffic and hence contention on the network average hops per byte =! nx d i b i Previously used metric: maximum dilation i=1! nx b i i=1 is the number of hops traversed by a message of 19
35 Mapping Simple2D to a 2D mesh Average hops per byte Default (METIS) BFT MHT COCE COCE+MHT AFFN PAIRS Number of nodes in the communication graph 20
36 Mapping Simple2D to a 2D mesh Average hops per byte Default (METIS) BFT MHT COCE COCE+MHT AFFN 512 x X X 8 64 X X 32 Different aspect ratios of a 2D mesh 21
37 Mapping Simple2D to a 3D mesh Average hops per byte Random Default (METIS) BFT MHT COCE+MHT Number of nodes in the communication graph 22
38 Summary Heuristics for mapping irregular graphs to mesh topologies best heuristic chosen at runtime (based on hop-bytes) Extensible to other topologies also Mapping library to help the application developer 23
39 Questions? Abhinav Bhatele, Automating Topology Aware Mapping for Supercomputers, PhD Thesis, Department of Computer Science, University of Illinois. LLNL-PRES , P. O. Box 808, Livermore, CA 94551
Automated Mapping of Regular Communication Graphs on Mesh Interconnects
Automated Mapping of Regular Communication Graphs on Mesh Interconnects Abhinav Bhatele, Gagan Gupta, Laxmikant V. Kale and I-Hsin Chung Motivation Running a parallel application on a linear array of processors:
More informationAbhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana Champaign Sameer Kumar IBM T. J. Watson Research Center
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana Champaign Sameer Kumar IBM T. J. Watson Research Center Motivation: Contention Experiments Bhatele, A., Kale, L. V. 2008 An Evaluation
More informationPREDICTING COMMUNICATION PERFORMANCE
PREDICTING COMMUNICATION PERFORMANCE Nikhil Jain CASC Seminar, LLNL This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract
More informationTraceR: A Parallel Trace Replay Tool for Studying Interconnection Networks
TraceR: A Parallel Trace Replay Tool for Studying Interconnection Networks Bilge Acun, PhD Candidate Department of Computer Science University of Illinois at Urbana- Champaign *Contributors: Nikhil Jain,
More informationScalable Topology Aware Object Mapping for Large Supercomputers
Scalable Topology Aware Object Mapping for Large Supercomputers Abhinav Bhatele (bhatele@illinois.edu) Department of Computer Science, University of Illinois at Urbana-Champaign Advisor: Laxmikant V. Kale
More informationA CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS
A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS Abhinav Bhatele, Eric Bohm, Laxmikant V. Kale Parallel Programming Laboratory Euro-Par 2009 University of Illinois at Urbana-Champaign
More informationAutomated Mapping of Regular Communication Graphs on Mesh Interconnects
Automated Mapping of Regular Communication Graphs on Mesh Interconnects Abhinav Bhatelé, Gagan Raj Gupta, Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign Urbana,
More informationOptimization of the Hop-Byte Metric for Effective Topology Aware Mapping
Optimization of the Hop-Byte Metric for Effective Topology Aware Mapping C. D. Sudheer Department of Mathematics and Computer Science Sri Sathya Sai Institute of Higher Learning, India Email: cdsudheerkumar@sssihl.edu.in
More informationTutorial: Application MPI Task Placement
Tutorial: Application MPI Task Placement Juan Galvez Nikhil Jain Palash Sharma PPL, University of Illinois at Urbana-Champaign Tutorial Outline Why Task Mapping on Blue Waters? When to do mapping? How
More informationIS TOPOLOGY IMPORTANT AGAIN? Effects of Contention on Message Latencies in Large Supercomputers
IS TOPOLOGY IMPORTANT AGAIN? Effects of Contention on Message Latencies in Large Supercomputers Abhinav S Bhatele and Laxmikant V Kale ACM Research Competition, SC 08 Outline Why should we consider topology
More informationToward Runtime Power Management of Exascale Networks by On/Off Control of Links
Toward Runtime Power Management of Exascale Networks by On/Off Control of Links, Nikhil Jain, Laxmikant Kale University of Illinois at Urbana-Champaign HPPAC May 20, 2013 1 Power challenge Power is a major
More informationMaximizing Throughput on a Dragonfly Network
Maximizing Throughput on a Dragonfly Network Nikhil Jain, Abhinav Bhatele, Xiang Ni, Nicholas J. Wright, Laxmikant V. Kale Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana,
More informationOrder or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations
Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations George M. Slota 1 Sivasankaran Rajamanickam 2 Kamesh Madduri 3 1 Rensselaer Polytechnic Institute, 2 Sandia National
More informationCharm++ for Productivity and Performance
Charm++ for Productivity and Performance A Submission to the 2011 HPC Class II Challenge Laxmikant V. Kale Anshu Arya Abhinav Bhatele Abhishek Gupta Nikhil Jain Pritish Jetley Jonathan Lifflander Phil
More informationReducing Communication Costs Associated with Parallel Algebraic Multigrid
Reducing Communication Costs Associated with Parallel Algebraic Multigrid Amanda Bienz, Luke Olson (Advisor) University of Illinois at Urbana-Champaign Urbana, IL 11 I. PROBLEM AND MOTIVATION Algebraic
More informationA Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé
A Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé Parallel Programming Laboratory Department of Computer Science University
More informationNetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013
NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching
More informationMemory Hierarchy Management for Iterative Graph Structures
Memory Hierarchy Management for Iterative Graph Structures Ibraheem Al-Furaih y Syracuse University Sanjay Ranka University of Florida Abstract The increasing gap in processor and memory speeds has forced
More informationParallel Computing Platforms
Parallel Computing Platforms Routing, Network Embedding John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14-15 4,11 October 2018 Topics for Today
More informationTopology-aware task mapping for reducing communication contention on large parallel machines
Topology-aware task mapping for reducing communication contention on large parallel machines Tarun Agarwal, Amit Sharma, Laxmikant V. Kalé Dept. of Computer Science University of Illinois at Urbana-Champaign
More informationGIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing
GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing Stefan Schmid @ T-Labs, 2011 GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing If you
More informationRouting Algorithms. Review
Routing Algorithms Today s topics: Deterministic, Oblivious Adaptive, & Adaptive models Problems: efficiency livelock deadlock 1 CS6810 Review Network properties are a combination topology topology dependent
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationShort Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy
Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,
More informationPTRAM: A Parallel Topology- and Routing-Aware Mapping Framework for Large-Scale HPC Systems
PTRAM: A Parallel Topology- and Routing-Aware Mapping Framework for Large-Scale HPC Systems Seyed H. Mirsadeghi and Ahmad Afsahi ECE Department, Queen s University, Kingston, ON, Canada, K7L 3N6 Email:
More informationTOPOLOGY CONTROL IN MOBILE AD HOC NETWORKS WITH COOPERATIVE COMMUNICATIONS
IEEE 2012 Transactions on Wireless Communications, Volume: 9, Issue: 2 TOPOLOGY CONTROL IN MOBILE AD HOC NETWORKS WITH COOPERATIVE COMMUNICATIONS Abstract Cooperative communication has received tremendous
More informationGeneric Topology Mapping Strategies for Large-scale Parallel Architectures
Generic Topology Mapping Strategies for Large-scale Parallel Architectures Torsten Hoefler and Marc Snir Scientific talk at ICS 11, Tucson, AZ, USA, June 1 st 2011, Hierarchical Sparse Networks are Ubiquitous
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationCS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2
Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99
More informationApplying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing
Applying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing HARSHITHA MENON, University of Illinois at Urbana-Champaign ABHINAV BHATELE, Lawrence Livermore National Laboratory SÉBASTIEN
More informationNUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems
NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems Carl Pearson 1, I-Hsin Chung 2, Zehra Sura 2, Wen-Mei Hwu 1, and Jinjun Xiong 2 1 University of Illinois Urbana-Champaign, Urbana
More informationPartitioning Low-diameter Networks to Eliminate Inter-job Interference
Partitioning Low-diameter Networks to Eliminate Inter-job Interference Nikhil Jain, Abhinav Bhatele, Xiang Ni, Todd Gamblin, Laxmikant V. Kale Center for Applied Scientific Computing, Lawrence Livermore
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationHierarchical task mapping for parallel applications on supercomputers
J Supercomput DOI 10.1007/s11227-014-1324-5 Hierarchical task mapping for parallel applications on supercomputers Jingjin Wu Xuanxing Xiong Zhiling Lan Springer Science+Business Media New York 2014 Abstract
More informationGPU Sparse Graph Traversal. Duane Merrill
GPU Sparse Graph Traversal Duane Merrill Breadth-first search of graphs (BFS) 1. Pick a source node 2. Rank every vertex by the length of shortest path from source Or label every vertex by its predecessor
More informationReview: Creating a Parallel Program. Programming for Performance
Review: Creating a Parallel Program Can be done by programmer, compiler, run-time system or OS Steps for creating parallel program Decomposition Assignment of tasks to processes Orchestration Mapping (C)
More informationSlim Fly: A Cost Effective Low-Diameter Network Topology
TORSTEN HOEFLER, MACIEJ BESTA Slim Fly: A Cost Effective Low-Diameter Network Topology Images belong to their creator! NETWORKS, LIMITS, AND DESIGN SPACE Networks cost 25-30% of a large supercomputer Hard
More informationLoad Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods
Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu) Robert B. Haber Laxmikant V. Kalé University of Illinois, Urbana-Champaign Parallel
More informationScalable In-memory Checkpoint with Automatic Restart on Failures
Scalable In-memory Checkpoint with Automatic Restart on Failures Xiang Ni, Esteban Meneses, Laxmikant V. Kalé Parallel Programming Laboratory University of Illinois at Urbana-Champaign November, 2012 8th
More informationDetecting and Using Critical Paths at Runtime in Message Driven Parallel Programs
Detecting and Using Critical Paths at Runtime in Message Driven Parallel Programs Isaac Dooley, Laxmikant V. Kale Department of Computer Science University of Illinois idooley2@illinois.edu kale@illinois.edu
More informationPick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality
Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality Repeated References, to a set of locations: Temporal Locality Take advantage of behavior
More informationc Copyright by Tarun Agarwal, 2005
c Copyright by Tarun Agarwal, 2005 STRATEGIES FOR TOPOLOGY-AWARE TASK MAPPING AND FOR REBALANCING WITH BOUNDED MIGRATIONS BY TARUN AGARWAL B.Tech., Indian Institute of Technology, Delhi, 2003 THESIS Submitted
More informationHPC Algorithms and Applications
HPC Algorithms and Applications Dwarf #5 Structured Grids Michael Bader Winter 2012/2013 Dwarf #5 Structured Grids, Winter 2012/2013 1 Dwarf #5 Structured Grids 1. dense linear algebra 2. sparse linear
More informationAutomated Mapping of Structured Communication Graphs onto Mesh Interconnects
Automated Mapping of Structured Communication Graphs onto Mesh Interconnects Abhinav Bhatelé, I-Hsin Chung and Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign,
More informationPrinciples of Wireless Sensor Networks. Routing, Zigbee, and RPL
http://www.ee.kth.se/~carlofi/teaching/pwsn-2011/wsn_course.shtml Lecture 8 Stockholm, November 11, 2011 Routing, Zigbee, and RPL Royal Institute of Technology - KTH Stockholm, Sweden e-mail: carlofi@kth.se
More informationApplying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing
Applying Graph Partitioning Methods in Measurement-based Dynamic Load Balancing HARSHITHA MENON, University of Illinois at Urbana-Champaign ABHINAV BHATELE, Lawrence Livermore National Laboratory SÉBASTIEN
More informationDynamic Load Distributions for Adaptive Computations on MIMD Machines using Hybrid Genetic Algorithms (a subset)
Dynamic Load Distributions for Adaptive Computations on MIMD Machines using Hybrid Genetic Algorithms (a subset) TIMOTHY H. KAISER PhD, Computer Science - UNM MS, Electrical Engineering - (Applied Physics)
More informationImage-Space-Parallel Direct Volume Rendering on a Cluster of PCs
Image-Space-Parallel Direct Volume Rendering on a Cluster of PCs B. Barla Cambazoglu and Cevdet Aykanat Bilkent University, Department of Computer Engineering, 06800, Ankara, Turkey {berkant,aykanat}@cs.bilkent.edu.tr
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationToward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies
Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationCS 491 CAP Intermediate Dynamic Programming
CS 491 CAP Intermediate Dynamic Programming Victor Gao University of Illinois at Urbana-Champaign Oct 28, 2016 Linear DP Knapsack DP DP on a Grid Interval DP Division/Grouping DP Tree DP Set DP Outline
More informationGPU Sparse Graph Traversal
GPU Sparse Graph Traversal Duane Merrill (NVIDIA) Michael Garland (NVIDIA) Andrew Grimshaw (Univ. of Virginia) UNIVERSITY of VIRGINIA Breadth-first search (BFS) 1. Pick a source node 2. Rank every vertex
More informationGIAN Course on Distributed Network Algorithms. Network Topologies and Interconnects
GIAN Course on Distributed Network Algorithms Network Topologies and Interconnects Stefan Schmid @ T-Labs, 2011 The Many Faces and Flavors of Network Topologies Gnutella P2P. Social Networks. Internet.
More informationConstruction of All Rectilinear Steiner Minimum Trees on the Hanan Grid
Construction of All Rectilinear Steiner Minimum Trees on the Hanan Grid Sheng-En David Lin and Dae Hyun Kim Presenter: Dae Hyun Kim (Assistant Professor) daehyun@eecs.wsu.edu School of Electrical Engineering
More informationMultiprocessor Interconnection Networks- Part Three
Babylon University College of Information Technology Software Department Multiprocessor Interconnection Networks- Part Three By The k-ary n-cube Networks The k-ary n-cube network is a radix k cube with
More informationMapping MPI+X Applications to Multi-GPU Architectures
Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under
More informationGPU-Accelerated Topology Optimization on Unstructured Meshes
GPU-Accelerated Topology Optimization on Unstructured Meshes Tomás Zegard, Glaucio H. Paulino University of Illinois at Urbana-Champaign July 25, 2011 Tomás Zegard, Glaucio H. Paulino (UIUC) GPU TOP on
More informationMeshing of flow and heat transfer problems
Meshing of flow and heat transfer problems Luyao Zou a, Zhe Li b, Qiqi Fu c and Lujie Sun d School of, Shandong University of science and technology, Shandong 266590, China. a zouluyaoxf@163.com, b 1214164853@qq.com,
More informationApplications. Oversampled 3D scan data. ~150k triangles ~80k triangles
Mesh Simplification Applications Oversampled 3D scan data ~150k triangles ~80k triangles 2 Applications Overtessellation: E.g. iso-surface extraction 3 Applications Multi-resolution hierarchies for efficient
More informationGraph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
More informationCommunication-constrained p-center Problem for Event Coverage in Theme Parks
Communication-constrained p-center Problem for Event Coverage in Theme Parks Gürkan Solmaz*, Kemal Akkaya, Damla Turgut* *Department of Electrical Engineering and Computer Science University of Central
More informationParallel Computing Platforms
Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics
More informationChapter 7 CONCLUSION
97 Chapter 7 CONCLUSION 7.1. Introduction A Mobile Ad-hoc Network (MANET) could be considered as network of mobile nodes which communicate with each other without any fixed infrastructure. The nodes in
More informationLecture 20: Distributed Memory Parallelism. William Gropp
Lecture 20: Distributed Parallelism William Gropp www.cs.illinois.edu/~wgropp A Very Short, Very Introductory Introduction We start with a short introduction to parallel computing from scratch in order
More informationConstraints: Nodes are interdependent. Memory Energy Processing 4/1/2011. Dulanjalie Dhanapala March 30 th, 2011
PhD. Dissertation Proposal Network Awareness in Wireless Sensor Networks - for Intelligent Organization and Efficient Routing Dulanjalie Dhanapala March 30 th, 2011 ommittee Prof. Anura P. Jayasumana(Advisor)
More informationPICS - a Performance-analysis-based Introspective Control System to Steer Parallel Applications
PICS - a Performance-analysis-based Introspective Control System to Steer Parallel Applications Yanhua Sun, Laxmikant V. Kalé University of Illinois at Urbana-Champaign sun51@illinois.edu November 26,
More informationc Copyright by Sameer Kumar, 2005
c Copyright by Sameer Kumar, 2005 OPTIMIZING COMMUNICATION FOR MASSIVELY PARALLEL PROCESSING BY SAMEER KUMAR B. Tech., Indian Institute Of Technology Madras, 1999 M.S., University of Illinois at Urbana-Champaign,
More informationPreliminary Evaluation of a Parallel Trace Replay Tool for HPC Network Simulations
Preliminary Evaluation of a Parallel Trace Replay Tool for HPC Network Simulations Bilge Acun 1, Nikhil Jain 1, Abhinav Bhatele 2, Misbah Mubarak 3, Christopher D. Carothers 4, and Laxmikant V. Kale 1
More informationTopology and affinity aware hierarchical and distributed load-balancing in Charm++
Topology and affinity aware hierarchical and distributed load-balancing in Charm++ Emmanuel Jeannot, Guillaume Mercier, François Tessier Inria - IPB - LaBRI - University of Bordeaux - Argonne National
More information6. Parallel Volume Rendering Algorithms
6. Parallel Volume Algorithms This chapter introduces a taxonomy of parallel volume rendering algorithms. In the thesis statement we claim that parallel algorithms may be described by "... how the tasks
More informationNavigation and Metric Path Planning
Navigation and Metric Path Planning October 4, 2011 Minerva tour guide robot (CMU): Gave tours in Smithsonian s National Museum of History Example of Minerva s occupancy map used for navigation Objectives
More informationSubdivision Surfaces. Course Syllabus. Course Syllabus. Modeling. Equivalence of Representations. 3D Object Representations
Subdivision Surfaces Adam Finkelstein Princeton University COS 426, Spring 2003 Course Syllabus I. Image processing II. Rendering III. Modeling IV. Animation Image Processing (Rusty Coleman, CS426, Fall99)
More informationGPGPU LAB. Case study: Finite-Difference Time- Domain Method on CUDA
GPGPU LAB Case study: Finite-Difference Time- Domain Method on CUDA Ana Balevic IPVS 1 Finite-Difference Time-Domain Method Numerical computation of solutions to partial differential equations Explicit
More informationGeometric Modeling. Mesh Decimation. Mesh Decimation. Applications. Copyright 2010 Gotsman, Pauly Page 1. Oversampled 3D scan data
Applications Oversampled 3D scan data ~150k triangles ~80k triangles 2 Copyright 2010 Gotsman, Pauly Page 1 Applications Overtessellation: E.g. iso-surface extraction 3 Applications Multi-resolution hierarchies
More informationFault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson
Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip
More informationVisIt Overview. VACET: Chief SW Engineer ASC: V&V Shape Char. Lead. Hank Childs. Supercomputing 2006 Tampa, Florida November 13, 2006
VisIt Overview Hank Childs VACET: Chief SW Engineer ASC: V&V Shape Char. Lead Supercomputing 2006 Tampa, Florida November 13, 2006 27B element Rayleigh-Taylor Instability (MIRANDA, BG/L) This is UCRL-PRES-226373
More informationLPT Codes used with ROAM Splitting Algorithm CMSC451 Honors Project Joseph Brosnihan Mentor: Dave Mount Fall 2015
LPT Codes used with ROAM Splitting Algorithm CMSC451 Honors Project Joseph Brosnihan Mentor: Dave Mount Fall 2015 Abstract Simplicial meshes are commonly used to represent regular triangle grids. Of many
More informationDistributed State Es.ma.on Algorithms for Electric Power Systems
Distributed State Es.ma.on Algorithms for Electric Power Systems Ariana Minot, Blue Waters Graduate Fellow Professor Na Li, Professor Yue M. Lu Harvard University, School of Engineering and Applied Sciences
More informationLecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue
More information1.2 Graph Drawing Techniques
1.2 Graph Drawing Techniques Graph drawing is the automated layout of graphs We shall overview a number of graph drawing techniques For general graphs: Force Directed Spring Embedder Barycentre based Multicriteria
More informationEE 4683/5683: COMPUTER ARCHITECTURE
3/3/205 EE 4683/5683: COMPUTER ARCHITECTURE Lecture 8: Interconnection Networks Avinash Kodi, kodi@ohio.edu Agenda 2 Interconnection Networks Performance Metrics Topology 3/3/205 IN Performance Metrics
More information6. Node Disjoint Split Multipath Protocol for Unified. Multicasting through Announcements (NDSM-PUMA)
103 6. Node Disjoint Split Multipath Protocol for Unified Multicasting through Announcements (NDSM-PUMA) 6.1 Introduction It has been demonstrated in chapter 3 that the performance evaluation of the PUMA
More informationFoundations of Multidimensional and Metric Data Structures
Foundations of Multidimensional and Metric Data Structures Hanan Samet University of Maryland, College Park ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE
More informationOn the Scalability of Hierarchical Ad Hoc Wireless Networks
On the Scalability of Hierarchical Ad Hoc Wireless Networks Suli Zhao and Dipankar Raychaudhuri Fall 2006 IAB 11/15/2006 Outline Motivation Ad hoc wireless network architecture Three-tier hierarchical
More informationQoS-Aware Hierarchical Multicast Routing on Next Generation Internetworks
QoS-Aware Hierarchical Multicast Routing on Next Generation Internetworks Satyabrata Pradhan, Yi Li, and Muthucumaru Maheswaran Advanced Networking Research Laboratory Department of Computer Science University
More informationLecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011
Lecture 6: Overlay Networks CS 598: Advanced Internetworking Matthew Caesar February 15, 2011 1 Overlay networks: Motivations Protocol changes in the network happen very slowly Why? Internet is shared
More informationEnabling Contiguous Node Scheduling on the Cray XT3
Enabling Contiguous Node Scheduling on the Cray XT3 Chad Vizino Pittsburgh Supercomputing Center Cray User Group Helsinki, Finland May 2008 With acknowledgements to Deborah Weisser, Jared Yanovich, Derek
More informationNotes. Video Game AI: Lecture 5 Planning for Pathfinding. Lecture Overview. Knowledge vs Search. Jonathan Schaeffer this Friday
Notes Video Game AI: Lecture 5 Planning for Pathfinding Nathan Sturtevant COMP 3705 Jonathan Schaeffer this Friday Planning vs localization We cover planning today Localization is just mapping a real-valued
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationIrregular Graph Algorithms on Parallel Processing Systems
Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories
More informationBalanced Box-Decomposition trees for Approximate nearest-neighbor. Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry
Balanced Box-Decomposition trees for Approximate nearest-neighbor 11 Manos Thanos (MPLA) Ioannis Emiris (Dept Informatics) Computational Geometry Nearest Neighbor A set S of n points is given in some metric
More informationUlrich Fiedler, Eduard Glatz, 9. March Executive Summary
Topology Construction in the TOWN Project Technical Report 7 Ulrich Fiedler, ulrich.fiedler@bfh.ch Eduard Glatz, eglatz@hsr.ch 9. March 7 Executive Summary In this report, we propose and evaluate an algorithm
More information03 - Reconstruction. Acknowledgements: Olga Sorkine-Hornung. CSCI-GA Geometric Modeling - Spring 17 - Daniele Panozzo
3 - Reconstruction Acknowledgements: Olga Sorkine-Hornung Geometry Acquisition Pipeline Scanning: results in range images Registration: bring all range images to one coordinate system Stitching/ reconstruction:
More informationSensor Network Architectures. Objectives
Sensor Network Architectures muse Objectives Be familiar with how application needs impact deployment strategies t Understand key benefits/costs associated with different topologies. Understand key benefits/costs
More informationLow-coordination Topologies For Redundancy In Sensor Networks
Low-coordination Topologies For Redundancy In Sensor Networks Rajagopal Iyengar ECSE Department Rensselaer Polytechnic Institute Troy, NY 118 iyengr@rpi.edu Koushik Kar ECSE Department Rensselaer Polytechnic
More informationSHORTEST PATHS. Weighted Digraphs. Shortest paths. Shortest Paths BOS ORD JFK SFO LAX DFW MIA
SHORTEST PATHS Weighted Digraphs Shortest paths 6 867 1 Weighted Graphs weights on the edges of a graph represent distances, costs, etc. An example of an undirected weighted graph: 6 867 2 Shortest Path
More informationINDIAN STATISTICAL INSTITUTE
INDIAN STATISTICAL INSTITUTE Mid Semestral Examination M. Tech (CS) - I Year, 2016-2017 (Semester - II) Design and Analysis of Algorithms Date : 21.02.2017 Maximum Marks : 60 Duration : 3.0 Hours Note:
More information