An Efficient Parallel Load-balancing Framework for Orthogonal Decomposition of Geometrical Data

Size: px
Start display at page:

Download "An Efficient Parallel Load-balancing Framework for Orthogonal Decomposition of Geometrical Data"

Transcription

1 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS An Efficient Parallel Load-balancing Framework for Orthogonal Decomposition of Geometrical Data Bruno R. C. Magalhães Farhan Tauheed Thomas Heinis Anastasia Ailamaki Felix Schürmann Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL), CH Data-Intensive Applications and Systems laboratory, EPFL, CH ISC 06, Frankfurt st June 06

2 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS AGENDA. We present our problem;. We detail an algorithm for the efficient orthogonal decomposition of spatial data;. We compare it to commonly used methods, showing better accuracy and a lower time to solution

3 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS INTRO Spatial data decomposition is an important problem for High Performance Computing, applied in several fields: I astrophysics eg N-body simulations, Gordon Bell Prize (GBP) winners of 009, 00, 0; I cardiac model simulations, GBP 05 finalist I fluid dynamics e.g. cloud cavitation, GBP 0 winner; I materials engineering e.g. materials crystallization, GBP 0 winner; I weather forecasting; I direct volume rendering;

4 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS DATA OVERVIEW (For simplicity, only 5% of neurons are presented) I Neurons spatially discretized as tree of compartments (cylinders); I Extremely dense data structures; I Approximately 0K cylinders per neuron; I Mouse brain: approx. 80M neurons; Human brain: 00B neurons;

5 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS STATE OF THE ART -APPROXIMATION METHODS Single-Axis Non-Uniform Grid Sort Tile Recursive Orthogonal Rec. Bisection Histogram-based Sampling-based Threshold-based # X # Y # Z Serial, SA / NUG Serial, STR Parallel (e.g. Zoltan), ORB 5

6 6 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS PROBLEM STATEMENT Min, max and mean compartments count per compute node, post-ghosting for alternative datasets of 0K neurons (00M compartments): Dense data structures are highly penalized by approxim. methods: - High time to solution; - Barrier on maximum input circuit size;

7 7 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS SORT BALANCE SPLIT I A framework underlying a balanced orthogonal spatial decomposition of spatial data. I Three steps per dimension:. Distributed sorting;. Distributed load balancing;. Network split;

8 8 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS SORT ( MPI GATHERV, MPI BCAST, MPI ALLTOALLV) rank 0 8 elements rank 7 elements rank 6 elements Step : Local data sorting Step : Collection of local samples, gathered by root node Step : broadcast of sample of samples, representing new data distribution intervals rank 0 rank rank Step : Data is redistributed based on distribution intervals. Final data is already sorted elements elements 8 elements

9 9 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS BALANCE ( MPI ALLTOALL, MPI ALLTOALLV) rank 0 rank rank Step : broadcast of elements count elements 6 elements elements Step : redistribution of data, based on mean count per node elements 6 elements 7 elements

10 0 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS SPLIT ( MPI COMM SPLIT) rank 0 rank rank rank rank rank 5 6 rank comm 0 Step : each node calculates new rank and sub-group id Step : based on previous, split network in K independent networks, e.g: if K= rank 0 rank rank 0 rank rank 0 rank 0 rank comm 0 comm comm comm I The comm split is the base of the new recursive step; I new sub-comms process new data set on next dimension;

11 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS EXAMPLE: X DECOMPOSITION OF 6 ELEMENTS Key: rank rank rank rank y x Initial data sorting and load balancing on X axis network split on X axis sorting and load balancing on Y axis network split on Y axis In Brief... I 6 collective communication calls per dimension, independently of network size; I One local sort operation, independently of data size or initial placement; I Accurate final spatial decomposition; I Recursive algorithm: we split main problem in smaller sub-problems, to be executed in parallel.

12 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS TIME TO SOLUTION -BLUEGENE/Q

13 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS WEAK AND STRONG SCALING

14 ISC06 OVERVIEW SORT-BALANCE-SPLIT RESULTS CLOSING REMARKS We presented the Sort-Balance-Split, a framework for the accurate orthogonal decomposition of spatial data. More accurate and lower time to solution on very dense datasets; I Compared with standard configurations of existing methods; Methods tested on a BlueGene/Q supercomputer We plan to open-source the SBS in the near future; Thank you for your attention. Acknowledgments: Research supported by funding from the ETH Domain for the Blue Brain Project (BBP); BlueBrain IV BGQ system financed by ETH Board Funding to BBP and hosted at the Swiss National Supercomputing Center (CSCS). We thank James King, Stuart Yates and Fabien Delalondre for technical discussions.

Enabling web-based interactive notebooks on geographically distributed HPC resources. Alexandre Beche

Enabling web-based interactive notebooks on geographically distributed HPC resources. Alexandre Beche Enabling web-based interactive notebooks on geographically distributed HPC resources Alexandre Beche Outlines 1. Context 2. Interactive notebook running on cluster(s) 3. Advanced

More information

PLAN-E Workshop Switzerland. Welcome! September 8, 2016

PLAN-E Workshop Switzerland. Welcome! September 8, 2016 PLAN-E Workshop Switzerland Welcome! September 8, 2016 The Swiss National Supercomputing Centre Driving innovation in computational research in Switzerland Michele De Lorenzi (CSCS) PLAN-E September 8,

More information

Tutorial: Application MPI Task Placement

Tutorial: Application MPI Task Placement Tutorial: Application MPI Task Placement Juan Galvez Nikhil Jain Palash Sharma PPL, University of Illinois at Urbana-Champaign Tutorial Outline Why Task Mapping on Blue Waters? When to do mapping? How

More information

Implementation and Analysis of Nonblocking Collective Operations on SCI Networks. Boris Bierbaum, Thomas Bemmerl

Implementation and Analysis of Nonblocking Collective Operations on SCI Networks. Boris Bierbaum, Thomas Bemmerl Implementation and Analysis of Nonblocking Collective Operations on SCI Networks Christian Kaiser Torsten Hoefler Boris Bierbaum, Thomas Bemmerl Scalable Coherent Interface (SCI) Ringlet: IEEE Std 1596-1992

More information

TRANSFORMERS: Robust Spatial Joins on Non-Uniform Data Distributions

TRANSFORMERS: Robust Spatial Joins on Non-Uniform Data Distributions Join time (hours) log scale TRANSFORMERS: Robust Spatial Joins on Non-Uniform Data Distributions Mirjana Pavlovic, Thomas Heinis, Farhan Tauheed, Panagiotis Karras, Anastasia Ailamaki École Polytechnique

More information

BLOCK: Efficient Execution of Spatial Range Queries in Main-Memory

BLOCK: Efficient Execution of Spatial Range Queries in Main-Memory BLOCK: Efficient Execution of Spatial Range Queries in Main-Memory ABSTRACT Matthaios Olma École Polytechnique Fédérale de Lausanne Lausanne, Switzerland Thomas Heinis Imperial College London, United Kingdom

More information

MPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh

MPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh MPI Optimisation Advanced Parallel Programming David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh Overview Can divide overheads up into four main categories: Lack of parallelism Load imbalance

More information

DETECTION AND ROBUST ESTIMATION OF CYLINDER FEATURES IN POINT CLOUDS INTRODUCTION

DETECTION AND ROBUST ESTIMATION OF CYLINDER FEATURES IN POINT CLOUDS INTRODUCTION DETECTION AND ROBUST ESTIMATION OF CYLINDER FEATURES IN POINT CLOUDS Yun-Ting Su James Bethel Geomatics Engineering School of Civil Engineering Purdue University 550 Stadium Mall Drive, West Lafayette,

More information

ECE 574 Cluster Computing Lecture 13

ECE 574 Cluster Computing Lecture 13 ECE 574 Cluster Computing Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements HW#5 Finally Graded Had right idea, but often result not an *exact*

More information

A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers

A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers Maxime Martinasso, Grzegorz Kwasniewski, Sadaf R. Alam, Thomas C. Schulthess, Torsten Hoefler Swiss National Supercomputing

More information

RUBIK: Efficient Threshold Queries on Massive Time Series

RUBIK: Efficient Threshold Queries on Massive Time Series RUBIK: Efficient Threshold Queries on Massive Time Series Eleni Tzirita Zacharatou, Farhan Tauheed, Thomas Heinis, Anastasia Ailamaki École Polytechnique Fédérale de Lausanne, Switzerland Oracle Labs Zurich,

More information

Particle-based simulations in Astrophysics

Particle-based simulations in Astrophysics Particle-based simulations in Astrophysics Jun Makino Particle Simulator Research Team, AICS/ Earth-Life Science Institute(ELSI), Tokyo Institute of Technology Feb 28, 2013 3rd AICS International Symposium

More information

The Icosahedral Nonhydrostatic (ICON) Model

The Icosahedral Nonhydrostatic (ICON) Model The Icosahedral Nonhydrostatic (ICON) Model Scalability on Massively Parallel Computer Architectures Florian Prill, DWD + the ICON team 15th ECMWF Workshop on HPC in Meteorology October 2, 2012 ICON =

More information

Scalable Dynamic Load Balancing of Detailed Cloud Physics with FD4

Scalable Dynamic Load Balancing of Detailed Cloud Physics with FD4 Center for Information Services and High Performance Computing (ZIH) Scalable Dynamic Load Balancing of Detailed Cloud Physics with FD4 Minisymposium on Advances in Numerics and Physical Modeling for Geophysical

More information

CC MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters

CC MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters CC MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters Amit Karwande, Xin Yuan Dept. of Computer Science Florida State University Tallahassee, FL 32306 {karwande,xyuan}@cs.fsu.edu

More information

High Performance Computing Course Notes HPC Fundamentals

High Performance Computing Course Notes HPC Fundamentals High Performance Computing Course Notes 2008-2009 2009 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

High Performance Computing Course Notes Course Administration

High Performance Computing Course Notes Course Administration High Performance Computing Course Notes 2009-2010 2010 Course Administration Contacts details Dr. Ligang He Home page: http://www.dcs.warwick.ac.uk/~liganghe Email: liganghe@dcs.warwick.ac.uk Office hours:

More information

Training in Mapping Changes on an Archaeological Site

Training in Mapping Changes on an Archaeological Site Training in Mapping Changes on an Archaeological Site Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Pierre-Yves Gilliéron Bertrand Merminod Jérôme Zufferey EPFL Presentation 04.2015

More information

High Performance Computing. Introduction to Parallel Computing

High Performance Computing. Introduction to Parallel Computing High Performance Computing Introduction to Parallel Computing Acknowledgements Content of the following presentation is borrowed from The Lawrence Livermore National Laboratory https://hpc.llnl.gov/training/tutorials

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Intersection Acceleration

Intersection Acceleration Advanced Computer Graphics Intersection Acceleration Matthias Teschner Computer Science Department University of Freiburg Outline introduction bounding volume hierarchies uniform grids kd-trees octrees

More information

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort Intro to MPI Last Time Intro to Parallel Algorithms Parallel Search Parallel Sorting Merge sort Sample sort Today Network Topology Communication Primitives Message Passing Interface (MPI) Randomized Algorithms

More information

Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes

Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes A. Calotoiu 1, T. Hoefler 2, M. Poke 1, F. Wolf 1 1) German Research School for Simulation Sciences 2) ETH Zurich September

More information

The Use of the MPI Communication Library in the NAS Parallel Benchmarks

The Use of the MPI Communication Library in the NAS Parallel Benchmarks The Use of the MPI Communication Library in the NAS Parallel Benchmarks Theodore B. Tabe, Member, IEEE Computer Society, and Quentin F. Stout, Senior Member, IEEE Computer Society 1 Abstract The statistical

More information

GPU Consideration for Next Generation Weather (and Climate) Simulations

GPU Consideration for Next Generation Weather (and Climate) Simulations GPU Consideration for Next Generation Weather (and Climate) Simulations Oliver Fuhrer 1, Tobias Gisy 2, Xavier Lapillonne 3, Will Sawyer 4, Ugo Varetto 4, Mauro Bianco 4, David Müller 2, and Thomas C.

More information

Computational Neuroscience Breakthroughs Through Innovative Data Management

Computational Neuroscience Breakthroughs Through Innovative Data Management Computational Neuroscience Breakthroughs Through Innovative Data Management Farhan Tauheed, Sadegh Nobari, Laurynas Biveinis, Thomas Heinis, and Anastasia Ailamaki Data-Intensive Applications and Systems

More information

Revealing Applications Access Pattern in Collective I/O for Cache Management

Revealing Applications Access Pattern in Collective I/O for Cache Management Revealing Applications Access Pattern in for Yin Lu 1, Yong Chen 1, Rob Latham 2 and Yu Zhuang 1 Presented by Philip Roth 3 1 Department of Computer Science Texas Tech University 2 Mathematics and Computer

More information

Designing High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning

Designing High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning 5th ANNUAL WORKSHOP 209 Designing High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning Hari Subramoni Dhabaleswar K. (DK) Panda The Ohio State University The Ohio State University E-mail:

More information

Week 3: MPI. Day 04 :: Domain decomposition, load balancing, hybrid particlemesh

Week 3: MPI. Day 04 :: Domain decomposition, load balancing, hybrid particlemesh Week 3: MPI Day 04 :: Domain decomposition, load balancing, hybrid particlemesh methods Domain decompositon Goals of parallel computing Solve a bigger problem Operate on more data (grid points, particles,

More information

TOUCH: In-Memory Spatial Join by Hierarchical Data-Oriented Partitioning

TOUCH: In-Memory Spatial Join by Hierarchical Data-Oriented Partitioning : In-Memory Spatial Join by Hierarchical Data-Oriented Partitioning Sadegh Nobari Farhan Tauheed Thomas Heinis Panagiotis Karras Stéphane Bressan Anastasia Ailamaki National University of Singapore, Singapore

More information

Name Date Types of Graphs and Creating Graphs Notes

Name Date Types of Graphs and Creating Graphs Notes Name Date Types of Graphs and Creating Graphs Notes Graphs are helpful visual representations of data. Different graphs display data in different ways. Some graphs show individual data, but many do not.

More information

Spatial Data Structures

Spatial Data Structures CSCI 420 Computer Graphics Lecture 17 Spatial Data Structures Jernej Barbic University of Southern California Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees [Angel Ch. 8] 1 Ray Tracing Acceleration

More information

Spatial Data Structures

Spatial Data Structures CSCI 480 Computer Graphics Lecture 7 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids BSP Trees [Ch. 0.] March 8, 0 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s/

More information

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS Hermann Härtig

LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS Hermann Härtig LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2016 Hermann Härtig LECTURE OBJECTIVES starting points independent Unix processes and block synchronous execution which component (point in

More information

A Scalable Adaptive Mesh Refinement Framework For Parallel Astrophysics Applications

A Scalable Adaptive Mesh Refinement Framework For Parallel Astrophysics Applications A Scalable Adaptive Mesh Refinement Framework For Parallel Astrophysics Applications James Bordner, Michael L. Norman San Diego Supercomputer Center University of California, San Diego 15th SIAM Conference

More information

Using GPUs to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation

Using GPUs to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation Using GPUs to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation GPU Technology Conference 2012 May 15, 2012 Thomas M. Benson, Daniel P. Campbell, Daniel A. Cook thomas.benson@gtri.gatech.edu

More information

MPI Performance Snapshot

MPI Performance Snapshot User's Guide 2014-2015 Intel Corporation Legal Information No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all

More information

Exercises: April 11. Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

Exercises: April 11. Hermann Härtig, TU Dresden, Distributed OS, Load Balancing Exercises: April 11 1 PARTITIONING IN MPI COMMUNICATION AND NOISE AS HPC BOTTLENECK LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2017 Hermann Härtig THIS LECTURE Partitioning: bulk synchronous

More information

CPSC / Sonny Chan - University of Calgary. Collision Detection II

CPSC / Sonny Chan - University of Calgary. Collision Detection II CPSC 599.86 / 601.86 Sonny Chan - University of Calgary Collision Detection II Outline Broad phase collision detection: - Problem definition and motivation - Bounding volume hierarchies - Spatial partitioning

More information

Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows

Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows Automated Configuration and Administration of a Storage-class Memory System to Support Supercomputer-based Scientific Workflows J. Bernard 1, P. Morjan 2, B. Hagley 3, F. Delalondre 1, F. Schürmann 1,

More information

Distributed Memory Parallel Programming

Distributed Memory Parallel Programming COSC Big Data Analytics Parallel Programming using MPI Edgar Gabriel Spring 201 Distributed Memory Parallel Programming Vast majority of clusters are homogeneous Necessitated by the complexity of maintaining

More information

SIMULATION OF AN IMPLANTED PIFA FOR A CARDIAC PACEMAKER WITH EFIELD FDTD AND HYBRID FDTD-FEM

SIMULATION OF AN IMPLANTED PIFA FOR A CARDIAC PACEMAKER WITH EFIELD FDTD AND HYBRID FDTD-FEM 1 SIMULATION OF AN IMPLANTED PIFA FOR A CARDIAC PACEMAKER WITH EFIELD FDTD AND HYBRID FDTD- Introduction Medical Implanted Communication Service (MICS) has received a lot of attention recently. The MICS

More information

Spatial Data Structures

Spatial Data Structures Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) [Angel 9.10] Outline Ray tracing review what rays matter? Ray tracing speedup faster

More information

11. Particle Simulator Research Team

11. Particle Simulator Research Team 11. Particle Simulator Research Team 11.1. Team members Junichiro Makino (Team Leader) Keigo Nitadori (Research Scientist) Masaki Iwasawa (Postdoctoral Researcher) Yuri Iida (Assistant) 11.2. Research

More information

Part - II. Message Passing Interface. Dheeraj Bhardwaj

Part - II. Message Passing Interface. Dheeraj Bhardwaj Part - II Dheeraj Bhardwaj Department of Computer Science & Engineering Indian Institute of Technology, Delhi 110016 India http://www.cse.iitd.ac.in/~dheerajb 1 Outlines Basics of MPI How to compile and

More information

COMP/CS 605: Introduction to Parallel Computing Topic: Parallel Computing Overview/Introduction

COMP/CS 605: Introduction to Parallel Computing Topic: Parallel Computing Overview/Introduction COMP/CS 605: Introduction to Parallel Computing Topic: Parallel Computing Overview/Introduction Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University

More information

Memory Ordering Mechanisms for ARM? Tao C. Lee, Marc-Alexandre Boéchat CS, EPFL

Memory Ordering Mechanisms for ARM? Tao C. Lee, Marc-Alexandre Boéchat CS, EPFL Memory Ordering Mechanisms for ARM? Tao C. Lee, Marc-Alexandre Boéchat CS, EPFL Forecast This research studies the performance of memory ordering mechanisms on Chip Multi- Processors (CMPs) for modern

More information

Intro to Parallel Computing

Intro to Parallel Computing Outline Intro to Parallel Computing Remi Lehe Lawrence Berkeley National Laboratory Modern parallel architectures Parallelization between nodes: MPI Parallelization within one node: OpenMP Why use parallel

More information

Big Orange Bramble. August 09, 2016

Big Orange Bramble. August 09, 2016 Big Orange Bramble August 09, 2016 Overview HPL SPH PiBrot Numeric Integration Parallel Pi Monte Carlo FDS DANNA HPL High Performance Linpack is a benchmark for clusters Created here at the University

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Introduction to Parallel Computing Bootcamp for SahasraT 7th September 2018 Aditya Krishna Swamy adityaks@iisc.ac.in SERC, IISc Acknowledgments Akhila, SERC S. Ethier, PPPL P. Messina, ECP LLNL HPC tutorials

More information

Slides prepared by : Farzana Rahman 1

Slides prepared by : Farzana Rahman 1 Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based

More information

The MOSIX Scalable Cluster Computing for Linux. mosix.org

The MOSIX Scalable Cluster Computing for Linux.  mosix.org The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org 1 Presentation overview Part I : Why computing clusters (slide 3-7) Part II : What

More information

PLP: Page Latch free

PLP: Page Latch free PLP: Page Latch free Shared everything OLTP Ippokratis Pandis Pınar Tözün Ryan Johnson Anastasia Ailamaki IBM Almaden Research Center École Polytechnique Fédérale de Lausanne University of Toronto OLTP

More information

Fast Methods with Sieve

Fast Methods with Sieve Fast Methods with Sieve Matthew G Knepley Mathematics and Computer Science Division Argonne National Laboratory August 12, 2008 Workshop on Scientific Computing Simula Research, Oslo, Norway M. Knepley

More information

MPI Performance Snapshot

MPI Performance Snapshot MPI Performance Snapshot User's Guide 2014-2015 Intel Corporation MPI Performance Snapshot User s Guide Legal Information No license (express or implied, by estoppel or otherwise) to any intellectual property

More information

Spatial Data Management Challenges in the Simulation Sciences

Spatial Data Management Challenges in the Simulation Sciences Spatial Data Management Challenges in the Simulation Sciences Thomas Heinis, Farhan Tauheed, Anastasia Ailamaki Data-Intensive Applications and Systems Laboratory, École Polytechnique Fédérale de Lausanne,

More information

Chapter 2 Basic Computer Configuration

Chapter 2 Basic Computer Configuration CSCA0101 COMPUTING BASICS Chapter 2 1 Topics: Basic Operations Computer Components Computer Categories 2 Computing Terminology Data Data is anything in a form suitable for use with a computer. Information

More information

Enzo-P / Cello. Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology. San Diego Supercomputer Center. Department of Physics and Astronomy

Enzo-P / Cello. Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology. San Diego Supercomputer Center. Department of Physics and Astronomy Enzo-P / Cello Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology James Bordner 1 Michael L. Norman 1 Brian O Shea 2 1 University of California, San Diego San Diego Supercomputer Center 2

More information

SAS Visual Analytics 8.2: Working with Report Content

SAS Visual Analytics 8.2: Working with Report Content SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) April 1, 2003 [Angel 9.10] Frank Pfenning Carnegie

More information

MPI Performance Snapshot. User's Guide

MPI Performance Snapshot. User's Guide MPI Performance Snapshot User's Guide MPI Performance Snapshot User s Guide Legal Information No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by

More information

High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A study with Parallel 3DFFT

High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A study with Parallel 3DFFT High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A study with Parallel 3DFFT Krishna Kandalla (1), Hari Subramoni (1), Karen Tomko (2), Dmitry Pekurovsky

More information

1. Interpreting the Results: Visualization 1

1. Interpreting the Results: Visualization 1 1. Interpreting the Results: Visualization 1 visual/graphical/optical representation of large sets of data: data from experiments or measurements: satellite images, tomography in medicine, microsopy,...

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Benchmark 1.a Investigate and Understand Designated Lab Techniques The student will investigate and understand designated lab techniques.

Benchmark 1.a Investigate and Understand Designated Lab Techniques The student will investigate and understand designated lab techniques. I. Course Title Parallel Computing 2 II. Course Description Students study parallel programming and visualization in a variety of contexts with an emphasis on underlying and experimental technologies.

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information

Starling: A Scheduler Architecture for High Performance Cloud Computing

Starling: A Scheduler Architecture for High Performance Cloud Computing Starling: A Scheduler Architecture for High Performance Cloud Computing Hang Qu, Omid Mashayekhi, David Terei, Philip Levis Stanford Platform Lab Seminar May 17, 2016 1 High Performance Computing (HPC)

More information

Multidimensional Indexing The R Tree

Multidimensional Indexing The R Tree Multidimensional Indexing The R Tree Module 7, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Single-Dimensional Indexes B+ trees are fundamentally single-dimensional indexes. When we create

More information

1. Introduction 1 2. Starting the SAED pattern indexing 1 3. Analyzing the indexing solutions 8 4. Remarks 10 Contents

1. Introduction 1 2. Starting the SAED pattern indexing 1 3. Analyzing the indexing solutions 8 4. Remarks 10 Contents SAED PATTERN INDEXING USING JEMS P. STADELMANN CIME-EPFL STATION 12 CH-1015 LAUSANNE SWITZERLAND 1. Introduction 1 2. Starting the SAED pattern indexing 1 3. Analyzing the indexing solutions 8 4. Remarks

More information

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Introduction to Indexing R-trees. Hong Kong University of Science and Technology Introduction to Indexing R-trees Dimitris Papadias Hong Kong University of Science and Technology 1 Introduction to Indexing 1. Assume that you work in a government office, and you maintain the records

More information

Parallel Programming

Parallel Programming Parallel Programming 7. Data Parallelism Christoph von Praun praun@acm.org 07-1 (1) Parallel algorithm structure design space Organization by Data (1.1) Geometric Decomposition Organization by Tasks (1.3)

More information

Spatial Data Structures

Spatial Data Structures 15-462 Computer Graphics I Lecture 17 Spatial Data Structures Hierarchical Bounding Volumes Regular Grids Octrees BSP Trees Constructive Solid Geometry (CSG) March 28, 2002 [Angel 8.9] Frank Pfenning Carnegie

More information

A dynamic load-balancing strategy for large scale CFD-applications

A dynamic load-balancing strategy for large scale CFD-applications A dynamic load-balancing strategy for large scale CFD-applications Philipp Offenhäuser 10.10.2017 1/20 :: A dynamic load-balancing strategy for large scale CFD-applications :: 10.10.2017 :: Outline Motivation

More information

Programming with MPI

Programming with MPI Programming with MPI p. 1/?? Programming with MPI More on Datatypes and Collectives Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Less Basic Collective Use A few important facilities

More information

Lecture overview. Visualisatie BMT. Transparency. Transparency. Transparency. Transparency. Transparency Volume rendering Assignment

Lecture overview. Visualisatie BMT. Transparency. Transparency. Transparency. Transparency. Transparency Volume rendering Assignment Visualisatie BMT Lecture overview Assignment Arjan Kok a.j.f.kok@tue.nl 1 Makes it possible to see inside or behind objects Complement of transparency is opacity Opacity defined by alpha value with range

More information

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Indirect Volume Rendering

Indirect Volume Rendering Indirect Volume Rendering Visualization Torsten Möller Weiskopf/Machiraju/Möller Overview Contour tracing Marching cubes Marching tetrahedra Optimization octree-based range query Weiskopf/Machiraju/Möller

More information

Foster s Methodology: Application Examples

Foster s Methodology: Application Examples Foster s Methodology: Application Examples Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 19, 2011 CPD (DEI / IST) Parallel and

More information

MPI in 2020: Opportunities and Challenges. William Gropp

MPI in 2020: Opportunities and Challenges. William Gropp MPI in 2020: Opportunities and Challenges William Gropp www.cs.illinois.edu/~wgropp MPI and Supercomputing The Message Passing Interface (MPI) has been amazingly successful First released in 1992, it is

More information

Hybrid OpenMP-MPI Turbulent boundary Layer code over 32k cores

Hybrid OpenMP-MPI Turbulent boundary Layer code over 32k cores Hybrid OpenMP-MPI Turbulent boundary Layer code over 32k cores T/NT INTERFACE y/ x/ z/ 99 99 Juan A. Sillero, Guillem Borrell, Javier Jiménez (Universidad Politécnica de Madrid) and Robert D. Moser (U.

More information

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017

HPC and Big Data: Updates about China. Haohuan FU August 29 th, 2017 HPC and Big Data: Updates about China Haohuan FU August 29 th, 2017 1 Outline HPC and Big Data Projects in China Recent Efforts on Tianhe-2 Recent Efforts on Sunway TaihuLight 2 MOST HPC Projects 2016

More information

The Potential of Diffusive Load Balancing at Large Scale

The Potential of Diffusive Load Balancing at Large Scale Center for Information Services and High Performance Computing The Potential of Diffusive Load Balancing at Large Scale EuroMPI 2016, Edinburgh, 27 September 2016 Matthias Lieber, Kerstin Gößner, Wolfgang

More information

Quantifying the Dynamic Ocean Surface Using Underwater Radiometric Measurement

Quantifying the Dynamic Ocean Surface Using Underwater Radiometric Measurement DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Quantifying the Dynamic Ocean Surface Using Underwater Radiometric Measurement Lian Shen Department of Mechanical Engineering

More information

Lesson 05. Mid Phase. Collision Detection

Lesson 05. Mid Phase. Collision Detection Lesson 05 Mid Phase Collision Detection Lecture 05 Outline Problem definition and motivations Generic Bounding Volume Hierarchy (BVH) BVH construction, fitting, overlapping Metrics and Tandem traversal

More information

Access Methods. Basic Concepts. Index Evaluation Metrics. search key pointer. record. value. Value

Access Methods. Basic Concepts. Index Evaluation Metrics. search key pointer. record. value. Value Access Methods This is a modified version of Prof. Hector Garcia Molina s slides. All copy rights belong to the original author. Basic Concepts search key pointer Value record? value Search Key - set of

More information

Data Representation in Visualisation

Data Representation in Visualisation Data Representation in Visualisation Visualisation Lecture 4 Taku Komura Institute for Perception, Action & Behaviour School of Informatics Taku Komura Data Representation 1 Data Representation We have

More information

CSE4334/5334 DATA MINING

CSE4334/5334 DATA MINING CSE4334/5334 DATA MINING Lecture 4: Classification (1) CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides courtesy

More information

Clustering Part 4 DBSCAN

Clustering Part 4 DBSCAN Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Generic Topology Mapping Strategies for Large-scale Parallel Architectures

Generic Topology Mapping Strategies for Large-scale Parallel Architectures Generic Topology Mapping Strategies for Large-scale Parallel Architectures Torsten Hoefler and Marc Snir Scientific talk at ICS 11, Tucson, AZ, USA, June 1 st 2011, Hierarchical Sparse Networks are Ubiquitous

More information

CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters

CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan and Dhabaleswar K. (DK) Panda Speaker: Sourav Chakraborty

More information

A4. Intro to Parallel Computing

A4. Intro to Parallel Computing Self-Consistent Simulations of Beam and Plasma Systems Steven M. Lund, Jean-Luc Vay, Rémi Lehe and Daniel Winklehner Colorado State U., Ft. Collins, CO, 13-17 June, 2016 A4. Intro to Parallel Computing

More information

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday

More information

ANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, Comparative Study of Classification Algorithms Using Data Mining

ANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, Comparative Study of Classification Algorithms Using Data Mining ANALYSIS COMPUTER SCIENCE Discovery Science, Volume 9, Number 20, April 3, 2014 ISSN 2278 5485 EISSN 2278 5477 discovery Science Comparative Study of Classification Algorithms Using Data Mining Akhila

More information

Annales UMCS Informatica AI 1 (2003) UMCS. Registration of CT and MRI brain images. Karol Kuczyński, Paweł Mikołajczak

Annales UMCS Informatica AI 1 (2003) UMCS. Registration of CT and MRI brain images. Karol Kuczyński, Paweł Mikołajczak Annales Informatica AI 1 (2003) 149-156 Registration of CT and MRI brain images Karol Kuczyński, Paweł Mikołajczak Annales Informatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ Laboratory

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen

Graph Partitioning for High-Performance Scientific Simulations. Advanced Topics Spring 2008 Prof. Robert van Engelen Graph Partitioning for High-Performance Scientific Simulations Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Challenges for irregular meshes Modeling mesh-based computations as graphs Static

More information

Computational Geometry. Lecture 17

Computational Geometry. Lecture 17 Computational Geometry Lecture 17 Computational geometry Algorithms for solving geometric problems in 2D and higher. Fundamental objects: Basic structures: point line segment line point set polygon L17.2

More information