Data Reduction and Partitioning in an Extreme Scale GPU-Based Clustering Algorithm
|
|
- Angela Thornton
- 5 years ago
- Views:
Transcription
1 Data Reduction and Partitioning in an Extreme Scale GPU-Based Clustering Algorithm Benjamin Welton and Barton Miller Paradyn Project University of Wisconsin - Madison DRBSD-2 Workshop November 17 th 2017 Denver, CO
2 Our History with Scalable Reductions Debuggers Performance Tools MRNet Multicast Reduction Network (MRNet) [Roth 04] MRNet was built as a scalable reduction framework for debuggers, performance tools, and applications. Scalability is achieved by a software defined tree based overlay network (TBON). Applications 2
3 Our History with Scalable Reductions Debuggers STAT Stack Trace Analysis Tool [Arnold 2007] Performance Tools MRNet Live debugging of millions of processes (max run to date 6+ million procs at LLNL). MRNet provides the scaleable reduction of stack traces collected from processes. Applications 3
4 Our History with Scalable Reductions Debuggers STAT Cray CCBD Total View Cray ATP Performance Tools MRNet Since STAT s release, MRNet has been used as the reduction framework by a number of commercial debuggers. Applications 4
5 Our History with Scalable Reductions Debuggers STAT Cray CCBD Total View Cray ATP Performance Tools MRNet TAU TAU [Nataraj 2008] Performance monitoring framework. MRNet is used to scalable collect performance data from thousands of nodes. Applications 5
6 Our History with Scalable Reductions Debuggers STAT Cray CCBD Total View Cray ATP Performance Tools MRNet TAU CBTF Open Speed Shop CEPBA Toolkit Applications Commercial and open source tools have continued to use MRNet for scalable communication. 6
7 Our History with Scalable Reductions Debuggers STAT Cray CCBD Total View Cray ATP Performance Tools MRNet TAU CBTF Open Speed Shop CEPBA Toolkit Applications Mean Shift Mean Shift [Arnold 2006] Scalable method of identifying local maximums in a data set (mean shift algorithm). 7
8 Our History with Scalable Reductions Debuggers STAT Cray CCBD Total View Cray ATP Performance Tools MRNet TAU CBTF Open Speed Shop CEPBA Toolkit Mr. Scan [Welton 2014] Applications Mean Shift Highly scalable density based clustering algorithm. First known implementation to scale to over thousands of nodes and to process billions of points Mr. Scan 8
9 MRNet Multicast / Reduction Network General-purpose Tree Based Overlay Network (TBON) o Network: user-defined topology o Stream: logical data channel o to a set of back-ends o multicast, gather, and custom reduction o Packet: collection of data o Filter: stream data operator o User definable aggregation and reduction operations. o Fully asynchronous across multiple streams. F(x 1,,x n ) BE app FE CP CP CP CP CP CP BE app BE app BE app 9
10 Introducing Mr. Scan Mr. Scan is a scalable density-based clustering algorithm Designed to cluster billions of data points.
11 Clustering Example (DBSCAN [1] ) Goal: Find regions that meet minimum density and spatial distance characteristics [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
12 Clustering Example (DBSCAN [1] ) Goal: Find regions that meet minimum density and spatial distance characteristics [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
13 Clustering Example (DBSCAN [1] ) Goal: Find regions that meet minimum density and spatial distance characteristics [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
14 Clustering Example (DBSCAN [1] ) The two parameters that determine if a point is in a cluster is Epsilon (Eps), and MinPts [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
15 Clustering Example (DBSCAN [1] ) The two parameters that determine if a point is in a cluster is Epsilon (Eps), and MinPts Eps [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
16 Clustering Example (DBSCAN [1] ) The two parameters that determine if a point is in a cluster is Epsilon (Eps), and MinPts MinPts [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
17 Clustering Example (DBSCAN [1] ) The two parameters that determine if a If the number of points in Eps is > MinPts, point is in a cluster is Epsilon (Eps), and the point is a core point. MinPts [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
18 Clustering Example (DBSCAN [1] ) The two parameters that determine if a If the number of points in Eps is > MinPts, point is in a cluster is Epsilon (Eps), and the point is a core point. MinPts MinPts: 3 [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
19 Clustering Example (DBSCAN [1] ) The For two every parameters discovered that point, determine this same if a If the number of points in Eps is > MinPts, calculation point is in is a performed cluster is Epsilon until the (Eps), cluster and is the point is a core point. fully MinPts expanded MinPts: 3 [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
20 Clustering Example (DBSCAN [1] ) The For two every parameters discovered that point, determine this same if a If the number of points in Eps is > MinPts, calculation point is in is a performed cluster is Epsilon until the (Eps), cluster and is the point is a core point. fully MinPts expanded MinPts: 3 [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
21 Clustering Example (DBSCAN [1] ) The For two every parameters discovered that point, determine this same if a If the number of points in Eps is > MinPts, calculation point is in is a performed cluster is Epsilon until the (Eps), cluster and is the point is a core point. fully MinPts expanded MinPts: 3 [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
22 Clustering Example (DBSCAN [1] ) The For two every parameters discovered that point, determine this same if a If the number of points in Eps is > MinPts, calculation point is in is a performed cluster is Epsilon until the (Eps), cluster and is the point is a core point. fully MinPts expanded [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
23 Clustering Example (DBSCAN [1] ) The For two every parameters discovered that point, determine this same if a If the number of points in Eps is > MinPts, calculation point is in is a performed cluster is Epsilon until the (Eps), cluster and is the point is a core point. fully MinPts expanded [1] M. Ester et. al., A density-based algorithm for discovering clusters in large spatial databases with noise, (1996) 11
24 Intro to Mr. Scan Mr. Scan Phases Partition: Distributed DBSCAN: GPU (on BE) Merge: CPU (x #levels) Sweep: CPU (x #levels) FE BE BE BE BE 12
25 Intro to Mr. Scan Mr. Scan Phases Partition: Distributed DBSCAN: GPU (on BE) Merge: CPU (x #levels) Sweep: CPU (x #levels) 12
26 Intro to Mr. Scan Mr. Scan Phases FE Partition: Distributed CP CP DBSCAN: GPU (on BE) DBSCAN Merge: CPU (x #levels) BE BE BE BE Sweep: CPU (x #levels) 12
27 Intro to Mr. Scan Mr. Scan Phases FE Partition: Distributed Merge CP CP DBSCAN: GPU (on BE) Merge: CPU (x #levels) BE BE BE BE Sweep: CPU (x #levels) 12
28 Intro to Mr. Scan Merge Mr. Scan Phases FE Partition: Distributed CP CP DBSCAN: GPU (on BE) Merge: CPU (x #levels) BE BE BE BE Sweep: CPU (x #levels) 12
29 Intro to Mr. Scan Mr. Scan Phases FE Partition: Distributed CP Sweep CP DBSCAN: GPU (on BE) Merge: CPU (x #levels) BE BE BE BE Sweep: CPU (x #levels) 12
30 Intro to Mr. Scan Mr. Scan Phases FE Partition: Distributed CP Sweep CP DBSCAN: GPU (on BE) Sweep BE BE BE BE Merge: CPU (x #levels) Sweep: CPU (x #levels) 12
31 Intro to Mr. Scan Mr. Scan Phases FE Partition: Distributed CP Sweep CP DBSCAN: GPU (on BE) Sweep BE BE BE BE Merge: CPU (x #levels) Sweep: CPU (x #levels) FS 12
32 Properties of a Scalable TBON App Applications must have the following properties to be scalable in a TBON: F(x 1,,x n ) CP FE CP CP CP CP CP BE BE BE BE app app app app 13
33 Properties of a Scalable TBON App Applications must have the following properties to be scalable in a TBON: o Same amount of work across all nodes in the tree. F(x 1,,x n ) CP FE CP CP CP CP CP BE BE BE BE app app app app 13
34 Properties of a Scalable TBON App Applications must have the following properties to be scalable in a TBON: o Same amount of work across all nodes in the tree. o Size of reduction output must be equal or less than input size. F(x 1,,x n ) CP FE CP CP CP CP CP BE BE BE BE app app app app 13
35 Properties of a Scalable TBON App Applications must have the following properties to be scalable in a TBON: o Same amount of work across all nodes in the tree. o Size of reduction output must be equal or less than input size. Mr. Scan must have these properties to scale F(x 1,,x n ) BE CP BE FE BE CP CP CP CP CP BE app app app app 13
36 Challenges To Scaling DBSCAN Workload Balance: o DBSCAN requires that points near one another must be processed on the same node. Size of data needed for merging: o Simple merging methods require sending all points in all clusters to merge accurately o Sending billions of points in the merge phase is infeasible. 14
37 Our Solutions Solve the balance and data size issues jointly with the following techniques: o Smart Partitioning o Selectively performing redundant computation to reduce merge data size (using data duplicated at edge of each partition). o Use an estimate of partition density to identify hard to compute partitions. o Dense Box Algorithm o Reduces the complexity of computation of extremely dense partitions. o Representative Points o Leverage the redundant computation to create a merge method that requires only a small fixed subset of points. 15
38 Partition Phase 16
39 Partition Phase o Algorithm: 16
40 Partition Phase o Algorithm: o Form initial partitions 16
41 Partition Phase o Algorithm: o Form initial partitions 16
42 Partition Phase o Algorithm: o Form initial partitions 16
43 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions 16
44 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions 16
45 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions o Rebalance 16
46 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions o Rebalance 16
47 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions o Rebalance 16
48 Partition Phase o Algorithm: o Form initial partitions o Add shadow regions o Rebalance 16
49 The DBSCAN Density Problem o Imbalances in point density can cause huge differences in runtimes between Thread Groups inside of a GPU (10-15x variance in time) o Issue is caused by the lookup operation for a points neighbors in the DBSCAN point expansion phase. ε ε Higher density results in higher neighbor count which increases the number of comparison operations 17
50 Dense Box o Dense Box eliminates the need to perform neighbor lookups on points in dense regions by labeling points as a member of cluster before DBSCAN is run. 1. Start with an ε region. 2. Divide the region of data into area s of size ε 2 2 for dense area detection *. ε For each ε area which 2 2 has point count >= MinPts. Mark points as members of a cluster. Do not expand these points. ε 2 2 * ε chosen because it guarantees all points inside are within ε distance of each other
51 Dense Box o Dense Box eliminates the need to perform neighbor lookups on points in dense regions by labeling points as a member of cluster before DBSCAN is run. 1. Start with an ε region. 2. Divide the region of data into area s of size ε 2 2 for dense area detection *. ε For each ε area which 2 2 has point count >= MinPts. Mark points as members of a cluster. Do not expand these points. ε 2 2 ε * ε chosen because it guarantees all points inside are within ε distance of each other
52 Merge Algorithm o Two phases in the merge operation 1. Select Representative points (Leaf Node) 2. Merge operation (Internal Node) 19
53 Representative Points (Leaf Nodes) o Large clusters are too expensive to move up the tree o In border regions we select eight representative points to represent the cluster o These points guarantee that any overlap of clusters detected on adjacent nodes Mr. Scan: Efficient Clustering with MRNet and GPUs 20
54 Representative Points (Leaf Nodes) o Large clusters are too expensive to move up the tree o In border regions we select eight representative points to represent the cluster o These points guarantee that any overlap of clusters detected on adjacent nodes Mr. Scan: Efficient Clustering with MRNet and GPUs 20
55 Representative Points (Leaf Nodes) o Large clusters are too expensive to move up the tree o In border regions we select eight representative points to represent the cluster o These points guarantee that any overlap of clusters detected on adjacent nodes Mr. Scan: Efficient Clustering with MRNet and GPUs 20
56 Representative Points (Leaf Nodes) o Large clusters are too expensive to move up the tree o In border regions we select eight representative points to represent the cluster o These points guarantee that any overlap of clusters detected on adjacent nodes Mr. Scan: Efficient Clustering with MRNet and GPUs 20
57 Merge Operation (Internal Nodes) o Merge overlapping clusters found on different DBSCAN leaf nodes o Merge needs to have low overhead and must operate without entire dataset (Representative Points + Non-Core points) MinPts = 3 Node 1 Node 2 Shadow Regions 21
58 Merge Operation (Internal Nodes) Compare representative points sent by leaf nodes o If an overlap in representative points exists between two nodes, those clusters merge. o Otherwise, the clusters are complete and no further propagation of representative points occurs. Cluster on Node 1 and 2 have overlapping representative points and are merged. Node 1 Node 2 Shadow Regions Mr. Scan: Efficient Clustering with MRNet and GPUs 22
59 Merge Operation (Internal Nodes) Compare representative points sent by leaf nodes o If no overlap exists between clusters, finalize the clusters and do not propagate representative points. No overlap in regions between Node 1 and 2. Clusters in both nodes are now final and no further propagation is needed Node 1 Shadow Regions Node 2 Mr. Scan: Efficient Clustering with MRNet and GPUs 23
60 Elapsed Time (seconds) Mr. Scan Results Partition Merge Startup Sweep DBSCAN 0 102M 409M 1.6B 3.2B 6.5B Point Count (800K points per node) 24
61 Takeaway Lessons for Scalable Data Reduction o By selectively duplicating processing, we could use a less data intensive merging algorithm o We were able to equalize the workload between nodes by use of the dense box algorithm o Using these approaches, we were able to scale Mr. Scan up to 8192 nodes. o MRNet is a general scalability framework for data reductions 25
62 Paper available at: Questions? 26
Tree-Based Density Clustering using Graphics Processors
Tree-Based Density Clustering using Graphics Processors A First Marriage of MRNet and GPUs Evan Samanas and Ben Welton Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 The
More informationA Lightweight Library for Building Scalable Tools
A Lightweight Library for Building Scalable Tools Emily R. Jacobson, Michael J. Brim, Barton P. Miller Paradyn Project University of Wisconsin jacobson@cs.wisc.edu June 6, 2010 Para 2010: State of the
More informationA Lightweight Library for Building Scalable Tools
A Lightweight Library for Building Scalable Tools Emily R. Jacobson, Michael J. Brim, and Barton P. Miller Computer Sciences Department, University of Wisconsin Madison, Wisconsin, USA {jacobson,mjbrim,bart}@cs.wisc.edu
More informationIn Search of Sweet-Spots in Parallel Performance Monitoring
In Search of Sweet-Spots in Parallel Performance Monitoring Aroon Nataraj, Allen D. Malony, Alan Morris University of Oregon {anataraj,malony,amorris}@cs.uoregon.edu Dorian C. Arnold, Barton P. Miller
More informationGermán Llort
Germán Llort gllort@bsc.es >10k processes + long runs = large traces Blind tracing is not an option Profilers also start presenting issues Can you even store the data? How patient are you? IPDPS - Atlanta,
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 4
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationTAUoverMRNet (ToM): A Framework for Scalable Parallel Performance Monitoring
: A Framework for Scalable Parallel Performance Monitoring Aroon Nataraj, Allen D. Malony, Alan Morris University of Oregon {anataraj,malony,amorris}@cs.uoregon.edu Dorian C. Arnold, Barton P. Miller University
More informationClustering Part 4 DBSCAN
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationClustering Lecture 4: Density-based Methods
Clustering Lecture 4: Density-based Methods Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced
More informationShort Introduction to tools on the Cray XC system. Making it easier to port and optimise apps on the Cray XC30
Short Introduction to tools on the Cray XC system Making it easier to port and optimise apps on the Cray XC30 Cray Inc 2013 The Porting/Optimisation Cycle Modify Optimise Debug Cray Performance Analysis
More informationDBSCAN. Presented by: Garrett Poppe
DBSCAN Presented by: Garrett Poppe A density-based algorithm for discovering clusters in large spatial databases with noise by Martin Ester, Hans-peter Kriegel, Jörg S, Xiaowei Xu Slides adapted from resources
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationImproving the Scalability of Comparative Debugging with MRNet
Improving the Scalability of Comparative Debugging with MRNet Jin Chao MeSsAGE Lab (Monash Uni.) Cray Inc. David Abramson Minh Ngoc Dinh Jin Chao Luiz DeRose Robert Moench Andrew Gontarek Outline Assertion-based
More informationClustering Algorithm (DBSCAN) VISHAL BHARTI Computer Science Dept. GC, CUNY
Clustering Algorithm (DBSCAN) VISHAL BHARTI Computer Science Dept. GC, CUNY Clustering Algorithm Clustering is an unsupervised machine learning algorithm that divides a data into meaningful sub-groups,
More informationPhilip C. Roth. Computer Science and Mathematics Division Oak Ridge National Laboratory
Philip C. Roth Computer Science and Mathematics Division Oak Ridge National Laboratory A Tree-Based Overlay Network (TBON) like MRNet provides scalable infrastructure for tools and applications MRNet's
More informationLarge-Scale Flight Phase identification from ADS-B Data Using Machine Learning Methods
Large-Scale Flight Phase identification from ADS-B Data Using Methods Junzi Sun 06.2016 PhD student, ATM Control and Simulation, Aerospace Engineering Large-Scale Flight Phase identification from ADS-B
More informationThe Effect of Emerging Architectures on Data Science (and other thoughts)
The Effect of Emerging Architectures on Data Science (and other thoughts) Philip C. Roth With contributions from Jeffrey S. Vetter and Jeremy S. Meredith (ORNL) and Allen Malony (U. Oregon) Future Technologies
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI
More informationShort Introduction to Tools on the Cray XC systems
Short Introduction to Tools on the Cray XC systems Assisting the port/debug/optimize cycle 4/11/2015 1 The Porting/Optimisation Cycle Modify Optimise Debug Cray Performance Analysis Toolkit (CrayPAT) ATP,
More informationHYRISE In-Memory Storage Engine
HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University
More informationThe Cray Programming Environment. An Introduction
The Cray Programming Environment An Introduction Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent
More informationClustering Algorithms for Data Stream
Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:
More informationAggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments
Aggregation of Real-Time System Monitoring Data for Analyzing Large-Scale Parallel and Distributed Computing Environments Swen Böhm 1,2, Christian Engelmann 2, and Stephen L. Scott 2 1 Department of Computer
More informationScalable Interaction with Parallel Applications
Scalable Interaction with Parallel Applications Filippo Gioachin Chee Wai Lee Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign Outline Overview Case Studies Charm++
More informationDS504/CS586: Big Data Analytics Big Data Clustering II
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering II Prof. Yanhua Li Time: 6pm 8:50pm Thu Location: AK 232 Fall 2016 More Discussions, Limitations v Center based clustering K-means BFR algorithm
More informationData Mining 4. Cluster Analysis
Data Mining 4. Cluster Analysis 4.5 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction DBSCAN Algorithm OPTICS Algorithm DENCLUE Algorithm References Outline Introduction Introduction Density-based
More informationLightstreamer. The Streaming-Ajax Revolution. Product Insight
Lightstreamer The Streaming-Ajax Revolution Product Insight 1 Agenda Paradigms for the Real-Time Web (four models explained) Requirements for a Good Comet Solution Introduction to Lightstreamer Lightstreamer
More informationMichael Joseph Brim. A dissertation submitted in partial fulfillment of. the requirements for the degree of. Doctor of Philosophy. (Computer Sciences)
Control and Inspection of Distributed Process Groups at Extreme Scale via Group File Semantics By Michael Joseph Brim A dissertation submitted in partial fulfillment of the requirements for the degree
More informationA Parallel Community Detection Algorithm for Big Social Networks
A Parallel Community Detection Algorithm for Big Social Networks Yathrib AlQahtani College of Computer and Information Sciences King Saud University Collage of Computing and Informatics Saudi Electronic
More informationScalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks
Scalable Tool Infrastructure for the Cray XT Using Tree-Based Overlay Networks Philip C. Roth, Oak Ridge National Laboratory and Jeffrey S. Vetter, Oak Ridge National Laboratory and Georgia Institute of
More informationOverlapping Ring Monitoring Algorithm in TIPC
Overlapping Ring Monitoring Algorithm in TIPC Jon Maloy, Ericsson Canada Inc. Montreal April 7th 2017 PURPOSE When a cluster node becomes unresponsive due to crash, reboot or lost connectivity we want
More informationTechnical Computing Suite supporting the hybrid system
Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect
More informationDensity-Based Clustering. Izabela Moise, Evangelos Pournaras
Density-Based Clustering Izabela Moise, Evangelos Pournaras Izabela Moise, Evangelos Pournaras 1 Reminder Unsupervised data mining Clustering k-means Izabela Moise, Evangelos Pournaras 2 Main Clustering
More informationCOMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS
COMPARISON OF DENSITY-BASED CLUSTERING ALGORITHMS Mariam Rehman Lahore College for Women University Lahore, Pakistan mariam.rehman321@gmail.com Syed Atif Mehdi University of Management and Technology Lahore,
More informationClearSpeed Visual Profiler
ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/28/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationClustering to Reduce Spatial Data Set Size
Clustering to Reduce Spatial Data Set Size Geoff Boeing arxiv:1803.08101v1 [cs.lg] 21 Mar 2018 1 Introduction Department of City and Regional Planning University of California, Berkeley March 2018 Traditionally
More informationDiffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading
Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Mario Almeida, Liang Wang*, Jeremy Blackburn, Konstantina Papagiannaki, Jon Crowcroft* Telefonica
More informationDS504/CS586: Big Data Analytics Big Data Clustering II
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering II Prof. Yanhua Li Time: 6pm 8:50pm Thu Location: KH 116 Fall 2017 Updates: v Progress Presentation: Week 15: 11/30 v Next Week Office hours
More informationNotes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/10/2017)
1 Notes Reminder: HW2 Due Today by 11:59PM TA s note: Please provide a detailed ReadMe.txt file on how to run the program on the STDLINUX. If you installed/upgraded any package on STDLINUX, you should
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Gernot Ziegler, Developer Technology (Compute) (Material by Thomas Bradley) Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction
More informationDensity-Based Clustering Based on Probability Distribution for Uncertain Data
International Journal of Engineering and Advanced Technology (IJEAT) Density-Based Clustering Based on Probability Distribution for Uncertain Data Pramod Patil, Ashish Patel, Parag Kulkarni Abstract: Today
More informationClustering Documentation
Clustering Documentation Release 0.3.0 Dahua Lin and contributors Dec 09, 2017 Contents 1 Overview 3 1.1 Inputs................................................... 3 1.2 Common Options.............................................
More informationNotes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/09/2018)
1 Notes Reminder: HW2 Due Today by 11:59PM TA s note: Please provide a detailed ReadMe.txt file on how to run the program on the STDLINUX. If you installed/upgraded any package on STDLINUX, you should
More informationData Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data
More informationLLNL Tool Components: LaunchMON, P N MPI, GraphLib
LLNL-PRES-405584 Lawrence Livermore National Laboratory LLNL Tool Components: LaunchMON, P N MPI, GraphLib CScADS Workshop, July 2008 Martin Schulz Larger Team: Bronis de Supinski, Dong Ahn, Greg Lee Lawrence
More informationDiogenes: A tool for exposing Hidden GPU Performance Opportunities
Diogenes: A tool for exposing Hidden GPU Performance Opportunities Benjamin Welton and Barton P. Miller Computer Sciences Department University of Wisconsin Madison 2018 Performance Tools Workshop Solitude,
More informationLecture 6: Input Compaction and Further Studies
PASI Summer School Advanced Algorithmic Techniques for GPUs Lecture 6: Input Compaction and Further Studies 1 Objective To learn the key techniques for compacting input data for reduced consumption of
More informationAn Enhanced Density Clustering Algorithm for Datasets with Complex Structures
An Enhanced Density Clustering Algorithm for Datasets with Complex Structures Jieming Yang, Qilong Wu, Zhaoyang Qu, and Zhiying Liu Abstract There are several limitations of DBSCAN: 1) parameters have
More informationAutomatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.
Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S. Vazhkudai Instance of Large-Scale HPC Systems ORNL s TITAN (World
More informationAdvanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED
Advanced Software for the Supercomputer PRIMEHPC FX10 System Configuration of PRIMEHPC FX10 nodes Login Compilation Job submission 6D mesh/torus Interconnect Local file system (Temporary area occupied
More informationN-Body Simulation using CUDA. CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo
N-Body Simulation using CUDA CSE 633 Fall 2010 Project by Suraj Alungal Balchand Advisor: Dr. Russ Miller State University of New York at Buffalo Project plan Develop a program to simulate gravitational
More informationAMAZON.COM RECOMMENDATIONS ITEM-TO-ITEM COLLABORATIVE FILTERING PAPER BY GREG LINDEN, BRENT SMITH, AND JEREMY YORK
AMAZON.COM RECOMMENDATIONS ITEM-TO-ITEM COLLABORATIVE FILTERING PAPER BY GREG LINDEN, BRENT SMITH, AND JEREMY YORK PRESENTED BY: DEEVEN PAUL ADITHELA 2708705 OUTLINE INTRODUCTION DIFFERENT TYPES OF FILTERING
More informationLecture-17: Clustering with K-Means (Contd: DT + Random Forest)
Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO. Scalable Dynamic Instrumentation for Large Scale Machines
UNIVERSITY OF CALIFORNIA, SAN DIEGO Scalable Dynamic Instrumentation for Large Scale Machines A thesis submitted in partial satisfaction of the requirements for the degree Master of Science in Computer
More informationOn Smart Query Routing: For Distributed Graph Querying with Decoupled Storage
On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage Arijit Khan Nanyang Technological University (NTU), Singapore Gustavo Segovia ETH Zurich, Switzerland Donald Kossmann Microsoft
More informationCHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH
37 CHAPTER 4 AN IMPROVED INITIALIZATION METHOD FOR FUZZY C-MEANS CLUSTERING USING DENSITY BASED APPROACH 4.1 INTRODUCTION Genes can belong to any genetic network and are also coordinated by many regulatory
More informationScalable GPU Graph Traversal!
Scalable GPU Graph Traversal Duane Merrill, Michael Garland, and Andrew Grimshaw PPoPP '12 Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming Benwen Zhang
More informationIndexing Large-Scale Data
Indexing Large-Scale Data Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook November 16, 2010
More informationArchitecture and Implementation of a Content-based Data Dissemination System
Architecture and Implementation of a Content-based Data Dissemination System Austin Park Brown University austinp@cs.brown.edu ABSTRACT SemCast is a content-based dissemination model for large-scale data
More informationHybrid Implementation of 3D Kirchhoff Migration
Hybrid Implementation of 3D Kirchhoff Migration Max Grossman, Mauricio Araya-Polo, Gladys Gonzalez GTC, San Jose March 19, 2013 Agenda 1. Motivation 2. The Problem at Hand 3. Solution Strategy 4. GPU Implementation
More informationScalability of Trace Analysis Tools. Jesus Labarta Barcelona Supercomputing Center
Scalability of Trace Analysis Tools Jesus Labarta Barcelona Supercomputing Center What is Scalability? Jesus Labarta, Workshop on Tools for Petascale Computing, Snowbird, Utah,July 2007 2 Index General
More informationJob Startup at Exascale:
Job Startup at Exascale: Challenges and Solutions Hari Subramoni The Ohio State University http://nowlab.cse.ohio-state.edu/ Current Trends in HPC Supercomputing systems scaling rapidly Multi-/Many-core
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationShort Introduction to Debugging Tools on the Cray XC40
Short Introduction to Debugging Tools on the Cray XC40 Overview Debugging Get your code up and running correctly. Profiling Locate performance bottlenecks. Light weight At most relinking. Get a first picture
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationClustering Billions of Images with Large Scale Nearest Neighbor Search
Clustering Billions of Images with Large Scale Nearest Neighbor Search Ting Liu, Charles Rosenberg, Henry A. Rowley IEEE Workshop on Applications of Computer Vision February 2007 Presented by Dafna Bitton
More informationDistribution of Periscope Analysis Agents on ALTIX 4700
John von Neumann Institute for Computing Distribution of Periscope Analysis Agents on ALTIX 4700 Michael Gerndt, Sebastian Strohhäcker published in Parallel Computing: Architectures, Algorithms and Applications,
More informationCisco Express Forwarding Overview
Cisco Express Forwarding () is advanced, Layer 3 IP switching technology. optimizes network performance and scalability for networks with large and dynamic traffic patterns, such as the Internet, on networks
More informationDyninst and MRNet: Foundational Infrastructure for Parallel Tools
Dyninst and MRNet: Foundational Infrastructure for Parallel Tools William R. Williams, Xiaozhu Meng, Benjamin Welton, and Barton P. Miller Abstract Parallel tools require common pieces of infrastructure:
More informationResource allocation and utilization in the Blue Gene/L supercomputer
Resource allocation and utilization in the Blue Gene/L supercomputer Tamar Domany, Y Aridor, O Goldshmidt, Y Kliteynik, EShmueli, U Silbershtein IBM Labs in Haifa Agenda Blue Gene/L Background Blue Gene/L
More informationChallenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery
Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark by Nkemdirim Dockery High Performance Computing Workloads Core-memory sized Floating point intensive Well-structured
More informationSlack: A New Performance Metric for Parallel Programs
Slack: A New Performance Metric for Parallel Programs Jeffrey K. Hollingsworth Barton P. Miller hollings@cs.umd.edu bart@cs.wisc.edu Computer Science Department University of Maryland College Park, MD
More informationA Survey on DBSCAN Algorithm To Detect Cluster With Varied Density.
A Survey on DBSCAN Algorithm To Detect Cluster With Varied Density. Amey K. Redkar, Prof. S.R. Todmal Abstract Density -based clustering methods are one of the important category of clustering methods
More informationAutomatic Scaling Iterative Computations. Aug. 7 th, 2012
Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics
More informationISSN: (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 2, Issue 2, February 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Paper / Case Study Available online at:
More informationHW4 VINH NGUYEN. Q1 (6 points). Chapter 8 Exercise 20
HW4 VINH NGUYEN Q1 (6 points). Chapter 8 Exercise 20 a. For each figure, could you use single link to find the patterns represented by the nose, eyes and mouth? Explain? First, a single link is a MIN version
More information*University of Illinois at Urbana Champaign/NCSA Bell Labs
Analysis of Gemini Interconnect Recovery Mechanisms: Methods and Observations Saurabh Jha*, Valerio Formicola*, Catello Di Martino, William Kramer*, Zbigniew Kalbarczyk*, Ravishankar K. Iyer* *University
More informationChapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
More informationDebugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc.
Debugging CUDA Applications with Allinea DDT Ian Lumb Sr. Systems Engineer, Allinea Software Inc. ilumb@allinea.com GTC 2013, San Jose, March 20, 2013 Embracing GPUs GPUs a rival to traditional processors
More informationOverview. Overview. OTV Fundamentals. OTV Terms. This chapter provides an overview for Overlay Transport Virtualization (OTV) on Cisco NX-OS devices.
This chapter provides an overview for Overlay Transport Virtualization (OTV) on Cisco NX-OS devices., page 1 Sample Topologies, page 6 OTV is a MAC-in-IP method that extends Layer 2 connectivity across
More informationVenugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04
The Design and Implementation of a Next Generation Name Service for the Internet Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 Presenter: Saurabh Kadekodi Agenda DNS overview Current DNS Problems
More informationAn actor-critic reinforcement learning controller for a 2-DOF ball-balancer
An actor-critic reinforcement learning controller for a 2-DOF ball-balancer Andreas Stückl Michael Meyer Sebastian Schelle Projektpraktikum: Computational Neuro Engineering 2 Empty side for praktikums
More informationTechniques to improve the scalability of Checkpoint-Restart
Techniques to improve the scalability of Checkpoint-Restart Bogdan Nicolae Exascale Systems Group IBM Research Ireland 1 Outline A few words about the lab and team Challenges of Exascale A case for Checkpoint-Restart
More informationThe Cray Programming Environment. An Introduction
The Cray Programming Environment An Introduction Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent
More informationFaster Clustering with DBSCAN
Faster Clustering with DBSCAN Marzena Kryszkiewicz and Lukasz Skonieczny Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland Abstract. Grouping data
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationChapter 8: GPS Clustering and Analytics
Chapter 8: GPS Clustering and Analytics Location information is crucial for analyzing sensor data and health inferences from mobile and wearable devices. For example, let us say you monitored your stress
More informationClustering in Data Mining
Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,
More informationClustering part II 1
Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:
More informationScalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX
Scalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX ABSTRACT Symmetric multiprocessor (SMP) computers can increase performance by reducing the time required to analyze large volumes
More informationExtending the Eclipse Parallel Tools Platform debugger with Scalable Parallel Debugging Library
Available online at www.sciencedirect.com Procedia Computer Science 18 (2013 ) 1774 1783 Abstract 2013 International Conference on Computational Science Extending the Eclipse Parallel Tools Platform debugger
More informationArchitecture of a Real-Time Operational DBMS
Architecture of a Real-Time Operational DBMS Srini V. Srinivasan Founder, Chief Development Officer Aerospike CMG India Keynote Thane December 3, 2016 [ CMGI Keynote, Thane, India. 2016 Aerospike Inc.
More informationThe MapReduce Abstraction
The MapReduce Abstraction Parallel Computing at Google Leverages multiple technologies to simplify large-scale parallel computations Proprietary computing clusters Map/Reduce software library Lots of other
More informationHeckaton. SQL Server's Memory Optimized OLTP Engine
Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability
More informationDistance-based Methods: Drawbacks
Distance-based Methods: Drawbacks Hard to find clusters with irregular shapes Hard to specify the number of clusters Heuristic: a cluster must be dense Jian Pei: CMPT 459/741 Clustering (3) 1 How to Find
More informationOracle Database 11g Direct NFS Client Oracle Open World - November 2007
Oracle Database 11g Client Oracle Open World - November 2007 Bill Hodak Sr. Product Manager Oracle Corporation Kevin Closson Performance Architect Oracle Corporation Introduction
More informationHigh-Performance Data Loading and Augmentation for Deep Neural Network Training
High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose
More informationDebugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software
Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software david@allinea.com Agenda What is HPC? How is scale affecting HPC? Achieving tool scalability Scale in practice
More informationAddressing Performance and Programmability Challenges in Current and Future Supercomputers
Addressing Performance and Programmability Challenges in Current and Future Supercomputers Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. VI-HPS - SC'13 Luiz DeRose 2013
More information