Hypergraph Sparsifica/on and Its Applica/on to Par//oning

Size: px
Start display at page:

Download "Hypergraph Sparsifica/on and Its Applica/on to Par//oning"

Transcription

1 Hypergraph Sparsifica/on and Its Applica/on to Par//oning Mehmet Deveci 1,3, Kamer Kaya 1, Ümit V. Çatalyürek 1,2 1 Dept. of Biomedical Informa/cs, The Ohio State University 2 Dept. of Electrical & Computer Engineering, The Ohio State University 3 Dept. of Computer Science & Engineering, The Ohio State University

2 Mo+va+on Problem: Sparsifica/on of large- scale data modeled as a hypergraph for a scalable computa/on and analysis Today data is big and its u/liza/on and analysis require complex algorithms and immense amount of compu/ng power. The techniques to make the data smaller are very important. We should avoid any redundancy in the data and we can even sacrifice some part of it to reduce the size. Applica/on (in this work): Hypergraph par//oning Used in many problems in parallel scien/fic compu/ng such as sparse matrix reordering, sta/c and dynamic load balancing, clustering, and recommenda/on. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 2

3 Contribu+on Proposed hypergraph sparsifica/on techniques Iden/cal net removal Already exist in some par//oning tools but our implementa/on is faster Iden/cal vertex removal Similar net removal To our best knowledge, there is no work that analyzes the effec/veness of the sparsifica/on on hypergraphs. Implemented under UMPa [Catalyurek12], a mul/- objec/ve hypergraph par//oner Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 3

4 Hypergraph Par++oning Hypergraph H=(V,N) V: vertex set, N: net set c(n): cost of a net w(v): weight of a vertex Objec/ve: Par//on the hypergraph Balanced load distribu/on Wk < Wavg( 1+ ε ) for 1 k K Minimized communica/on between parts CV = n N c(n) ( λ n 1) n 1, n 2, n 3 and n 5, n 6 are identical nets v 2, v 4 are identical vertices. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 4

5 Par++oning Example P 1 P 2! c(n i ) =1, w(v i ) =1 $ # & # λ n1 = λ n2 = λ n3 = λ n5 = λ n6 =1, & # & " λ n4 = 2, % # CV =1 & % ( $ W 1 = 2,W 2 = 3 imbal = 0.2' Par//oning criteria: Communica/on volume and par//oning /me. Be_er volume reduces the parallel execu/on /me. However, par//oning /me can dominate applica/on /me. We want to reduce the par//oning /me by sparsifica/on Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 5

6 Mul+- level Approach Three phases: Coarsening: obtain smaller and similar hypergraphs to the original Ini+al par++oning: find a solu/on for the smallest hypergraph. Uncoarsening: project the ini/al solu/on to the finer hypergraphs and refine it itera/vely un/l a solu/on for the original hypergraph obtained. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 6

7 Iden+cal Net Removal (INR) Two nets are iden/cal if their pin sets are the same Pairwise comparison is very expensive Instead, we use hashing If two nets are iden/cal the sum of their pin id s must be iden/cal Calculate a hash value for each net, and compare only the ones with the same hash value Choose one representa/ve net for an iden/cal net set Coarsening sparsifies the vertices. INR is done after coarsening level. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 7

8 INR Hash Func+on Hash func/ons: CSj(n) = i pins[n] Murmur Hash [Appleby12] i j The quality of the hash func/on depends on the number of collusions, e.g., CS(n 1 ) =CS(n 2 ) for two nets n 1 and n 2 which are not iden/cal nets False- posi/ve cost: Number of pairwise comparisons for non- iden/cal nets Checksum occupancy: The average number of representa/ves having the same checksum value Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 8

9 INR Variants INR- SRT: Calculates hash values for each net, then sorts it w.r.t. hash values. Reduces the false- posi/ve, and occupancy rate. However, sor/ng can be expensive. INR- MEM: Uses two arrays first and next to store the hash values in a linked list structure. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 9

10 Hashing Example Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 10

11 Hashing Example CS1(n 1 ) = 1+3 = 4 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 11

12 Hashing Example CS1(n 2 ) = 1+3 = 4 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 12

13 Hashing Example CS1(n 3 ) = 1+3 = 4 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 13

14 Hashing Example CS1(n 4 ) = = 9 9 mod 7 = 2 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 14

15 Hashing Example CS1(n 5 ) = = mod 7 = 4 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 15

16 Hashing Example CS1(n 6 ) = = mod 7 = 4 c(n 1 ) = c(n 1 )+c(n 2 )+c(n 3 ) c(n 5 ) = c(n 5 )+c(n 6 ) Occupancy = (2 + 1) / 2 = 1.5 Occupancy = ( ) / 3 = 1 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 16

17 Iden+cal Vertex Removal (IVR) Two ver/ces are iden/cal if they are connected to the same nets. Same methods applied to INR Although INR does not affect the par//oning result, IVR affects the quality of the par//oning by taking early decisions on the part assignments. Coarsening sparsifies the identical vertices during coarsening. There is no need for IVR. But IVR performed at the beginning of the coarsening can reduce its execution time. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 17

18 Similar Net Removal (SNR) INR aims to remove the redundancy from hypergraph. only effec/ve when iden/cal nets, i.e., redundancy, exist SNR removes the similar nets even when there is no redundancy Lossy compression technique. Usually worsen the quality, but makes the par//oning faster. When the performance of the applica/on is not very sensi/ve against small changes in par//oning quality, this trade/off can be useful. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 18

19 Similar Net Removal (SNR) The similarity between two nets n i and n j is defined with Jaccard Coefficient: J(n i, n j ) = pins[n i ] pins[n j ] pins[n i ] pins[n j ] Since the number of nets is large, it is infeasible to compute the similarity for each net pair. Instead, compute a footprint of each net using minhash Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 19

20 Similar Net Removal (SNR) σ is a random permuta/on of the integers from 1 to V, and min σ (n) is the first vertex id of a net n N under the permuta/on σ. We use t permuta/ons σ 1 to σ t to obtain a minwise footprint of each net. Two nets n i and n j are similar iff their minwise footprints are iden/cal, where mf(n) = (min σ1 (n),..., min σt (n)). We do the hashing and pairwise comparison only for this minwise footprint set, and choose one of the nets as the representa/ve of this set. Large (LRG): representa/ve is the net with the largest number of pins. Important (IMP): when calcula/ng the pin count, priori/zing the pins which are connected to heavy nets. Union (UNI): representa/ve is a virtual net that is connected to all pins of the nets in the set. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 20

21 Experiments All the algorithms are implemented in UMPa. g++ version and O3 flag. Intel Xeon E5520 (quad- core clocked at 2.27 Ghz) 48 GBs of Memory 28 matrices from different matrix classes. K = 2, 8, 32, 128, 512, 1024 Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 21

22 Hash Func+on Comparison Normalized+w.r.t.+Cs12Sort+ 1.4" 1.2" 1" 0.8" 0.6" Elimina:on":me" False"posi:ve"cost" Checksum"occupancy" Quality is be_er with INRSRT, as no limit on hash- size. Except CS1, all other has an occupancy value close to 1 (op/mal occupancy). INRMEM equipped with CS2 has best performance. Checksum func/on is as good as CS3 and MurmurHash. 0.4" Computa/onally cheaper. 0.2" 0" InrSrt+Cs2" InrSrt+Cs3" InrSrt+Murmur" InrMem+Cs1" InrMem+Cs2" InrMem+Cs3" InrMem+Murmur" Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 22

23 Improvement on Time and CV BASE UMPa INR INR+IVR K Time CV Time- Speedup CV- improve Time- Speedup CV- improve , , , , , , to 3.30 speedups for INR+IVR 0.3% 2.4% quality improvement on average. The speedup values are increasing with K promising as the overhead of the partitioning problem is usually an issue for large K values. Most of the speedup is obtained with INR, as not all hypergraphs contain identical vertices. 14/28 of the matrices in the test set have less than 103 identical vertices Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 23

24 SNR improvement w.r.t INR+IVR 1.1$ 1$ SNR8LRG$ SNR8P48LRG$ SNR8IMP$ SNR8P48IMP$ SNR8UNI$ SNR8P48UNI$ 0.9$ 0.8$ 0.7$ Total$Volume$ Total$Time$ Total$Volume$ Total$Time$ Total$Volume$ Total$Time$ 4 permuta/on array (t=4). SNR- P4- X restricts the removal process to only the nets with 4 or more pins SNR- X and SNR- P4- X where X is a representa/ve selec/on method. 128$ 512$ 1024$ On 1024 processor SNR- LRG è 22% improvement on /me, 5% harm on CV 4.2 speedup w.r.t. Base 4% reduc/on on CV SNR- P4- LRG è 15% improvement on /me, 2% harm on CV 3.9 speedup w.r.t. Base 2% reduc/on on CV Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 24

25 Conclusion We proposed heuris/cs for lossless and lossy hypergraph sparsifica/on. We show that the effec/veness of the heuris/c increases with the number of part numbers. This is promising as par//oning overhead is an issue for today s architectures with large number of processors. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 25

26 References Catalyurek et al. "UMPA: A Mul/- objec/ve, mul/- level par//oner for communica/on minimiza/on." Graph Par??oning and Graph Clustering (2012). A. Appleby, SMHasher & MurmurHash, 2012, h_p://code. google.com/p/smhasher/. Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 26

27 Thanks For more informa/on Visit h_p://bmi.osu.edu/~umit or h_p://bmi.osu.edu/hpc Acknowledgement of Support Deveci et al. "Hypergraph Sparsifica+on and its Applica+on to Par++oning" 27

Par$$oning Sparse Matrices

Par$$oning Sparse Matrices SIAM CSE 09 Minisymposium on Parallel Sparse Matrix Computa:ons and Enabling Algorithms March 2, 2009, Miami, FL Par$$oning Sparse Matrices Ümit V. Çatalyürek Associate Professor Biomedical Informa5cs

More information

Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra<on

Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra<on Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra

More information

Ar#ficial Intelligence

Ar#ficial Intelligence Ar#ficial Intelligence Advanced Searching Prof Alexiei Dingli Gene#c Algorithms Charles Darwin Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for

More information

Parallel Graph Coloring For Many- core Architectures

Parallel Graph Coloring For Many- core Architectures Parallel Graph Coloring For Many- core Architectures Mehmet Deveci, Erik Boman, Siva Rajamanickam Sandia Na;onal Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated

More information

UMPa: A multi-objective, multi-level partitioner for communication minimization

UMPa: A multi-objective, multi-level partitioner for communication minimization Contemporary Mathematics Volume 588, 2013 http://dx.doi.org/10.1090/conm/588/11704 UMPa: A multi-objective, multi-level partitioner for communication minimization Ümit V. Çatalyürek, Mehmet Deveci, Kamer

More information

k-way Hypergraph Partitioning via n-level Recursive Bisection

k-way Hypergraph Partitioning via n-level Recursive Bisection k-way Hypergraph Partitioning via n-level Recursive Bisection Sebastian Schlag, Vitali Henne, Tobias Heuer, Henning Meyerhenke Peter Sanders, Christian Schulz January 10th, 2016 @ ALENEX 16 INSTITUTE OF

More information

CS 6140: Machine Learning Spring 2017

CS 6140: Machine Learning Spring 2017 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis@cs Grades

More information

Clustering. Barna Saha

Clustering. Barna Saha Clustering Barna Saha The Problem of Clustering Given a set of points, with a no;on of distance between points, group the points into some number of clusters, so that members of a cluster are close to

More information

Opera&ng Systems ECE344

Opera&ng Systems ECE344 Opera&ng Systems ECE344 Lecture 10: Scheduling Ding Yuan Scheduling Overview In discussing process management and synchroniza&on, we talked about context switching among processes/threads on the ready

More information

A Push- Relabel- Based Maximum Cardinality Bipar9te Matching Algorithm on GPUs

A Push- Relabel- Based Maximum Cardinality Bipar9te Matching Algorithm on GPUs A Push- Relabel- Based Maximum Cardinality Biparte Matching Algorithm on GPUs Mehmet Deveci,, Kamer Kaya, Bora Uçar, and Ümit V. Çatalyürek, Dept. of Biomedical InformaDcs, The Ohio State University Dept.

More information

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal

More information

STREAMER: a Distributed Framework for Incremental Closeness Centrality

STREAMER: a Distributed Framework for Incremental Closeness Centrality STREAMER: a Distributed Framework for Incremental Closeness Centrality Computa@on A. Erdem Sarıyüce 1,2, Erik Saule 4, Kamer Kaya 1, Ümit V. Çatalyürek 1,3 1 Department of Biomedical InformaBcs 2 Department

More information

BigDataBench- S: An Open- source Scien6fic Big Data Benchmark Suite

BigDataBench- S: An Open- source Scien6fic Big Data Benchmark Suite BigDataBench- S: An Open- source Scien6fic Big Data Benchmark Suite Xinhui Tian, Shaopeng Dai, Zhihui Du, Wanling Gao, Rui Ren, Yaodong Cheng, Zhifei Zhang, Zhen Jia, Peijian Wang and Jianfeng Zhan INSTITUTE

More information

Modular arithme.c and cryptography

Modular arithme.c and cryptography Modular arithme.c and cryptography CSC 1300 Discrete Structures Villanova University Public Key Cryptography (Slides 11-32) by Dr. Lillian Cassel, Villanova University Villanova CSC 1300 - Dr Papalaskari

More information

Combinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons

Combinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons Combinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons Assefaw Gebremedhin Purdue University (Star/ng August 2014, Washington State University School of Electrical Engineering

More information

Re- op&mizing Data Parallel Compu&ng

Re- op&mizing Data Parallel Compu&ng Re- op&mizing Data Parallel Compu&ng Sameer Agarwal Srikanth Kandula, Nicolas Bruno, Ming- Chuan Wu, Ion Stoica, Jingren Zhou UC Berkeley A Data Parallel Job can be a collec/on of maps, A Data Parallel

More information

Design and Implementa/on of a Consolidated Middlebox Architecture. Vyas Sekar Sylvia Ratnasamy Michael Reiter Norbert Egi Guangyu Shi

Design and Implementa/on of a Consolidated Middlebox Architecture. Vyas Sekar Sylvia Ratnasamy Michael Reiter Norbert Egi Guangyu Shi Design and Implementa/on of a Consolidated Middlebox Architecture Vyas Sekar Sylvia Ratnasamy Michael Reiter Norbert Egi Guangyu Shi 1 Need for Network Evolu/on New applica/ons Evolving threats Performance,

More information

Graph and Hypergraph Partitioning for Parallel Computing

Graph and Hypergraph Partitioning for Parallel Computing Graph and Hypergraph Partitioning for Parallel Computing Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology June 29, 2016 Graph and hypergraph partitioning References:

More information

Parallel Implementation of Task Scheduling using Ant Colony Optimization

Parallel Implementation of Task Scheduling using Ant Colony Optimization Parallel Implementaon of Task Scheduling using Ant Colony Opmizaon T. Vetri Selvan 1, Mrs. P. Chitra 2, Dr. P. Venkatesh 3 1 Thiagaraar College of Engineering /Department of Computer Science, Madurai,

More information

Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis

Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis Elif Dede, Madhusudhan Govindaraju Lavanya Ramakrishnan, Dan Gunter, Shane Canon Department of Computer Science, Binghamton

More information

Distributed State Es.ma.on Algorithms for Electric Power Systems

Distributed State Es.ma.on Algorithms for Electric Power Systems Distributed State Es.ma.on Algorithms for Electric Power Systems Ariana Minot, Blue Waters Graduate Fellow Professor Na Li, Professor Yue M. Lu Harvard University, School of Engineering and Applied Sciences

More information

M 2 R: Enabling Stronger Privacy in MapReduce Computa;on

M 2 R: Enabling Stronger Privacy in MapReduce Computa;on M 2 R: Enabling Stronger Privacy in MapReduce Computa;on Anh Dinh, Prateek Saxena, Ee- Chien Chang, Beng Chin Ooi, Chunwang Zhang School of Compu,ng Na,onal University of Singapore 1. Mo;va;on Distributed

More information

Lecture 2 Data Cube Basics

Lecture 2 Data Cube Basics CompSci 590.6 Understanding Data: Theory and Applica>ons Lecture 2 Data Cube Basics Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Papers 1. Gray- Chaudhuri- Bosworth- Layman- Reichart- Venkatrao-

More information

BoW model. Textual data: Bag of Words model

BoW model. Textual data: Bag of Words model BoW model Textual data: Bag of Words model With text, categoriza9on is the task of assigning a document to one or more categories based on its content. It is appropriate for: Detec9ng and indexing similar

More information

Overcoming the Barriers of Graphs on GPUs: Delivering Graph Analy;cs 100X Faster and 40X Cheaper

Overcoming the Barriers of Graphs on GPUs: Delivering Graph Analy;cs 100X Faster and 40X Cheaper Overcoming the Barriers of Graphs on GPUs: Delivering Graph Analy;cs 100X Faster and 40X Cheaper November 18, 2015 Super Compu3ng 2015 The Amount of Graph Data is Exploding! Billion+ Edges! 2 Graph Applications

More information

Energy- Aware Time Change Detec4on Using Synthe4c Aperture Radar On High- Performance Heterogeneous Architectures: A DDDAS Approach

Energy- Aware Time Change Detec4on Using Synthe4c Aperture Radar On High- Performance Heterogeneous Architectures: A DDDAS Approach Energy- Aware Time Change Detec4on Using Synthe4c Aperture Radar On High- Performance Heterogeneous Architectures: A DDDAS Approach Sanjay Ranka (PI) Sartaj Sahni (Co- PI) Mark Schmalz (Co- PI) University

More information

Informa)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies

Informa)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies Informa)on Retrieval and Map- Reduce Implementa)ons Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies mas4108@louisiana.edu Map-Reduce: Why? Need to process 100TB datasets On 1 node:

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop NDBI040: Big Data Management and NoSQL Databases hp://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-ndbi040/ Lecture 2 MapReduce, Apache Hadoop Marn Svoboda svoboda@ksi.mff.cuni.cz 11. 10. 2016 Charles University

More information

MPI & OpenMP Mixed Hybrid Programming

MPI & OpenMP Mixed Hybrid Programming MPI & OpenMP Mixed Hybrid Programming Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline Introduc/on Share & Distributed Memory Programming MPI & OpenMP Advantages/Disadvantages MPI vs. OpenMP Why

More information

(Sequen)al) Sor)ng. Bubble Sort, Inser)on Sort. Merge Sort, Heap Sort, QuickSort. Op)mal Parallel Time complexity. O ( n 2 )

(Sequen)al) Sor)ng. Bubble Sort, Inser)on Sort. Merge Sort, Heap Sort, QuickSort. Op)mal Parallel Time complexity. O ( n 2 ) Parallel Sor)ng A jungle (Sequen)al) Sor)ng Bubble Sort, Inser)on Sort O ( n 2 ) Merge Sort, Heap Sort, QuickSort O ( n log n ) QuickSort best on average Op)mal Parallel Time complexity O ( n log n ) /

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval Clustering Chris Manning, Pandu Nayak, and Prabhakar Raghavan Today s Topic: Clustering Document clustering Mo*va*ons Document representa*ons Success criteria Clustering

More information

Informa(on Retrieval

Informa(on Retrieval Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document

More information

Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs

Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs Masher: Mapping Long(er) Reads with Hash-based Genome Indexing on GPUs Anas Abu-Doleh 1,2, Erik Saule 1, Kamer Kaya 1 and Ümit V. Çatalyürek 1,2 1 Department of Biomedical Informatics 2 Department of Electrical

More information

Finding Similar Sets. Applications Shingling Minhashing Locality-Sensitive Hashing

Finding Similar Sets. Applications Shingling Minhashing Locality-Sensitive Hashing Finding Similar Sets Applications Shingling Minhashing Locality-Sensitive Hashing Goals Many Web-mining problems can be expressed as finding similar sets:. Pages with similar words, e.g., for classification

More information

Op#mizing MapReduce for Highly- Distributed Environments

Op#mizing MapReduce for Highly- Distributed Environments Op#mizing MapReduce for Highly- Distributed Environments Abhishek Chandra Associate Professor Department of Computer Science and Engineering University of Minnesota hep://www.cs.umn.edu/~chandra 1 Big

More information

MapReduce. Tom Anderson

MapReduce. Tom Anderson MapReduce Tom Anderson Last Time Difference between local state and knowledge about other node s local state Failures are endemic Communica?on costs ma@er Why Is DS So Hard? System design Par??oning of

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop Czech Technical University in Prague, Faculty of Informaon Technology MIE-PDB: Advanced Database Systems hp://www.ksi.mff.cuni.cz/~svoboda/courses/2016-2-mie-pdb/ Lecture 12 MapReduce, Apache Hadoop Marn

More information

Virtual Synchrony. Jared Cantwell

Virtual Synchrony. Jared Cantwell Virtual Synchrony Jared Cantwell Review Mul7cast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed file systems Goal Distributed programming is hard What

More information

hashfs Applying Hashing to Op2mize File Systems for Small File Reads

hashfs Applying Hashing to Op2mize File Systems for Small File Reads hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design

More information

Applica'on Aware Deadlock Free Oblivious Rou'ng

Applica'on Aware Deadlock Free Oblivious Rou'ng Applica'on Aware Deadlock Free Oblivious Rou'ng Michel Kinsy, Myong Hyo Cho, Tina Wen, Edward Suh (Cornell University), Marten van Dijk and Srinivas Devadas Massachuse(s Ins,tute of Technology Outline

More information

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Mo>va>on: Parallel Query Processing Increasing parallelism in compu>ng Shared nothing clusters, mul> core technology,

More information

Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD

Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD Riyaz Haque and David F. Richards This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore

More information

Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines

Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines Asynchronous and Fault-Tolerant Recursive Datalog Evalua9on in Shared-Nothing Engines Jingjing Wang, Magdalena Balazinska, Daniel Halperin University of Washington Modern Analy>cs Requires Itera>on Graph

More information

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Decision making for autonomous naviga2on Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Overview Naviga2on and Mobile robots Decision- making techniques for naviga2on Building

More information

Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search

Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search Alejandro Arbelaez - CharloBe Truchet - Philippe Codognet JFLI University of Tokyo LINA, UMR 6241 University of

More information

Today s Objec4ves. Data Center. Virtualiza4on Cloud Compu4ng Amazon Web Services. What did you think? 10/23/17. Oct 23, 2017 Sprenkle - CSCI325

Today s Objec4ves. Data Center. Virtualiza4on Cloud Compu4ng Amazon Web Services. What did you think? 10/23/17. Oct 23, 2017 Sprenkle - CSCI325 Today s Objec4ves Virtualiza4on Cloud Compu4ng Amazon Web Services Oct 23, 2017 Sprenkle - CSCI325 1 Data Center What did you think? Oct 23, 2017 Sprenkle - CSCI325 2 1 10/23/17 Oct 23, 2017 Sprenkle -

More information

Tools zur Op+mierung eingebe2eter Mul+core- Systeme. Bernhard Bauer

Tools zur Op+mierung eingebe2eter Mul+core- Systeme. Bernhard Bauer Tools zur Op+mierung eingebe2eter Mul+core- Systeme Bernhard Bauer Agenda Mo+va+on So.ware Engineering & Mul5core Think Parallel Models Added Value Tooling Quo Vadis? The Mul5core Era Moore s Law: The

More information

Fixed- Parameter Evolu2onary Algorithms

Fixed- Parameter Evolu2onary Algorithms Fixed- Parameter Evolu2onary Algorithms Frank Neumann School of Computer Science University of Adelaide Joint work with Stefan Kratsch (U Utrecht), Per Kris2an Lehre (DTU Informa2cs), Pietro S. Oliveto

More information

A Script- Based Autotuning Compiler System to Generate High- Performance CUDA code

A Script- Based Autotuning Compiler System to Generate High- Performance CUDA code A Script- Based Autotuning Compiler System to Generate High- Performance CUDA code Malik Khan, Protonu Basu, Gabe Rudy, Mary Hall, Chun Chen, Jacqueline Chame Mo:va:on Challenges to programming the GPU

More information

Deformable Part Models

Deformable Part Models Deformable Part Models References: Felzenszwalb, Girshick, McAllester and Ramanan, Object Detec@on with Discrimina@vely Trained Part Based Models, PAMI 2010 Code available at hkp://www.cs.berkeley.edu/~rbg/latent/

More information

CSE 473: Ar+ficial Intelligence

CSE 473: Ar+ficial Intelligence CSE 473: Ar+ficial Intelligence Search Instructor: Luke Ze=lemoyer University of Washington [These slides were adapted from Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials

More information

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel

More information

Extending Heuris.c Search

Extending Heuris.c Search Extending Heuris.c Search Talk at Hebrew University, Cri.cal MAS group Roni Stern Department of Informa.on System Engineering, Ben Gurion University, Israel 1 Heuris.c search 2 Outline Combining lookahead

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing

More information

ECSE 425 Lecture 25: Mul1- threading

ECSE 425 Lecture 25: Mul1- threading ECSE 425 Lecture 25: Mul1- threading H&P Chapter 3 Last Time Theore1cal and prac1cal limits of ILP Instruc1on window Branch predic1on Register renaming 2 Today Mul1- threading Chapter 3.5 Summary of ILP:

More information

Multilevel Acyclic Partitioning of Directed Acyclic Graphs for Enhancing Data Locality

Multilevel Acyclic Partitioning of Directed Acyclic Graphs for Enhancing Data Locality Multilevel Acyclic Partitioning of Directed Acyclic Graphs for Enhancing Data Locality Julien Herrmann 1, Bora Uçar 2, Kamer Kaya 3, Aravind Sukumaran Rajam 4, Fabrice Rastello 5, P. Sadayappan 4, Ümit

More information

Fast Recommendation on Bibliographic Networks with Sparse-Matrix Ordering and Partitioning

Fast Recommendation on Bibliographic Networks with Sparse-Matrix Ordering and Partitioning Social Network Analysis and Mining manuscript No. (will be inserted by the editor) Fast Recommendation on Bibliographic Networks with Sparse-Matrix Ordering and Partitioning Onur Küçüktunç Kamer Kaya Erik

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search

More information

Network Coding: Theory and Applica7ons

Network Coding: Theory and Applica7ons Network Coding: Theory and Applica7ons PhD Course Part IV Tuesday 9.15-12.15 18.6.213 Muriel Médard (MIT), Frank H. P. Fitzek (AAU), Daniel E. Lucani (AAU), Morten V. Pedersen (AAU) Plan Hello World! Intra

More information

Super Instruction Architecture for Heterogeneous Systems. Victor Lotric, Nakul Jindal, Erik Deumens, Rod Bartlett, Beverly Sanders

Super Instruction Architecture for Heterogeneous Systems. Victor Lotric, Nakul Jindal, Erik Deumens, Rod Bartlett, Beverly Sanders Super Instruction Architecture for Heterogeneous Systems Victor Lotric, Nakul Jindal, Erik Deumens, Rod Bartlett, Beverly Sanders Super Instruc,on Architecture Mo,vated by Computa,onal Chemistry Coupled

More information

CS261 Data Structures. Maps (or Dic4onaries)

CS261 Data Structures. Maps (or Dic4onaries) CS261 Data Structures Maps (or Dic4onaries) Goals Introduce the Map(or Dic4onary) ADT Introduce an implementa4on of the map with a Dynamic Array So Far. Emphasis on values themselves e.g. store names in

More information

Multithreaded Clustering for Multi-level Hypergraph Partitioning

Multithreaded Clustering for Multi-level Hypergraph Partitioning 2012 IEEE 26th International Parallel and Distributed Processing Symposium Multithreaded Clustering for Multi-level Hypergraph Partitioning Ümit V. Çatalyürek, Mehmet Deveci, Kamer Kaya The Ohio State

More information

Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests

Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests William Killian PhD Prelimary Exam Presenta4on Department of Computer and Informa4on Science CommiIee John Cavazos and

More information

Penalized Graph Partitioning for Static and Dynamic Load Balancing

Penalized Graph Partitioning for Static and Dynamic Load Balancing Penalized Graph Partitioning for Static and Dynamic Load Balancing Tim Kiefer, Dirk Habich, Wolfgang Lehner Euro-Par 06, Grenoble, France, 06-08-5 Task Allocation Challenge Application (Workload) = Set

More information

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006)

Data Flow Analysis. Suman Jana. Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006) Data Flow Analysis Suman Jana Adopted From U Penn CIS 570: Modern Programming Language Implementa=on (Autumn 2006) Data flow analysis Derives informa=on about the dynamic behavior of a program by only

More information

Scalable Package Queries in Rela2onal Database Systems. Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou

Scalable Package Queries in Rela2onal Database Systems. Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou Scalable Package Queries in Rela2onal Database Systems Ma9eo Brucato Juan F. Beltran Azza Abouzied Alexandra Meliou Package Queries An important class of combinatorial op-miza-on queries Largely unsupported

More information

Searching and Sorting (Savitch, Chapter 7.4)

Searching and Sorting (Savitch, Chapter 7.4) Searching and Sorting (Savitch, Chapter 7.4) TOPICS Algorithms Complexity Binary Search Bubble Sort Insertion Sort Selection Sort What is an algorithm? A finite set of precise instruc6ons for performing

More information

Effect of Router Buffers on Stability of Internet Conges8on Control Algorithms

Effect of Router Buffers on Stability of Internet Conges8on Control Algorithms Effect of Router Buffers on Stability of Internet Conges8on Control Algorithms Somayeh Sojoudi Steven Low John Doyle Oct 27, 2011 1 Resource alloca+on problem Objec8ve Fair assignment of rates to the users

More information

What is an algorithm?

What is an algorithm? /0/ What is an algorithm? Searching and Sorting (Savitch, Chapter 7.) TOPICS Algorithms Complexity Binary Search Bubble Sort Insertion Sort Selection Sort A finite set of precise instrucons for performing

More information

Module: Sequence Alignment Theory and Applica8ons Session: BLAST

Module: Sequence Alignment Theory and Applica8ons Session: BLAST Module: Sequence Alignment Theory and Applica8ons Session: BLAST Learning Objec8ves and Outcomes v Understand the principles of the BLAST algorithm v Understand the different BLAST algorithms, parameters

More information

CS 188: Ar)ficial Intelligence

CS 188: Ar)ficial Intelligence CS 188: Ar)ficial Intelligence Search Instructors: Pieter Abbeel & Anca Dragan University of California, Berkeley [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley

More information

Introduc)on to. CS60092: Informa0on Retrieval

Introduc)on to. CS60092: Informa0on Retrieval Introduc)on to CS60092: Informa0on Retrieval Ch. 4 Index construc)on How do we construct an index? What strategies can we use with limited main memory? Sec. 4.1 Hardware basics Many design decisions in

More information

Mondrian Mul+dimensional K Anonymity

Mondrian Mul+dimensional K Anonymity Mondrian Mul+dimensional K Anonymity Kristen Lefevre, David J. DeWi

More information

RaceMob: Crowdsourced Data Race Detec,on

RaceMob: Crowdsourced Data Race Detec,on RaceMob: Crowdsourced Data Race Detec,on Baris Kasikci, Cris,an Zamfir, and George Candea School of Computer & Communica3on Sciences Data Races to shared memory loca,on By mul3ple threads At least one

More information

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Instructor: Wei-Min Shen CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Instructor: Wei-Min Shen Status Check and Review Status check Have you registered in Piazza? Have you run the Project-1?

More information

MLlib. Distributed Machine Learning on. Evan Sparks. UC Berkeley

MLlib. Distributed Machine Learning on. Evan Sparks.  UC Berkeley MLlib & ML base Distributed Machine Learning on Evan Sparks UC Berkeley January 31st, 2014 Collaborators: Ameet Talwalkar, Xiangrui Meng, Virginia Smith, Xinghao Pan, Shivaram Venkataraman, Matei Zaharia,

More information

Cross- Valida+on & ROC curve. Anna Helena Reali Costa PCS 5024

Cross- Valida+on & ROC curve. Anna Helena Reali Costa PCS 5024 Cross- Valida+on & ROC curve Anna Helena Reali Costa PCS 5024 Resampling Methods Involve repeatedly drawing samples from a training set and refibng a model on each sample. Used in model assessment (evalua+ng

More information

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning

CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1

More information

Algorithms and Data Structures CS-CO-412

Algorithms and Data Structures CS-CO-412 Algorithms and Data Structures CS-CO-412 David Vernon Professor of Informatics University of Skövde Sweden david@vernon.eu www.vernon.eu Algorithms and Data Structures 1 Copyright D. Vernon 2014 Course

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

Topic: Duplicate Detection and Similarity Computing

Topic: Duplicate Detection and Similarity Computing Table of Content Topic: Duplicate Detection and Similarity Computing Motivation Shingling for duplicate comparison Minhashing LSH UCSB 290N, 2013 Tao Yang Some of slides are from text book [CMS] and Rajaraman/Ullman

More information

Architecture of So-ware Systems Massively Distributed Architectures Reliability, Failover and failures. Mar>n Rehák

Architecture of So-ware Systems Massively Distributed Architectures Reliability, Failover and failures. Mar>n Rehák Architecture of So-ware Systems Massively Distributed Architectures Reliability, Failover and failures Mar>n Rehák Mo>va>on Internet- based business models imposed new requirements on computa>onal architectures

More information

GASPP: A GPU- Accelerated Stateful Packet Processing Framework

GASPP: A GPU- Accelerated Stateful Packet Processing Framework GASPP: A GPU- Accelerated Stateful Packet Processing Framework Giorgos Vasiliadis, FORTH- ICS, Greece Lazaros Koromilas, FORTH- ICS, Greece Michalis Polychronakis, Columbia University, USA So5ris Ioannidis,

More information

CLOUD SERVICES. Cloud Value Assessment.

CLOUD SERVICES. Cloud Value Assessment. CLOUD SERVICES Cloud Value Assessment www.cloudcomrade.com Comrade a companion who shares one's ac8vi8es or is a fellow member of an organiza8on 2 Today s Agenda! Why Companies Should Consider Moving Business

More information

LUMOS. A Framework with Analy1cal Models for Heterogeneous Architectures. Liang Wang, and Kevin Skadron (University of Virginia)

LUMOS. A Framework with Analy1cal Models for Heterogeneous Architectures. Liang Wang, and Kevin Skadron (University of Virginia) LUMOS A Framework with Analy1cal Models for Heterogeneous Architectures Liang Wang, and Kevin Skadron (University of Virginia) What is LUMOS A set of first- order analy1cal models targe1ng heterogeneous

More information

Origin- des*na*on Flow Measurement in High- Speed Networks

Origin- des*na*on Flow Measurement in High- Speed Networks IEEE INFOCOM, 2012 Origin- des*na*on Flow Measurement in High- Speed Networks Tao Li Shigang Chen Yan Qiao Introduc*on (Defini*ons) Origin- des+na+on flow between two routers is the set of packets that

More information

Fast Computational GPU Design with GT-Pin

Fast Computational GPU Design with GT-Pin ast omputational GPU esign with GT-Pin Melanie Kambadur *, Sunpyo Hong +, Juan abral +, Harish Patil +, hi-keung Luk +, Sohaib Sajid +, Martha. Kim *. * olumbia University, New York, NY. + Intel orporation,

More information

Uninformed search strategies

Uninformed search strategies Uninformed search strategies A search strategy is defined by picking the order of node expansion Uninformed search strategies use only the informa:on available in the problem defini:on Breadth- first search

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

Informa(on Retrieval

Informa(on Retrieval Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and

More information

Special Topics on Algorithms Fall 2017 Dynamic Programming. Vangelis Markakis, Ioannis Milis and George Zois

Special Topics on Algorithms Fall 2017 Dynamic Programming. Vangelis Markakis, Ioannis Milis and George Zois Special Topics on Algorithms Fall 2017 Dynamic Programming Vangelis Markakis, Ioannis Milis and George Zois Basic Algorithmic Techniques Content Dynamic Programming Introduc

More information

Introduc4on to Algorithms

Introduc4on to Algorithms CS 2351 Data Structures Introduc4on to Algorithms Prof. Chung- Ta King Department of Computer Science Na9onal Tsing Hua University Outline l Data structure and algorithm What is algorithm? How to specify

More information

Link State Rou.ng Reading: Sec.ons 4.2 and 4.3.4

Link State Rou.ng Reading: Sec.ons 4.2 and 4.3.4 Link State Rou.ng Reading: Sec.ons. and.. COS 6: Computer Networks Spring 0 Mike Freedman hep://www.cs.princeton.edu/courses/archive/spring/cos6/ Inside a router Goals of Today s Lecture Control plane:

More information

Lecture 34 Fall 2018 Wednesday November 28

Lecture 34 Fall 2018 Wednesday November 28 Greedy Algorithms Oliver W. Layton CS231: Data Structures and Algorithms Lecture 34 Fall 2018 Wednesday November 28 Plan Friday office hours: 3-4pm instead of 1-2pm Dijkstra's algorithm example Minimum

More information

The Meter-ON project. Marco Baron Enel Distribuzione. Steering the implementation of smart metering solutions throughout Europe

The Meter-ON project. Marco Baron Enel Distribuzione. Steering the implementation of smart metering solutions throughout Europe Steering the implementa.on of smart metering solu.ons throughout Europe The Meter-ON project Steering the implementation of smart metering solutions throughout Europe Session 47: Operational challenges

More information

Using Dynamic Voltage Frequency Scaling and CPU Pinning for Energy Efficiency in Cloud Compu1ng. Jakub Krzywda Umeå University

Using Dynamic Voltage Frequency Scaling and CPU Pinning for Energy Efficiency in Cloud Compu1ng. Jakub Krzywda Umeå University Using Dynamic Voltage Frequency Scaling and CPU Pinning for Energy Efficiency in Cloud Compu1ng Jakub Krzywda Umeå University How to use DVFS and CPU Pinning to lower the power consump1on during periods

More information

A Comparison of GPU Box- Plane Intersec8on Algorithms for Direct Volume Rendering. Chair of Computer Science Prof. Lang University of Cologne, Germany

A Comparison of GPU Box- Plane Intersec8on Algorithms for Direct Volume Rendering. Chair of Computer Science Prof. Lang University of Cologne, Germany A Comparison of GPU Box- Plane Intersec8on Algorithms for Direct Volume Rendering Chair of Computer Science Prof. Lang, Germany Stefan Zellmann (zellmans@uni- koeln.de) Ulrich Lang (lang@uni- koeln.de)

More information

Profiling & Tuning Applica1ons. CUDA Course July István Reguly

Profiling & Tuning Applica1ons. CUDA Course July István Reguly Profiling & Tuning Applica1ons CUDA Course July 21-25 István Reguly Introduc1on Why is my applica1on running slow? Work it out on paper Instrument code Profile it NVIDIA Visual Profiler Works with CUDA,

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually

More information

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan Feature Selec+on Machine Learning Fall 2018 Kasthuri Kannan Interpretability vs. Predic+on Types of feature selec+on Subset selec+on/forward/backward Shrinkage (Lasso/Ridge) Best model (CV) Feature selec+on

More information