Jure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research
|
|
- Oswald Wilson
- 5 years ago
- Views:
Transcription
1 Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research
2 Network: an interaction graph: Nodes represent entities Edges represent interaction between pairs of entities 2
3 Are there natural clusters, communities, partitions, etc.? Concept-based clusters, link-based clusters, density-based clusters, 3
4 Bid, click and impression information for keyword x advertiser pair Mine information at query-time to provide new ads Maximize CTR, RPS, advertiser ROI 4
5 query Find micro-markets by partitioning the query x advertiser graph: advertiser 5
6 Linear (low-rank) methods: If Gaussian, then low-rank space is good Kernel (non-linear) methods: If low-dimensional manifold, then kernels are good Hierarchical methods: Top-down and bottom-up common in social sciences Graph partitioning methods: Define edge counting metric conductance, expansion, modularity, etc. and optimize! It is a matter of common experience that communities exist in networks... Although not precisely defined, communities are usually thought of as sets of nodes with better connections amongst its members than with the rest of the world. 6
7 Communities: Sets of nodes with lots of connections inside and few to outside (the rest of the network) Assumption: Networks are (hierarchically) composed of communities Communities, clusters, groups, modules 7
8 Communities: Sets of nodes with lots of connections inside and few to outside (the rest of the network) Assumption: Networks are (hierarchically) composed of communities Hierarchical community structure Question: Are large networks really like this? 8
9 How community like is a set of nodes? Let A be the adjacency matrix of G=(V,E). The conductance of a set S of nodes is: S S The Network Community Profile (NCP) plot of the graph is: 9
10 What is best community of 5 nodes? Score: Φ(S) = # edges cut / # edges inside 10
11 What is best community of 5 nodes? Bad community Φ=5/6 = 0.83 Score: Φ(S) = # edges cut / # edges inside 11
12 What is best community of 5 nodes? Bad community Φ=5/7 = 0.7 Better community Φ=2/5 = 0.4 Score: Φ(S) = # edges cut / # edges inside 12
13 What is best community of 5 nodes? Bad community Φ=5/7 = 0.7 Best community Φ=2/8 = 0.25 Better community Φ=2/5 = 0.4 Score: Φ(S) = # edges cut / # edges inside 13
14 Network community profile (NCP) plot Plot the score of best community of size k k=5 k=7 log Φ(k) Φ(5)=0.25 Φ(7)=0.18 Community size, log k 14
15 Idea: Use approximation algorithms for NP-hard graph partitioning problems as experimental probes of network structure. Spectral (quadratic approx): confuses long paths with deep cuts Multi-commodity flow (log(n) approx): difficulty with expanders SDP (sqrt(log(n)) approx): best in theory Metis (multi-resolution heuristic): common in practice X+MQI: post-processing step on, e.g., MQI of Metis Local Spectral - connected and tighter sets (empirically) Metis+MQI - best conductance (empirically) We are not interested in partitions per se, but in probing network structure 15
16 d-dimensional meshes California road network 16
17 Zachary s university karate club social network During the study club split into 2 The split (squares vs. circles) corresponds to cut B 17
18 Collaborations between scientists in Networks [Newman, 2005] 18
19 [Ravasz&Barabasi, 2003] [Clauset,Moore&Newman, 2008] 19
20 Previously researchers examined community structure of small networks (~100 nodes) We examined more than 100 different large networks Large networks look very different! 20
21 Typical example: General relativity collaboration network (4,158 nodes, 13,422 edges) 21
22 Φ(k), (conductance) Better and better communities Communities get worse and worse Best community has ~100 nodes k, (community size) 22
23 Definition: Whisker is a maximal set of nodes connected to the network by a single edge NCP plot Largest whisker Whiskers are responsible for downward slope of NCP plot 23
24 Denser and denser core of the network Core contains ~60% nodes and ~80% edges Network structure: Core-periphery (jellyfish, octopus) Whiskers are responsible for good communities 24
25 Each new edge inside the community costs more Φ=1/3 = 0.33 NCP plot Φ=2/4 = 0.5 Φ=8/6 = 1.3 Φ=64/14 = 4.5 Each node has twice as many children 25
26 Edge to cut Whiskers: Whiskers in real networks are non-trivial (richer than trees) 26
27 Whiskers in real networks are larger than Whiskers expected based on density and degree sequence 27
28 28
29 Nothing happens! Now we have 2-edge connected whiskers to deal with. Indicates the recursiveness of our coreperiphery structure: as we remove the periphery, the core itself breaks into core and the periphery 29
30 What if we allow cuts that give disconnected communities? Cut all whiskers Compose communities out of whiskers How good community do we get? 30
31 Rewired network Local spectral Bag-ofwhiskers Metis+MQI LiveJournal 31
32 Regularization properties: spectral embeddings stretch along directions in which the randomwalk mixes slowly Resulting hyperplane cuts have "good" conductance cuts, but may not yield the optimal cuts spectral embedding flow based embedding 32
33 ext/int Dots are connected clusters Metis+MQI (red) gives sets with better conductance. Local Spectral (blue) gives tighter and more wellrounded sets. 33
34 Two ca. 500 node communities from Local Spectral: Two ca. 500 node communities from Metis+MQI: 34
35 ... can be computed from: Spectral embedding (independent of balance) SDP-based methods (for volume-balanced partitions) 35
36 What is a good model that explains such network structure? None of the existing models work Flat Down and Flat Flat and Down Pref. attachment Small World Geometric Pref. Attachment 36
37 Note: Sparsity is the issue, not heavytails per se. (Power laws with 2< <3 give us the appropriate sparsity) 37
38 Forest Fire [LKF05]: connections spread like a fire New node joins the network Selects a seed node Connects to some of its neighbors Continue recursively Notes: Preferential attachment flavor - second neighbor is not uniform at random. Copying flavor - since burn seed s neighbors. Hierarchical flavor - seed is parent. Local flavor - burn near -- in a diffusion sense -- the seed vertex. As community grows it blends into the core of the network 38
39 rewired network Bag of whiskers 39
40 Whiskers: Largest whisker has ~100 nodes Whisker size is independent of network size Core: 60% of the nodes, 80% edges Core has little structure (hard to cut) Still more structure than the random network 40
41 The Dunbar number 150 individuals is maximum community size On-line communities have 60 members and break down at around 80, military, churches, divisions, etc. all close to the Dunbar's 150 Common bond vs. common identity theory Common bond (people are attached to individual community members) are smaller and more cohesive Common identity (people are attached to the group as a whole) focused around common interest and tend to be larger and more diverse What edges mean and community identification social networks - reasons an individual adds a link to a friend can vary enormously citation networks or web graphs - links are more expensive and are more semantically uniform 41
42 Networks with ground truth communities: LiveJournal12: users create and explicitly join on-line groups DBLP co-authorships: publication venues can be viewed as communities Amazon product co-purchasing: each item belongs to one or more hierarchically organized categories, as defined by Amazon IMDB collaboration: countries of production and languages may be viewed as communities 42
43 LiveJournal DBLP Rewired Network Ground truth Amazon IMDB 43
44 NCP plot is a way to analyze network community structure Our results agree with previous work on small networks (people did not hit the Dunbar s limit) But large networks are different: Whiskers + Core (core-periphery) structure Small well isolated communities blend into the core of the networks as they grow 44
45 45
46 Assume a recursive Kronecker model. Fit it to G. We get K = What does this tell about the network structure? CoreCore-peripheryPeriphery 0.9 edges No communities 0.1 edges No good cuts edges As opposed to: edges which gives a hierarchy
47 Assume a recursive Kronecker model. Fit it to G. We get K = What does this tell about the network structure? Core 0.9 edges 0.5 edges edges Periphery 0.1 edges As opposed to: which gives a hierarchy
48 48
Community Structure in Large Social and Information Networks
Community Structure in Large Social and Information Networks Michael W. Mahoney Stanford University (For more info, see: http://cs.stanford.edu/people/mmahoney) Lots and lots of large data! DNA micro-array
More informationAn Empirical Analysis of Communities in Real-World Networks
An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationImplementation of Network Community Profile using Local Spectral algorithm and its application in Community Networking
Implementation of Network Community Profile using Local Spectral algorithm and its application in Community Networking Vaibhav VPrakash Department of Computer Science and Engineering, Sri Jayachamarajendra
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationWeb Structure Mining Community Detection and Evaluation
Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationSupplementary Material: Large-scale community structure in social and information networks
Supplementary Material: Large-scale community structure in social and information networks Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, Michael W. Mahoney Note: This is a draft, from March 25, 2009.
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationCommunity Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well- Defined Clusters
Internet Mathematics ISSN: 1542-7951 (Print) 1944-9488 (Online) Journal homepage: http://www.tandfonline.com/loi/uinm20 Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large
More informationarxiv: v1 [cs.ds] 20 Apr 2010
Empirical Comparison of Algorithms for Network Community Detection Jure Leskovec Stanford University jure@cs.stanford.edu Kevin J. Lang Yahoo! Research langk@yahoo-inc.com Michael W. Mahoney Stanford University
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More informationModularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationDiffusion and Clustering on Large Graphs
Diffusion and Clustering on Large Graphs Alexander Tsiatas Thesis Proposal / Advancement Exam 8 December 2011 Introduction Graphs are omnipresent in the real world both natural and man-made Examples of
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank
More informationCommunity detection. Leonid E. Zhukov
Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:
More informationOnline Social Networks and Media. Community detection
Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the
More informationOh Pott, Oh Pott! or how to detect community structure in complex networks
Oh Pott, Oh Pott! or how to detect community structure in complex networks Jörg Reichardt Interdisciplinary Centre for Bioinformatics, Leipzig, Germany (Host of the 2012 Olympics) Questions to start from
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationExtracting Information from Complex Networks
Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning
More informationCOMMUNITY detection is one of the most important
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion Joyce Jiyoung Whang, Member, IEEE, David F. Gleich, and Inderjit S. Dhillon,
More informationHow do we view BIG data?
How do we view BIG data? Algorithmic & Statistical Perspectives... Lambert (2000) Computer Scientists Data: are a record of everything that happened. Goal: process the data to find interesting patterns
More informationECS 289 / MAE 298, Lecture 15 Mar 2, Diffusion, Cascades and Influence, Part II
ECS 289 / MAE 298, Lecture 15 Mar 2, 2011 Diffusion, Cascades and Influence, Part II Diffusion and cascades in networks (Nodes in one of two states) Viruses (human and computer) contact processes epidemic
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationChallenges in Multiresolution Methods for Graph-based Learning
Challenges in Multiresolution Methods for Graph-based Learning Michael W. Mahoney ICSI and Dept of Statistics, UC Berkeley ( For more info, see: http: // www. stat. berkeley. edu/ ~ mmahoney or Google
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationDemystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian
Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationBig Data Analytics Influx of data pertaining to the 4Vs, i.e. Volume, Veracity, Velocity and Variety
Holistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges Big Data Analytics Influx of data pertaining to the 4Vs, i.e. Volume, Veracity, Velocity and Variety Abhishek
More informationData Clustering Hierarchical Clustering, Density based clustering Grid based clustering
Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms
More informationCSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationHolistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges
Holistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges Abhishek Santra 1 and Sanjukta Bhowmick 2 1 Information Technology Laboratory, CSE Department, University of
More informationCommunity Structure Detection. Amar Chandole Ameya Kabre Atishay Aggarwal
Community Structure Detection Amar Chandole Ameya Kabre Atishay Aggarwal What is a network? Group or system of interconnected people or things Ways to represent a network: Matrices Sets Sequences Time
More informationCS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul
1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given
More informationClustering Results. Result List Example. Clustering Results. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Presenting Results Clustering Clustering Results! Result lists often contain documents related to different aspects of the query topic! Clustering is used to
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011
Stanford University CS359G: Graph Partitioning and Expanders Handout 1 Luca Trevisan January 4, 2011 Lecture 1 In which we describe what this course is about. 1 Overview This class is about the following
More informationCSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection
CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationExtracting Communities from Networks
Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction
More informationClusters and Communities
Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders
More informationModels of Network Formation. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns
Models of Network Formation Networked Life NETS 112 Fall 2017 Prof. Michael Kearns Roadmap Recently: typical large-scale social and other networks exhibit: giant component with small diameter sparsity
More informationCSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationClustering in Data Mining
Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,
More informationA Divisive clustering technique for maximizing the modularity
A Divisive clustering technique for maximizing the modularity Ümit V. Çatalyürek, Kamer Kaya, Johannes Langguth, and Bora Uçar February 13, 2012 1 Introduction 2 Clustering Paradigms 3 Algorithm 4 Results
More informationHierarchical Clustering: Objectives & Algorithms. École normale supérieure & CNRS
Hierarchical Clustering: Objectives & Algorithms Vincent Cohen-Addad Paris Sorbonne & CNRS Frederik Mallmann-Trenn MIT Varun Kanade University of Oxford Claire Mathieu École normale supérieure & CNRS Clustering
More informationBumptrees for Efficient Function, Constraint, and Classification Learning
umptrees for Efficient Function, Constraint, and Classification Learning Stephen M. Omohundro International Computer Science Institute 1947 Center Street, Suite 600 erkeley, California 94704 Abstract A
More informationParallel Local Graph Clustering
Parallel Local Graph Clustering Julian Shun Joint work with Farbod Roosta-Khorasani, Kimon Fountoulakis, and Michael W. Mahoney Work appeared in VLDB 2016 2 Metric for Cluster Quality Conductance = Number
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationEMERGENCE OF CORE-PERIPHERY STRUCTURE FROM LOCAL NODE DOMINANCE IN SOCIAL NETWORKS
EMERGENCE OF CORE-PERIPHERY STRUCTURE FROM LOCAL NODE DOMINANCE IN SOCIAL NETWORKS Jennifer Gamble, Harish Chintakunta, Hamid Krim Electrical and Computer Engineering North Carolina State University jpgamble@ncsu.edu,
More informationSocial Data Management Communities
Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities
More informationCS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University
CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University Course website: http://snap.stanford.edu/na09 Slides will be available online Reading material will be posted online:
More informationCSE 494 Project C. Garrett Wolf
CSE 494 Project C Garrett Wolf Introduction The main purpose of this project task was for us to implement the simple k-means and buckshot clustering algorithms. Once implemented, we were asked to vary
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationA Computational Theory of Clustering
A Computational Theory of Clustering Avrim Blum Carnegie Mellon University Based on work joint with Nina Balcan, Anupam Gupta, and Santosh Vempala Point of this talk A new way to theoretically analyze
More informationAn Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu
An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu ACM SAC 2010 outline Social network Definition and properties Social
More informationCo-clustering or Biclustering
References: Co-clustering or Biclustering A. Anagnostopoulos, A. Dasgupta and R. Kumar: Approximation Algorithms for co-clustering, PODS 2008. K. Puolamaki. S. Hanhijarvi and G. Garriga: An approximation
More informationHierarchical Clustering
Hierarchical Clustering Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree-like diagram that records the sequences of merges
More informationClustering: Overview and K-means algorithm
Clustering: Overview and K-means algorithm Informal goal Given set of objects and measure of similarity between them, group similar objects together K-Means illustrations thanks to 2006 student Martin
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationClustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL
Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL Contents Zachary s famous example Community structure Modularity The Girvan-Newman edge betweenness algorithm In the beginning: Zachary
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationParallel Local Graph Clustering
Parallel Local Graph Clustering Kimon Fountoulakis, joint work with J. Shun, X. Cheng, F. Roosta-Khorasani, M. Mahoney, D. Gleich University of California Berkeley and Purdue University Based on J. Shun,
More informationRobust Kernel Methods in Clustering and Dimensionality Reduction Problems
Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust
More informationSpectral Graph Multisection Through Orthogonality. Huanyang Zheng and Jie Wu CIS Department, Temple University
Spectral Graph Multisection Through Orthogonality Huanyang Zheng and Jie Wu CIS Department, Temple University Outline Motivation Preliminary Algorithm Evaluation Future work Motivation Traditional graph
More informationMachine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016
Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Classification and Clustering Classification and clustering are classical pattern recognition / machine learning problems
More informationCS512 (Spring 2012) Advanced Data Mining : Midterm Exam I
CS512 (Spring 2012) Advanced Data Mining : Midterm Exam I (Thursday, March 1, 2012, 90 minutes, 100 marks brief answers directly written on the exam paper) Note: Closed book and notes but one reference
More informationMCL. (and other clustering algorithms) 858L
MCL (and other clustering algorithms) 858L Comparing Clustering Algorithms Brohee and van Helden (2006) compared 4 graph clustering algorithms for the task of finding protein complexes: MCODE RNSC Restricted
More informationCSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.
CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory
More informationV2: Measures and Metrics (II)
- Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness
More informationChapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han
Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application
More informationCommunity detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati
Community detection algorithms survey and overlapping communities Presented by Sai Ravi Kiran Mallampati (sairavi5@vt.edu) 1 Outline Various community detection algorithms: Intuition * Evaluation of the
More informationImage Segmentation continued Graph Based Methods
Image Segmentation continued Graph Based Methods Previously Images as graphs Fully-connected graph node (vertex) for every pixel link between every pair of pixels, p,q affinity weight w pq for each link
More informationStatistical Physics of Community Detection
Statistical Physics of Community Detection Keegan Go (keegango), Kenji Hata (khata) December 8, 2015 1 Introduction Community detection is a key problem in network science. Identifying communities, defined
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationScalable Influence Maximization in Social Networks under the Linear Threshold Model
Scalable Influence Maximization in Social Networks under the Linear Threshold Model Wei Chen Microsoft Research Asia Yifei Yuan Li Zhang In collaboration with University of Pennsylvania Microsoft Research
More informationCentralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge
Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum
More informationLarge Scale Graph Algorithms
Large Scale Graph Algorithms A Guide to Web Research: Lecture 2 Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Stuttgart, Spring 2007 1 / 34 Talk Objective To pose an abstract computational
More informationTopology Enhancement in Wireless Multihop Networks: A Top-down Approach
Topology Enhancement in Wireless Multihop Networks: A Top-down Approach Symeon Papavassiliou (joint work with Eleni Stai and Vasileios Karyotis) National Technical University of Athens (NTUA) School of
More informationLecture 7: Decision Trees
Lecture 7: Decision Trees Instructor: Outline 1 Geometric Perspective of Classification 2 Decision Trees Geometric Perspective of Classification Perspective of Classification Algorithmic Geometric Probabilistic...
More informationData Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data
More informationBased on Raymond J. Mooney s slides
Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit
More informationClassification. 1 o Semestre 2007/2008
Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationCommunity Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network
Community Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network 1 Monika Kasondra, 2 Prof. Kamal Sutaria, 1 M.E. Student, 2 Assistent Professor, 1 Computer
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationClustering Part 4 DBSCAN
Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of
More informationAn Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization
An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science
More informationCluster Analysis. Ying Shen, SSE, Tongji University
Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group
More informationNearest Neighbor with KD Trees
Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest
More informationCommunity Analysis. Chapter 6
This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides
More informationImage Segmentation. Shengnan Wang
Image Segmentation Shengnan Wang shengnan@cs.wisc.edu Contents I. Introduction to Segmentation II. Mean Shift Theory 1. What is Mean Shift? 2. Density Estimation Methods 3. Deriving the Mean Shift 4. Mean
More information