Scalable Network Analysis
|
|
- Gervais Mervin Manning
- 6 years ago
- Views:
Transcription
1 Inderjit S. Dhillon University of Texas at Austin COMAD, Ahmedabad, India Dec 20, 2013
2 Outline Unstructured Data - Scale & Diversity Evolving Networks Machine Learning Problems arising in Networks Recommender Systems Link Prediction Sign Prediction Formulation as Missing Value Estimation Scalable Algorithms NOMAD: Distributed matrix completion algorithm Results on Applications Conclusions 2
3 Structured Data Data organized into fields: Relational databases, spreadsheets, XML Highly optimized for storage & retrieval (e.g. using SQL) 3
4 Structured Data Focus is on data format, efficient storage & search Less or no uncertainty in semantics: e.g. businesses know the fields of the data 4
5 Unstructured Data Modern data is unstructured and diverse Networks, Graphs Text Images, Videos 5
6 Unstructured Data Much greater growth rate 6
7 Unstructured Data Dynamic aspects of unstructured data: Constantly evolving Uncertainties abound: What should I ask of the data? Seek insights Heterogeneity renders traditional database models inadequate 7
8 Unstructured Data Buzzwords - Big Data & Data Science Machine Learning: Predictive models for data Engineering perspective: Scale matters 8
9 Network Graphs Social networks (Friendship) APQ6 AQP1 AQP5 Bipartite networks (Membership, Ratings, etc.) MIP AQP2 GK Gene networks (Functional interaction) 9
10 Graph Evolution Social networks are highly dynamic Constantly grow, change quickly over time Users arrive/leave, relationships form/dissolve Understanding graph evolution is important 10
11 Graphs meet Machine Learning Network analysis: Understanding structure & evolution of networks Formulate predictive problems on the adjacency matrix of the graph Confluence of graph theory & machine learning 11
12 Users Recommender Systems Movies Link Prediction Netflix problem: 100M ratings, 0.5M users, 20K movies A Toy Problem In Comparison! Facebook growth 12
13 Users Recommender Systems Movies 13
14 Link Prediction in Social Networks?? Network at time T Network at time T + 1 Problem: Infer missing relationships from a given snapshot of the network 14
15 Predicting gene-disease links? 15
16 Signed Social Networks Like Dislike Like / Dislike? Sign Prediction Problem: Given a snapshot of the signed social network, predict the signs of missing edges 16
17 Formulation as Missing Value Estimation
18 Users Low-rank Matrix Completion Movies 18
19 Users Low-rank Matrix Completion Movies? 19
20 Low-rank Matrix Completion 20
21 Low-rank Matrix Completion min W 2R m k,h2r n k X (i,j)2 (A ij W i H T j ) 2 + (kw k 2 F + khk 2 F ) 21
22 Low-rank Matrix Completion? min W 2R m k,h2r n k X (i,j)2 (A ij W i H T j ) 2 + (kw k 2 F + khk 2 F ) 22
23 Low-rank Matrix Completion 3.44 min W 2R m k,h2r n k X (i,j)2 (A ij W i H T j ) 2 + (kw k 2 F + khk 2 F ) 23
24 Link Prediction Can be posed as matrix completion problem A (t) = Issue: Only positive relationships are observed Test Link Score (4,5) 1 (1,4) 1 (3,4) 1 (1,5) 1 24
25 Link Prediction Formulate Biased Matrix Completion Problem: min W 2R n k,h2r n k X (i,j)2 (A ij W i H T j ) 2 + X (i,j) /2 (A ij W i H T j ) 2 + (kw k 2 F + khk 2 F ) A (t) = Test Link Score (4,5) 0.59 (1,4) 0.59 (3,4) 0.67 (1,5)
26 Signed Social Networks Balanced Not balanced Friend of a friend is a friend Enemy of an enemy is a friend Social Balance [Harary,1953]: In real-world signed networks, triangles tend to be balanced 26
27 Signed Social Networks Theorem: All triangles in a network are balanced if and only if there exist two antagonistic groups. 27
28 Signed Social Networks Weakly Balanced Not balanced Relaxation: Weak balance Allow triangles with all negative edges 28
29 Signed Social Networks Theorem: All triangles in a network are weakly balanced if and only if there exist k antagonistic groups. 29
30 Sign Prediction Theorem: A k-weakly balanced signed network has rank at most k. Sign inference can be posed as low-rank matrix completion Theorem: If there are no small groups, the underlying network can be exactly recovered, under certain conditions. K. Chiang et al. Prediction and Clustering in Signed Networks: A Local to Global Perspective. To appear in JMLR. 30
31 Scalable Algorithms
32 Stochastic Gradient Sample random index (i,j) and update corresponding factors: w i w i ((A ij w T i h j )h j + w i ) h j h j ((A ij w T i h j )w i + h j ) Time per update O(k) Effective for very large-scale problems 32
33 Distributed Stochastic Gradient Descent (DSGD) [Gemulla et al. KDD 2011] Decoupled updates Easy to parallelize But communication & computation are interleaved 33
34 DSGD 34
35 DSGD Curse of the last reducer 35
36 DSGD 36
37 NOMAD Goal: Keep CPU & network simultaneously busy. Non-locking stochastic Multi-machine algorithm for Asynchronous & Decentralized matrix factorization Asynchronous distributed solution. 37
38 NOMAD Nomadic variables queue h 1 h 5 h 4 h 3 Native variables w 1 w 2 w 3 Workers w 4 h 2 38
39 NOMAD Nomadic variables queue h 1 h 5 h 3 Native variables Push h 4 w 1 w 2 w 3 Workers w 4 h 2 h 4 39
40 NOMAD 40
41 NOMAD 41
42 NOMAD 42
43 NOMAD 43
44 NOMAD 44
45 NOMAD 45
46 NOMAD 46
47 NOMAD 47
48 NOMAD 48
49 NOMAD 49
50 NOMAD 50
51 NOMAD Algorithm 1. Initialize: Randomly assign h j columns to worker queues 2. Parallel Foreach q in {1,2,...,p} 3. If queue[q] not empty then 4. (j, h j ) queue[q].pop() 5. for ( i, j) in (q) do j 6. Do SGD updates 7. end for 8. Sample q uniformly from {1,2,...,p} 9. queue[q ].push( (j, h j ) 10. end if 11. Parallel End 51
52 NOMAD Algorithm 1. Initialize: Randomly assign h j columns to worker queues 2. Parallel Foreach q in {1,2,...,p} 3. If queue[q] not empty then 4. (j, h j ) queue[q].pop() 5. for ( i, j) in (q) do j 6. Do SGD updates Distributed setting: 7. end for Write over the network 8. Sample q uniformly from {1,2,...,p} 9. queue[q ].push( (j, h j ) 10. end if 11. Parallel End Concurrent object 52
53 Algorithm Complexity Average space required per worker: O(mk/p + nk/p + /p) Average time for one sweep: O( k/p) 53
54 Results on Applications
55 Recommender Systems Netflix dataset: 2,649,429 users, 17,770 movies, ~100M ratings Multicore Distributed 55
56 Recommender Systems Synthetic dataset: ~85M users, 17,770 items, ~8.5B observations 56
57 Link Prediction Flickr dataset: 1.9M users & 42M links. Test set: sampled 5K users. 57
58 Predicting Gene-Disease Links CATAPULT(Blom et al.,2013) Katz Biased MF Test pair rank CATAPULT(Blom et al.,2013) Katz Biased MF Singleton gene rank Pr(Rank <= x) Pr(Rank <= x) Rank Rank 58
59 Sign Prediction Epinions dataset (+ve & -ve reviews): 131K nodes, 840K edges, 15% edges negative. MF-ALS is faster and achieves higher accuracy. MF-ALS takes 455 secs on network with 1.1M nodes & 120M edges 59
60 Conclusions Rapid growth of unstructured data demands scalable machine learning solutions for analysis Machine learning problems arising in network analysis can be cast in the matrix completion framework Our proposed asynchronous distributed algorithm NOMAD outperforms state-of-the-art matrix completion solvers Beyond Matrix Completion: Asynchronous distributed framework for solving machine learning problems 60
Scalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationSocial and Information Network Analysis Positive and Negative Relationships
Social and Information Network Analysis Positive and Negative Relationships Cornelia Caragea Department of Computer Science and Engineering University of North Texas June 14, 2016 To Recap Triadic Closure
More informationPositive and Negative Relations
Positive and Negative Relations CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2016 Hadi Amiri hadi@umd.edu Lecture Topics Balanced and Unbalanced Networks
More informationLink Prediction for Social Network
Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue
More informationLink Sign Prediction and Ranking in Signed Directed Social Networks
Noname manuscript No. (will be inserted by the editor) Link Sign Prediction and Ranking in Signed Directed Social Networks Dongjin Song David A. Meyer Received: date / Accepted: date Abstract Signed directed
More informationData Mining Techniques
Data Mining Techniques CS 6 - Section - Spring 7 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Project Project Deadlines Feb: Form teams of - people 7 Feb:
More informationPositive and Negative Links
Positive and Negative Links Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz May 4, 2015 Elisabeth Lex (KTI, TU Graz) Networks May 4, 2015 1 / 66 Outline 1 Repetition 2 Motivation 3 Structural Balance
More informationCS224W Final Report Emergence of Global Status Hierarchy in Social Networks
CS224W Final Report Emergence of Global Status Hierarchy in Social Networks Group 0: Yue Chen, Jia Ji, Yizheng Liao December 0, 202 Introduction Social network analysis provides insights into a wide range
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationData Mining Techniques
Data Mining Techniques CS 60 - Section - Fall 06 Lecture Jan-Willem van de Meent (credit: Andrew Ng, Alex Smola, Yehuda Koren, Stanford CS6) Recommender Systems The Long Tail (from: https://www.wired.com/00/0/tail/)
More informationDistributed stochastic gradient descent for link prediction in signed social networks
Zhang et al. EURASIP Journal on Advances in Signal Processing (2019) 2019:3 https://doi.org/10.1186/s13634-019-0601-0 EURASIP Journal on Advances in Signal Processing RESEARCH Open Access Distributed stochastic
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/10/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang Dept. of Computer Science University of Texas at Austin kychiang@cs.utexas.edu Joyce Jiyoung Whang Dept. of Computer
More informationGraph Mining: Overview of different graph models
Graph Mining: Overview of different graph models Davide Mottin, Konstantina Lazaridou Hasso Plattner Institute Graph Mining course Winter Semester 2016 Lecture road Anomaly detection (previous lecture)
More informationCS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks
CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks Archana Sulebele, Usha Prabhu, William Yang (Group 29) Keywords: Link Prediction, Review Networks, Adamic/Adar,
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Recommender Systems II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Recommender Systems Recommendation via Information Network Analysis Hybrid Collaborative Filtering
More informationIMP: A Message-Passing Algorithm for Matrix Completion
IMP: A Message-Passing Algorithm for Matrix Completion Henry D. Pfister (joint with Byung-Hak Kim and Arvind Yedla) Electrical and Computer Engineering Texas A&M University ISTC 2010 Brest, France September
More informationV2: Measures and Metrics (II)
- Betweenness Centrality V2: Measures and Metrics (II) - Groups of Vertices - Transitivity - Reciprocity - Signed Edges and Structural Balance - Similarity - Homophily and Assortative Mixing 1 Betweenness
More informationSupervised Random Walks
Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17 Correlation Discovery by random walk Problem definition Estimate
More informationAnalyzing Stochastic Gradient Descent for Some Non- Convex Problems
Analyzing Stochastic Gradient Descent for Some Non- Convex Problems Christopher De Sa Soon at Cornell University cdesa@stanford.edu stanford.edu/~cdesa Kunle Olukotun Christopher Ré Stanford University
More informationAspEm: Embedding Learning by Aspects in Heterogeneous Information Networks
AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks Yu Shi, Huan Gui, Qi Zhu, Lance Kaplan, Jiawei Han University of Illinois at Urbana-Champaign (UIUC) Facebook Inc. U.S. Army Research
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationPractice Questions for Midterm
Practice Questions for Midterm - 10-605 Oct 14, 2015 (version 1) 10-605 Name: Fall 2015 Sample Questions Andrew ID: Time Limit: n/a Grade Table (for teacher use only) Question Points Score 1 6 2 6 3 15
More informationQUINT: On Query-Specific Optimal Networks
QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li Joint work with Yuan Yao (NJU) -1- Jie Tang (Tsinghua) Wei Fan (Baidu) Hanghang Tong (ASU) Node Proximity: What? Node proximity: the closeness
More informationGraph Data Management
Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of
More informationAlgorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov
Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties
More informationDatabases and Information Retrieval Integration TIETS42. Kostas Stefanidis Autumn 2016
+ Databases and Information Retrieval Integration TIETS42 Autumn 2016 Kostas Stefanidis kostas.stefanidis@uta.fi http://www.uta.fi/sis/tie/dbir/index.html http://people.uta.fi/~kostas.stefanidis/dbir16/dbir16-main.html
More informationGraph Data Management Systems in New Applications Domains. Mikko Halin
Graph Data Management Systems in New Applications Domains Mikko Halin Introduction Presentation is based on two papers Graph Data Management Systems for New Application Domains - Philippe Cudré-Mauroux,
More informationConflict Graphs for Parallel Stochastic Gradient Descent
Conflict Graphs for Parallel Stochastic Gradient Descent Darshan Thaker*, Guneet Singh Dhillon* Abstract We present various methods for inducing a conflict graph in order to effectively parallelize Pegasos.
More informationNavigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets
Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets Nadathur Satish, Narayanan Sundaram, Mostofa Ali Patwary, Jiwon Seo, Jongsoo Park, M. Amber Hassaan, Shubho Sengupta, Zhaoming
More informationMachine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari
Machine Learning Basics: Stochastic Gradient Descent Sargur N. srihari@cedar.buffalo.edu 1 Topics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation Sets
More informationCSE 258. Web Mining and Recommender Systems. Advanced Recommender Systems
CSE 258 Web Mining and Recommender Systems Advanced Recommender Systems This week Methodological papers Bayesian Personalized Ranking Factorizing Personalized Markov Chains Personalized Ranking Metric
More informationMassive Data Analysis
Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that
More informationWHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures
ASHNIK PTE LTD. White Paper WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures Date: 10/12/2014 Company Name: Ashnik Pte Ltd. Singapore By: Sandeep Khuperkar, Director
More informationDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:
More informationResource and Performance Distribution Prediction for Large Scale Analytics Queries
Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia
More informationSpectral Clustering and Community Detection in Labeled Graphs
Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu
More informationCSE 158 Lecture 8. Web Mining and Recommender Systems. Extensions of latent-factor models, (and more on the Netflix prize)
CSE 158 Lecture 8 Web Mining and Recommender Systems Extensions of latent-factor models, (and more on the Netflix prize) Summary so far Recap 1. Measuring similarity between users/items for binary prediction
More informationLearning Low-rank Transformations: Algorithms and Applications. Qiang Qiu Guillermo Sapiro
Learning Low-rank Transformations: Algorithms and Applications Qiang Qiu Guillermo Sapiro Motivation Outline Low-rank transform - algorithms and theories Applications Subspace clustering Classification
More informationBatch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data
Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data Jesse Read 1, Albert Bifet 2, Bernhard Pfahringer 2, Geoff Holmes 2 1 Department of Signal Theory and Communications Universidad
More informationRecommender System. What is it? How to build it? Challenges. R package: recommenderlab
Recommender System What is it? How to build it? Challenges R package: recommenderlab 1 What is a recommender system Wiki definition: A recommender system or a recommendation system (sometimes replacing
More informationHarp-DAAL for High Performance Big Data Computing
Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big
More informationGraph-Parallel Problems. ML in the Context of Parallel Architectures
Case Study 4: Collaborative Filtering Graph-Parallel Problems Synchronous v. Asynchronous Computation Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 20 th, 2014
More informationCS 5220: Parallel Graph Algorithms. David Bindel
CS 5220: Parallel Graph Algorithms David Bindel 2017-11-14 1 Graphs Mathematically: G = (V, E) where E V V Convention: V = n and E = m May be directed or undirected May have weights w V : V R or w E :
More informationParallelism. CS6787 Lecture 8 Fall 2017
Parallelism CS6787 Lecture 8 Fall 2017 So far We ve been talking about algorithms We ve been talking about ways to optimize their parameters But we haven t talked about the underlying hardware How does
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 4, 2016 Outline Multi-core v.s. multi-processor Parallel Gradient Descent Parallel Stochastic Gradient Parallel Coordinate Descent Parallel
More informationPerformance and Scalability
Measuring Critical Manufacturing cmnavigo: Affordable MES Performance and Scalability for Time-Critical Industry Environments Critical Manufacturing, 2015 Summary Proving Performance and Scalability Time-Critical
More informationNon-exhaustive, Overlapping k-means
Non-exhaustive, Overlapping k-means J. J. Whang, I. S. Dhilon, and D. F. Gleich Teresa Lebair University of Maryland, Baltimore County October 29th, 2015 Teresa Lebair UMBC 1/38 Outline Introduction NEO-K-Means
More informationDeepFace: Closing the Gap to Human-Level Performance in Face Verification
DeepFace: Closing the Gap to Human-Level Performance in Face Verification Report on the paper Artem Komarichev February 7, 2016 Outline New alignment technique New DNN architecture New large dataset with
More informationMachine Learning Basics. Sargur N. Srihari
Machine Learning Basics Sargur N. srihari@cedar.buffalo.edu 1 Overview Deep learning is a specific type of ML Necessary to have a solid understanding of the basic principles of ML 2 Topics Stochastic Gradient
More informationCPSC 340: Machine Learning and Data Mining. Recommender Systems Fall 2017
CPSC 340: Machine Learning and Data Mining Recommender Systems Fall 2017 Assignment 4: Admin Due tonight, 1 late day for Monday, 2 late days for Wednesday. Assignment 5: Posted, due Monday of last week
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationConstrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationModularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationMIND: Machine Learning based Network Dynamics. Dr. Yanhui Geng Huawei Noah s Ark Lab, Hong Kong
MIND: Machine Learning based Network Dynamics Dr. Yanhui Geng Huawei Noah s Ark Lab, Hong Kong Outline Challenges with traditional SDN MIND architecture Experiment results Conclusion Challenges with Traditional
More informationUncovering the Formation of Triadic Closure in Social Networks. Zhanpeng Fang and Jie Tang Tsinghua University
Uncovering the Formation of Triadic Closure in Social Networks Zhanpeng Fang and Jie Tang Tsinghua University 1 Triangle Laws Triangle is one of most basic human groups in social networks Friends of friends
More informationStructure of Social Networks
Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers
More informationChallenges in Multiresolution Methods for Graph-based Learning
Challenges in Multiresolution Methods for Graph-based Learning Michael W. Mahoney ICSI and Dept of Statistics, UC Berkeley ( For more info, see: http: // www. stat. berkeley. edu/ ~ mmahoney or Google
More informationDecentralized and Distributed Machine Learning Model Training with Actors
Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of
More informationWhy do we need graph processing?
Why do we need graph processing? Community detection: suggest followers? Determine what products people will like Count how many people are in different communities (polling?) Graphs are Everywhere Group
More informationParallel Stochastic Gradient Descent: The case for native GPU-side GPI
Parallel Stochastic Gradient Descent: The case for native GPU-side GPI J. Keuper Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Mark Silberstein Accelerated Computer
More informationTGNet: Learning to Rank Nodes in Temporal Graphs. Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2
TGNet: Learning to Rank Nodes in Temporal Graphs Qi Song 1 Bo Zong 2 Yinghui Wu 1,3 Lu-An Tang 2 Hui Zhang 2 Guofei Jiang 2 Haifeng Chen 2 1 2 3 Node Ranking in temporal graphs Temporal graphs have been
More informationComplex-Network Modelling and Inference
Complex-Network Modelling and Inference Lecture 8: Graph features (2) Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/notes/ Network_Modelling/ School
More informationParadigm Shift of Database
Paradigm Shift of Database Prof. A. A. Govande, Assistant Professor, Computer Science and Applications, V. P. Institute of Management Studies and Research, Sangli Abstract Now a day s most of the organizations
More informationConvex Optimization MLSS 2015
Convex Optimization MLSS 2015 Constantine Caramanis The University of Texas at Austin The Optimization Problem minimize : f (x) subject to : x X. The Optimization Problem minimize : f (x) subject to :
More informationBalanced and partitionable signed graphs
A. Mrvar: Balanced and partitionable signed graphs 1 Balanced and partitionable signed graphs Signed graphs Signed graph is an ordered pair (G,σ), where: G = (V,L) is a graph with a set of vertices V and
More informationCOMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare
COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work
More informationGaia: Geo-Distributed Machine Learning Approaching LAN Speeds
Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds Kevin Hsieh Aaron Harlap, Nandita Vijaykumar, Dimitris Konomis, Gregory R. Ganger, Phillip B. Gibbons, Onur Mutlu Machine Learning and Big
More informationAn Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization
An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science
More informationCS224W Project: Recommendation System Models in Product Rating Predictions
CS224W Project: Recommendation System Models in Product Rating Predictions Xiaoye Liu xiaoye@stanford.edu Abstract A product recommender system based on product-review information and metadata history
More informationExtracting Communities from Networks
Extracting Communities from Networks Ji Zhu Department of Statistics, University of Michigan Joint work with Yunpeng Zhao and Elizaveta Levina Outline Review of community detection Community extraction
More informationMetric Learning for Large-Scale Image Classification:
Metric Learning for Large-Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationActiveClean: Interactive Data Cleaning For Statistical Modeling. Safkat Islam Carolyn Zhang CS 590
ActiveClean: Interactive Data Cleaning For Statistical Modeling Safkat Islam Carolyn Zhang CS 590 Outline Biggest Takeaways, Strengths, and Weaknesses Background System Architecture Updating the Model
More informationFastText. Jon Koss, Abhishek Jindal
FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words
More informationApache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis
Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning
More informationNetworks in economics and finance. Lecture 1 - Measuring networks
Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks
More informationCase Study 4: Collaborative Filtering. GraphLab
Case Study 4: Collaborative Filtering GraphLab Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin March 14 th, 2013 Carlos Guestrin 2013 1 Social Media
More informationLocal Community Detection in Dynamic Graphs Using Personalized Centrality
algorithms Article Local Community Detection in Dynamic Graphs Using Personalized Centrality Eisha Nathan, Anita Zakrzewska, Jason Riedy and David A. Bader * School of Computational Science and Engineering,
More informationEpilog: Further Topics
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Epilog: Further Topics Lecture: Prof. Dr. Thomas
More informationWeb Personalization & Recommender Systems
Web Personalization & Recommender Systems COSC 488 Slides are based on: - Bamshad Mobasher, Depaul University - Recent publications: see the last page (Reference section) Web Personalization & Recommender
More informationIntroduction to Data Management. Lecture #1 (Course Trailer )
Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Welcome to one
More informationLatent Space Model for Road Networks to Predict Time-Varying Traffic. Presented by: Rob Fitzgerald Spring 2017
Latent Space Model for Road Networks to Predict Time-Varying Traffic Presented by: Rob Fitzgerald Spring 2017 Definition of Latent https://en.oxforddictionaries.com/definition/latent Latent Space Model?
More informationCSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena
CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated
More informationover Multi Label Images
IBM Research Compact Hashing for Mixed Image Keyword Query over Multi Label Images Xianglong Liu 1, Yadong Mu 2, Bo Lang 1 and Shih Fu Chang 2 1 Beihang University, Beijing, China 2 Columbia University,
More informationPhylogenetics on CUDA (Parallel) Architectures Bradly Alicea
Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea
More informationAlgorithmic and Economic Aspects of Networks. Nicole Immorlica
Algorithmic and Economic Aspects of Networks Nicole Immorlica Syllabus 1. Jan. 8 th (today): Graph theory, network structure 2. Jan. 15 th : Random graphs, probabilistic network formation 3. Jan. 20 th
More informationDynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling
2014/04/09 @ WWW 14 Dynamic and Historical Shortest-Path Distance Queries on Large Evolving Networks by Pruned Landmark Labeling Takuya Akiba (U Tokyo) Yoichi Iwata (U Tokyo) Yuichi Yoshida (NII & PFI)
More informationMining Data that Changes. 17 July 2015
Mining Data that Changes 17 July 2015 Data is Not Static Data is not static New transactions, new friends, stop following somebody in Twitter, But most data mining algorithms assume static data Even a
More informationGraph based machine learning with applications to media analytics
Graph based machine learning with applications to media analytics Lei Ding, PhD 9-1-2011 with collaborators at Outline Graph based machine learning Basic structures Algorithms Examples Applications in
More informationJAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.
JAVASCRIPT CHARTING Scaling for the Enterprise with Metric Insights 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING... 3! Challenges... 3! Borrowing From The Enterprise BI Stack... 4! Visualization
More informationProgress Report: Collaborative Filtering Using Bregman Co-clustering
Progress Report: Collaborative Filtering Using Bregman Co-clustering Wei Tang, Srivatsan Ramanujam, and Andrew Dreher April 4, 2008 1 Introduction Analytics are becoming increasingly important for business
More informationLDA for Big Data - Outline
LDA FOR BIG DATA 1 LDA for Big Data - Outline Quick review of LDA model clustering words-in-context Parallel LDA ~= IPM Fast sampling tricks for LDA Sparsified sampler Alias table Fenwick trees LDA for
More informationKnow your neighbours: Machine Learning on Graphs
Know your neighbours: Machine Learning on Graphs Andrew Docherty Senior Research Engineer andrew.docherty@data61.csiro.au www.data61.csiro.au 2 Graphs are Everywhere Online Social Networks Transportation
More informationLarge-Scale Lasso and Elastic-Net Regularized Generalized Linear Models
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data
More informationarxiv: v1 [cs.si] 8 Jun 2012
Multi-Scale Link Prediction Donghyuk Shin Si Si Inderjit S. Dhillon arxiv:1206.1891v1 [cs.si] 8 Jun 2012 Abstract The automated analysis of social networks has become an important problem due to the proliferation
More informationSTREAMING RANKING BASED RECOMMENDER SYSTEMS
STREAMING RANKING BASED RECOMMENDER SYSTEMS Weiqing Wang, Hongzhi Yin, Zi Huang, Qinyong Wang, Xingzhong Du, Quoc Viet Hung Nguyen University of Queensland, Australia & Griffith University, Australia July
More informationSurvey on Process in Scalable Big Data Management Using Data Driven Model Frame Work
Survey on Process in Scalable Big Data Management Using Data Driven Model Frame Work Dr. G.V. Sam Kumar Faculty of Computer Studies, Ministry of Education, Republic of Maldives ABSTRACT: Data in rapid
More informationDatabases 2 (VU) ( / )
Databases 2 (VU) (706.711 / 707.030) MapReduce (Part 3) Mark Kröll ISDS, TU Graz Nov. 27, 2017 Mark Kröll (ISDS, TU Graz) MapReduce Nov. 27, 2017 1 / 42 Outline 1 Problems Suited for Map-Reduce 2 MapReduce:
More information