SOFIA: Social Filtering for Niche Markets

Size: px
Start display at page:

Download "SOFIA: Social Filtering for Niche Markets"

Transcription

1 Social Filtering for Niche Markets Matteo Dell'Amico Licia Capra University College London UCL MobiSys Seminar 9 October 2007 : Social Filtering for Niche Markets

2 Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

3 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern The Long Tail Chris Anderson, 2006 Digital distribution: millions of dierent products are available to consumers. An enormous market for niche content is appearing. Users need help to nd interesting content. Filters are essential to connect supply and demand. Our Problem Creating an ecient and robust lter. : Social Filtering for Niche Markets

4 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern The Long Tail Chris Anderson, 2006 Digital distribution: millions of dierent products are available to consumers. An enormous market for niche content is appearing. Users need help to nd interesting content. Filters are essential to connect supply and demand. Our Problem Creating an ecient and robust lter. : Social Filtering for Niche Markets

5 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

6 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Collaborative Filtering Which items might I like? Let's look at what similar users did. Similarity in reviews, behaviour... They are competent: they express (subjective!) judgements we agree with. : Social Filtering for Niche Markets

7 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Propagating Trust: Competence Alice expressed judgement X (I like eating at SOAS). Bob agrees with Alice on X, therefore Alice ranks Bob as a competent evaluator. Bob also expressed judgement Y (They make good burgers at ULU). Alice decides to trust Bob's advice and tries ULU. : Social Filtering for Niche Markets

8 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Sybil Attack Also known as... How to trick Alice? Create lots of false users (Sybils) that copy Alice's judgements. All Sybils vote for a malicious judgement they want to increase the ranking of. Since the Sybils look competent to Alice, she will trust them. Prole injection, shilling (CF), web spam (webpage ranking). In Social Filtering, Alice leverages on her social ties to isolate Sybils. : Social Filtering for Niche Markets

9 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Sybil Attack Also known as... How to trick Alice? Create lots of false users (Sybils) that copy Alice's judgements. All Sybils vote for a malicious judgement they want to increase the ranking of. Since the Sybils look competent to Alice, she will trust them. Prole injection, shilling (CF), web spam (webpage ranking). In Social Filtering, Alice leverages on her social ties to isolate Sybils. : Social Filtering for Niche Markets

10 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

11 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Propagating Trust: Intent Web of Trust: a social network where A links to B if A trusts B to behave honestly. Created explicitely by users (e.g., Facebook) or automatically (e.g., logs). Trust Transitivity: I trust the friends of my friends. Alice thinks Bob is honest. Bob recommends Charlie to Alice. Since Alice trusts Bob, she decides to trusts Charlie as well. Iteratively, Alice derives trust for Dave. : Social Filtering for Niche Markets

12 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Isolating Sybils There is no way to recognize legitimate users only by looking at their judgements. It is costly for the attacker to convince honest users to trust it. A small number of honest users are connected to the Sybil network via attack edges (Yu et al., ACM SIGCOMM '06). We can isolate Sybils if we limit the amount of trust propagated through the attack edges. : Social Filtering for Niche Markets

13 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Discussing Trust Transitivity Pro Users get trusted if they behave honestly. If reciprocative behaviour is adopted, the rational choice for selsh users is to behave honestly (Feldman et al., ACM EC '04). Sybils can get isolated. Con Trust transitivity does not take into account the tastes of the users. This is a big problem in niches, where subjectivity is extreme. : Social Filtering for Niche Markets

14 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Discussing Trust Transitivity Pro Users get trusted if they behave honestly. If reciprocative behaviour is adopted, the rational choice for selsh users is to behave honestly (Feldman et al., ACM EC '04). Sybils can get isolated. Con Trust transitivity does not take into account the tastes of the users. This is a big problem in niches, where subjectivity is extreme. : Social Filtering for Niche Markets

15 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

16 Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern Propagating Trust: Social Filtering We trust users who are both willing and able to give good judgements. Alice trusts Dave's intent because a path in the web of trust connects her to him. She trusts his competence because they agree on X. Since Dave is honest and competent, Alice trusts his judgement Y. : Social Filtering for Niche Markets

17 From HITS to Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

18 From HITS to PageRank Google's algorithm to rank the importance of web pages. Intuitive consideration: an authoritative page is linked by many authoritative pages. A random surfer following links at random is more likely to stumble in more important pages. With 1 α probability of stopping at each step, PageRank computes the probability that the random surfer stops at any given page. In Webs of Trust The same principle applies: reputable users are recommended by other reputable users. We swap the WWW graph with the social network. : Social Filtering for Niche Markets

19 From HITS to PageRank Google's algorithm to rank the importance of web pages. Intuitive consideration: an authoritative page is linked by many authoritative pages. A random surfer following links at random is more likely to stumble in more important pages. With 1 α probability of stopping at each step, PageRank computes the probability that the random surfer stops at any given page. In Webs of Trust The same principle applies: reputable users are recommended by other reputable users. We swap the WWW graph with the social network. : Social Filtering for Niche Markets

20 From HITS to Personalized PageRank PageRank does not take into account subjectivity, which is essential to isolate Sybil nodes. We force the random walk to start in the evaluating node: this assures that the walk starts at a honest node. The trust obtained by Sybil nodes is limited by the probability of following an attack edge. : Social Filtering for Niche Markets

21 From HITS to Personalized PageRank - The α Parameter (1) α: probability that our random walk continues at each step. Low α implies shorter paths. Pro: Con: Fast convergence Close social ties may have related tastes (i.e., my friends listen to similar music) We don't trust honest users because they're socially far away. : Social Filtering for Niche Markets

22 From HITS to Personalized PageRank - The α Parameter (2) High α implies longer paths: Pro: Con: We have more information about nodes. Attack edges are more likely to be traversed: lower attack resilience. : Social Filtering for Niche Markets

23 From HITS to Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

24 From HITS to HITS: the Idea Jon Kleinberg, JACM 1999 Web pages are seen as hubs and authorities: authorities are the authoritative pages; hubs are pages that link to authorities. Good hubs point to good authorities; good authorities are pointed by good hubs. In our case Users instead of hubs; judgements instead of authorities. : Social Filtering for Niche Markets

25 From HITS to HITS: the Algorithm We have a bipartite graph with hubs/users (circles) and authorities/judgements (squares). All hubs start with the same weight. Iteratively, until convergence: Weights on authorities are the sum of weights on all hubs that link them; Weights on hubs become the sum of weights on authorities they link; Weights on hubs get renormalized. : Social Filtering for Niche Markets

26 From HITS to HITS: Example (1) Initialization Weights on hubs get initialized. : Social Filtering for Niche Markets

27 From HITS to HITS: Example (2) Forward step Weigths on authorities are the sum of hubs who link them. : Social Filtering for Niche Markets

28 From HITS to HITS: Example (3) Backward step Weigths on hubs are the sum of linked authorities. : Social Filtering for Niche Markets

29 From HITS to HITS: Example (4) Normalization Weigths on hubs get renormalized. Back to the Forward Step. : Social Filtering for Niche Markets

30 From HITS to Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

31 From HITS to Change 1: Tightly Knit Communities SALSA: Lempel and Moran, 2001 HITS rewards disproportionately communities where users and judgements are highly correlated. In the graph on the left, the ranking of nodes in the less dense blue community goes to 0. Fix: perform a random walk on the judgement graph and compute the equilibrium distribution. Side eect: niche judgements are rewarded, since their weight is redistributed to less nodes. : Social Filtering for Niche Markets

32 From HITS to Change 2: Subjective Ranking Problem The results of HITS are independent from tastes of the evaluating node. It is essential to have personalized results. Fix Same approach as in PageRank: we start the random walk from the evaluating node. To reward shorter paths, we stop at each iteration with probability 1 β. Low β implies higher subjectivity and faster convergence. High β favours longer paths of trust propagation. : Social Filtering for Niche Markets

33 From HITS to Change 2: Subjective Ranking Problem The results of HITS are independent from tastes of the evaluating node. It is essential to have personalized results. Fix Same approach as in PageRank: we start the random walk from the evaluating node. To reward shorter paths, we stop at each iteration with probability 1 β. Low β implies higher subjectivity and faster convergence. High β favours longer paths of trust propagation. : Social Filtering for Niche Markets

34 From HITS to Change 3: Take Intent into Account Problem Fix As said before, we don't want to trust dishonest nodes. Culprit for HITS: backwards step. The fact that a user expressed a judgement does not insure they are well intentioned. 1 Compute intent ranking using Personalized PageRank. 2 Redistribute trust to users proportionally to their intent ranking. : Social Filtering for Niche Markets

35 From HITS to Change 3: Take Intent into Account Problem Fix As said before, we don't want to trust dishonest nodes. Culprit for HITS: backwards step. The fact that a user expressed a judgement does not insure they are well intentioned. 1 Compute intent ranking using Personalized PageRank. 2 Redistribute trust to users proportionally to their intent ranking. : Social Filtering for Niche Markets

36 From HITS to in Synthesis : SOcial FIltering Algorithm HITS-like trust propagating algorithm. 3 key modications: 1 Random walk trust propagation as proposed in SALSA 2 The starting point is the evaluating node; the random walk continues at each step with probability β. 3 In the backward step, trust is redistributed from judgements to users according to their intent ranking computed using Personalized PageRank on the web of trust. : Social Filtering for Niche Markets

37 Datasets Hidden Judgements Sybil Attacks Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

38 Datasets Hidden Judgements Sybil Attacks CiteSeer Large dataset of scientic collaborations. Social network: co-authorship data. Authors A and B are connected if they wrote papers together. Judgements: citations. Graph data If X cites Y, the implicit judgement is Y is relevant to X's topic. A highly clustered subset of the whole graph. 10,000 authors. 182,675 papers. : Social Filtering for Niche Markets

39 Datasets Hidden Judgements Sybil Attacks Last.fm Social networking website devoted to music. Social network: friend lists. Same as Facebook, MySpace,... Judgements: most listened artists chart for each user. Implicit judgement: I like to listen to songs by X. Graph data A BFS crawl of 10,000 users. 51,654 dierent artists. : Social Filtering for Niche Markets

40 Datasets Hidden Judgements Sybil Attacks Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

41 Datasets Hidden Judgements Sybil Attacks Hidden Judgements How to evaluate the accuracy of 's ranking on judgements? We want to rank highly the judgements that a user would approve. We hide a random judgement and execute. If the algorithm performs well, the hidden judgement will have a high ranking. In Citeseer, we try to guess a missing citation from a paper. In Last.fm, we try to nd the missing artist in a chart. : Social Filtering for Niche Markets

42 Datasets Hidden Judgements Sybil Attacks Hidden Judgements - Citeseer no intent ranking Personalized PageRank Ratio Rank Medians: 4 (), 12 ( - no intent ranking), 30 (Personalized PageRank). : Social Filtering for Niche Markets

43 Datasets Hidden Judgements Sybil Attacks Hidden Judgements - Last.fm no intent ranking Personalized PageRank Ratio Rank Medians: 174 (), 157 ( - no intent ranking), 344 (Personalized PageRank). : Social Filtering for Niche Markets

44 Datasets Hidden Judgements Sybil Attacks Outline 1 Social Filtering Competence: Taste Similarity Intent: Trust Transitivity The Social Filtering Pattern 2 From HITS to 3 Datasets Hidden Judgements Sybil Attacks : Social Filtering for Niche Markets

45 Datasets Hidden Judgements Sybil Attacks Sybil Attack We simulated an attack trying to inate the rating of a malicious judgement X on a victim node A. A coalition of 100 Sybil nodes is created. All Sybils copy A's judgements, then add a link to X. We study how the ranking of X changes before and after the attack, on the victim node A and on other nodes. : Social Filtering for Niche Markets

46 Datasets Hidden Judgements Sybil Attacks Sybil Attack - Last.fm (1) Attack Percentiles Algorithm edges Role Any no attack 12,914 25,827 38,741 - no intent victim other 348 1,185 3, ,730 20,493 33,322 Pers. PageRank 10 4,759 8,757 13, ,092 2,012 3,101 1 victim 3,406 11,182 31,765 other 9,599 19,186 33, victim 469 1,311 2,815 other 4,612 8,779 14, victim other 1,040 2,649 5,571 : Social Filtering for Niche Markets

47 Datasets Hidden Judgements Sybil Attacks Sybil Attack - Last.fm (2) Attack Percentiles Algorithm edges Role (α = 0.9) 100 (α = 0.5) 100 victim other 1,040 2,649 5,571 victim other 1,578 3,106 5,128 Tradeo between accuracy and attack resilience. : Social Filtering for Niche Markets

48 Datasets Hidden Judgements Sybil Attacks Conclusions Social Filtering Integrating information about social networks and subjective preferences we obtain recommendations that are: Accurate (due mainly to preferences) Attack resilient (thanks to social networks). Incorporating social network may increase accuracy. A particular implementation of Social Filtering. Future Work P2P/mobile decentralised implementation Other social ltering algorithms? : Social Filtering for Niche Markets

COMP5331: Knowledge Discovery and Data Mining

COMP5331: Knowledge Discovery and Data Mining COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd, Jon M. Kleinberg 1 1 PageRank

More information

Link Analysis and Web Search

Link Analysis and Web Search Link Analysis and Web Search Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ based on material by prof. Bing Liu http://www.cs.uic.edu/~liub/webminingbook.html

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Information Retrieval. Lecture 11 - Link analysis

Information Retrieval. Lecture 11 - Link analysis Information Retrieval Lecture 11 - Link analysis Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 35 Introduction Link analysis: using hyperlinks

More information

COMP 4601 Hubs and Authorities

COMP 4601 Hubs and Authorities COMP 4601 Hubs and Authorities 1 Motivation PageRank gives a way to compute the value of a page given its position and connectivity w.r.t. the rest of the Web. Is it the only algorithm: No! It s just one

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Giri Iyengar Cornell University gi43@cornell.edu March 14, 2018 Giri Iyengar (Cornell Tech) Social Network Analysis March 14, 2018 1 / 24 Overview 1 Social Networks 2 HITS 3 Page

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 28: Apr 26, 2012 Scribes: Mauricio Monsalve and Yamini Mule

Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 28: Apr 26, 2012 Scribes: Mauricio Monsalve and Yamini Mule Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 28: Apr 26, 2012 Scribes: Mauricio Monsalve and Yamini Mule 1 How big is the Web How big is the Web? In the past, this question

More information

Information Retrieval Lecture 4: Web Search. Challenges of Web Search 2. Natural Language and Information Processing (NLIP) Group

Information Retrieval Lecture 4: Web Search. Challenges of Web Search 2. Natural Language and Information Processing (NLIP) Group Information Retrieval Lecture 4: Web Search Computer Science Tripos Part II Simone Teufel Natural Language and Information Processing (NLIP) Group sht25@cl.cam.ac.uk (Lecture Notes after Stephen Clark)

More information

University of Maryland. Tuesday, March 2, 2010

University of Maryland. Tuesday, March 2, 2010 Data-Intensive Information Processing Applications Session #5 Graph Algorithms Jimmy Lin University of Maryland Tuesday, March 2, 2010 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Part 1: Link Analysis & Page Rank

Part 1: Link Analysis & Page Rank Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Graph Data: Social Networks [Source: 4-degrees of separation, Backstrom-Boldi-Rosa-Ugander-Vigna,

More information

Link Analysis in the Cloud

Link Analysis in the Cloud Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)

More information

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science Lecture 9: I: Web Retrieval II: Webology Johan Bollen Old Dominion University Department of Computer Science jbollen@cs.odu.edu http://www.cs.odu.edu/ jbollen April 10, 2003 Page 1 WWW retrieval Two approaches

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

Social Networks 2015 Lecture 10: The structure of the web and link analysis

Social Networks 2015 Lecture 10: The structure of the web and link analysis 04198250 Social Networks 2015 Lecture 10: The structure of the web and link analysis The structure of the web Information networks Nodes: pieces of information Links: different relations between information

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]

More information

1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a

1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a !"#$ %#& ' Introduction ' Social network analysis ' Co-citation and bibliographic coupling ' PageRank ' HIS ' Summary ()*+,-/*,) Early search engines mainly compare content similarity of the query and

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Learning to Rank Networked Entities

Learning to Rank Networked Entities Learning to Rank Networked Entities Alekh Agarwal Soumen Chakrabarti Sunny Aggarwal Presented by Dong Wang 11/29/2006 We've all heard that a million monkeys banging on a million typewriters will eventually

More information

How to organize the Web?

How to organize the Web? How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second try: Web Search Information Retrieval attempts to find relevant docs in a small and trusted set Newspaper

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

Graph Algorithms. Revised based on the slides by Ruoming Kent State

Graph Algorithms. Revised based on the slides by Ruoming Kent State Graph Algorithms Adapted from UMD Jimmy Lin s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States. See http://creativecommons.org/licenses/by-nc-sa/3.0/us/

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

CS6200 Information Retreival. The WebGraph. July 13, 2015

CS6200 Information Retreival. The WebGraph. July 13, 2015 CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects

More information

Social and Technological Network Data Analytics. Lecture 5: Structure of the Web, Search and Power Laws. Prof Cecilia Mascolo

Social and Technological Network Data Analytics. Lecture 5: Structure of the Web, Search and Power Laws. Prof Cecilia Mascolo Social and Technological Network Data Analytics Lecture 5: Structure of the Web, Search and Power Laws Prof Cecilia Mascolo In This Lecture We describe power law networks and their properties and show

More information

Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material.

Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. 1 Contents Introduction Network properties Social network analysis Co-citation

More information

Einführung in Web und Data Science Community Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Einführung in Web und Data Science Community Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Einführung in Web und Data Science Community Analysis Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Today s lecture Anchor text Link analysis for ranking Pagerank and variants

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #11: Link Analysis 3 Seoul National University 1 In This Lecture WebSpam: definition and method of attacks TrustRank: how to combat WebSpam HITS algorithm: another algorithm

More information

Bitcoin, Security for Cloud & Big Data

Bitcoin, Security for Cloud & Big Data Bitcoin, Security for Cloud & Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013 Bitcoin Public, distributed, peer-to-peer, hash-chained audit log of all transactions ( block chain ).

More information

CS425: Algorithms for Web Scale Data

CS425: Algorithms for Web Scale Data CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org J.

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Link analysis Instructor: Rada Mihalcea (Note: This slide set was adapted from an IR course taught by Prof. Chris Manning at Stanford U.) The Web as a Directed Graph

More information

CPSC 340: Machine Learning and Data Mining. Ranking Fall 2016

CPSC 340: Machine Learning and Data Mining. Ranking Fall 2016 CPSC 340: Machine Learning and Data Mining Ranking Fall 2016 Assignment 5: Admin 2 late days to hand in Wednesday, 3 for Friday. Assignment 6: Due Friday, 1 late day to hand in next Monday, etc. Final:

More information

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds MAE 298, Lecture 9 April 30, 2007 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in

More information

DSCI 575: Advanced Machine Learning. PageRank Winter 2018

DSCI 575: Advanced Machine Learning. PageRank Winter 2018 DSCI 575: Advanced Machine Learning PageRank Winter 2018 http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf Web Search before Google Unsupervised Graph-Based Ranking We want to rank importance based on

More information

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur

More information

COMP6237 Data Mining Making Recommendations. Jonathon Hare

COMP6237 Data Mining Making Recommendations. Jonathon Hare COMP6237 Data Mining Making Recommendations Jonathon Hare jsh2@ecs.soton.ac.uk Introduction Recommender systems 101 Taxonomy of recommender systems Collaborative Filtering Collecting user preferences as

More information

Supplementary file for SybilDefender: A Defense Mechanism for Sybil Attacks in Large Social Networks

Supplementary file for SybilDefender: A Defense Mechanism for Sybil Attacks in Large Social Networks 1 Supplementary file for SybilDefender: A Defense Mechanism for Sybil Attacks in Large Social Networks Wei Wei, Fengyuan Xu, Chiu C. Tan, Qun Li The College of William and Mary, Temple University {wwei,

More information

Data-Intensive Computing with MapReduce

Data-Intensive Computing with MapReduce Data-Intensive Computing with MapReduce Session 5: Graph Processing Jimmy Lin University of Maryland Thursday, February 21, 2013 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

Pagerank Scoring. Imagine a browser doing a random walk on web pages:

Pagerank Scoring. Imagine a browser doing a random walk on web pages: Ranking Sec. 21.2 Pagerank Scoring Imagine a browser doing a random walk on web pages: Start at a random page At each step, go out of the current page along one of the links on that page, equiprobably

More information

Recommendation/Reputation. Ennan Zhai

Recommendation/Reputation. Ennan Zhai Recommendation/Reputation Ennan Zhai ennan.zhai@yale.edu Lecture Outline Background Reputation System: EigenTrust & Credence Sybil-Resitance: DSybil Lecture Outline Background Reputation System: EigenTrust

More information

Aiding the Detection of Fake Accounts in Large Scale Social Online Services

Aiding the Detection of Fake Accounts in Large Scale Social Online Services Aiding the Detection of Fake Accounts in Large Scale Social Online Services Qiang Cao Duke University Michael Sirivianos Xiaowei Yang Tiago Pregueiro Cyprus Univ. of Technology Duke University Tuenti,

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 5: Analyzing Graphs (2/2) February 2, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These

More information

Recommender Systems (RSs)

Recommender Systems (RSs) Recommender Systems Recommender Systems (RSs) RSs are software tools providing suggestions for items to be of use to users, such as what items to buy, what music to listen to, or what online news to read

More information

Information Networks: PageRank

Information Networks: PageRank Information Networks: PageRank Web Science (VU) (706.716) Elisabeth Lex ISDS, TU Graz June 18, 2018 Elisabeth Lex (ISDS, TU Graz) Links June 18, 2018 1 / 38 Repetition Information Networks Shape of the

More information

Analysis of Large Graphs: TrustRank and WebSpam

Analysis of Large Graphs: TrustRank and WebSpam Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Lec 8: Adaptive Information Retrieval 2

Lec 8: Adaptive Information Retrieval 2 Lec 8: Adaptive Information Retrieval 2 Advaith Siddharthan Introduction to Information Retrieval by Manning, Raghavan & Schütze. Website: http://nlp.stanford.edu/ir-book/ Linear Algebra Revision Vectors:

More information

Bruno Martins. 1 st Semester 2012/2013

Bruno Martins. 1 st Semester 2012/2013 Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Information Networks: Hubs and Authorities

Information Networks: Hubs and Authorities Information Networks: Hubs and Authorities Web Science (VU) (706.716) Elisabeth Lex KTI, TU Graz June 11, 2018 Elisabeth Lex (KTI, TU Graz) Links June 11, 2018 1 / 61 Repetition Opinion Dynamics Culture

More information

Threats & Vulnerabilities in Online Social Networks

Threats & Vulnerabilities in Online Social Networks Threats & Vulnerabilities in Online Social Networks Lei Jin LERSAIS Lab @ School of Information Sciences University of Pittsburgh 03-26-2015 201 Topics Focus is the new vulnerabilities that exist in online

More information

Unit VIII. Chapter 9. Link Analysis

Unit VIII. Chapter 9. Link Analysis Unit VIII Link Analysis: Page Ranking in web search engines, Efficient Computation of Page Rank using Map-Reduce and other approaches, Topic-Sensitive Page Rank, Link Spam, Hubs and Authorities (Text Book:2

More information

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011

Jordan Boyd-Graber University of Maryland. Thursday, March 3, 2011 Data-Intensive Information Processing Applications! Session #5 Graph Algorithms Jordan Boyd-Graber University of Maryland Thursday, March 3, 2011 This work is licensed under a Creative Commons Attribution-Noncommercial-Share

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 60 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Pregel: A System for Large-Scale Graph Processing

More information

CSI 445/660 Part 10 (Link Analysis and Web Search)

CSI 445/660 Part 10 (Link Analysis and Web Search) CSI 445/660 Part 10 (Link Analysis and Web Search) Ref: Chapter 14 of [EK] text. 10 1 / 27 Searching the Web Ranking Web Pages Suppose you type UAlbany to Google. The web page for UAlbany is among the

More information

Author(s): Rahul Sami, 2009

Author(s): Rahul Sami, 2009 Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial Share Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/

More information

CS 6604: Data Mining Large Networks and Time-Series

CS 6604: Data Mining Large Networks and Time-Series CS 6604: Data Mining Large Networks and Time-Series Soumya Vundekode Lecture #12: Centrality Metrics Prof. B Aditya Prakash Agenda Link Analysis and Web Search Searching the Web: The Problem of Ranking

More information

Graph and Link Mining

Graph and Link Mining Graph and Link Mining Graphs - Basics A graph is a powerful abstraction for modeling entities and their pairwise relationships. G = (V,E) Set of nodes V = v,, v 5 Set of edges E = { v, v 2, v 4, v 5 }

More information

Roadmap. Roadmap. Ranking Web Pages. PageRank. Roadmap. Random Walks in Ranking Query Results in Semistructured Databases

Roadmap. Roadmap. Ranking Web Pages. PageRank. Roadmap. Random Walks in Ranking Query Results in Semistructured Databases Roadmap Random Walks in Ranking Query in Vagelis Hristidis Roadmap Ranking Web Pages Rank according to Relevance of page to query Quality of page Roadmap PageRank Stanford project Lawrence Page, Sergey

More information

Web Spam. Seminar: Future Of Web Search. Know Your Neighbors: Web Spam Detection using the Web Topology

Web Spam. Seminar: Future Of Web Search. Know Your Neighbors: Web Spam Detection using the Web Topology Seminar: Future Of Web Search University of Saarland Web Spam Know Your Neighbors: Web Spam Detection using the Web Topology Presenter: Sadia Masood Tutor : Klaus Berberich Date : 17-Jan-2008 The Agenda

More information

Information Retrieval and Web Search Engines

Information Retrieval and Web Search Engines Information Retrieval and Web Search Engines Lecture 12: Link Analysis January 28 th, 2016 Wolf-Tilo Balke and Younes Ghammad Institut für Informationssysteme Technische Universität Braunschweig An Overview

More information

Web search before Google. (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.)

Web search before Google. (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.) ' Sta306b May 11, 2012 $ PageRank: 1 Web search before Google (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.) & % Sta306b May 11, 2012 PageRank: 2 Web search

More information

De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs,

De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs, De#anonymizing,Social,Networks, and,inferring,private,attributes, Using,Knowledge,Graphs, Jianwei Qian Illinois Tech Chunhong Zhang BUPT Xiang#Yang Li USTC,/Illinois Tech Linlin Chen Illinois Tech Outline

More information

Network Centrality. Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017

Network Centrality. Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017 Network Centrality Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017 Node centrality n Relative importance of a node in a network n How influential a person is within a

More information

Sybil defenses via social networks

Sybil defenses via social networks Sybil defenses via social networks Abhishek University of Oslo, Norway 19/04/2012 1 / 24 Sybil identities Single user pretends many fake/sybil identities i.e., creating multiple accounts observed in real-world

More information

The link prediction problem for social networks

The link prediction problem for social networks The link prediction problem for social networks Alexandra Chouldechova STATS 319, February 1, 2011 Motivation Recommending new friends in in online social networks. Suggesting interactions between the

More information

Personalized Information Retrieval

Personalized Information Retrieval Personalized Information Retrieval Shihn Yuarn Chen Traditional Information Retrieval Content based approaches Statistical and natural language techniques Results that contain a specific set of words or

More information

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1 Matrix-Vector Multiplication by MapReduce From Rajaraman / Ullman- Ch.2 Part 1 Google implementation of MapReduce created to execute very large matrix-vector multiplications When ranking of Web pages that

More information

A Case For OneSwarm. Tom Anderson University of Washington.

A Case For OneSwarm. Tom Anderson University of Washington. A Case For OneSwarm Tom Anderson University of Washington http://oneswarm.cs.washington.edu/ With: Jarret Falkner, Tomas Isdal, Alex Jaffe, John P. John, Arvind Krishnamurthy, Harsha Madhyastha and Mike

More information

Trust in the Internet of Things From Personal Experience to Global Reputation. 1 Nguyen Truong PhD student, Liverpool John Moores University

Trust in the Internet of Things From Personal Experience to Global Reputation. 1 Nguyen Truong PhD student, Liverpool John Moores University Trust in the Internet of Things From Personal Experience to Global Reputation 1 Nguyen Truong PhD student, Liverpool John Moores University 2 Outline I. Background on Trust in Computer Science II. Overview

More information

Link Analysis. CSE 454 Advanced Internet Systems University of Washington. 1/26/12 16:36 1 Copyright D.S.Weld

Link Analysis. CSE 454 Advanced Internet Systems University of Washington. 1/26/12 16:36 1 Copyright D.S.Weld Link Analysis CSE 454 Advanced Internet Systems University of Washington 1/26/12 16:36 1 Ranking Search Results TF / IDF or BM25 Tag Information Title, headers Font Size / Capitalization Anchor Text on

More information

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech CSE 6242/ CX 4242 Feb 18, 2014 Graphs / Networks Centrality measures, algorithms, interactive applications Duen Horng (Polo) Chau Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey

More information

Link Analysis: Web Structure and Search

Link Analysis: Web Structure and Search Link Analysis: Web Structure and Search Web Science (VU) (706716) Elisabeth Lex ISDS, TU Graz June 12, 2017 Elisabeth Lex (ISDS, TU Graz) Links June 12, 2017 1 / 69 Outline 1 Information Networks 2 Paths

More information

Recent Researches on Web Page Ranking

Recent Researches on Web Page Ranking Recent Researches on Web Page Pradipta Biswas School of Information Technology Indian Institute of Technology Kharagpur, India Importance of Web Page Internet Surfers generally do not bother to go through

More information

SEO: SEARCH ENGINE OPTIMISATION

SEO: SEARCH ENGINE OPTIMISATION SEO: SEARCH ENGINE OPTIMISATION SEO IN 11 BASIC STEPS EXPLAINED What is all the commotion about this SEO, why is it important? I have had a professional content writer produce my content to make sure that

More information

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users

Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users Social Interaction Based Video Recommendation: Recommending YouTube Videos to Facebook Users Bin Nie, Honggang Zhang, Yong Liu Fordham University, Bronx, NY. Email: {bnie, hzhang44}@fordham.edu NYU Poly,

More information

CPSC 426/526. Reputation Systems. Ennan Zhai. Computer Science Department Yale University

CPSC 426/526. Reputation Systems. Ennan Zhai. Computer Science Department Yale University CPSC 426/526 Reputation Systems Ennan Zhai Computer Science Department Yale University Recall: Lec-4 P2P search models: - How Chord works - Provable guarantees in Chord - Other DHTs, e.g., CAN and Pastry

More information

Graph and Web Mining - Motivation, Applications and Algorithms PROF. EHUD GUDES DEPARTMENT OF COMPUTER SCIENCE BEN-GURION UNIVERSITY, ISRAEL

Graph and Web Mining - Motivation, Applications and Algorithms PROF. EHUD GUDES DEPARTMENT OF COMPUTER SCIENCE BEN-GURION UNIVERSITY, ISRAEL Graph and Web Mining - Motivation, Applications and Algorithms PROF. EHUD GUDES DEPARTMENT OF COMPUTER SCIENCE BEN-GURION UNIVERSITY, ISRAEL Web mining - Outline Introduction Web Content Mining Web usage

More information

Link Structure Analysis

Link Structure Analysis Link Structure Analysis Kira Radinsky All of the following slides are courtesy of Ronny Lempel (Yahoo!) Link Analysis In the Lecture HITS: topic-specific algorithm Assigns each page two scores a hub score

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Web Structure Mining using Link Analysis Algorithms

Web Structure Mining using Link Analysis Algorithms Web Structure Mining using Link Analysis Algorithms Ronak Jain Aditya Chavan Sindhu Nair Assistant Professor Abstract- The World Wide Web is a huge repository of data which includes audio, text and video.

More information

Topic mash II: assortativity, resilience, link prediction CS224W

Topic mash II: assortativity, resilience, link prediction CS224W Topic mash II: assortativity, resilience, link prediction CS224W Outline Node vs. edge percolation Resilience of randomly vs. preferentially grown networks Resilience in real-world networks network resilience

More information

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit Page 1 of 14 Retrieving Information from the Web Database and Information Retrieval (IR) Systems both manage data! The data of an IR system is a collection of documents (or pages) User tasks: Browsing

More information

Web consists of web pages and hyperlinks between pages. A page receiving many links from other pages may be a hint of the authority of the page

Web consists of web pages and hyperlinks between pages. A page receiving many links from other pages may be a hint of the authority of the page Link Analysis Links Web consists of web pages and hyperlinks between pages A page receiving many links from other pages may be a hint of the authority of the page Links are also popular in some other information

More information

MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns

MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns MIDTERM EXAMINATION Networked Life (NETS 112) November 21, 2013 Prof. Michael Kearns This is a closed-book exam. You should have no material on your desk other than the exam itself and a pencil or pen.

More information

Link Farming in Twitter

Link Farming in Twitter Link Farming in Twitter Pawan Goyal CSE, IITKGP Nov 11, 2016 Pawan Goyal (IIT Kharagpur) Link Farming in Twitter Nov 11, 2016 1 / 1 Reference Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar

More information

Link Analysis. Hongning Wang

Link Analysis. Hongning Wang Link Analysis Hongning Wang CS@UVa Structured v.s. unstructured data Our claim before IR v.s. DB = unstructured data v.s. structured data As a result, we have assumed Document = a sequence of words Query

More information

An Improved Computation of the PageRank Algorithm 1

An Improved Computation of the PageRank Algorithm 1 An Improved Computation of the PageRank Algorithm Sung Jin Kim, Sang Ho Lee School of Computing, Soongsil University, Korea ace@nowuri.net, shlee@computing.ssu.ac.kr http://orion.soongsil.ac.kr/ Abstract.

More information

3 announcements: Thanks for filling out the HW1 poll HW2 is due today 5pm (scans must be readable) HW3 will be posted today

3 announcements: Thanks for filling out the HW1 poll HW2 is due today 5pm (scans must be readable) HW3 will be posted today 3 announcements: Thanks for filling out the HW1 poll HW2 is due today 5pm (scans must be readable) HW3 will be posted today CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu

More information

Collaborative Filtering using Euclidean Distance in Recommendation Engine

Collaborative Filtering using Euclidean Distance in Recommendation Engine Indian Journal of Science and Technology, Vol 9(37), DOI: 10.17485/ijst/2016/v9i37/102074, October 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Collaborative Filtering using Euclidean Distance

More information

Automatically Building Research Reading Lists

Automatically Building Research Reading Lists Automatically Building Research Reading Lists Michael D. Ekstrand 1 Praveen Kanaan 1 James A. Stemper 2 John T. Butler 2 Joseph A. Konstan 1 John T. Riedl 1 ekstrand@cs.umn.edu 1 GroupLens Research Department

More information

CPSC 532L Project Development and Axiomatization of a Ranking System

CPSC 532L Project Development and Axiomatization of a Ranking System CPSC 532L Project Development and Axiomatization of a Ranking System Catherine Gamroth cgamroth@cs.ubc.ca Hammad Ali hammada@cs.ubc.ca April 22, 2009 Abstract Ranking systems are central to many internet

More information

Deep Web Crawling and Mining for Building Advanced Search Application

Deep Web Crawling and Mining for Building Advanced Search Application Deep Web Crawling and Mining for Building Advanced Search Application Zhigang Hua, Dan Hou, Yu Liu, Xin Sun, Yanbing Yu {hua, houdan, yuliu, xinsun, yyu}@cc.gatech.edu College of computing, Georgia Tech

More information

A STUDY OF RANKING ALGORITHM USED BY VARIOUS SEARCH ENGINE

A STUDY OF RANKING ALGORITHM USED BY VARIOUS SEARCH ENGINE A STUDY OF RANKING ALGORITHM USED BY VARIOUS SEARCH ENGINE Bohar Singh 1, Gursewak Singh 2 1, 2 Computer Science and Application, Govt College Sri Muktsar sahib Abstract The World Wide Web is a popular

More information

Link Analysis in Web Mining

Link Analysis in Web Mining Problem formulation (998) Link Analysis in Web Mining Hubs and Authorities Spam Detection Suppose we are given a collection of documents on some broad topic e.g., stanford, evolution, iraq perhaps obtained

More information

Countering Sparsity and Vulnerabilities in Reputation Systems

Countering Sparsity and Vulnerabilities in Reputation Systems Countering Sparsity and Vulnerabilities in Reputation Systems Li Xiong Department of Mathematics and Computer Science Emory University lxiong@mathcs.emory.edu Ling Liu, Mustaque Ahamad College of Computing

More information