Collective Spammer Detection in Evolving Multi-Relational Social Networks
|
|
- Noreen Moody
- 6 years ago
- Views:
Transcription
1 + Collective Spammer Detection in Evolving Multi-Relational Social Networks Shobeir Fakhraei (University of Maryland) James Foulds (University of California, Santa Cruz) Madhusudana Shashanka (if(we) Inc., Currently Niara Inc.) Lise Getoor (University of California, Santa Cruz)
2 Spam in Social Networks 2 n Recent study by Nexgate in 2013: n Spam grew by more than 300% in half a year
3 Spam in Social Networks 3 n Recent study by Nexgate in 2013: n Spam grew by more than 300% in half a year n 1 in 200 social messages are spam
4 Spam in Social Networks 4 n Recent study by Nexgate in 2013: n Spam grew by more than 300% in half a year n 1 in 200 social messages are spam n 5% of all social apps are spammy
5 Spam in Social Networks 5 n What s different about social networks? n Spammers have more ways to interact with users
6 Spam in Social Networks 6 n What s different about social networks? n Spammers have more ways to interact with users n Messages, comments on photos, winks,
7 Spam in Social Networks 7 n What s different about social networks? n Spammers have more ways to interact with users n Messages, comments on photos, winks, n They can split spam across multiple messages
8 Spam in Social Networks 8 n What s different about social networks? n Spammers have more ways to interact with users n Messages, comments on photos, winks, n They can split spam across multiple messages n More available info about users on their profiles!
9 Spammers are getting smarter! 9 Traditional Spam: George Want some replica luxury watches? Click here: Shobeir
10 Spammers are getting smarter! 10 Traditional Spam: George Want some replica luxury watches? Click here: [Report Spam] Shobeir
11 Spammers are getting smarter! 11 Traditional Spam: (Intelligent) Social Spam: George Want some replica luxury watches? Click here: Mary Hey Shobeir! Nice profile photo. I live in Bay Area too. Wanna chat?! [Report Spam] Shobeir Shobeir
12 Spammers are getting smarter! 12 Traditional Spam: (Intelligent) Social Spam: George Want some replica luxury watches? Click here: Mary Hey Shobeir! Nice profile photo. I live in Bay Area too. Wanna chat?! [Report Spam] Shobeir Sure! :) Shobeir
13 Spammers are getting smarter! 13 Traditional Spam: (Intelligent) Social Spam: George Want some replica luxury watches? Click here: Mary Hey Shobeir! Nice profile photo. I live in Bay Area too. Wanna chat?! [Report Spam] Shobeir Sure! :) Realistic Looking Conversation Shobeir I m logging off here., too many people pinging me! I really like you, let s chat more here: Mary
14 Tagged.com 14 n Founded in 2004, is a social networking site which connects people through social interactions and games n Over 300 million registered members n Data sample for experiments (on a laptop): n 5.6 Million users (3.9% Labeled Spammers) n 912 Million Links
15 Social Networks: Multi-relational and Time-Evolving 15 t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
16 Social Networks: Multi-relational and Time-Evolving 16 Legitimate users t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
17 Social Networks: Multi-relational and Time-Evolving 17 Legitimate users Spammers t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
18 Social Networks: Multi-relational and Time-Evolving 18 Legitimate users Spammers t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Actions = Profile view, message, poke, report abuse, etc
19 Social Networks: Multi-relational and Time-Evolving 19 t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Actions = Profile view, message, poke, report abuse, etc
20 Social Networks: Multi-relational and Time-Evolving 20 Profile view t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Actions = Profile view, message, poke, report abuse, etc
21 Social Networks: Multi-relational and Time-Evolving 21 Profile view Message t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Actions = Profile view, message, poke, report abuse, etc
22 Social Networks: Multi-relational and Time-Evolving 22 Profile view Message t(1) t(2) t(6) t(5) t(10) t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Poke Actions = Profile view, message, poke, report abuse, etc
23 Social Networks: Multi-relational and Time-Evolving 23 Profile view Message t(1) t(2) t(6) t(5) t(10) Report spammer t(4) t(8) t(7) t(11) t(3) t(9) Link = Action at time t Poke Actions = Profile view, message, poke, report abuse, etc
24 Our Approach 24 Predict spammers based on: n Graph structure n Action sequences n Reporting behavior t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
25 Our Approach 25 Predict spammers based on: n Graph structure n Action sequences n Reporting behavior t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
26 26 Graph Structure Feature Extraction Are you interested? Meet Me Play Pets Friend Request Message Graphs for each relation Wink Report Abuse Pagerank, K-core, Graph coloring, Triangle count, Connected components, In/out degree
27 27 Graph Structure Feature Extraction Features Are you interested? Meet Me Play Pets Friend Request Message Graphs for each relation Wink Report Abuse Pagerank, K-core, Graph coloring, Triangle count, Connected components, In/out degree
28 Graph Structure Features 28 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
29 Graph Structure Features 29 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
30 Graph Structure Features 30 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
31 Graph Structure Features 31 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
32 Graph Structure Features 32 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
33 Graph Structure Features 33 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
34 Graph Structure Features 34 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
35 Graph Structure Features 35 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core X n Graph coloring n Connected components n Triangle count (8 features for each of 10 relations)
36 Graph Structure Features 36 n Extract features for each relation graph es for each of 10 rel n PageRank n Degree statistics n Total degree n In degree n Out degree n k-core n Graph coloring n Connected components n Triangle count X n Viewing profile n Friend requests n Message n Luv n Wink n Pets game n Buying n Wishing n MeetMe game n Yes n No n Reporting abuse (8 features for each of 10 relations)
37 Graph Structure Features 37 Viewing profile Reporting abuse t(1) t(9) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Classification method: Gradient Boosted Trees
38 Graph Structure Features 38 Experiments AU- PR AU- ROC 1 Rela'on, 8 Feature types 10 Rela'ons, 1 Feature type 10 Rela'ons, 8 Feature types ± ± ± ± ± ± Multiple relations/features better performance!
39 Graph Structure Features 39 Experiments AU- PR AU- ROC 1 Rela6on, 8 Feature types 10 Rela'ons, 1 Feature type 10 Rela'ons, 8 Feature types ± ± ± ± ± ± Multiple relations/features better performance!
40 Graph Structure Features 40 Experiments AU- PR AU- ROC 1 Rela'on, 8 Feature types 10 Rela6ons, 1 Feature type 10 Rela'ons, 8 Feature types ± ± ± ± ± ± Multiple relations/features better performance!
41 Graph Structure Features 41 Experiments AU- PR AU- ROC 1 Rela'on, 8 Feature types 10 Rela'ons, 1 Feature type 10 Rela6ons, 8 Feature types ± ± ± ± ± ± Multiple relations/features better performance!
42 Our Approach 42 Predict spammers based on: n Graph structure n Action sequences n Reporting behavior t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
43 Sequence of Actions 43 n Sequential Bigram Features: Short sequence segment of 2 consecutive actions, to capture sequential information User1 Ac'ons: Message, Profile_view, Message, Friend_Request,.
44 Sequence of Actions 44 n Mixture of Markov Models (MMM): A.k.a. chain-augmented, tree-augmented naive Bayes y x 1 x... 2 x n-1 x n ny P (y, x) =P (y)p (x 1 y) P (x i x i i=2 1,y),
45 Sequence of Actions 45 t(1) t(10) t(9) Action Sequence Bigram Features + Chain Augmented NB
46 Sequence of Actions 46 Experiments AU- PR AU- ROC Bigram Features ± ± MMM ± ± Bigram + MMM ± ± Little benefit from MMM (although little overhead)
47 Results 47 Precision-Recall ROC We can classify 70% of the spammers that need manual labeling with about 90% accuracy
48 Deployment and Example Runtimes 48 n We can: n Run the model on short intervals, with new snapshots of the network n Update the features as events occur n Example runtimes with Graphlab Create TM on a Macbook Pro: n 5.6 million vertices and 350 million edges: n PageRank: 6.25 minutes n Triangle counting: minutes n k-core: 14.3 minutes
49 Our Approach 49 Predict spammers based on: n Graph structure n Action sequences n Reporting behavior t(6) t(1) t(5) t(2) t(10) t(4) t(8) t(7) t(11) t(3) t(9)
50 Refining the abuse reporting systems 50 n Abuse report systems are very noisy n People have different standards n Spammers report random people to increase noise n Personal gain in social games n Goal is to clean up the system using: n Reporters previous history n Collective reasoning over reports
51 Collective Classification with Reports 51 t(1) t(10) t(9) Report Subgraph Probabilistic Soft Logic
52 HL-MRFs & Probabilistic Soft Logic (PSL) Probabilistic Soft Logic (PSL), a declarative modeling language based on first-order logic Weighted logical rules define a probabilistic graphical model:! : P (A, B) ^ Q(B,C)! R(A, C) Instantiated rules reduce the probability of any state that does not satisfy the rule, as measured by its distance to satisfaction 52
53 Collective Classification with Reports 53 n Model using only reports: REPORTED(v 1,v 2 )! SPAMMER(v 2 ) SPAMMER(v)
54 Collective Classification with Reports 54 n Model using reports and credibility of the reporter: CREDIBLE(v 1 ) ^ REPORTED(v 1,v 2 )! SPAMMER(v 2 ) PRIOR-CREDIBLE(v)! CREDIBLE(v) PRIOR-CREDIBLE(v)! CREDIBLE(v) SPAMMER(v)
55 Collective Classification with Reports 55 n Model using reports, credibility of the reporter, and collective reasoning: CREDIBLE(v 1 ) ^ REPORTED(v 1,v 2 )! SPAMMER(v 2 ) SPAMMER(v 2 ) ^ REPORTED(v 1,v 2 )! CREDIBLE(v 1 ) SPAMMER(v 2 ) ^ REPORTED(v 1,v 2 )! CREDIBLE(v 1 ) PRIOR-CREDIBLE(v)! CREDIBLE(v) PRIOR-CREDIBLE(v)! CREDIBLE(v) SPAMMER(v)
56 Results of Classification Using Reports 56 Experiments AU- PR AU- ROC Reports Only ± ± Reports & Credibility ± ± Reports & Credibility & Collec've Reasoning ± ± 0.004
57 Results of Classification Using Reports 57 Experiments AU- PR AU- ROC Reports Only ± ± Reports & Credibility ± ± Reports & Credibility & Collec've Reasoning ± ± 0.004
58 Results of Classification Using Reports 58 Experiments AU- PR AU- ROC Reports Only ± ± Reports & Credibility ± ± Reports & Credibility & Collec've Reasoning ± ± 0.004
59 Results of Classification Using Reports 59 Experiments AU- PR AU- ROC Reports Only ± ± Reports & Credibility ± ± Reports & Credibility & Collec6ve Reasoning ± ± 0.004
60 Conclusion 60 t(1) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Multiple relations are more predictive than multiple features AUPR: t(9) Code and part of the data will be released soon:
61 Conclusion 61 t(1) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Multiple relations are more predictive than multiple features AUPR: t(9) Action Sequence Bigram Features + Chain Augmented NB Even simple bigrams are highly predictive AUPR: Code and part of the data will be released soon:
62 Conclusion 62 t(1) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Multiple relations are more predictive than multiple features AUPR: Can classify 70% of the spammers that needed manual labeling with 90% accuracy t(9) Action Sequence Bigram Features + Chain Augmented NB Even simple bigrams are highly predictive AUPR: AUPR: Code and part of the data will be released soon:
63 Conclusion 63 t(1) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Multiple relations are more predictive than multiple features AUPR: Can classify 70% of the spammers that needed manual labeling with 90% accuracy t(9) Action Sequence Bigram Features + Chain Augmented NB Even simple bigrams are highly predictive AUPR: AUPR: Report Subgraph Probabilistic Soft Logic Jointly refining the credibility of the source is highly effective! AUPR: Code and part of the data will be released soon:
64 Acknowledgements 64 n Collaborators: Lise Getoor Univ. California, Santa Cruz Shobeir Fakhraei Univ. of Maryland Madhusudana Shashanka if(we) Inc., currently Niara Inc. n If(we) Inc. (Formerly Tagged Inc.): Johann Schleier-Smith, Karl Dawson, Dai Li, Stuart Robinson, Vinit Garg, and Simon Hill n Dato (Formerly Graphlab): Danny Bickson, Brian Kent, Srikrishna Sridhar, Rajat Arya, Shawn Scully, and Alice Zheng
65 Conclusion 65 t(1) t(10) Graph Structure PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring PageRank Triangle Count Out-Degree In-Degree k-core Graph Coloring Multiple relations are more predictive than multiple features AUPR: Can classify 70% of the spammers that needed manual labeling with 90% accuracy t(9) Action Sequence Bigram Features + Chain Augmented NB Even simple bigrams are highly predictive AUPR: AUPR: Report Subgraph Probabilistic Soft Logic Jointly refining the credibility of the source is highly effective! AUPR: Code and part of the data will be released soon:
Collective Spammer Detection in Evolving Multi-Relational Social Networks
Collective Spammer Detection in Evolving Multi-Relational Social Networks Shobeir Fakhraei University of Maryland College Park, MD, USA shobeir@cs.umd.edu James Foulds University of California Santa Cruz,
More informationarxiv: v1 [cs.si] 2 Jul 2016
Adaptive Neighborhood Graph Construction for Inference in Multi-Relational Networks Shobeir Fakhraei,2 Dhanya Sridhar 2 Jay Pujara 2 Lise Getoor 2 shobeir@cs.umd.edu dsridhar@ucsc.edu jay@cs.umd.edu getoor@soe.ucsc.edu
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationKnow your neighbours: Machine Learning on Graphs
Know your neighbours: Machine Learning on Graphs Andrew Docherty Senior Research Engineer andrew.docherty@data61.csiro.au www.data61.csiro.au 2 Graphs are Everywhere Online Social Networks Transportation
More informationCountering Spam Using Classification Techniques. Steve Webb Data Mining Guest Lecture February 21, 2008
Countering Spam Using Classification Techniques Steve Webb webb@cc.gatech.edu Data Mining Guest Lecture February 21, 2008 Overview Introduction Countering Email Spam Problem Description Classification
More informationDetecting Spam Web Pages
Detecting Spam Web Pages Marc Najork Microsoft Research Silicon Valley About me 1989-1993: UIUC (home of NCSA Mosaic) 1993-2001: Digital Equipment/Compaq Started working on web search in 1997 Mercator
More informationCPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017
CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.
More information5 Minimal I-Maps, Chordal Graphs, Trees, and Markov Chains
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 5 Minimal I-Maps, Chordal Graphs, Trees, and Markov Chains Recall
More informationCS47300 Web Information Search and Management
CS47300 Web Information Search and Management Search Engine Optimization Prof. Chris Clifton 31 October 2018 What is Search Engine Optimization? 90% of search engine clickthroughs are on the first page
More informationKNOWLEDGE GRAPH IDENTIFICATION
KNOWLEDGE GRAPH IDENTIFICATION Jay Pujara 1, Hui Miao 1, Lise Getoor 1, William Cohen 2 1 University of Maryland, College Park, US 2 Carnegie Mellon University International Semantic Web Conference 10/25/2013
More informationIBL and clustering. Relationship of IBL with CBR
IBL and clustering Distance based methods IBL and knn Clustering Distance based and hierarchical Probability-based Expectation Maximization (EM) Relationship of IBL with CBR + uses previously processed
More informationOverview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8
Tutorial 3 1 / 8 Overview Non-Parametrics Models Definitions KNN Ensemble Methods Definitions, Examples Random Forests Clustering Definitions, Examples k-means Clustering 2 / 8 Non-Parametrics Models Definitions
More informationBatch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data
Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data Jesse Read 1, Albert Bifet 2, Bernhard Pfahringer 2, Geoff Holmes 2 1 Department of Signal Theory and Communications Universidad
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationECE521 Lecture 18 Graphical Models Hidden Markov Models
ECE521 Lecture 18 Graphical Models Hidden Markov Models Outline Graphical models Conditional independence Conditional independence after marginalization Sequence models hidden Markov models 2 Graphical
More informationMobile Services Part 1
Mobile Services Part 1 Pilot survey on location based services, mobile websites and applications Prof. Dr. Uwe Weithöner, Marc Buschler (Bachelor of Arts) Investing in the future by working together for
More informationCS371R: Final Exam Dec. 18, 2017
CS371R: Final Exam Dec. 18, 2017 NAME: This exam has 11 problems and 16 pages. Before beginning, be sure your exam is complete. In order to maximize your chance of getting partial credit, show all of your
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationPartitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning
Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning
More informationof Manchester The University COMP14112 Markov Chains, HMMs and Speech Revision
COMP14112 Lecture 11 Markov Chains, HMMs and Speech Revision 1 What have we covered in the speech lectures? Extracting features from raw speech data Classification and the naive Bayes classifier Training
More informationCSI5387: Data Mining Project
CSI5387: Data Mining Project Terri Oda April 14, 2008 1 Introduction Web pages have become more like applications that documents. Not only do they provide dynamic content, they also allow users to play
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 16, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationNon-ML Anti-Spamming: A Role Based Solution
Non-ML Anti-Spamming: A Role Based Solution Anthony Y. Fu, Email: anthony@cs.cityu.edu.hk WebPage: http://www.cs.cityu.edu.hk/~anthony Department of Computer Science, City University of Hong Kong Hong
More informationSocial Search Networks of People and Search Engines. CS6200 Information Retrieval
Social Search Networks of People and Search Engines CS6200 Information Retrieval Social Search Social search Communities of users actively participating in the search process Goes beyond classical search
More informationJoin. Sign in You may use a Google account, UMich account, or have profiles at both addresses.
Join Google+ is Google s new social networking site. Why should you join? Privacy - Google+ easily allows you to control who sees what content you add. Sharing - This site makes sharing content extremely
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs
More informationOn the automatic classification of app reviews
Requirements Eng (2016) 21:311 331 DOI 10.1007/s00766-016-0251-9 RE 2015 On the automatic classification of app reviews Walid Maalej 1 Zijad Kurtanović 1 Hadeer Nabil 2 Christoph Stanik 1 Walid: please
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationA novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems
A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems Anestis Gkanogiannis and Theodore Kalamboukis Department of Informatics Athens University of Economics
More informationBing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web
More informationIdentifying Important Communications
Identifying Important Communications Aaron Jaffey ajaffey@stanford.edu Akifumi Kobashi akobashi@stanford.edu Abstract As we move towards a society increasingly dependent on electronic communication, our
More informationEstimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationPUBCRAWL: Protecting Users and Businesses from CRAWLers
PUBCRAWL: Protecting Users and Businesses from CRAWLers Grégoire Jacob 1,3, Engin Kirda 2, Christopher Kruegel 1, Giovanni Vigna 1 1 University of California, Santa Barbara / 2 Northeastern University
More informationSupervised Random Walks
Supervised Random Walks Pawan Goyal CSE, IITKGP September 8, 2014 Pawan Goyal (IIT Kharagpur) Supervised Random Walks September 8, 2014 1 / 17 Correlation Discovery by random walk Problem definition Estimate
More informationBuilding Search Applications
Building Search Applications Lucene, LingPipe, and Gate Manu Konchady Mustru Publishing, Oakton, Virginia. Contents Preface ix 1 Information Overload 1 1.1 Information Sources 3 1.2 Information Management
More informationProject Report: "Bayesian Spam Filter"
Humboldt-Universität zu Berlin Lehrstuhl für Maschinelles Lernen Sommersemester 2016 Maschinelles Lernen 1 Project Report: "Bayesian E-Mail Spam Filter" The Bayesians Sabine Bertram, Carolina Gumuljo,
More informationMARKETING VOL. 1
EMAIL MARKETING VOL. 1 TITLE: Email Promoting: What You Need To Do Author: Iris Carter-Collins Table Of Contents 1 Email Promoting: What You Need To Do 4 Building Your Business Through Successful Marketing
More informationMachine Learning. Decision Trees. Manfred Huber
Machine Learning Decision Trees Manfred Huber 2015 1 Decision Trees Classifiers covered so far have been Non-parametric (KNN) Probabilistic with independence (Naïve Bayes) Linear in features (Logistic
More informationMobility and Portability with Today's Rockwell Software
Mobility and Portability with Today's Rockwell Software Improve productivity for every industrial worker PUBLIC Copyright 2018 Rockwell Automation, Inc. All Rights Reserved. 1 THE CONNECTED ENTERPRISE
More informationHuman Computation. Melissa Winstanley
Human Computation Melissa Winstanley mwinst@cs.washington.edu What computers do badly Open-ended, unstructured tasks Creativity Writing stories Composing music Making art Conversation Loebner Prize (chatbots)
More informationSearch Engines Information Retrieval in Practice
Search Engines Information Retrieval in Practice W. BRUCE CROFT University of Massachusetts, Amherst DONALD METZLER Yahoo! Research TREVOR STROHMAN Google Inc. ----- PEARSON Boston Columbus Indianapolis
More informationTIME TO GET MOBILE 70%IN % 76% 84%
TIME TO GET MOBILE There are plenty of statistics available about how smartphone and tablet ownership in Australia is continuing to rise. But how do these devices impact the way that people are choosing
More informationIdentifying Suspended Accounts In Twitter
University of Windsor Scholarship at UWindsor Electronic Theses and Dissertations 2016 Identifying Suspended Accounts In Twitter Xiutian Cui University of Windsor Follow this and additional works at: https://scholar.uwindsor.ca/etd
More informationLink Mining & Entity Resolution. Lise Getoor University of Maryland, College Park
Link Mining & Entity Resolution Lise Getoor University of Maryland, College Park Learning in Structured Domains Traditional machine learning and data mining approaches assume: A random sample of homogeneous
More informationAugust 12. Tips for Gmail. Tips to save time and increase your productivity. Gmail Training
Tips for Gmail August 12 2014 Gmail Tips to save time and increase your productivity Tips for Gmail Gmail Learn tips to save time, increase your productivity, and manage your email efficiently. Page 2
More informationThe Next Generation of Mobile Learning. Tamar Elkeles, Qualcomm Kevin Oakes, i4cp
The Next Generation of Mobile Learning Tamar Elkeles, Qualcomm Kevin Oakes, i4cp About i4cp i4cp focuses on the people practices that make high performance organizations unique. High-performance organizations
More informationGetting started with social media and comping
Getting started with social media and comping Promotors are taking a leap further into the digital age, and we are finding that more and more competitions are migrating to Facebook and Twitter. If you
More informationURL ATTACKS: Classification of URLs via Analysis and Learning
International Journal of Electrical and Computer Engineering (IJECE) Vol. 6, No. 3, June 2016, pp. 980 ~ 985 ISSN: 2088-8708, DOI: 10.11591/ijece.v6i3.7208 980 URL ATTACKS: Classification of URLs via Analysis
More informationDETECTING VIDEO SPAMMERS IN YOUTUBE SOCIAL MEDIA
How to cite this paper: Yuhanis Yusof & Omar Hadeb Sadoon. (2017). Detecting video spammers in youtube social media in Zulikha, J. & N. H. Zakaria (Eds.), Proceedings of the 6th International Conference
More informationCS60092: Informa0on Retrieval
Introduc)on to CS60092: Informa0on Retrieval Sourangshu Bha1acharya Today s lecture hypertext and links We look beyond the content of documents We begin to look at the hyperlinks between them Address ques)ons
More informationCSEP 573: Artificial Intelligence
CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian
More informationExploring the Hidden Dimension in Graph Processing
Exploring the Hidden Dimension in Graph Processing Mingxing Zhang, Yongwei Wu, Kang Chen, *Xuehai Qian, Xue Li, and Weimin Zheng Tsinghua University *University of Shouthern California Graph is Ubiquitous
More informationTC2-Computer Literacy Mr. Sencer February 4, 2010
TC2-Computer Literacy Mr. Sencer February 4, 2010 What is a network? A network is a collection of computers and devices connected together, sometimes wirelessly, via communications devices. When a computer
More informationOn the Automatic Classification of App Reviews
The final publication is available at Springer via http://dx.doi.org/10.1007/s00766-016-0251-9 On the Automatic Classification of App Reviews Walid Maalej Zijad Kurtanović Hadeer Nabil Christoph Stanik
More informationStructured Learning. Jun Zhu
Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum
More informationPinterest MONDAY, APRIL 22, Basics PAGE 2. How-tos PAGE 3. Advanced PAGE 4
Pinterest MONDAY, APRIL 22, 2013 Basics PAGE 2 How-tos PAGE 3 Advanced PAGE 4 What is Pinterest? Pinterest is a virtual pin board used for sharing and organizing images. Use Pinterest to share your own
More informationMobile Learning Trends & Realities. Tamar Elkeles, Ph.D. Chief Learning Officer
1 Mobile Learning Trends & Realities Tamar Elkeles, Ph.D. Chief Learning Officer Qualcomm s business Enabling the next evolution of wireless through Technology licensing Chipsets and system software Wireless
More informationWhat is WeChat? WeChat (Wēixìn, 微信 ) is a social messaging app developed by Tencent ( 腾讯 ) starting in 2011.
Wechat What is WeChat? WeChat (Wēixìn, 微信 ) is a social messaging app developed by Tencent ( 腾讯 ) starting in 2011. It is available for both Android and ios devices. It can be used to chat with other people
More informationProbabilistic Visitor Stitching on Cross-Device Web Logs
Probabilistic Visitor Stitching on Cross-Device Web Logs Sungchul Kim Adobe Research San Jose, CA 95110 sukim@adobe.com Eunyee Koh Adobe Research San Jose, CA 95110 eunyee@adobe.com Nikhil Kini UC Santa
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 19: Web Search Basics Hinrich Schütze Institute for Natural Language Processing, Universität Stuttgart 2008.07.07 Schütze: Web
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.
More informationOnline Communication. Chat Rooms Instant Messaging Blogging Social Media
Online Communication E-mail Chat Rooms Instant Messaging Blogging Social Media { Advantages: { Reduces cost of postage Fast and convenient Need an email address to sign up for other online accounts. Eliminates
More informationLink Prediction for Social Network
Link Prediction for Social Network Ning Lin Computer Science and Engineering University of California, San Diego Email: nil016@eng.ucsd.edu Abstract Friendship recommendation has become an important issue
More informationINTRODUCTION. In this summary version, we present some of the key figures and charts.
1 INTRODUCTION GWI Market reports track key digital behaviors and penetration levels at a national level, providing the very latest figures for topline engagement as well as analyzing patterns across demographic
More informationGraphChi: Large-Scale Graph Computation on Just a PC
OSDI 12 GraphChi: Large-Scale Graph Computation on Just a PC Aapo Kyrölä (CMU) Guy Blelloch (CMU) Carlos Guestrin (UW) In co- opera+on with the GraphLab team. BigData with Structure: BigGraph social graph
More informationAutomatic Summarization
Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization
More information.io. How to use WeAlert.io in your neighbourhood QUICK GUIDE WEALERT-APP
.io How to use WeAlert.io in your neighbourhood QUICK GUIDE WEALERT-APP Register to WeAlert.io Within 20 seconds you are in direct contact with your neighbours and together we will keep our neighbourhood
More informationData mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014
Data Mining Data mining processes What technological infrastructure is required? Data mining is a system of searching through large amounts of data for patterns. It is a relatively new concept which is
More informationPOMAC: Properly Offloading Mobile Applications to Clouds
POMAC: Properly Offloading Mobile Applications to Clouds Mohammed Anowarul Hassan George Mason University Kshitiz Bhattarai SAP Lab Palo Alto Qi Wei and Songqing Chen George Mason University 1 Outline
More informationHow to Make Student Communications Stick. #LetsDoThis
How to Make Student Communications Stick #LetsDoThis 1 Today s Agenda The Problem Noise, Competition The Rule of 7 Communication Channels Top 5 Channels To Optimize Group Exercise! 2 Your Presenters Chris
More informationText Classification. Dr. Johan Hagelbäck.
Text Classification Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Document Classification A very common machine learning problem is to classify a document based on its text contents We use
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationHow to use MySpace and comment on a photo OR send me a message updating me on what s happening over in Perth!
How to use MySpace and comment on a photo OR send me a message updating me on what s happening over in Perth! Signing up to MySpace: 1. Firstly, open your internet homepage and type MySpaces s URL (www.myspace.com)
More informationCompSci 1: Overview CS0. Collaborative Filtering: A Tutorial. Collaborative Filtering. Everyday Examples of Collaborative Filtering...
CompSci 1: Overview CS0 Collaborative Filtering: A Tutorial! Audioscrobbler and last.fm! Collaborative filtering! What is a neighbor?! What is the network? Drawn from tutorial by William W. Cohen Center
More information1 Machine Learning System Design
Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training
More informationRita McCue University of California, Santa Cruz 12/7/09
Rita McCue University of California, Santa Cruz 12/7/09 1 Introduction 2 Naïve Bayes Algorithms 3 Support Vector Machines and SVMLib 4 Comparative Results 5 Conclusions 6 Further References Support Vector
More informationTODAY S LECTURE HYPERTEXT AND
LINK ANALYSIS TODAY S LECTURE HYPERTEXT AND LINKS We look beyond the content of documents We begin to look at the hyperlinks between them Address questions like Do the links represent a conferral of authority
More informationSpam Filtering Using Visual Features
Spam Filtering Using Visual Features Sirnam Swetha Computer Science Engineering sirnam.swetha@research.iiit.ac.in Sharvani Chandu Electronics and Communication Engineering sharvani.chandu@students.iiit.ac.in
More informationDiagnosis of Spams Some Statistical Considerations
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 3, Issue 4 (August 2012), PP. 05-09 Diagnosis of Email Spams Some Statistical Considerations
More informationMARKETING VOL. 3
TITLE: Proven Tips For Being Successful With Network Marketing Author: Iris Carter-Collins Table Of Contents Proven Tips For Being Successful With Network Marketing 1 Are You Here To Learn About E-mail
More informationA Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics
A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics Helmut Berger and Dieter Merkl 2 Faculty of Information Technology, University of Technology, Sydney, NSW, Australia hberger@it.uts.edu.au
More informationMPEG Frame Types intrapicture predicted picture bidirectional predicted picture. I frames reference frames
MPEG o We now turn our attention to the MPEG format, named after the Moving Picture Experts Group that defined it. To a first approximation, a moving picture (i.e., video) is simply a succession of still
More informationA Flexible Approach to Relational Modeling of Social Network Spam
A Flexible Approach to Relational Modeling of Social Network Spam Jonathan Brophy and Daniel Lowd Department of Computer Science University of Oregon {jbrophy,lowd}@cs.uoregon.edu Abstract Social media
More informationOnline Communication. Chat Rooms Instant Messaging Blogging Social Media
Online Communication E-mail Chat Rooms Instant Messaging Blogging Social Media Advantages: Reduces cost of postage Fast and convenient Eliminates phone charges Disadvantages: May be difficult to understand
More informationSCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR
SCALABLE KNOWLEDGE BASED AGGREGATION OF COLLECTIVE BEHAVIOR P.SHENBAGAVALLI M.E., Research Scholar, Assistant professor/cse MPNMJ Engineering college Sspshenba2@gmail.com J.SARAVANAKUMAR B.Tech(IT)., PG
More informationBayesian Spam Detection System Using Hybrid Feature Selection Method
2016 International Conference on Manufacturing Science and Information Engineering (ICMSIE 2016) ISBN: 978-1-60595-325-0 Bayesian Spam Detection System Using Hybrid Feature Selection Method JUNYING CHEN,
More informationBuilding Dynamic Knowledge Graphs
Building Dynamic Knowledge Graphs Jay Pujara Department of Computer Science University of Maryland College Park, MD 20742 jay@cs.umd.edu Lise Getoor Department of Computer Science University of California
More informationSpam 2.0 Workshop on Digital Social Networks. Alexandru Cosoi
Spam 2.0 Workshop on Digital Social Networks George Petre glpetre@bitdefender.com Alexandru Cosoi acosoi@bitdefender.com Social Networks A social network is a social structure made of nodes (which are
More informationVodafone One Net app Quick Start Guide For Android tablet
Vodafone One Net app Quick Start Guide For Android tablet Power to you Contents What is the One Net app? 1 Installing the One Net app 2 Logging in and out 2 Logging in for the first time 2 Logging out
More informationTyping Software For Macbook Pro Facebook
Typing Software For Macbook Pro Facebook Depending on the program you type in, the Emoji might only appear as a Right click any highlighted text to quickly share it via Twitter, Facebook, email. Over the
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationSurvey of Semantic Search technologies for Information Retrieval. Eric Abecassis Houston Technology Center Manager
Survey of Semantic Search technologies for Information Retrieval Eric Abecassis Houston Technology Center Manager 2009 Schlumberger. All rights reserved. An asterisk is used throughout this presentation
More informationBYOD Programme Handbook
BYOD Programme Handbook October 2018 Student & Parent Guide IT Helpdesk The IT Helpdesk is the initial point of contact for the IT Department. The IT Helpdesk is located in Mercy 2 and is open Monday
More informationKnowledge Graph Completion. Mayank Kejriwal (USC/ISI)
Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An intelligent way of doing data cleaning Deduplicating entity nodes (entity resolution) Collective reasoning (probabilistic
More informationWhere Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf
Where Next? Data Mining Techniques and Challenges for Trajectory Prediction Slides credit: Layla Pournajaf o Navigational services. o Traffic management. o Location-based advertising. Source: A. Monreale,
More informationADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA
INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS
More informationLive me app computer app live
Live me app computer 153k. Rate this App. Live.me screenshot 1. Live.me screenshot 2. Live.me screenshot 3. Live.me screenshot 4. Live.me screenshot 5. Live.me screenshot 6. Download this app from Microsoft
More informationLive me app for computer
Live me app for computer HEY YOU! It's time to join the largest broadcasting community in the world LiveMe! In more than 85 countries, you can chat with people nearby and far,. 3 days ago. 1. Install Live.me
More information