Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media

Size: px
Start display at page:

Download "Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media"

Transcription

1 Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media Juhi Kulshrestha joint work with Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar, Saptarshi Ghosh, Krishna P. Gummadi and Karrie Karahalios

2 Social media as "search" platform Rich source of news & information Search is how users follow news about events & people Hashtags - recommended queries

3 Social media as "search" platform

4 Social media as "search" platform Ranked list (according to importance)

5 Potential bias in search results

6 Potential bias in search results

7 Search can shape user opinion Users place greater trust in higher ranked items [Pan et al., 2007] Biased search results can influence voting patterns [Epstein & Robertson, 2015]

8 Search bias in the headlines

9 Search bias in the headlines What does bias of a search system mean? How can we quantify it?

10 Identify sources of search bias Quantify bias of each source Study bias of political searches in Quantifying Search Bias CSCW 2017

11 Identify sources of search bias Quantify bias of each source Study bias of political searches in Quantifying Search Bias CSCW 2017

12 Social media search engine design Data corpus

13 Social media search engine design Data corpus q i1 i2 i3... (set) input (relevant items)

14 Social media search engine design Data corpus q i1 i2 i3... Ranking system (set) input (relevant items) ranking

15 Social media search engine design Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list)

16 Social media search engine design Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list)

17 Social media search engine design Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Output bias may stem from Bias introduced by the ranking system

18 Social media search engine design Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Output bias may stem from Bias introduced by the ranking system Bias in the input relevant item set

19 Identify sources of search bias Quantify bias of each source Study bias of political searches in Quantifying Search Bias CSCW 2017

20 Quantifying bias of each source Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Step 1: Quantify bias of an individual item

21 Step 1: Quantifying bias of a single items

22 Step 1: Quantifying bias of a single items We use source bias as a proxy Infer bias of each individual item from the bias of the author High scalability

23 Step 1: Quantifying bias of a single items We use source bias as a proxy Infer bias of each individual item from the bias of the author High scalability Prior work on inferring content bias Could be plugged into our bias quantification framework

24 Step 1: Quantifying bias of a single items Evaluation: High agreement between source and content bias (75% or more)

25 Step 1: Quantifying bias of a single items Evaluation: High agreement between source and content bias (75% or more)

26 Step 1: Quantifying bias of a single items Evaluation: High agreement between source and content bias (75% or more) "R-NC 11th District, Republican party"

27 Step 1: Quantifying bias of a single items Evaluation: High agreement between source and content bias (75% or more)

28 Step 1: Quantifying bias of a single items Evaluation: High agreement between source and content bias (75% or more) "Republican party nominee for Presidential elections"

29 Quantifying bias of each source Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Step 2: Quantify bias of a set of items

30 Step 2: Quantifying bias of set of items i1 i2 i3... (set) Compute bias score (s) for each item Take the average over the whole set input (relevant items)

31 Quantifying bias of each source Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Step 3: Quantify bias of a ranked list of items

32 Step 3: Quantifying bias of ranked list of items MAP-style measure i5 i1 i3... output (ranked list) Bias till rank r Output bias

33 Quantifying bias of each source Data corpus q i1 i2 i3... (set) Ranking system i5 i1 i3... input (relevant items) ranking output (ranked list) Step 4: Quantify bias introduced by ranking system

34 Step 4: Quantifying bias introduced by ranking Ranking system Ranking bias = Output bias - Input bias ranking

35 Identify sources of search bias Quantify bias of each source Study bias of political searches in Quantifying Search Bias CSCW 2017

36 Studying bias for political searches in Twitter Search queries for 2016 Democratic and Republican presidential primary debates (#demdebate, rep debate,...) Presidential candidates (Hillary Clinton, Donald Trump,...)

37 Studying bias for political searches in Twitter Search queries for 2016 Democratic and Republican presidential primary debates (#demdebate, rep debate,...) Presidential candidates (Hillary Clinton, Donald Trump,...) Data collected Output: Twitter top search snapshots every 10 mins Input: All tweets containing the query, using streaming api

38 Studying bias for political searches in Twitter Search queries for 2016 Democratic and Republican presidential primary debates (#demdebate, rep debate,...) Presidential candidates (Hillary Clinton, Donald Trump,...) Data collected Output: Twitter top search snapshots every 10 mins Input: All tweets containing the query, using streaming api Non-personalized search data

39 Studying bias for political searches in Twitter Search queries for 2016 Democratic and Republican presidential primary debates (#demdebate, rep debate,...) Presidential candidates (Hillary Clinton, Donald Trump,...) Data collected Output: Twitter top search snapshots every 10 mins Input: All tweets containing the query, using streaming api Computed input, ranking, and output bias for each of the 25 queries

40 Bias in Twitter search: Input bias vs. Ranking Quantifying Search Bias CSCW 2017

41 Bias in Twitter search: Input bias vs. Ranking bias Does input bias Quantifying Search Bias CSCW 2017

42 Impact of input bias Query Output Bias Input Bias Bernie Sanders Martin O'Malley Rand Paul John Kasich dem debate #demdebate republican debate rep debate

43 Impact of input bias Query Output Bias Input Bias Bernie Sanders Martin O'Malley Rand Paul John Kasich dem debate #demdebate republican debate rep debate Input bias matters!

44 Impact of input bias Query Output Bias Input Bias Bernie Sanders Martin O'Malley Rand Paul John Kasich dem debate #demdebate republican debate rep debate Input bias varies across queries

45 Effect of query phrasing Query Input Bias dem debate 0.29 #demdebate 0.56 republican debate 0.27 rep debate 0.40

46 Effect of query phrasing Query Input Bias dem debate 0.29 #demdebate 0.56 republican debate 0.27 rep debate 0.40 Even for the same event, query phrasing can greatly effect the bias

47 Bias in Twitter search: Input bias vs. Ranking bias Does input bias matter? Input bias does matter Can vary significantly based on the query Even for the same event, different phrasings of queries have widely differing Quantifying Search Bias CSCW 2017

48 Bias in Twitter search: Input bias vs. Ranking bias Does input bias matter? Input bias does matter Can vary significantly based on the query Even for the same event, different phrasings of queries have widely differing biases Does the ranking bias Quantifying Search Bias CSCW 2017

49 Examining ranking bias Query Ranking Bias Hillary Clinton 0.18 Bernie Sanders 0.16 Martin O'Malley 0.07 Donald Trump 0.10 Ted Cruz Marco Rubio Ben Carson 0.26

50 Examining ranking bias Query Ranking Bias Hillary Clinton 0.18 Bernie Sanders 0.16 Martin O'Malley 0.07 Donald Trump 0.10 Ted Cruz Marco Rubio Ben Carson 0.26 Ranking bias does exist...

51 Examining ranking bias Query Ranking Bias Hillary Clinton 0.18 Bernie Sanders 0.16 Martin O'Malley 0.07 Donald Trump 0.10 Ted Cruz Marco Rubio Ben Carson but no evidence of systemic bias

52 Bias in Twitter search: Input bias vs. Ranking bias Does input bias matter? Input bias does matter Can vary significantly based on the query Even for the same event, different phrasings of queries have widely differing biases Does the ranking bias exist? Yes and varies across queries No evidence of systemic Quantifying Search Bias CSCW 2017

53 Open challenge: How to address search Quantifying Search Bias CSCW 2017

54 Open challenge: How to address search bias? Modify ranking system to account for bias Might lead to reduction in quality of Quantifying Search Bias CSCW 2017

55 Open challenge: How to address search bias? Modify ranking system to account for bias Might lead to reduction in quality of results Make the bias transparent Keep the current ranking Inform the users about the bias they are seeing Make biases related to query phrasing Quantifying Search Bias CSCW 2017

56 Quantifying Search Bias CSCW 2017

57 Quantifying Search Bias CSCW 2017

58 Quantifying Search Bias CSCW 2017

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media

Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media Muhammad Bilal Zafar with Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Saptarshi Ghosh, Krishna Gummadi

More information

Spring (percentages may not add to 100% due to rounding)

Spring (percentages may not add to 100% due to rounding) Spring 2016 Survey Information: Registered Voters, Random Selection, Landline and Cell Telephone Survey Number of Adult Wisconsin Registered Voters: 616 Interview Period: 4/12-4/15, 2016 Margin of Error

More information

Election Summary Report April 26, 2016 General Primary Election Armstrong County, PA Official Results

Election Summary Report April 26, 2016 General Primary Election Armstrong County, PA Official Results Registered Voters - 40,873 Votes Cast - 18,853 (46.13%) DEM President DEM Auditor General Hillary Clinton 2992 44.62% Eugene A Depasquale 5458 99.33% Bernie Sanders 3073 45.82% John Brown (Write-In) 10

More information

THE MISSING AGENDA THE IMPORTANCE OF CYBER SECURITY TO U.S. VOTERS

THE MISSING AGENDA THE IMPORTANCE OF CYBER SECURITY TO U.S. VOTERS THE MISSING AGENDA THE IMPORTANCE OF CYBER SECURITY TO U.S. VOTERS This election season voters have heard promises to make the U.S. great again and how we re stronger together. But they have yet to hear

More information

Election 2016 Twitter Sentiment Map

Election 2016 Twitter Sentiment Map Election 2016 Twitter Sentiment Map Alex Engel Stanford University ABSTRACT Many political polls conducted today are very prone to selection bias and other factors that negatively impact the validity of

More information

Sifaka: Text Mining Above a Search API

Sifaka: Text Mining Above a Search API Sifaka: Text Mining Above a Search API ABSTRACT Cameron VandenBerg Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213, USA cmw2@cs.cmu.edu Text mining and analytics software

More information

Marquette Law School Poll September 24-28, 2015 Results for Registered Voters

Marquette Law School Poll September 24-28, 2015 Results for Registered Voters Marquette Law School Poll September 24-28, 2015 Results for Registered Voters (ages are rounded to whole numbers for reporting of results. Values ending in.5 here may round up or down if they are slightly

More information

FIELD RESEARCH CORPORATION

FIELD RESEARCH CORPORATION FIELD RESEARCH CORPORATION FOUNDED IN 1945 BY MERVIN FIELD 601 California Street San Francisco, California 94108 415-392-5763 Tabulations From a Survey of Californians Likely to Vote in the June 2016 Presidential

More information

I Vote For How Search Informs Our Choice of Candidate

I Vote For How Search Informs Our Choice of Candidate CHAPTER 1 I Vote For How Search Informs Our Choice of Candidate NICHOLAS DIAKOPOULOS, DANIEL TRIELLI, JENNIFER STARK, AND SEAN MUSSENDEN S earch engines such as Google mediate a substantial amount of human

More information

Cognos: Crowdsourcing Search for Topic Experts in Microblogs

Cognos: Crowdsourcing Search for Topic Experts in Microblogs Cognos: Crowdsourcing Search for Topic Experts in Microblogs Saptarshi Ghosh, Naveen Sharma, Fabricio Benevenuto, Niloy Ganguly, Krishna Gummadi IIT Kharagpur, India; UFOP, Brazil; MPI-SWS, Germany Topic

More information

Q1. Do you have a cellphone, or not? 1. Yes 2. No [SHOW IF Q1=1] Q2. Do you have a cellphone that connects to the Internet and can have apps, or does your phone only receive calls and text messages? 1.

More information

Analysis of Tweets: Donald Trump and sexual harassment allegations

Analysis of Tweets: Donald Trump and sexual harassment allegations Analysis of Tweets: Donald Trump and sexual harassment allegations MAIN CONCEPT Twitter historical database was used to search for suitable tweets. Search term of Trump and harassment was consequently

More information

SPUTNIK EXCLUSIVE: Research Proves Google...

SPUTNIK EXCLUSIVE: Research Proves Google... Sputnik International Photo: Youtube/SourceFed SPUTNIK EXCLUSIVE: Research Proves Google Manipulates Millions to Favor Clinton US 14:00 12.09.2016 (updated 20:59 14.09.2016) 1 of 29 09/18/2016 10:22 AM

More information

The Growing Gap between Landline and Dual Frame Election Polls

The Growing Gap between Landline and Dual Frame Election Polls MONDAY, NOVEMBER 22, 2010 Republican Share Bigger in -Only Surveys The Growing Gap between and Dual Frame Election Polls FOR FURTHER INFORMATION CONTACT: Scott Keeter Director of Survey Research Michael

More information

Kings County June 7, 2016 Statewide Presidential Primary Election Certified Official Final Results

Kings County June 7, 2016 Statewide Presidential Primary Election Certified Official Final Results Kings County June 7, 2016 Statewide Presidential Primary Election Certified Official Final Results Last Updated: June 17, 2016 9:43 AM Registration & Turnout 48,523 Voters Election Day Turnout 3,687 7.60%

More information

FIELD RESEARCH CORPORATION

FIELD RESEARCH CORPORATION FIELD RESEARCH CORPORATION FOUNDED IN 1945 BY MERVIN FIELD 601 California Street San Francisco, California 94108 415-392-5763 Tabulations From a Survey of California Likely Voters about the 2016 Presidential

More information

BOTS, TROLLS AND SOCIAL BOTS: THE GOOD, THE BAD AND THE UGLY

BOTS, TROLLS AND SOCIAL BOTS: THE GOOD, THE BAD AND THE UGLY BOTS, TROLLS AND SOCIAL BOTS: THE GOOD, THE BAD AND THE UGLY Juan Carlos Medina Serrano Data Scientist/PhD Candidate Political Data Science Technische Universität München Contents Bots The Bad and the

More information

FEAT: A FACEBOOK EXTRACTION AND ANALYSIS TOOLKIT. A Thesis by HAIHOUA YANG

FEAT: A FACEBOOK EXTRACTION AND ANALYSIS TOOLKIT. A Thesis by HAIHOUA YANG FEAT: A FACEBOOK EXTRACTION AND ANALYSIS TOOLKIT A Thesis by HAIHOUA YANG Submitted to the Graduate School at Appalachian State University in partial fulfillment of the requirements for the degree of MASTER

More information

TWITTER USE IN ELECTION CAMPAIGNS: TECHNICAL APPENDIX. Jungherr, Andreas. (2016). Twitter Use in Election Campaigns: A Systematic Literature

TWITTER USE IN ELECTION CAMPAIGNS: TECHNICAL APPENDIX. Jungherr, Andreas. (2016). Twitter Use in Election Campaigns: A Systematic Literature 1 Jungherr, Andreas. (2016). Twitter Use in Election Campaigns: A Systematic Literature Review. Journal of Information Technology & Politics. (Forthcoming). Technical Appendix Coding Process 1. Keyword

More information

Link Farming in Twitter

Link Farming in Twitter Link Farming in Twitter Pawan Goyal CSE, IITKGP Nov 11, 2016 Pawan Goyal (IIT Kharagpur) Link Farming in Twitter Nov 11, 2016 1 / 1 Reference Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar

More information

DOWNS MODEL OF ELECTIONS

DOWNS MODEL OF ELECTIONS DOWNS MODEL OF ELECTIONS 1 A. Assumptions 1. A single dimension of alternatives. 2. Voters. a. prefer the candidate that is closer to their ideal point more than one farther away (as before). A. Assumptions

More information

Election Summary Report MARION COUNTY PRIMARY ELECTION March 15, 2016 Summary For Jurisdiction Wide, All Counters, All Races Official Results

Election Summary Report MARION COUNTY PRIMARY ELECTION March 15, 2016 Summary For Jurisdiction Wide, All Counters, All Races Official Results Election Summary ort Time:17:28:21 Page:1 of 9 Registered Voters 38747 - Cards Cast 16377 42.27% Num. ort Precinct 65 - Num. orting 65 100.00% Delegates/Alternates-at-Large Votes 4685 Hillary Clinton 2644

More information

Mining Social Media Users Interest

Mining Social Media Users Interest Mining Social Media Users Interest Presenters: Heng Wang,Man Yuan April, 4 th, 2016 Agenda Introduction to Text Mining Tool & Dataset Data Pre-processing Text Mining on Twitter Summary & Future Improvement

More information

Predicting primary election results through network analysis of donor relationships

Predicting primary election results through network analysis of donor relationships Predicting primary election results through network analysis of donor relationships 1 Introduction Michael Weingert, Ellen Sebastian December 9, 2015 In U.S. Presidential elections, campaign funding is

More information

Social Search Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson

Social Search Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Social Search Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson The Anatomy of a Large-Scale Social Search Engine by Horowitz, Kamvar WWW2010 Web IR Input is a query of keywords

More information

.. Spring 2016 DATA 301: Introduction to Data Science Alex Dekhtyar.. Lab 5, HTML, Mashups, Madlibs and Mockery

.. Spring 2016 DATA 301: Introduction to Data Science Alex Dekhtyar.. Lab 5, HTML, Mashups, Madlibs and Mockery .. Spring 2016 DATA 301: Introduction to Data Science Alex Dekhtyar.. Due date: Monday, April 25, 11:59am Lab 5, HTML, Mashups, Madlibs and Mockery Lab Assignment Assignment Preparation This is a team

More information

Purpose. Symptom. Cause. March 4, 2008

Purpose. Symptom. Cause. March 4, 2008 March 4, 2008 To: From: AVC Advantage Customers using series 9.0 firmware Joe McIntyre, Senior Project/Account Manager Re: WinEDS Technical Product Bulletin - AVC Advantage Party Turnout Issue / Operator

More information

Using US Facebook Data on Server

Using US Facebook Data on Server Using US Facebook Data on Server Keng-Chi Chang 2017-10-18 Contents 1 Quick Start & Basic Structure 2 2 Directories of Cleaned Data 3 3 Directories of Raw Data 4 4 Introduction to Facebook Data 5 4.1 Page

More information

Network Centrality. Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017

Network Centrality. Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017 Network Centrality Saptarshi Ghosh Department of CSE, IIT Kharagpur Social Computing course, CS60017 Node centrality n Relative importance of a node in a network n How influential a person is within a

More information

NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags

NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags NUSIS at TREC 2011 Microblog Track: Refining Query Results with Hashtags Hadi Amiri 1,, Yang Bao 2,, Anqi Cui 3,,*, Anindya Datta 2,, Fang Fang 2,, Xiaoying Xu 2, 1 Department of Computer Science, School

More information

COMPARING RESULTS OF SENTIMENT ANALYSIS USING NAIVE BAYES AND SUPPORT VECTOR MACHINE IN DISTRIBUTED APACHE SPARK ENVIRONMENT

COMPARING RESULTS OF SENTIMENT ANALYSIS USING NAIVE BAYES AND SUPPORT VECTOR MACHINE IN DISTRIBUTED APACHE SPARK ENVIRONMENT COMPARING RESULTS OF SENTIMENT ANALYSIS USING NAIVE BAYES AND SUPPORT VECTOR MACHINE IN DISTRIBUTED APACHE SPARK ENVIRONMENT Tomasz Szandala Department of Computer Engineering Wrocław University of Technology,

More information

Supporting Research Data Collection from YouTube with TubeKit

Supporting Research Data Collection from YouTube with TubeKit Supporting Research Data Collection from YouTube with TubeKit Chirag Shah School of Information & Library Science University of North Carolina Chapel Hill NC 27599, USA chirag@unc.edu Abstract We present

More information

Twentieth Annual University of Oregon Eugene Luks Programming Competition

Twentieth Annual University of Oregon Eugene Luks Programming Competition Twentieth Annual University of Oregon Eugene Luks Programming Competition 206 April 0, Saturday am0:00 pm2:00 Problem Contributors James Allen, Eugene Luks, Chris Wilson, and the ACM Technical Assistance

More information

Mapping Twitter Conversation Landscapes

Mapping Twitter Conversation Landscapes Mapping Twitter Conversation Landscapes The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Vosoughi,

More information

Cumulative Report - Unofficial MARTIN COUNTY, TEXAS - GENERAL ELECTION - November 06, 2018

Cumulative Report - Unofficial MARTIN COUNTY, TEXAS - GENERAL ELECTION - November 06, 2018 Page 1of6 Total Number of Voters: 2, 177 of 2,970 = 73.30% Precincts Reporting 8 of 8 = 100.00% I Party Candidate Absentee Early Election Total.i Straight Party, Vote For 1 Republican Party 34 79.07% 455

More information

LCC 6313 Principles of Interactive Design Professor: Ian Bogost Date: Mar 4, 2005 Name: Jaemin Lee. Three leaf clover Website Design Documentation

LCC 6313 Principles of Interactive Design Professor: Ian Bogost Date: Mar 4, 2005 Name: Jaemin Lee. Three leaf clover Website Design Documentation Three leaf clover Website Design Documentation 1 1. Strategy 1-1. Site Objectives This site visualizes the social connection and the political inclination of the most powerful U.S. companies. This site

More information

HOW TO TEXT OUT THE VOTE (TOTV)

HOW TO TEXT OUT THE VOTE (TOTV) HOW TO TEXT OUT THE VOTE (TOTV) Introductions AGENDA HUSTLE Overview of Hustle HOW TO How to start texting voters! We ll be learning about best practices and replying to messages PRACTICE Try it yourself

More information

Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records

Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records Ted Enamorado Benjamin Fifield Kosuke Imai Princeton University Talk at Seoul National University Fifth Asian Political

More information

LIS 665 Lesson Plan Kelly Denzer April 2, 2016

LIS 665 Lesson Plan Kelly Denzer April 2, 2016 LIS 665 Lesson Plan Kelly Denzer April 2, 2016 Introduction: This lesson plan, bibliography, and reflection details the steps I took when researching a brokered convention, a task given to me by my friend

More information

1 of 20 05/13/ :37 AM

1 of 20 05/13/ :37 AM 2017 CONGRESSIONAL REPORT PRESENTED TO: Vice President Mike Pence Kansas Secretary of State Kris Kobach The American Public and the U.S. Commission on Voter Fraud (Sections from the 620 Page Report) 1

More information

Supporting Research Data Collection from YouTube with TubeKit

Supporting Research Data Collection from YouTube with TubeKit Supporting Research Data Collection from YouTube with TubeKit Chirag Shah School of Information & Library Science University of North Carolina chirag@unc.edu Abstract We present TubeKit, a query-based

More information

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization

Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Effective Tweet Contextualization with Hashtags Performance Prediction and Multi-Document Summarization Romain Deveaud 1 and Florian Boudin 2 1 LIA - University of Avignon romain.deveaud@univ-avignon.fr

More information

Scytl Election Results User Guide

Scytl Election Results User Guide Scytl Election Results User Guide Allegheny County has contracted with Scytl to display County election results in a manner that is transparent and simple for both the general public and users who are

More information

arxiv: v1 [cs.ir] 20 Dec 2018

arxiv: v1 [cs.ir] 20 Dec 2018 Intertemporal Connections Between Query Suggestions and Search Engine Results for Politics Related Queries Malte Bonart 1 and Philipp Schaer 1 arxiv:1812.08585v1 [cs.ir] 20 Dec 2018 1 Introduction Technische

More information

Orange3 Text Mining Documentation

Orange3 Text Mining Documentation Orange3 Text Mining Documentation Release Biolab January 26, 2017 Widgets 1 Corpus 1 2 NY Times 5 3 Twitter 9 4 Wikipedia 13 5 Pubmed 17 6 Corpus Viewer 21 7 Preprocess Text 25 8 Bag of Words 31 9 Topic

More information

Mining the Evolution of Political Opinion

Mining the Evolution of Political Opinion Utrecht University Mining the Evolution of Political Opinion Master Thesis First Supervisor prof. dr. A.P.J.M. Siebes Algorithmic Data Analysis Group Department of Information and Computing Science Second

More information

CSEP 573: Artificial Intelligence

CSEP 573: Artificial Intelligence CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:

More information

Finding Influencers within Fuzzy Topics on Twitter

Finding Influencers within Fuzzy Topics on Twitter 1 1. Introduction Finding Influencers within Fuzzy Topics on Twitter Tal Stramer, Stanford University In recent years, there has been an explosion in usage of online social networks such as Twitter. This

More information

Retrieval Evaluation. Hongning Wang

Retrieval Evaluation. Hongning Wang Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User

More information

UPV-INAOE-Autoritas - Check That: An Approach based on External Sources to Detect Claims Credibility

UPV-INAOE-Autoritas - Check That: An Approach based on External Sources to Detect Claims Credibility UPV-INAOE-Autoritas - Check That: An Approach based on External Sources to Detect Claims Credibility Bilal Ghanem 1, Manuel Montes-y-Gómez 3, Francisco Rangel 1,2, and Paolo Rosso 1 1 PRHLT Research Center,

More information

Comparative Analysis of Clicks and Judgments for IR Evaluation

Comparative Analysis of Clicks and Judgments for IR Evaluation Comparative Analysis of Clicks and Judgments for IR Evaluation Jaap Kamps 1,3 Marijn Koolen 1 Andrew Trotman 2,3 1 University of Amsterdam, The Netherlands 2 University of Otago, New Zealand 3 INitiative

More information

www.newsflashenglish.com ESL ENGLISH LESSON (60-120 mins) 15 th November 2010 Social networking in today s world Hands up those of you who like to chat online? How many of you use Facebook? Probably quite

More information

Recitation 2: Strings and RegExp Statistical Computing, Monday September 28, 2015

Recitation 2: Strings and RegExp Statistical Computing, Monday September 28, 2015 Recitation 2: Strings and RegExp Statistical Computing, 36-350 Monday September 28, 2015 Strings: A Brief Review Intituively, a string is a seris of letters, numbers, and other symbols. R stores these

More information

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations

Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations George M. Slota 1 Sivasankaran Rajamanickam 2 Kamesh Madduri 3 1 Rensselaer Polytechnic Institute, 2 Sandia National

More information

Using NLP and context for improved search result in specialized search engines

Using NLP and context for improved search result in specialized search engines Mälardalen University School of Innovation Design and Engineering Västerås, Sweden Thesis for the Degree of Bachelor of Science in Computer Science DVA331 Using NLP and context for improved search result

More information

RMIT University at TREC 2006: Terabyte Track

RMIT University at TREC 2006: Terabyte Track RMIT University at TREC 2006: Terabyte Track Steven Garcia Falk Scholer Nicholas Lester Milad Shokouhi School of Computer Science and IT RMIT University, GPO Box 2476V Melbourne 3001, Australia 1 Introduction

More information

FIELD RESEARCH CORPORATION

FIELD RESEARCH CORPORATION FIELD RESEARCH CORPORATION FOUNDED IN 1945 BY MERVIN FIELD 601 California Street San Francisco, California 94108 415-392-5763 Tabulations From a Survey of California Likely Voters about the 2016 Presidential

More information

Clinton Leads Trump in Michigan by 10% (Clinton 49% - Trump 39%)

Clinton Leads Trump in Michigan by 10% (Clinton 49% - Trump 39%) P R E S S R E L E A S E FOR RELEASE: August 16, 2016 Contact: Steve Mitchell 248-891-2414 Clinton Leads Trump in Michigan by 10% (Clinton 49% - Trump 39%) EAST LANSING, Michigan --- The latest Fox 2 Detroit/Mitchell

More information

ALPACA: A Lightweight Platform for Analyzing Claim Acceptability

ALPACA: A Lightweight Platform for Analyzing Claim Acceptability ALPACA: A Lightweight Platform for Analyzing Claim Acceptability ABSTRACT Jeff King peff@cc.gatech.edu Michael T. Hunter mhunter@cc.gatech.edu Internet users face challenges in evaluating the validity

More information

MATH 1340 Mathematics & Politics

MATH 1340 Mathematics & Politics MTH 1340 Mathematics & Politics Lecture 4 June 25, 2015 Slides prepared by Iian Smythe for MTH 1340, Summer 2015, at Cornell University 1 Profiles and social choice functions Recall from last time: in

More information

By Garrett Chang and Steven Fiacco (and Darren Lin and Anders Maraviglia [SD&D])

By Garrett Chang and Steven Fiacco (and Darren Lin and Anders Maraviglia [SD&D]) Final Project Report By Garrett Chang and Steven Fiacco (and Darren Lin and Anders Maraviglia [SD&D]) Motivation Our motivation for this visualization resulted from a simple problem when viewing election

More information

PageRank for Product Image Search. Research Paper By: Shumeet Baluja, Yushi Jing

PageRank for Product Image Search. Research Paper By: Shumeet Baluja, Yushi Jing PageRank for Product Image Search Research Paper By: Shumeet Baluja, Yushi Jing Topics Motivation What is PageRank? ImageRank Algorithm Features generation & Similarity measure Concept of Centrality PageRank

More information

Colorado Results. For 10/3/ /4/2012. Contact: Doug Kaplan,

Colorado Results. For 10/3/ /4/2012. Contact: Doug Kaplan, Colorado Results For 10/3/2012 10/4/2012 Contact: Doug Kaplan, 407-242-1870 Executive Summary Following the debates, Gravis Marketing, a non-partisan research firm conducted a survey of 1,438 likely voters

More information

VOTES PERCENT US SENATOR VOTE FOR 1 MARLIN A. STUTZMAN... 4, TODD YOUNG... 12,

VOTES PERCENT US SENATOR VOTE FOR 1 MARLIN A. STUTZMAN... 4, TODD YOUNG... 12, Page 1 of 8 SUMMARY REPORT HANCOCK COUNTY, INDIANA RUN DATE:05/04/16 PRIMARY ELECTION RUN TIME:07:52 PM MAY 3, 2016 STATISTICS VOTES PERCENT PRECINCTS COUNTED (OF 47)..... 47 100.00 REGISTERED VOTERS -

More information

Scraping and Preprocessing of Social Media Data

Scraping and Preprocessing of Social Media Data Preconference on Computational tools for text mining, processing and analysis. May 25th 2017, 9:00-17:00 (ICA San Diego) Scraping and Preprocessing of Social Media Data H A I LIANG, A SSISTANT PROFESSOR

More information

Towards Summarizing the Web of Entities

Towards Summarizing the Web of Entities Towards Summarizing the Web of Entities contributors: August 15, 2012 Thomas Hofmann Director of Engineering Search Ads Quality Zurich, Google Switzerland thofmann@google.com Enrique Alfonseca Yasemin

More information

6/29/ :38 AM 1

6/29/ :38 AM 1 6/29/2017 11:38 AM 1 Creating an Event Hub In this lab, you will create an Event Hub. What you need for this lab An Azure Subscription Create an event hub Take the following steps to create an event hub

More information

HTML 5 and CSS 3, Illustrated Complete. Unit M: Integrating Social Media Tools

HTML 5 and CSS 3, Illustrated Complete. Unit M: Integrating Social Media Tools HTML 5 and CSS 3, Illustrated Complete Unit M: Integrating Social Media Tools Objectives Understand social networking Integrate a Facebook account with a Web site Integrate a Twitter account feed Add a

More information

GraphCEP Real-Time Data Analytics Using Parallel Complex Event and Graph Processing

GraphCEP Real-Time Data Analytics Using Parallel Complex Event and Graph Processing Institute of Parallel and Distributed Systems () Universitätsstraße 38 D-70569 Stuttgart GraphCEP Real-Time Data Analytics Using Parallel Complex Event and Graph Processing Ruben Mayer, Christian Mayer,

More information

Eavesdropping on Fine-Grained User Activities Within Smartphone Apps Over Encrypted Network Traffic

Eavesdropping on Fine-Grained User Activities Within Smartphone Apps Over Encrypted Network Traffic Eavesdropping on Fine-Grained User Activities Within Smartphone Apps Over Encrypted Network Traffic Brendan Saltaformaggio, Hongjun Choi, Kristen Johnson, Yonghwi Kwon, Qi Zhang, Xiangyu Zhang, Dongyan

More information

60-538: Information Retrieval

60-538: Information Retrieval 60-538: Information Retrieval September 7, 2017 1 / 48 Outline 1 what is IR 2 3 2 / 48 Outline 1 what is IR 2 3 3 / 48 IR not long time ago 4 / 48 5 / 48 now IR is mostly about search engines there are

More information

Refresher: ER-modeling, logical relational model, dependencies. Toon Calders

Refresher: ER-modeling, logical relational model, dependencies. Toon Calders Refresher: ER-modeling, logical relational model, dependencies Toon Calders toon.calders@ulb.ac.be Different Levels Conceptual level: ER-diagrams Logical level: Relations, attributes, schemas, primary

More information

Predicting Bias in Machine Learned Classifiers Using Clustering

Predicting Bias in Machine Learned Classifiers Using Clustering Predicting Bias in Machine Learned Classifiers Using Clustering Robert Thomson 1, Elie Alhajjar 1, Joshua Irwin 2, and Travis Russell 1 1 United States Military Academy, West Point NY 10996, USA {Robert.Thomson,Elie.Alhajjar,Travis.Russell}@usma.edu

More information

Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions

Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions Advanced Computer Graphics CS 525M: Crowds replace Experts: Building Better Location-based Services using Mobile Social Network Interactions XIAOCHEN HUANG Computer Science Dept. Worcester Polytechnic

More information

CS 115 Exam 3, Fall 2016, Sections 1-4

CS 115 Exam 3, Fall 2016, Sections 1-4 , Sections 1-4 Your name: Rules You may use one handwritten 8.5 x 11 cheat sheet (front and back). This is the only resource you may consult during this exam. Explain/show work if you want to receive partial

More information

Windows media player free download for iphone

Windows media player free download for iphone Windows media player free download for iphone 01/08/2018 Donald trump budget cuts proposal 01/10/2018 Relay for life themes 2017 01/10/2018 -Ralph lauren shirt dress women -Is valium used for nausea 01/11/2018

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Web Search. Lecture Objectives. Text Technologies for Data Science INFR Learn about: 11/14/2017. Instructor: Walid Magdy

Web Search. Lecture Objectives. Text Technologies for Data Science INFR Learn about: 11/14/2017. Instructor: Walid Magdy Text Technologies for Data Science INFR11145 Web Search Instructor: Walid Magdy 14-Nov-2017 Lecture Objectives Learn about: Working with Massive data Link analysis (PageRank) Anchor text 2 1 The Web Document

More information

SOCIAL MEDIA. Charles Murphy

SOCIAL MEDIA. Charles Murphy SOCIAL MEDIA Charles Murphy Social Media Overview 1. Introduction 2. Social Media Areas Blogging Bookmarking Deals Location-based Music Photo sharing Video 3. The Fab Four FaceBook Google+ Linked In Twitter

More information

Analyzing a Large Corpus of Crowdsourced Plagiarism

Analyzing a Large Corpus of Crowdsourced Plagiarism Analyzing a Large Corpus of Crowdsourced Plagiarism Master s Thesis Michael Völske Fakultät Medien Bauhaus-Universität Weimar 06.06.2013 Supervised by: Prof. Dr. Benno Stein Prof. Dr. Volker Rodehorst

More information

Processing Twitter Data with MongoDB. Xiaoxiao Liu

Processing Twitter Data with MongoDB. Xiaoxiao Liu Processing Twitter Data with MongoDB Xiaoxiao Liu Issue with Facebook Data Original, I planned to do this project with Facebook Data. - Facebook Graph API - Third-Party Java Library: restfb I was interested

More information

/ Cloud Computing. Recitation 7 October 10, 2017

/ Cloud Computing. Recitation 7 October 10, 2017 15-319 / 15-619 Cloud Computing Recitation 7 October 10, 2017 Overview Last week s reflection Project 3.1 OLI Unit 3 - Module 10, 11, 12 Quiz 5 This week s schedule OLI Unit 3 - Module 13 Quiz 6 Project

More information

Graph Data Modeling for Political Communication on Twitter

Graph Data Modeling for Political Communication on Twitter Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2016 Graph Data Modeling for Political Communication on Twitter Prashant Kumar Iowa State University Follow this

More information

Evaluating the Accuracy of. Implicit feedback. from Clicks and Query Reformulations in Web Search. Learning with Humans in the Loop

Evaluating the Accuracy of. Implicit feedback. from Clicks and Query Reformulations in Web Search. Learning with Humans in the Loop Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search Thorsten Joachims, Filip Radlinski, Geri Gay, Laura Granka, Helene Hembrooke, Bing Pang Department of Computer

More information

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,

More information

STAT10010 Introductory Statistics Lab 2

STAT10010 Introductory Statistics Lab 2 STAT10010 Introductory Statistics Lab 2 1. Aims of Lab 2 By the end of this lab you will be able to: i. Recognize the type of recorded data. ii. iii. iv. Construct summaries of recorded variables. Calculate

More information

CS54701: Information Retrieval

CS54701: Information Retrieval CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful

More information

custom fused glass tile Important Copy: custom glass tile and fused glass tile Custom Glass Tile 2016 dhhs federal employee performance awards

custom fused glass tile Important Copy: custom glass tile and fused glass tile Custom Glass Tile 2016 dhhs federal employee performance awards custom fused glass tile Custom Glass Tile Important Copy: custom glass tile and fused glass tile 2016 dhhs federal employee performance awards Maybe he wasnt calling rigidly still body language countries

More information

Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records

Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records Kosuke Imai Princeton University Talk at SOSC Seminar Hong Kong University of Science and Technology June 14, 2017 Joint

More information

Multimedia Information Systems

Multimedia Information Systems Multimedia Information Systems Samson Cheung EE 639, Fall 2004 Lecture 6: Text Information Retrieval 1 Digital Video Library Meta-Data Meta-Data Similarity Similarity Search Search Analog Video Archive

More information

Unstructured Data. CS102 Winter 2019

Unstructured Data. CS102 Winter 2019 Winter 2019 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for patterns in data

More information

Voters and Mail. 5 Insights to Boost Campaign Impact. A United States Postal Service and American Association of Political Consultants (AAPC) study

Voters and Mail. 5 Insights to Boost Campaign Impact. A United States Postal Service and American Association of Political Consultants (AAPC) study Voters and Mail 5 Insights to Boost Campaign Impact A United States Postal Service and American Association of Political Consultants (AAPC) study Voters are waiting for you at the mailbox. The American

More information

Managing your online reputation

Managing your online reputation Managing your online reputation In this internet age where every thought, feeling and opinion is tweeted, posted or blogged about for the world to see, reputation management has never been so important

More information

From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search

From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search Wellesley College Wellesley College Digital Scholarship and Archive Computer Science Faculty Scholarship Computer Science 4-26-2010 From Obscurity to Prominence in Minutes: Political Speech and Real-Time

More information

Aheadsup.com channel - Automatic Twitter Feed

Aheadsup.com channel - Automatic Twitter Feed Aheadsup.com channel - Automatic Twitter Feed Here is how it works: Post an entry in your social media sentiment channel. Example: alert for SGU The alert will automatically appear on the web site of your

More information

CS130 Software Tools. Fall 2010 Intro to SPSS and Data Handling

CS130 Software Tools. Fall 2010 Intro to SPSS and Data Handling Software Tools Intro to SPSS and Data Handling 1 Types of Analyses When doing data analysis, we are interested in two types of summaries: Statistical Summaries (e.g. descriptive, hypothesis testing) Visual

More information

Advisor/Committee Members Dr. Chris Pollett Dr. Mark Stamp Dr. Soon Tee Teoh. By Vijeth Patil

Advisor/Committee Members Dr. Chris Pollett Dr. Mark Stamp Dr. Soon Tee Teoh. By Vijeth Patil Advisor/Committee Members Dr. Chris Pollett Dr. Mark Stamp Dr. Soon Tee Teoh By Vijeth Patil Motivation Project goal Background Yioop! Twitter RSS Modifications to Yioop! Test and Results Demo Conclusion

More information

The Plurality-with-Elimination Method

The Plurality-with-Elimination Method The Plurality-with-Elimination Method Lecture 9 Section 1.4 Robb T. Koether Hampden-Sydney College Fri, Sep 8, 2017 Robb T. Koether (Hampden-Sydney College) The Plurality-with-Elimination Method Fri, Sep

More information

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island Taming Text How to Find, Organize, and Manipulate It GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS 11 MANNING Shelter Island contents foreword xiii preface xiv acknowledgments xvii about this book

More information

Visualization of a hashtag's activity in real time. BS.C Thesis Davvetas Athanasios Technological Educational Institute of Athens

Visualization of a hashtag's activity in real time. BS.C Thesis Davvetas Athanasios Technological Educational Institute of Athens Visualization of a hashtag's activity in real time BS.C Thesis Davvetas Athanasios Technological Educational Institute of Athens cs121018@teiath.gr Tools used During this project we used open source tools

More information