Modeling Information Seeking Behavior in Social Media Eugene Agichtein
|
|
- George Hensley
- 5 years ago
- Views:
Transcription
1 Modeling Information Seeking Behavior in Social Media Eugene Agichtein I lli I f i A L b (IRL b) Intelligent Information Access Lab (IRLab) Emory University
2 Intelligent Information Access Lab (IRLab) Modeling information seeking behavior Searching the Web and social media Text and data mining for medical informatics and public health Ablimit Aji Qi Guo Julia Kiseleva In collaboration with: - Beth Buffalo (Neurology) - Charlie Clarke (Waterloo) - Ernie Garcia (Radiology) - Phil Wolff (Psychology) - Hongyuan Zha (GaTech) Dmitry Lagun Qiaoling Liu Yu Wang 2
3 Our Approach to Intelligent Information Access Search logs: queries, clicks Data-Driven Di Model ldiscovery (machine learning/data mining) Intelligent search Information Health Cognitive sharing Informatics Diagnosticsi 3 3
4 Intelligent search Contextualized Intent Inference 4
5 Intelligent search Web-scale Text Mining Extract entities, relationships, events from text Estimate accuracy of web content DiseaseOutbreaks, The New York Times Some Applications: Incorporating extracted information into (web) search Finding implicit connections between events, entities Visualizing and exploring large text collections 18 November 2009 Eugene Agichtein, Emory 5 University, IR Lab [DL 00, ICDE 2003 best student paper, SIGMOD 2006 best paper, ]
6 Health Informatics Information Extraction for Decision Support with E. V. Garcia (Radiology) and A. Ram (Georgia Tech) Rule Discovery from Medical Literature (MERLIN project): Identify articles containing useful clinical knowledge Extract new expert system rules for the Emory Cardiac Toolbox IF LV_stress_perfusion_is_abnormal THEN Diseased_coronary_is(LAD) Personalized diagnosis i and care (PRETEX project): Extract clinical variables from text in patient records Personalize expert system rules for a given patient or population New: unexpected findings 18 November 2009 Eugene Agichtein, Emory University, IR Lab 6
7 Talk Outline IR Lab research overview Mining interactions in social media Content quality Information seeker satisfaction Question intent If time: inferring web search intent 7
8 Finding Information Online 8
9 From Searching to Finding 9
10 Social (Information) Sharing 10
11 11 11
12 (Text) Social Media Today Published: 4Gb/day Social Media: 10Gb/Day Yahoo Answers: 120M users, 40M questions, 1B answers Yes, we could read your blog. Or, you could tell us about your day 12
13 Finding Information Online (Revisited) Claim: next generation of search will provide support for real-time, mediated info exchange First step: web-scale collaborative question answering (CQA) sites Realistic information exchange 100M+ community Many immediate challenges 13
14
15 15
16 (Some) Related Work Adamic et al., WWW 2007, WWW 2008: Expertise sharing, network structure Elsas et al., SIGIR 2008: Blog search Glance et al.: Blog Pulse, popularity, information sharing Harper et al., CHI 2008, 2009: Answer quality across multiple CQA sites Krautetal: al.: community participation Kumar et al., WWW 2004, KDD 2008, : Information diffusion in blogspace, network evolution Third Workshop on Searching Social Media (SSM 2010) at WSDM: edu/ssm2010/ 16
17 Finding High Quality Content in SM E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne, Finding High Quality Content in Social Media, in WSDM 2008 Well-written Interesting Relevant (answer) Factually correct Popular? Provocative? Useful? As judged by professional editors 17
18 18 18
19 How do Question and Answer Quality relate? 19 19
20 20 20
21 21 21
22 22 22
23 23 23
24 Community 24
25 Link Analysis for Authority Estimation User 1 User 2 Question 1 Question 2 Answer 1 User 3 Answer 2 User 4 Answer 3 User 5 Answer 4 User 6 User 3 User 1 User 4 User 5 User 2 User 6 Question 3 Answer 5 Answer 6 A ( j ) = H ( i ) i= 0.. M H ( i) = A( j) j = 0.. K Hub (asker) Authority (answerer) 25
26 HITS effective Qualitative Observations HITS ineffective 26
27 Random forest classifier 27 27
28 Result 1: Identifying High Quality Questions 28
29 Top Features for Question Classification Asker popularity ( stars ) Punctuation density Topical category Page views KL Divergence from reference corpus LM 29
30 Identifying High Quality Answers 30
31 Top Features for Answer Classification Answer length Community ratings Answerer reputation Word overlap Kincaid readability score 31
32 User and Content Quality: Coupled Mutual Reinforcement 32
33 Can Improve Performance OR Reduce Training 33
34 Finding Information Online (Revisited) Next generation of search: human-machine-human CQA: a case study in complex IR Content quality Asker satisfaction Understanding the interactions 34
35 Dimensions of Quality Well-written Interesting Relevant (answer) Factually correct Popular? Timely? Provocative? Useful? As judged by the asker (or community) 35 35
36 Are Editor Labels Meaningful for CGC? Information seeking process: want to find useful information about topic with incomplete knowledge N. Belkin: Anomalous states of knowledge Want to model directly if user found satisfactory information Specific (amenable) case: CQA 36
37 Yahoo! Answers: The Good News Active community of millions of users in many countries and languages Effective for subjective information needs Great forum for socialization/chat Can be invaluable for hard-to-find information not available on the web 37
38 38
39 Yahoo! Answers: The Bad News May have to wait a long time to get a satisfactory answer 1. FIFA World Cup Optical Poetry Football (American) Soccer Medicine WinterSports 5 8. Special Education 0 9. General Health Care Outdoor Recreation Time to close a question (hours) May never obtain a satisfying i answer 39
40 Predicting Asker Satisfaction Y. Liu, J. Bian, and E. Agichtein, in SIGIR 2008 Yandong Liu Jiang Bian Given a question submitted by an asker in CQA, predict whether the user will be satisfied with the answers contributed by the community. Satisfied : The asker has closed the question AND Selected the best answer AND Rated best answer >= 3 stars (# not important) Else, Unsatisfied 40
41 ASP: Asker Satisfaction Prediction Question Answer Asker History Answerer History Category Text Classifier Wikipedia News asker is satisfied asker is not satisfied 41 41
42 Experimental Setup: Data Crawled from Yahoo! Answers in early 2008 Questions Answers Askers Categories % Satisfied 216,170 1,963, , % Anonymized dataset available at: 1/2009: Yahoo! Webscope : Comprehensive Answers dataset: ~5M questions & answers
43 Satisfaction by Topic Topic Questions Answers A per Q Satisfied Asker rating 2006 FIFA World Cup Mental Health Time to close by asker , % minutes % days Mathematics % minutes Diet & % days Fitness 43
44 Satisfaction Prediction: Human Judges Truth: asker s rating A random sample of 130 questions Researchers Agreement: 0.82 F1: P*R/(P+R) R) Amazon Mechanical Turk Five workers per question. Agreement: 0.9 F1: 0.61 Best when at least 4 out of 5 raters agree 44 44
45 Performance: ASP vs. Humans (F1, Satisfied) Classifier With Text Without Text Selected Features ASP_SVMSVM ASP_C ASP_RandomForest ASP_Boosting ASP_NB Best Human Perf 0.61 Baseline (random) 0.66 Human F1 is lower than the random baseline! ASP is significantly more effective than humans 45
46 Top Features by Information Gain Q: Askers previous rating Q: Average past rating by asker UH: Member since (interval) UH: Average # answers for by past Q UH: Previous Q resolved for the asker CA: Average asker rating for category UH: Total number of answers received 46
47 Offline vs. Online Prediction Offline prediction (AFTER answers arrive) All features( question, answer, asker & category) F1: 0.77 Online prediction (BEFORE question posted) NO answer features Only asker history and question features (stars, #comments, sum of votes ) F1:
48 Personalized Prediction of Satisfaction Y. Liu and E. Agichtein, You've Got Answers: Personalized Models for Predicting Success in Community Question Answering, ACL 2008 Same information!= same usefulness for different searchers! Personalization vs. Groupization? 48
49 Example Personalized Models 49
50 Outline Next generation of search: Algorithmically mediated information exchange CQA: a case study in complex IR Content quality Asker satisfaction Understanding the interactions 50
51 Social Media Language Analysis Social Media!= WSJ Text Subjectivity, Sentiment, Temporal Sensitivity 51
52 Subjectivity in CQA B. Li, Y. Liu, and E. Agichtein, CoCQA: Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation, EMNLP 2008 How can we exploit structure of CQA for categorization of social media content? Case Study: Text Subjectivity Subjective: Has anyone got one of those home blood pressure monitors? and if so what make is it and do you think they are worth getting? Objective: What is the difference between chemotherapy and radiation treatments? 52 52
53 Objective vs. Subjective Content in CQA Education 30% Arts Science 36% 70% 48% 52% 64% Health 34% Objective Subjective 21% Sports 36% 64% 66% 79% 53
54 54 54
55 Questions and Answers: Two Views Example: Q: Has anyone got one of those home blood pressure monitors? and if so what make is it and do you think they are worth getting? A: My mom has one as she is diabetic so its important for her to monitor it she finds it useful. Answer orientation usually matches question Idea: Co-Training (Blum & Mitchell, COLT 1998) 55 55
56 CoCQA: A Co-Training Framework over Questions and Answers EA1 Labeled Data Q A CQ C CA Class ify Q A Unlabeled Unlabeled Data Data???????????????????????????????????????? Stop Validation (Holdout training data) 56 56
57 Slide 56 EA1 Include one more box on lower right corner: after "stop" lights up, show box "apply final classifier on test data" Eugene Agichtein, 10/26/2008
58 Features Method Example result: CoCQA Outperforms State-ofthe-Art Partially Supervised ML Question (macro avg F1) Question+ Best Answer (macro avg F1) Supervised GE (-0.7%) (+3.2%) CoCQA (+1.9%) (+7.2%) Implications: Can reduce amount of required manual labels Can improve accuracy with more unlabeled data 57
59 Another Example: Question Urgency [Liu et al., SIGIR 2009 poster] Problem a growing volume of questions competing for visibility Urgent questions pushed out Delayed responses useless 58
60 Outline Next generation of search: Algorithmically mediated information exchange CQA: a case study in complex IR Content quality Asker satisfaction Understanding interactions 59
61 Current Work (in Progress) Partially supervised models of expertise (Bian et al., WWW 2009) Sentiment, temporal sensitivity i i analysis Influence of text on interactions Towards real-time hybrid social/web search 60
62 Intelligent search Goal: Hybrid Web/Social Search 61
63 Takeaways Robust machine learning over interaction data system improvements, insights into behavior Contextualized models for NLP and text mining system improvements, insights into interactions Mining social media:potentialfortransformative transformative impact for IR, sociology, psychology, medical informatics, public health, 62
64 More information, datasets, papers, slides: References Modeling search intent [SIGIR 06, 07, ECIR 09, WI 09] Estimating content quality [WSDM 2008] Estimating contributor authority [CIKM 2007] Searching CQA archives [WWW 2008, WWW 2009] Inferring asker intent [EMNLP 2008, SIGIR 09 poster] Predicting satisfaction [SIGIR 2008, ACL 2008, TKDE 09] Coping with spam in CQA [AIRWeb 2008] 63
65 Thank you! Diane Kelly and UNC for hosting my visit Supported by: 64
Machine Learning Applications to Modeling Web Searcher Behavior Eugene Agichtein
Machine Learning Applications to Modeling Web Searcher Behavior Eugene Agichtein Intelligent Information Access Lab (IRLab) Emory University Talk Outline Overview of the Emory IR Lab Intent-centric Web
More informationSurvey on Community Question Answering Systems
World Journal of Technology, Engineering and Research, Volume 3, Issue 1 (2018) 114-119 Contents available at WJTER World Journal of Technology, Engineering and Research Journal Homepage: www.wjter.com
More informationThe Web: Concepts and Technology. 1 CS 584: Information Retrieval. Math & Computer Science Department, Emory University
The Web: Concepts and Technology January 15: Course Overview 1 CS 584: Information Retrieval. Math & Computer Science Department, Emory University Today s Plan Who am I? What is this course about? Logistics
More informationKarami, A., Zhou, B. (2015). Online Review Spam Detection by New Linguistic Features. In iconference 2015 Proceedings.
Online Review Spam Detection by New Linguistic Features Amir Karam, University of Maryland Baltimore County Bin Zhou, University of Maryland Baltimore County Karami, A., Zhou, B. (2015). Online Review
More informationEugene Agichtein, Curriculum Vitae October Eugene Agichtein
Eugene Agichtein Mathematics and Computer Science Department Emory University eugene@mathcs.emory.edu 400 Dowman Drive, Suite W401 Web: http://www.mathcs.emory.edu/~eugene/ Atlanta, GA 30322 Telephone:
More informationMining Trusted Information in Medical Science: An Information Network Approach
Mining Trusted Information in Medical Science: An Information Network Approach Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign Collaborated with many, especially Yizhou
More informationRanking with Query-Dependent Loss for Web Search
Ranking with Query-Dependent Loss for Web Search Jiang Bian 1, Tie-Yan Liu 2, Tao Qin 2, Hongyuan Zha 1 Georgia Institute of Technology 1 Microsoft Research Asia 2 Outline Motivation Incorporating Query
More informationLecture 5: Search Interfaces + New Directions
Modeling User Behavior and dinteractions ti Lecture 5: Search Interfaces + New Directions Eugene Agichtein Emory University Eugene Agichtein, Emory University, RuSSIR 2009 (Petrozavodsk, Russia) 1 Lecture
More informationAnomaly Detection. You Chen
Anomaly Detection You Chen 1 Two questions: (1) What is Anomaly Detection? (2) What are Anomalies? Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior
More informationTelling Experts from Spammers Expertise Ranking in Folksonomies
32 nd Annual ACM SIGIR 09 Boston, USA, Jul 19-23 2009 Telling Experts from Spammers Expertise Ranking in Folksonomies Michael G. Noll (Albert) Ching-Man Au Yeung Christoph Meinel Nicholas Gibbins Nigel
More informationPUTTING CONTEXT INTO SEARCH AND SEARCH INTO CONTEXT. Susan Dumais, Microsoft Research
PUTTING CONTEXT INTO SEARCH AND SEARCH INTO CONTEXT Susan Dumais, Microsoft Research Overview Importance of context in IR Potential for personalization framework Examples Personal navigation Client-side
More informationTowards Predicting Web Searcher Gaze Position from Mouse Movements
Towards Predicting Web Searcher Gaze Position from Mouse Movements Qi Guo Emory University 400 Dowman Dr., W401 Atlanta, GA 30322 USA qguo3@emory.edu Eugene Agichtein Emory University 400 Dowman Dr., W401
More informationQuery Independent Scholarly Article Ranking
Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data
More informationCriES 2010
CriES Workshop @CLEF 2010 Cross-lingual Expert Search - Bridging CLIR and Social Media Institut AIFB Forschungsgruppe Wissensmanagement (Prof. Rudi Studer) Organizing Committee: Philipp Sorg Antje Schultz
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationTriRank: Review-aware Explainable Recommendation by Modeling Aspects
TriRank: Review-aware Explainable Recommendation by Modeling Aspects Xiangnan He, Tao Chen, Min-Yen Kan, Xiao Chen National University of Singapore Presented by Xiangnan He CIKM 15, Melbourne, Australia
More informationAddressing the Challenges of Underspecification in Web Search. Michael Welch
Addressing the Challenges of Underspecification in Web Search Michael Welch mjwelch@cs.ucla.edu Why study Web search?!! Search engines have enormous reach!! Nearly 1 billion queries globally each day!!
More informationKDD 10 Tutorial: Recommender Problems for Web Applications. Deepak Agarwal and Bee-Chung Chen Yahoo! Research
KDD 10 Tutorial: Recommender Problems for Web Applications Deepak Agarwal and Bee-Chung Chen Yahoo! Research Agenda Focus: Recommender problems for dynamic, time-sensitive applications Content Optimization
More informationRecommender Systems. Collaborative Filtering & Content-Based Recommending
Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on
More informationEffective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar
Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar School of Computer Science Faculty of Science University of Windsor How Big is this Big Data? 40 Billion Instagram Photos 300 Hours
More informationClassification. I don t like spam. Spam, Spam, Spam. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Classification applications in IR Classification! Classification is the task of automatically applying labels to items! Useful for many search-related tasks I
More informationCombining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,
More informationAutomatic Domain Partitioning for Multi-Domain Learning
Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels
More informationS-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional
More informationInteraction Model to Predict Subjective Specificity of Search Results
Interaction Model to Predict Subjective Specificity of Search Results Kumaripaba Athukorala, Antti Oulasvirta, Dorota Glowacka, Jilles Vreeken, Giulio Jacucci Helsinki Institute for Information Technology
More informationCIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets
CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,
More informationAnnotation and Evaluation
Annotation and Evaluation Digging into Data: Jordan Boyd-Graber University of Maryland April 15, 2013 Digging into Data: Jordan Boyd-Graber (UMD) Annotation and Evaluation April 15, 2013 1 / 21 Exam Solutions
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org J.
More informationQuestioning Yahoo! Answers
Questioning Yahoo! Answers Zoltán Gyöngyi zoltan@cs.stanford.edu Outline Yahoo! Answers model Statistics Basics Diversity Authority Problems Interaction model Others Question Answering on the Web April
More informationUnderstanding the use of Temporal Expressions on Persian Web Search
Understanding the use of Temporal Expressions on Persian Web Search Behrooz Mansouri Mohammad Zahedi Ricardo Campos Mojgan Farhoodi Alireza Yari Ricardo Campos TempWeb 2018 @ WWW Lyon, France, Apr 23,
More informationLearning Temporal-Dependent Ranking Models
Learning Temporal-Dependent Ranking Models Miguel Costa, Francisco Couto, Mário Silva LaSIGE @ Faculty of Sciences, University of Lisbon IST/INESC-ID, University of Lisbon 37th Annual ACM SIGIR Conference,
More informationFinal Project Discussion. Adam Meyers Montclair State University
Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...
More informationPositive and Negative Links
Positive and Negative Links Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz May 4, 2015 Elisabeth Lex (KTI, TU Graz) Networks May 4, 2015 1 / 66 Outline 1 Repetition 2 Motivation 3 Structural Balance
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationWEB SPAM IDENTIFICATION THROUGH LANGUAGE MODEL ANALYSIS
WEB SPAM IDENTIFICATION THROUGH LANGUAGE MODEL ANALYSIS Juan Martinez-Romo and Lourdes Araujo Natural Language Processing and Information Retrieval Group at UNED * nlp.uned.es Fifth International Workshop
More informationThis study is brought to you courtesy of.
This study is brought to you courtesy of www.google.com/think/insights Health Consumer Study The Role of Digital in Patients Healthcare Actions & Decisions Google/OTX U.S., December 2009 Background Demonstrate
More informationHow to organize the Web?
How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second try: Web Search Information Retrieval attempts to find relevant docs in a small and trusted set Newspaper
More informationA Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2
A Novel Categorized Search Strategy using Distributional Clustering Neenu Joseph. M 1, Sudheep Elayidom 2 1 Student, M.E., (Computer science and Engineering) in M.G University, India, 2 Associate Professor
More informationIntroduction to Information Retrieval. Hongning Wang
Introduction to Information Retrieval Hongning Wang CS@UVa What is information retrieval? 2 Why information retrieval Information overload It refers to the difficulty a person can have understanding an
More informationIn the Mood to Click? Towards Inferring Receptiveness to Search Advertising
In the Mood to Click? Towards Inferring Receptiveness to Search Advertising Qi Guo Eugene Agichtein Mathematics & Computer Science Department Emory University Atlanta, USA {qguo3,eugene}@mathcs.emory.edu
More informationMath Information Retrieval: User Requirements and Prototype Implementation. Jin Zhao, Min Yen Kan and Yin Leng Theng
Math Information Retrieval: User Requirements and Prototype Implementation Jin Zhao, Min Yen Kan and Yin Leng Theng Why Math Information Retrieval? Examples: Looking for formulas Collect teaching resources
More informationTopic Classification in Social Media using Metadata from Hyperlinked Objects
Topic Classification in Social Media using Metadata from Hyperlinked Objects Sheila Kinsella 1, Alexandre Passant 1, and John G. Breslin 1,2 1 Digital Enterprise Research Institute, National University
More informationPart 1: Link Analysis & Page Rank
Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Graph Data: Social Networks [Source: 4-degrees of separation, Backstrom-Boldi-Rosa-Ugander-Vigna,
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second
More informationCognos: Crowdsourcing Search for Topic Experts in Microblogs
Cognos: Crowdsourcing Search for Topic Experts in Microblogs Saptarshi Ghosh, Naveen Sharma, Fabricio Benevenuto, Niloy Ganguly, Krishna Gummadi IIT Kharagpur, India; UFOP, Brazil; MPI-SWS, Germany Topic
More informationHuman Computer Interaction in Health Informatics: From Laboratory Usability Testing to Televaluation of Web-based Information Systems
Human Computer Interaction in Health Informatics: From Laboratory Usability Testing to Televaluation of Web-based Information Systems André W. Kushniruk, Ph.D. Arts Information Technology Program York
More informationChapter 6 Evaluation Metrics and Evaluation
Chapter 6 Evaluation Metrics and Evaluation The area of evaluation of information retrieval and natural language processing systems is complex. It will only be touched on in this chapter. First the scientific
More informationDomain Adaptation Using Domain Similarity- and Domain Complexity-based Instance Selection for Cross-domain Sentiment Analysis
Domain Adaptation Using Domain Similarity- and Domain Complexity-based Instance Selection for Cross-domain Sentiment Analysis Robert Remus rremus@informatik.uni-leipzig.de Natural Language Processing Group
More informationSearching in All the Right Places. How Is Information Organized? Chapter 5: Searching for Truth: Locating Information on the WWW
Chapter 5: Searching for Truth: Locating Information on the WWW Fluency with Information Technology Third Edition by Lawrence Snyder Searching in All the Right Places The Obvious and Familiar To find tax
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationCS490W: Web Information Search & Management. CS-490W Web Information Search and Management. Luo Si. Department of Computer Science Purdue University
CS490W: Web Information Search & Management CS-490W Web Information Search and Management Luo Si Department of Computer Science Purdue University Overview Web: Growth of the Web The world produces between
More informationTime-aware Approaches to Information Retrieval
Time-aware Approaches to Information Retrieval Nattiya Kanhabua Department of Computer and Information Science Norwegian University of Science and Technology 24 February 2012 Motivation Searching documents
More informationBruno Martins. 1 st Semester 2012/2013
Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4
More informationTaccumulation of the social network data has raised
International Journal of Advanced Research in Social Sciences, Environmental Studies & Technology Hard Print: 2536-6505 Online: 2536-6513 September, 2016 Vol. 2, No. 1 Review Social Network Analysis and
More informationEfficiently Mining Positive Correlation Rules
Applied Mathematics & Information Sciences An International Journal 2011 NSP 5 (2) (2011), 39S-44S Efficiently Mining Positive Correlation Rules Zhongmei Zhou Department of Computer Science & Engineering,
More informationCS-490WIR Web Information Retrieval and Management. Luo Si
CS490W: Web Information Retrieval & Management CS-490WIR Web Information Retrieval and Management Luo Si Department of Computer Science Purdue University Overview Web: Growth of the Web The world produces
More informationA Framework to Crawl Web Forums Based on Time
A Framework to Crawl Web Forums Based on Time Dr. M.V. Siva Prasad principal@anurag.ac.in Ch. Suresh Kumar chsuresh.cse@anurag.ac.in B. Ramesh rameshcse532@gmail.com ABSTRACT: An Internet or web forum,
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second
More informationLatent Aspect Rating Analysis. Hongning Wang
Latent Aspect Rating Analysis Hongning Wang CS@UVa Online opinions cover all kinds of topics Topics: People Events Products Services, Sources: Blogs Microblogs Forums Reviews, 45M reviews 53M blogs 1307M
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationPersonalized Models of Search Satisfaction. Ahmed Hassan and Ryen White
Personalized Models of Search Satisfaction Ahmed Hassan and Ryen White Online Satisfaction Measurement Satisfying users is the main objective of any search system Measuring user satisfaction is essential
More informationExtracting Visual Snippets for Query Suggestion in Collaborative Web Search
Extracting Visual Snippets for Query Suggestion in Collaborative Web Search Hannarin Kruajirayu, Teerapong Leelanupab Knowledge Management and Knowledge Engineering Laboratory Faculty of Information Technology
More informationBuilding and Annotating Corpora of Collaborative Authoring in Wikipedia
Building and Annotating Corpora of Collaborative Authoring in Wikipedia Johannes Daxenberger, Oliver Ferschke and Iryna Gurevych Workshop: Building Corpora of Computer-Mediated Communication: Issues, Challenges,
More informationUniversity of Glasgow at CLEF 2013: Experiments in ehealth Task 3 with Terrier
University of Glasgow at CLEF 2013: Experiments in ehealth Task 3 with Terrier Nut Limsopatham 1, Craig Macdonald 2, and Iadh Ounis 2 School of Computing Science University of Glasgow G12 8QQ, Glasgow,
More informationQuery Modifications Patterns During Web Searching
Bernard J. Jansen The Pennsylvania State University jjansen@ist.psu.edu Query Modifications Patterns During Web Searching Amanda Spink Queensland University of Technology ah.spink@qut.edu.au Bhuva Narayan
More informationRecommender Systems: Practical Aspects, Case Studies. Radek Pelánek
Recommender Systems: Practical Aspects, Case Studies Radek Pelánek 2017 This Lecture practical aspects : attacks, context, shared accounts,... case studies, illustrations of application illustration of
More informationCSE 3. How Is Information Organized? Searching in All the Right Places. Design of Hierarchies
CSE 3 Comics Updates Shortcut(s)/Tip(s) of the Day Web Proxy Server PrimoPDF How Computers Work Ch 30 Chapter 5: Searching for Truth: Locating Information on the WWW Fluency with Information Technology
More informationQuery Sugges*ons. Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata
Query Sugges*ons Debapriyo Majumdar Information Retrieval Spring 2015 Indian Statistical Institute Kolkata Search engines User needs some information search engine tries to bridge this gap ssumption: the
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationNortheastern University in TREC 2009 Million Query Track
Northeastern University in TREC 2009 Million Query Track Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Stefan Savev, Javed Aslam Information Studies Department, University of Sheffield, Sheffield, UK College
More informationAccessing Web Archives
Accessing Web Archives Web Science Course 2017 Helge Holzmann 05/16/2017 Helge Holzmann (holzmann@l3s.de) Not today s topic http://blog.archive.org/2016/09/19/the-internet-archive-turns-20/ 05/16/2017
More informationWebSci and Learning to Rank for IR
WebSci and Learning to Rank for IR Ernesto Diaz-Aviles L3S Research Center. Hannover, Germany diaz@l3s.de Ernesto Diaz-Aviles www.l3s.de 1/16 Motivation: Information Explosion Ernesto Diaz-Aviles
More informationModern Retrieval Evaluations. Hongning Wang
Modern Retrieval Evaluations Hongning Wang CS@UVa What we have known about IR evaluations Three key elements for IR evaluation A document collection A test suite of information needs A set of relevance
More informationAnalysis of Large Graphs: TrustRank and WebSpam
Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
More informationLaHC at CLEF 2015 SBS Lab
LaHC at CLEF 2015 SBS Lab Nawal Ould-Amer, Mathias Géry To cite this version: Nawal Ould-Amer, Mathias Géry. LaHC at CLEF 2015 SBS Lab. Conference and Labs of the Evaluation Forum, Sep 2015, Toulouse,
More informationInternet Search. (COSC 488) Nazli Goharian Nazli Goharian, 2005, Outline
Internet Search (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Nazli Goharian, 2005, 2012 1 Outline Web: Indexing & Efficiency Partitioned Indexing Index Tiering & other early termination techniques
More informationOverview of Web Mining Techniques and its Application towards Web
Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous
More informationKeywords APSE: Advanced Preferred Search Engine, Google Android Platform, Search Engine, Click-through data, Location and Content Concepts.
Volume 5, Issue 3, March 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Advanced Preferred
More informationTowards Breaking the Quality Curse. AWebQuerying Web-Querying Approach to Web People Search.
Towards Breaking the Quality Curse. AWebQuerying Web-Querying Approach to Web People Search. Dmitri V. Kalashnikov Rabia Nuray-Turan Sharad Mehrotra Dept of Computer Science University of California, Irvine
More informationDATA MINING II - 1DL460. Spring 2014"
DATA MINING II - 1DL460 Spring 2014" A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt14 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationThe Web: Concepts and Technology. January 15: Course Overview
The Web: Concepts and Technology January 15: Course Overview 1 Today s Plan Who am I? What is this course about? Logistics Who are you? 2 Meet Your Instructor Instructor: Eugene Agichtein Web: http://www.mathcs.emory.edu/~eugene
More informationFinding Nutrition Information on the Web: Coverage vs. Authority
Finding Nutrition Information on the Web: Coverage vs. Authority Susan G. Doran Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208.Sue_doran@yahoo.com Samuel
More informationInformation Retrieval
Information Retrieval Learning to Rank Ilya Markov i.markov@uva.nl University of Amsterdam Ilya Markov i.markov@uva.nl Information Retrieval 1 Course overview Offline Data Acquisition Data Processing Data
More informationUNDERSTANDING AND IMPROVING WEB SEARCH USING LARGE-SCALE BEHAVIORAL LOGS. Susan Dumais, Microsoft Research
UNDERSTANDING AND IMPROVING WEB SEARCH USING LARGE-SCALE BEHAVIORAL LOGS Susan Dumais, Microsoft Research Overview The big data revolution examples from Web search Large-scale behavioral logs Observations:
More informationDepartment of Computer Science & Engineering The Graduate School, Chung-Ang University. CAU Artificial Intelligence LAB
Department of Computer Science & Engineering The Graduate School, Chung-Ang University CAU Artificial Intelligence LAB 1 / 17 Text data is exploding on internet because of the appearance of SNS, such as
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org A 3.3 B 38.4 C 34.3 D 3.9 E 8.1 F 3.9 1.6 1.6 1.6 1.6 1.6 2 y 0.8 ½+0.2 ⅓ M 1/2 1/2 0 0.8 1/2 0 0 + 0.2 0 1/2 1 [1/N]
More informationHow to do an On-Page SEO Analysis Table of Contents
How to do an On-Page SEO Analysis Table of Contents Step 1: Keyword Research/Identification Step 2: Quality of Content Step 3: Title Tags Step 4: H1 Headings Step 5: Meta Descriptions Step 6: Site Performance
More informationInterpreting Document Collections with Topic Models. Nikolaos Aletras University College London
Interpreting Document Collections with Topic Models Nikolaos Aletras University College London Acknowledgements Mark Stevenson, Sheffield Tim Baldwin, Melbourne Jey Han Lau, IBM Research Talk Outline Introduction
More informationPersonalized Information Retrieval. Elena Holobiuc Iulia Pasov Alexandru Agape Octavian Sima Bogdan Cap-Bun
Personalized Information Retrieval Elena Holobiuc Iulia Pasov Alexandru Agape Octavian Sima Bogdan Cap-Bun Content Overview Enhancing Personalized Web Search Intent and interest in personalized search
More informationSearching for Information
Searching for Information INFO/CSE100, Spring 2006 Fluency in Information Technology http://www.cs.washington.edu/100 Apr-10-06 searching @ university of washington 1 Readings and References Reading Fluency
More informationUsability Testing. November 14, 2016
Usability Testing November 14, 2016 Announcements Wednesday: HCI in industry VW: December 1 (no matter what) 2 Questions? 3 Today Usability testing Data collection and analysis 4 Usability test A usability
More informationInferring User Search for Feedback Sessions
Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department
More informationFederated Search. Jaime Arguello INLS 509: Information Retrieval November 21, Thursday, November 17, 16
Federated Search Jaime Arguello INLS 509: Information Retrieval jarguell@email.unc.edu November 21, 2016 Up to this point... Classic information retrieval search from a single centralized index all ueries
More informationUnsupervised Rank Aggregation with Distance-Based Models
Unsupervised Rank Aggregation with Distance-Based Models Alexandre Klementiev, Dan Roth, and Kevin Small University of Illinois at Urbana-Champaign Motivation Consider a panel of judges Each (independently)
More informationQuery Reformulation for Clinical Decision Support Search
Query Reformulation for Clinical Decision Support Search Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, Ophir Frieder Information Retrieval Lab Computer Science Department Georgetown University
More informationStudent Guide to Neehr Perfect Go!
Student Guide to Neehr Perfect Go! I. Introduction... 1 II. Quick Facts... 1 III. Creating your Account... 1 IV. Applying Your Subscription... 4 V. Logging in to Neehr Perfect... 6 VI. Activities... 6
More informationMatt Quinn.
Matt Quinn matt.quinn@nist.gov Roles of AHRQ and NIST What s at Stake Current State of Usability in Certified EHRs Projects to Support Improved Usability Moving Forward June 7 NIST Workshop Questions NIST's
More informationDetecting Good Abandonment in Mobile Search
Detecting Good Abandonment in Mobile Search Kyle Williams, Julia Kiseleva, Aidan C. Crook Γ, Imed Zitouni Γ, Ahmed Hassan Awadallah Γ, Madian Khabsa Γ The Pennsylvania State University, University Park,
More information3 Data, Data Mining. Chengkai Li
CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach
More informationWhere the Social Web Meets the Semantic Web. Tom Gruber RealTravel.com tomgruber.org
Where the Social Web Meets the Semantic Web Tom Gruber RealTravel.com tomgruber.org Doug Engelbart, 1968 "The grand challenge is to boost the collective IQ of organizations and of society. " Tim Berners-Lee,
More information