Effective Searching of RDF Knowledge Bases
|
|
- Beatrice Walters
- 5 years ago
- Views:
Transcription
1 Effective Searching of RDF Knowledge Bases Shady Elbassuoni Joint work with: Maya Ramanath and Gerhard Weikum
2 RDF Knowledge Bases Annie Hall is a 1977 American romantic comedy directed by Woody Allen and co-starring Diane Keaton. USA 1997 ProducedIn actedin hasproductionyear Woody_Allen Annie_Hall actedin Diane_keaton directed hasgenre hasgenre Comedy Romance 2
3 Linking RDF Knowledge Bases ~ 256 knowledge bases ~ 30 billion triples ~ 400 million links 3
4 RDF Triples Annie Hall is a 1977 American romantic comedy directed by Woody Allen and co-starring Diane Keaton. subject predicate object Annie_Hall hasproductionyear 1977 Annie_Hall producedin USA Annie_Hall hasgenre Romance Annie_Hall hasgenre Comedy Woody_Allen directed Annie_Hall Woody_Allen actedin Annie_Hall Diane_Keaton actedin Annie_Hall 4
5 Utilizing RDF Knowledge Bases Address advanced information needs People born in the same city as Albert Einstein? Fiction books written by a Nobel prize winner? Movies directed and acted in by the same person? Beyond standard keyword-based search 5
6 Searching RDF Knowledge Bases Use triple-pattern queries Movies directed and acted in by the same person??d directed?m.?d actedin?m Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts... Mel_Gibson directed Braveheart. Mel_Gibson actedin Braveheart 6
7 Outline Result Ranking Augmenting RDF knowledge bases with text Automatic Query Relaxation Conclusion 7
8 Outline Result Ranking Augmenting RDF knowledge bases with text Automatic Query Relaxation Conclusion 8
9 Motivation?d directed?m.?d actedin?m Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts Mel_Gibson directed Braveheart. Mel_Gibson actedin Braveheart results over 600,000 triples Result Ranking is Crucial 9
10 Ranking Criteria?d directed?m.?d actedin?m Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts Mel_Gibson directed Braveheart. Mel_Gibson actedin Braveheart results over 600,000 triples Rank results based on informativeness 10
11 Challenges How to measure the informativeness of a result? How to use informativeness for ranking in a principled way? 11
12 Measuring Informativeness Associate each triple with a witness count Number of sources from which the triple was extracted Woody_Allen directed Annie_Hall c(t) Woody_Allen directed Manhattan Tim_Robbins directed Bob_Roberts Steven_Spielberg directed Munich James_Cameron directed Titanic Mel_Gibson directed Braveheart
13 How to use witness counts for ranking??d directed?m.?d actedin?m Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts Mel_Gibson directed Braveheart. Mel_Gibson actedin Braveheart results over 600,000 triples Language-models-based Ranking 13
14 Language-models-based Ranking (Zhai and Lafferty, CIKM 2001) Q P(w Q) director actor KL(Q D) = w P w Q log P(w Q) P(w D) w Kullback-Leibler Divergence P w Q = c(w, Q) Q P(w D) w P w D = α c(w, D) D + (1 α) c(w, Col) Col D Annie Hall is a drama romance movie directed and acted in by Woody Allen which also... Maximum-Likelihood Estimator Smoothing Component 14
15 Language-models-based Ranking in RDF (Elbassuoni et al., CIKM 2009) Q?d directed?m.?d actedin?m?? R Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall 15
16 Language-models-based Ranking in RDF (Elbassuoni et al., CIKM 2009) Q?d directed?m.?d actedin?m P(T Q) Probability distributions over tuples T Kullback-Leibler Divergence KL(Q R) = T P T Q log P(T Q) P(T R) P(T R) T How to estimate? R Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall 16
17 Query Language-Model Estimation Q?d directed?m.?d actedin?m P(T Q) Independence between triple patterns T P(Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Q) = P(Woody_Allen directed Annie_Hall q 1 ) * P(Woody_Allen actedin Annie_Hall q 2 ) c(t) Woody_Allen directed Annie_Hall c(woody_allen directed Annie_Hall) t ε matches(q 1 ) c(t) Woody_Allen directed Manhattan Tim_Robbins directed Bob_Roberts Steven_Spielberg directed Munich James_Cameron directed Titanic Mel_Gibson directed Braveheart
18 Result Language-Model Estimation R Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall P(T R) P T R = α P(T R) + 1 α P(T Col) T Smoothing component 1if R contains T and 0 otherwise P(Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall R) = 1 P(Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts R) = 0 18
19 Experimental Evaluation Two real-world RDF Datasets: IMDB: Internet Movie Database linked with YAGO LibraryThing: an online book catalogue 19
20 Experimental Evaluation Query Benchmark: 24 triple-pattern queries 20
21 Experimental Evaluation Competitors: WOR [Nie et al., WWW 2007] Queries: keywords Results: Entities Result Ranking: Language-models based BANKS [Kacholia, et.al., VLDB 2005] Queries: keywords Results: Tuples of triples Result Ranking: based on entity weights and triple weights NAGA [Kasneci et al., ICDE 2008] Queries: Triple-patterns Results: Tuples of triples Result Ranking: Language-models-based (Query Likelihood) 21
22 Experimental Evaluation Methodology and Results: Pool top-10 results from all approaches and assess result relevance on a 4-point scale 7 human judges on Amazon Mechanical Turk Measure Normalized Discounted Cumulative Gain (NDCG) Dataset KL-Div WOR BANKS NAGA IMDB LT One-tailed t-test with P-value <
23 Result Ranking Summary Estimate query language model and result language model Rank a result based on the Kullback-Leibler Divergence between the query and the result language models?d directed?m.?d actedin?m 23
24 Outline Result Ranking Augmenting RDF knowledge bases with text Automatic Query Relaxation Conclusion 24
25 Motivation Movies directed and acted in by the same person about an election campaign??d directed?m.?d actedin?m How to express about an election campaign in RDF? Combining RDF data with text is needed 25
26 Challenges How do we combine RDF data with text? How do we search combined RDF data and text? 26
27 Combining RDF Data with Text Extract keywords from triples witnesses Tim_Robbins directed Bob_Roberts corrupt rightwing folksinger crooked election campaign independent muckraking reporter c(t,w) 27
28 Searching Text-Augmented RDF Data Use text-augmented triple-pattern query Movies directed and acted in by the same person about an election campaign??d directed?m{election campaign}.?d actedin?m Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Mel_Gibson directed Braveheart. Mel_Gibson actedin Braveheart Consider keywords only for ranking Tim_Robbins directed Bob_Roberts. Tim_Robbins actedin Bob_Roberts 28
29 Result Ranking Q?d directed?m{election campaign}.?d actedin?m Independence between triple patterns P(T Q) P(Woody_Allen directed Annie_Hall. Woody_Allen actedin Annie_Hall Q) = P(Woody_Allen directed Annie_Hall q 1 ) * P(Woody_Allen actedin Annie_Hall q 2 ) Independence between keywords P(Woody_Allen directed Annie_Hall q 1,election) * P (Woody_Allen directed Annie_Hall q 1,campaign) T c(woody_allen directed Annie_Hall, campaign) t ε matches(q 1 ) c(t, campaign) 29
30 Experimental Evaluation Two real-world RDF Datasets: IMDB: Internet Movie Database linked with YAGO LibraryThing: an online book catalogue 30
31 Experimental Evaluation Query Benchmark: 24 keyword-augmented triple-pattern queries 31
32 Experimental Evaluation Competitors: WOR [Nie et al., WWW 2007] Queries: keywords Results: Entities Result Ranking: Language-models based BANKS [Kacholia, et.al., VLDB 2005] Queries: keywords Results: Tuples of triples Result Ranking: function of entity weights and triple weights NAGA [Kasneci et al., ICDE 2008] Queries: Triple-patterns Results: Tuples of triples Result Ranking: Language-models-based (Query Likelihood) 32
33 Experimental Evaluation Methodology and Results: Pool top-10 results from all approaches and assess result relevance on a 4-point scale 7 human judges on Amazon Mechanical Turk Measure Normalized Discounted Cumulative Gain (NDCG) Dataset KL-Div WOR BANKS NAGA IMDB LT One-tailed t-test with P-value <
34 Text Augmentation Summary Associate triples with keywords from witnesses Extend triple-pattern search to allow keyword conditions Take into consideration keywords while ranking results?d directed?m{election campaign}.?d actedin?m 34
35 Outline Result Ranking Augmenting RDF knowledge bases with text Automatic Query Relaxation Conclusion 35
36 Motivation Adventure movies directed and acted in by the same person??d directed?m.?d actedin?m.?m hasgenre Adventure 3 results over 600,000 triples 14 Action movies directed and acted in by the same person 9 Adventure movies produced and acted in by the same person Improve recall by retrieving results close to query intention 36
37 Challenges How to identify results close to query intention? How to merge these results with the exact results? 37
38 Retrieving Results Close to Query Intention Perform automatic query relaxation?d directed?m.?d actedin?m.?m hasgenre Adventure?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure Might still return insufficient number of results?d directed?m.?d actedin?m.?m hasgenre?x Poor precision Combine all types of relaxations in one framework 38
39 Query Relaxation Framework (Elbassuoni et al., ESWC2011) Replace resources(entities/relations) in the query with similar ones Replace resources in the query with variables Remove one or more triple-patterns How to measure similarity between resources? 39
40 Measuring Similarity between Resources Adventure Action? Use dictionaries Use text descriptions Use the knowledge base Adventure type Film_Genres Star_Wars hasgenre Adventure Rat_Race hasgenre Adventure Superman hasgenre Adventure Action type Film_Genres Star_Wars hasgenre Action Superman hasgenre Action Die_Hard hasgenre Action
41 Measuring Similarity between Resources Adventure Action? Use dictionaries Use text descriptions Use the knowledge base Adventure type Film_Genres Star_Wars hasgenre Adventure Rat_Race hasgenre Adventure Superman hasgenre Adventure Action type Film_Genres Star_Wars hasgenre Action Superman hasgenre Action Die_Hard hasgenre Action
42 Similarity Metric Adventure Action Film_Genres Star_Wars Rat_Race Superman P(w X) w? P(w Y) w Film_Genres Star_Wars Superman Die_Hard Jensen-Shannon Divergence JS(X Y = KL(X M + KL(M Y M = (X + Y) 2 Square root is a metric between 0 and 1 42
43 Deciding if two resources are similar Adventure Action Film_Genres Star_Wars Rat_Race Superman P(w X) w P(w Y) w Film_Genres Star_Wars Superman Die_Hard δ Thriller Similar if JS(X Y < δ Action Drama... Comedy P(w other) w Adventure score Action Thriller 0.472?
44 Generating Relaxed Queries?d directed?m.?d actedin?m.?m hasgenre Adventure directed score actedin score Adventure score produced 0.245? Action 0.221? hasgenre score? Thriller? ?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure 0.342?d directed?m.?d?x?m.?m hasgenre Adventure 0.446?d produced?m.?d actedin?m.?m hasgenre Action 0.980?d directed?m.?d actedin?m.?m?x?y 44
45 Executing Queries and Merging Results?d directed?m.?d actedin?m.?m hasgenre Adventure 3 results ?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure?d directed?m.?d?x?m.?m hasgenre Adventure?d produced?m.?d actedin?m.?m hasgenre Action 0.980?d directed?m.?d actedin?m.?m?x?y 45
46 Executing Queries and Merging Results?d directed?m.?d actedin?m.?m hasgenre Adventure ?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure 14 results 0.342?d directed?m.?d?x?m.?m hasgenre Adventure 0.446?d produced?m.?d actedin?m.?m hasgenre Action 0.980?d directed?m.?d actedin?m.?m?x?y 46
47 Executing Queries and Merging Results?d directed?m.?d actedin?m.?m hasgenre Adventure ?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure 9 results 0.342?d directed?m.?d?x?m.?m hasgenre Adventure 0.446?d produced?m.?d actedin?m.?m hasgenre Action 0.980?d directed?m.?d actedin?m.?m?x?y Dependent on the order in which queries are executed 47
48 Executing Queries and Merging Results?d directed?m.?d actedin?m.?m hasgenre Adventure Q ?d directed?m.?d actedin?m.?m hasgenre Action?d produced?m.?d actedin?m.?m hasgenre Adventure Q 2 Q ?d directed?m.?d?x?m.?m hasgenre Adventure 0.446?d produced?m.?d actedin?m.?m hasgenre Action 0.980?d directed?m.?d actedin?m.?m?x?y Q m m P T Q = λ i P(T Q i ) i=1 48
49 Experimental Evaluation Two real-world RDF Datasets: IMDB: Internet Movie Database linked with YAGO LibraryThing: an online book catalogue 49
50 Experimental Evaluation Query Benchmark: 80 triple-pattern queries and 30 keyword-augmented ones Few or no results 50
51 Experimental Evaluation Evaluating similarity metric and pruning strategy For each resource (entity/relation) in the evaluation queries Retrieve top-5 most similar resources and assess how close they are to the resource on a 3-point scale 6 human judges on Amazon Mechanical Turk Entities Relations # of items Avg. rating Correlation between Avg. Rating & JS-Div Avg. Rating of Most Similar Resource Avg. Rating Below Threshold Avg. Rating Above Threshold
52 Experimental Evaluation Evaluating closeness of relaxed queries For each evaluation query Retrieve top-5 closest relaxed queries and assess how close they are to the evaluation query on a 4-point scale 6 human judges on Amazon Mechanical Turk # of items 110 Avg. rating 1.89 Correlation between Avg. Rating & Score Avg. Rating of closest relaxation
53 Experimental Evaluation Evaluating quality of results Pool top-10 results from 3 approaches Our framework with Incremental Query Execution Our framework with Batch Query Execution Baseline approach: resources replaced by variables Dataset Incremental Batch Baseline Triple-pattern queries IMDB LT Keyword-augmented triple-pattern queries IMDB LT One-tailed t-test with P-value <
54 Query Relaxation Summary Generate relaxed queries and execute them Merge and rank results taking into consideration closeness to query intention?d directed?m.?d actedin?m.?m hasgenre Adventure 54
55 Conclusion RDF is the way to represent and link heterogeneous structured data on the Web IR-style searching and ranking models Result Ranking Combining RDF with text Automatic Query Relaxation Other contributions Plain Keyword Search Result Diversity Top-k Query Processing Witness Retrieval Model Current endeavors Timeline Entity Summarization Natural language Question-Answering using RDF data Search Personalization in the context of RDF 55
Query Relaxation for Entity-Relationship Search
Query Relaxation for Entity-Relationship Search Shady Elbassuoni, Maya Ramanath, and Gerhard Weikum Max-Planck Institute for Informatics {elbass,ramanath,weikum}@mpii.de Abstract. Entity-relationship-structured
More informationFast Contextual Preference Scoring of Database Tuples
Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation
More informationQuerying Wikipedia Documents and Relationships
Querying Wikipedia Documents and Relationships Huong Nguyen Thanh Nguyen Hoa Nguyen Juliana Freire School of Computing and SCI Institute, University of Utah {huongnd,thanhh,thanhhoa,juliana}@cs.utah.edu
More informationDiversification of Query Interpretations and Search Results
Diversification of Query Interpretations and Search Results Advanced Methods of IR Elena Demidova Materials used in the slides: Charles L.A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova,
More informationDatabases & Information Retrieval
Databases & Information Retrieval Maya Ramanath (Further Reading: Combining Database and Information-Retrieval Techniques for Knowledge Discovery. G. Weikum, G. Kasneci, M. Ramanath and F.M. Suchanek,
More informationIntuitive and Interactive Query Formulation to Improve the Usability of Query Systems for Heterogeneous Graphs
Intuitive and Interactive Query Formulation to Improve the Usability of Query Systems for Heterogeneous Graphs Nandish Jayaram University of Texas at Arlington PhD Advisors: Dr. Chengkai Li, Dr. Ramez
More informationGraph-Based Synopses for Relational Data. Alkis Polyzotis (UC Santa Cruz)
Graph-Based Synopses for Relational Data Alkis Polyzotis (UC Santa Cruz) Data Synopses Data Query Result Data Synopsis Query Approximate Result Problem: exact answer may be too costly to compute Examples:
More informationCSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)"
CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)" All slides Addison Wesley, Donald Metzler, and Anton Leuski, 2008, 2012! Language Model" Unigram language
More informationAn Investigation of Basic Retrieval Models for the Dynamic Domain Task
An Investigation of Basic Retrieval Models for the Dynamic Domain Task Razieh Rahimi and Grace Hui Yang Department of Computer Science, Georgetown University rr1042@georgetown.edu, huiyang@cs.georgetown.edu
More informationEffective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar
Effective Keyword Search over (Semi)-Structured Big Data Mehdi Kargar School of Computer Science Faculty of Science University of Windsor How Big is this Big Data? 40 Billion Instagram Photos 300 Hours
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationOn Duplicate Results in a Search Session
On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this
More informationPersonalized Web Search
Personalized Web Search Dhanraj Mavilodan (dhanrajm@stanford.edu), Kapil Jaisinghani (kjaising@stanford.edu), Radhika Bansal (radhika3@stanford.edu) Abstract: With the increase in the diversity of contents
More informationDemystifying movie ratings 224W Project Report. Amritha Raghunath Vignesh Ganapathi Subramanian
Demystifying movie ratings 224W Project Report Amritha Raghunath (amrithar@stanford.edu) Vignesh Ganapathi Subramanian (vigansub@stanford.edu) 9 December, 2014 Introduction The past decade or so has seen
More informationHolistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs
Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Authors: Andreas Wagner, Veli Bicer, Thanh Tran, and Rudi Studer Presenter: Freddy Lecue IBM Research Ireland 2014 International
More informationKeyword Search in Databases
+ Databases and Information Retrieval Integration TIETS42 Keyword Search in Databases Autumn 2016 Kostas Stefanidis kostas.stefanidis@uta.fi http://www.uta.fi/sis/tie/dbir/index.html http://people.uta.fi/~kostas.stefanidis/dbir16/dbir16-main.html
More informationImproving Difficult Queries by Leveraging Clusters in Term Graph
Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu
More informationIRCE at the NTCIR-12 IMine-2 Task
IRCE at the NTCIR-12 IMine-2 Task Ximei Song University of Tsukuba songximei@slis.tsukuba.ac.jp Yuka Egusa National Institute for Educational Policy Research yuka@nier.go.jp Masao Takaku University of
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 2:00pm-3:30pm, Tuesday, December 15th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationSemantic and Distributed Entity Search in the Web of Data
Semantic and Distributed Entity Search in the Web of Data Robert Neumayer neumayer@idi.ntnu.no Norwegian University of Science and Technology Trondheim, Norway March 6, 2013 1/48 1. Entity Search and the
More informationA RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH
A RECOMMENDER SYSTEM FOR SOCIAL BOOK SEARCH A thesis Submitted to the faculty of the graduate school of the University of Minnesota by Vamshi Krishna Thotempudi In partial fulfillment of the requirements
More informationKeyword Search over RDF Graphs. Elisa Menendez
Elisa Menendez emenendez@inf.puc-rio.br Summary Motivation Keyword Search over RDF Process Challenges Example QUIOW System Next Steps Motivation Motivation Keyword search is an easy way to retrieve information
More informationLimitations of XPath & XQuery in an Environment with Diverse Schemes
Exploiting Structure, Annotation, and Ontological Knowledge for Automatic Classification of XML-Data Martin Theobald, Ralf Schenkel, and Gerhard Weikum Saarland University Saarbrücken, Germany 23.06.2003
More informationUniversity of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015
University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic
More informationBook Recommendation based on Social Information
Book Recommendation based on Social Information Chahinez Benkoussas and Patrice Bellot LSIS Aix-Marseille University chahinez.benkoussas@lsis.org patrice.bellot@lsis.org Abstract : In this paper, we present
More informationDivQ: Diversification for Keyword Search over Structured Databases
DivQ: Diversification for Keyword Search over Structured Databases Elena Demidova, Peter Fankhauser, 2, Xuan Zhou 3 and Wolfgang Nejdl L3S Research Center, Hannover, Germany 2 Fraunhofer IPSI, Darmstadt
More informationINEX REPORT. Report on INEX 2011
INEX REPORT Report on INEX 2011 P. Bellot T. Chappell A. Doucet S. Geva J. Kamps G. Kazai M. Koolen M. Landoni M. Marx V. Moriceau J. Mothe G. Ramírez M. Sanderson E. Sanjuan F. Scholer X. Tannier M. Theobald
More informationEfficient Prediction of Difficult Keyword Queries over Databases
Efficient Prediction of Difficult Keyword Queries over Databases Gurramkonda Lakshmi Priyanka P.G. Scholar (M. Tech), Department of CSE, Srinivasa Institute of Technology & Sciences, Ukkayapalli, Kadapa,
More informationNATURAL LANGUAGE PROCESSING
NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity
More informationRecommender Systems 6CCS3WSN-7CCSMWAL
Recommender Systems 6CCS3WSN-7CCSMWAL http://insidebigdata.com/wp-content/uploads/2014/06/humorrecommender.jpg Some basic methods of recommendation Recommend popular items Collaborative Filtering Item-to-Item:
More informationA Deep Relevance Matching Model for Ad-hoc Retrieval
A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese
More informationarxiv: v5 [cs.db] 29 Mar 2016
Effective Keyword Search in Graphs arxiv:52.6395v5 [cs.db] 29 Mar 26 n 5 n n 3 n 4 n 6 n 2 (a) n : n 2 : Laurence Fishburne n 3 : Birth Date (info_type) n 4 : The Matrix ABSTRACT n 7 n 8 Mehdi Kargar n,
More informationA New Measure of the Cluster Hypothesis
A New Measure of the Cluster Hypothesis Mark D. Smucker 1 and James Allan 2 1 Department of Management Sciences University of Waterloo 2 Center for Intelligent Information Retrieval Department of Computer
More informationContext based Re-ranking of Web Documents (CReWD)
Context based Re-ranking of Web Documents (CReWD) Arijit Banerjee, Jagadish Venkatraman Graduate Students, Department of Computer Science, Stanford University arijitb@stanford.edu, jagadish@stanford.edu}
More informationAssigning Global Relevance Scores to DBpedia Facts
Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Matthias Kohnen, Tobias Metzke, Ziawasch Abedjan, and Gjergji Kasneci Hasso Plattner Institute (HPI) Potsdam,
More informationInformation Retrieval
Introduction to Information Retrieval Evaluation Rank-Based Measures Binary relevance Precision@K (P@K) Mean Average Precision (MAP) Mean Reciprocal Rank (MRR) Multiple levels of relevance Normalized Discounted
More informationExtracting Rankings for Spatial Keyword Queries from GPS Data
Extracting Rankings for Spatial Keyword Queries from GPS Data Ilkcan Keles Christian S. Jensen Simonas Saltenis Aalborg University Outline Introduction Motivation Problem Definition Proposed Method Overview
More informationExtending Keyword Search to Metadata in Relational Database
DEWS2008 C6-1 Extending Keyword Search to Metadata in Relational Database Jiajun GU Hiroyuki KITAGAWA Graduate School of Systems and Information Engineering Center for Computational Sciences University
More informationA Family of Contextual Measures of Similarity between Distributions with Application to Image Retrieval
A Family of Contextual Measures of Similarity between Distributions with Application to Image Retrieval Florent Perronnin, Yan Liu and Jean-Michel Renders Xerox Research Centre Europe (XRCE) Textual and
More informationEFFICIENT APPROACH FOR DETECTING HARD KEYWORD QUERIES WITH MULTI-LEVEL NOISE GENERATION
EFFICIENT APPROACH FOR DETECTING HARD KEYWORD QUERIES WITH MULTI-LEVEL NOISE GENERATION B.Mohankumar 1, Dr. P. Marikkannu 2, S. Jansi Rani 3, S. Suganya 4 1 3 4Asst Prof, Department of Information Technology,
More informationAdvanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016
Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full
More informationBy Atul S. Kulkarni Graduate Student, University of Minnesota Duluth. Under The Guidance of Dr. Richard Maclin
By Atul S. Kulkarni Graduate Student, University of Minnesota Duluth Under The Guidance of Dr. Richard Maclin Outline Problem Statement Background Proposed Solution Experiments & Results Related Work Future
More informationRetrieval Evaluation. Hongning Wang
Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User
More informationRanked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?
Ranked Retrieval One option is to average the precision scores at discrete Precision 100% 0% More junk 100% Everything points on the ROC curve But which points? Recall We want to evaluate the system, not
More informationEstimating Quantiles from the Union of Historical and Streaming Data
Estimating Quantiles from the Union of Historical and Streaming Data Sneha Aman Singh, Iowa State University Divesh Srivastava, AT&T Labs - Research Srikanta Tirthapura, Iowa State University Quantiles
More informationICFHR 2014 COMPETITION ON HANDWRITTEN KEYWORD SPOTTING (H-KWS 2014)
ICFHR 2014 COMPETITION ON HANDWRITTEN KEYWORD SPOTTING (H-KWS 2014) IOANNIS PRATIKAKIS 1 KONSTANTINOS ZAGORIS 1,2 BASILIS GATOS 2 GEORGIOS LOULOUDIS 2 NIKOLAOS STAMATOPOULOS 2 1 2 Visual Computing Group
More informationExploiting Index Pruning Methods for Clustering XML Collections
Exploiting Index Pruning Methods for Clustering XML Collections Ismail Sengor Altingovde, Duygu Atilgan and Özgür Ulusoy Department of Computer Engineering, Bilkent University, Ankara, Turkey {ismaila,
More informationNAGA: Searching and Ranking Knowledge. Gjergji Kasneci, Fabian M. Suchanek, Georgiana Ifrim, Maya Ramanath, and Gerhard Weikum
NAGA: Searching and Ranking Knowledge Gjergji Kasneci, Fabian M. Suchanek, Georgiana Ifrim, Maya Ramanath, and Gerhard Weikum MPI I 2007 5 001 March 2007 Authors Addresses Gjergji Kasneci Max-Planck-Institut
More informationarxiv: v1 [cs.ai] 19 Jul 2012
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases Simon Lacoste-Julien INRIA - SIERRA project-team Ecole Normale Superieure Paris, France Konstantina Palla University of Cambridge Cambridge,
More informationCOLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA
COLLABORATIVE LOCATION AND ACTIVITY RECOMMENDATIONS WITH GPS HISTORY DATA Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia WWW 2010
More informationPart 11: Collaborative Filtering. Francesco Ricci
Part : Collaborative Filtering Francesco Ricci Content An example of a Collaborative Filtering system: MovieLens The collaborative filtering method n Similarity of users n Methods for building the rating
More informationTop- k Entity Augmentation Using C onsistent Set C overing
Top- k Entity Augmentation Using C onsistent Set C overing J ulian Eberius*, Maik Thiele, Katrin Braunschweig and Wolfgang Lehner Technische Universität Dresden, Germany What is Entity Augmentation? ENTITY
More informationReducing Redundancy with Anchor Text and Spam Priors
Reducing Redundancy with Anchor Text and Spam Priors Marijn Koolen 1 Jaap Kamps 1,2 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Informatics Institute, University
More informationCSE 154, Autumn 2012 Final Exam, Thursday, December 13, 2012
CSE 154, Autumn 2012 Final Exam, Thursday, December 13, 2012 Name: Quiz Section: Student ID #: TA: Rules: You have 110 minutes to complete this exam. You may receive a deduction if you keep working after
More informationHow Fresh Do You Want Your Search Results?
How Fresh Do You Want Your Search Results? Shiwen Cheng, Anastasios Arvanitis, Vagelis Hristidis Department of Computer Science & Engineering University of California, Riverside, California, USA {schen064,
More informationTriAD: A Distributed Shared-Nothing RDF Engine based on Asynchronous Message Passing
TriAD: A Distributed Shared-Nothing RDF Engine based on Asynchronous Message Passing Sairam Gurajada, Stephan Seufert, Iris Miliaraki, Martin Theobald Databases & Information Systems Group ADReM Research
More informationExploiting the Diversity of User Preferences for Recommendation. Saúl Vargas and Pablo Castells {saul.vargas,
Exploiting the Diversity of User Preferences for Recommendation Saúl Vargas and Pablo Castells {saul.vargas, pablo.castells}@uam.es Item Recommendation User A B C E F D G I H User profile You may also
More informationSearch Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]
Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating
More informationDomain Adaptation Using Domain Similarity- and Domain Complexity-based Instance Selection for Cross-domain Sentiment Analysis
Domain Adaptation Using Domain Similarity- and Domain Complexity-based Instance Selection for Cross-domain Sentiment Analysis Robert Remus rremus@informatik.uni-leipzig.de Natural Language Processing Group
More informationEntity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu //8 Jure Leskovec, Stanford CS6: Mining Massive Datasets High dim. data Graph data Infinite data Machine learning
More informationConversational Knowledge Graphs. Larry Heck Microsoft Research
Conversational Knowledge Graphs Larry Heck Microsoft Research Multi-modal systems e.g., Microsoft MiPad, Pocket PC TV Voice Search e.g., Bing on Xbox Task-specific argument extraction (e.g., Nuance, SpeechWorks)
More informationEntity Set Expansion with Meta Path in Knowledge Graph
Entity Set Expansion with Meta Path in Knowledge Graph Yuyan Zheng 1, Chuan Shi 1,2(B), Xiaohuan Cao 1, Xiaoli Li 3, and Bin Wu 1 1 Beijing Key Lab of Intelligent Telecommunications Software and Multimedia,
More informationDocument indexing, similarities and retrieval in large scale text collections
Document indexing, similarities and retrieval in large scale text collections Eric Gaussier Univ. Grenoble Alpes - LIG Eric.Gaussier@imag.fr Eric Gaussier Document indexing, similarities & retrieval 1
More informationCSCI 5417 Information Retrieval Systems. Jim Martin!
CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the
More informationMetric Spaces for Temporal Information Retrieval
Metric Spaces for Temporal Information Retrieval Matteo Brucato1, Danilo Montesi2 1 University of Massachusetts Amherst, USA 2 University of Bologna, Italy Presented by: Matteo Brucato matteo@cs.umass.edu
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationMemTest: A Novel Benchmark for In-memory Database
MemTest: A Novel Benchmark for In-memory Database Qiangqiang Kang, Cheqing Jin, Zhao Zhang, Aoying Zhou Institute for Data Science and Engineering, East China Normal University, Shanghai, China 1 Outline
More informationBi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track
Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Jeffrey Dalton University of Massachusetts, Amherst jdalton@cs.umass.edu Laura
More informationAUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH
AUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH Dezhao Song and Jeff Heflin SWAT Lab Department of Computer Science and Engineering Lehigh University 11/10/2011
More informationQuery-Independent Learning to Rank for RDF Entity Search
Query-Independent Learning to Rank for RDF Entity Search Lorand Dali 1, Blaž Fortuna 1, Thanh Tran 2, Dunja Mladenić 1,3 1 Jožef Stefan Institute, 1000 Ljubljana, Slovenia {Lorand.Dali, Blaz.Fortuna, dunja.mladenic}@ijs.si
More informationOn Duplicate Results in a Search Session
On Duplicate Results in a Search Session Jiepu Jiang Daqing He Shuguang Han School of Information Sciences University of Pittsburgh jiepu.jiang@gmail.com dah44@pitt.edu shh69@pitt.edu ABSTRACT In this
More informationHow to predict IMDb score
How to predict IMDb score Jiawei Li A53226117 Computational Science, Mathematics and Engineering University of California San Diego jil206@ucsd.edu Abstract This report is based on the dataset provided
More informationWord Embeddings in Search Engines, Quality Evaluation. Eneko Pinzolas
Word Embeddings in Search Engines, Quality Evaluation Eneko Pinzolas Neural Networks are widely used with high rate of success. But can we reproduce those results in IR? Motivation State of the art for
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu Customer X Buys Metalica CD Buys Megadeth CD Customer Y Does search on Metalica Recommender system suggests Megadeth
More informationData Retrieval in Intermittedly Connected Networks
Humboldt University Computer Science Department Data Retrieval in Intermittedly Connected Networks Dirk Neukirchen Interplanetary Internet Seminar, 2006 June 22, 2006 Overview Traditional Approaches vs.
More informationPersonalized Recommendations using Knowledge Graphs. Rose Catherine Kanjirathinkal & Prof. William Cohen Carnegie Mellon University
+ Personalized Recommendations using Knowledge Graphs Rose Catherine Kanjirathinkal & Prof. William Cohen Carnegie Mellon University + The Problem 2 n Generate content-based recommendations on sparse real
More informationShortest paths on large graphs: Systems, Algorithms, Applications
Shortest paths on large graphs: Systems, Algorithms, Applications Andrey Gubichev TU München January 2012 Andrey Gubichev Shortest paths on large graphs 1 / 53 Outline Introduction Systems Algorithms Applications
More informationData Mining Lecture 2: Recommender Systems
Data Mining Lecture 2: Recommender Systems Jo Houghton ECS Southampton February 19, 2019 1 / 32 Recommender Systems - Introduction Making recommendations: Big Money 35% of Amazons income from recommendations
More informationINEX REPORT. Report on INEX 2012
INEX REPORT Report on INEX 2012 P. Bellot T. Chappell A. Doucet S. Geva S. Gurajada J. Kamps G. Kazai M. Koolen M. Landoni M. Marx A. Mishra V. Moriceau J. Mothe M. Preminger G. Ramírez M. Sanderson E.
More informationKNOW At The Social Book Search Lab 2016 Suggestion Track
KNOW At The Social Book Search Lab 2016 Suggestion Track Hermann Ziak and Roman Kern Know-Center GmbH Inffeldgasse 13 8010 Graz, Austria hziak, rkern@know-center.at Abstract. Within this work represents
More informationOverview of the NTCIR-13 OpenLiveQ Task
Overview of the NTCIR-13 OpenLiveQ Task ABSTRACT Makoto P. Kato Kyoto University mpkato@acm.org Akiomi Nishida Yahoo Japan Corporation anishida@yahoo-corp.jp This is an overview of the NTCIR-13 OpenLiveQ
More informationSPARQL Extensions with Preferences: a Survey Olivier Pivert, Olfa Slama, Virginie Thion
SPARQL Extensions with Preferences: a Survey Olivier Pivert, Olfa Slama, Virginie Thion 31 st ACM Symposium on Applied Computing Pisa, Italy April 4-8, 2016 Outline 1 Introduction 2 3 4 Outline Introduction
More informationPresented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu
Presented by: Dimitri Galmanovich Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu 1 When looking for Unstructured data 2 Millions of such queries every day
More informationWordNet-based User Profiles for Semantic Personalization
PIA 2005 Workshop on New Technologies for Personalized Information Access WordNet-based User Profiles for Semantic Personalization Giovanni Semeraro, Marco Degemmis, Pasquale Lops, Ignazio Palmisano LACAM
More informationFractional Similarity : Cross-lingual Feature Selection for Search
: Cross-lingual Feature Selection for Search Jagadeesh Jagarlamudi University of Maryland, College Park, USA Joint work with Paul N. Bennett Microsoft Research, Redmond, USA Using All the Data Existing
More informationWEB SPAM IDENTIFICATION THROUGH LANGUAGE MODEL ANALYSIS
WEB SPAM IDENTIFICATION THROUGH LANGUAGE MODEL ANALYSIS Juan Martinez-Romo and Lourdes Araujo Natural Language Processing and Information Retrieval Group at UNED * nlp.uned.es Fifth International Workshop
More informationProbabilistic Graphical Models Part III: Example Applications
Probabilistic Graphical Models Part III: Example Applications Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2014 CS 551, Fall 2014 c 2014, Selim
More informationLearning Dense Models of Query Similarity from User Click Logs
Learning Dense Models of Query Similarity from User Click Logs Fabio De Bona, Stefan Riezler*, Keith Hall, Massi Ciaramita, Amac Herdagdelen, Maria Holmqvist Google Research, Zürich *Dept. of Computational
More informationHybrid Acquisition of Temporal Scopes for RDF Data
Hybrid Acquisition of Temporal Scopes for RDF Data Anisa Rula 1, Matteo Palmonari 1, Axel-Cyrille Ngonga Ngomo 2, Daniel Gerber 2, Jens Lehmann 2, and Lorenz Bühmann 2 1. University of Milano-Bicocca,
More informationInformation Retrieval
Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning, Pandu Nayak and Prabhakar Raghavan Evaluation 1 Situation Thanks to your stellar performance in CS276, you
More informationKeyword Search in RDF Databases
Keyword Search in RDF Databases Charalampos S. Nikolaou charnik@di.uoa.gr Department of Informatics & Telecommunications University of Athens MSc Dissertation Presentation April 15, 2011 Outline Background
More informationEfficient Top-k Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling with Application to Network Structure Prediction
Efficient Top-k Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling with Application to Network Structure Prediction Takuya Akiba U Tokyo Takanori Hayashi U Tokyo Nozomi Nori Kyoto
More informationRecommender Systems - Introduction. Data Mining Lecture 2: Recommender Systems
Recommender Systems - Introduction Making recommendations: Big Money 35% of amazons income from recommendations Netflix recommendation engine worth $ Billion per year And yet, Amazon seems to be able to
More informationClassification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014
Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014 Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationMining Data Streams. Outline [Garofalakis, Gehrke & Rastogi 2002] Introduction. Summarization Methods. Clustering Data Streams
Mining Data Streams Outline [Garofalakis, Gehrke & Rastogi 2002] Introduction Summarization Methods Clustering Data Streams Data Stream Classification Temporal Models CMPT 843, SFU, Martin Ester, 1-06
More informationMultimodal Social Book Search
Multimodal Social Book Search Mélanie Imhof, Ismail Badache, Mohand Boughanem To cite this version: Mélanie Imhof, Ismail Badache, Mohand Boughanem. Multimodal Social Book Search. 6th Conference on Multilingual
More informationarxiv: v1 [cs.ir] 13 Oct 2009
arxiv:0910.2405v1 [cs.ir] 13 Oct 2009 Generating Concise and Readable Summaries of XML Documents Maya Ramanath, Kondreddi Sarath Kumar, Georgiana Ifrim MPI I 2009 5-002 May 2009 Authors Addresses Maya
More informationRishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India. Monojit Choudhury and Srivatsan Laxman Microsoft Research India India
Rishiraj Saha Roy and Niloy Ganguly IIT Kharagpur India Monojit Choudhury and Srivatsan Laxman Microsoft Research India India ACM SIGIR 2012, Portland August 15, 2012 Dividing a query into individual semantic
More information