Semantic and Distributed Entity Search in the Web of Data

Size: px
Start display at page:

Download "Semantic and Distributed Entity Search in the Web of Data"

Transcription

1 Semantic and Distributed Entity Search in the Web of Data Robert Neumayer Norwegian University of Science and Technology Trondheim, Norway March 6, /48

2 1. Entity Search and the Web of Data The Web of Data What are Entities? 2. Centralised Entity Search Entity Modelling Experiments 3. Federated Entity Search Introduction Experimental Results 4. P2P Entity Search Introduction and Approach Experiments 5. Conclusions and Future Work Future Work Outline 2/48

3 Overview Describe the main components of the last four years of my research Try to give a good motivation and show the whole picture Show real-world examples Pointers on future work Do it in an accessible way 3/48

4 What? Semantic and Distributed Entity Search in the Web of Data Definitions (in reverse order) Web of Data Entities Entity Search Centralised or distributed? 4/48

5 The Web of Data Blog post by Tim Heath... slight disagreement Terms: Linked Data Web of Data 5/48

6 The Web of Data Blog post by Tim Heath... slight disagreement Terms: Linked Data Web of Data... Linked Data is just an attempt to rebrand the Semantic Web... 5/48

7 The Web of Data Blog post by Tim Heath... slight disagreement Terms: Linked Data Web of Data... Personally I use the term Web of data largely interchangeably with the term Semantic Web... 5/48

8 The Web of Data Blog post by Tim Heath... slight disagreement Terms: Linked Data Web of Data... The precise term I use depends on the audience. With Semantic Web geeks I say Semantic Web, with others I tend to say Web of data... 5/48

9 ... How We Use the Terms Linked Data Technical foundation means of publishing/exchanging interconnected data Web of Data / Semantic Web Largely interchangeable an interconnected Web of Data available for search and research example wikipedia connecting to other resources 6/48

10 Linked Open Data 2007 BBC Later + TOTP Musicbrainz Magnatune Sem- Web- Central FOAF SIOC NEW! Ontoworld NEW! SW Conference Corpus ECS Southampton Freshmeat Open- Guides Jamendo US Census Data Gov- Track Project Gutenberg Geonames World Factbook Wikicompany Eurostat W3C WordNet DBpedia NEW! Open Cyc flickr wrappr NEW! Revyu lingvoj DBLP Berlin DBLP Hannover RDF Book Mashup 7/48

11 8/48

12 Entities 1/4 Knowledge bases are growing, so what? Something s interesting when Google do it Google Knowledge graph (2012)

13 Entities 1/4 Knowledge bases are growing, so what? Something s interesting when Google do it Google Knowledge graph (2012)

14 Entities 1/4 Knowledge bases are growing, so what? Something s interesting when Google do it Google Knowledge graph (2012) 9/48

15 What is an entity? (Typed) object Entities 2/4 10/48

16 Once identified, the entity has Attributes and relations Entities 3/4 11/48

17 Entities 4/4 Free text Date Director Relations (Links) Outgoing Ingoing 12/48

18 The Entity Search Task ad-hoc entity retrieval 1 : answering arbitrary information needs related to particular aspects of objects [entities], expressed in unconstrained natural language and resolved using a collection of structured data Our main focus Realistic and frequent type of search 1 J. Pound, P. Mika, and H. Zaragoza. Ad-hoc object retrieval in the web of data. In: Proc. of the 19th Int. Conference on World Wide Web (WWW 10) /48

19 Top Google Searches People do search for entities Persons Products Events BBB12 is big brother Brazil /48

20 Overview (Centralised) entity search Federated entity search Peer-to-peer (P2P) networks 15/48

21 Overview (Centralised) entity search Federated entity search Peer-to-peer (P2P) networks 15/48

22 Overview (Centralised) entity search Federated entity search Peer-to-peer (P2P) networks 15/48

23 Overview of Publications 1/3 (Centralised) entity search Semantic Search Challenge 3 Hierarchical Entity Model 4 Strong Baselines 5 3 K. Balog, M. Ciglan, R. Neumayer, W. Wei, and K. Nørvåg. NTNU at SemSearch In: Proc. of the 4th Int. Semantic Search Workshop of the 20th Int. World Wide Web Conference WWW2011) R. Neumayer, K. Balog, and K. Nørvåg. On the Modeling of Entities for Ad-hoc Entity Search in the Web of Data. In: Proc. of the 34rd European Conference on Information Retrieval (ECIR 12) R. Neumayer, K. Balog, and K. Nørvåg. When Simple is (more than) Good Enough: Effective Semantic Search with (almost) no Semantics. In: Proc. of the 34rd European Conference on Information Retrieval (ECIR 12) /48

24 Overview of Publications 2/3 Federated entity search Collection ranking and selection 6 Ranking Distributed Knowledge Repositories 7 6 K. Balog, R. Neumayer, and K. Nørvåg. Collection Ranking and Selection for Federated Entity Search. In: Proc. of 18th Int. Symposium of String Processing and Information Retrieval (SPIRE 12). Lecture Notes in Computer Science R. Neumayer, K. Balog, and K. Nørvåg. Ranking Distributed Knowledge Repositories. In: Proc. of the Int. Conference on Theory and Practice of Digital Libraries Research and Advanced Technology for Digital Libraries (TPDL 12). Lecture Notes in Computer Science /48

25 Overview of Publications 3/3 Peer-to-peer (P2P) networks Aggregation of Document Frequencies 8 Hybrid Aggregation in P2P Networks 9 8 R. Neumayer, C. Doulkeridis, and K. Nørvåg. Aggregation of Document Frequencies in Unstructured P2P Networks. In: Proc. of 10th Int. Conference on Web Information Systems Engineering (WISE 09). Lecture Notes in Computer Science R. Neumayer, C. Doulkeridis, and K. Nørvåg. A Hybrid Approach for Estimating Document Frequencies in Unstructured P2P Networks. In: Information Systems 36.3 (2011). 18/48

26 1. Entity Search and the Web of Data The Web of Data What are Entities? 2. Centralised Entity Search Entity Modelling Experiments 3. Federated Entity Search Introduction Experimental Results 4. P2P Entity Search Introduction and Approach Experiments 5. Conclusions and Future Work Future Work 19/48

27 Centralised Entity Search Research questions How can traditional ad-hoc document retrieval techniques be applied in the context of the Web of Data? How can the structure of entities be exploited for the purpose of ad-hoc retrieval? How does field weighting affect search quality? 20/48

28 From Predicates to Fields: Structured Retrieval How to represent entity data in terms of structured fields? Text Serenity Serenity is a... Joss Whedon United States Films based on tv series Space Westerns Film Adam Baldwin Summer Glau Jewel Staite (a) Unstructured... Pred. type Value Name Serenity Attributes Serenity is a... OutRelations Joss Whedon United States Films based on tv series Space Westerns Film Adam Baldwin Summer Glau InRelations Best 2005 sci-fi film Favourite film (b) and Structured Entity Model 21/48

29 Fields and predicates Entity Modelling Approaches Somewhere in between one field and one field per predicate We consider: Unstructured entity model Collapse all predicates Structured entity model with predicate folding Collapse within predicate types Hierarchical entity model Use individual fields Predicate type weighting 22/48

30 Structured Entity Model Collapsing all fields per type Name, Attribute, InRelation, OutRelation Smoothing on type level Linear mixture of types (mixture of LMs) e p t... p t t... t 23/48

31 Hierarchical Entity Model Type folding viable alternative Preserve info about individual predicates Use individual fields Three model components Term generation Predicate generation Predicate type generation e p t... p t p t p t 24/48

32 2010/2011 Semantic Search Challenge Given a keyword query, targeting a particular entity, provide a ranked list of relevant entities (i.e., URIs) Queries Sampled from web search engine logs (142 in total) Data collection Billion Triple Challenge 2009 (BTC) dataset About 70 million entities From sources like dbpedia.org or livejournal.com Relevance judgments On a 3-point scale, collected using crowdsourcing 25/48

33 Experimental Results Ingoing relations have a marginal effect only Structured entity model improves compared to unstructured model Hierarchical model improves, but only for individual predicate types Overall our results are competitive with the ones achieved at evaluation initiatives Preprocessing, preprocessing Collection quality 26/48

34 When Simple is Good Enough Rather straigth forward approach Three components Extended preprocessing Process entity names Fielded representation Title and content fields Domain boosting Boost DBpedia Compare state-of-the-art fielded retrieval models 27/48

35 Results Outperform all results from Semantic Search Challenge Still not outperformed by others Entity titles answer entity queries very well Extent of improvements surprising 28/48

36 1. Entity Search and the Web of Data The Web of Data What are Entities? 2. Centralised Entity Search Entity Modelling Experiments 3. Federated Entity Search Introduction Experimental Results 4. P2P Entity Search Introduction and Approach Experiments 5. Conclusions and Future Work Future Work 29/48

37 Federated Search 1/2 Moving from centralised retrieval to a distributed setting Starting from a broker, query is routed to the right collection Main research question: Can federated entity search benefit from entity modelling? 30/48

38 Federated Search Central broker 1 Collection representation Summary A Summary B Summary C Collection A 2 Collection selection 3 Result merging Q 1 A C B 2 Q Collection B Q 3 Collection C 31/48

39 Collection Representation 1/2 Collection-centric model Treat each collection as one large document Low cost, less accurate results expected 32/48

40 Collection Representation 2/2 Entity-centric model Consider each collection in terms of its entities High cost, more accurate results expected 33/48

41 Collection Selection Predefined threshold Top-k collection selection Typically 5-20 AENN: All an Entity Needs is a Name Central repository of entity names AENN collection selection Trade-off between EC and CC approaches Precision-oriented Recall-oriented Balanced 34/48

42 Result merging Once we have multiple collections selected Central broker Summary A Summary B These collections rank their respective entities... and the resultant rankings have to be merged into one final list Summary C Q 1 A C B 2 3 Q Q Collection A Collection B Collection C 35/48

43 Experimental Setup Distributed environment Top 100 largest second-level domains from BTC Three sets with different handling of DBpedia Relevance Metrics Considered the #relevant entities from each collection Collection ranking and result merging: Standard IR metrics (MAP, MRR, ndcg) Collection selection: Analogues of precision and recall, plus the avg. #coll. selected 36/48

44 Experimental Results CC and EC methods are competitive Content-based methods stronger Small difference for the DBpedia-only collection AENN outperforms other title-only methods AENN has positive effects on collection selection 37/48

45 1. Entity Search and the Web of Data The Web of Data What are Entities? 2. Centralised Entity Search Entity Modelling Experiments 3. Federated Entity Search Introduction Experimental Results 4. P2P Entity Search Introduction and Approach Experiments 5. Conclusions and Future Work Future Work 38/48

46 P2P Search A query can originate from every peer and has to be routed via possibly many others Research questions: Is P2P search a viable alternative to broker-based (i.e., federated search) architectures for entity retrieval? How can the proposed frequency estimation technique be further improved? 39/48

47 Text documents, terms, and distribution Many problems are caused by distributed collections What is distributed and how? random is easy Local / global document frequencies Different numbers of documents per node Local importance and influence of collections Global information improves search results How frequent is a term on the global level? 40/48

48 We employ DESENT for P2P network creation Completely distributed and decentralised Hierarchical overlay generation Individual peers Zones formed by neighbouring peers Super zones based previous level DESENT 41/48

49 Local Term Selection Process Based on local peer s knowledge only Considers local terms and their frequencies Problems Number of documents per peer Document frequencies are unstable Local / global importance issues 42/48

50 Compare to central case Full info Central case without term info Lucene scoring Aggregated values score in between Portable to LM and entity use-case 43/48

51 1. Entity Search and the Web of Data The Web of Data What are Entities? 2. Centralised Entity Search Entity Modelling Experiments 3. Federated Entity Search Introduction Experimental Results 4. P2P Entity Search Introduction and Approach Experiments 5. Conclusions and Future Work Future Work 44/48

52 Summary of Contributions Analysis of retrieval models wrt. their applicability to entity search Hierarchical models Structured retrieval for entity search Formalisation of federated search task in a language model framework AENN method Benchmark data sets for federated entity search Entity search in P2P contexts 45/48

53 Query Target Type Identification Queries often target specific types (e.g. cars, actors,... ) Sub problem: DBPedia ontology target type identification What is a query s type? Ontology linking How to exploit this info? See CIKM 12 poster, partly INEX 12 submission 46/48

54 Query to Field/Predicate Mapping Which field/predicate best answers a query? Simple example: IMDB field:actor field:director field:trivia Example query: Clint Eastwood What is the best field to answer the query? What is the best field to answer the individual query terms? What results are we looking for (actor/director)? 47/48

55 Last Slide Three basic purposes of oral presentations (in the spirit of trusting Wikipedia 10 ) Inform Persuade Good will I tried to do all of these things! Thanks for help and support /48

Semantic and Distributed Entity Search in the Web of Data

Semantic and Distributed Entity Search in the Web of Data Robert Neumayer Semantic and Distributed Entity Search in the Web of Data Thesis for the degree of Philosophiae Doctor Trondheim, February 2013 Norwegian University of Science and Technology Faculty of

More information

DBpedia Extracting structured data from Wikipedia

DBpedia Extracting structured data from Wikipedia DBpedia Extracting structured data from Wikipedia Anja Jentzsch, Freie Universität Berlin Köln. 24. November 2009 DBpedia DBpedia is a community effort to extract structured information from Wikipedia

More information

Linking Spatial Data from the Web

Linking Spatial Data from the Web European Geodemographics Conference London, April 1, 2009 Linking Spatial Data from the Web Christian Becker, Freie Universität Berlin Hello Name Job Christian Becker Partner, MES (consulting) PhD Student

More information

The Data Web and Linked Data.

The Data Web and Linked Data. Mustafa Jarrar Lecture Notes, Knowledge Engineering (SCOM7348) University of Birzeit 1 st Semester, 2011 Knowledge Engineering (SCOM7348) The Data Web and Linked Data. Dr. Mustafa Jarrar University of

More information

Creating Large-scale Training and Test Corpora for Extracting Structured Data from the Web

Creating Large-scale Training and Test Corpora for Extracting Structured Data from the Web Creating Large-scale Training and Test Corpora for Extracting Structured Data from the Web Robert Meusel and Heiko Paulheim University of Mannheim, Germany Data and Web Science Group {robert,heiko}@informatik.uni-mannheim.de

More information

The Linking Open Data Project Bootstrapping the Web of Data

The Linking Open Data Project Bootstrapping the Web of Data The Linking Open Data Project Bootstrapping the Web of Data Tom Heath Talis Information Ltd, UK CATCH Programme and E-Culture Project Meeting on Metadata Interoperability Amsterdam, 29 February 2008 My

More information

OKKAM-based instance level integration

OKKAM-based instance level integration OKKAM-based instance level integration Paolo Bouquet W3C RDF2RDB This work is co-funded by the European Commission in the context of the Large-scale Integrated project OKKAM (GA 215032) RoadMap Using the

More information

LinkedMDB. The first linked data source dedicated to movies

LinkedMDB. The first linked data source dedicated to movies Oktie Hassanzadeh Mariano Consens University of Toronto April 20th, 2009 Madrid, Spain Presentation at the Linked Data On the Web (LDOW) 2009 Workshop LinkedMDB 2 The first linked data source dedicated

More information

Query Expansion using Wikipedia and DBpedia

Query Expansion using Wikipedia and DBpedia Query Expansion using Wikipedia and DBpedia Nitish Aggarwal and Paul Buitelaar Unit for Natural Language Processing, Digital Enterprise Research Institute, National University of Ireland, Galway firstname.lastname@deri.org

More information

Semantically Driven Web Content Management Systems Adaptation. Lai Wei

Semantically Driven Web Content Management Systems Adaptation. Lai Wei Semantically Driven Web Content Management Systems Adaptation Lai Wei A dissertation submitted to the University of Dublin, in partial fulfillment of the requirements for the degree of Master of Science

More information

ERCIM Alain Bensoussan Fellowship Scientific Report

ERCIM Alain Bensoussan Fellowship Scientific Report ERCIM Alain Bensoussan Fellowship Scientific Report Fellow: Christos Doulkeridis Visited Location : NTNU Duration of Visit: March 01, 2009 February 28, 2010 I - Scientific activity The research conducted

More information

Semantic Web Systems Introduction Jacques Fleuriot School of Informatics

Semantic Web Systems Introduction Jacques Fleuriot School of Informatics Semantic Web Systems Introduction Jacques Fleuriot School of Informatics 11 th January 2015 Semantic Web Systems: Introduction The World Wide Web 2 Requirements of the WWW l The internet already there

More information

Entity and Knowledge Base-oriented Information Retrieval

Entity and Knowledge Base-oriented Information Retrieval Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061

More information

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites Access IT Training 2003 Google indexed 3,3 billion of pages http://searchenginewatch.com/3071371 2005 Google s index contains 8,1 billion of websites http://blog.searchenginewatch.com/050517-075657 Estimated

More information

Semantic Entity Retrieval using Web Queries over Structured RDF Data

Semantic Entity Retrieval using Web Queries over Structured RDF Data Semantic Entity Retrieval using Web Queries over Structured RDF Data Jeff Dalton and Sam Huston CS645 Project Final Report May 11, 2010 Abstract We investigate the problem of performing entity retrieval

More information

Improving Difficult Queries by Leveraging Clusters in Term Graph

Improving Difficult Queries by Leveraging Clusters in Term Graph Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu

More information

Efficient, Scalable, and Provenance-Aware Management of Linked Data

Efficient, Scalable, and Provenance-Aware Management of Linked Data Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management

More information

The Emerging Web of Linked Data

The Emerging Web of Linked Data The Emerging Web of Linked Data Christian Bizer, Freie Universität Berlin The classic World Wide Web is built upon the idea to set hyperlinks between Web documents. Hyperlinks are the basis for navigating

More information

Tabularized Search Results

Tabularized Search Results University of Stavanger Tabularized Search Results Johan Le Gall supervisor: Krisztian Balog June 15, 2016 Abstract This thesis focuses on the problem of generating search engine results as a table. These

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Keyword Search over RDF Graphs. Elisa Menendez

Keyword Search over RDF Graphs. Elisa Menendez Elisa Menendez emenendez@inf.puc-rio.br Summary Motivation Keyword Search over RDF Process Challenges Example QUIOW System Next Steps Motivation Motivation Keyword search is an easy way to retrieve information

More information

Linking Distributed Data across the Web

Linking Distributed Data across the Web Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd tom.heath@talis.com http://tomheath.com/ Overview Background From a Web of Documents to a Web of

More information

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,

More information

Linked Open Europeana: Semantic Leveraging of European Cultural Heritage

Linked Open Europeana: Semantic Leveraging of European Cultural Heritage Linked Open Europeana: Semantic Leveraging of European Cultural Heritage http://www.slideshare.net/gradmans/ Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library and Information

More information

Linking FRBR Entities to LOD through Semantic Matching

Linking FRBR Entities to LOD through Semantic Matching Linking FRBR Entities to through Semantic Matching Naimdjon Takhirov, Fabien Duchateau, Trond Aalberg Department of Computer and Information Science Norwegian University of Science and Technology Theory

More information

Information Retrieval

Information Retrieval Information Retrieval WS 2016 / 2017 Lecture 2, Tuesday October 25 th, 2016 (Ranking, Evaluation) Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University

More information

Hierarchical Link Analysis for Ranking Web Data

Hierarchical Link Analysis for Ranking Web Data Hierarchical Link Analysis for Ranking Web Data Renaud Delbru, Nickolai Toupikov, Michele Catasta, Giovanni Tummarello, and Stefan Decker Digital Enterprise Research Institute, Galway June 1, 2010 Introduction

More information

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts Leslie Carr http://id.ecs.soton.ac.uk/people/60 What s the Web For? To share information 1. Ad hoc home pages 2. Structured

More information

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data: Introduction to Semantic Web Angelica Lo Duca IIT-CNR angelica.loduca@iit.cnr.it Linked Open Data: a paradigm for the Semantic Web Course Outline Introduction to SW Give a structure to data (RDF Data Model)

More information

Linked Data Evolving the Web into a Global Data Space

Linked Data Evolving the Web into a Global Data Space Linked Data Evolving the Web into a Global Data Space Anja Jentzsch, Freie Universität Berlin 05 October 2011 EuropeanaTech 2011, Vienna 1 Architecture of the classic Web Single global document space Web

More information

Effective Searching of RDF Knowledge Bases

Effective Searching of RDF Knowledge Bases Effective Searching of RDF Knowledge Bases Shady Elbassuoni Joint work with: Maya Ramanath and Gerhard Weikum RDF Knowledge Bases Annie Hall is a 1977 American romantic comedy directed by Woody Allen and

More information

Prof. Dr. Christian Bizer

Prof. Dr. Christian Bizer STI Summit July 6 th, 2011, Riga, Latvia Global Data Integration and Global Data Mining Prof. Dr. Christian Bizer Freie Universität ität Berlin Germany Outline 1. Topology of the Web of Data What data

More information

The Emerging Web of Linked Data

The Emerging Web of Linked Data 4th Berlin Semantic Web Meetup 26. February 2010 The Emerging Web of Linked Data Prof. Dr. Christian Bizer Freie Universität Berlin Outline 1. From a Web of Documents to a Web of Data Web APIs and Linked

More information

Effective Latent Space Graph-based Re-ranking Model with Global Consistency

Effective Latent Space Graph-based Re-ranking Model with Global Consistency Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case

More information

DBpedia-An Advancement Towards Content Extraction From Wikipedia

DBpedia-An Advancement Towards Content Extraction From Wikipedia DBpedia-An Advancement Towards Content Extraction From Wikipedia Neha Jain Government Degree College R.S Pura, Jammu, J&K Abstract: DBpedia is the research product of the efforts made towards extracting

More information

SRI International, Artificial Intelligence Center Menlo Park, USA, 24 July 2009

SRI International, Artificial Intelligence Center Menlo Park, USA, 24 July 2009 SRI International, Artificial Intelligence Center Menlo Park, USA, 24 July 2009 The Emerging Web of Linked Data Chris Bizer, Freie Universität Berlin Outline 1. From a Web of Documents to a Web of Data

More information

Semantic Web and Natural Language Processing

Semantic Web and Natural Language Processing Semantic Web and Natural Language Processing Wiltrud Kessler Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Semantic Web Winter 2014/2015 This work is licensed under a Creative Commons

More information

Jianyong Wang Department of Computer Science and Technology Tsinghua University

Jianyong Wang Department of Computer Science and Technology Tsinghua University Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity

More information

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala Master Project Various Aspects of Recommender Systems May 2nd, 2017 Master project SS17 Albert-Ludwigs-Universität Freiburg Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue

More information

Re-contextualization and contextual Entity exploration. Sebastian Holzki

Re-contextualization and contextual Entity exploration. Sebastian Holzki Re-contextualization and contextual Entity exploration Sebastian Holzki Sebastian Holzki June 7, 2016 1 Authors: Joonseok Lee, Ariel Fuxman, Bo Zhao, and Yuanhua Lv - PAPER PRESENTATION - LEVERAGING KNOWLEDGE

More information

Cluster-based Instance Consolidation For Subsequent Matching

Cluster-based Instance Consolidation For Subsequent Matching Jennifer Sleeman and Tim Finin, Cluster-based Instance Consolidation For Subsequent Matching, First International Workshop on Knowledge Extraction and Consolidation from Social Media, November 2012, Boston.

More information

Linked Data. Department of Software Enginnering Faculty of Information Technology Czech Technical University in Prague Ivo Lašek, 2011

Linked Data. Department of Software Enginnering Faculty of Information Technology Czech Technical University in Prague Ivo Lašek, 2011 Linked Data Department of Software Enginnering Faculty of Information Technology Czech Technical University in Prague Ivo Lašek, 2011 Semantic Web, MI-SWE, 11/2011, Lecture 9 Evropský sociální fond Praha

More information

Learning to Reweight Terms with Distributed Representations

Learning to Reweight Terms with Distributed Representations Learning to Reweight Terms with Distributed Representations School of Computer Science Carnegie Mellon University August 12, 215 Outline Goal: Assign weights to query terms for better retrieval results

More information

Example Based Entity Search in the Web of Data

Example Based Entity Search in the Web of Data Example Based Entity Search in the Web of Data Marc Bron 1, Krisztian Balog 2, and Maarten de Rijke 1 1 ISLA, University of Amsterdam, Scienc Park 904, 1098 XH Amsterdam 2 University of Stavanger, NO-4036

More information

Corso di Biblioteche Digitali

Corso di Biblioteche Digitali Corso di Biblioteche Digitali Vittore Casarosa casarosa@isti.cnr.it tel. 050-315 3115 cell. 348-397 2168 Ricevimento dopo la lezione o per appuntamento Valutazione finale 70-75% esame orale 25-30% progetto

More information

Exploiting Index Pruning Methods for Clustering XML Collections

Exploiting Index Pruning Methods for Clustering XML Collections Exploiting Index Pruning Methods for Clustering XML Collections Ismail Sengor Altingovde, Duygu Atilgan and Özgür Ulusoy Department of Computer Engineering, Bilkent University, Ankara, Turkey {ismaila,

More information

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval

Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Attentive Neural Architecture for Ad-hoc Structured Document Retrieval Saeid Balaneshin 1 Alexander Kotov 1 Fedor Nikolaev 1,2 1 Textual Data Analytics Lab, Department of Computer Science, Wayne State

More information

Linked Data and RDF. COMP60421 Sean Bechhofer

Linked Data and RDF. COMP60421 Sean Bechhofer Linked Data and RDF COMP60421 Sean Bechhofer sean.bechhofer@manchester.ac.uk Building a Semantic Web Annotation Associating metadata with resources Integration Integrating information sources Inference

More information

Is Brad Pitt Related to Backstreet Boys? Exploring Related Entities

Is Brad Pitt Related to Backstreet Boys? Exploring Related Entities Is Brad Pitt Related to Backstreet Boys? Exploring Related Entities Nitish Aggarwal, Kartik Asooja, Paul Buitelaar, and Gabriela Vulcu Unit for Natural Language Processing Insight-centre, National University

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Information Search in Web Archives

Information Search in Web Archives Information Search in Web Archives Miguel Costa Advisor: Prof. Mário J. Silva Co-Advisor: Prof. Francisco Couto Department of Informatics, Faculty of Sciences, University of Lisbon PhD thesis defense,

More information

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL

Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL Marcelo Arenas 1, Mariano Consens 2, and Alejandro Mallea 1,3 1 Pontificia Universidad Católica de Chile 2 University of Toronto

More information

The SemSets Model for Ad-hoc Semantic List Search

The SemSets Model for Ad-hoc Semantic List Search The SemSets Model for Ad-hoc Semantic List Search Marek Ciglan Institute of Informatics Slovak Academy of Sciences Bratislava, Slovakia marek.ciglan@savba.sk Kjetil Nørvåg Dept. of Computer and Information

More information

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS) 2 ITARC 2010 Stockholm 100420 Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS) 3 Contents Trends in information / data Critical factors... growing importance Needs

More information

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS) 2 ITARC 2010 Stockholm 100420 Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS) 3 Contents Trends in information / data Critical factors... growing importance Needs

More information

On Measuring the Lattice of Commonalities Among Several Linked Datasets

On Measuring the Lattice of Commonalities Among Several Linked Datasets On Measuring the Lattice of Commonalities Among Several Linked Datasets Michalis Mountantonakis and Yannis Tzitzikas FORTH-ICS Information Systems Laboratory University of Crete Computer Science Department

More information

Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs

Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Authors: Andreas Wagner, Veli Bicer, Thanh Tran, and Rudi Studer Presenter: Freddy Lecue IBM Research Ireland 2014 International

More information

ISSN Vol.05,Issue.07, July-2017, Pages:

ISSN Vol.05,Issue.07, July-2017, Pages: WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.07, July-2017, Pages:1320-1324 Efficient Prediction of Difficult Keyword Queries over Databases KYAMA MAHESH 1, DEEPTHI JANAGAMA 2, N. ANJANEYULU 3 1 PG Scholar,

More information

Linked Data and RDF. COMP60421 Sean Bechhofer

Linked Data and RDF. COMP60421 Sean Bechhofer Linked Data and RDF COMP60421 Sean Bechhofer sean.bechhofer@manchester.ac.uk Building a Semantic Web Annotation Associating metadata with resources Integration Integrating information sources Inference

More information

infoh509 xml & web technologies lecture 9: sparql Stijn Vansummeren February 14, 2017

infoh509 xml & web technologies lecture 9: sparql Stijn Vansummeren February 14, 2017 infoh509 xml & web technologies lecture 9: sparql Stijn Vansummeren February 14, 2017 what have we gained? Current no structure Future structured by RDF (subject, predicate, object) b:genome b:field b:molecular-bio

More information

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS]

Search Evaluation. Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Search Evaluation Tao Yang CS293S Slides partially based on text book [CMS] [MRS] Table of Content Search Engine Evaluation Metrics for relevancy Precision/recall F-measure MAP NDCG Difficulties in Evaluating

More information

Time-aware Approaches to Information Retrieval

Time-aware Approaches to Information Retrieval Time-aware Approaches to Information Retrieval Nattiya Kanhabua Department of Computer and Information Science Norwegian University of Science and Technology 24 February 2012 Motivation Searching documents

More information

MODEL-BASED SYSTEMS ENGINEERING DESIGN AND TRADE-OFF ANALYSIS WITH RDF GRAPHS

MODEL-BASED SYSTEMS ENGINEERING DESIGN AND TRADE-OFF ANALYSIS WITH RDF GRAPHS MODEL-BASED SYSTEMS ENGINEERING DESIGN AND TRADE-OFF ANALYSIS WITH RDF GRAPHS Nefretiti Nassar and Mark Austin Institute of Systems Research, University of Maryland, College Park, MD 20742. CSER 2013 Presentation,

More information

Corso di Biblioteche Digitali

Corso di Biblioteche Digitali Corso di Biblioteche Digitali Vittore Casarosa casarosa@isti.cnr.it tel. 050-315 3115 cell. 348-397 2168 Ricevimento dopo la lezione o per appuntamento Valutazione finale 70-75% esame orale 25-30% progetto

More information

How to Publish Linked Data on the Web - Proposal for a Half-day Tutorial at ISWC2008

How to Publish Linked Data on the Web - Proposal for a Half-day Tutorial at ISWC2008 How to Publish Linked Data on the Web - Proposal for a Half-day Tutorial at ISWC2008 Tom Heath 1, Michael Hausenblas 2, Chris Bizer 3, Richard Cyganiak 4 1 Talis Information Limited, UK 2 Joanneum Research,

More information

Semantic Web and Web2.0. Dr Nicholas Gibbins

Semantic Web and Web2.0. Dr Nicholas Gibbins Semantic Web and Web2.0 Dr Nicholas Gibbins Web 2.0 is the business revolution in the computer industry caused by the move to the internet as platform, and an attempt to understand the rules for success

More information

W3C Workshop on RDF Access to Relational Databases October, 2007 Boston, MA, USA D2RQ. Lessons Learned

W3C Workshop on RDF Access to Relational Databases October, 2007 Boston, MA, USA D2RQ. Lessons Learned W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 Boston, MA, USA D2RQ Lessons Learned Christian Bizer Richard Cyganiak Freie Universität Berlin The D2RQ Plattform 2002: D2R MAP dump

More information

SPARQL Protocol And RDF Query Language

SPARQL Protocol And RDF Query Language SPARQL Protocol And RDF Query Language WS 2011/12: XML Technologies John Julian Carstens Department of Computer Science Communication Systems Group Christian-Albrechts-Universität zu Kiel March 1, 2012

More information

Information Retrieval

Information Retrieval Information Retrieval ETH Zürich, Fall 2012 Thomas Hofmann LECTURE 6 EVALUATION 24.10.2012 Information Retrieval, ETHZ 2012 1 Today s Overview 1. User-Centric Evaluation 2. Evaluation via Relevance Assessment

More information

Linked Open Europeana: Semantics for the Digital Humanities

Linked Open Europeana: Semantics for the Digital Humanities Linked Open Europeana: Semantics for the Digital Humanities Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library and Information Science stefan.gradmann@ibi.hu-berlin.de 1 Overview

More information

Information Retrieval (IR) through Semantic Web (SW): An Overview

Information Retrieval (IR) through Semantic Web (SW): An Overview Information Retrieval (IR) through Semantic Web (SW): An Overview Gagandeep Singh 1, Vishal Jain 2 1 B.Tech (CSE) VI Sem, GuruTegh Bahadur Institute of Technology, GGS Indraprastha University, Delhi 2

More information

A cocktail approach to the VideoCLEF 09 linking task

A cocktail approach to the VideoCLEF 09 linking task A cocktail approach to the VideoCLEF 09 linking task Stephan Raaijmakers Corné Versloot Joost de Wit TNO Information and Communication Technology Delft, The Netherlands {stephan.raaijmakers,corne.versloot,

More information

Financial Dataspaces: Challenges, Approaches and Trends

Financial Dataspaces: Challenges, Approaches and Trends Financial Dataspaces: Challenges, Approaches and Trends Finance and Economics on the Semantic Web (FEOSW), ESWC 27 th May, 2012 Seán O Riain ebusiness Copyright 2009. All rights reserved. Motivation Changing

More information

Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation

Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation - Korea-Germany Joint Workshop for LOD2 2011 - Han-Gyu Ko Dept. of Computer Science, KAIST Korea Advanced Institute of Science

More information

Business to Consumer Markets on the Semantic Web

Business to Consumer Markets on the Semantic Web Workshop on Metadata for Security (W-MS) International Federated Conferences (OTM '03) Business to Consumer Markets on the Semantic Web Prof. Dr.-Ing. Robert Tolksdorf, Dipl.-Kfm. Christian Bizer Freie

More information

URI Disambiguation in the Context of Linked Data

URI Disambiguation in the Context of Linked Data http://dbpedia.org/resource/spain http://dbpedia.org/resource/tim_berners-lee http://acm.rkbexplorer.com/id/resource-p112732 URI Disambiguation in the Context of Linked Data http://sws.geonames.org/2510769

More information

7 Analysis of experiments

7 Analysis of experiments Natural Language Addressing 191 7 Analysis of experiments Abstract In this research we have provided series of experiments to identify any trends, relationships and patterns in connection to NL-addressing

More information

Semantic Web Fundamentals

Semantic Web Fundamentals Semantic Web Fundamentals Web Technologies (706.704) 3SSt VU WS 2017/18 Vedran Sabol with acknowledgements to P. Höfler, V. Pammer, W. Kienreich ISDS, TU Graz December 11 th 2017 Overview What is Semantic

More information

Architecture and Applications

Architecture and Applications webinale 2010 31.05.2010 The Web of Linked Data Architecture and Applications Prof. Dr. Christian Bizer Freie Universität Berlin Outline 1. From a Web of Documents to a Web of Data Web APIs and Linked

More information

Distributed Case-based Reasoning for Fault Management

Distributed Case-based Reasoning for Fault Management Distributed Case-based Reasoning for Fault Management Ha Manh Tran and Jürgen Schönwälder Computer Science, Jacobs University Bremen, Germany 1st EMANICS Workshop on Peer-to-Peer Management University

More information

Role of Social Media and Semantic WEB in Libraries

Role of Social Media and Semantic WEB in Libraries Role of Social Media and Semantic WEB in Libraries By Dr. Anwar us Saeed Email: anwarussaeed@yahoo.com Layout Plan Where Library streams merge the WEB Recent Evolution of the WEB Social WEB Semantic WEB

More information

OWLIM Reasoning over FactForge

OWLIM Reasoning over FactForge OWLIM Reasoning over FactForge Barry Bishop, Atanas Kiryakov, Zdravko Tashev, Mariana Damova, Kiril Simov Ontotext AD, 135 Tsarigradsko Chaussee, Sofia 1784, Bulgaria Abstract. In this paper we present

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

NodeXL You can t stop the Firefly Joshua Brulé November 7, 2013

NodeXL You can t stop the Firefly Joshua Brulé November 7, 2013 NodeXL You can t stop the Firefly Joshua Brulé (jtcbrule@gmail.com) November 7, 2013 Firefly is noted for being one of the few canceled television series to be spun off into a major motion picture (Serenity).

More information

User Control Mechanisms for Privacy Protection Should Go Hand in Hand with Privacy-Consequence Information: The Case of Smartphone Apps

User Control Mechanisms for Privacy Protection Should Go Hand in Hand with Privacy-Consequence Information: The Case of Smartphone Apps User Control Mechanisms for Privacy Protection Should Go Hand in Hand with Privacy-Consequence Information: The Case of Smartphone Apps Position Paper Gökhan Bal, Kai Rannenberg Goethe University Frankfurt

More information

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016

Advanced Topics in Information Retrieval. Learning to Rank. ATIR July 14, 2016 Advanced Topics in Information Retrieval Learning to Rank Vinay Setty vsetty@mpi-inf.mpg.de Jannik Strötgen jannik.stroetgen@mpi-inf.mpg.de ATIR July 14, 2016 Before we start oral exams July 28, the full

More information

Towards open-domain QA. Question answering. TReC QA framework. TReC QA: evaluation

Towards open-domain QA. Question answering. TReC QA framework. TReC QA: evaluation Question ing Overview and task definition History Open-domain question ing Basic system architecture Watson s architecture Techniques Predictive indexing methods Pattern-matching methods Advanced techniques

More information

Mounia Lalmas, Department of Computer Science, Queen Mary, University of London, United Kingdom,

Mounia Lalmas, Department of Computer Science, Queen Mary, University of London, United Kingdom, XML Retrieval Mounia Lalmas, Department of Computer Science, Queen Mary, University of London, United Kingdom, mounia@acm.org Andrew Trotman, Department of Computer Science, University of Otago, New Zealand,

More information

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points? Ranked Retrieval One option is to average the precision scores at discrete Precision 100% 0% More junk 100% Everything points on the ROC curve But which points? Recall We want to evaluate the system, not

More information

Ranking Web Pages by Associating Keywords with Locations

Ranking Web Pages by Associating Keywords with Locations Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

A Scheme of Automated Object and Facet Extraction for Faceted Search over XML Data

A Scheme of Automated Object and Facet Extraction for Faceted Search over XML Data IDEAS 2014 A Scheme of Automated Object and Facet Extraction for Faceted Search over XML Data Takahiro Komamizu, Toshiyuki Amagasa, Hiroyuki Kitagawa University of Tsukuba Background Introduction Background

More information

Towards Rule Learning Approaches to Instance-based Ontology Matching

Towards Rule Learning Approaches to Instance-based Ontology Matching Towards Rule Learning Approaches to Instance-based Ontology Matching Frederik Janssen 1, Faraz Fallahi 2 Jan Noessner 3, and Heiko Paulheim 1 1 Knowledge Engineering Group, TU Darmstadt, Hochschulstrasse

More information

Beyond RDF Links Exploring the Semantic Web with the Help of Formal Concepts

Beyond RDF Links Exploring the Semantic Web with the Help of Formal Concepts Beyond RDF Links Exploring the Semantic Web with the Help of Formal Concepts Markus Kirchberg 1, Erwin Leonardi 1, Yu Shyang Tan 1, Ryan K L Ko 1, Sebastian Link 2, and Bu Sung Lee 1 1 Cloud & Security

More information

Semantic Web Systems Linked Open Data Jacques Fleuriot School of Informatics

Semantic Web Systems Linked Open Data Jacques Fleuriot School of Informatics Semantic Web Systems Linked Open Data Jacques Fleuriot School of Informatics 9 th February 2015 In the previous lecture l Querying with XML Basic idea: search along paths in an XML tree e.g. path expression:

More information

OLAP over Federated RDF Sources

OLAP over Federated RDF Sources OLAP over Federated RDF Sources DILSHOD IBRAGIMOV, KATJA HOSE, TORBEN BACH PEDERSEN, ESTEBAN ZIMÁNYI. Outline o Intro and Objectives o Brief Intro to Technologies o Our Approach and Progress o Future Work

More information

Introduction. October 5, Petr Křemen Introduction October 5, / 31

Introduction. October 5, Petr Křemen Introduction October 5, / 31 Introduction Petr Křemen petr.kremen@fel.cvut.cz October 5, 2017 Petr Křemen (petr.kremen@fel.cvut.cz) Introduction October 5, 2017 1 / 31 Outline 1 About Knowledge Management 2 Overview of Ontologies

More information

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user

THIS LECTURE. How do we know if our results are any good? Results summaries: Evaluating a search engine. Making our good results usable to a user EVALUATION Sec. 6.2 THIS LECTURE How do we know if our results are any good? Evaluating a search engine Benchmarks Precision and recall Results summaries: Making our good results usable to a user 2 3 EVALUATING

More information