What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved through SPARQL queries Montbonnot, France Jerome.Euzenat@inrialpes.fr Thanks to Natasha Noy for our collaboration on a former version of these slides Jérôme Euzenat Interoperability 2 / 24 Being serious about the semantic web Ontology heterogeneity It is not one person s It is not several people common It is many people s many ontologies So it is a mess, but a meaningful mess. integer string uri Jérôme Euzenat Interoperability 3 / 24 Jérôme Euzenat Interoperability 4 / 24
Heterogeneity problem How can we address the problem? Resources being expressed in different ways must be reconciled before being used. Mismatch between formalized knowledge can occur when: different languages are used (OWL vs. Topic maps); different terminologies are used: parameters English vs. Chinese; vs.. different models are used: different classes: vs. ; classes vs. property: vs. literarygenre; classes vs. instances: One physical book as an instance vs. one work as an instance. different scopes and granularity are used. Only books vs. cultural items vs. any product; s detailed to the print and translation level vs. books as works. Initial alignment matching resources Resulting alignment Jérôme Euzenat Interoperability 5 / 24 Jérôme Euzenat Interoperability 6 / 24 Transformation and mediation integer string uri SELECT x. WHERE x : AND x. = Bertrand Russell AND x.topic = Bertrand Russell mediator x.=http://dx..org/10.1080/041522862x SELECT x. WHERE x : AND x. = Bertrand Russell x.=041522862x Jérôme Euzenat Interoperability 7 / 24 Jérôme Euzenat Interoperability 8 / 24
Why should we deal with this? Alication: Catalog integration Alications of semantic integration Catalogue integration Matcher Schema and data integration Query answering Peer-to-peer information sharing Web service composition Generator Agent communication Data transformation Ontology evolution Catalog Transformation Integrated portal Jérôme Euzenat Interoperability 9 / 24 Jérôme Euzenat Interoperability 10 / 24 Alications: Query answering Alications: Agent communication Matcher Matcher peer query reformulated answer Generator mediator reformulated query answer peer agent message Generator Transformed Translator Transformed message agent Jérôme Euzenat Interoperability 11 / 24 Jérôme Euzenat Interoperability 12 / 24
Ontology matching in three steps On what basis can we match? Reconciliation can be performed in 3 steps o o Match, Matcher thereby determines the alignment A Generate Generator a processor (for merging, transforming, etc.) Transformation Aly Content: relying on what is inside the Name, comments, alternate names, names of related entities: NLP, IR, etc. Internal structure: constraints on relations, typing External structure: relations between entities: Data mining, Discrete mathematics Extension: Statistics, data analysis, data mining, machine learning Semantics (models): Reasoning techniques Context: the relations of the with the outside Annotated resources: The web External ontologies: dbpedia, etc. External resources: wordnet, etc. Jérôme Euzenat Interoperability 13 / 24 Jérôme Euzenat Interoperability 14 / 24 Name similarity Structure similarity integer string uri Jérôme Euzenat Interoperability 15 / 24 Jérôme Euzenat Interoperability 16 / 24
Instance similarity Combining different techniques Basic matchers provide candidate correspondences, most of the systems use several such matchers and further combine and filter their results. o A M A M A M A A A Bertrand Russell: My life Albert Camus: La chute o Matcher composition Iteration Aggregation Filtering Jérôme Euzenat Interoperability 17 / 24 Jérôme Euzenat Interoperability 18 / 24 How well do these aroaches work? Ontology Alignment Evaluation Initiative (OAEI) Formal comparative evaluation of different -matching tools; Run every year since 2004; Variety of test cases (in size, in formalism, in content); Results consistent across test cases; Results very dependent on the tasks and the data (from under 50% of precision and recall to well over 80% if ontologies are relatively similar) Progress every year! http://oaei.matching.org Now involved in the SEALS (Semantics Evaluation At Large Scale) project. Benchmark results (precision and recall curves) 1. precision 0. 0. recall 1. 2010 ASMOV 2009 Lily 2008 Lily 2007 ASMOV 2006 RiMOM 2005 Falcon edna Jérôme Euzenat Interoperability 19 / 24 Jérôme Euzenat Interoperability 20 / 24
Tools you should be aware of Frameworks PROMPT (a Protégé plug-in): includes a user interface and a plug-in architecture. Alignment API: used by many tools; provides an exchange format and evaluation tools for OAEI. COMA++: oriented toward database integration (many basic algorithms implemented). Matching systems OAEI best performers (Falcon, RiMOM, ASMOV, etc.) Available systems (FOAM, Falcon, COMA++, Aroma, etc.) Jérôme Euzenat Interoperability 21 / 24 Selected challenges Scalability and efficiency Current matchers can be fast, scale and accurate, but not all at once. New sources of matching Context-based matching, General purpose matching (vs. special purpose matching) Matcher combination, Matcher selection and self-configuration, User involvement, Matching (serendipitously) while working, How to explain alignments? Social and collaborative matching, Alignment management: infrastructure and suort, How do we maintain alignments when ontologies evolve? Reasoning with alignments, Being robust to incorrect alignments. and, of course, many others, Jérôme Euzenat Interoperability 22 / 24 Further reading Ontology Matching by Euzenat and Shvaiko Proceedings of ISWC, ASWC, ESWC, WWW conferences, etc. Journal of web semantics, Journal on data semantics, etc. http://www.matching.org Jerome.Euzenat@inria.fr http://exmo.inrialpes.fr Jérôme Euzenat Interoperability 23 / 24