Unsupervised Learning of Link Discovery Configuration
|
|
- Clementine Parker
- 5 years ago
- Views:
Transcription
1 Unsupervised Learning of Link Discovery Configuration Andriy Nikolov, Mathieu d Aquin, and Enrico Motta Knowledge Media Institute, The Open University, UK {a.nikolov, m.daquin, e.motta}@open.ac.uk
2 Link discovery problem Goal: Which individuals refer to the same real-world entity? Challenges Schema heterogeneity Different value representation formats Incorrect/missing data values Homonymy/synonymy dbpedia:heraklion fbase:heraklion dbpedia:s.s._heraklion geonames:261745/
3 Instance matching Decision rule avg max Jaro Levenshtein Monge-Elkan Jaccard Many parameters to set Expert knowledge often required Jaro-Winkler (personal names) Monge-Elkan (document titles) No straightforward guidance (rdfs:label;dc:title) (foaf:name;dc:title) (dbprop:residence;bnb:birthyear)
4 How to avoid manual configuration? Learn from existing linked data repositories available on the Web [Hu et al., 2011] Only for some datasets/ontologies Utilising Web of data Machine learning Can we do it without labelled training data? Active learning Adapts to the domain by picking appropriate parameters for particular datasets Training data required Reduce the amount of training data [Sarawagi et al., 2002, Ngonga- Ngomo et al., 2011]
5 What is a good decision rule? Minimises defects no spurious mappings Maximises output discovered actual mappings
6 Evaluation measures
7 Reference datasets Some datasets are considered as reference ones: - Reliable - Comprehensive Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
8 Assumptions
9 Fitness criterion George W. Bush (1946 George H. W. Bush (1924 Bill Clinton (1946 Bush, G. W. (1946 Bush, G. H. W. (1924 Clinton, W. J. (1946 George W. Bush (1946 George H. W. Bush (1924 Bill Clinton (1946 Bush, G. W. (1946 Bush, G. H. W. (1924 Clinton, W. J. (1946 George W. Bush (1946 George H. W. Bush (1924 Bill Clinton (1946 Bush, G. W. (1946 Bush, G. H. W. (1924 Clinton, W. J. (1946
10 Fitness function: unsupervised case
11 Approach genetic algorithm Initialise a pool of decision rules with random parameters Apply these rules to the datasets Calculate fitness of each decision rule by evaluating its results Use a genetic algorithm to improve the pool of decision rules Elitist selection Crossover Mutation Candidate solution pool Fitness calculation Decision rule
12 Encoding decision rules Gene Genotype
13 Mutation
14 Crossover
15 Implementation - Stop-word filtering - Indexing Determining & applying the optimal decision rule Filtering resulting mappings - Enforcing 1-to-1 rule
16 Fitness function behaviour 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0, Avg. pseudo-f Avg. F1 Best pseudo-f Best F1 In general, N=20 was sufficient for convergence
17 Genetic algorithm parameters 1,2 1 0,8 0,6 0,4 0,2 - Varying population size 0 - Comparing with ideal case (actual F1-measure) F1-fitness (MC) Pseudo-F-fitness (MC) F1-fitness (BA) Pseudo-F-fitness (BA) - Varying crossover and mutation rate didn t result in significant differences in performance
18 OAEI 2010: Person/Restaurant Comparing actual F1-measure (average after 5 runs) Benchmark adapted from the database record linkage 1,2 1 0,8 0,6 0,4 0,2 0 Person1 Person2 Restaurant (OAEI) Restaurant (fixed) KnoFuss+GA ObjectCoref ASMOV CODI LN2R RiMOM FBEM
19 OAEI 2011: New York Times data Benchmark taken from LOD: links between NYT data, DBpedia, Freebase, and Geonames 1,2 1 0,8 0,6 0,4 0,2 0 KnoFuss+GA AgreementMaker SERIMI Zhishi.links
20 Reducing computations Link discovery has to be performed many times: Having population size 100 and 20 generations 2000 decision rule applications over both datasets Solution: Take a random sample of candidate instances (after blocking) Learn a decision rule based on the sample Apply the rule once to the complete datasets
21 Reducing computations: sampling 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0, F1-fitness Unsupervised (sample) Unsupervised (complete)
22 Alternative indirect evidence
23 Fitness function comparison 1 0,8 0,6 0,4 0,2 Pseudo-F-fitness NG-fitness 0 Musicians Book authors OAEI (Geonames) OAEI (Freebase) Neighbourhood growth can also perform well but is less robust
24 Summary Can we do without labelled training data? Yes, we can! well, in certain conditions Different types of indirect evidence can be exploited Genetic algorithms provide a suitable tool to learn a decision rule
25 On-going & future work Improving the performance Is it possible to weaken the assumptions? More complex forms of decision rules Combining different types of evidence Scalability Efficient blocking mechanisms Parallelisation of computations
26 Questions? Thanks for your attention! More info & download link:
27 Experiments: default parameters
28 Issue: what does pseudo-recall mean?
29 Instance matching rdfs:label Jung, C. G. (Carl Gustav) bnb:jungcg(carlgustav) dc:creator bio:event bio:event bnb: bnb: /birth bnb: /death dc:title bio:date bio:date Visions Carl Jung foaf:name dbprop:knownfor Analytical Psychology dbprop:birthdate dbpedia:carl_jung dbprop:residence Switzerland dbpedia:sigmund_freud dbprop:influenced dbprop:placeofbirth Kesswil
30 Experiments: OAEI datasets OAEI 2010 (Person/Restaurant) Dataset KnoFuss+GA ObjectCoref ASMOV CODI LN2R RiMOM FBEM Person N/A Person Restaurant (OAEI) N/A Restaurant (fixed) N/A N/A N/A N/A 0.96 OAEI 2011 (New York Times data) Dataset KnoFuss+GA AgreementMaker SERIMI Zhishi.links DBPedia (locations) DBPedia (organisations) DBPedia (people) Freebase (locations) Freebase (organisations) Freebase (people) Geonames (locations) Average
31 Parameters: population size 1,2 1 0,8 0,6 0,4 0,2 F1-fitness (MC) Pseudo-F-fitness (MC) F1-fitness (BA) Pseudo-F-fitness (BA) Varying crossover and mutation rate didn t result in significant differences in performance
32 Instance matching rdfs:label Jung, C. G. (Carl Gustav) bnb:jungcg(carlgustav) dc:creator bio:event bio:event bnb: bnb: /birth bnb: /death dc:title bio:date bio:date Visions Comparable pairs of attributes Carl Jung foaf:name dbprop:knownfor Analytical Psychology dbprop:birthdate dbpedia:carl_jung dbprop:residence Switzerland dbpedia:sigmund_freud dbprop:influenced dbprop:placeofbirth Kesswil
33 Instance matching bnb:jungcg(carlgustav) Appropriate similarity measures rdfs:label Jung, C. G. (Carl Gustav) token set similarity Carl Jung foaf:name bio:event/bio:date 1875 date inclusion dbprop:birthdate dbpedia:carl_jung
34 Instance matching rdfs:label bnb:jungcg(carlgustav) average() 0.85 Aggregation function, weights, threshold bio:event/bio:date Jung, C. G. (Carl Gustav) token * token set set similarity 0.6 date * date inclusion Carl Jung foaf:name dbprop:birthdate dbpedia:carl_jung
35 Instance matching rdfs:label average() 0.85 bio:event/bio:date Jung, C. G. (Carl Gustav) token * token set set similarity 0.6 substring * date inclusion Carl Jung foaf:name dbprop:birthdate average( 0.4*token-set-sim( rdfs:label : foaf:name); 0.6*substr-sim(bio:event/bio:date : dbprop:birthdate) ) 0.85 Decision rule
SLINT: A Schema-Independent Linked Data Interlinking System
SLINT: A Schema-Independent Linked Data Interlinking System Khai Nguyen 1, Ryutaro Ichise 2, and Bac Le 1 1 University of Science, Ho Chi Minh, Vietnam {nhkhai,lhbac}@fit.hcmus.edu.vn 2 National Institute
More informationWhat should I link to? Identifying relevant sources and classes for data linking
What should I link to? Identifying relevant sources and classes for data linking Andriy Nikolov, Mathieu d Aquin, Enrico Motta Knowledge Media Institute, The Open University, Milton Keynes, UK {a.nikolov,
More informationIdentifying Relevant Sources for Data Linking using a Semantic Web Index
Identifying Relevant Sources for Data Linking using a Semantic Web Index Andriy Nikolov a.nikolov@open.ac.uk Knowledge Media Institute Open University Milton Keynes, UK Mathieu d Aquin m.daquin@open.ac.uk
More informationEvaluating semantic data infrastructure components for small devices
Evaluating semantic data infrastructure components for small devices Andriy Nikolov, Ning Li, Mathieu d Aquin, Enrico Motta Knowledge Media Institute, The Open University, Milton Keynes, UK {a.nikolov,
More informationarxiv: v1 [cs.db] 1 Aug 2012
Learning Expressive Linkage Rules using Genetic Programming arxiv:1208.0291v1 [cs.db] 1 Aug 2012 ABSTRACT Robert Isele Web-based Systems Group Freie Universität Berlin Garystr. 21, 14195 Berlin, Germany
More informationLinking FRBR Entities to LOD through Semantic Matching
Linking FRBR Entities to through Semantic Matching Naimdjon Takhirov, Fabien Duchateau, Trond Aalberg Department of Computer and Information Science Norwegian University of Science and Technology Theory
More informationRiMOM Results for OAEI 2010
RiMOM Results for OAEI 2010 Zhichun Wang 1, Xiao Zhang 1, Lei Hou 1, Yue Zhao 2, Juanzi Li 1, Yu Qi 3, Jie Tang 1 1 Tsinghua University, Beijing, China {zcwang,zhangxiao,greener,ljz,tangjie}@keg.cs.tsinghua.edu.cn
More informationThe Open University s repository of research publications and other research outputs. Building SPARQL-Enabled Applications with Android devices
Open Research Online The Open University s repository of research publications and other research outputs Building SPARQL-Enabled Applications with Android devices Conference Item How to cite: d Aquin,
More informationThe R2R Framework: Christian Bizer, Andreas Schultz. 1 st International Workshop on Consuming Linked Data (COLD2010) Freie Universität Berlin
1 st International Workshop on Consuming Linked Data (COLD2010) November 8, 2010, Shanghai, China The R2R Framework: Publishing and Discovering i Mappings on the Web Christian Bizer, Andreas Schultz Freie
More informationLinked Data Evolving the Web into a Global Data Space
Linked Data Evolving the Web into a Global Data Space Anja Jentzsch, Freie Universität Berlin 05 October 2011 EuropeanaTech 2011, Vienna 1 Architecture of the classic Web Single global document space Web
More informationThe Open University s repository of research publications and other research outputs
Open Research Online The Open University s repository of research publications and other research outputs Overcoming schema heterogeneity between linked semantic repositories to improve coreference resolution
More informationA Machine Learning Approach for Instance Matching Based on Similarity Metrics
A Machine Learning Approach for Instance Matching Based on Similarity Metrics Shu Rong 1, Xing Niu 1,EvanWeiXiang 2, Haofen Wang 1, Qiang Yang 2, and Yong Yu 1 1 APEX Data & Knowledge Management Lab, Shanghai
More informationselection of similarity functions for
Evaluating Genetic Algorithms for selection of similarity functions for record linkage Faraz Shaikh and Chaiyong Ragkhitwetsagul Carnegie Mellon University School of Computer Science ISRI - Institute for
More informationWhat you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web
What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved
More informationOntology Instance Linking: Towards Interlinked Knowledge Graphs
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Ontology Instance Linking: Towards Interlinked Knowledge Graphs Jeff Heflin Department of Computer Science and Engineering
More informationGrid-Based Genetic Algorithm Approach to Colour Image Segmentation
Grid-Based Genetic Algorithm Approach to Colour Image Segmentation Marco Gallotta Keri Woods Supervised by Audrey Mbogho Image Segmentation Identifying and extracting distinct, homogeneous regions from
More informationHandling instance coreferencing in the KnoFuss architecture
Handling instance coreferencing in the KnoFuss architecture Andriy Nikolov, Victoria Uren, Enrico Motta and Anne de Roeck Knowledge Media Institute, The Open University, Milton Keynes, UK {a.nikolov, v.s.uren,
More informationEvolving SQL Queries for Data Mining
Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper
More informationSERIMI Results for OAEI 2011
SERIMI Results for OAEI 2011 Samur Araujo 1, Arjen de Vries 1, and Daniel Schwabe 2 1 Delft University of Technology, PO Box 5031, 2600 GA Delft, the Netherlands {S.F.CardosodeAraujo, A.P.deVries}@tudelft.nl
More informationSemantic Annotation and Linking of Medical Educational Resources
5 th European IFMBE MBEC, Budapest, September 14-18, 2011 Semantic Annotation and Linking of Medical Educational Resources N. Dovrolis 1, T. Stefanut 2, S. Dietze 3, H.Q. Yu 3, C. Valentine 3 & E. Kaldoudi
More informationLearning Approach for Domain-Independent Linked Data Instance Matching
Learning Approach for Domain-Independent Linked Data Instance Matching Khai Nguyen University of Science Ho Chi Minh, Vietnam nhkhai@fit.hcmus.edu.vn Ryutaro Ichise National Institute of Informatics Tokyo,
More informationLinkedMDB. The first linked data source dedicated to movies
Oktie Hassanzadeh Mariano Consens University of Toronto April 20th, 2009 Madrid, Spain Presentation at the Linked Data On the Web (LDOW) 2009 Workshop LinkedMDB 2 The first linked data source dedicated
More informationBenchmarking the Performance of Linked Data Translation Systems
Linked Data on the Web (LDOW 2012) Benchmarking the Performance of Linked Data Translation Systems Carlos R. Rivero¹, Andreas Schultz², Christian Bizer² and David Ruiz¹ ¹ University of Sevilla ² Freie
More informationYAM++ : A multi-strategy based approach for Ontology matching task
YAM++ : A multi-strategy based approach for Ontology matching task Duy Hoa Ngo, Zohra Bellahsene To cite this version: Duy Hoa Ngo, Zohra Bellahsene. YAM++ : A multi-strategy based approach for Ontology
More informationLinking library data: contributions and role of subject data. Nuno Freire The European Library
Linking library data: contributions and role of subject data Nuno Freire The European Library Outline Introduction to The European Library Motivation for Linked Library Data The European Library Open Dataset
More informationAn Iterative Approach to Record Deduplication
An Iterative Approach to Record Deduplication M. Roshini Karunya, S. Lalitha, B.Tech., M.E., II ME (CSE), Gnanamani College of Technology, A.K.Samuthiram, India 1 Assistant Professor, Gnanamani College
More informationCombining Ontology Mapping Methods Using Bayesian Networks
Combining Ontology Mapping Methods Using Bayesian Networks Ondřej Šváb, Vojtěch Svátek University of Economics, Prague, Dep. Information and Knowledge Engineering, Winston Churchill Sq. 4, 130 67 Praha
More informationOutline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search
Outline Genetic Algorithm Motivation Genetic algorithms An illustrative example Hypothesis space search Motivation Evolution is known to be a successful, robust method for adaptation within biological
More informationA Generalization of the Winkler Extension and its Application for Ontology Mapping
A Generalization of the Winkler Extension and its Application for Ontology Mapping Maurice Hermans Frederik C. Schadd Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands Abstract Mapping
More informationKnowledge Graph Completion. Mayank Kejriwal (USC/ISI)
Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An intelligent way of doing data cleaning Deduplicating entity nodes (entity resolution) Collective reasoning (probabilistic
More informationBuilding a Linked Open Data Knowledge Graph Henning Schoenenberger Michele Pasin. Frankfurt Book Fair 2017 October 11, 2017
Building a Linked Open Data Knowledge Graph Henning Schoenenberger Michele Pasin Frankfurt Book Fair 2017 October 11, 2017 1 Springer Nature s Metadata Mission Statement We understand metadata as the gateway
More informationOne size does not fit all: Customizing Ontology Alignment Using User Feedback
One size does not fit all: Customizing Ontology Alignment Using User Feedback Songyun Duan, Achille Fokoue, and Kavitha Srinivas IBM T.J. Watson Research Center, NY, USA {sduan, achille, ksrinivs}@us.ibm.com
More informationDiscovering Names in Linked Data Datasets
Discovering Names in Linked Data Datasets Bianca Pereira 1, João C. P. da Silva 2, and Adriana S. Vivacqua 1,2 1 Programa de Pós-Graduação em Informática, 2 Departamento de Ciência da Computação Instituto
More informationInsMT / InsMTL Results for OAEI 2014 Instance Matching
InsMT / InsMTL Results for OAEI 2014 Instance Matching Abderrahmane Khiat 1, Moussa Benaissa 1 1 LITIO Lab, University of Oran, BP 1524 El-Mnaouar Oran, Algeria abderrahmane_khiat@yahoo.com moussabenaissa@yahoo.fr
More informationOntology Matching with CIDER: Evaluation Report for the OAEI 2008
Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task
More informationTowards On the Go Matching of Linked Open Data Ontologies
Towards On the Go Matching of Linked Open Data Ontologies Isabel F. Cruz ADVIS Lab Department of Computer Science University of Illinois at Chicago ifc@cs.uic.edu Matteo Palmonari DISCo University of Milan-Bicocca
More informationOtO Matching System: A Multi-strategy Approach to Instance Matching
OtO Matching System: A Multi-strategy Approach to Instance Matching Evangelia Daskalaki and Dimitris Plexousakis Foundation of Research and Technology Hellas, Institute of Computer Science, Crete, Greece
More informationRiMOM Results for OAEI 2009
RiMOM Results for OAEI 2009 Xiao Zhang, Qian Zhong, Feng Shi, Juanzi Li and Jie Tang Department of Computer Science and Technology, Tsinghua University, Beijing, China zhangxiao,zhongqian,shifeng,ljz,tangjie@keg.cs.tsinghua.edu.cn
More informationDeduplication of Hospital Data using Genetic Programming
Deduplication of Hospital Data using Genetic Programming P. Gujar Department of computer engineering Thakur college of engineering and Technology, Kandiwali, Maharashtra, India Priyanka Desai Department
More informationLarge-Scale Duplicate Detection
Large-Scale Duplicate Detection Potsdam, April 08, 2013 Felix Naumann, Arvid Heise Outline 2 1 Freedb 2 Seminar Overview 3 Duplicate Detection 4 Map-Reduce 5 Stratosphere 6 Paper Presentation 7 Organizational
More informationCommon Hours. Eric Childress Consulting Project Manager OCLC Research
A Success Unexpected in Common Hours Eric Childress Consulting Project Manager OCLC Research To accomplish great things, we must not only act, but also dream; not only plan, but also believe. -- Henry
More informationHybrid Acquisition of Temporal Scopes for RDF Data
Hybrid Acquisition of Temporal Scopes for RDF Data Anisa Rula 1, Matteo Palmonari 1, Axel-Cyrille Ngonga Ngomo 2, Daniel Gerber 2, Jens Lehmann 2, and Lorenz Bühmann 2 1. University of Milano-Bicocca,
More informationSemantic Technologies to Support the User-Centric Analysis of Activity Data
Semantic Technologies to Support the User-Centric Analysis of Activity Data Mathieu d Aquin, Salman Elahi and Enrico Motta Knowledge Media Institute, The Open University, Milton Keynes, UK {m.daquin, s.elahi,
More informationExploration vs. Exploitation in Differential Evolution
Exploration vs. Exploitation in Differential Evolution Ângela A. R. Sá 1, Adriano O. Andrade 1, Alcimar B. Soares 1 and Slawomir J. Nasuto 2 Abstract. Differential Evolution (DE) is a tool for efficient
More informationMatching Web Tables with Knowledge Base Entities: From Entity Lookups to Entity Embeddings
Matching Web Tables with Knowledge Base Entities: From Entity Lookups to Entity Embeddings Vasilis Efthymiou 1, Oktie Hassanzadeh 2, Mariano Rodriguez-Muro 2, and Vassilis Christophides 3 1 ICS-FORTH &
More informationSupporting FRBRization of Web Product Descriptions
Supporting FRBRization of Web Product Descriptions Naimdjon Takhirov, Fabien Duchateau, Trond Aalberg Department of Computer and Information Science Norwegian University of Science and Technology Theory
More informationThe Open University s repository of research publications and other research outputs
Open Research Online The Open University s repository of research publications and other research outputs Towards data fusion in a multi-ontology environment Conference Item How to cite: Nikolov, Andriy;
More informationLinked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library
Linked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library SWIB14 Semantic Web in Libraries Conference Bonn, December 2014 Outline Introduction to The
More informationALIN Results for OAEI 2017
ALIN Results for OAEI 2017 Jomar da Silva 1, Fernanda Araujo Baião 1, and Kate Revoredo 1 Graduated Program in Informatics, Department of Applied Informatics Federal University of the State of Rio de Janeiro
More informationFirst results of the Ontology Alignment Evaluation Initiative 2011
First results of the Ontology Alignment Evaluation Initiative 2011 Jérôme Euzenat 1, Alfio Ferrara 2, Willem Robert van Hage 3, Laura Hollink 4, Christian Meilicke 5, Andriy Nikolov 6, François Scharffe
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationOntology matching: state of the art and future challenges
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, JANUARY 201X 1 Ontology matching: state of the art and future challenges Pavel Shvaiko and Jérôme Euzenat Abstract After years of research
More informationAOT / AOTL Results for OAEI 2014
AOT / AOTL Results for OAEI 2014 Abderrahmane Khiat 1, Moussa Benaissa 1 1 LITIO Lab, University of Oran, BP 1524 El-Mnaouar Oran, Algeria abderrahmane_khiat@yahoo.com moussabenaissa@yahoo.fr Abstract.
More informationZhishi.links Results for OAEI 2011
Zhishi.links Results for OAEI 2011 Xing Niu, Shu Rong, Yunlong Zhang, and Haofen Wang APEX Data & Knowledge Management Lab Shanghai Jiao Tong University {xingniu, rongshu, zhangyunlong, whfcarter}@apex.sjtu.edu.cn
More informationGeneral Data Protection Regulation: Knowing your data. Title. Prepared by: Paul Barks, Managing Consultant
General Data Protection Regulation: Knowing your data Title Prepared by: Paul Barks, Managing Consultant Table of Contents 1. Introduction... 3 2. The challenge... 4 3. Data mapping... 7 4. Conclusion...
More informationAutomatic training example selection for scalable unsupervised record linkage
Automatic training example selection for scalable unsupervised record linkage Peter Christen Department of Computer Science, The Australian National University, Canberra, Australia Contact: peter.christen@anu.edu.au
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More informationIntroduction Entity Match Service. Step-by-Step Description
Introduction Entity Match Service In order to incorporate as much institutional data into our central alumni and donor database (hereafter referred to as CADS ), we ve developed a comprehensive suite of
More informationLinked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library
Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library Nuno Freire Chief data officer The European Library Pacific Neighbourhood Consortium 2014 Annual
More informationMobile and Cloud Computing Seminar
Mobile and Cloud Computing Seminar MTAT.03.280 Fall 2013 Satish Srirama satish.srirama@ut.ee Course Purpose To have a platform to discuss the research developments of Mobile Cloud Lab Introduce students
More informationRecord Linkage using Probabilistic Methods and Data Mining Techniques
Doi:10.5901/mjss.2017.v8n3p203 Abstract Record Linkage using Probabilistic Methods and Data Mining Techniques Ogerta Elezaj Faculty of Economy, University of Tirana Gloria Tuxhari Faculty of Economy, University
More informationLinked data and its role in the semantic web. Dave Reynolds, Epimorphics
Linked data and its role in the semantic web Dave Reynolds, Epimorphics Ltd @der42 Roadmap What is linked data? Modelling Strengths and weaknesses Examples Access other topics image: Leo Oosterloo @ flickr.com
More informationGenetic Algorithms Variations and Implementation Issues
Genetic Algorithms Variations and Implementation Issues CS 431 Advanced Topics in AI Classic Genetic Algorithms GAs as proposed by Holland had the following properties: Randomly generated population Binary
More informationKOSIMap: Ontology alignments results for OAEI 2009
KOSIMap: Ontology alignments results for OAEI 2009 Quentin Reul 1 and Jeff Z. Pan 2 1 VUB STARLab, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium 2 University of Aberdeen, Aberdeen AB24
More informationDSSim-ontology mapping with uncertainty
DSSim-ontology mapping with uncertainty Miklos Nagy, Maria Vargas-Vera, Enrico Motta Knowledge Media Institute (Kmi) The Open University Walton Hall, Milton Keynes, MK7 6AA, United Kingdom mn2336@student.open.ac.uk;{m.vargas-vera,e.motta}@open.ac.uk
More information4 th Linked Data on the Web Workshop (LDOW 2011)
WWW 2011 29th March 2011, Hyderabad, India 4 th Linked Data on the Web Workshop (LDOW 2011) Christian Bizer, Freie Universität Berlin, Germany Tom Heath, Talis, UK Tim Berners-Lee, W3C/MIT, USA Michael
More informationUniversity of Rome Tor Vergata DBpedia Manuel Fiorelli
University of Rome Tor Vergata DBpedia Manuel Fiorelli fiorelli@info.uniroma2.it 07/12/2017 2 Notes The following slides contain some examples and pictures taken from: Lehmann, J., Isele, R., Jakob, M.,
More informationDRX: A LOD browser and dataset interlinking recommendation tool
Undefined 1 (2009) 1 5 1 IOS Press DRX: A LOD browser and dataset interlinking recommendation tool Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open
More informationOverview of Record Linkage Techniques
Overview of Record Linkage Techniques Record linkage or data matching refers to the process used to identify records which relate to the same entity (e.g. patient, customer, household) in one or more data
More informationW3C Workshop on RDF Access to Relational Databases October, 2007 Boston, MA, USA D2RQ. Lessons Learned
W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 Boston, MA, USA D2RQ Lessons Learned Christian Bizer Richard Cyganiak Freie Universität Berlin The D2RQ Plattform 2002: D2R MAP dump
More informationDetection of Related Semantic Datasets Based on Frequent Subgraph Mining
Detection of Related Semantic Datasets Based on Frequent Subgraph Mining Mikel Emaldi 1, Oscar Corcho 2, and Diego López-de-Ipiña 1 1 Deusto Institute of Technology - DeustoTech, University of Deusto,
More informationLinked Open Data and Semantic Technologies for Research in Agriculture and Forestry
Linked Open and Semantic Technologies for Research in Agriculture and Forestry Platform Linked Nederland 2 April 2015 Rob Lokers, Alterra, Wageningen UR Contents related challenges in agricultural (and
More informationArtificial Intelligence Application (Genetic Algorithm)
Babylon University College of Information Technology Software Department Artificial Intelligence Application (Genetic Algorithm) By Dr. Asaad Sabah Hadi 2014-2015 EVOLUTIONARY ALGORITHM The main idea about
More informationAnchor-Profiles for Ontology Mapping with Partial Alignments
Anchor-Profiles for Ontology Mapping with Partial Alignments Frederik C. Schadd Nico Roos Department of Knowledge Engineering, Maastricht University, Maastricht, The Netherlands Abstract. Ontology mapping
More informationCollective Entity Resolution in Relational Data
Collective Entity Resolution in Relational Data I. Bhattacharya, L. Getoor University of Maryland Presented by: Srikar Pyda, Brett Walenz CS590.01 - Duke University Parts of this presentation from: http://www.norc.org/pdfs/may%202011%20personal%20validation%20and%20entity%20resolution%20conference/getoorcollectiveentityresolution
More informationDiscovering Concept Coverings in Ontologies of Linked Data Sources
Discovering Concept Coverings in Ontologies of Linked Data Sources Rahul Parundekar, Craig A. Knoblock and Jose-Luis Ambite {parundek,knoblock}@usc.edu, ambite@isi.edu University of Southern California
More informationAutomated Program Repair
#1 Automated Program Repair Motivation Software maintenance is expensive Up to 90% of the cost of software [Seacord] Up to $70 Billion per year in US [Jorgensen, Sutherland] Bug repair is the majority
More informationSemantic Interactive Ontology Matching: Synergistic Combination of Techniques to Improve the Set of Candidate Correspondences
Semantic Interactive Ontology Matching: Synergistic Combination of Techniques to Improve the Set of Correspondences Jomar da Silva 1, Fernanda Araujo Baião 1, Kate Revoredo 1, and Jérôme Euzenat 2 1 Graduated
More informationStream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
Data Ingestion ETL, Distcp, Kafka, OpenRefine, Query & Exploration SQL, Search, Cypher, Stream Processing Platforms Storm, Spark,.. Batch Processing Platforms MapReduce, SparkSQL, BigQuery, Hive, Cypher,...
More informationGenetic Algorithms. Genetic Algorithms
A biological analogy for optimization problems Bit encoding, models as strings Reproduction and mutation -> natural selection Pseudo-code for a simple genetic algorithm The goal of genetic algorithms (GA):
More informationEntity Type Recognition for Heterogeneous Semantic Graphs
Semantics for Big Data AAAI Technical Report FS-13-04 Entity Type Recognition for Heterogeneous Semantic Graphs Jennifer Sleeman and Tim Finin Computer Science and Electrical Engineering University of
More informationAUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH
AUTOMATICALLY GENERATING DATA LINKAGES USING A DOMAIN-INDEPENDENT CANDIDATE SELECTION APPROACH Dezhao Song and Jeff Heflin SWAT Lab Department of Computer Science and Engineering Lehigh University 11/10/2011
More informationMapPSO Results for OAEI 2010
MapPSO Results for OAEI 2010 Jürgen Bock 1 FZI Forschungszentrum Informatik, Karlsruhe, Germany bock@fzi.de Abstract. This paper presents and discusses the results produced by the MapPSO system for the
More informationLinked data implementations who, what, why?
Semantic Web in Libraries (SWIB18), Bonn, Germany 28 November 2018 Linked data implementations who, what, why? Karen Smith-Yoshimura OCLC Research Linking Open Data cloud diagram 2017, by Andrejs Abele,
More informationAvoiding Chinese Whispers: Controlling End-to-End Join Quality in Linked Open Data Stores
Avoiding Chinese Whispers: Controlling End-to-End Join Quality in Linked Open Data Stores Jan-Christoph Kalo, Silviu Homoceanu, Jewgeni Rose, Wolf-Tilo Balke Technische Universität Braunschweig Mühlenpfordstraße
More informationCombinational Circuit Design Using Genetic Algorithms
Combinational Circuit Design Using Genetic Algorithms Nithyananthan K Bannari Amman institute of technology M.E.Embedded systems, Anna University E-mail:nithyananthan.babu@gmail.com Abstract - In the paper
More informationPOMap results for OAEI 2017
POMap results for OAEI 2017 Amir Laadhar 1, Faiza Ghozzi 2, Imen Megdiche 1, Franck Ravat 1, Olivier Teste 1, and Faiez Gargouri 2 1 Paul Sabatier University, IRIT (CNRS/UMR 5505) 118 Route de Narbonne
More informationGRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS. G. Panoutsos and M. Mahfouf
GRANULAR COMPUTING AND EVOLUTIONARY FUZZY MODELLING FOR MECHANICAL PROPERTIES OF ALLOY STEELS G. Panoutsos and M. Mahfouf Institute for Microstructural and Mechanical Process Engineering: The University
More informationA Parallel Evolutionary Algorithm for Discovery of Decision Rules
A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationALIN Results for OAEI 2016
ALIN Results for OAEI 2016 Jomar da Silva, Fernanda Araujo Baião and Kate Revoredo Department of Applied Informatics Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil {jomar.silva,fernanda.baiao,katerevoredo}@uniriotec.br
More informationAn Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets
Semantic Web 0 (2015) 0 0 IOS Press An Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets Editor(s): Michelle Cheatham, Wright State University, USA; Isabel F. Cruz,
More informationPresented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu
Presented by: Dimitri Galmanovich Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu 1 When looking for Unstructured data 2 Millions of such queries every day
More informationA Survey on Removal of Duplicate Records in Database
Indian Journal of Science and Technology A Survey on Removal of Duplicate Records in Database M. Karthigha 1* and S. Krishna Anand 2 1 PG Student, School of Computing (CSE), SASTRA University, 613401,
More informationLeveraging Data and Structure in Ontology Integration
Leveraging Data and Structure in Ontology Integration O. Udrea L. Getoor R.J. Miller Group 15 Enrico Savioli Andrea Reale Andrea Sorbini DEIS University of Bologna Searching Information in Large Spaces
More informationCOALA - Correlation-Aware Active Learning of Link Specifications
COALA - Correlation-Aware Active Learning of Link Specifications Axel-Cyrille Ngonga Ngomo, Klaus Lyko, and Victor Christen Department of Computer Science AKSW Research Group University of Leipzig, Germany
More informationAlignment and Dataset Identification of Linked Data in Semantic Web
Wright State University CORE Scholar Kno.e.sis Publications The Ohio Center of Excellence in Knowledge- Enabled Computing (Kno.e.sis) 2014 Alignment and Dataset Identification of Linked Data in Semantic
More informationInformation Integration of Partially Labeled Data
Information Integration of Partially Labeled Data Steffen Rendle and Lars Schmidt-Thieme Information Systems and Machine Learning Lab, University of Hildesheim srendle@ismll.uni-hildesheim.de, schmidt-thieme@ismll.uni-hildesheim.de
More informationFrom Ontologies to Information Extraction and Back
From Ontologies to Information Extraction and Back LAMA SAEEDA D E PA R T M E N T O F C Y B E R N E T I C S A N D A R T I F I C I A L I N T E L L I G E N C E FA C U LT Y O F E L E C T R I C A L E N G I
More informationA Hybrid Model Words-Driven Approach for Web Product Duplicate Detection
A Hybrid Model Words-Driven Approach for Web Product Duplicate Detection Marnix de Bakker, Flavius Frasincar, and Damir Vandic Erasmus University Rotterdam PO Box 1738, NL-3000 DR Rotterdam, The Netherlands
More information