Alignment Results of SOBOM for OAEI 2010

Similar documents
Alignment Results of SOBOM for OAEI 2009

Ontology Mapping: As a Binary Classification Problem

Ontology Generator from Relational Database Based on Jena

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

UB at GeoCLEF Department of Geography Abstract

A Binarization Algorithm specialized on Document Images and Photos

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Available online at Available online at Advanced in Control Engineering and Information Science

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Cluster Analysis of Electrical Behavior

TN348: Openlab Module - Colocalization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Query Clustering Using a Hybrid Query Similarity Measure

An Optimal Algorithm for Prufer Codes *

Constructing Minimum Connected Dominating Set: Algorithmic approach

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Module Management Tool in Software Development Organizations

Personalized Concept-Based Clustering of Search Engine Queries

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

The Codesign Challenge

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

Wishing you all a Total Quality New Year!

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Related-Mode Attacks on CTR Encryption Mode

Deep Classification in Large-scale Text Hierarchies

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

An Intelligent Context Interpreter based on XML Schema Mapping

A Resources Virtualization Approach Supporting Uniform Access to Heterogeneous Grid Resources 1

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Virtual Machine Migration based on Trust Measurement of Computer Node

A Clustering Algorithm for Chinese Adjectives and Nouns 1

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Modeling Hierarchical User Interests Based on HowNet and Concept Mapping

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Classifier Selection Based on Data Complexity Measures *

Object-Based Techniques for Image Retrieval

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

An Image Fusion Approach Based on Segmentation Region

Concurrent Apriori Data Mining Algorithms

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

User Authentication Based On Behavioral Mouse Dynamics Biometrics

BRDPHHC: A Balance RDF Data Partitioning Algorithm based on Hybrid Hierarchical Clustering

Relevance Feedback for Image Retrieval

LRD: Latent Relation Discovery for Vector Space Expansion and Information Retrieval

Research and Application of Fingerprint Recognition Based on MATLAB

Simulation Based Analysis of FAST TCP using OMNET++

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Manifold-Ranking Based Keyword Propagation for Image Retrieval *

A Method of Hot Topic Detection in Blogs Using N-gram Model

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION

Cross-Language Information Retrieval

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Solving two-person zero-sum game by Matlab

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Impact of a New Attribute Extraction Algorithm on Web Page Classification

A new selection strategy for selective cluster ensemble based on Diversity and Independency

Unsupervised Content Discovery in Composite Audio Rui Cai Department of Computer Science and Technology, Tsinghua Univ. Beijing, , China

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

Document Representation and Clustering with WordNet Based Similarity Rough Set Model

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Smoothing Spline ANOVA for variable screening

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

DLK Pro the all-rounder for mobile data downloading. Tailor-made for various requirements.

On Some Entertaining Applications of the Concept of Set in Computer Science Course

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

High-Boost Mesh Filtering for 3-D Shape Enhancement

Meta-heuristics for Multidimensional Knapsack Problems

From Comparing Clusterings to Combining Clusterings

Exploring Image, Text and Geographic Evidences in ImageCLEF 2007

Optimizing Document Scoring for Query Retrieval

Support Vector Machines

The Shortest Path of Touring Lines given in the Plane

A Computer Vision System for Automated Container Code Recognition

Query classification using topic models and support vector machine

A Web Site Classification Approach Based On Its Topological Structure

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Keyword-based Document Clustering

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks

Hybrid Non-Blind Color Image Watermarking

An Integrated Information Platform for Intelligent Transportation Systems Based on Ontology

Robust Classification of ph Levels on a Camera Phone

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Hermite Splines in Lie Groups as Products of Geodesics

Transcription:

Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn, chl198478@126.com,tany.zang@gmal.com Abstract. In ths paper we gve a bref explanaton of how Sub-Ontology based Ontology Matchng (SOBOM) method gets the algnment results at OAEI2010. SOBOM deal wth an ontology from two dfferent vews: an ontology wth s-a herarchcal structure O and an ontology wth other relatonshps O. Frstly, from the O vew, SOBOM starts wth a set of anchors provded by a lngustc matcher. And then t extracts sub-ontologes based on the anchors and ranks these sub-ontologes accordng to ther depths. Secondly, SOBOM utlzes Semantc Inductve Smlarty Floodng algorthm to compute the smlarty of concepts between dfferent sub-ontologes derved from the two ontologes accordng the depth of sub-ontologes to get concept algnments. Fnally, from the O vew, SOBOM gets relatonshp algnments by usng the concept algnment results n O. The experment results show SOBOM can fnd more algnment results than other compared relevant methods. 1 System presentaton Currently more and more ontologes are dstrbutedly bult and used by dfferent organzatons. And these ontologes are usually lght-weghted [1] contanng lots of concepts especally n bomedcne, such as anatomy taxonomy NCI Thesaurus. The Sub-ontology based Ontology Matchng (SOBOM) s desgned for matchng lghtweght ontologes that has s-a herarchy as ther backbones. It matches an ontology

from two vews: O and O that are depcted n Fg. 1. The unque feature of our method s combnng sub-ontology extracton wth ontology matchng. 1.1 State, purpose, general statement SOBOM s developed to match ontology automatcally for general purpose. Based on two dfferent vews, we desgn three elementary matchers n current verson. The frst one s a anchor generator whch s used to fnd anchors; the second one s a structure matcher SISF (Semantc Inductve Smlarty Floodng) whch s nspred by Anchor- Prompt [3] and SF [4] algorthms and s exploted to flood smlarty among concepts. The last one s a relatonshp matcher whch utlzes the results of SISF to get relatonshp algnments. In addton, a Sub-ontology Extractor (SoE) s ntegrated nto SOBOM to extract sub-ontologes accordng to the anchors got by lngustc matcher and rank them by ther depths descendngly. Overall SOBOM s a sequental method, so t does not care how to combne the results of dfferent matchers. The overvew of the method s llustrated n Fg. 2. O O ' O '' Fg. 1. Two vews of an ontology n SOBOM O ' O ' O' ' Sub_O 1 Sub_O 2 Sub_O 1 Sub_O 2 Fg. 2. The processng overvew of SOBOM For smplcty, we defne some notatons used n the report.

Ontology: An ontology O conssts of a set of concepts C, propertes (relatons) R, nstances I, and axoms A. We use entty e to denote ether c C or r R. Each relaton r has a doman and range defned as followng: Doman ( r ) = { c c C and havng the relatonsh p r} Range ( r ) = { c c C and can be value of r} Anchor: An anchor s defned as a par of assumed equvalent non-leaf concepts across ontologes. Gven two ontologes O 1, O 2, c1 O1, c2 O2, f c1 c2 (means that c1 s dentcal wth c2),and c1, c2 are both not leaf nodes n the herarches of O1 and O2 respectvely, then an anchor, X s defned as a par of concepts < c,c 2 > 1. Sub-Ontology: Let ontology O = (C,R, H C,H R,). A sub-ontology O x s a subset of Owhose elements all come from O, called O x = (C 1,H C C C 1 ), where C1 C, H1 H, x s the root of H C. Indeed, a sub-ontology n our method s a herarchcal taxonomy, and ts root s an anchor concept. Sub-ontology Depth. The depth of sub-ontology O x s the maxmal length of path from the anchor x to the root r of the taxonomy H C whch contans t n the orgnal ontology O. C C ( x..., r ), r H H O x Depth( O ) = Max, 1.2 Specfc technques used SOBOM ams to provde hgh qualty of 1:1 algnments between concept and property pars. We have mplemented SOBOM algorthm n java and ntegrated three dstngushng consttutonal matchers. They are ndependent components n core matcher lbrary of SOBOM. Due to the space lmtaton, we only descrbe the key features of them. The detals can be found n the related paper [8]. Our anchor generator s based on the local context of an entty n ontology. In detals, the local context of an entty ncludng the followng aspects: the textual nformaton (label, d, comments and so on), the structure nformaton (the number of super or sub concepts, the number of constrants) and the ndvdual nformaton (the number of ndvdual f exstng). We consder

that the local context of an entty can express the meanng of t. Consequently, we get three smlarty matrxes respectvely, and we choose the maxmal of them as the fnal results. SISF uses the RDF statement to represent the ontology and utlzes the anchors to nductng the constructon of smlarty propagaton graph for the sub-ontologes. SISF handles the ontology from the vew O and only generate concept-concept algnment. R-matcher s a relatonshp matcher base on the defnton of the ontology. It combnes the lngustc and semantc nformaton of a relaton. From the O vew, t utlzes the s-a herarchy to extend the doman and range of a relatonshp and uses the result of SISF to generate the algnment between relatonshps. More mportantly, SoE s ntegrated nto SOBOM and extracts sub-ontologes accordng to the anchors [5,6]. SoE ranks extracted sub-ontologes accordng to ther depths. As we extract sub-ontologes for ontology matchng, the rules of extractng sub-ontology n SoE are as followng: only sub-concepts of anchor are ncluded n the sub-ontology. In other words, a sub-ontology s a taxonomy whch has anchor as root. If one of the two concepts n an anchor s a leaf node n the orgnal ontology, we do not use SISF to deal wth t actually. Because ths phenomenon just represents a one-to-many mappng. After extractng sub-ontologes, SOBOM wll match these sub-ontologes accordng to ther depth n orgnal ontology. We frst match the subontologes wth larger depth value. By usng SoE, SOBOM can reduce the scale of ontology and make t easy to operate sub-ontologes n SISF. 1.3 Adaptatons made for the evaluaton We don t make any specfc adaptaton for the tests n the OAEI 2010 campagn. All the algnments outputted by SOBOM are based on the same set of parameters.

1.4 Lnk to the system and parameters fle The current verson of SOBOM s avalable at: http://mlg.ht.edu.cn:8080/ontology/download.jsp, and the parameters settng s llustrated n the readng me fle. 1.5 Lnk to the set of provded algnments (n algn format) We deploy our matcher as a web servce, our web servce name s: eu.sealsproject.omt.ws.matcher.algnmentwsimpl. The endpont of our web servce can be found at: http://mlg.ht.edu.cn:8080/sobomservce/sobommatcher?wsdl. 2 Results In ths secton, we descrbe the results of SOBOM algorthm aganst the benchmark, drectory and anatomy ontologes provded by the OAEI 2010 campagn. We use Jena-API to parse the RDF and OWL fles. The experments were carred out on a PC runnng Wndows vsta ultmate wth Core 2 Duo processors and 4-ggabyte memory. 2.1 Benchmark On the bass of the nature, we can dvde the benchmark dataset nto fve groups: #101-104, #201-210, #221-247, #248-266 and #301-304. SOBOM s a sequental matcher. If the lngustc matcher gets no results, SOBOM wll produce no result. We descrbed the performance of our SOBOM algorthm over each group and overall performance on the benchmark test set n Table 1. #101-104 SOBOM plays well for these test cases. #201-210 In ths group, some lngustc features of canddate ontologes are dscarded or modfed, ther structures are qute smlar. SOBOM s a sequental matcher, our anchor generator matches concepts based on ther local context not only

the lngustc nformaton. So, although wthout lngustc nformaton SOBOM also gets relatvely hgh precson and recall. #221-247 The structures of the canddate ontologes are altered n these tests. However, SOBOM dscovers most of the algnments from the lngustc perspectve va our anchor generator, and both the precson and recall are pretty good. #248-266 Both the lngustc and structural characterstcs of the canddate ontologes are changed heavly, so the tests n ths group mght be the most dffcult ones n all the benchmark tests. So, SOBOM does not very well. #301-304 Ths test group are four real-lfe ontologes of bblographc references. SOBOM can only fnd equvalence algnment relatons. Table 1. The performance on the benchmark 101-104 201-210 221-247 248-266 301-304 Precson 1.0 0.99 0.99 0.87 0.77 Recall 1.0 0.85 0.99 0.57 0.70 Compared to our prevous results (OAEI2009), the recall of every group s hghly mproved. Ths s enhanced by our redesgned anchor generator. 2.2 Anatomy The anatomy real world test bed covers the doman of body anatomy and consst of two ontologes, Adult Mouse Anatomy (2247 classes) and NCI Thesaurus (3304 classes). These are relatvely large compared to benchmark ontologes. Ths type ontologes s what SOBOM sutable for handlng, t generated 268 sub-ontologes and 1249 algnments between MA and NCI, consumed 19mn3s to complete the matchng task. It s obvous that most of the algnment appears n the leaf nodes n ontologes (834 leaf node algnments). The experment result shows n Table 2. Table 2. The performance of SOBOM on the anatomy test Leaf node Subontologes Algnments Total Tme algnments consumng

NCI 834 268 1249 19mn3s MA 2.3 Conference There are 120 pars of ontologes n ths track. Most of them are blnd tests (.e. there no reference algnment avalable). The whole results are avalable at: http://seals.nralpes.fr/platform/;jsessond=fd1e3a5ce8da43c1d52db21079ea ECF3?wcket:bookmarkablePage=:eu.sealsproject.omt.u.Results&endpont=http://21 9.217.238.162:8080/SOBOMServce/SOBOMMatcher?wsdl&evaluatonID=http://21 9.217.238.162:8080/SOBOMServce/SOBOMMatcher?wsdl2010/10/03+02:09:03&tr ack=conference+testsute. 3 General comments In ths secton, we want to ntroduce comments on the results of SOBOM algorthm and the way to mprove t. 3.1 Comments on the results Strengths SOBOM deals wth ontology from two dfferent vews and combnes results of every step n a sequental way. If the ontologes have regular lterals and herarchcal structures, SOBOM can acheve satsfactory algnments. And t can avod mssng algnment n many parttonng matchng methods as llustrated n [7]. Weaknesses SOBOM needs anchors to extract sub-ontologes. So t depends on the precson of anchors. In current verson, we use a lngustc matcher to get anchor concept, f the lterals of concept mssed, SOBOM wll get bad results.

3.2 Dscussons on the way to mprove the proposed system SOBOM can be vewed as a frame of ontology matchng. So many ndependent matchers can be ntegrated nto t. Now, we have enhanced the anchor generator by not consderng the textual nformaton but also the structure nformaton. Our next plan s to ntegrate a more powerful matcher to produce anchor concepts or develop a new method to get anchor concepts. Meanwhle, we plan to develop a mappng debuggng method to refne the results of SOBOM. 4 Concluson Ontology matchng s very mportant part of establshng nteroperablty among semantc applcatons. Ths paper reports our partcpaton n OAEI2010 campagn. We present the algnment process of SOBOM and descrbe the specfc technques for ontology matchng. We also show the performance n dfferent algnment tasks. The strengths and the weaknesses of our proposed approach are summarzed and the possble mprovement wll be made for the system n the future. We propose a brand new algorthm to match ontologes. The unque feature of our method s combnng sub-ontology extracton wth ontology matchng based on two dfferent vews of an ontology. References 1. Fausto Gunchgla and Ilya Ahrayeu : Lghtweght Ontologes. Techncal Report. (2007) DIT-07-071. 2. G.Stolos, G. Stamou, S. Kollas: A strng metrc for ontology algnment, In Proc. Of the 4 th Internatonal Semantc Web Conference(ISWC 05). (2005) 623-637. 3. N.F. Noy, M.A. Musen: Anchor-PROMPT: usng non-local context for semantc matchng, In Proc. Of IJCAI2001 Workshop on Ontology and Informaton Sharng, ( 2001) 63-70.

4. S. Melnk, H.G. Molna and E. Rahm: Smlarty Floodng: A Versatle Graph Matchng Algorthm, In Proc 18 th Int l Conf. Data Eng. (ECDE 02) (2002) 117-128. 5. Julan Sedenberg and Alan Rector: Web Ontology Segmentaton: Analyss, Classfcaton and Use, WWW2006, (2006). 6. H. Stuckenschmdt and M. Klen: Structure-Based Parttonng of Large Class Herarches. In Proc of the 3 rd Internatonal Semantc Web Conference (2004). 7. W. Hu, Y. Qu: Block matchng for ontologes, In Proc of the 5 th Internatonal Semantc Web Conference, LNCS, vol. 4273, Sprnger (2006) 300-313. 8. P.G. Xu, H.J. Tao: SOBOM: Sub-ontology based Ontology Matchng. To be appear.