Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn, chl198478@126.com,tany.zang@gmal.com Abstract. In ths paper we gve a bref explanaton of how Sub-Ontology based Ontology Matchng (SOBOM) method gets the algnment results at OAEI2010. SOBOM deal wth an ontology from two dfferent vews: an ontology wth s-a herarchcal structure O and an ontology wth other relatonshps O. Frstly, from the O vew, SOBOM starts wth a set of anchors provded by a lngustc matcher. And then t extracts sub-ontologes based on the anchors and ranks these sub-ontologes accordng to ther depths. Secondly, SOBOM utlzes Semantc Inductve Smlarty Floodng algorthm to compute the smlarty of concepts between dfferent sub-ontologes derved from the two ontologes accordng the depth of sub-ontologes to get concept algnments. Fnally, from the O vew, SOBOM gets relatonshp algnments by usng the concept algnment results n O. The experment results show SOBOM can fnd more algnment results than other compared relevant methods. 1 System presentaton Currently more and more ontologes are dstrbutedly bult and used by dfferent organzatons. And these ontologes are usually lght-weghted [1] contanng lots of concepts especally n bomedcne, such as anatomy taxonomy NCI Thesaurus. The Sub-ontology based Ontology Matchng (SOBOM) s desgned for matchng lghtweght ontologes that has s-a herarchy as ther backbones. It matches an ontology
from two vews: O and O that are depcted n Fg. 1. The unque feature of our method s combnng sub-ontology extracton wth ontology matchng. 1.1 State, purpose, general statement SOBOM s developed to match ontology automatcally for general purpose. Based on two dfferent vews, we desgn three elementary matchers n current verson. The frst one s a anchor generator whch s used to fnd anchors; the second one s a structure matcher SISF (Semantc Inductve Smlarty Floodng) whch s nspred by Anchor- Prompt [3] and SF [4] algorthms and s exploted to flood smlarty among concepts. The last one s a relatonshp matcher whch utlzes the results of SISF to get relatonshp algnments. In addton, a Sub-ontology Extractor (SoE) s ntegrated nto SOBOM to extract sub-ontologes accordng to the anchors got by lngustc matcher and rank them by ther depths descendngly. Overall SOBOM s a sequental method, so t does not care how to combne the results of dfferent matchers. The overvew of the method s llustrated n Fg. 2. O O ' O '' Fg. 1. Two vews of an ontology n SOBOM O ' O ' O' ' Sub_O 1 Sub_O 2 Sub_O 1 Sub_O 2 Fg. 2. The processng overvew of SOBOM For smplcty, we defne some notatons used n the report.
Ontology: An ontology O conssts of a set of concepts C, propertes (relatons) R, nstances I, and axoms A. We use entty e to denote ether c C or r R. Each relaton r has a doman and range defned as followng: Doman ( r ) = { c c C and havng the relatonsh p r} Range ( r ) = { c c C and can be value of r} Anchor: An anchor s defned as a par of assumed equvalent non-leaf concepts across ontologes. Gven two ontologes O 1, O 2, c1 O1, c2 O2, f c1 c2 (means that c1 s dentcal wth c2),and c1, c2 are both not leaf nodes n the herarches of O1 and O2 respectvely, then an anchor, X s defned as a par of concepts < c,c 2 > 1. Sub-Ontology: Let ontology O = (C,R, H C,H R,). A sub-ontology O x s a subset of Owhose elements all come from O, called O x = (C 1,H C C C 1 ), where C1 C, H1 H, x s the root of H C. Indeed, a sub-ontology n our method s a herarchcal taxonomy, and ts root s an anchor concept. Sub-ontology Depth. The depth of sub-ontology O x s the maxmal length of path from the anchor x to the root r of the taxonomy H C whch contans t n the orgnal ontology O. C C ( x..., r ), r H H O x Depth( O ) = Max, 1.2 Specfc technques used SOBOM ams to provde hgh qualty of 1:1 algnments between concept and property pars. We have mplemented SOBOM algorthm n java and ntegrated three dstngushng consttutonal matchers. They are ndependent components n core matcher lbrary of SOBOM. Due to the space lmtaton, we only descrbe the key features of them. The detals can be found n the related paper [8]. Our anchor generator s based on the local context of an entty n ontology. In detals, the local context of an entty ncludng the followng aspects: the textual nformaton (label, d, comments and so on), the structure nformaton (the number of super or sub concepts, the number of constrants) and the ndvdual nformaton (the number of ndvdual f exstng). We consder
that the local context of an entty can express the meanng of t. Consequently, we get three smlarty matrxes respectvely, and we choose the maxmal of them as the fnal results. SISF uses the RDF statement to represent the ontology and utlzes the anchors to nductng the constructon of smlarty propagaton graph for the sub-ontologes. SISF handles the ontology from the vew O and only generate concept-concept algnment. R-matcher s a relatonshp matcher base on the defnton of the ontology. It combnes the lngustc and semantc nformaton of a relaton. From the O vew, t utlzes the s-a herarchy to extend the doman and range of a relatonshp and uses the result of SISF to generate the algnment between relatonshps. More mportantly, SoE s ntegrated nto SOBOM and extracts sub-ontologes accordng to the anchors [5,6]. SoE ranks extracted sub-ontologes accordng to ther depths. As we extract sub-ontologes for ontology matchng, the rules of extractng sub-ontology n SoE are as followng: only sub-concepts of anchor are ncluded n the sub-ontology. In other words, a sub-ontology s a taxonomy whch has anchor as root. If one of the two concepts n an anchor s a leaf node n the orgnal ontology, we do not use SISF to deal wth t actually. Because ths phenomenon just represents a one-to-many mappng. After extractng sub-ontologes, SOBOM wll match these sub-ontologes accordng to ther depth n orgnal ontology. We frst match the subontologes wth larger depth value. By usng SoE, SOBOM can reduce the scale of ontology and make t easy to operate sub-ontologes n SISF. 1.3 Adaptatons made for the evaluaton We don t make any specfc adaptaton for the tests n the OAEI 2010 campagn. All the algnments outputted by SOBOM are based on the same set of parameters.
1.4 Lnk to the system and parameters fle The current verson of SOBOM s avalable at: http://mlg.ht.edu.cn:8080/ontology/download.jsp, and the parameters settng s llustrated n the readng me fle. 1.5 Lnk to the set of provded algnments (n algn format) We deploy our matcher as a web servce, our web servce name s: eu.sealsproject.omt.ws.matcher.algnmentwsimpl. The endpont of our web servce can be found at: http://mlg.ht.edu.cn:8080/sobomservce/sobommatcher?wsdl. 2 Results In ths secton, we descrbe the results of SOBOM algorthm aganst the benchmark, drectory and anatomy ontologes provded by the OAEI 2010 campagn. We use Jena-API to parse the RDF and OWL fles. The experments were carred out on a PC runnng Wndows vsta ultmate wth Core 2 Duo processors and 4-ggabyte memory. 2.1 Benchmark On the bass of the nature, we can dvde the benchmark dataset nto fve groups: #101-104, #201-210, #221-247, #248-266 and #301-304. SOBOM s a sequental matcher. If the lngustc matcher gets no results, SOBOM wll produce no result. We descrbed the performance of our SOBOM algorthm over each group and overall performance on the benchmark test set n Table 1. #101-104 SOBOM plays well for these test cases. #201-210 In ths group, some lngustc features of canddate ontologes are dscarded or modfed, ther structures are qute smlar. SOBOM s a sequental matcher, our anchor generator matches concepts based on ther local context not only
the lngustc nformaton. So, although wthout lngustc nformaton SOBOM also gets relatvely hgh precson and recall. #221-247 The structures of the canddate ontologes are altered n these tests. However, SOBOM dscovers most of the algnments from the lngustc perspectve va our anchor generator, and both the precson and recall are pretty good. #248-266 Both the lngustc and structural characterstcs of the canddate ontologes are changed heavly, so the tests n ths group mght be the most dffcult ones n all the benchmark tests. So, SOBOM does not very well. #301-304 Ths test group are four real-lfe ontologes of bblographc references. SOBOM can only fnd equvalence algnment relatons. Table 1. The performance on the benchmark 101-104 201-210 221-247 248-266 301-304 Precson 1.0 0.99 0.99 0.87 0.77 Recall 1.0 0.85 0.99 0.57 0.70 Compared to our prevous results (OAEI2009), the recall of every group s hghly mproved. Ths s enhanced by our redesgned anchor generator. 2.2 Anatomy The anatomy real world test bed covers the doman of body anatomy and consst of two ontologes, Adult Mouse Anatomy (2247 classes) and NCI Thesaurus (3304 classes). These are relatvely large compared to benchmark ontologes. Ths type ontologes s what SOBOM sutable for handlng, t generated 268 sub-ontologes and 1249 algnments between MA and NCI, consumed 19mn3s to complete the matchng task. It s obvous that most of the algnment appears n the leaf nodes n ontologes (834 leaf node algnments). The experment result shows n Table 2. Table 2. The performance of SOBOM on the anatomy test Leaf node Subontologes Algnments Total Tme algnments consumng
NCI 834 268 1249 19mn3s MA 2.3 Conference There are 120 pars of ontologes n ths track. Most of them are blnd tests (.e. there no reference algnment avalable). The whole results are avalable at: http://seals.nralpes.fr/platform/;jsessond=fd1e3a5ce8da43c1d52db21079ea ECF3?wcket:bookmarkablePage=:eu.sealsproject.omt.u.Results&endpont=http://21 9.217.238.162:8080/SOBOMServce/SOBOMMatcher?wsdl&evaluatonID=http://21 9.217.238.162:8080/SOBOMServce/SOBOMMatcher?wsdl2010/10/03+02:09:03&tr ack=conference+testsute. 3 General comments In ths secton, we want to ntroduce comments on the results of SOBOM algorthm and the way to mprove t. 3.1 Comments on the results Strengths SOBOM deals wth ontology from two dfferent vews and combnes results of every step n a sequental way. If the ontologes have regular lterals and herarchcal structures, SOBOM can acheve satsfactory algnments. And t can avod mssng algnment n many parttonng matchng methods as llustrated n [7]. Weaknesses SOBOM needs anchors to extract sub-ontologes. So t depends on the precson of anchors. In current verson, we use a lngustc matcher to get anchor concept, f the lterals of concept mssed, SOBOM wll get bad results.
3.2 Dscussons on the way to mprove the proposed system SOBOM can be vewed as a frame of ontology matchng. So many ndependent matchers can be ntegrated nto t. Now, we have enhanced the anchor generator by not consderng the textual nformaton but also the structure nformaton. Our next plan s to ntegrate a more powerful matcher to produce anchor concepts or develop a new method to get anchor concepts. Meanwhle, we plan to develop a mappng debuggng method to refne the results of SOBOM. 4 Concluson Ontology matchng s very mportant part of establshng nteroperablty among semantc applcatons. Ths paper reports our partcpaton n OAEI2010 campagn. We present the algnment process of SOBOM and descrbe the specfc technques for ontology matchng. We also show the performance n dfferent algnment tasks. The strengths and the weaknesses of our proposed approach are summarzed and the possble mprovement wll be made for the system n the future. We propose a brand new algorthm to match ontologes. The unque feature of our method s combnng sub-ontology extracton wth ontology matchng based on two dfferent vews of an ontology. References 1. Fausto Gunchgla and Ilya Ahrayeu : Lghtweght Ontologes. Techncal Report. (2007) DIT-07-071. 2. G.Stolos, G. Stamou, S. Kollas: A strng metrc for ontology algnment, In Proc. Of the 4 th Internatonal Semantc Web Conference(ISWC 05). (2005) 623-637. 3. N.F. Noy, M.A. Musen: Anchor-PROMPT: usng non-local context for semantc matchng, In Proc. Of IJCAI2001 Workshop on Ontology and Informaton Sharng, ( 2001) 63-70.
4. S. Melnk, H.G. Molna and E. Rahm: Smlarty Floodng: A Versatle Graph Matchng Algorthm, In Proc 18 th Int l Conf. Data Eng. (ECDE 02) (2002) 117-128. 5. Julan Sedenberg and Alan Rector: Web Ontology Segmentaton: Analyss, Classfcaton and Use, WWW2006, (2006). 6. H. Stuckenschmdt and M. Klen: Structure-Based Parttonng of Large Class Herarches. In Proc of the 3 rd Internatonal Semantc Web Conference (2004). 7. W. Hu, Y. Qu: Block matchng for ontologes, In Proc of the 5 th Internatonal Semantc Web Conference, LNCS, vol. 4273, Sprnger (2006) 300-313. 8. P.G. Xu, H.J. Tao: SOBOM: Sub-ontology based Ontology Matchng. To be appear.