A Large Scale Taxonomy Mapping Evaluation

Size: px
Start display at page:

Download "A Large Scale Taxonomy Mapping Evaluation"

Transcription

1 A Large Scale Taxonomy Mapping Evaluation Paolo Avesani 1, Fausto Giunchiglia 2, and Mikalai Yatskevich 2 1 ITC-IRST, Povo, Trento, Italy avesani@itc.it 2 Dept. of Information and Communication Technology, University of Trento, Povo, Trento, Italy {fausto,yatskevi}@dit.unitn.it Abstract. Matching hierarchical structures, like taxonomies or web directories, is the premise for enabling interoperability among heterogenous data organizations. While the number of new matching solutions is increasing the evaluation issue is still open. This work addresses the problem of comparison for pairwise matching solutions. A methodology is proposed to overcome the issue of scalability. A large scale dataset is developed based on real world case study namely, the web directories of Google, Looksmart and Yahoo!. Finally, an empirical evaluation is performed which compares the most representative solutions for taxonomy matching. We argue that the proposed dataset can play a key role in supporting the empirical analysis for the research effort in the area of taxonomy matching. 1 Introduction Taxonomic structures are commonly used in file systems, market place catalogs, and the directories of Web portals. They are now widespread as knowledge repositories (in this case they can be viewed as shallow ontologies [23]) and the problem of their integration and interoperability is acquiring a high relevance from a scientific and commercial perspective. A typical application of hierarchical classification interoperability occurs when a set of companies wants to exchange products without sharing a common product catalog. The typical solution to the interoperability problem amounts to performing matching between taxonomies. The Match operator takes two graph-like structures as input and produces a mapping between the nodes of the graphs that correspond semantically to each other. Many diverse solutions to the matching problem have been proposed so far, see for example surveys in [20, 21] and concrete solutions [13, 15, 6, 18, 24, 3, 19, 17, 8], etc. Unfortunately nearly all of them suffer from the lack of evaluation. Until very recently there were no comparative evaluations and it was quite difficult to find two systems which were evaluated on the same dataset. At the same time the evaluation efforts were mostly concentrated either on datasets artificially synthesized under questionable assumptions or on the toy examples.

2 In this paper we introduce a large scale dataset for evaluating matching solutions. The dataset is constructed from the mappings extracted from real web directories and contains thousands of mappings. We have evaluated the dataset using the most representative state of the art solutions to the matching problem. The evaluation highlighted that the dataset has four key properties namely Complexity, Discrimination capability, Incrementality and Correctness. The first means that the dataset is hard for state of the art matching systems, the second that it discriminates among the various matching solutions, the third that it is effective in recognizing weaknesses in the state of the art matching systems and the fourth that it can be considered as a correct tool to support the improvement and research on the matching solutions. At the same time the current version of dataset contains only true positive mappings. This fact limits the evaluations on the dataset to measuring only Recall. This is a weakness of the dataset that we plan to improve. However, as highlighted in [16], the biggest problem in nowadays matching systems is recall, while completeness is much less of an issue. The rest of the paper is organized as follows. Section 2 summarizes the definition of the matching problem and recalls the state of the art. Section 3 expands more on the notion of mapping evaluation problem. Section 4 illustrates how the large scale dataset has been arranged. Section 5 is devoted to a large scale empirical evaluation on two leading matching systems. Section 6 presents the results of our experiments and argues why the proposed dataset is of interest. Section 7 concludes the paper. 2 The Matching Problem In order to motivate the matching problem and illustrate one of the possible situations which can arise in the data integration task let us use the two taxonomies A and B depicted on Figure 1. They are taken from Yahoo! and Standard business catalogues. Suppose that the task is to integrate these two taxonomies. We assume that all the data and conceptual models (e.g., classifications, database schemas, taxonomies and ontologies) can be represented as graphs (see [9] for a detailed discussion). Therefore, the matching problem can be represented as extraction of graph-like structures from the data or conceptual models and matching the obtained graphs. This allows for the statement and solution of a generic matching problem, very much along the lines of what done in [15, 13]. The first step in the integration process is to identify candidates to be merged or to have relationships under an integrated taxonomy. For example, Computer Hardware A can be assumed equivalent to Computer Hardware B and more general than Personal Computers B. Hereafter the subscripts designate the schema (either A or B) from which the node is derived. We think of a mapping element as a 4-tuple ID ij, n1 i, n2 j, R, i = 1,...,N 1 ; j = 1,...,N 2 ; where ID ij is a unique identifier of the given mapping element; n1 i is the i-th node of the first graph, N 1 is the number of nodes in the first graph; n2 j is the j-th node of the second graph, N 2 is the number of nodes in

3 Fig. 1. Parts of Yahoo and Standard taxonomies the second graph; and R specifies a similarity relation of the given nodes. A mapping is a set of mapping elements. We think of matching as the process of discovering mappings between two graph-like structures through the application of a matching algorithm. Matching approaches can be classified into syntactic and semantic depending on how mapping elements are computed and on the kind of similarity relation R used (see [10] for in depth discussion): In syntactic matching the key intuition is to find the syntactic (very often string based) similarity between the labels of nodes. Similarity relation R in this case is typically represented as a [0, 1] coefficient, which is often considered as equivalence relation with certain level of plausibility or confidence (see [13, 7] for example). Similarity coefficients usually measure the closeness between two elements linguistically and structurally. For example, the similarity between Computer Storage Devices A and Data Storage Devices B based on linguistical and structural analysis could be 0,63. Semantic matching is an approach where semantic relations are computed between concepts (not between labels) at nodes. The possible semantic relations (R) are: equivalence (=); more general or generalization ( ); less general or specification ( ); mismatch ( ); overlapping ( ). They are ordered according to decreasing binding strength, i.e., from the strongest (=) to the weakest ( ). For example, as from Figure 1 Computer Hardware A is more general than Large Scale Com-puters B In this paper we are focused on taxonomy matching. We think about taxonomy as a N,A,F l, where N is a set of nodes, A is a set of arcs, such that N,A is a rooted tree. F l is a function from N to set of labels L (i.e., words in natural language). An example of taxonomy is presented on Figure 1. Notice that the distinguishing feature of taxonomies is the lack of formal encoding semantics.

4 3 The Evaluation Problem Nearly all state of the art matching systems suffer from the lack of evaluation. Till very recently there was no comparative evaluation and it was quite difficult to find two systems evaluated on the same dataset. Often authors artificially synthesize datasets for empirical evaluation but rarely they explain their premises and assumptions. The last efforts [22] on matching evaluation concentrate rather on artificially produced and quite simple examples than real world matching tasks. Most of the current evaluation efforts were devoted to the schemas with tenth of nodes and only some recent works (see [6] for example) present the evaluation results for the graphs with hundreds of nodes. At the same time industrial size schemas contain up to tenth thousands of nodes. The evaluation problem can be summarized as the problem of acquiring the reference relationship that holds between two nodes. Given such a reference relationship it would be straightforward to evaluate the result of a matching solution. Up to now the acquisition of the reference mappings that hold among the nodes of two taxonomies is performed manually. Similarly to the annotated corpora for information retrieval or information extraction, we need to annotate a corpus of pairwise relationships. Of course such an approach prevents the opportunity of having large corpora. The number of mappings between two taxonomies are quadratic with respect to taxonomy size, what makes hardly possible the manual mapping of real world size taxonomies. It is worthwhile to remember that web directories, for example, have tens thousands of nodes. Certain heuristics can help in reducing the search space but the human effort is still too demanding. Our proposal is to build a reference interpretation for a node looking at its use. We argue that the semantics of nodes can be derived by their pragmatics, i.e., how they are used. In our context, the nodes of a taxonomy are used to classify documents. The set of documents classified under a given node implicitly defines its meaning. This approach has been followed by other researchers. For example in [5, 14] the interpretation of a node is approximated by a model computed through statistical learning. Of course the accuracy of the interpretation is affected by the error of the learning model. We follow a similar approach but without the statistical approximation. The working hypothesis is that the meaning of two nodes is equivalent if the sets of documents classified under those nodes have a meaningful overlap. The basic idea is to compute the relationship hypotheses based on the cooccurence of documents. This document-driven interpretation can be used as a reference value for the evaluation of competing matching solutions. A simple definition of equivalence relationship based on documents can be derived by the F1 measure of information retrieval. Figure 2 shows a simple example. In the graphical representation we have two taxonomies, for each of them we focus our attention on a reference node. Let be S and P two sets of documents classified under the reference nodes of the first and second taxonomies respectively. We will refer to A S and A P as the set of documents classified under the ancestor nodes of S and P. Conversely, we will refer to T S and T P as the set of documents classified under the subtrees of S

5 and P. The goal is to define a relationship hypothesis based on the overlapping of the set of documents, i.e. the pragmatic use of the nodes. The first step, the equivalence relationship, can be easily formulated as the F1 measure of information retrieval [2]. The similarity of two sets of documents is defined as the ratio between the marginal sets and the shared documents: Equivalence = MS P + MP S O S P where the set of shared documents is defined as OP S = P S and MS P = S \OS P is the marginal set of documents classified by S and not classified by P (similarly MS P = P \ OS P ). The following equivalence applies OS P = OP S. Notice that O stands for overlapping and M stands for Marginal set. Fig. 2. The pairwise relationships between two taxonomies. We do a step forward because we do not only compute the equivalence hypothesis based on the notion of F1 measure of information retrieval, but we extend such equation to define the formulation of generalization and specialization hypotheses. Generalization and specialization hypotheses can be formulated taking advantage of the contextual encoding of knowledge in terms of hierarchies of categories. The challenge is to formulate a generalization hypothesis (and conversely a specialization hypothesis) between two nodes looking at the overlapping of set of documents classified in the ancestor or subtree of the reference nodes [1].

6 The generalization relationship holds when the first node has to be considered more general of the second node. Intuitively, it happens when the documents classified under the first nodes occur in the ancestor of the second node, or the documents classified under the second node occur in the subtree of the first node. Following this intuition we can formalize the generalization hypothesis as Generalization = M S P + MP S O S P + OP A S + O S T P where OA P S represents the set of documents resulting from the intersection between MS P and the set of documents classified under the concepts in the hierarchy above S (i.e. the ancestors); similarly OT S P represents the set of documents resulting from the intersection between MP S and the set of documents classified under the concepts in the hierarchy below P (i.e. the children). In a similar way we can conceive the specialization relationship. The first node is more specific than second node when the meaning associated to the first node can be subsumed by the meaning of the second node. Intuitively, it happens when the documents classified under the first nodes occur in the subtree of the second node, or the documents classified under the second node occur in the ancestor of the first node. Specialization = M S P + MP S O S P + OP T S + O S A P where OT P S represents the set of documents resulting from the intersection between MS P and the set of documents classified under the concepts in the hierarchy below S (i.e. the children); similarly OA S P represents the set of documents resulting from the intersection between MP S and the set of documents classified under the concepts in the hierarchy above P (i.e. the ancestors). The three definitions above allow us to compute a relationship hypothesis between two nodes of two different taxonomies. Such an hypothesis relies on the assumption that if two nodes classify the same set of documents, the meaning associated to the nodes is reasonably the same. Of course this assumption is true for a virtually infinite set of documents. In a real world case study we face with finite set of documents, and therefore, this way of proceeding is prone to error. Nevertheless, our claim is that the approximation introduced by our assumption is balanced by the benefit of scaling with the annotation of large taxonomies. 4 Building a Large Scale Mapping Dataset Let us try to apply the notion of document-driven interpretation to a real world case study. We focus our attention to web directories for many reasons. Web directories are widely used and known; moreover they are homogeneous, that is they cover general topics. The meaning of a node in a web directory is not defined with formal semantics but by pragmatics. Furthermore the web directories

7 address the same space of documents, therefore the working hypothesis of cooccurence of documents can be sustainable. Of course different web directories don t cover the same portion of the web but the overlapping is meaningful. The case study of web directories meets two requirements of the matching problem: to have heterogeneous representations of the same topics and to have taxonomies of large dimensions. We address three main web directories: Google, Yahoo! and Looksmart. Nodes have been considered as categories denoted by the lexical labels, the tree structures have been considered as hierarchical relations, and the URL classified under a given node as documents. The following table summarizes the total amount of processed data. Web Directories Google Looksmart Yahoo! number of nodes number of urls Let us briefly describe the process by which we have arranged an annotated corpus of pairwise relations between web directories. Step 1. We crawled all three web directories, both the hierarchical structure and the web contents, then we computed the subset of URLs classified by all of them. Step 2. We pruned the downloaded web directories by removing all the URLs that were not referred by all the three web directories. Step 3. We performed an additional pruning by removing all the nodes with a number of URLs under a given threshold. In our case study we fixed such a threshold at 10. Step 4. We manually recognized potential overlapping between two branches of two different web directories like Google:/Top/Science/Biology Looksmart:/Top/Science-and-Health/Biology Yahoo:/Top/Computers-and-Internet/Internet Looksmart:/Top/Computing/Internet Google:/Top/Reference/Education Yahoo:/Top/Education We recognized 50 potential overlapping and for each of them we run an exhaustive assessment on all the possible pairs between the two related subtrees. Such an heuristic allowed us to reduce the quadratic explosion of cartesian product of two web directories. We focussed the analysis on smaller subtrees where the overlaps were more likely. Step 5. We computed the three document-driven hypothesis for equivalence, generalization and specialization relationships as described above. Hypotheses of equivalence, generalization and specialization are normalized and estimated by a number in the range [0,1]. Since the cumulative hypothesis of

8 all three relationships for the same pair of nodes can not be higher than 1, we introduce a threshold to select the winning hypothesis. We fixed such a threshold to 0.5. We discarded all the pairs where none of the three relationship hypotheses was detected. This process allowed us to obtain 2265 pairwise relationships defined using the document-driven interpretation. Half are equivalence relationships and half are generalization relationships (notice that by definition generalization and specialization hypothesis are symmetric). In the following we will refer to this dataset as TaxME, TAXonomy Mapping Evaluation. 5 The Empirical Evaluation The evaluation was designed in order to assess the major dataset properties namely: Complexity, namely the fact that the dataset is hard for state of the art matching systems. Discrimination ability, namely the fact that the dataset can discriminate among various matching approaches. Incrementality, namely the fact that the dataset allows to incrementally discover the weaknesses of the tested systems. Correctness, namely the fact that the dataset can be a source of correct results. We have evaluated two state of the art matching systems COMA 3 and S M atch and compared their results with baseline solution. Let us describe the matching systems in more detail. The COMA system [13] is a generic syntactic schema matching tool. It exploits both element and structure level techniques and combines the results of their independent execution using several aggregation strategies. COMA provides an extensible library of matching algorithms and a framework for combining obtained results. Matching library contains 6 individual matchers, 5 hybrid matchers and 1 reuse-oriented matcher. One of the distinct features of the COMA tool is the possibility of performing iterations in the matching process. In the evaluation we used default combination of matchers and aggregation strategy (NamePath+Leaves and Average respectively). S-Match is a generic semantic matching tool. It takes two tree-like structures and produces a set of mappings between their nodes. S-Match implements semantic matching algorithm in 4 steps. On the first step the labels of nodes are linguistically preprocessed and their meanings are obtained from the Oracle (in the current version WordNet 2.0 is used as an Oracle). On the second step the 3 In the evaluation we use the version of COMA described in [13]. A newer version of the system COMA++ exists but we do not have it. However as from the evaluation results presented in [10, 11], COMA is still best among the other syntactic matchers.

9 meaning of the nodes is refined with respect to the tree structure. On the third step the semantic relations between the labels at nodes and their meanings are computed by the library of element level semantic matchers. On the fourth step the matching results are produced by reduction of the node matching problem into propositional validity problem, which is efficiently solved by SAT solver or ad hoc algorithm (see [10, 11] for more details). We have compared the performance of these two systems with baseline solution. The pseudo code of baseline node matching algorithm is given in Algorithm 1. It is executed for each pair of nodes in two trees. The algorithm considers a simple string comparison among the labels placed on the path spanning from a node to the root of the tree. Equivalence, more general and less general relations are computed as the corresponding logical operations on the sets of the labels. Algorithm 1 Baseline node matching algorithm 1: String nodematch(node sourcenode, Node targetnode) 2: Set sourcesetoflabels=getlabelsinpathtoroot(sourcenode) 3: Set targetsetoflabels=getlabelsinpathtoroot(targetnode) 4: if sourcesetoflabels targetsetoflabels then 5: result= 6: else if sourcesetoflabels targetsetoflabels then 7: result= 8: else if sourcesetoflabels targetsetoflabels then 9: result= 10: else 11: result= Idk 12: end if 13: return result The systems have been evaluated on the dataset described in Section 4. We computed the number of matching tasks solved by each matching system. Notice that the matching task was considered to be solved in the case when the matching system produce specification, generalization or equivalence semantic relation for it. For example, TaxME suggests that specification relation holds in the following example: Google:/Top/Sports/Basketball/Professional/NBDL Looksmart:/Top/Sports/Basketball COMA produced for this matching task 0.58 similarity coefficient, which can be considered as equivalence relation with probability In the evaluation we consider this case as true positive for COMA (i.e., the mapping was considered as found by the system). Notice that at present TaxME contains only true positive mappings. This fact allows to obtain the correct results for Recall measure, which is defined as a ratio of reference mappings found by the system to the number of reference mappings. At the same time Precision, which is defined as ratio of reference

10 mappings found by the system to the number of mappings in the result, can not be correctly estimated by the dataset since, as from Section 4, TaxME guarantee only the correctness but not completeness of the mappings it contains. 6 Discussion of Results Evaluation results are presented on Table 1. It contains the total number of mappings found by the systems and the partitioning of the mappings on semantic relations. Let us discuss the results through the major dataset properties perspective. Table 1. Evaluation Results Google vs. Looksmart Google vs. Yahoo Looksmart vs.yahoo Total COMA (38,68%) = not applicable not applicable not applicable not applicable not applicable not applicable not applicable not applicable S-Match (29,54%) = Baseline (5,39%) = Complexity As from Table 1, the results of baseline are surprisingly low. It produced slightly more than 5% of mappings. This result is interesting since on the previously evaluated datasets (see [4] for example) the similar baseline algorithm performed quite well and found up to 70% of mappings. This lead us to conclusion that the dataset is not trivial (i.e., it is essentially hard for simple matching techniques). As from Figure 3, S-Match found about 30% of the mappings in the biggest (Google-Yahoo) matching task. At the same time it produced slightly less than 30% of mappings in all the tasks. COMA found about 35% of mappings on Google-Looksmart and Yahoo-Looksmart matching tasks. At the same time it produced the best result on Google-Yahoo. COMA found slightly less than 40% of all the mappings. These results are interesting since, as from [13, 10], previously reported recall values for both systems were in 70-80% range. This fact turn us to conclusion that the dataset is hard for state of the art syntactic and semantic matching systems.

11 Fig. 3. Percentage of correctly determined mappings(recall) 6.2 Discrimination ability Consider Figure 4. It presents the partitioning of the mappings found by S- Match and COMA. As from the figure the sets of mappings produced by COMA and S-Match intersects only on 15% of the mappings. This fact turns us to an important conclusion: the dataset is discriminating (i.e., it contains a number of features which are essentially hard for various classes of matching systems and allow to discriminate between the major qualities of the systems). Fig. 4. Partitioning of the mappings found by COMA and S-Match 6.3 Incrementality In order to evaluate incrementality we have chosen S-Match as a test system. In order to identify the shortcomings of S-Match we manually analyzed the mappings missed by S-Match. This analysis allowed us to clasterize the mismatches into several categories. In this paper we describe in detail one of the most important categories of mismatches namely Meaningless labels. Consider the following example:

12 Google:/Top/Science/Social_Sciences/Archaeology/Alternative/ South_America/Nazca_Lines Looksmart:/Top/Science_&_Health/Social_Science/Archaeology/ By_Region/Andes_South_America/Nazca In this matching task some labels are meaningful in the sense they define the context of the concept. In our example these are Social Sciences, Archaeology, South America, Nazca. The other labels do not have a great influence on the meaning of concept. At the same time they can prevent S-Match from producing the correct semantic relation. In our example S-Match can not find any semantic relation connecting Nazca Lines and Nazca. The reason for this is By Region label, which is meaningless in the sense it is defined only for readability and taxonomy partitioning purposes. An other example of this kind is Google:/Top/Arts/Celebrities/A/Affleck,_Ben Looksmart:/Top/Entertainment/Celebrities/Actors/Actors_A/ Actors_Aa-Af/Affleck,_Ben/Fan_Dedications Here, A and Actors A/Actors Aa-Af do not influence on the meaning of the concept. At the same time they prevent S-Match to produce the correct semantic relation holding between the concepts. An optimized version of S-Match (S-Match++) has a list of meaningless labels. At the moment the list contains only about 30 words but it is automatically enriched in preprocessing phase. A general rule for considering natural language label as meaningless is to check whether it is used for taxonomy partitioning purposes. For example, S-Match++ consider as meaningless the labels with the following structure by word, where word stands for any word in natural language. However, this method is not effective in the case of labels composed from alphabet letters (such as Actors Aa-Af from previous example). S-Match++ deals with the latter case in the following way: the combination of letters are considered as meaningless if it is not recognized by WordNet, not in abbreviation or proper name list, and at the same time its length is less or equal to 3. The addition of these techniques allowed to improve significantly the S-Match matching capability. The number of mappings found by the system on TaxME dataset increased by 15%. This result gives us an evidence to incrementality of the dataset (i.e., the dataset allows to discover the weaknesses of the systems and gives the clues to the systems evolution). Analysis of S-Match results on TaxME allowed to identify 10 major bottlenecks in the system implementation. At the moment we are developing ad hoc techniques allowing to improve S-Match results in this cases. The current version of S-Match (S-Match++) contains the techniques allowing to solve 5 out of 10 major categories of mismatches. Consider Figure 5. It contains the results of comparative evaluation S-Match++ against the other systems. As from the figure S-Match++ significantly outperforms all the other systems. It found about 60% of mappings in all the matching tasks, what is twice better than S-Match result. This significant improvement would hardly be possible without comprehensive evaluation on TaxME dataset.

13 Fig. 5. Percentage of correctly determined mappings(recall) 6.4 Correctness We manually analyzed correctness of the mappings provided by TaxME. At the moment 60% of mappings are processed and only 2-3% of them are not correct. Taking into account the notion of idiosyncratic classification [12] (or the fact that human annotators on the sufficiently big and complex dataset tend to have resemblance up to 20% in comparison with their own results), such a mismatch can be considered as marginal. 7 Conclusions In this paper we have presented a mapping dataset which carries the key important properties of Complexity, Incrementality, Discrementality and Correctness. We have evaluated the dataset on two state of the art matching systems representing different matching approaches. As from the evaluation, the dataset can be considered as a powerful tool to support the evaluation and research on the matching solutions. The ultimate step which needs to be performed is to acquire the user mappings for TaxME dataset. We have already arranged such a kind of test and the results though preliminary are promising. Unfortunately at the moment more significant statistics needs to be collected in order to further improve TaxME. Acknowledgment We would like to thank Claudio Fontana and Christian Girardi for their helpful contribution in crawling and processing the web directories. We also would like to thank Pavel Shvaiko and Ilya Zaihrayev for their work on S-Match. This work has been partially supported by the European Knowledge Web network of excellence (IST ).

14 References 1. P. Avesani. Evaluation framework for local ontologies interoperability. In AAAI Workshop on Meaning Negotiation, R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, S. Bergamaschi, S. Castano, and M. Vincini. Semantic integration of semistructured and structured data sources. SIGMOD Record, (28(1)):54 59, P. Bouquet, L. Serafini, and S. Zanobini. Semantic coordination: a new approach and an application. In Fensel D., Sycara K. P., and Mylopoulos J., editors, The Semantic Web, volume 2870 of LNCS, Sanibel Island, Fla., October H. Doan, P. Domingos, and A. Halevy. Learning to match the schemas of data sources: A multistrategy approach. Machine Learning, 50: , M. Ehrig and Y. Sure. Ontology mapping - an integrated approach. In Christoph Bussler, John Davis, Dieter Fensel, and Rudi Studer, editors, Proceedings of the First European Semantic Web Symposium, volume 3053 of Lecture Notes in Computer Science, pages 76 91, Heraklion, Greece, MAY Springer Verlag. 7. J. Euzenat and P. Valtchev. An integrative proximity measure for ontology alignment. In Proceedings of Semantic Integration workshop at International Semantic Web Conference (ISWC), J. Euzenat and P. Valtchev. Similarity-based ontology alignment in OWL-lite. In Proceedings of European Conference on Artificial Intelligence (ECAI), pages , F. Giunchiglia and P. Shvaiko. Semantic matching. The Knowledge Engineering Review Journal, (18(3)): , F. Giunchiglia, P. Shvaiko, and M. Yatskevich. S-match: an algorithm and an implementation of semantic matching. In Bussler C., Davies J., Fensel D., and Studer R., editors, The semantic web: research and applications, volume 3053 of LNCS, Heraklion, May F. Giunchiglia, M. Yatskevich, and E. Giunchiglia. Efficient semantic matching. In Proceedings of the 2nd european semantic web conference (ESWC 05), Heraklion, 29 May-1 June D. Goren-Bar and T.Kuflik. Supporting user-subjective categorization with selforganizing maps and learning vector quantization. Journal of the American Society for Information Science and Technology JASIST, 56(4): , H.H.Do and E. Rahm. COMA - a system for flexible combination of schema matching approaches. In Proceedings of Very Large Data Bases Conference (VLDB), pages , R. Ichise, H. Takeda, and S. Honiden. Integrating multiple internet directories by instance-based learning. In IJCAI, pages 22 30, J. Madhavan, P. A. Bernstein, and E. Rahm. Generic schema matching with cupid. The Very Large Databases (VLDB) Journal, pages 49 58, B. Magnini, M. Speranza, and C. Girardi. A semantic-based approach to interoperability of classification hierarchies: Evaluation of linguistic techniques. In Proceedings of COLING-2004, August 23-27, D. L. McGuinness, R. Fikes, J. Rice, and S. Wilder. An environment for merging and testing large ontologies. In Proceedings of International Conference on the Principles of Knowledge Representation and Reasoning (KR), pages , S. Melnik, H. Garcia-Molina, and E. Rahm. Similarity flooding: A versatile graph matching algorithm. In Proceedings of International Conference on Data Engineering (ICDE), pages , 2002.

15 19. Noy N. and Musen M. A. Anchor-prompt: Using non-local context for semantic matching. In Proceedings of workshop on Ontologies and Information Sharing at International Joint Conference on Artificial Intelligence (IJCAI), pages 63 70, E. Rahm and P. Bernstein. A survey of approaches to automatic schema matching. VLDB Journal, (10(4)): , P. Shvaiko and J. Euzenat. A survey of schema-based matching approaches. Journal on Data Semantics, IV, Y. Sure, O. Corcho, J. Euzenat, and T. Hughes. Evaluation of Ontology-based Tools. Proceedings of the 3rd International Workshop on Evaluation of Ontologybased Tools (EON), C. Welty and N. Guarino. Supporting ontological analysis of taxonomic relationships. Data and Knowledge Engineering, (39(1)):51 74, L. Xu and D.W. Embley. Using domain ontologies to discover direct and indirect matches for schema elements. In Proceedings of Semantic Integration workshop at International Semantic Web Conference (ISWC), 2003.

Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology

Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology Abstract. Schema/ontology/classification matching is a critical problem in many application domains, such as, schema/ontology/classification

More information

38050 Povo Trento (Italy), Via Sommarive 14 DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING

38050 Povo Trento (Italy), Via Sommarive 14  DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING

More information

38050 Povo Trento (Italy), Via Sommarive 14 Fausto Giunchiglia, Pavel Shvaiko and Mikalai Yatskevich

38050 Povo Trento (Italy), Via Sommarive 14   Fausto Giunchiglia, Pavel Shvaiko and Mikalai Yatskevich UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it SEMANTIC MATCHING Fausto Giunchiglia, Pavel Shvaiko and Mikalai

More information

Ontology matching using vector space

Ontology matching using vector space University of Wollongong Research Online University of Wollongong in Dubai - Papers University of Wollongong in Dubai 2008 Ontology matching using vector space Zahra Eidoon University of Tehran, Iran Nasser

More information

DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE Povo Trento (Italy), Via Sommarive 14

DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE Povo Trento (Italy), Via Sommarive 14 UNIVERSITY OF TRENTO DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE 38050 Povo Trento (Italy), Via Sommarive 14 http://www.disi.unitn.it A LARGE SCALE DATASET FOR THE EVALUATION OF ONTOLOGY MATCHING

More information

Semantic Schema Matching

Semantic Schema Matching Semantic Schema Matching Fausto Giunchiglia, Pavel Shvaiko, and Mikalai Yatskevich University of Trento, Povo, Trento, Italy {fausto pavel yatskevi}@dit.unitn.it Abstract. We view match as an operator

More information

Ontology Matching Techniques: a 3-Tier Classification Framework

Ontology Matching Techniques: a 3-Tier Classification Framework Ontology Matching Techniques: a 3-Tier Classification Framework Nelson K. Y. Leung RMIT International Universtiy, Ho Chi Minh City, Vietnam nelson.leung@rmit.edu.vn Seung Hwan Kang Payap University, Chiang

More information

38050 Povo (Trento), Italy Tel.: Fax: e mail: url:

38050 Povo (Trento), Italy Tel.: Fax: e mail: url: CENTRO PER LA RICERCA SCIENTIFICA E TECNOLOGICA 38050 Povo (Trento), Italy Tel.: +39 0461 314312 Fax: +39 0461 302040 e mail: prdoc@itc.it url: http://www.itc.it AN ALGORITHM FOR MATCHING CONTEXTUALIZED

More information

Towards Rule Learning Approaches to Instance-based Ontology Matching

Towards Rule Learning Approaches to Instance-based Ontology Matching Towards Rule Learning Approaches to Instance-based Ontology Matching Frederik Janssen 1, Faraz Fallahi 2 Jan Noessner 3, and Heiko Paulheim 1 1 Knowledge Engineering Group, TU Darmstadt, Hochschulstrasse

More information

Ontology matching techniques: a 3-tier classification framework

Ontology matching techniques: a 3-tier classification framework University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Ontology matching techniques: a 3-tier classification framework Nelson

More information

3 Classifications of ontology matching techniques

3 Classifications of ontology matching techniques 3 Classifications of ontology matching techniques Having defined what the matching problem is, we attempt at classifying the techniques that can be used for solving this problem. The major contributions

More information

arxiv: v1 [cs.db] 23 Feb 2016

arxiv: v1 [cs.db] 23 Feb 2016 SIFT: An Algorithm for Extracting Structural Information From Taxonomies Jorge Martinez-Gil, Software Competence Center Hagenberg (Austria), jorgemar@acm.org Keywords: Algorithms; Knowledge Engineering;

More information

Enabling Product Comparisons on Unstructured Information Using Ontology Matching

Enabling Product Comparisons on Unstructured Information Using Ontology Matching Enabling Product Comparisons on Unstructured Information Using Ontology Matching Maximilian Walther, Niels Jäckel, Daniel Schuster, and Alexander Schill Technische Universität Dresden, Faculty of Computer

More information

Poster Session: An Indexing Structure for Automatic Schema Matching

Poster Session: An Indexing Structure for Automatic Schema Matching Poster Session: An Indexing Structure for Automatic Schema Matching Fabien Duchateau LIRMM - UMR 5506 Université Montpellier 2 34392 Montpellier Cedex 5 - France duchatea@lirmm.fr Mark Roantree Interoperable

More information

The HMatch 2.0 Suite for Ontology Matchmaking

The HMatch 2.0 Suite for Ontology Matchmaking The HMatch 2.0 Suite for Ontology Matchmaking S. Castano, A. Ferrara, D. Lorusso, and S. Montanelli Università degli Studi di Milano DICo - Via Comelico, 39, 20135 Milano - Italy {castano,ferrara,lorusso,montanelli}@dico.unimi.it

More information

Matching and Alignment: What is the Cost of User Post-match Effort?

Matching and Alignment: What is the Cost of User Post-match Effort? Matching and Alignment: What is the Cost of User Post-match Effort? (Short paper) Fabien Duchateau 1 and Zohra Bellahsene 2 and Remi Coletta 2 1 Norwegian University of Science and Technology NO-7491 Trondheim,

More information

Web Explanations for Semantic Heterogeneity Discovery

Web Explanations for Semantic Heterogeneity Discovery Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 1, Fausto Giunchiglia 1, Paulo Pinheiro da Silva 2, and Deborah L. McGuinness 2 1 University of Trento, Povo, Trento, Italy {pavel, fausto}@dit.unitn.it

More information

Semantic Interoperability. Being serious about the Semantic Web

Semantic Interoperability. Being serious about the Semantic Web Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA 1 Being serious about the Semantic Web It is not one person s ontology It is not several people s common

More information

Alignment Results of SOBOM for OAEI 2009

Alignment Results of SOBOM for OAEI 2009 Alignment Results of SBM for AEI 2009 Peigang Xu, Haijun Tao, Tianyi Zang, Yadong, Wang School of Computer Science and Technology Harbin Institute of Technology, Harbin, China xpg0312@hotmail.com, hjtao.hit@gmail.com,

More information

DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE Povo Trento (Italy), Via Sommarive 14

DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE Povo Trento (Italy), Via Sommarive 14 UNIVERSITY OF TRENTO DIPARTIMENTO DI INGEGNERIA E SCIENZA DELL INFORMAZIONE 38050 Povo Trento (Italy), Via Sommarive 14 http://www.disi.unitn.it SEMANTIC MATCHING WITH S-MATCH Pavel Shvaiko, Fausto Giunchiglia,

More information

FOAM Framework for Ontology Alignment and Mapping Results of the Ontology Alignment Evaluation Initiative

FOAM Framework for Ontology Alignment and Mapping Results of the Ontology Alignment Evaluation Initiative FOAM Framework for Ontology Alignment and Mapping Results of the Ontology Alignment Evaluation Initiative Marc Ehrig Institute AIFB University of Karlsruhe 76128 Karlsruhe, Germany ehrig@aifb.uni-karlsruhe.de

More information

Ontology Construction -An Iterative and Dynamic Task

Ontology Construction -An Iterative and Dynamic Task From: FLAIRS-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Ontology Construction -An Iterative and Dynamic Task Holger Wache, Ubbo Visser & Thorsten Scholz Center for Computing

More information

Contexts and ontologies in schema matching

Contexts and ontologies in schema matching Contexts and ontologies in schema matching Paolo Bouquet Department of Information and Communication Technology, University of Trento Via Sommarive, 14 38050 Trento (Italy) Abstract. In this paper, we

More information

PRIOR System: Results for OAEI 2006

PRIOR System: Results for OAEI 2006 PRIOR System: Results for OAEI 2006 Ming Mao, Yefei Peng University of Pittsburgh, Pittsburgh, PA, USA {mingmao,ypeng}@mail.sis.pitt.edu Abstract. This paper summarizes the results of PRIOR system, which

More information

Soundness of Schema Matching Methods

Soundness of Schema Matching Methods Soundness of Schema Matching Methods M. Benerecetti 1, P. Bouquet 2, and S. Zanobini 2 1 Department of Physical Science University of Naples Federico II Via Cintia, Complesso Monte S. Angelo, I-80126 Napoli

More information

A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS

A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS Manoj Paul, S. K. Ghosh School of Information Technology, Indian Institute of Technology, Kharagpur 721302, India - (mpaul, skg)@sit.iitkgp.ernet.in

More information

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved

More information

Ontology Matching Using an Artificial Neural Network to Learn Weights

Ontology Matching Using an Artificial Neural Network to Learn Weights Ontology Matching Using an Artificial Neural Network to Learn Weights Jingshan Huang 1, Jiangbo Dang 2, José M. Vidal 1, and Michael N. Huhns 1 {huang27@sc.edu, jiangbo.dang@siemens.com, vidal@sc.edu,

More information

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Yihong Ding Department of Computer Science Brigham Young University Provo, Utah 84602 ding@cs.byu.edu David W. Embley Department

More information

RiMOM Results for OAEI 2008

RiMOM Results for OAEI 2008 RiMOM Results for OAEI 2008 Xiao Zhang 1, Qian Zhong 1, Juanzi Li 1, Jie Tang 1, Guotong Xie 2 and Hanyu Li 2 1 Department of Computer Science and Technology, Tsinghua University, China {zhangxiao,zhongqian,ljz,tangjie}@keg.cs.tsinghua.edu.cn

More information

XBenchMatch: a Benchmark for XML Schema Matching Tools

XBenchMatch: a Benchmark for XML Schema Matching Tools XBenchMatch: a Benchmark for XML Schema Matching Tools Fabien Duchateau, Zohra Bellahsene, Ela Hunt To cite this version: Fabien Duchateau, Zohra Bellahsene, Ela Hunt. XBenchMatch: a Benchmark for XML

More information

Fausto Giunchiglia and Mattia Fumagalli

Fausto Giunchiglia and Mattia Fumagalli DISI - Via Sommarive 5-38123 Povo - Trento (Italy) http://disi.unitn.it FROM ER MODELS TO THE ENTITY MODEL Fausto Giunchiglia and Mattia Fumagalli Date (2014-October) Technical Report # DISI-14-014 From

More information

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch

More information

Measuring the quality of ontology mappings: A multifaceted approach.

Measuring the quality of ontology mappings: A multifaceted approach. Measuring the quality of ontology mappings: A multifaceted approach. Jennifer Sampson Department of Computer and Information Science Norwegian University of Science and Technology Trondheim, Norway Jennifer.Sampson@idi.ntnu.no

More information

RiMOM Results for OAEI 2009

RiMOM Results for OAEI 2009 RiMOM Results for OAEI 2009 Xiao Zhang, Qian Zhong, Feng Shi, Juanzi Li and Jie Tang Department of Computer Science and Technology, Tsinghua University, Beijing, China zhangxiao,zhongqian,shifeng,ljz,tangjie@keg.cs.tsinghua.edu.cn

More information

Combining Ontology Mapping Methods Using Bayesian Networks

Combining Ontology Mapping Methods Using Bayesian Networks Combining Ontology Mapping Methods Using Bayesian Networks Ondřej Šváb, Vojtěch Svátek University of Economics, Prague, Dep. Information and Knowledge Engineering, Winston Churchill Sq. 4, 130 67 Praha

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

InsMT / InsMTL Results for OAEI 2014 Instance Matching

InsMT / InsMTL Results for OAEI 2014 Instance Matching InsMT / InsMTL Results for OAEI 2014 Instance Matching Abderrahmane Khiat 1, Moussa Benaissa 1 1 LITIO Lab, University of Oran, BP 1524 El-Mnaouar Oran, Algeria abderrahmane_khiat@yahoo.com moussabenaissa@yahoo.fr

More information

Towards a Theory of Formal Classification

Towards a Theory of Formal Classification Towards a Theory of Formal Classification Fausto Giunchiglia and Maurizio Marchese and Ilya Zaihrayeu {fausto, marchese, ilya}@dit.unitn.it Department of Information and Communication Technology University

More information

ONTOLOGY MATCHING: A STATE-OF-THE-ART SURVEY

ONTOLOGY MATCHING: A STATE-OF-THE-ART SURVEY ONTOLOGY MATCHING: A STATE-OF-THE-ART SURVEY December 10, 2010 Serge Tymaniuk - Emanuel Scheiber Applied Ontology Engineering WS 2010/11 OUTLINE Introduction Matching Problem Techniques Systems and Tools

More information

! " # Formal Classification. Logics for Data and Knowledge Representation. Classification Hierarchies (1) Classification Hierarchies (2)

!  # Formal Classification. Logics for Data and Knowledge Representation. Classification Hierarchies (1) Classification Hierarchies (2) ,!((,.+#$),%$(-&.& *,(-$)%&.'&%!&, Logics for Data and Knowledge Representation Alessandro Agostini agostini@dit.unitn.it University of Trento Fausto Giunchiglia fausto@dit.unitn.it Formal Classification!$%&'()*$#)

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

XML Schema Matching Using Structural Information

XML Schema Matching Using Structural Information XML Schema Matching Using Structural Information A.Rajesh Research Scholar Dr.MGR University, Maduravoyil, Chennai S.K.Srivatsa Sr.Professor St.Joseph s Engineering College, Chennai ABSTRACT Schema matching

More information

OMEN: A Probabilistic Ontology Mapping Tool

OMEN: A Probabilistic Ontology Mapping Tool OMEN: A Probabilistic Ontology Mapping Tool Prasenjit Mitra 1, Natasha F. Noy 2, and Anuj Rattan Jaiswal 1 1 The Pennsylvania State University, University Park, PA 16802, U.S.A. {pmitra, ajaiswal}@ist.psu.edu

More information

Reconciling Ontologies for Coordination among E-business Agents

Reconciling Ontologies for Coordination among E-business Agents Reconciling Ontologies for Coordination among E-business Agents Jingshan Huang Computer Science and Engineering Department University of South Carolina Columbia, SC 29208, USA +1-803-777-3768 huang27@sc.edu

More information

Cluster-based Similarity Aggregation for Ontology Matching

Cluster-based Similarity Aggregation for Ontology Matching Cluster-based Similarity Aggregation for Ontology Matching Quang-Vinh Tran 1, Ryutaro Ichise 2, and Bao-Quoc Ho 1 1 Faculty of Information Technology, Ho Chi Minh University of Science, Vietnam {tqvinh,hbquoc}@fit.hcmus.edu.vn

More information

Gap analysis of ontology mapping tools and techniques

Gap analysis of ontology mapping tools and techniques Loughborough University Institutional Repository Gap analysis of ontology mapping tools and techniques This item was submitted to Loughborough University's Institutional Repository by the/an author. Citation:

More information

An interactive, asymmetric and extensional method for matching conceptual hierarchies

An interactive, asymmetric and extensional method for matching conceptual hierarchies An interactive, asymmetric and extensional method for matching conceptual hierarchies Jérôme David, Fabrice Guillet, Régis Gras, and Henri Briand LINA CNRS FRE 2729, Polytechnic School of Nantes University,

More information

Ontology Reconciliation for Service-Oriented Computing

Ontology Reconciliation for Service-Oriented Computing Ontology Reconciliation for Service-Oriented Computing Jingshan Huang, Jiangbo Dang, and Michael N. Huhns Computer Science & Engineering Dept., University of South Carolina, Columbia, SC 29208, USA {huang27,

More information

OWL-CM : OWL Combining Matcher based on Belief Functions Theory

OWL-CM : OWL Combining Matcher based on Belief Functions Theory OWL-CM : OWL Combining Matcher based on Belief Functions Theory Boutheina Ben Yaghlane 1 and Najoua Laamari 2 1 LARODEC, Université de Tunis, IHEC Carthage Présidence 2016 Tunisia boutheina.yaghlane@ihec.rnu.tn

More information

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009 349 WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY Mohammed M. Sakre Mohammed M. Kouta Ali M. N. Allam Al Shorouk

More information

Ontology mediation, merging and aligning

Ontology mediation, merging and aligning Ontology mediation, merging and aligning Jos de Bruijn Marc Ehrig Cristina Feier Francisco Martín-Recuerda François Scharffe Moritz Weiten May 20, 2006 Abstract Ontology mediation is a broad field of research

More information

Anchor-Profiles for Ontology Mapping with Partial Alignments

Anchor-Profiles for Ontology Mapping with Partial Alignments Anchor-Profiles for Ontology Mapping with Partial Alignments Frederik C. Schadd Nico Roos Department of Knowledge Engineering, Maastricht University, Maastricht, The Netherlands Abstract. Ontology mapping

More information

Natasha Noy Stanford University USA

Natasha Noy Stanford University USA Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University US Being serious

More information

Semantic Web. Ontology Alignment. Morteza Amini. Sharif University of Technology Fall 95-96

Semantic Web. Ontology Alignment. Morteza Amini. Sharif University of Technology Fall 95-96 ه عا ی Semantic Web Ontology Alignment Morteza Amini Sharif University of Technology Fall 95-96 Outline The Problem of Ontologies Ontology Heterogeneity Ontology Alignment Overall Process Similarity (Matching)

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

Semantic Matching: Algorithms and Implementation *

Semantic Matching: Algorithms and Implementation * Semantic Matching: Algorithms and Implementation * Fausto Giunchiglia, Mikalai Yatskevich, and Pavel Shvaiko Department of Information and Communication Technology, University of Trento, 38050, Povo, Trento,

More information

The Results of Falcon-AO in the OAEI 2006 Campaign

The Results of Falcon-AO in the OAEI 2006 Campaign The Results of Falcon-AO in the OAEI 2006 Campaign Wei Hu, Gong Cheng, Dongdong Zheng, Xinyu Zhong, and Yuzhong Qu School of Computer Science and Engineering, Southeast University, Nanjing 210096, P. R.

More information

A Survey on Ontology Mapping

A Survey on Ontology Mapping A Survey on Ontology Mapping Namyoun Choi, Il-Yeol Song, and Hyoil Han College of Information Science and Technology Drexel University, Philadelphia, PA 19014 Abstract Ontology is increasingly seen as

More information

Lily: Ontology Alignment Results for OAEI 2009

Lily: Ontology Alignment Results for OAEI 2009 Lily: Ontology Alignment Results for OAEI 2009 Peng Wang 1, Baowen Xu 2,3 1 College of Software Engineering, Southeast University, China 2 State Key Laboratory for Novel Software Technology, Nanjing University,

More information

Matching Schemas for Geographical Information Systems Using Semantic Information

Matching Schemas for Geographical Information Systems Using Semantic Information Matching Schemas for Geographical Information Systems Using Semantic Information Christoph Quix, Lemonia Ragia, Linlin Cai, and Tian Gan Informatik V, RWTH Aachen University, Germany {quix, ragia, cai,

More information

An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information

An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information Stefan Schulte Multimedia Communications Lab (KOM) Technische Universität Darmstadt, Germany schulte@kom.tu-darmstadt.de

More information

38050 Povo Trento (Italy), Via Sommarive 14 TOWARDS A THEORY OF FORMAL CLASSIFICATION

38050 Povo Trento (Italy), Via Sommarive 14   TOWARDS A THEORY OF FORMAL CLASSIFICATION UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it TOWARDS A THEORY OF FORMAL CLASSIFICATION Fausto Giunchiglia,

More information

MapPSO Results for OAEI 2010

MapPSO Results for OAEI 2010 MapPSO Results for OAEI 2010 Jürgen Bock 1 FZI Forschungszentrum Informatik, Karlsruhe, Germany bock@fzi.de Abstract. This paper presents and discusses the results produced by the MapPSO system for the

More information

Community-Driven Ontology Matching

Community-Driven Ontology Matching Community-Driven Ontology Matching Anna V. Zhdanova 1,2 and Pavel Shvaiko 3 1 CCSR, University of Surrey, Guildford, UK, 2 DERI, University of Innsbruck, Innsbruck, Austria, a.zhdanova@surrey.ac.uk 3 DIT,

More information

Ontology Extraction from Heterogeneous Documents

Ontology Extraction from Heterogeneous Documents Vol.3, Issue.2, March-April. 2013 pp-985-989 ISSN: 2249-6645 Ontology Extraction from Heterogeneous Documents Kirankumar Kataraki, 1 Sumana M 2 1 IV sem M.Tech/ Department of Information Science & Engg

More information

38050 Povo Trento (Italy), Via Sommarive 14 COMMUNITY-DRIVEN ONTOLOGY MATCHING. Anna V. Zhdanova and Pavel Shvaiko

38050 Povo Trento (Italy), Via Sommarive 14  COMMUNITY-DRIVEN ONTOLOGY MATCHING. Anna V. Zhdanova and Pavel Shvaiko UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it COMMUNITY-DRIVEN ONTOLOGY MATCHING Anna V. Zhdanova and Pavel

More information

A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation

A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation Dimitris Manakanatas, Dimitris Plexousakis Institute of Computer Science, FO.R.T.H. P.O. Box 1385, GR 71110, Heraklion, Greece

More information

Semantic text features from small world graphs

Semantic text features from small world graphs Semantic text features from small world graphs Jurij Leskovec 1 and John Shawe-Taylor 2 1 Carnegie Mellon University, USA. Jozef Stefan Institute, Slovenia. jure@cs.cmu.edu 2 University of Southampton,UK

More information

Ontology Refinement and Evaluation based on is-a Hierarchy Similarity

Ontology Refinement and Evaluation based on is-a Hierarchy Similarity Ontology Refinement and Evaluation based on is-a Hierarchy Similarity Takeshi Masuda The Institute of Scientific and Industrial Research, Osaka University Abstract. Ontologies are constructed in fields

More information

A Tagging Approach to Ontology Mapping

A Tagging Approach to Ontology Mapping A Tagging Approach to Ontology Mapping Colm Conroy 1, Declan O'Sullivan 1, Dave Lewis 1 1 Knowledge and Data Engineering Group, Trinity College Dublin {coconroy,declan.osullivan,dave.lewis}@cs.tcd.ie Abstract.

More information

Optimised Classification for Taxonomic Knowledge Bases

Optimised Classification for Taxonomic Knowledge Bases Optimised Classification for Taxonomic Knowledge Bases Dmitry Tsarkov and Ian Horrocks University of Manchester, Manchester, UK {tsarkov horrocks}@cs.man.ac.uk Abstract Many legacy ontologies are now being

More information

Reconciling Agent Ontologies for Web Service Applications

Reconciling Agent Ontologies for Web Service Applications Reconciling Agent Ontologies for Web Service Applications Jingshan Huang, Rosa Laura Zavala Gutiérrez, Benito Mendoza García, and Michael N. Huhns Computer Science and Engineering Department University

More information

Leveraging Data and Structure in Ontology Integration

Leveraging Data and Structure in Ontology Integration Leveraging Data and Structure in Ontology Integration O. Udrea L. Getoor R.J. Miller Group 15 Enrico Savioli Andrea Reale Andrea Sorbini DEIS University of Bologna Searching Information in Large Spaces

More information

An Improving for Ranking Ontologies Based on the Structure and Semantics

An Improving for Ranking Ontologies Based on the Structure and Semantics An Improving for Ranking Ontologies Based on the Structure and Semantics S.Anusuya, K.Muthukumaran K.S.R College of Engineering Abstract Ontology specifies the concepts of a domain and their semantic relationships.

More information

Towards a Theory of Formal Classification

Towards a Theory of Formal Classification Towards a Theory of Formal Classification Fausto Giunchiglia, Maurizio Marchese, Ilya Zaihrayeu {fausto, marchese, ilya}@dit.unitn.it Department of Information and Communication Technology University of

More information

Extension and integration of i* models with ontologies

Extension and integration of i* models with ontologies Extension and integration of i* models with ontologies Blanca Vazquez 1,2, Hugo Estrada 1, Alicia Martinez 2, Mirko Morandini 3, and Anna Perini 3 1 Fund Information and Documentation for the industry

More information

Motivating Ontology-Driven Information Extraction

Motivating Ontology-Driven Information Extraction Motivating Ontology-Driven Information Extraction Burcu Yildiz 1 and Silvia Miksch 1, 2 1 Institute for Software Engineering and Interactive Systems, Vienna University of Technology, Vienna, Austria {yildiz,silvia}@

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

A Generic Algorithm for Heterogeneous Schema Matching

A Generic Algorithm for Heterogeneous Schema Matching You Li, Dongbo Liu, and Weiming Zhang A Generic Algorithm for Heterogeneous Schema Matching You Li1, Dongbo Liu,3, and Weiming Zhang1 1 Department of Management Science, National University of Defense

More information

Dynamic Ontology Evolution

Dynamic Ontology Evolution Dynamic Evolution Fouad Zablith Knowledge Media Institute (KMi), The Open University. Walton Hall, Milton Keynes, MK7 6AA, United Kingdom. f.zablith@open.ac.uk Abstract. Ontologies form the core of Semantic

More information

A framework for the classification and the reclassification of electronic catalogs

A framework for the classification and the reclassification of electronic catalogs 2004 ACM Symposium on Applied Computing A framework for the classification and the reclassification of electronic catalogs Domenico Beneventano Dipartimento di Ingegneria dell Informazione Università di

More information

Information Discovery, Extraction and Integration for the Hidden Web

Information Discovery, Extraction and Integration for the Hidden Web Information Discovery, Extraction and Integration for the Hidden Web Jiying Wang Department of Computer Science University of Science and Technology Clear Water Bay, Kowloon Hong Kong cswangjy@cs.ust.hk

More information

Semantic Integration: A Survey Of Ontology-Based Approaches

Semantic Integration: A Survey Of Ontology-Based Approaches Semantic Integration: A Survey Of Ontology-Based Approaches Natalya F. Noy Stanford Medical Informatics Stanford University 251 Campus Drive, Stanford, CA 94305 noy@smi.stanford.edu ABSTRACT Semantic integration

More information

Integration of Product Ontologies for B2B Marketplaces: A Preview

Integration of Product Ontologies for B2B Marketplaces: A Preview Integration of Product Ontologies for B2B Marketplaces: A Preview Borys Omelayenko * B2B electronic marketplaces bring together many online suppliers and buyers. Each individual participant potentially

More information

SurveyonTechniquesforOntologyInteroperabilityinSemanticWeb. Survey on Techniques for Ontology Interoperability in Semantic Web

SurveyonTechniquesforOntologyInteroperabilityinSemanticWeb. Survey on Techniques for Ontology Interoperability in Semantic Web Global Journal of Computer Science and Technology: E Network, Web & Security Volume 14 Issue 2 Version 1.0 Year 2014 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

A Survey of Schema-based Matching Approaches

A Survey of Schema-based Matching Approaches A Survey of Schema-based Matching Approaches Pavel Shvaiko 1 and Jérôme Euzenat 2 1 University of Trento, Povo, Trento, Italy, pavel@dit.unitn.it 2 INRIA, Rhône-Alpes, France, Jerome.Euzenat@inrialpes.fr

More information

Computational Cost of Querying for Related Entities in Different Ontologies

Computational Cost of Querying for Related Entities in Different Ontologies Computational Cost of Querying for Related Entities in Different Ontologies Chung Ming Cheung Yinuo Zhang Anand Panangadan Viktor K. Prasanna University of Southern California Los Angeles, CA 90089, USA

More information

Semantic Web. Ontology Alignment. Morteza Amini. Sharif University of Technology Fall 94-95

Semantic Web. Ontology Alignment. Morteza Amini. Sharif University of Technology Fall 94-95 ه عا ی Semantic Web Ontology Alignment Morteza Amini Sharif University of Technology Fall 94-95 Outline The Problem of Ontologies Ontology Heterogeneity Ontology Alignment Overall Process Similarity Methods

More information

An Annotation Tool for Semantic Documents

An Annotation Tool for Semantic Documents An Annotation Tool for Semantic Documents (System Description) Henrik Eriksson Dept. of Computer and Information Science Linköping University SE-581 83 Linköping, Sweden her@ida.liu.se Abstract. Document

More information

Interoperability Issues, Ontology Matching and MOMA

Interoperability Issues, Ontology Matching and MOMA Interoperability Issues, Ontology Matching and MOMA Malgorzata Mochol (Free University of Berlin, Königin-Luise-Str. 24-26, 14195 Berlin, Germany mochol@inf.fu-berlin.de) Abstract: Thought interoperability

More information

Evaluation of ontology matching

Evaluation of ontology matching Evaluation of ontology matching Jérôme Euzenat (INRIA Rhône-Alpes & LIG) + work within Knowledge web 2.2 and esp. Malgorzata Mochol (FU Berlin) April 19, 2007 Evaluation of ontology matching 1 / 44 Outline

More information

2 Experimental Methodology and Results

2 Experimental Methodology and Results Developing Consensus Ontologies for the Semantic Web Larry M. Stephens, Aurovinda K. Gangam, and Michael N. Huhns Department of Computer Science and Engineering University of South Carolina, Columbia,

More information

38050 Povo Trento (Italy), Via Sommarive 14 A SURVEY OF SCHEMA-BASED MATCHING APPROACHES. Pavel Shvaiko and Jerome Euzenat

38050 Povo Trento (Italy), Via Sommarive 14  A SURVEY OF SCHEMA-BASED MATCHING APPROACHES. Pavel Shvaiko and Jerome Euzenat UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it A SURVEY OF SCHEMA-BASED MATCHING APPROACHES Pavel Shvaiko

More information

Hierarchical Clustering of Process Schemas

Hierarchical Clustering of Process Schemas Hierarchical Clustering of Process Schemas Claudia Diamantini, Domenico Potena Dipartimento di Ingegneria Informatica, Gestionale e dell'automazione M. Panti, Università Politecnica delle Marche - via

More information

Structure Matching for Enhancing UDDI Queries Results

Structure Matching for Enhancing UDDI Queries Results Structure Matching for Enhancing UDDI Queries Results Giancarlo Tretola University of Sannio Department of Engineering Benevento,82100 Italy tretola@unisannio.it Eugenio Zimeo University of Sannio Research

More information

Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc.

Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc. Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc. This paper provides an overview of a presentation at the Internet Librarian International conference in London

More information

A Model of Machine Learning Based on User Preference of Attributes

A Model of Machine Learning Based on User Preference of Attributes 1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

More information

HotMatch Results for OEAI 2012

HotMatch Results for OEAI 2012 HotMatch Results for OEAI 2012 Thanh Tung Dang, Alexander Gabriel, Sven Hertling, Philipp Roskosch, Marcel Wlotzka, Jan Ruben Zilke, Frederik Janssen, and Heiko Paulheim Technische Universität Darmstadt

More information