A Sequence-based Ontology Matching Approach

Size: px
Start display at page:

Download "A Sequence-based Ontology Matching Approach"

Transcription

1 A Sequence-based Ontology Matching Approach Alsayed Algergawy, Eike Schallehn and Gunter Saake 1 Abstract. The recent growing of the Semantic Web requires the need to cope with highly semantic heterogeneities among available ontologies. Ontology matching techniques aim to tackle this problem by establishing correspondences between ontologies elements. An intricate obstacle faces the ontology matching problem is its scalability against large number and large-scale ontologies. To tackle these challenges, in this paper, we propose a new matching framework based on Prüfer sequences. The proposed approach is applicable for matching a database of XML trees. Our approach is based on the representation of XML ontologies as sequences of labels and numbers by the Prüfer s method that constructs a one-to-one correspondence between schema ontologies and sequences. We capture ontology tree semantic information in Label Prüfer Sequences (LPS) and ontology tree structural information in Number Prüfer Sequences (NPS). Then, we develop a new structural matching algorithm exploiting both LPS and NPS. Our Experimental results demonstrate the performance benefits of the proposed approach. 1 Introduction The Semantic Web (SW) is evolving towards an open, dynamic, distributed, and heterogenous environments. The core of the SW is ontology, which is used to represent our conceptualizations. The Semantic Web puts the onus of ontology creation on the user by providing common ontology languages such as XML, RDF(S) and OWL. However, ontologies defined by different applications usually describe their domains in different terminologies, even they cover the same domain. In order to support ontology-based information integration, tools and mechanisms are needed to resolve the semantic heterogeneity problem and align terms in different ontologies. Ontology matching plays the central role in these approaches. Ontology matching is the task of identifying correspondences among elements of two ontologies [5, 2, 15]. Due to the complexity of ontology/schema matching, it was mostly performed manually by a human expert. However, manual reconciliation tends to be a slow and inefficient process especially in large-scale and dynamic environments such as the Semantic Web. Therefore, the need for automatic semantic schema matching has become essential. Consequently, many ontology/schema matching systems have been developed for automating the matching process, such as Cupid [17], COMA [6], Similarity Flooding [18], LSD [7], BTreeMatch [12], Spicy [2], GLUE [8, 9], OntoBuilder [21], QOM [13], and S-Match [14]. Moreover, most of these approaches have been developed and tested using small-scale schemas. The primary focus was on matching effectiveness. Unfortunately, the effectiveness of automatic match techniques typically decreases for larger schemas. In particular, matching the complete input schemas may lead not only to long execution times, but also poor quality due to 1 Magdeburg University, Germany, {alshahat, eike, saake}@ovgu.de the large search space. Therefore, the need for efficient and effective algorithms has been arisen. Recently, matching algorithms are introduced to focus on matching large-scale and large number schemas and ontologies, i.e. considering the efficiency aspect of matching algorithms, such as COMA++ [1, 11], QOM [13], Bellflower [23], and PORSCHE [22]. Most of these systems rely heavily on either rule-based approaches or learner-based approaches. In the rule-based systems, schemas to be matched are represented as schema trees or schema graphs which in turn requires traversing these trees (or graphs) many times. On the other hand, learning-based systems need much pre-effort to train its learners. As a consequence, especially in large-scale schemas and dynamic environments, matching performance declines radically. Motivated by the above challenges and by the fact that the most prominent feature for an XML schema is its hierarchical structure, in this paper, we propose a novel approach for matching XML schemas. In particular, we develop and implement the XPrüM system, which consists mainly of two parts schema preparation and schema matching. Schemas to be matched are first parsed and represented internally using rooted ordered labeled trees, called schema trees. Then, we construct a Prüfer sequence for each schema tree. Prüfer sequences construct a one-to-one correspondence between schema trees and sequences. We capture schema tree semantic information in Label Prüfer Sequences (LPS) and schema tree structural information in Number Prüfer Sequences (NPS). LPS is exploited by a linguistic matcher to compute terminological similarities between schemas elements. Linguistic matching techniques may provide false positive matches. Structural matching is used to correct such matches based on structural contexts of schema elements. Structural matching relies on the notion of node 2 context. In this paper, we distinguish between three types of node contexts depending on its location in the schema tree. These types are child context, leaf context and ancestor context. We exploit the number sequence representation of the schema tree to extract node contexts for each tree node in an efficient way. Then, for each node context, we apply its associated algorithm. For example, the leaf context similarity between two nodes is measured by extracting leaf context for each node as a gap vector. Then, we apply the cosine measure between two gap node vectors. Other context similarity measures are determined similarly. By representing schema trees as Prüfer Sequences we need to traverse these trees only once to construct these sequences. Then, we develop a novel structural matching algorithm which captures semantic information existing in label sequences and structural information embedded in number sequences. The paper is organized as follows: Section 2 introduces basic concepts and definitions. Section 3 describes our proposed approach. Section 4 presents experimental results. Section 5 gives concluding 2 In this paper the terms node and element are interchangeable

2 (a) Ontology Tree OT1 (b) Ontology Tree OT2 Figure 1: Computer science department ontologies remarks and our proposed future work. 2 Preliminaries The semantics conveyed by ontologies can be as simple as a database schema or as complex as the background knowledge in a knowledge base. Our proposed approach is concerning only with the first case, which gives the assumption that ontologies in this paper are represented as XML schemas. XML schemas can be modeled as rooted, ordered, labeled trees T = (N, E, Lab), called ontology trees. N is the set of tree nodes, representing different XML schema components. E is the set of edges, representing the relationships between schema components. Lab is the set of node labels, representing properties of each node. We categorize nodes into atomic nodes, which have no outgoing edges and represent leaf nodes and complex nodes, which are the internal nodes in the ontology tree. 2.1 Motivations To the best of our knowledge no work for matching XML schemas exists which is based on Prüer Sequences. Most of existing matching systems rely heavily on matching schema trees/graphs. In addition, these systems perform poorly when dealing with large-scale ontologies. These shortcomings have motivated us to develop an XML schema matching system based on Prüfer Sequences, called XPrüM. Our proposed system is also a schema-based approach. However, each schema tree is represented using two sequences which capture both tree semantic information and tree structure. This leads to a space efficient representation and an efficient time response when compared to the state-of-the-art systems. 2.2 A Matching Example To describe the operation of our approach, we use the example found in [9]. It describes two XML ontologies, shown in Figure 1 (a and b), that represent organization in universities from different countries and have been widely used in the literature. The task is to discover semantic correspondences between two schemas elements. 3 The Proposed Approach In this section, we shall describe the core parts of the XPrüM system. As shown in Fig.2, it has two main parts: ontology preparation and ontology matching. First, ontologies are parsed using a SAX parser Figure 2: Matching Process Phases and represented internally as ontology trees. Then, using the Prüfer sequence method, we extract both label sequences and number sequences. The ontology matching part discovers the set of matches between two ontologies employing both sequences. 3.1 Prüfer Sequences Construction We now describe the tree sequence representation method, which provides a bijection between ordered, labeled trees and sequences. This representation is inspired from classical Prüfer sequences [19] and particularly from what is called Consolidated Prüfer Sequence CPS proposed in [24]. CPS of an ontology tree OT consists of two sequences, Number Prüfer Sequence NPS and Label Prüfer Sequence LPS. They are constructed by doing a post-order traversal that tags each node in the ontology tree with a unique traversal number. NPS is then constructed iteratively by removing the node with the smallest traversal number and appending its parent node number to the already structured partial sequence. LPS is constructed similarly but by taken the node labels of deleted nodes instead of their parent node numbers. Both NPS and LPS convey completely different but complementary information NPS that is constructed from unique post-order traversal numbers gives wealthy tree structure information and LPS gives the labels for tree nodes. CPS representation thus provides a bijection between ordered, labeled trees and sequences. Therefore, CPS = (NPS, LPS) uniquely represents a rooted, ordered, labeled tree, where each entry in the CPS corresponds to an edge in the schema tree. For more details see [24]. Example 1. Consider ontology trees OT1 and OT2 shown in Figure 3, each node is associated with its OID and its post order number. Table 1 illustrates CPS for OT 1 and OT 2. For example, CPS of OT 1 can be written as the NP S(OT 1)= , and the LP S(OT 1).name= UnderGrad Courses, Grad Courses,

3 analysis shows that there is no single dominant element matcher that performs best, regardless of the data model and application domain [1]. As a result, we should exploit different kinds of element matchers. In our approach, we make use of name matcher; to exploit elements names, datatype matcher; to exploit elements types/datatypes, and structural matcher; to exploit elements structural contexts. After a degree of similarity is computed, how to combine different similarities from different element matchers and select top-k mappings are addressed in the second phase Figure 3: Nodes OIDs and corresponding post-order numbers, Name, Degree, Assistant Professor, Associate Professor, Professor, Faculty, Staff, People, CS Dept US. Table 1: CPS of Ontology Trees Schema Tree ST1 Schema Tree ST2 NPS LPS NPS LPS OID name type/data type OID name type/data type 11 n 2 UnderGrad Courses element/string 11 n 2 Courses element/string 11 n 3 Grad Courses element/string 5 n 6 FirstName element/string 5 n 7 Name element/string 5 n 7 LastName element/string 5 n 8 Degree element/string 5 n 8 Education element/string 8 n 6 Assistant Professor element/- 8 n 5 Lecturer element/- 8 n 9 Associate Professor element/string 8 n 9 SeniorLecturer element/string 8 n 1 Professor element/string 8 n 1 Professor element/string 1 n 5 faculty element/- 1 n 4 AcademicStaff element/- 1 n 11 Staff element/string 1 n 11 TecnicalStaff element/string 11 n 4 People element/- 11 n 3 Staff element/- - n 1 CS Dept US element/- - n 1 CS Dept Aust element/ Name Matcher The aim of this phase is to obtain an initial matching between elements of two ontology trees based on the similarity of their names. To compute linguistic similarity between two elements names s 1 and s 2, we use the following three similarity measures. The first where editdistance(s 1, s 2 ) is the minimum number of character insertion and deletion operations needed to transform one string to the other. The second similarity measure is based on the number of different one is sim edit (s 1, s 2) = max( s 1, s 2 ) editdistance(s 1,s 2 ) max( s 1, s 2 ) tri(s 2 ) trigrams in the two strings: sim tri(s 1, s 2) = 2 tri(s 1) tri(s 1 ) + tri(s 2 ) where tri(s 1 ) is the set of trigrams in s 1. The third similarity measure is based on Jaro-Winkler distance, which is given by sim jaro(s 1, s 2) = 1 ( m + m m t 3 s 1 s 2 ) where m is the number of matching characters and t is the number of transpositions. The m linguistic matching between two ontology tree nodes is computed as the combination (weighted sum) of the above three similarity values CPS Properties In the following, we list the structural properties behind CPS representation of ontology trees. If we construct a CPS=(NPS, LPS) from an ontology tree OT, we could classify these properties into: Unary Properties: for every node n i has a postorder number k, 1. atomic node: n i is an atomic node iff k NP S 2. complex node: n i is a complex node iff k NP S 3. root node: n i is the root node (n root ) iff k = max(np S), where max is a function which returns the maximum number in NPS. Binary Properties 1. edge relationship: each entry CP S i = (k i, LP S i ) represents an edge from the node whose post-order number is k i to a node n i = LP S i.oid. This property shows both child and parent relationships. This means that the node n i = LP S i.oid is an immediate child for the node whose post-order number k i. 2. sibling relationship: two entries CP S i = (k i, LP S i) and CP S j = (k j, LP S j ), the two nodes n i = LP S i.oid and n j = LP S j.oid are two sibling nodes iff k i = K j. 3.2 Matching Algorithms Ontology matching algorithms operate on the sequential representation of two ontology trees ST1 and ST2 and discover semantic correspondences between them. Generally speaking, the process of ontology matching is performed, as shown in Figure 2, in two phases element matchers and combiner & selector. First, a degree of similarity is computed automatically for all element pairs using the element matcher phase. Recent empirical Datatype Compatibility We propose the use of datatype compatibility to improve initial linguistic similarity. To this end, we make use of built-in XML datatypes hierarchy 4 in order to compute datatype compatibility coefficients. Based on XML schema datatype hierarchy, we build a datatype compatibility table as the one used in [17]. After computing datatype compatibility coefficients we can adjust linguistic similarity values. The result of the above process is an adjusted linguistic similarity matrix Structural Matcher Our structural matching algorithm is motivated by the fact that the most prominent feature in an XML schema is its hierarchical structure. This matcher is based on the node context, which is reflected by its ancestors and its descendants. The descendants of an element include both its immediate children and the leaves of the subtrees rooted at the element. The immediate children reflect its basic structure, while the leaves reflect the element s content. In this paper, as in [16, 3], we consider three kinds of node contexts depending on its position in the ontology tree: The child context of a node n i is defined as the set of its immediate children nodes including attributes and subelements. The child context of an atomic node is an empty set. Using the edge relationship property, we could identify immediate children of a complex node and their count. The number of immediate children of a non-leaf node from the NPS sequence is obtained by counting its post-order traversal number in the sequence, and then we could identify these children. For example, in Example 1, consider OT1, 4

4 the post-order number of node n 4 is 1. This number repeats two times. This means that it has two immediate children {n 5, n 11}. The leaf context of a node n i is defined as the set of leaf nodes of subtrees rooted at node n i. We notice that nodes whose post-order numbers do not appear in the NPS sequence are atomic nodes; CPS atomic property. From this notice and from the child context we could recursively obtain the leaf context for a certain node. For example, in Example 1, consider OT1, nodes {n 2, n 3, n 7, n 8, n 9, n 1, n 11 } are leaf nodes set. Node n 6 has two children n 7, n 8, which are leaves. Then they are the leaf context set of node n 6. The ancestor context of a node n i is defined as the path extending from the root node to the node n i. The ancestor of the root node is an empty path. For a non-atomic node, we obtain the ancestor context by scanning the NPS sequence from left to right and identifying the numbers which are greater than post-order number of the node until the first occurrence of the root node. While scanning from left to right, we ignore nodes whose post-order numbers are less than post-order numbers of already scanned nodes. For a leaf node, the ancestor context is the ancestor context of its parent union the parent itself. For example, in Example 1, consider OT1, the ancestor context of node n 5 (non-atomic node) is the path n 1 / n 4/ n 5. While the ancestor context of node n 9 (atomic node) is the path n 1 /n 4 / n 5 / n 9. Structural Context Similarity Algorithm Structural node context defined above relies on the notion of path and set. In order to compare two ancestor contexts, we essentially compare their corresponding paths. On the other hand, in order to compare two child contexts and/or leaf contexts, we need to compare their corresponding sets. In the following we describe how to compute the three structural context measures: 1. Child Context Algorithm: To obtain the child context similarity between two nodes, we compare the two child context sets for the two nodes. To this end, we first extract the child context set for each node. Second, we get the linguistic similarity between each pair of children in the two sets. Third, we select the matching pairs with maximum similarity values. And finally, we take the average of best similarity values. 2. Leaf Context Algorithm: Before we delve into details of computing leaf context similarity, we shall first introduce the notion of gap between two tree nodes. Definition The gap between two nodes n i and n j in an ontology tree OT is defined as the difference between their post-order numbers. To compute the leaf context similarity between two nodes, we compare their leaf context sets. To this end, first, we extract the leaf context set for each node. Second, we determine the gap between each node and its leaf context set. We call this vector the gap vector. Third, we apply the cosine measure between two vectors. Example 2. For the two ontology trees shown in Example 1. The leaf context set of OT1.n 1 is leaf set (n 1 )={n 2, n 3, n 7, n 8, n 9, n 1, n 11} and the leaf context set of OT2.n 1 is leaf set (n 1 )={n 2, n 6, n 7, n 8, n 9, n 1, n 11 }. The gap vector of ST1.n 1 is v 1 =gapvec(ot1.n 1)={1,9,8,7,5,4,2} and the gap vector of OT2.n 1 is v 2 =gapvec(ot2.n 1 )={1,9,8,7,5,4,2}. The cosine measure CM of the two vectors gives CM(v 1,v 2 ) =1.. Then the leaf context similarity between nodes OT1.n 1 and OT2.n 1 is Ancestor Context Similarity: The ancestor context similarity captures the similarity between two nodes based on their ancestor contexts. To compute the ancestor similarity between two nodes n i and n j, first we extract each ancestor context from the CPS sequence, say path P i for n i and path P j for n j. Second, we compare two paths. To compare two paths, we use three of four scores established in [4] and reused in [3]. These scores are combined to compute the similarity between two paths P i and P j psim as follows: psim(p i, P j ) = LCS n (P i, P j ) γgap S(P i, P j ) δld(p i, P j ) (1) where γ and δ are positive parameters ranging from to 1 that represent the comparative importance of each factor. The three scores are: (1)LCS n(p i, P j) used to measure longest common subsequences between two paths normalized by the length of the first path, (2) GAP S(P i, P j) used to ensure that the occurrences of two paths nodes are close to each other, and (3)LD(P i, P j ) used to give higher values to source paths whose lengths is similar to target paths. Putting it all together: Our complete structural matching algorithm is as follows: The algorithm accepts CPS(NPS, LPS) for each schema tree and the linguistic similarity matrix as inputs and produces a structural similarity matrix. For each node pairs, context similarity is computed using child context, ancestor context, and leaf context. The three context values are then combined. 4 Experimental Evaluation To implement the solution we have installed above algorithms using Java platform. We ran all our experiments on 2.4GHz Intel core2 processor with 2GB RAM running Windows XP. We shall describe the data sets used through evaluation and our experimental results. 4.1 Data Sets We experimented with the data sets shown in Table 2. These data sets were obtained from We choose these data sets since they capture different characteristics in the numbers of nodes (schema size), their depth (the number of nodes nesting) and represent different application domains, see Table 2. Table 2: Data set details Domain No. of ontologies/nodes Ontology size University 44/55 < 1KB XCBL 57/35 < 1 KB OAGIS 4/36 <1 KB OAGIS 1/65 >1 KB 4.2 Measures for Match Performance The XPrüM system considers both performance aspects matching effectiveness and matching efficiency. However due to space limit and according the paper outline, we consider matching efficiency in more details. To this end, we use the response time as a function of the number of schemas and the number of nodes to measure matching efficiency

5 Response time in ms Number of Schemas (a) Response time of University ontologies 3 25 Response time in ms Number of Schemas (b) Response time of XCBL ontologies 12 1 could preserve matching quality beside its scalability. XPrüM has other features including: it is almost automatic; it does not make use of any external dictionary; moreover, it is independent on data mode and application domain of matched schemas. In our ongoing work, we should consider the second aspect of semantics conveyed by ontologies, i.e. modeling ontologies using RDF(s) or OWL, to deal with background knowledge. This helps us to apply the sequence-based approach on other applications and domains such as image matching and the Web service discovery. Response time in Sec Number of nodes x 1 Response time in Sec Number of nodes x 1 (c) Res. time of OAGIS ontology nodes 1 (d) Res. time of OAGIS ontology nodes 2 Figure 4: Performance analysis of XPrüM system with real world ontologies 4.3 Experimental Results XPrüM Quality: To show matching effectiveness of our system, ontologies shown in Figure 1 have been used. XPrüM could discover 9 true positive matches among 11 and 2 false positive matches with F-measure of 81%. Compared to COMA++ [1] which could discover only 5 true positive matches and miss 6 false negative matches with F-measure of 62%, our system demonstrates better quality. XPrüM Efficiency: Figure 4 shows that XPrüM achieves high scalability across all three domains. The system could identify and discover correspondences across 44 schemas including 55 nodes from the university domain in a time of.4 second, while the approach needs 1.8 seconds to match 57 schemas including approximately 35 nodes from the XCBL domain. This demonstrates that XPrüM is scalable with large number of schemas. To demonstrates the system scalability with large-scale schemas, we performed two set of experiments. The first one is tested with the OAGIS domain whose schemas sizes ranging between 1KB and 1KB. Figure 4(c) shows that the system needs 26 seconds to match 4 schemas containing 36 nodes. The second set is performed also using the OAGIS domain containing 1 schemas whose sizes are greater than 1KB. XPrüM needs more than 1 seconds to match 65 nodes as shown in Figure 4(d). 5 CONCLUSION With the proliferation of the Semantic Web ontologies, the development of automatic techniques for ontology matching will be crucial to their success. In this paper, we have addressed an intricate problem associated to the ontology matching problem matching scalability. To tackle this, we have proposed and implemented the XPrüM system, a hybrid matching algorithm to automatically discover semantic correspondences between XML schemas. The system starts with transforming schemas into ontology trees and then constructs a consolidated Prüfer sequence CPS for each schema tree which construct a one-to-one correspondence between schema trees and sequences. We capture schema tree semantic information in Label Prüfer Sequences and schema tree structural information in Number Prüfer Sequences. Experimental results have shown that XPrüM has a high scalability with respect to large number and large-scale schemas. Moreover, it REFERENCES [1] D. Aumueller, H.H. Do, S. Massmann, and E. Rahm, Schema and ontology matching with COMA++, in SIGMOD Conference, (25). [2] A. Bonifati, G. Mecca, A. Pappalardo, and S. Raunich, The spicy project: A new approach to data matching, in SEBD. Turkey, (26). [3] A. Boukottaya and C. Vanoirbeek, Schema matching for transforming structured documents, in DocEng 5, pp , (25). [4] D. Carmel, N. Efraty, G. Landau, Yo. Maarek, and Y. Mass, An extension of the vector space model for querying xml documents via xml fragments, SIGIR Forum, 36(2), (22). [5] L. Ding, P.Kolari, Z. Ding, S. Avancha, T. Finin, and A. Joshi, Using ontologies in the semantic web: A survey, in TR-CS-5-7, (25). [6] H. H. Do and E. Rahm, COMA- A system for flexible combination of schema matching approaches, in VLDB 22, pp , (22). [7] A. Doan, Learning to map between structured representations of datag, in Ph.D Thesis. Washington University, (22). [8] A. Doan, J. Madhavan, R. Dhamankar, P. Domingos, and A.Halevy, Learning to match ontologies on the semantic web, Knowledge and Information Systems, 12(4), , (23). [9] A. Doan, J. Madhavan, P. Domingos, and A. Halevy, Ontology matching: A machine learning approach, Handbook on Ontologies, International Handbooks on Information Systems, 24. [1] C. Domshlak, A. Gal, and H. Roitman, Rank aggregation for automatic schema matching, IEEE on KDE, 19(4), , (April 27). [11] C. Drumm, M. Schmitt, H.-H. Do, and E. Rahm, Quickmig - automatic schema matching for data migration projects, in Proc. ACM CIKM7. Portugal, (27). [12] F. Duchateau, Z. Bellahsene, and M. Roche, An indexing structure for automatic schema matching, in SMDB Workshop. Turkey, (27). [13] M. Ehrig and S. Staab, QOM - quick ontology mapping, in International Semantic Web Conference 24:, pp , (24). [14] F. Giunchiglia, M. Yatskevich, and P. Shvaiko, Semantic matching: Algorithms and implementation, Journal on Data Semantics, 9, 1 38, (27). [15] Y. Kalfoglou and M. Schorlemme, Ontology mapping: the state of the art, The Knowledge Engineering Review, 18(1), 1 31, (23). [16] M. L. Lee, L. Yang, W. Hsu, and X. Yang, Xclust: Clustering XML schemas for effective integration, in CIKM 2, pp , (22). [17] J. Madhavan, P. Bernstein, and E. Rahm, Generic schema matching with cupid, in VLDB 21, pp Roma, Italy, (21). [18] S. Melnik, H. Garcia-Molina, and E. Rahm, Similarity flooding: A versatile graph matching algorithm and its application to schema matching, in Proceedings of the 18th International Conference on Data Engineering (ICDE 2), (22). [19] H. Prufer, Neuer beweis eines satzes uber permutationen, Archiv für Mathematik und Physik, 27, , (1918). [2] E. Rahm and P. Bernstein, A survey of approaches to automatic schema matching, VLDB Journal, 1(4), , (21). [21] H. Roitman and A. Gal, Ontobuilder: Fully automatic extraction and consolidation of ontologies from Web sources using sequence semantics, in EDBT 26 Workshops, (26). [22] K. Saleem, Z. Bellahsene, and E. Hunt, PORSCHE: Performance oriented schema mediation, Accepted for publication in Information Systems Journal, (28). [23] M. Smiljanic, XML Schema Matching Balancing Efficiency and Effectiveness by means of Clustering, Ph.D. dissertation, Twente University, 26. [24] S. Tatikonda, S. Parthasarathy, and M. Goyder, LCS-trim: Dynamic programming meets XML indexing and querying, in VLDB 7, pp , (27).

Poster Session: An Indexing Structure for Automatic Schema Matching

Poster Session: An Indexing Structure for Automatic Schema Matching Poster Session: An Indexing Structure for Automatic Schema Matching Fabien Duchateau LIRMM - UMR 5506 Université Montpellier 2 34392 Montpellier Cedex 5 - France duchatea@lirmm.fr Mark Roantree Interoperable

More information

NAME SIMILARITY MEASURES FOR XML SCHEMA MATCHING

NAME SIMILARITY MEASURES FOR XML SCHEMA MATCHING NAME SIMILARITY MEASURES FOR XML SCHEMA MATCHING Ali El Desoukey Mansoura University, Mansoura, Egypt Amany Sarhan, Alsayed Algergawy Tanta University, Tanta, Egypt Seham Moawed Middle Delta Company for

More information

Fuzzy Constraint-based Schema Matching Formulation

Fuzzy Constraint-based Schema Matching Formulation Fuzzy Constraint-based Schema Matching Formulation Alsayed Algergawy, Eike Schallehn, and Gunter Saake Department of Computer Science, Otto-von-Guericke University, 39106 Magdeburg, Germany {alshahat,eike,saake@iti.cs.uni-magdeburg.de}

More information

PRIOR System: Results for OAEI 2006

PRIOR System: Results for OAEI 2006 PRIOR System: Results for OAEI 2006 Ming Mao, Yefei Peng University of Pittsburgh, Pittsburgh, PA, USA {mingmao,ypeng}@mail.sis.pitt.edu Abstract. This paper summarizes the results of PRIOR system, which

More information

BMatch: A Quality/Performance Balanced Approach for Large Scale Schema Matching

BMatch: A Quality/Performance Balanced Approach for Large Scale Schema Matching BMatch: A Quality/Performance Balanced Approach for Large Scale Schema Matching Fabien Duchateau 1 and Zohra Bellahsene 1 and Mathieu Roche 1 LIRMM - Université Montpellier 2 161 rue Ada 34000 Montpellier,

More information

XML Schema Matching Using Structural Information

XML Schema Matching Using Structural Information XML Schema Matching Using Structural Information A.Rajesh Research Scholar Dr.MGR University, Maduravoyil, Chennai S.K.Srivatsa Sr.Professor St.Joseph s Engineering College, Chennai ABSTRACT Schema matching

More information

XML Schema Automatic Matching Solution

XML Schema Automatic Matching Solution XML Schema Automatic Matching Solution Huynh Quyet Thang, Vo Sy Nam Abstract Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing,

More information

A Flexible Approach for Planning Schema Matching Algorithms

A Flexible Approach for Planning Schema Matching Algorithms A Flexible Approach for Planning Schema Matching Algorithms Fabien Duchateau, Zohra Bellahsene, Remi Coletta To cite this version: Fabien Duchateau, Zohra Bellahsene, Remi Coletta. A Flexible Approach

More information

Matching and Alignment: What is the Cost of User Post-match Effort?

Matching and Alignment: What is the Cost of User Post-match Effort? Matching and Alignment: What is the Cost of User Post-match Effort? (Short paper) Fabien Duchateau 1 and Zohra Bellahsene 2 and Remi Coletta 2 1 Norwegian University of Science and Technology NO-7491 Trondheim,

More information

Outline A Survey of Approaches to Automatic Schema Matching. Outline. What is Schema Matching? An Example. Another Example

Outline A Survey of Approaches to Automatic Schema Matching. Outline. What is Schema Matching? An Example. Another Example A Survey of Approaches to Automatic Schema Matching Mihai Virtosu CS7965 Advanced Database Systems Spring 2006 April 10th, 2006 2 What is Schema Matching? A basic problem found in many database application

More information

XBenchMatch: a Benchmark for XML Schema Matching Tools

XBenchMatch: a Benchmark for XML Schema Matching Tools XBenchMatch: a Benchmark for XML Schema Matching Tools Fabien Duchateau, Zohra Bellahsene, Ela Hunt To cite this version: Fabien Duchateau, Zohra Bellahsene, Ela Hunt. XBenchMatch: a Benchmark for XML

More information

Enabling Product Comparisons on Unstructured Information Using Ontology Matching

Enabling Product Comparisons on Unstructured Information Using Ontology Matching Enabling Product Comparisons on Unstructured Information Using Ontology Matching Maximilian Walther, Niels Jäckel, Daniel Schuster, and Alexander Schill Technische Universität Dresden, Faculty of Computer

More information

A Generic Algorithm for Heterogeneous Schema Matching

A Generic Algorithm for Heterogeneous Schema Matching You Li, Dongbo Liu, and Weiming Zhang A Generic Algorithm for Heterogeneous Schema Matching You Li1, Dongbo Liu,3, and Weiming Zhang1 1 Department of Management Science, National University of Defense

More information

XML Grammar Similarity: Breakthroughs and Limitations

XML Grammar Similarity: Breakthroughs and Limitations XML Grammar Similarity: Breakthroughs and Limitations Joe TEKLI, Richard CHBEIR* and Kokou YETONGNON LE2I Laboratory University of Bourgogne Engineer s Wing BP 47870 21078 Dijon CEDEX FRANCE Phone: (+33)

More information

COMA A system for flexible combination of schema matching approaches. Hong-Hai Do, Erhard Rahm University of Leipzig, Germany dbs.uni-leipzig.

COMA A system for flexible combination of schema matching approaches. Hong-Hai Do, Erhard Rahm University of Leipzig, Germany dbs.uni-leipzig. COMA A system for flexible combination of schema matching approaches Hong-Hai Do, Erhard Rahm University of Leipzig, Germany dbs.uni-leipzig.de Content Motivation The COMA approach Comprehensive matcher

More information

Ontology matching using vector space

Ontology matching using vector space University of Wollongong Research Online University of Wollongong in Dubai - Papers University of Wollongong in Dubai 2008 Ontology matching using vector space Zahra Eidoon University of Tehran, Iran Nasser

More information

Using Element Clustering to Increase the Efficiency of XML Schema Matching

Using Element Clustering to Increase the Efficiency of XML Schema Matching Using Element Clustering to Increase the Efficiency of XML Schema Matching Marko Smiljanić University of Twente The Netherlands m.smiljanic@utwente.nl Maurice van Keulen University of Twente The Netherlands

More information

An approach to the model-based fragmentation and relational storage of XML-documents

An approach to the model-based fragmentation and relational storage of XML-documents An approach to the model-based fragmentation and relational storage of XML-documents Christian Süß Fakultät für Mathematik und Informatik, Universität Passau, D-94030 Passau, Germany Abstract A flexible

More information

A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation

A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation Dimitris Manakanatas, Dimitris Plexousakis Institute of Computer Science, FO.R.T.H. P.O. Box 1385, GR 71110, Heraklion, Greece

More information

Learning mappings and queries

Learning mappings and queries Learning mappings and queries Marie Jacob University Of Pennsylvania DEIS 2010 1 Schema mappings Denote relationships between schemas Relates source schema S and target schema T Defined in a query language

More information

INTEROPERABILITY IN COLLABORATIVE NETWORK OF BIODIVERSITY ORGANIZATIONS

INTEROPERABILITY IN COLLABORATIVE NETWORK OF BIODIVERSITY ORGANIZATIONS 54 INTEROPERABILITY IN COLLABORATIVE NETWORK OF BIODIVERSITY ORGANIZATIONS Ozgul Unal and Hamideh Afsarmanesh University of Amsterdam, THE NETHERLANDS {ozgul, hamideh}@science.uva.nl Schematic and semantic

More information

Designing a Benchmark for the Assessment of XML Schema Matching Tools

Designing a Benchmark for the Assessment of XML Schema Matching Tools Designing a Benchmark for the Assessment of XML Schema Matching Tools Fabien Duchateau, Zohra Bellahsene To cite this version: Fabien Duchateau, Zohra Bellahsene. Designing a Benchmark for the Assessment

More information

Generic Schema Matching with Cupid

Generic Schema Matching with Cupid Generic Schema Matching with Cupid J. Madhavan, P. Bernstein, E. Rahm Overview Schema matching problem Cupid approach Experimental results Conclusion presentation by Viorica Botea The Schema Matching Problem

More information

CHAPTER 3 LITERATURE REVIEW

CHAPTER 3 LITERATURE REVIEW 20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations

More information

Ontology Matching Techniques: a 3-Tier Classification Framework

Ontology Matching Techniques: a 3-Tier Classification Framework Ontology Matching Techniques: a 3-Tier Classification Framework Nelson K. Y. Leung RMIT International Universtiy, Ho Chi Minh City, Vietnam nelson.leung@rmit.edu.vn Seung Hwan Kang Payap University, Chiang

More information

Ontology matching techniques: a 3-tier classification framework

Ontology matching techniques: a 3-tier classification framework University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Ontology matching techniques: a 3-tier classification framework Nelson

More information

3 Classifications of ontology matching techniques

3 Classifications of ontology matching techniques 3 Classifications of ontology matching techniques Having defined what the matching problem is, we attempt at classifying the techniques that can be used for solving this problem. The major contributions

More information

A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database *

A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database * BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 6, No 1 Sofia 2006 A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database

More information

Leveraging Data and Structure in Ontology Integration

Leveraging Data and Structure in Ontology Integration Leveraging Data and Structure in Ontology Integration O. Udrea L. Getoor R.J. Miller Group 15 Enrico Savioli Andrea Reale Andrea Sorbini DEIS University of Bologna Searching Information in Large Spaces

More information

Yet Another Matcher. Fabien Duchateau, Remi Coletta, Zohra Bellahsene, Renée Miller

Yet Another Matcher. Fabien Duchateau, Remi Coletta, Zohra Bellahsene, Renée Miller Yet Another Matcher Fabien Duchateau, Remi Coletta, Zohra Bellahsene, Renée Miller To cite this version: Fabien Duchateau, Remi Coletta, Zohra Bellahsene, Renée Miller. Yet Another Matcher. RR-916, 29.

More information

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.

Trees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures. Trees Q: Why study trees? : Many advance DTs are implemented using tree-based data structures. Recursive Definition of (Rooted) Tree: Let T be a set with n 0 elements. (i) If n = 0, T is an empty tree,

More information

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology Trees : Part Section. () (2) Preorder, Postorder and Levelorder Traversals Definition: A tree is a connected graph with no cycles Consequences: Between any two vertices, there is exactly one unique path

More information

RiMOM Results for OAEI 2008

RiMOM Results for OAEI 2008 RiMOM Results for OAEI 2008 Xiao Zhang 1, Qian Zhong 1, Juanzi Li 1, Jie Tang 1, Guotong Xie 2 and Hanyu Li 2 1 Department of Computer Science and Technology, Tsinghua University, China {zhangxiao,zhongqian,ljz,tangjie}@keg.cs.tsinghua.edu.cn

More information

Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology

Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology Schema-based Semantic Matching: Algorithms, a System and a Testing Methodology Abstract. Schema/ontology/classification matching is a critical problem in many application domains, such as, schema/ontology/classification

More information

An interactive, asymmetric and extensional method for matching conceptual hierarchies

An interactive, asymmetric and extensional method for matching conceptual hierarchies An interactive, asymmetric and extensional method for matching conceptual hierarchies Jérôme David, Fabrice Guillet, Régis Gras, and Henri Briand LINA CNRS FRE 2729, Polytechnic School of Nantes University,

More information

Binary Trees, Binary Search Trees

Binary Trees, Binary Search Trees Binary Trees, Binary Search Trees Trees Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search, insert, delete)

More information

Quality of Matching in large scale scenarios

Quality of Matching in large scale scenarios Quality of Matching in large scale scenarios Sana Sellami : Université de Lyon, CNRS INSA-Lyon, LIRIS, UMR5205, F-69621, France. sana.sellami@insa-lyon.fr Nabila Benharkat : Université de Lyon, CNRS INSA-Lyon,

More information

Improving Quality and Performance of Schema Matching in Large Scale

Improving Quality and Performance of Schema Matching in Large Scale Improving Quality and Performance of Schema Matching in Large Scale Fabien Duchateau, Mathieu Roche, Zohra Bellahsene To cite this version: Fabien Duchateau, Mathieu Roche, Zohra Bellahsene. Improving

More information

Similarity Flooding: A versatile Graph Matching Algorithm and its Application to Schema Matching

Similarity Flooding: A versatile Graph Matching Algorithm and its Application to Schema Matching Similarity Flooding: A versatile Graph Matching Algorithm and its Application to Schema Matching Sergey Melnik, Hector Garcia-Molina (Stanford University), and Erhard Rahm (University of Leipzig), ICDE

More information

QuickMig - Automatic Schema Matching for Data Migration Projects

QuickMig - Automatic Schema Matching for Data Migration Projects QuickMig - Automatic Schema Matching for Data Migration Projects Christian Drumm SAP Research, Karlsruhe, Germany christian.drumm@sap.com Matthias Schmitt SAP AG, St.Leon-Rot, Germany ma.schmitt@sap.com

More information

Clustering using Fast Structural Hierarchy Extraction from User-Defined Tags for Semi-Structural Languages

Clustering using Fast Structural Hierarchy Extraction from User-Defined Tags for Semi-Structural Languages Clustering using Fast Structural Hierarchy Extraction from User-Defined Tags for Semi-Structural Languages Hyun-Joo Moon, Haeng-kon Kim, Eun- Ser Lee, Member, IAENG Abstract Needs for semi-structured Languages

More information

Matching Large XML Schemas

Matching Large XML Schemas Matching Large XML Schemas Erhard Rahm, Hong-Hai Do, Sabine Maßmann University of Leipzig, Germany rahm@informatik.uni-leipzig.de Abstract Current schema matching approaches still have to improve for very

More information

PORSCHE: Performance ORiented SCHEma Mediation

PORSCHE: Performance ORiented SCHEma Mediation PORSCHE: Performance ORiented SCHEma Mediation Khalid Saleem, Zohra Bellahsene, Ela Hunt To cite this version: Khalid Saleem, Zohra Bellahsene, Ela Hunt. PORSCHE: Performance ORiented SCHEma Mediation.

More information

Lecture 32. No computer use today. Reminders: Homework 11 is due today. Project 6 is due next Friday. Questions?

Lecture 32. No computer use today. Reminders: Homework 11 is due today. Project 6 is due next Friday. Questions? Lecture 32 No computer use today. Reminders: Homework 11 is due today. Project 6 is due next Friday. Questions? Friday, April 1 CS 215 Fundamentals of Programming II - Lecture 32 1 Outline Introduction

More information

arxiv: v1 [cs.db] 23 Feb 2016

arxiv: v1 [cs.db] 23 Feb 2016 SIFT: An Algorithm for Extracting Structural Information From Taxonomies Jorge Martinez-Gil, Software Competence Center Hagenberg (Austria), jorgemar@acm.org Keywords: Algorithms; Knowledge Engineering;

More information

A Comparative Analysis of Ontology and Schema Matching Systems

A Comparative Analysis of Ontology and Schema Matching Systems A Comparative Analysis of Ontology and Schema Matching Systems K. Saruladha Computer Science Department, Pondicherry Engineering College, Puducherry, India Dr. G. Aghila Computer Science Department, Pondicherry

More information

Lily: Ontology Alignment Results for OAEI 2009

Lily: Ontology Alignment Results for OAEI 2009 Lily: Ontology Alignment Results for OAEI 2009 Peng Wang 1, Baowen Xu 2,3 1 College of Software Engineering, Southeast University, China 2 State Key Laboratory for Novel Software Technology, Nanjing University,

More information

PORSCHE: Performance ORiented SCHEma Mediation

PORSCHE: Performance ORiented SCHEma Mediation PORSCHE: Performance ORiented SCHEma Mediation Khalid Saleem Zohra Bellahsene LIRMM - UMR 5506, Université Montpellier 2, 34392 Montpellier, France Ela Hunt GlobIS, Department of Computer Science, ETH

More information

A Framework for Domain-Specific Interface Mapper (DSIM)

A Framework for Domain-Specific Interface Mapper (DSIM) 56 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 A Framework for Domain-Specific Interface Mapper (DSIM) Komal Kumar Bhatia1 and A. K. Sharma2, YMCA

More information

Binary Trees

Binary Trees Binary Trees 4-7-2005 Opening Discussion What did we talk about last class? Do you have any code to show? Do you have any questions about the assignment? What is a Tree? You are all familiar with what

More information

InsMT / InsMTL Results for OAEI 2014 Instance Matching

InsMT / InsMTL Results for OAEI 2014 Instance Matching InsMT / InsMTL Results for OAEI 2014 Instance Matching Abderrahmane Khiat 1, Moussa Benaissa 1 1 LITIO Lab, University of Oran, BP 1524 El-Mnaouar Oran, Algeria abderrahmane_khiat@yahoo.com moussabenaissa@yahoo.fr

More information

Ontology Matching Using an Artificial Neural Network to Learn Weights

Ontology Matching Using an Artificial Neural Network to Learn Weights Ontology Matching Using an Artificial Neural Network to Learn Weights Jingshan Huang 1, Jiangbo Dang 2, José M. Vidal 1, and Michael N. Huhns 1 {huang27@sc.edu, jiangbo.dang@siemens.com, vidal@sc.edu,

More information

Instance-based matching of hierarchical ontologies

Instance-based matching of hierarchical ontologies Instance-based matching of hierarchical ontologies Andreas Thor, Toralf Kirsten, Erhard Rahm University of Leipzig {thor,tkirsten,rahm}@informatik.uni-leipzig.de Abstract: We study an instance-based approach

More information

Trees. (Trees) Data Structures and Programming Spring / 28

Trees. (Trees) Data Structures and Programming Spring / 28 Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r

More information

RA: An XML Schema Reduction Algorithm

RA: An XML Schema Reduction Algorithm RA: An XML Schema Reduction Algorithm Angela C. Duta 1, Ken Barker 1, and Reda Alhajj 12 1 ADSA Laboratory, Department of Computer Science, University of Calgary 2500 University Drive NW, Calgary, Canada

More information

XML Clustering by Bit Vector

XML Clustering by Bit Vector XML Clustering by Bit Vector WOOSAENG KIM Department of Computer Science Kwangwoon University 26 Kwangwoon St. Nowongu, Seoul KOREA kwsrain@kw.ac.kr Abstract: - XML is increasingly important in data exchange

More information

Evaluating XPath Queries

Evaluating XPath Queries Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But

More information

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

An Extended Byte Carry Labeling Scheme for Dynamic XML Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer

More information

Trees. CSE 373 Data Structures

Trees. CSE 373 Data Structures Trees CSE 373 Data Structures Readings Reading Chapter 7 Trees 2 Why Do We Need Trees? Lists, Stacks, and Queues are linear relationships Information often contains hierarchical relationships File directories

More information

Industrial-Strength Schema Matching

Industrial-Strength Schema Matching Industrial-Strength Schema Matching Philip A. Bernstein, Sergey Melnik, Michalis Petropoulos, Christoph Quix Microsoft Research Redmond, WA, U.S.A. {philbe, melnik}@microsoft.com, mpetropo@cs.ucsd.edu,

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

Heading-Based Sectional Hierarchy Identification for HTML Documents

Heading-Based Sectional Hierarchy Identification for HTML Documents Heading-Based Sectional Hierarchy Identification for HTML Documents 1 Dept. of Computer Engineering, Boğaziçi University, Bebek, İstanbul, 34342, Turkey F. Canan Pembe 1,2 and Tunga Güngör 1 2 Dept. of

More information

Evolution of the COMA Match System

Evolution of the COMA Match System Evolution of the COMA Match System Sabine Massmann, Salvatore Raunich, David Aumüller, Patrick Arnold, Erhard Rahm WDI Lab, Institute of Computer Science University of Leipzig, Leipzig, Germany {lastname}@informatik.uni-leipzig.de

More information

Alignment Results of SOBOM for OAEI 2009

Alignment Results of SOBOM for OAEI 2009 Alignment Results of SBM for AEI 2009 Peigang Xu, Haijun Tao, Tianyi Zang, Yadong, Wang School of Computer Science and Technology Harbin Institute of Technology, Harbin, China xpg0312@hotmail.com, hjtao.hit@gmail.com,

More information

Learning to match ontologies on the Semantic Web

Learning to match ontologies on the Semantic Web The VLDB Journal (2003) 12: 303 319 / Digital Object Identifier (DOI) 10.1007/s00778-003-0104-2 Learning to match ontologies on the Semantic Web AnHai Doan 1, Jayant Madhavan 2, Robin Dhamankar 1, Pedro

More information

YAM++ Results for OAEI 2013

YAM++ Results for OAEI 2013 YAM++ Results for OAEI 2013 DuyHoa Ngo, Zohra Bellahsene University Montpellier 2, LIRMM {duyhoa.ngo, bella}@lirmm.fr Abstract. In this paper, we briefly present the new YAM++ 2013 version and its results

More information

Sangam: A Framework for Modeling Heterogeneous Database Transformations

Sangam: A Framework for Modeling Heterogeneous Database Transformations Sangam: A Framework for Modeling Heterogeneous Database Transformations Kajal T. Claypool University of Massachusetts-Lowell Lowell, MA Email: kajal@cs.uml.edu Elke A. Rundensteiner Worcester Polytechnic

More information

SeMap: A Generic Schema Matching System

SeMap: A Generic Schema Matching System SeMap: A Generic Schema Matching System by Ting Wang B.Sc., Zhejiang University, 2004 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of

More information

ISSN (Online) ISSN (Print)

ISSN (Online) ISSN (Print) Accurate Alignment of Search Result Records from Web Data Base 1Soumya Snigdha Mohapatra, 2 M.Kalyan Ram 1,2 Dept. of CSE, Aditya Engineering College, Surampalem, East Godavari, AP, India Abstract: Most

More information

Generic Schema Matching, Ten Years Later

Generic Schema Matching, Ten Years Later Generic Schema Matching, Ten Years Later Philip A. Bernstein Microsoft Corporation philbe@microsoft.com Jayant Madhavan Google Inc. jayant@google.com Erhard Rahm University of Leipzig rahm@informatik.uni-leipzig.de

More information

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function DEIM Forum 2018 I5-5 Abstract A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function Yoshiki SEKINE and Nobutaka SUZUKI Graduate School of Library, Information and Media

More information

A Comparative Study Weighting Schemes for Double Scoring Technique

A Comparative Study Weighting Schemes for Double Scoring Technique , October 19-21, 2011, San Francisco, USA A Comparative Study Weighting Schemes for Double Scoring Technique Tanakorn Wichaiwong Member, IAENG and Chuleerat Jaruskulchai Abstract In XML-IR systems, the

More information

Accelerating XML Structural Matching Using Suffix Bitmaps

Accelerating XML Structural Matching Using Suffix Bitmaps Accelerating XML Structural Matching Using Suffix Bitmaps Feng Shao, Gang Chen, and Jinxiang Dong Dept. of Computer Science, Zhejiang University, Hangzhou, P.R. China microf_shao@msn.com, cg@zju.edu.cn,

More information

Learning to Match Ontologies on the Semantic Web

Learning to Match Ontologies on the Semantic Web The VLDB Journal manuscript No. (will be inserted by the editor) Learning to Match Ontologies on the Semantic Web AnHai Doan, Jayant Madhavan, Robin Dhamankar, Pedro Domingos, Alon Halevy Department of

More information

A Large Scale Taxonomy Mapping Evaluation

A Large Scale Taxonomy Mapping Evaluation A Large Scale Taxonomy Mapping Evaluation Paolo Avesani 1, Fausto Giunchiglia 2, and Mikalai Yatskevich 2 1 ITC-IRST, 38050 Povo, Trento, Italy avesani@itc.it 2 Dept. of Information and Communication Technology,

More information

The HMatch 2.0 Suite for Ontology Matchmaking

The HMatch 2.0 Suite for Ontology Matchmaking The HMatch 2.0 Suite for Ontology Matchmaking S. Castano, A. Ferrara, D. Lorusso, and S. Montanelli Università degli Studi di Milano DICo - Via Comelico, 39, 20135 Milano - Italy {castano,ferrara,lorusso,montanelli}@dico.unimi.it

More information

OWL-CM : OWL Combining Matcher based on Belief Functions Theory

OWL-CM : OWL Combining Matcher based on Belief Functions Theory OWL-CM : OWL Combining Matcher based on Belief Functions Theory Boutheina Ben Yaghlane 1 and Najoua Laamari 2 1 LARODEC, Université de Tunis, IHEC Carthage Présidence 2016 Tunisia boutheina.yaghlane@ihec.rnu.tn

More information

Ontology Reconciliation for Service-Oriented Computing

Ontology Reconciliation for Service-Oriented Computing Ontology Reconciliation for Service-Oriented Computing Jingshan Huang, Jiangbo Dang, and Michael N. Huhns Computer Science & Engineering Dept., University of South Carolina, Columbia, SC 29208, USA {huang27,

More information

Two-Phase Schema Matching in Real World Relational Databases

Two-Phase Schema Matching in Real World Relational Databases Two-Phase Matching in Real World Relational Databases Nikolaos Bozovic #1, Vasilis Vassalos #2, # Department of informatics, Athens University of Economics and Business, Greece 1 nbosovic@gmail.com 2 vassalos@aueb.gr

More information

SurveyonTechniquesforOntologyInteroperabilityinSemanticWeb. Survey on Techniques for Ontology Interoperability in Semantic Web

SurveyonTechniquesforOntologyInteroperabilityinSemanticWeb. Survey on Techniques for Ontology Interoperability in Semantic Web Global Journal of Computer Science and Technology: E Network, Web & Security Volume 14 Issue 2 Version 1.0 Year 2014 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Yihong Ding Department of Computer Science Brigham Young University Provo, Utah 84602 ding@cs.byu.edu David W. Embley Department

More information

Restricting the Overlap of Top-N Sets in Schema Matching

Restricting the Overlap of Top-N Sets in Schema Matching Restricting the Overlap of Top-N Sets in Schema Matching Eric Peukert SAP Research Dresden 01187 Dresden, Germany eric.peukert@sap.com Erhard Rahm University of Leipzig Leipzig, Germany rahm@informatik.uni-leipzig.de

More information

AROMA results for OAEI 2009

AROMA results for OAEI 2009 AROMA results for OAEI 2009 Jérôme David 1 Université Pierre-Mendès-France, Grenoble Laboratoire d Informatique de Grenoble INRIA Rhône-Alpes, Montbonnot Saint-Martin, France Jerome.David-at-inrialpes.fr

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

Interactive Schema Matching With Semantic Functions

Interactive Schema Matching With Semantic Functions Interactive Schema Matching With Semantic Functions Guilian Wang 1 Joseph Goguen 1 Young-Kwang Nam 2 Kai Lin 3 1 Department of Computer Science and Engineering, U.C. San Diego. {guilian, goguen}@cs.ucsd.edu

More information

Semistructured Data Store Mapping with XML and Its Reconstruction

Semistructured Data Store Mapping with XML and Its Reconstruction Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,

More information

38050 Povo Trento (Italy), Via Sommarive 14 DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING

38050 Povo Trento (Italy), Via Sommarive 14  DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING UNIVERSITY OF TRENTO DEPARTMENT OF INFORMATION AND COMMUNICATION TECHNOLOGY 38050 Povo Trento (Italy), Via Sommarive 14 http://www.dit.unitn.it DISCOVERING MISSING BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING

More information

Extending E-R for Modelling XML Keys

Extending E-R for Modelling XML Keys Extending E-R for Modelling XML Keys Martin Necasky Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic martin.necasky@mff.cuni.cz Jaroslav Pokorny Faculty of Mathematics and

More information

COMPUSOFT, An international journal of advanced computer technology, 4 (6), June-2015 (Volume-IV, Issue-VI)

COMPUSOFT, An international journal of advanced computer technology, 4 (6), June-2015 (Volume-IV, Issue-VI) ISSN:2320-0790 Duplicate Detection in Hierarchical XML Multimedia Data Using Improved Multidup Method Prof. Pramod. B. Gosavi 1, Mrs. M.A.Patel 2 Associate Professor & HOD- IT Dept., Godavari College of

More information

Lecture 26. Introduction to Trees. Trees

Lecture 26. Introduction to Trees. Trees Lecture 26 Introduction to Trees Trees Trees are the name given to a versatile group of data structures. They can be used to implement a number of abstract interfaces including the List, but those applications

More information

Trees. Tree Structure Binary Tree Tree Traversals

Trees. Tree Structure Binary Tree Tree Traversals Trees Tree Structure Binary Tree Tree Traversals The Tree Structure Consists of nodes and edges that organize data in a hierarchical fashion. nodes store the data elements. edges connect the nodes. The

More information

LiSTOMS: a Light-weighted Self-tuning Ontology Mapping System

LiSTOMS: a Light-weighted Self-tuning Ontology Mapping System 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology LiSTOMS: a Light-weighted Self-tuning Ontology Mapping System Zhen Zhen Junyi Shen Institute of Computer

More information

Learning to Discover Complex Mappings from Web Forms to Ontologies

Learning to Discover Complex Mappings from Web Forms to Ontologies Learning to Discover Complex Mappings from Web Forms to Ontologies Yuan An College of Information Science and Technology Drexel University Philadelphia, USA yan@ischool.drexel.edu Xiaohua Hu College of

More information

A Bottom-up Strategy for Query Decomposition

A Bottom-up Strategy for Query Decomposition A Bottom-up Strategy for Query Decomposition Le Thi Thu Thuy, Doan Dai Duong, Virendrakumar C. Bhavsar and Harold Boley Faculty of Computer Science, University of New Brunswick Fredericton, New Brunswick,

More information

Semantic-based Optimal XML Schema Matching:

Semantic-based Optimal XML Schema Matching: 2010 International Conference on E-business, Management and Economics IPEDR vol.3 (2011) (2011) IACSIT Press, Hong Kong Semantic-based Optimal XML Schema Matching: A Mathematical Programming Approach Jaewook

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

Managing Uncertainty in Schema Matcher Ensembles

Managing Uncertainty in Schema Matcher Ensembles Managing Uncertainty in Schema Matcher Ensembles Anan Marie and Avigdor Gal Technion Israel Institute of Technology {sananm@cs,avigal@ie}.technion.ac.il Abstract. Schema matching is the task of matching

More information

Handling instance coreferencing in the KnoFuss architecture

Handling instance coreferencing in the KnoFuss architecture Handling instance coreferencing in the KnoFuss architecture Andriy Nikolov, Victoria Uren, Enrico Motta and Anne de Roeck Knowledge Media Institute, The Open University, Milton Keynes, UK {a.nikolov, v.s.uren,

More information

CSE 214 Computer Science II Introduction to Tree

CSE 214 Computer Science II Introduction to Tree CSE 214 Computer Science II Introduction to Tree Fall 2017 Stony Brook University Instructor: Shebuti Rayana shebuti.rayana@stonybrook.edu http://www3.cs.stonybrook.edu/~cse214/sec02/ Tree Tree is a non-linear

More information