Improved Processing of Path Query on RDF Data Using Suffix Array

Size: px
Start display at page:

Download "Improved Processing of Path Query on RDF Data Using Suffix Array"

Transcription

1 Journal of Convergence Information Technology Volume 4, Number 3, September 2009 Improved Processing of Path Query on RDF Data Using Suffix Array Corresponding author Sung Wan Kim * Division of Computer, Sahmyook University, Seoul, , Korea swkim@syu.ac.kr doi: /jcit.vol4.issue3.6 Abstract RDF is a recommended standard to describe additional semantic information to resources on the Semantic Web. Matono et al. proposed an indexing and query processing scheme for path-based RDF query using a suffix array. In this paper, we indicate some points on the previous approach. We propose an improved indexing and query processing scheme to reduce the binary search space and the overhead caused by repeating direct pattern matching. Finally, experimental performance evaluations demonstrate our approach outperforms the previous one. Keywords RDF, Indexing, Query Processing, Suffix Array. 1. Introduction In the Semantic Web, we can associate resources over the Web with metadata describing additional semantic information. RDF was recommended as a standard format to associate these metadata [1]. RDF data is a set of triples in the form of <subject, property, object>. Subject and property denote resource URI respectively. The object has the resource URL or a literal value. The RDF data can be represented as a directed graph where subject and object are nodes and property is an arc (Figure 1). Ovals indicate resources and rectangles indicate literals. An Arc describes the relationship between subjects and objects. Several indexing and query processing approaches have been proposed to handle query processing for the RDF data. Matono et al. [4] introduced an indexing scheme based on a suffix array structure to process path-based RDF queries. They showed performance gain, using experimental evaluations, for the simple path-based RDF queries. The same approach was described in [5] with the same experimental results. Matono's approach is significant since it was the first applying of suffix array to process path-based RDF query. Figure 1. An example RDF graph In this paper, we first indicate points regarding Matono's scheme. We then describe some approaches to improve query processing performance. Section 2 describes the research background and related work. Improved approaches and experimental evaluations are explained in Section 3 and Section 4 respectively. We finally conclude in Section Research Background 2.1. Suffix Array Suffix array is a widely used data structure to retrieve a specific string pattern amongst large textual data [6]. A suffix at the position of i for a given text is the sub-sequence beginning from the i-th position to the end. For the sample text 'abracbra', the suffix at position 5 is 'cbra'. A suffix array is a list of all extracted suffixes of the input text in lexicographical order. Thus, if a suffix pattern repeats in the given text, the suffixes all appear consecutively in the suffix array. In practice the suffix array consists of only beginning positions of suffixes. The following steps are used in the suffix array construction for the sample text of 'abracbra'. We first assign index points to the given text, as shown in Figure 2. We here assign an index point to each character. 45

2 Improved Processing of Path Query on RDF Data Using Suffix Array Sung Wan Kim Text a b r a c b r a Index Points (idx) Figure 2. Assigning index points We then extract all suffixes from the given text. The left hand side of Figure 3 shows the extracted suffixes with index points. Next, we order the suffixes lexicographically as shown in the right hand side of Figure 3. We finally enter the sorted index points in the suffix array. suffixes idx Suffixes(sorted) idx lcp a b r a c b r a 1 a 8 0 b r a c b r a 2 a b r a c b r a 1 1 r a c b r a 3 a c b r a 4 1 a c b r a 4 b r a 6 0 c b r a 5 b r a c b r a 2 3 b r a 6 c b r a 5 0 r a 7 r a 7 0 a 8 r a c b r a 3 2 Figure 3. Extracted suffixes and sorted suffixes Figure 4 shows the suffix array SA of the given text. We can retrieve a specific string pattern by binary search on the suffix array. For example, we can find a string pattern 'ra' in the given text by twice performing the binary search. resource r 0 and r n is defined by the sequence of path r 0 p 1 r 1 p 2 r 2... p n r n (n > 0) where a (r n p n+1 r n+1 ) indicates a single triple. idx pid r1 p r2 p r3 p r4 n kr 2 r1 p r5 p r4 n kr 3 r1 p r5 p jp 4 r1 p r5 p r6 n cn Figure 5. Path information table (ptab) Matono et al. introduced an indexing scheme using suffix array to efficiently process the path-based queries over RDF data [4]. This scheme treats RDF data as a DAG and extracts all paths in the form of an alternation of labels of nodes and labels of arcs from root nodes (nodes with indegree zero) to terminal nodes (nodes with outdegree zero). For example we can extract a path pattern 'r1.p.r5.n.jp' from Figure 1. We assign an integer pair (pid, idx) as an index point for each suffix, since suffixes are extracted from different paths. In this paper, we call this pair a suffix label. The pid indicates the path identifier and idx is the index point within the path. Figure 5 shows the assignment table of suffix labels for the suffixes extracted from the four paths of the RDF graph in Figure 1. SA Figure 4. Constructed suffix array (SA) The LCP (longest common prefix) array is an auxiliary data structure. The LCP array stores the lengths of the longest common prefix substring between each suffix and its preceding suffix in the suffix array. For example, the LCP value of suffix 'bracbra' is 3 since its preceding suffix is 'bra' Indexing RDF Data Using Suffix Array Several RDF query languages have been proposed. The typical RDF query retrieves resources that have a specific relationship R, reachable from a given resource. We thus should describe the relationship R in the query. This relationship can be represented with the path expressions. For example, the relationship between Figure 6. Suffix array index construction The steps to generate the suffix array are shown in Figure 6. We sort all suffixes in lexicographical order and eliminate duplicate suffixes. We finally obtain the suffix array of [ (4,7) (3,5) (1,9) (4,6) (3,4) (1,8) (1,2) 46

3 Journal of Convergence Information Technology Volume 4, Number 3, September 2009 (1,4) (1,6) (3,2) (2,2) (4,2) (4,4) (1,1) (3,1) (2,1) (4,1) (1,3) (1,5) (1,7) (3,3) (2,3) (4,3) (4,5) ]. Query processing is handled using this suffix array and the path information table (ptab), shown in Figure 5. In [4] only a simple path query type was considered. The following query is the example represented by RDQL format, one of the RDF queries. Ex.1) select?x where (r1 p r5) (r5 p?x) This query retrieves all resources reachable from a given path pattern. We call this form of query a forward simple path query. The condition in the where clause of the above query can be represented as a path pattern 'r1.p.r5.p'. The following steps are used in [4] to retrieve suffix labels for all suffixes having path 'r1.p.r5.p' as the beginning pattern. 1) Find a position p of the array component in the suffix array SA that first matches the given path pattern using the ptab and by performing binary searches over the suffix array (here, p is 16, if the index of array begins from 1). 2) Perform pattern matching repeatedly to find more positions for the adjacent suffixes in the left and right hand sides to the first matched position p over the suffix array SA (position 17 is additionally acquired here). 3) Extract the content of the suffix array SA for the positions, namely suffix labels. For this example, (2, 1) and (4, 1) are extracted from SA [16] and SA [17]. Then, add the length of the given query pattern (here, the length is 4) to the idx value of each suffix label. Obtain the final answer {r4, r6} from ptab, using the modified suffix labels. The following example explains backward query processing. We retrieve all resources that precede the path pattern of a given query. All processing steps are equivalent to the forward type query, excepting step 3. We decrement the idx value of each suffix label by one. Ex. 2) select?x where (?x p r4) (r4 n kr) For the above backward path query we find an index label (1, 6) and finally obtain {r3} as a return value. However, there is a missing result (that is 'r5' on the path with PID 2) since we have generated the suffix index after eliminating the duplicate suffixes. We can observe two features in the query processing. First, we have whole suffix array index as the retrieval space for all queries. As the size of RDF data grows the size of the suffix array also grows. Thus, the number of retrievals and processing time for binary searches increase. Second, there are overheads, especially to repeatedly performing time-consuming path pattern matching for the left and right hand sides to the first matched position in the suffix array. In next section, we describe new approaches to support handling the backward query type without omitting results and to improve the two drawbacks of Matono's approach. 3. Proposed Approach In this section, we describe an improved index organization and query processing approach to improve the performance of Matono's approach. We assume RDF data to be a DAG. We define a path as an alternation of labels of nodes and labels of arcs as follows. Path ::= (rsclbl '.' proplbl '.')*(rsclbl literalval (rsclbl '.' proplbl)) rsclbl ::= URI Reference (refer to [1]) proplbl ::= propname (refer to [1]) literalval ::= Constant Values (refer to [1]) propname ::= URI Reference (refer to [1]) The length of path d indicates the number of components that comprise the path pattern. For the path pattern 'r1.p1.r2.p1', the length of the path d is Index Organization Figure 7 shows the stages of index organization. We first extract all paths from RDF data and construct the path information table (ptab). For each extracted path, we distinguish suffixes and assign a suffix label that consists of a pair of path id (pid) and index point (idx) for each suffix. The distinction from Matono's approach is we do not eliminate duplicate suffixes to handle the backward path query. We next compute the 47

4 Improved Processing of Path Query on RDF Data Using Suffix Array Sung Wan Kim LCP value for each suffix after sorting the extracted suffixes in lexicographical order. keyword group can be implemented by a list or B-tree. LCP array is used to reduce the overhead for the repeated pattern matching during query processing. The next section explains its usage. Figure 7. Index Generation Step A characteristic of the suffix array is that suffixes with the same path pattern as a prefix, appear consecutively. In Matono's approach, the entire suffix array is always included in the search space. However in the case of a given query pattern that begins with 'r1', it is enough to have only the part of the suffix pattern beginning with 'r1' as the binary search space. This indicates that suffixes without 'r1' in the starting position do not need to be included in the search space. If we perform pattern matching with only the suffixes that begin with the same component as the first of a given query pattern, we can reduce the number of binary searches and pattern matching. Finally, we generate the index using the sorted suffix labels and LCPs. In this paper, we maintain several suffix arrays, instead of a single one, to reduce the binary search space. The proposed index structure is shown in Figure 8. It consists of two parts. Each suffix array SA_k in the right side of the index includes only the suffix labels for all suffixes that have 'k' as their first component. For instance, SA_r1 includes suffix labels of [(1, 1)(2, 1)(3, 1)(4, 1)] assigned only to the suffixes beginning with 'r1'. We also maintain several LCP arrays. Each LCP array LCP_k maintains the LCP values for the suffix patterns with 'k' as their first component. The keyword group in the left side of the index includes only the first components of the suffixes. Each keyword connects to SA_k and LCP_k arrays. The Figure 8. Proposed index structure In the case of the query processing the path pattern 'r1.p.r5.p' given in the example query 1 using the proposed index, only the set of suffix labels of [(1, 1)(2, 1)(3, 1)(4, 1)] stored in SA_r1 is used for the binary search. Thus, we can reduce the number of searches to find the first matched suffix to the given query pattern and the number of pattern matches, since the binary search space is reduced Query Processing In this section we describe our approach to handle forward and backward path queries. Repeatedly performing pattern matching between the path pattern in a given query and adjacent suffix patterns in the left and right hand sides to the first pattern matched point on the suffix array is time consuming. We first extract a suffix label from the suffix array to perform pattern matching. Then, we obtain the corresponding suffix pattern in the form of a string from the path information table (ptab) using the extracted suffix label. We next match them with the path pattern in a given query. Rather than use this approach, we handle it by comparisons among the components of the LCP array that comprises integer values. Figure 9 shows the processing algorithm for the forward path query. The function GetSuffixLabel extracts all suffix labels assigned to suffixes matched to the path pattern in a given query. Finding the position p for a suffix in SA_i, which is first matched to the path pattern in a given query, is the initial step of this function. This 48

5 Journal of Convergence Information Technology Volume 4, Number 3, September 2009 initial step is performed in the same manner as in the previous approach. However, in the proposed approach the binary search space is limited to only a single SA_i, instead of the entire suffix array. For the example query 1 we obtain that p is 3 on the SA_r1 and include a suffix label of (2, 1) in a temporary set. The second step finds other adjacent suffix patterns placed at the left and right hand sides of the first matched position p over SA_i. Instead of direct pattern matching, we utilize an LCP_i array. The value of LCP_i[p] maintains the length of the longest common prefix from the suffix patterns of SA_i[p] and SA_i[p- 1]. For instance, the suffix patterns for SA_r1[4] and SA_r1[3] are 'r1.p.r5.p.r6.n.cn' and 'r1.p.r5.p.r4.n.kr' respectively. Thus, the value of LCP_r1[4] is 4. Consider the case of finding the adjacent suffix patterns to the left hand side of the first matched position p on SA_i. Assume that d is the path pattern length in a given query. If LCP_i[j] is greater than or equal to d, then the suffix pattern at SA_i[j-1] can be considered to be matched to the path pattern in a given query. We thus include the suffix label at SA_i[j-1] in the result set. We can find all suffix labels for the adjacent suffixes to the left of position p by repeatedly performing this procedure. Similarly, we can find the suffix labels for the adjacent suffix patterns in the right hand portion. We repeatedly include the value of LCP_i[j+1], if the value of LCP_i[j+1] is greater than or equal to d. We can replace the time-consuming pattern matching with integer comparisons and reduce query processing time. For the example query 1 we here obtain an additional suffix label of (4, 1) at the position 4 on the SA_r1. The function ForwardQueryProcessing obtains the final answer from ptab, using the set of suffix labels returned by the function GetSuffixLabel. We first modify the returned suffix labels by adding d to the idx value of each suffix label and finally include the content of ptab[pid][idx] in the final answer. For the example query 1 the function GetSuffixLabel returns a set of suffix labels of {(2, 1) (4, 1)} and we have {r4, r6} as the final answer. We omit the processing algorithm for the backward path query since it is identical to that of the forward path query, excepting one thing. We first obtain a set of suffix labels, executing the function GetSuffixLabel. We then modify the returned suffix labels by decrementing 1 from the idx value of each suffix label. We then include the content of ptab[pid][idx] in the final answer. Function ForwardQueryProcessing(usrQueryPattern) // Assume, d is the length of query pattern usrquerypattern // Output : the final result set finalset Call GetSuffixLabel(usrQueryPattern) // obtaining tempset For each suffix label (pid, idx) in tempset Do Add the content of ptab[pid][idx +d] in finalset End For End Function Function GetSuffixLabel(usrQueryPattern) // Output : a set of suffix labels tempset matched with the query pattern usrquerypattern Step1)Find the position p in SA_i that contains the first matched suffix to usrquerypattern Add suffix label that is the content of SA_i[p] in the temporary set tempset Step2) Find additional suffix labels from the adjacent components to the position p on the SA_i using LCP array 2-1) Perform left side scan tp p; While (d <= LCP_i[tp]) tp tp - 1 // modify the position p value Add the content of SA_i[tp] in tempset End While 2-2) Perform right side scan tp p; While (d <= LCP_i[tp+1]) tp tp + 1 // modify the position p value Add the content of SA_i[tp] in tempset End While End Function Figure 9. Query Processing Algorithm for Forward Path Query 49

6 Improved Processing of Path Query on RDF Data Using Suffix Array Sung Wan Kim 4. Experimental Evaluation We evaluate performance in this section. We used a modified FOAF ontology-based RDF data set provided in FOAF project [7] after transforming the data to DAG and generated two data sets, as shown in Table 1. The number of suffixes was reduced by 40% when we eliminate duplicate suffixes. Table 1. Experimental Data Set Data1 Data2 Data Size (KB) 2,000 10,000 # of extracted paths 1,314 6,570 # of extracted suffixes 14,874 74,370 # of extracted suffixes (duplicate suffixes eliminated) 9,293 44,539 Tests were performed in a machine with Intel Core2 Duo 2.20GHz CPU, 1GB memory, and 300 GB HDD, running Window XP Professional. We used MS Visual C and MySQL 5.0 for the implementation. We implemented three approaches to evaluate performance of path-based RDF query processing. First, we implemented Matono's approach, mentioned in section 2. For this, we eliminated the duplicate suffixes and generated a table for the path information and a table for the suffix array. The second approach was similar to the first, but the duplicate suffixes were not removed to handle backward path query with no missing results. The last approach is our proposed approach. We also did not eliminate the duplicate suffixes. We applied the proposed index structure and query processing algorithms. For this, we generated an additional table to store keywords and a table to store LCP values. We performed the experiments after loading the suffix array in main memory for all three approaches. Query types for the performance evaluations are shown in Table 2. We measured the average execution times for the forward and backward path queries, not taking database caching into account. The path length in the table denotes the number of components in the path pattern for a given query. The retrieval target indicates the item to be in the final set. The position is where the given query is matched in the RDF data graph. If the position is root, for example, the given query is matched at the paths that begin from resources with root nodes (indegree of zero) in the RDF graph. Table 2. Query Types for Test Query features direction path length target position Q1 forward 6 resource upper Q2 forward 10 resource root Q3 forward 4 resource mid Q4 forward 5 property mid Q5 forward 5 resource mid Q6 forward 5 resource upper Q7 forward 4 resource lower Table 3 shows the query processing times for the above query types in Table 2. The number of binary searches in the table indicates the number of binary searches executed for the given query to be initially matched. The number of L/R accessing field denotes the number of executions for pattern matching to the left and right hand side to the first matched position in the two former approaches and the number of comparisons of LCP values in the proposed approach respectively. The number of returns field indicates the number of results obtained by query evaluation; it includes the replicated results. The number of results is the number of results in which the duplicated results is excluded from the number of returns. We omitted this field in the table for the forward queries, since the number of results is the same in all three approaches. For all query types, the number of binary searches is reduced remarkably, when we apply the proposed indexing approach. The number of accesses for the adjacent components in the left and right hand portions to the first matched position on the suffix array is counted differently in the first approach and the other two ones by the query types. Most of the duplicate suffixes appear among the suffixes extracted from nodes positioned under the mid parts of RDF graph. The query types of Q3, Q4, Q5, and Q7 are matched at the mid and/or lower parts of RDF graph. Thus, the first approach that removes the duplicate suffixes shows fewer accesses for the adjacent component to perform pattern matching than the other two ones. The number of returns after query processing was less in the first approach. Both the second and the proposed approaches, which do not remove the duplicate suffixes, returned the same number of results. 50

7 Journal of Convergence Information Technology Volume 4, Number 3, September 2009 Conversely, in the case of the query types of Q1, Q2, and Q6 that are matched in the upper parts of the RDF graph, the number of accesses to the adjacent components at the left and right hand side parts was measured to be equal, as was the number of returns counted. Thus, we determined eliminating duplicate faster than the second one. Hence, we know that eliminating duplicate suffixes directly influences query processing performance. Compared to the proposed approach, however, the first approach showed slightly faster or similar performance. One of the reasons for this is that the number of pattern matches has been Table 3. Query Processing Results suffixes does not influence the performance of query processing for a given query that is matched to the path beginning from the nodes positioned at the root or upper parts of the RDF graph. Thus, the query processing times for Q1, Q2, and Q6 are similar in the first and second approaches. The query processing times of the proposed approach is 50 % faster than the other two approaches. This is due to reducing the binary search space using the proposed index structure and replacing the path pattern matching with the integer comparisons based on LCD values. In the case of the processing times for the query types of Q3, Q4, and Q5, the first approach is 50 % reduced by removing the duplicate suffixes in the first approach. Thus, both the number of the returns and the time to exclude the repeated returns to obtain the final result was reduced. However, for query processing with duplicate suffixes, the performance of the proposed approach more than doubled the performance of the second one. Finally, to handle the backward path query, like Q7, the first approach performs better than the others, for the same reason as we process the query types of Q3, Q4, and Q5. The number of results in the first approach is one, whilst both the second and proposed approaches return three results. We thus know these two 51

8 Improved Processing of Path Query on RDF Data Using Suffix Array Sung Wan Kim approaches are more accurate than the first. Higher performance gain was obtained in the proposed approach than the second one. 5. Conclusion In this paper, we first introduced the characteristics of the previous indexing and query processing scheme using a suffix array to handle the path-based RDF queries. We then proposed two schemes to improve query processing performance. We hence proposed an index structure to reduce binary search space and introduced a query evaluation approach to reduce the overhead caused by repeating direct pattern matching. Finally, experimental evaluations demonstrated the proposed approach improves performance compared to the previous approach for path-based RDF queries. 6. Acknowledgement Part of this work was done while the author was a visiting researcher in the Information Systems and Database Group at the University of Waikato, New Zealand. 7. References [1] W3C, RDF Primer, [2] W3C, SPARQL Query Language for RDF, [3] P. Haase, et al. "A Comparison of RDF Query Languages", In the Proc. of the Third International Semantic Web Conference, 2004, pp [4] A. Matono, et al., "An Indexing Scheme for RDF and RDF Schema based on Suffix Arrays", In the Proc. of the First International Workshop on Semantic Web and Databases (SWDB). Sept. 2003, pp [5] Baolin Liu and Bo Hu, "Path Queries Based RDF Index", In the Proc. of the First International Conference on Semantics, Knowledge, and Grid (SKG), 2006, pp [6] William B. Frakes and Richard Baeza-Yates, Information Retrieval : data structures and algorithms, Sigma Press, [7] The Friend of a Friend (FOAF) project, 52

Solution to Problem 1 of HW 2. Finding the L1 and L2 edges of the graph used in the UD problem, using a suffix array instead of a suffix tree.

Solution to Problem 1 of HW 2. Finding the L1 and L2 edges of the graph used in the UD problem, using a suffix array instead of a suffix tree. Solution to Problem 1 of HW 2. Finding the L1 and L2 edges of the graph used in the UD problem, using a suffix array instead of a suffix tree. The basic approach is the same as when using a suffix tree,

More information

SPARQL Protocol And RDF Query Language

SPARQL Protocol And RDF Query Language SPARQL Protocol And RDF Query Language WS 2011/12: XML Technologies John Julian Carstens Department of Computer Science Communication Systems Group Christian-Albrechts-Universität zu Kiel March 1, 2012

More information

An Efficient Approach to Triple Search and Join of HDT Processing Using GPU

An Efficient Approach to Triple Search and Join of HDT Processing Using GPU An Efficient Approach to Triple Search and Join of HDT Processing Using GPU YoonKyung Kim, YoonJoon Lee Computer Science KAIST Daejeon, South Korea e-mail: {ykkim, yjlee}@dbserver.kaist.ac.kr JaeHwan Lee

More information

Linear Work Suffix Array Construction

Linear Work Suffix Array Construction Linear Work Suffix Array Construction Juha Karkkainen, Peter Sanders, Stefan Burkhardt Presented by Roshni Sahoo March 7, 2019 Presented by Roshni Sahoo Linear Work Suffix Array Construction March 7, 2019

More information

Lecture 5: Graphs & their Representation

Lecture 5: Graphs & their Representation Lecture 5: Graphs & their Representation Why Do We Need Graphs Graph Algorithms: Many problems can be formulated as problems on graphs and can be solved with graph algorithms. To learn those graph algorithms,

More information

1. Gusfield text for chapter 5&6 about suffix trees are scanned and uploaded on the web 2. List of Project ideas is uploaded

1. Gusfield text for chapter 5&6 about suffix trees are scanned and uploaded on the web 2. List of Project ideas is uploaded Date: Thursday, February 8 th Lecture: Dr. Mihai Pop Scribe: Hyoungtae Cho dministrivia. Gusfield text for chapter &6 about suffix trees are scanned and uploaded on the web. List of Project ideas is uploaded

More information

Suffix Trees and Arrays

Suffix Trees and Arrays Suffix Trees and Arrays Yufei Tao KAIST May 1, 2013 We will discuss the following substring matching problem: Problem (Substring Matching) Let σ be a single string of n characters. Given a query string

More information

Grid Resources Search Engine based on Ontology

Grid Resources Search Engine based on Ontology based on Ontology 12 E-mail: emiao_beyond@163.com Yang Li 3 E-mail: miipl606@163.com Weiguang Xu E-mail: miipl606@163.com Jiabao Wang E-mail: miipl606@163.com Lei Song E-mail: songlei@nudt.edu.cn Jiang

More information

Space Efficient Linear Time Construction of

Space Efficient Linear Time Construction of Space Efficient Linear Time Construction of Suffix Arrays Pang Ko and Srinivas Aluru Dept. of Electrical and Computer Engineering 1 Laurence H. Baker Center for Bioinformatics and Biological Statistics

More information

Applications of Suffix Tree

Applications of Suffix Tree Applications of Suffix Tree Let us have a glimpse of the numerous applications of suffix trees. Exact String Matching As already mentioned earlier, given the suffix tree of the text, all occ occurrences

More information

Suffix Arrays Slides by Carl Kingsford

Suffix Arrays Slides by Carl Kingsford Suffix Arrays 02-714 Slides by Carl Kingsford Suffix Arrays Even though Suffix Trees are O(n) space, the constant hidden by the big-oh notation is somewhat big : 20 bytes / character in good implementations.

More information

Semantic Web Information Management

Semantic Web Information Management Semantic Web Information Management Norberto Fernández ndez Telematics Engineering Department berto@ it.uc3m.es.es 1 Motivation n Module 1: An ontology models a domain of knowledge n Module 2: using the

More information

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

An Extended Byte Carry Labeling Scheme for Dynamic XML Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer

More information

Searching a Sorted Set of Strings

Searching a Sorted Set of Strings Department of Mathematics and Computer Science January 24, 2017 University of Southern Denmark RF Searching a Sorted Set of Strings Assume we have a set of n strings in RAM, and know their sorted order

More information

Lecture 7 February 26, 2010

Lecture 7 February 26, 2010 6.85: Advanced Data Structures Spring Prof. Andre Schulz Lecture 7 February 6, Scribe: Mark Chen Overview In this lecture, we consider the string matching problem - finding all places in a text where some

More information

Merge Sort Roberto Hibbler Dept. of Computer Science Florida Institute of Technology Melbourne, FL

Merge Sort Roberto Hibbler Dept. of Computer Science Florida Institute of Technology Melbourne, FL Merge Sort Roberto Hibbler Dept. of Computer Science Florida Institute of Technology Melbourne, FL 32901 rhibbler@cs.fit.edu ABSTRACT Given an array of elements, we want to arrange those elements into

More information

PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet

PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet IEICE TRANS. FUNDAMENTALS, VOL.E8??, NO. JANUARY 999 PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet Tetsuo SHIBUYA, SUMMARY The problem of constructing the suffix tree of a tree is

More information

COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara

COMPUTER AND INFORMATION SCIENCE JENA DB. Group Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara JENA DB Group - 10 Abhishek Kumar Harshvardhan Singh Abhisek Mohanty Suhas Tumkur Chandrashekhara OUTLINE Introduction Data Model Query Language Implementation Features Applications Introduction Open Source

More information

Finding Topic-centric Identified Experts based on Full Text Analysis

Finding Topic-centric Identified Experts based on Full Text Analysis Finding Topic-centric Identified Experts based on Full Text Analysis Hanmin Jung, Mikyoung Lee, In-Su Kang, Seung-Woo Lee, Won-Kyung Sung Information Service Research Lab., KISTI, Korea jhm@kisti.re.kr

More information

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT Jong-Soo Sohn and In-Jeong Chung Department of Computer and Information Science Korea University Republic of Korea Abstract

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Introduction How to retrieval information? A simple alternative is to search the whole text sequentially Another option is to build data structures over the text (called indices)

More information

Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL

Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL Revisiting Blank Nodes in RDF to Avoid the Semantic Mismatch with SPARQL Marcelo Arenas 1, Mariano Consens 2, and Alejandro Mallea 1,3 1 Pontificia Universidad Católica de Chile 2 University of Toronto

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Modern Information Retrieval, chapter 8 2. Information Retrieval:

More information

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function DEIM Forum 2018 I5-5 Abstract A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function Yoshiki SEKINE and Nobutaka SUZUKI Graduate School of Library, Information and Media

More information

Towards an Integrated Information Framework for Service Technicians

Towards an Integrated Information Framework for Service Technicians Towards an Integrated Information Framework for Service Technicians Sebastian Bader, Jan Oevermann KIT The Research University in the Helmholtz Association www.kit.edu How it should be: I need to do maintenance

More information

Computing the Longest Common Substring with One Mismatch 1

Computing the Longest Common Substring with One Mismatch 1 ISSN 0032-9460, Problems of Information Transmission, 2011, Vol. 47, No. 1, pp. 1??. c Pleiades Publishing, Inc., 2011. Original Russian Text c M.A. Babenko, T.A. Starikovskaya, 2011, published in Problemy

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Evaluating XPath Queries

Evaluating XPath Queries Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But

More information

CHENNAI MATHEMATICAL INSTITUTE M.Sc. / Ph.D. Programme in Computer Science

CHENNAI MATHEMATICAL INSTITUTE M.Sc. / Ph.D. Programme in Computer Science CHENNAI MATHEMATICAL INSTITUTE M.Sc. / Ph.D. Programme in Computer Science Entrance Examination, 5 May 23 This question paper has 4 printed sides. Part A has questions of 3 marks each. Part B has 7 questions

More information

Answering Aggregate Queries Over Large RDF Graphs

Answering Aggregate Queries Over Large RDF Graphs 1 Answering Aggregate Queries Over Large RDF Graphs Lei Zou, Peking University Ruizhe Huang, Peking University Lei Chen, Hong Kong University of Science and Technology M. Tamer Özsu, University of Waterloo

More information

Graph and Digraph Glossary

Graph and Digraph Glossary 1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose

More information

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph. Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial

More information

A Survey on Disk-based Genome. Sequence Indexing

A Survey on Disk-based Genome. Sequence Indexing Contemporary Engineering Sciences, Vol. 7, 2014, no. 15, 743-748 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4684 A Survey on Disk-based Genome Sequence Indexing Woong-Kee Loh Department

More information

CHAPTER 4 RESULT ANALYSIS

CHAPTER 4 RESULT ANALYSIS 89 CHAPTER 4 RESULT ANALYSIS 4. INTRODUCTION The results analysis chapter focuses on experimentation and evaluation of the research work. Various real time scenarios are taken and applied to this proposed

More information

An empirical evaluation of a metric index for approximate string matching. Bilegsaikhan Naidan and Magnus Lie Hetland

An empirical evaluation of a metric index for approximate string matching. Bilegsaikhan Naidan and Magnus Lie Hetland An empirical evaluation of a metric index for approximate string matching Bilegsaikhan Naidan and Magnus Lie Hetland Abstract In this paper, we evaluate a metric index for the approximate string matching

More information

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Bumjoon Jo and Sungwon Jung (&) Department of Computer Science and Engineering, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 04107,

More information

Semantic Web and Python Concepts to Application development

Semantic Web and Python Concepts to Application development PyCon 2009 IISc, Bangalore, India Semantic Web and Python Concepts to Application development Vinay Modi Voice Pitara Technologies Private Limited Outline Web Need better web for the future Knowledge Representation

More information

Pedigree Management and Assessment Framework (PMAF) Demonstration

Pedigree Management and Assessment Framework (PMAF) Demonstration Pedigree Management and Assessment Framework (PMAF) Demonstration Kenneth A. McVearry ATC-NY, Cornell Business & Technology Park, 33 Thornwood Drive, Suite 500, Ithaca, NY 14850 kmcvearry@atcorp.com Abstract.

More information

Scalable Reduction of Large Datasets to Interesting Subsets

Scalable Reduction of Large Datasets to Interesting Subsets Scalable Reduction of Large Datasets to Interesting Subsets Gregory Todd Williams, Jesse Weaver, Medha Atre, and James A. Hendler Tetherless World Constellation, Rensselaer Polytechnic Institute, Troy,

More information

RDFPath. Path Query Processing on Large RDF Graphs with MapReduce. 29 May 2011

RDFPath. Path Query Processing on Large RDF Graphs with MapReduce. 29 May 2011 29 May 2011 RDFPath Path Query Processing on Large RDF Graphs with MapReduce 1 st Workshop on High-Performance Computing for the Semantic Web (HPCSW 2011) Martin Przyjaciel-Zablocki Alexander Schätzle

More information

Indexing and Searching

Indexing and Searching Indexing and Searching Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Modern Information Retrieval, chapter 9 2. Information Retrieval:

More information

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS 82 CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS In recent years, everybody is in thirst of getting information from the internet. Search engines are used to fulfill the need of them. Even though the

More information

COMP3121/3821/9101/ s1 Assignment 1

COMP3121/3821/9101/ s1 Assignment 1 Sample solutions to assignment 1 1. (a) Describe an O(n log n) algorithm (in the sense of the worst case performance) that, given an array S of n integers and another integer x, determines whether or not

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

L02 : 08/21/2015 L03 : 08/24/2015.

L02 : 08/21/2015 L03 : 08/24/2015. L02 : 08/21/2015 http://www.csee.wvu.edu/~adjeroh/classes/cs493n/ Multimedia use to be the idea of Big Data Definition of Big Data is moving data (It will be different in 5..10 yrs). Big data is highly

More information

IEEE LANGUAGE REFERENCE MANUAL Std P1076a /D3

IEEE LANGUAGE REFERENCE MANUAL Std P1076a /D3 LANGUAGE REFERENCE MANUAL Std P1076a-1999 2000/D3 Clause 10 Scope and visibility The rules defining the scope of declarations and the rules defining which identifiers are visible at various points in the

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

Multi-agent and Semantic Web Systems: Querying

Multi-agent and Semantic Web Systems: Querying Multi-agent and Semantic Web Systems: Querying Fiona McNeill School of Informatics 11th February 2013 Fiona McNeill Multi-agent Semantic Web Systems: Querying 11th February 2013 0/30 Contents This lecture

More information

Domain Specific Semantic Web Search Engine

Domain Specific Semantic Web Search Engine Domain Specific Semantic Web Search Engine KONIDENA KRUPA MANI BALA 1, MADDUKURI SUSMITHA 2, GARRE SOWMYA 3, GARIKIPATI SIRISHA 4, PUPPALA POTHU RAJU 5 1,2,3,4 B.Tech, Computer Science, Vasireddy Venkatadri

More information

Analyzing Dshield Logs Using Fully Automatic Cross-Associations

Analyzing Dshield Logs Using Fully Automatic Cross-Associations Analyzing Dshield Logs Using Fully Automatic Cross-Associations Anh Le 1 1 Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697, USA anh.le@uci.edu

More information

Falcon-AO: Aligning Ontologies with Falcon

Falcon-AO: Aligning Ontologies with Falcon Falcon-AO: Aligning Ontologies with Falcon Ningsheng Jian, Wei Hu, Gong Cheng, Yuzhong Qu Department of Computer Science and Engineering Southeast University Nanjing 210096, P. R. China {nsjian, whu, gcheng,

More information

Processing ontology alignments with SPARQL

Processing ontology alignments with SPARQL Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Processing ontology alignments with SPARQL Author(s) Polleres, Axel

More information

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Ilhoon Shin Seoul National University of Science & Technology ilhoon.shin@snut.ac.kr Abstract As the amount of digitized

More information

Semantic Adaptation Approach for Adaptive Web-Based Systems

Semantic Adaptation Approach for Adaptive Web-Based Systems Semantic Adaptation Approach for Adaptive Web-Based Systems Bujar Raufi, Artan Luma, Xhemal Zenuni, Florije Ismaili Faculty of Contemporary Sciences and Technologies, South East European University Ilindenska

More information

Understanding Billions of Triples with Usage Summaries

Understanding Billions of Triples with Usage Summaries Understanding Billions of Triples with Usage Summaries Shahan Khatchadourian and Mariano P. Consens University of Toronto shahan@cs.toronto.edu, consens@cs.toronto.edu Abstract. Linked Data is a way to

More information

Transforming Data from into DataPile RDF Structure into RDF

Transforming Data from into DataPile RDF Structure into RDF Transforming Data from DataPile Structure Transforming Data from into DataPile RDF Structure into RDF Jiří Jiří Dokulil Charles Faculty of University, Mathematics Faculty and Physics, of Mathematics Charles

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

KNOWLEDGE GRAPHS. Lecture 4: Introduction to SPARQL. TU Dresden, 6th Nov Markus Krötzsch Knowledge-Based Systems

KNOWLEDGE GRAPHS. Lecture 4: Introduction to SPARQL. TU Dresden, 6th Nov Markus Krötzsch Knowledge-Based Systems KNOWLEDGE GRAPHS Lecture 4: Introduction to SPARQL Markus Krötzsch Knowledge-Based Systems TU Dresden, 6th Nov 2018 Review We can use reification to encode complex structures in RDF graphs: Film Actor

More information

Integrated Usage of Heterogeneous Databases for Novice Users

Integrated Usage of Heterogeneous Databases for Novice Users International Journal of Networked and Distributed Computing, Vol. 3, No. 2 (April 2015), 109-118 Integrated Usage of Heterogeneous Databases for Novice Users Ayano Terakawa Dept. of Information Science,

More information

Directed Graph and Binary Trees

Directed Graph and Binary Trees and Dr. Nahid Sultana December 19, 2012 and Degrees Paths and Directed graphs are graphs in which the edges are one-way. This type of graphs are frequently more useful in various dynamic systems such as

More information

Profiles Research Networking Software API Guide

Profiles Research Networking Software API Guide Profiles Research Networking Software API Guide Documentation Version: March 13, 2013 Software Version: ProfilesRNS_1.0.3 Table of Contents Overview... 2 PersonID, URI, and Aliases... 3 1) Profiles RNS

More information

SPARQL QUERY LANGUAGE WEB:

SPARQL QUERY LANGUAGE   WEB: SPARQL QUERY LANGUAGE JELENA JOVANOVIC EMAIL: JELJOV@GMAIL.COM WEB: HTTP://JELENAJOVANOVIC.NET SPARQL query language W3C standard for querying RDF graphs Can be used to query not only native RDF data,

More information

Evaluating find a path reachability queries

Evaluating find a path reachability queries Evaluating find a path reachability queries Panagiotis ouros and Theodore Dalamagas and Spiros Skiadopoulos and Timos Sellis Abstract. Graphs are used for modelling complex problems in many areas, such

More information

SQL-to-MapReduce Translation for Efficient OLAP Query Processing

SQL-to-MapReduce Translation for Efficient OLAP Query Processing , pp.61-70 http://dx.doi.org/10.14257/ijdta.2017.10.6.05 SQL-to-MapReduce Translation for Efficient OLAP Query Processing with MapReduce Hyeon Gyu Kim Department of Computer Engineering, Sahmyook University,

More information

IDENTIFYING VOLATILE DATA FROM MULTIPLE MEMORY DUMPS IN LIVE FORENSICS

IDENTIFYING VOLATILE DATA FROM MULTIPLE MEMORY DUMPS IN LIVE FORENSICS Chapter 13 IDENTIFYING VOLATILE DATA FROM MULTIPLE MEMORY DUMPS IN LIVE FORENSICS Frank Law, Patrick Chan, Siu-Ming Yiu, Benjamin Tang, Pierre Lai, Kam-Pui Chow, Ricci Ieong, Michael Kwan, Wing-Kai Hon

More information

Suffix Tree and Array

Suffix Tree and Array Suffix Tree and rray 1 Things To Study So far we learned how to find approximate matches the alignments. nd they are difficult. Finding exact matches are much easier. Suffix tree and array are two data

More information

Full-Text and Structural XML Indexing on B + -Tree

Full-Text and Structural XML Indexing on B + -Tree Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information

More information

External Sorting Sorting Tables Larger Than Main Memory

External Sorting Sorting Tables Larger Than Main Memory External External Tables Larger Than Main Memory B + -trees for 7.1 External Challenges lurking behind a SQL query aggregation SELECT C.CUST_ID, C.NAME, SUM (O.TOTAL) AS REVENUE FROM CUSTOMERS AS C, ORDERS

More information

RDF AND SPARQL. Part IV: Syntax of SPARQL. Dresden, August Sebastian Rudolph ICCL Summer School

RDF AND SPARQL. Part IV: Syntax of SPARQL. Dresden, August Sebastian Rudolph ICCL Summer School RDF AND SPARQL Part IV: Syntax of SPARQL Sebastian Rudolph ICCL Summer School Dresden, August 2013 Agenda 1 Introduction and Motivation 2 Simple SPARQL Queries 3 Complex Graph Pattern 4 Filters 5 Solution

More information

Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting

Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting Natawut Nupairoj and Lionel M. Ni Department of Computer Science Michigan State University East Lansing,

More information

Parallel Distributed Memory String Indexes

Parallel Distributed Memory String Indexes Parallel Distributed Memory String Indexes Efficient Construction and Querying Patrick Flick & Srinivas Aluru Computational Science and Engineering Georgia Institute of Technology 1 In this talk Overview

More information

A faceted lightweight ontology for Earthquake Engineering Research Projects and Experiments

A faceted lightweight ontology for Earthquake Engineering Research Projects and Experiments Eng. Md. Rashedul Hasan email: md.hasan@unitn.it Phone: +39-0461-282571 Fax: +39-0461-282521 SERIES Concluding Workshop - Joint with US-NEES JRC, Ispra, May 28-30, 2013 A faceted lightweight ontology for

More information

DBpedia-An Advancement Towards Content Extraction From Wikipedia

DBpedia-An Advancement Towards Content Extraction From Wikipedia DBpedia-An Advancement Towards Content Extraction From Wikipedia Neha Jain Government Degree College R.S Pura, Jammu, J&K Abstract: DBpedia is the research product of the efforts made towards extracting

More information

Semantic Web Technologies: Assignment 1. Axel Polleres Siemens AG Österreich

Semantic Web Technologies: Assignment 1. Axel Polleres Siemens AG Österreich Semantic Web Technologies: Assignment 1 Siemens AG Österreich 1 The assignment: 2 FOAF: 1. Create your own FOAF file. You can use a generator tool such as FOAF- a- Ma>c to generate a skeleton. 2. Make

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

LCP Array Construction

LCP Array Construction LCP Array Construction The LCP array is easy to compute in linear time using the suffix array SA and its inverse SA 1. The idea is to compute the lcp values by comparing the suffixes, but skip a prefix

More information

Index Tuning. Index. An index is a data structure that supports efficient access to data. Matching records. Condition on attribute value

Index Tuning. Index. An index is a data structure that supports efficient access to data. Matching records. Condition on attribute value Index Tuning AOBD07/08 Index An index is a data structure that supports efficient access to data Condition on attribute value index Set of Records Matching records (search key) 1 Performance Issues Type

More information

RDF* and SPARQL* An Alternative Approach to Statement-Level Metadata in RDF

RDF* and SPARQL* An Alternative Approach to Statement-Level Metadata in RDF RDF* and SPARQL* An Alternative Approach to Statement-Level Metadata in RDF Olaf Hartig @olafhartig Picture source:htp://akae.blogspot.se/2008/08/dios-mo-doc-has-construido-una-mquina.html 2 4 htp://tinkerpop.apache.org/docs/current/reference/#intro

More information

Parallel Exact Inference on the Cell Broadband Engine Processor

Parallel Exact Inference on the Cell Broadband Engine Processor Parallel Exact Inference on the Cell Broadband Engine Processor Yinglong Xia and Viktor K. Prasanna {yinglonx, prasanna}@usc.edu University of Southern California http://ceng.usc.edu/~prasanna/ SC 08 Overview

More information

Querying Description Logics

Querying Description Logics Querying Description Logics Petr Křemen 1 SPARQL and Ontology Querying 1.1 SPARQL Query Structure SPARQL Language [SS13] is aimed at querying RDF(S) [GB04] documents. As OWL 2 [MPSP09] is an extension

More information

Keyword Search over RDF Graphs. Elisa Menendez

Keyword Search over RDF Graphs. Elisa Menendez Elisa Menendez emenendez@inf.puc-rio.br Summary Motivation Keyword Search over RDF Process Challenges Example QUIOW System Next Steps Motivation Motivation Keyword search is an easy way to retrieve information

More information

GPU-accelerated Verification of the Collatz Conjecture

GPU-accelerated Verification of the Collatz Conjecture GPU-accelerated Verification of the Collatz Conjecture Takumi Honda, Yasuaki Ito, and Koji Nakano Department of Information Engineering, Hiroshima University, Kagamiyama 1-4-1, Higashi Hiroshima 739-8527,

More information

Accessing information about Linked Data vocabularies with vocab.cc

Accessing information about Linked Data vocabularies with vocab.cc Accessing information about Linked Data vocabularies with vocab.cc Steffen Stadtmüller 1, Andreas Harth 1, and Marko Grobelnik 2 1 Institute AIFB, Karlsruhe Institute of Technology (KIT), Germany {steffen.stadtmueller,andreas.harth}@kit.edu

More information

Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013

Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013 Abdullah-Al Mamun CSE 5095 Yufeng Wu Spring 2013 Introduction Data compression is the art of reducing the number of bits needed to store or transmit data Compression is closely related to decompression

More information

On Ordering and Indexing Metadata for the Semantic Web

On Ordering and Indexing Metadata for the Semantic Web On Ordering and Indexing Metadata for the Semantic Web Jeffrey Pound, Lubomir Stanchev, David Toman,, and Grant E. Weddell David R. Cheriton School of Computer Science, University of Waterloo, Canada Computer

More information

Closed Pattern Mining from n-ary Relations

Closed Pattern Mining from n-ary Relations Closed Pattern Mining from n-ary Relations R V Nataraj Department of Information Technology PSG College of Technology Coimbatore, India S Selvan Department of Computer Science Francis Xavier Engineering

More information

Browsing the Semantic Web

Browsing the Semantic Web Proceedings of the 7 th International Conference on Applied Informatics Eger, Hungary, January 28 31, 2007. Vol. 2. pp. 237 245. Browsing the Semantic Web Peter Jeszenszky Faculty of Informatics, University

More information

Finding Similarity and Comparability from Merged Hetero Data of the Semantic Web by Using Graph Pattern Matching

Finding Similarity and Comparability from Merged Hetero Data of the Semantic Web by Using Graph Pattern Matching Finding Similarity and Comparability from Merged Hetero Data of the Semantic Web by Using Graph Pattern Matching Hiroyuki Sato, Kyoji Iiduka, Takeya Mukaigaito, and Takahiko Murayama Information Sharing

More information

Orchestrating Music Queries via the Semantic Web

Orchestrating Music Queries via the Semantic Web Orchestrating Music Queries via the Semantic Web Milos Vukicevic, John Galletly American University in Bulgaria Blagoevgrad 2700 Bulgaria +359 73 888 466 milossmi@gmail.com, jgalletly@aubg.bg Abstract

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

String Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42

String Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42 String Matching Pedro Ribeiro DCC/FCUP 2016/2017 Pedro Ribeiro (DCC/FCUP) String Matching 2016/2017 1 / 42 On this lecture The String Matching Problem Naive Algorithm Deterministic Finite Automata Knuth-Morris-Pratt

More information

A Performance Evaluation of the Preprocessing Phase of Multiple Keyword Matching Algorithms

A Performance Evaluation of the Preprocessing Phase of Multiple Keyword Matching Algorithms A Performance Evaluation of the Preprocessing Phase of Multiple Keyword Matching Algorithms Charalampos S. Kouzinopoulos and Konstantinos G. Margaritis Parallel and Distributed Processing Laboratory Department

More information

Semantic Web Systems Querying Jacques Fleuriot School of Informatics

Semantic Web Systems Querying Jacques Fleuriot School of Informatics Semantic Web Systems Querying Jacques Fleuriot School of Informatics 5 th February 2015 In the previous lecture l Serialising RDF in XML RDF Triples with literal Object edstaff:9888 foaf:name Ewan Klein.

More information

Benchmarking Database Representations of RDF/S Stores

Benchmarking Database Representations of RDF/S Stores Benchmarking Database Representations of RDF/S Stores Yannis Theoharis 1, Vassilis Christophides 1, Grigoris Karvounarakis 2 1 Computer Science Department, University of Crete and Institute of Computer

More information

User Interests: Definition, Vocabulary, and Utilization in Unifying Search and Reasoning

User Interests: Definition, Vocabulary, and Utilization in Unifying Search and Reasoning User Interests: Definition, Vocabulary, and Utilization in Unifying Search and Reasoning Yi Zeng 1, Yan Wang 1, Zhisheng Huang 2, Danica Damljanovic 3, Ning Zhong 1,4, Cong Wang 1 1 International WIC Institute,

More information