Count Distinct Semantic Queries over Multiple Linked Datasets
|
|
- Homer West
- 6 years ago
- Views:
Transcription
1 c 2018 by the authors; licensee RonPub, Lübeck, Germany. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( Open Access Open Journal of Semantic Web (OJSW) Volume 5, Issue 1, ISSN X Count Distinct Semantic Queries over Multiple Linked Datasets Bogdan Kostov, Petr Křemen Department of Cybernetics, Czech Technical University in Prague, Technicka 2, Prague, Czech Republic, {bogdan.kostov, petr.kremen ABSTRACT In this paper, we revise count distinct queries and their semantics over datasets with incomplete knowledge, which is a typical case for the linked data integration scenario where datasets are viewed as ontologies. We focus on counting individuals present in the signature of the ontology. Specifically, we investigate the Certain Epistemic Count (CEC) and the Possible Epistemic Count (PEC) interval based semantics. In the case of CEC semantics, we propose an algorithm for its evaluation and we prove its correctness under a practical constraint of the queried ontology. We conduct and report experiments with the implementation of the proposed algorithm. We also prove decidability of the PEC semantics. TYPE OF PAPER AND KEYWORDS Regular research paper: semantic queries, aggregate queries, conjunctive queries, distinct count, linked data, ontology, incomplete knowledge, certain epistemic count, possible epistemic count 1 INTRODUCTION Reflecting knowledge incompleteness in aggregate queries with their different interpretations (different strategies employed when counting incomplete data) over linked data sets is in its early stages of research, lacking appropriate algorithms, tools as well as benchmark support. We believe that such queries are vital in business intelligence and data quality assessment use-cases where integrated incomplete data is analyzed. We attempt to bring some practical results to this interesting research area. In this paper, we extend the work [12]. We investigate the possibility to implement different interpretations of count distinct queries over ontologies where the Unique Name Assumption 1 (UNA) does not hold, which is a 1 The unique Name Assumption (UNA) is a convenient assumption in which differently named individuals are interpreted as different domain elements. typical case for the linked data integration scenario. We focus on counting individuals present in the signature of the ontology. Specific contributions of the paper include: a novel interpretation labeled possible epistemic count (PEC) for count distinct queries and we prove its decidability, see Section 4, an algorithm implementing the certain epistemic count (CEC) interpretation and proof of its correctness under a practical constraint on the queried ontology, see Section 5, experimental evaluation of the algorithm over artificial data, see Section 6. This paper is organized as follows. In Section 2 we present a motivational use-case for semantic count distinct queries over multiple Linked Data (LD) datasets in the domain of aviation safety. We also discuss the 1
2 Open Journal of Semantic Web (OJSW), Volume 5, Issue 1, 2018 Table 1: Safety event report dataset D 1 from Prague Airport ID Location Incident Time Report Time FN 1 Prague 12h 12h A 2 Prague 11h 11h B 3 - Not-sure 15h20 C 4 - Not-sure 10h D types of incompleteness that we found in the data and conclude by identifying two suitable interpretations of count distinct queries. In Section 3 we formally define the syntax and semantics of OWL 2 - DL ontologies, semantic conjunctive queries and count distinct queries with basic semantics. In Section 4 we define the PEC semantics of count distinct queries and prove its decidability. In Section 5 we propose an algorithm for the CEC semantics of count distinct queries and we prove its soundness and completeness under a practical constraint on the queried ontology. In Section 6 we report the results of experiments done with the proposed algorithm. We experiment with artificial data samples. In Section 7 we present the state of art. Section 8 summarizes our results and discusses future work. 2 MOTIVATIONAL USE-CASES We start this section with a discussion about a simple use-case scenario for semantic counting in the domain of aviation safety. We also show how two types of data incompleteness affect the evaluation of the interpretation of the count distinct aggregate function in different scenarios. One of the strategies in managing aviation safety is through monitoring of reactive indicators. An indicator is typically defined as a statistic in a time frame (a month) over a data set of reported safety events and investigation results. The simplest and also one of the most common indicators are based on count distinct statistics, e.g., the number of bird strike events (e.g., a collision of a bird with an airplane) per month. The precision of this statistics is essential for the efficient and effective performance of safety departments. We identify errors when evaluating these statistics over data sets caused by incompleteness of the collected data. Consider the following example data. Table 1 and 2 show reports of bird strike events collected respectively at two fictive organizations Prague Airport and Brno Airline. Table 3 shows information about individual flights. The column with name FN refers to the flight number. Each table represents an autonomous dataset. Note that the datasets do not contain events but their reports. Based on our experience with the industry we assume Table 2: Safety event report dataset D 2 from Brno Airline ID Location Incident Time Report Time FN 1 Prague 12h 12h20 A 4 - Not-sure 9h10 D that in an autonomous dataset of event reports the identity of reports is used to represent the identity of events and that each event is reported only once. These conditions are achieved by data collection procedures and data management processes. We can see the first incompleteness in the second and third columns, ( Location and Incident Time ) in Table 1 where the values are not always specified. This incompleteness of attribute values occurs naturally. A bird strike event is typically reported by the pilot. The pilot may not perceive the strike at the moment of the occurrence. Hence the time and location of the strike might not be known. As a result, the data collected in the dataset contains incomplete information about the time and location of the incidents. Note also that in case of a bird strike event the time and location of the event are bound and given the time we know the location and vice versa. Consider the indicator defined in Example 1. Example 1. Bird strike indicator Number of bird strikes over Prague airport for a month. Commonly, this indicator is evaluated by counting the number of bird strike reports in the dataset. Evaluating the indicator in this way over the dataset from Table 1 gives a result of 4 and over the dataset obtained by merging the data from Tables 1 and 2 gives 5. Let s investigate and identify the errors which occur using this evaluation strategy. To compute this indicator correctly, the queried dataset needs to comply with the following conditions: unique identification for events and a definite classification of events. Let s first consider evaluation over Table 1. The first condition is satisfied because the identity of events is complete, as discussed earlier. The second condition, however, definite classification, is not met. In this case classifying a bird strike as over Prague airport or not is based on the attribute Location. Assuming that the time and location are dependent attributes and the location of the incident is not known, then given that the incident occurred within a certain interval before the landing, the event can satisfy the condition to be 2
3 B. Kostov, P. Křemen: Count Distinct Semantic Queries over Multiple Linked Datasets Table 3: Dataset describing flights FN Departure Arrival DepTime ArrTime Airline A Prague Brno 10h 10h30 Brno Airlines C Paris Prague 13h 15h D Brno Prague 8h 8h30 Brno Airlines classified as over Prague airport. The problem is, as mentioned earlier, that neither the time nor the location of some reports are known. For example, the incident reported at row 3 in Table 1 may have occurred in Paris and incident reported at row 4 may have occurred in Brno. In this case, the indicator should be evaluated as a range 2..4 which represents the incompleteness present in the data as opposed to the conventional evaluation which would result in a definite answer 4. Let s investigate the evaluation over the merge of datasets in Tables 1 and 2. The second condition is violated because the incompleteness of the attributes Location and Incident Time is inherited from the merged data. Furthermore, the first condition is also violated because as opposed to the autonomous datasets the merged dataset is not managed in the same manner and thus may contain more than one report for a given event. For example, the event reported at first and forth rows in Table 1 are the same events as the ones reported at first and second rows in Table 2. Note that the merged datasets does not contain this information. In this case, a correct evaluation should return the range 2..6 as opposed to the definite value 3 returned by the common evaluation strategy. Instance reconciliation may be used to improve data quality. This process produces a new dataset containing information about which records report the same event. This mapping is implemented by a discriminative attribute which enables grouping of reports of the same event in one group or using relational statements (table1/report#4 owl: sameas table2/report#1). In the first approach, we can use distinct count with UNA interpretation to evaluate correctly the indicator from Example 1. The second case is quite typical for the semantic web. However, the correct implementation of the indicator using distinct count with UNA interpretation needs to be extended to calculate the partitions defined by the owl: sameas statements. In both cases, SPARQL 1.1 can be used to evaluate this indicator correctly. However, in the second case, the query will be more complicated because it will need to calculate the equality partitions induced by the owl: sameas axioms. Because of the added information and the use of a suitable evaluation, this method will return the value 2. 3 PRELIMINARIES In this section, we will define basic terms and notions used in the rest of the paper. We will start with the definition of an ontology followed by the definition of conjunctive queries and aggregate queries with distinct count function. 3.1 Ontology In this paper we consider the description logic language SROIQ(D) for ontology representation. We will use the term ontology as a shorthand of a SROIQ(D) ontology. Next, we will introduce part of the syntax and semantics of the language, relevant to our work. We will also introduce some syntactic sugar used in OWL 2 Web Ontology Language [17, 19], a syntactic variant of SROIQ(D). For a full description of the syntax and semantics of different description logic formalisms see [1]. Also, note that we assume that the SROIQ(D) ontologies should comply with all the necessary syntactic restrictions for which inference problems remain decidable. Definition 1 (Ontology, Selected Syntax and Semantics). An ontology O is a pair S, A, where S is a signature and A is a set of axioms. The semantics of ontologies uses a first order interpretation I = ( I, I), where I is an interpretation domain and I : S is an interpretation function mapping elements from the ontology signature S to elements from the interpretation domain I. An ontology O is satisfied by an interpretation I, denoted by I = O, if all of its axioms are satisfied by the interpretation I, such interpretation I is called a model of O. We say that a set of axioms A is entailed by the ontology O, denoted by O = A, if every model I of the ontology O, is also a model of A, I = A. Let I = ( I, I) be an interpretation. Let R, R 1, R 2,, R n be roles, C, D be concepts, a, b, a 1, a 2,, a n be individuals (all previous part of S) and k is a natural number. Selected role axioms and their interpretations are: I I = Func(R) if x I, y I R I and x I, z I R I implies y I = z I. 3
4 Open Journal of Semantic Web (OJSW), Volume 5, Issue 1, 2018 I = InvFunc(R) if y I, x I R I and z I, x I R I implies y I = z I. Note that these axioms are only syntactic sugar and in SROIQ(D) they are represented using TBox axioms. Selected TBox and ABox axioms and their interpretations are: I = C D if C I D I I = HasKey(C, R 1, R 2,, R n ) if x I, y I C I and w1 I, w2 I,, wn I if x I, w I i R I i and y I, w I i R I i then xi = y I. I = C(a) if a I C I I = R(a, b) if a I, b I R I We consider the following concept expressions and their interpretations. Top concept I = I, bottom concept I =, intersection (C D) I = C I D I, union (C D) I = C I D I, one of ({a 1, a 2,, a n }) I = {a I 1, a I 2,, a I n}, minimum number restrictions ( k RC) I = {x {y x, y R I and y C I } k}, maximum number restrictions ( k RC) I = {x {y x, y R I and y C I } k}. Where S stands for the size of S. Examples 2, 3 and 4 show how we can use ABox axioms to construct an ontology for the datasets from Tables 1, 2 and 3. The ontologies capture the type of the incident (e.g., BirdStrike), its location and during which flight it happened. Other data from the tables is ignored. Furthermore, for the sake of space we abbreviate some of the resources: Location loc; Departure dep; Arrival arr; Prague Prg. In examples 2 and 3, reports are represented as individuals i k and r k and the k corresponds to the identifier from column ID. Flights are represented as individuals using their flight number (i.e., column FN) and are associated with reports using the duringflight property. Note, how the lack of location information in tables 2 and 3 is captured by the ontologies in examples 2 and 3 by omitting assertion axioms of the loc property. Example 2. Ontology (ABox) of Prague Airport Report dataset from Table 1 BirdStrike(i1), loc(i1, P rg), duringflight(i1, A) BirdStrike(i2), loc(i2, P rg), duringflight(i2, B) BirdStrike(i3), duringflight(i3, C) BirdStrike(i4), duringflight(i4, D) Example 3. Ontology (ABox) of Brno Airlines Report dataset from Table 2 BirdStrike(r4), duringflight(r4, D) Example 4. Ontology (ABox) of the Flight dataset from Table 3 Flight(C), dep(c, P aris), arr(c, P rg), Flight(D), dep(d, Brno), arr(d, P rg), Definition 2 (Inference Problems). Let O be an ontology, A be a set of axioms and C, D be concepts. Consistency checking (CC). O is consistent, CC(O) = true, if there is a model I of O, Subsumption. D subsumes C, denotes as C D, w.r.t. O, if O = C D. 3.2 Conjunctive Queries To define distinct count queries we first need to define conjunctive queries. Definition 3 (Conjunctive Query). We will denote conjunctive queries using the following rule like notation Q( x) φ( x, z). (1) The head of the query Q( x) denotes the name of the query and the result variables R var (Q) = x. The body of the query φ( x, z) is a comma-separated list of query atoms interpreted as a conjunctive query. The allowed atoms are SROIQ(D) atoms where we allow for distinguished and non-distinguished variables in places of individuals. The variables z are called non-result variables. Result variables must be distinguished. By V var (Q) we denote the list of all variables in the query. By V d (Q) we denote all distinguished variables. A binding µ : V var (Q) S is a mapping of the variables of the query to elements in the ontology s signature and Q µ is the substitution of the variables in Q by the binding µ. Binding of a tuple of variables is denoted by µ( v) = (µ(v 1 ),..., µ(v k )). M Q,O is the set of all possible bindings of Q w.r.t. O, M Q,O = n k where n is the number of individuals in the signature S of O and k is the number of distinguished variables of Q. We call Q a semi-ground query if there are no distinguished variables in the query. A solution to the query Q w.r.t. the ontology O is a binding µ for which the substitution Q µ is a semi-ground query, the body of which is entailed by the ontology (denoted by O = Q µ ). The set of all solution bindings of query Q w.r.t. O is denoted by Sat O Q = {µ O = Q µ, µ M Q,O }. The result of query Q w.r.t. ontology O denoted by Q O, is a set of bindings of the result variables R var (Q), formally Q O = {R var (µ) µ Sat O Q }. The set of all solution bindings of a query Q w.r.t. an interpretation I is denoted by Sat I Q = {µ I = Q µ, µ M Q,O }. Finally, the result of query Q 4
5 B. Kostov, P. Křemen: Count Distinct Semantic Queries over Multiple Linked Datasets w.r.t. an interpretation I denoted by Q I, is a set of bindings of the result variables R var (Q), formally Q I = {R var (µ) µ Sat I Q }. Note that there are certain syntactic limitations of the query body concerning undistinguished variables. For more detail see [11]. Examples 5 shows a conjunctive query using the notation defined above. which will retrieve all bird strike incidents which occurred over the Prague airport. Example 5. Conjunctive Query to Select Bird Strike Incidents Over Prague Airport, Q 1 (?inc) BirdStrike(?inc), loc(?inc, P rg). 3.3 Count Distinct Queries and Semantics In this section, we define the notion and semantics of the count distinct queries. Particularly, we focus on count distinct queries which retrieve the number of distinct tuples of the result of a conjunctive query Q. We also define some additional terminology and symbols used later in the paper. First, we define the notation of count distinct queries and their conventional semantics. Definition 4 (Count Distinct Query). Let O be an ontology, Q be an conjunctive query with no negation as failure atoms. We denote a distinct count query over O as CD(Q( x), O), where CD is the count distinct function. Q is the conjunctive query the results of which are counted. The result variables x of Q also specifies the counting variables. Conventionally, count distinct queries are evaluated relying on the UNA principle: basic semantics CD(Q( x), O) = (Q O ) Example 6 shows a representation of the indicator introduced in Example 1 using the notion defined above. Example 6. Count Distinct Query with CEC Semantics CD(Q 1 (?inc) BirdStrike(?inc), loc(?inc, P rg), O). Note that in the Definition 6 we count the distinct result bindings Q O. Counting requires comparing bindings. We can view a binding as a set of variablesymbol mapping pairs which can be compared. For example, let binding µ 1 = {(x, a)} is different from binding µ 2 = {(x, b)} and equal to binding µ 3 = {(x, a)}. We denote interpretation of bindings as µ I 1 = {(x, a)} I = {(x I, a I )}. When interpreted bindings may compare differently. For example, if a I = b I then µ I 1 = µ I 2. Next we define certain epistemic count (CEC) semantics of count distinct queries. Definition 5 (Certain Epistemic Count). A count distinct query with CEC semantics is represented as CD C (Q( x), O). The C in the superscript denotes the semantics mode, i.e., CEC. We define the CEC semantics as follows: CD C (Q, O) = min(n C Q,O ), max(nc Q,O ) where N C Q,O is the set of different sizes of certain answers of Q over O in different models I of O, N C Q,O = { ((Q O ) I ) I = O}. Example 7 shows the representation and the result of the indicator introduced in Example 1 using the notion and semantics defined above. Example 7. Count Distinct Queries CD C (Q 1(?inc) BirdStrike(?inc), loc(?inc, P rg), O). Evaluating the query with the CEC semantics over the merged datasets from Tables 1 and 2 will result in CD C (Q 1, O) = 2, 3. This is because there are at most three incidents reported over Prague but two of them might be the same. 4 POSSIBLE EPISTEMIC COUNT SEMANTICS In this section, we will define the Possible Epistemic Count (PEC) semantics for distinct count queries. Definition 6 (Possible Epistemic Count). A count distinct query with PEC semantics is represented as CD P (Q( x), O). The P in the superscript denotes the semantics mode, i.e., PEC. We define the PEC semantics as follows: CD P (Q, O) = min(n P Q,O ), max(np Q,O ), where N P Q,O is the set of different sizes of certain answers of Q over O in different models I of O, N P Q,O = { ((Q O ) I ) I = O}. Example 8 shows the representation and the result of the indicator introduced in Example 1 using the notion and semantics defined above. Example 8. Count Distinct Query with PEC Semantics CD P (Q 1(?inc) BirdStrike(?inc), loc(?inc, P rg), O). Evaluating the query with the PEC semantics over the merged datasets from Tables 1 and 2 will result in CD C (Q 1, O) = 2, 6. This is because there are potentially at most 6 incidents reported over Prague. Three of those reports are certainly over Prague. However, two of them might be the same, therefore the minimum is again 2. 5
6 Open Journal of Semantic Web (OJSW), Volume 5, Issue 1, 2018 Certain and possible answers correspond respectively to the Semantic Tuple Count and Semantic Count introduced in [12], where we also prove decidability of the former. Here we prove that CD P (Q, O) is decidable. To prove this we define an alternative to N P Q,O which we denote as N and then we prove that it is decidable and equivalent to N P Q,O. The definition of the PEC interpretation requires to evaluate the number of distinct bindings in each model of the ontology. This is a problem because the ontology may have infinitely many models. Each model I contributes to the set N P Q,O with a number ((R var (Sat I Q ))I ). The evaluation of that number relies solely on the results of the query Q over the model I, and the equivalence relation of the symbols in the query result encoded in the model. Note that although there are infinite number models of the ontology the results of the query Q in any of those models belongs to the powerset of the set of all bindings M Q,O, that is there finitely many result sets being counted. The equality relation can also be thought of as a partitioning of the elements in the signature S. Note that the set E O of all different partitionings encoded in all the models of the ontology is also finite. Before we define the set N, we introduce some auxiliary notions. Given M, a potential set of result bindings, we can encode them as a set of axioms as follows A M = O {Q µ µ M}. Given E, a potential partitioning of elements in S we can encode them as a set of axioms as follows A E = e E {a =. b a, b e} e {a b a e 1,e 2 E 1, b e 2 }. Given M and E, we define the ontology extension O(M, E) as follows O(M, E) = O A M A E. Let L = {(M, E) M M Q,O, EinE O } be the set of all pairs of M (potential result bindings composed of elements from the signature S) and E (potential partitioning of those elements). We define the alternative definition of N P Q,O as follows. Definition 7 (Alternative Definition of N P Q,O ). N = { (R var (M) E ) O(M, E) is consitent Sat O(M,E) Q = M, (M, E) L}. Lemma 1. Calculating N is decidable. Proof. To calculate N we need to iterate over the finite set L. For each element, (M, E) L, we construct O(M, E), an extension of the original ontology. O(M, E) is checked for consistency and whether it satisfies the condition Sat O(M,E) Q = M. If it does, we count the number of interpreted bindings. Each of these steps is decidable. Therefore, calculating the set N is decidable. Lemma 2. The set N is equivalent to the set N P Q,O. Proof. First, we show that N N P Q,O. Let n be a number in N, then n = ((R var (M)) E ). We know that the ontology O(M, E) is consistent and that it satisfies the condition Sat O(M,E) Q = M. We can deduce that there exists a model I = O(M, E), such that M E = (Sat I Q )I. Therefore, n N P Q,O. Second, we show that N P Q,O N. Each number n in N P Q,O is computed from the interpreted binding set (Sat I Q )I w.r.t a model I of the ontology. Note that M = Sat I Q belongs to the set M Q,O. Also, note that the partitioning E of elements in the signature of ontology O induced by I is an element from E O. Then, the extended ontology O(M, E ) is consistent and it satisfies the condition Sat O(M,E) Q that n N. Therefore, N = N P Q,O = M which implies Theorem 1. Let O be an ontology, Q be a conjunctive query. Then calculating the result of CD P (Q, O) is decidable. Proof. The decidability of CD P (Q, O) is a direct consequence of Lemmas 1, 2. 5 IMPLEMENTATION OF CEC SEMANTICS FOR COUNT DISTINCT QUERIES In this section, we present the implementation of the CEC semantics of count distinct queries described in Section 3.3. The implementation takes in an ontology O and the underlying conjunctive query Q with result variables as counting variables. The implementation relies on a reasoner which supports conjunctive query answering and consistency checking. Counting over multiple datasets is possible if the datasets are merged, and the schema (ontology) is explicitly included. The resulting dataset, i.e., ontology with TBox, RBox, and ABox, can be used as an input for the counting procedure. The idea behind the implementations is to look for numbers which belong to the result by restricting the possible models of the input ontology and checking whether the restriction is satisfiable. The algorithm is split into two procedures. First, in Algorithm 1 we define a procedure CDCSingleColumn(T, O) which calculates the count distinct function with CEC semantics over a set of single variable (single column) bindings. The input T is a list of elements from the signature of O. The procedure looks for the minimum and maximum by iteratively adding axioms to the ontology. We construct the ontology O i to narrow down its models to those containing at most (respectively at least) i many partitions induced by the equality relation of the 6
7 B. Kostov, P. Křemen: Count Distinct Semantic Queries over Multiple Linked Datasets Algorithm 1: Evaluation of count distinct with single counting variable 1 CDCSingleColumn(T, O = S, A ) 2 / / p, x, C / S 3 O := O {p(x, t) t T } 4 a, b := 1 5 / / find minimum 6 f o r i:=n t o 1 do 7 O i = O {( i p)(x)} 8 i f CC(O i) = false 9 a := i b r e a k 11 / / find maximum 12 O := O {C {t 1} {t 2}... {t n}}} 13 f o r i:=1 t o n do 14 O i = O {( i p C)(x)} 15 i f CC(O i) = f a l s e 16 b := i 1 17 b r e a k 18 r e t u r n (a, b) counted elements encoded in those models. In this implementation, we use number restrictions to achieve this effect. A similar algorithm can be designed using different SROIQ(D) constructs. Algorithm 2 shows the implementation of the CEC semantics of count distinct queries CD C (Q, O). The algorithm makes use of the conjunctive query answering service of the used reasoner. In the case where the query Q has more than one result variables, the algorithm transforms the problem to a single column problem. The single column problem is evaluated in both cases by calling the procedure CDCSingleColumn(T, O) from Algorithm 1. The transformation of multi to single column counting is necessary because in SROIQ(D) there are no constructs to apply the same restrictions used in the single column case. The idea of the transformation is to create a one-to-one mapping between fresh individuals, one for each result row, with the elements of that row. This is achieved by modeling columns as bijective relations, using HasKey and Func axioms and associating through the column relations, each fresh individual corresponding to a result binding with the values of the bound variables. 5.1 Practical Correctness of CDC In this section, we prove the correctness of Algorithm 2. First, we define the required practical restrictions of the queried ontology. Definition 8 (Ontology Restrictions). Let S be the set of elements composing the counted bindings. An ontology Algorithm 2: Evaluation of count distinct with multiple counting variables 1 CDC(Q, O) 2 / / T is a multiset of n k-sized tuples 3 / / of individuals from S 4 T := evalquery(q, O) 5 i f R var(q) > 1 / / k > 1, tuple counting 6 O := O 7 {HasKey(A, c 1, c 2,, c k )} 8 {Func(c 1), Func(c 2),, Func(c k )} 9 {A(r i), c 1(r i, t i,1),, c k (r i, t i,k ) i 1, n } 10 T := {r 1, r 2,, r n} 11 r e t u r n CDCSingleColumn(T, O ) 12 e l s e / / k = 1, element counting 13 r e t u r n CDCSingleColumn(T, O) can be queried using Algorithm 2 if any of the following axioms are not entailed by the queried ontology. (s), s S The first restriction allows us to add fresh properties and associate fresh individuals with them. The second restriction allows us to associate the counted elements with other individuals using fresh properties. Note that these restrictions will be satisfied in most practical ontologies. For example, if an ontology entails the first axiom in Definition 8, it will have only models in which no individual is related to any other individual. In the case of our running example of incident reports, using such an ontology will not allow to associate incidents with flights. The second axiom in Definition 8 has a similar impact as the first one. Specifically, it restricts adding new relations to the counted elements. Next, we prove the correctness of Algorithm 2. The proof is split into two parts. Lemma 3 shows the correctness of the procedure shown in Algorithm 1. Lemma 4 shows the correctness of the transformation of counting multiple columns to counting a single column. Note that both proofs consider that the input ontology O fulfills the restrictions in Definition 8. Lemma 3. Algorithm 1 is a correct procedure of CD C (Q, O) considering the input ontology O fulfills the restrictions in Definition 8. Proof. Let a, b = CD C (Q, O) and let a, b = CDSingleColumn(, O) where T = Q O. We show that a = a and b = b. Here we will show the proof for a = a. The proof for b = b is similar. First, we prove that a a. The set axioms in O is a subset of the axioms in O i ontology. By definition, there exists a 7
8 Open Journal of Semantic Web (OJSW), Volume 5, Issue 1, 2018 model I of O a, I = O a. Because of the monotonicity property of SROIQ(D) ontologies, I is also a model of O. The number n = ((Q O ) I ) is greater or equal to a and it is equal to the number a. Therefore a a. Now we prove that a a. This is equivalent to the statement that for any I model of O, i = ((Q O ) I ), there exists a model I of O i, where (T I ) = (T I ). We construct I from I as follows: z I := z I, z S, x I := a, where a is an arbitrary element from I ; p I = {(a, t 1 ), (a, t 2 ),, (a, t n )}. It is easy to show that I is a model of O as it has the same domain and mappings as I. The following implies by the construction of I : (T I ) = (T I ); I = {p(x, t) t T }. Also I = ( i p)(x) which is implied by the interpretation conditions of the class expression i p, see Definition 1, and the equality {y (x, y) p I and y I } = {t I i (a, ti i ) pi and t i T } = (T I ) = (T I ) = i. Therefore I = O i and T I = T I. Lemma 4. The transformation from multi to single column counting in Algorithm 2 is a practically correct procedure for computing CD C (Q, O). Proof. Counting the set of interpreted row individuals is equivalent to counting the set of interpreted result bindings if there is a one-to-one correspondence between the interpreted rows and interpreted result bindings for each model of the ontology. Let I be a model of the ontology O. To prove the correspondence, we first show that there is a unique interpreted result binding t I i, t i T for each interpreted row r i T. Because of the functional restrictions on the column properties c 1, c 2,, c k, the interpreted row ri I is restricted to have at most one value per column property. Therefore, there is exactly one interpreted unique tuple t I i associated with ri i. Second, we prove that each interpreted tuple t I i is associated with exaclty one interpreted row ri I. In this case, the HasKey axiom restricts each unique t I i to be associated with exactly one interpreted row ri I. This proves the one-to-one correspondence. Because there is a one-to-one correspondence between the interpreted rows and interpreted result bindings for each model of the ontology, the proposed transformation from multi to single column counting is correct. Theorem 2. The CDC(Q, O) Algorithm 2 is a correct procedure of CD C (Q, O). Proof. The proof of this theorem is a consequence of Lemmas 3 and 4. 6 EXPERIMENTS In this section, we report the results of our experiments with Algorithm 1 which is the core procedure of the implementation of the CEC semantics. In the experiment, we are counting a set of individuals T described in an OWL 2 ontology. The goal of the experiments is to form an initial assessment of the practicality of the algorithm. In our first tests we experimented with small sizes of T, e.g., 1 to 10. We found that, as expected the bottleneck of the procedure is the consistency check operation used on lines 8 and 15 in Algorithm 1. Note that the complexity of the algorithm depends on the complexity of the consistency check procedure. In the case of SROIQ(D) reasoner the complexity is known to be NEXPTIME. We tried to optimize this operation. In general, using an arbitrary reasoner for consistency checking has the limitation that in each iteration the consistency checking must start from scratch. Note that both loops on lines 6 and 13 in Algorithm 1 iterate over the restriction of the count from the general to the more specific. Theoretically, because of the monotonicity property of SROIQ(D), using incremental reasoner may improve performance. In practice, the improvement gained from switching to an incremental reasoner was negligible compared to another factor. The last consistency check in the iteration process (the one that should fail) may take a very long time even for a small number of elements, e.g., 10 is that. This is because the reasoner must check all possible combinations of partitionings of the counted elements. The number of possible partitionings with k non-empty partitions of an n element set is given by the Stirling number s(n, k). The following examples demonstrate the problem evaluating an inconsistent O i ontology, e.g. s(10, 8) = 750, s(20, 16) = This observation led us to another optimization idea, where we attempt to recognize the last consistency check based on how long it takes to terminate. We implemented a consistency check timeout strategy in which we terminate the cycles at lines 6 and 13 if the consistency check takes longer than the total time spent on that operation during the execution. Note that this strategy has an impact on correctness. Because the last consistency check may be falsely identified, the results might be incomplete. Soundness is preserved in the sense that the calculated interval will be contained in CD C (Q, O). For example, if the correct result of a count distinct query is CD C (Q, O) = 100, 150 and the timeout strategy fails it will always return an interval contained within the correct one, e.g., CDCSingleColumn(T, O) = 125,
9 B. Kostov, P. Křemen: Count Distinct Semantic Queries over Multiple Linked Datasets Figure 1: Execution time dependence on size of T Figure 1 shows the results of Algorithm 1 using an incremental reasoner and the consistency timeout strategy described above. We evaluated 4 queries (i.e., 4 different sizes of T {20, 40, 100, 180}) over randomly generated artificial datasets. Each query was executed 100 times. Figure 1 shows the median, the 90- and 100- percentiles of the times that was needed to compute the query in each of the executions. We observed the expected increase of execution time with the size of T. We also note that most of the executions were under a minute long. There were some cases for which the execution took nearly two and a half minute. The timeout strategy improved performance dramatically. Nevertheless, more experiments are needed to improve the performance of the algorithm and to check how the correctness of the results was compromised. The queries were executed against randomly generated artificial datasets. The structure of each dataset D i consists of individuals whose identity is complete in D i. Each dataset partitions the set of individuals to n disjoint classes from the global schema. The merged dataset contains incomplete identity of all the individuals from all the datasets. The datasets were composed of up to 200 individuals and a random number (from 4 up to 150) of pairwise disjoint classes. The synthetic data over which we conduct our experiments try to capture the motivational example from Section 2. The common ontology is simulated by the disjointwith axiom between classes used to annotate the individuals. The level of matching between datasets of event reports D 1 and D 2 is captured by differentfrom axioms between the individuals. The experiments were conducted with the Pellet reasoner [23] on a Dell laptop with Intel Core i7 processor and 8 GB memory. 7 STATE OF ART In recent years, expressive query languages, like SPARQL-DL [24], OWL-SAIQL [13], SQWRL [18] for OWL 2 [25] or SeRQL [2], SPARQL [20] for RDF, have been introduced and implemented in the field of semantic web. There have been deep studies [7, 22, 14, 5, 10] evaluating conjunctive queries in RDF and OWL, but few efforts have been spent on an algebra, as well as aggregation functions. Recently the RDF query language SPARQL [20] has been extended towards the new SPARQL [6] including many new constructs, e.g. aggregation functions or negation as failure. There are already publicly available implementations, e.g. ARQ [8], KGRAM [4], RDF::Query [26] or RDF4J [15] (previously known as Sesame [9]). Independently on SPARQL 1.1 the authors of [21] discuss the topic of aggregation over data structured as RDF graphs rather than on the relational data returned by the query (the result set table). The authors stress the need of different modes of aggregation. A few years ago the SQWRL query language for OWL [18] was proposed. All these efforts implement or interpret aggregation using the basic relational semantics, i.e. the result of aggregation functions is computed over the results of the nonaggregate variant of the query and neglect the impact of the interplay between aggregation functions and the inferred domain elements. More closely related to our study is the work [3]. It shows that exact semantics, of aggregate queries over a DL-Lite A ontology returns results only if the Abox is not empty and if axioms in the Tbox resolve as constraints over cardinalities of the groups in the aggregate query. The authors propose epistemic semantics of aggregate 2 SPARQL 1.1 is a W3C recommendation since March
10 Open Journal of Semantic Web (OJSW), Volume 5, Issue 1, 2018 queries in the settings of data integration use-cases returning the aggregate function s results known from (inferred by) the ontology. The authors propose an evaluation algorithm for a sub set of aggregate queries, i.e. restricted epistemic aggregate queries, defined using functional dependency of query variables w.r.t. the Tbox of the queried ontology. Compared to their work we investigate different epistemic semantics of count distinct queries in case the Unique Name Assumption does not hold, which is a typical case for the linked data integration scenario. Although not exactly in the same framework of query answering like in this paper, the authors in [16] propose an approach to model/represent quantification over types without actually using explicit numerical information using an ontology design pattern. In comparison, we focus on answering count distinct queries over datasets containing incomplete knowledge. In [12], a novel interpretation of count distinct queries, called Semantic Tuple Count (STC), was introduced and it was shown that the evaluation of the STC interpretation of count distinct queries is decidable in SROIQ(D). Also, STC is compared to three other known interpretations Basic Count (conventional interpretation based on UNA), Semantic Count (a form of Certain Answers for count distinct queries defined on an interval of possible counts), and Epistemic Count (defined as the minimum of the Semantic Count interpretation) 3. The term STC is equivalent to our term Certain Epistemic Count (CEC). 8 CONCLUSION AND FUTURE WORK This work presened in the paper extends our previous work [12] with a new semantics for count distinct queries possible epistemic count (PEC) and a proof of its decidability. In this paper, we also propose an implementation of the certain epistemic count (CEC) semantics of count distinct queries over OWL 2 DL ontologies. We design and implement a reasoner-backed meta-algorithm and proved its correctness assuming some practical constraints on the queries ontology. We also conducted some experiments over artificial, randomly generated data. During the experiments we verified that evaluating the first inconsistency using Algorithm 1 is not tractable. This was expected but poses challenges for future research. We report performance improvements when applying two optimization techniques, i.e., using an incremental reasoner and a time-out strategy, see Section 6. We plan to extend experiments so that we can gain 3 Note, that comparing to [12] our definition of Epistemic Count is different. a better understanding of its performance characteristics. We plan to design and investigate possible optimizations. The results of the work on count distinct queries can be used for the validation of cardinality constraints. We plan to spend time for investigating possibilities of their validation. ACKNOWLEDGEMENTS This work was supported by grant No. GA S Efficient Exploration of Linked Data Cloud of the Grant Agency of the Czech Republic and by grant No.SGS16/229/OHK3/3T/13 Supporting ontological data quality in information systems of the Czech Technical University in Prague. REFERENCES [1] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider, The description logic handbook: theory, implementation, and applications, F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider, Eds. Cambridge University Press, [2] J. Broekstra and A. Kampman, An RDF query and transformation language, in Semantic Web and Peer-to-Peer, S. Staab and H. Stuckenschmidt, Eds., 2006, pp [3] D. Calvanese, E. Kharlamov, W. Nutt, and C. Thorne, Aggregate queries over ontologies, in ONISW, 2008, pp [4] O. Corby, Kgram: a knowledge graph abstract machine, Mar. 2012, [5] B. Glimm, I. Horrocks, C. Lutz, and U. Sattler, Conjunctive query answering in the description logic SHIQ, in Proceedings of the 20th International Joint Conference on Artificial Intelligence, [6] S. Harris and A. Seaborne, SPARQL 1.1 query language, W3C, W3C Recommendation, Mar. 2013, query /. [7] I. Horrocks, O. Kutz, and U. Sattler, The even more irresistible sroiq, in KR, P. Doherty, J. Mylopoulos, and C. A. Welty, Eds. AAAI Press, 2006, pp [8] A. Jena, ARQ - A SPARQL Processor for Jena, Apr. 2011, 10
11 B. Kostov, P. Křemen: Count Distinct Semantic Queries over Multiple Linked Datasets [9] A. Kampman, C. Fluit, and J. Broekstra, Sesame, Jan. 2013, [10] I. Kollia, B. Glimm, and I. Horrocks, Query answering over SROIQ knowledge bases with SPARQL, in Proceedings of the International Workshop on Description Logic, [11] B. Kostov and P. Kremen, Count aggregation in semantic queries - technical report, Czech Technical University in Prague, Dept. of Cybernetics, Tech. Rep., [12] B. Kostov and P. Křemen, Count aggregation in semantic queries, International Workshop on Scalable Semantic Web Knowledge Base Systems co-located the International Semantic Web Conference (SSWS@ISWC), [13] E. Kubias, S. Schenk, S. Staab, and J. Z. Pan, OWL SAIQL - an OWL DL query language for ontology extraction, in W3C web ontology language (OWL) - experiences and directions workshop, [14] P. Křemen and Z. Kouba, Conjunctive query optimization in OWL2-DL, in Proceedings of the 22th International Conference on Database and Expert System Applications, ser. LNCS, vol. 6861, [15] J. Leigh, Rdf4j released, June 2017, [16] D. C. Martínez, K. Janowicz, and P. Hitzler, A logical geo-ontology design pattern for quantifying over types, in SIGSPATIAL/GIS, 2012, pp [17] B. Motik, B. C. Grau, and P. Patel-Schneider, OWL 2 web ontology language direct semantics (second edition), W3C, W3C Recommendation, Dec. 2012, [18] M. J. O Connor and A. K. Das, SQWRL: A query language for OWL, in OWLED, [19] B. Parsia, P. Patel-Schneider, and B. Motik, OWL 2 web ontology language structural specification and functional-style syntax (second edition), W3C, W3C Recommendation, Dec. 2012, syntax /. [20] E. Prud hommeaux and A. Seaborne, SPARQL query language for RDF, W3C, W3C Recommendation, Jan. 2008, cit [21] D. Y. Seid and S. Mehrotra, Grouping and aggregate queries over semantic web databases, in ICSC, 2007, pp [22] E. Sirin and B. Parsia, Optimizations for Answering Conjunctive ABox Queries, in Description Logics, ser. CEUR, vol. 189, [23] E. Sirin, B. Parsia, B. C. Grau, A. Kalyanpur, and Y. Katz, Pellet: a practical OWL-DL reasoner, J. Web Sem., vol. 5, no. 2, pp , [24] E. Sirin and B. Parsia, SPARQL-DL: SPARQL query for OWL-DL, in 3rd OWL Experiences and Directions Workshop (OWLED-2007), [25] W3C, OWL 2 web ontology language document overview (second edition), W3C, W3C Recommendation, Dec. 2012, /. [26] G. T. Williams, RDF::Query - a complete SPARQL 1.1 query and update implementation for use with RDF::Trine, Nov. 2012, AUTHOR BIOGRAPHIES Bogdan Kostov is a Ph.D. student in the field of artifical intelligence and biocybernetics at the Czech Technical University in Prague, Czech Republic. His research topics are ontology development, semantic query answering that he applied in the domains of cultural heritage and aviation safety. Dr. Petr Křemen received his Ph.D. degree in artifical intelligence and biocybernetics from the Czech Technical University in Prague, Czech Republic. He leads a research team at the Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University, Prague in the field of ontology-based information systems, ontology development, ontology comparison, error explanation and query answering. He is an author of more than 50 peer-reviewed articles, mainly on international fora. 11
9th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2013)
9th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2013) At the 12th International Semantic Web Conference (ISWC2013), Sydney, Australia, October, 2013 SSWS 2013 PC Co-chairs
More informationRELATIONAL REPRESENTATION OF ALN KNOWLEDGE BASES
RELATIONAL REPRESENTATION OF ALN KNOWLEDGE BASES Thomas Studer ABSTRACT The retrieval problem for a knowledge base O and a concept C is to find all individuals a such that O entails C(a). We describe a
More informationRepresenting Product Designs Using a Description Graph Extension to OWL 2
Representing Product Designs Using a Description Graph Extension to OWL 2 Henson Graves Lockheed Martin Aeronautics Company Fort Worth Texas, USA henson.graves@lmco.com Abstract. Product development requires
More informationExtracting Finite Sets of Entailments from OWL Ontologies
Extracting Finite Sets of Entailments from OWL Ontologies Samantha Bail, Bijan Parsia, Ulrike Sattler The University of Manchester Oxford Road, Manchester, M13 9PL {bails,bparsia,sattler@cs.man.ac.uk}
More informationOn the Reduction of Dublin Core Metadata Application Profiles to Description Logics and OWL
On the Reduction of Dublin Core Metadata Application Profiles to Description Logics and OWL Dimitrios A. Koutsomitropoulos High Performance Information Systems Lab, Computer Engineering and Informatics
More informationAppendix 1. Description Logic Terminology
Appendix 1 Description Logic Terminology Franz Baader Abstract The purpose of this appendix is to introduce (in a compact manner) the syntax and semantics of the most prominent DLs occurring in this handbook.
More informationAppendix 1. Description Logic Terminology
Appendix 1 Description Logic Terminology Franz Baader Abstract The purpose of this appendix is to introduce (in a compact manner) the syntax and semantics of the most prominent DLs occurring in this handbook.
More informationA platform for distributing and reasoning with OWL-EL knowledge bases in a Peer-to-Peer environment
A platform for distributing and reasoning with OWL-EL knowledge bases in a Peer-to-Peer environment Alexander De Leon 1, Michel Dumontier 1,2,3 1 School of Computer Science 2 Department of Biology 3 Instititute
More informationOWL extended with Meta-modelling
OWL extended with Meta-modelling Regina Motz 1, Edelweis Rohrer 1, Paula Severi 2 and Ignacio Vidal 1 1 Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Uruguay 2 Department
More informationEvaluating OWL 2 Reasoners in the Context Of Checking Entity-Relationship Diagrams During Software Development
Evaluating OWL 2 Reasoners in the Context Of Checking Entity-Relationship Diagrams During Software Development Alexander A. Kropotin Department of Economic Informatics, Leuphana University of Lüneburg,
More informationPresented By Aditya R Joshi Neha Purohit
Presented By Aditya R Joshi Neha Purohit Pellet What is Pellet? Pellet is an OWL- DL reasoner Supports nearly all of OWL 1 and OWL 2 Sound and complete reasoner Written in Java and available from http://
More informationIntegrating SysML and OWL
Integrating SysML and OWL Henson Graves Lockheed Martin Aeronautics Company Fort Worth Texas, USA henson.graves@lmco.com Abstract. To use OWL2 for modeling a system design one must be able to construct
More informationLogical reconstruction of RDF and ontology languages
Logical reconstruction of RDF and ontology languages Jos de Bruijn 1, Enrico Franconi 2, and Sergio Tessaris 2 1 Digital Enterprise Research Institute, University of Innsbruck, Austria jos.debruijn@deri.org
More informationA Knowledge Compilation Technique for ALC Tboxes
A Knowledge Compilation Technique for ALC Tboxes Ulrich Furbach and Heiko Günther and Claudia Obermaier University of Koblenz Abstract Knowledge compilation is a common technique for propositional logic
More informationTractable Extensions of the Description Logic EL with Numerical Datatypes
Proc. 23rd Int. Workshop on Description Logics (DL2010), CEUR-WS 573, Waterloo, Canada, 2010. Tractable Extensions of the Description Logic EL with Numerical Datatypes Despoina Magka, Yevgeny Kazakov,
More informationTractable Extensions of the Description Logic EL with Numerical Datatypes
Tractable Extensions of the Description Logic EL with Numerical Datatypes Despoina Magka, Yevgeny Kazakov, and Ian Horrocks Oxford University Computing Laboratory Wolfson Building, Parks Road, OXFORD,
More informationInformative Armstrong RDF Datasets for n-ary Relations
H. Harmse et al. / Informative Armstrong RDF Datasets for n-ary Relations Henriette HARMSE a, Katarina BRITZ b and Aurona GERBER a a CSIR CAIR, Department of Informatics, University of Pretoria, Pretoria,
More informationTrOWL: Tractable OWL 2 Reasoning Infrastructure
TrOWL: Tractable OWL 2 Reasoning Infrastructure Edward Thomas, Jeff Z. Pan, and Yuan Ren Department of Computing Science, University of Aberdeen, Aberdeen AB24 3UE, UK Abstract. The Semantic Web movement
More informationDRAOn: A Distributed Reasoner for Aligned Ontologies
DRAOn: A Distributed Reasoner for Aligned Ontologies Chan Le Duc 1, Myriam Lamolle 1, Antoine Zimmermann 2, and Olivier Curé 3 1 LIASD Université Paris 8 - IUT de Montreuil, France {chan.leduc, myriam.lamolle}@iut.univ-paris8.fr
More informationLearning Probabilistic Ontologies with Distributed Parameter Learning
Learning Probabilistic Ontologies with Distributed Parameter Learning Giuseppe Cota 1, Riccardo Zese 1, Elena Bellodi 1, Fabrizio Riguzzi 2, and Evelina Lamma 1 1 Dipartimento di Ingegneria University
More informationA Tool for Storing OWL Using Database Technology
A Tool for Storing OWL Using Database Technology Maria del Mar Roldan-Garcia and Jose F. Aldana-Montes University of Malaga, Computer Languages and Computing Science Department Malaga 29071, Spain, (mmar,jfam)@lcc.uma.es,
More informationPrinciples of Knowledge Representation and Reasoning
Principles of Knowledge Representation and Semantic Networks and Description Logics II: Description Logics Terminology and Notation Albert-Ludwigs-Universität Freiburg Bernhard Nebel, Stefan Wölfl, and
More informationUpdating data and knowledge bases
Updating data and knowledge bases Inconsistency management in data and knowledge bases (2013) Antonella Poggi Sapienza Università di Roma Inconsistency management in data and knowledge bases (2013) Rome,
More informationLocal Closed World Reasoning with OWL 2
Local Closed World Reasoning with OWL 2 JIST 2011 Tutorial Jeff Z. Pan Department of Computing Science University of Aberdeen, UK Agenda 1. Brief introduction to Ontology and OWL 2 (10m) 2. Open vs. Closed
More informationOptimised Classification for Taxonomic Knowledge Bases
Optimised Classification for Taxonomic Knowledge Bases Dmitry Tsarkov and Ian Horrocks University of Manchester, Manchester, UK {tsarkov horrocks}@cs.man.ac.uk Abstract Many legacy ontologies are now being
More informationExtending the SHOIQ(D) Tableaux with DL-safe Rules: First Results
Extending the SHOIQ(D) Tableaux with DL-safe Rules: First Results Vladimir Kolovski and Bijan Parsia and Evren Sirin University of Maryland College Park, MD 20740 kolovski@cs.umd.edu bparsia@isr.umd.edu
More informationSPARQL-DL. Czech Technical University in Prague (CZ) April 2008
SPARQL-DL Petr Křemen Czech Technical University in Prague (CZ) April 2008 What is SPARQL-DL Different Perspectives SPARQL-DL constructs SoA Pellet Query Engine SPARQL-DL Evaluation Preprocessing Evaluation
More informationMastro Studio: a system for Ontology-Based Data Management
Mastro Studio: a system for Ontology-Based Data Management Cristina Civili, Marco Console, Domenico Lembo, Lorenzo Lepore, Riccardo Mancini, Antonella Poggi, Marco Ruzzi, Valerio Santarelli, and Domenico
More informationProbabilistic Information Integration and Retrieval in the Semantic Web
Probabilistic Information Integration and Retrieval in the Semantic Web Livia Predoiu Institute of Computer Science, University of Mannheim, A5,6, 68159 Mannheim, Germany livia@informatik.uni-mannheim.de
More informationOWLS-SLR An OWL-S Service Profile Matchmaker
OWLS-SLR An OWL-S Service Profile Matchmaker Quick Use Guide (v0.1) Intelligent Systems and Knowledge Processing Group Aristotle University of Thessaloniki, Greece Author: Georgios Meditskos, PhD Student
More informationDRAOn: A Distributed Reasoner for Aligned Ontologies
DRAOn: A Distributed Reasoner for Aligned Ontologies Chan Le Duc 1, Myriam Lamolle 1, Antoine Zimmermann 2, and Olivier Curé 3 1 LIASD Université Paris 8 - IUT de Montreuil, France {chan.leduc, myriam.lamolle}@iut.univ-paris8.fr
More informationPropositional Logic. Part I
Part I Propositional Logic 1 Classical Logic and the Material Conditional 1.1 Introduction 1.1.1 The first purpose of this chapter is to review classical propositional logic, including semantic tableaux.
More informationAn Architecture for Semantic Enterprise Application Integration Standards
An Architecture for Semantic Enterprise Application Integration Standards Nenad Anicic 1, 2, Nenad Ivezic 1, Albert Jones 1 1 National Institute of Standards and Technology, 100 Bureau Drive Gaithersburg,
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationA Tableau-based Federated Reasoning Algorithm for Modular Ontologies
A Tableau-based Federated Reasoning Algorithm for Modular Ontologies Jie Bao 1, Doina Caragea 2, Vasant Honavar 1 1 Artificial Intelligence Research Laboratory, Department of Computer Science, Iowa State
More informationTHE DESCRIPTION LOGIC HANDBOOK: Theory, implementation, and applications
THE DESCRIPTION LOGIC HANDBOOK: Theory, implementation, and applications Edited by Franz Baader Deborah L. McGuinness Daniele Nardi Peter F. Patel-Schneider Contents List of contributors page 1 1 An Introduction
More informationOpening, Closing Worlds On Integrity Constraints
Opening, Closing Worlds On Integrity Constraints Evren Sirin 1, Michael Smith 1, Evan Wallace 2 1 Clark & Parsia LLC, Washington, DC, USA {evren,msmith}@clarkparsia.com 2 National Institute of Standards
More informationConsistency and Set Intersection
Consistency and Set Intersection Yuanlin Zhang and Roland H.C. Yap National University of Singapore 3 Science Drive 2, Singapore {zhangyl,ryap}@comp.nus.edu.sg Abstract We propose a new framework to study
More informationScalability via Parallelization of OWL Reasoning
Scalability via Parallelization of OWL Reasoning Thorsten Liebig, Andreas Steigmiller, and Olaf Noppens Institute for Artificial Intelligence, Ulm University 89069 Ulm, Germany firstname.lastname@uni-ulm.de
More informationSnorocket 2.0: Concrete Domains and Concurrent Classification
Snorocket 2.0: Concrete Domains and Concurrent Classification Alejandro Metke-Jimenez and Michael Lawley The Australian e-health Research Centre ICT Centre, CSIRO Brisbane, Queensland, Australia {alejandro.metke,michael.lawley}@csiro.au
More informationA Unified Logical Framework for Rules (and Queries) with Ontologies - position paper -
A Unified Logical Framework for Rules (and Queries) with Ontologies - position paper - Enrico Franconi Sergio Tessaris Faculty of Computer Science, Free University of Bozen-Bolzano, Italy lastname@inf.unibz.it
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2002 Vol. 1, No. 2, July-August 2002 The Theory of Classification Part 2: The Scratch-Built
More informationOntological Modeling: Part 11
Ontological Modeling: Part 11 Terry Halpin LogicBlox and INTI International University This is the eleventh in a series of articles on ontology-based approaches to modeling. The main focus is on popular
More informationParallel and Distributed Reasoning for RDF and OWL 2
Parallel and Distributed Reasoning for RDF and OWL 2 Nanjing University, 6 th July, 2013 Department of Computing Science University of Aberdeen, UK Ontology Landscape Related DL-based standards (OWL, OWL2)
More informationExplaining Subsumption in ALEHF R + TBoxes
Explaining Subsumption in ALEHF R + TBoxes Thorsten Liebig and Michael Halfmann University of Ulm, D-89069 Ulm, Germany liebig@informatik.uni-ulm.de michael.halfmann@informatik.uni-ulm.de Abstract This
More informationEfficient Two-Phase Data Reasoning for Description Logics
Efficient Two-Phase Data Reasoning for Description Logics Abstract Description Logics are used more and more frequently for knowledge representation, creating an increasing demand for efficient automated
More informationLecture 1: Conjunctive Queries
CS 784: Foundations of Data Management Spring 2017 Instructor: Paris Koutris Lecture 1: Conjunctive Queries A database schema R is a set of relations: we will typically use the symbols R, S, T,... to denote
More informationFoundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution
Foundations of AI 9. Predicate Logic Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution Wolfram Burgard, Andreas Karwath, Bernhard Nebel, and Martin Riedmiller 09/1 Contents Motivation
More informationScalable Semantic Retrieval Through Summarization and Refinement
Scalable Semantic Retrieval Through Summarization and Refinement Julian Dolby, Achille Fokoue, Aditya Kalyanpur, Aaron Kershenbaum, Edith Schonberg, Kavitha Srinivas IBM Watson Research Center P.O.Box
More informationTowards SPARQL instance-level Update in the Presence of OWL-DL TBoxes
Towards SPARQL instance-level Update in the Presence of OWL-DL TBoxes Claudia Schon a,1, Steffen Staab a,b a Institute for Web Science and Technologies, University of Koblenz-Landau, Koblenz, Germany b
More informationOn the Hardness of Counting the Solutions of SPARQL Queries
On the Hardness of Counting the Solutions of SPARQL Queries Reinhard Pichler and Sebastian Skritek Vienna University of Technology, Faculty of Informatics {pichler,skritek}@dbai.tuwien.ac.at 1 Introduction
More informationLeveraging Set Relations in Exact Set Similarity Join
Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,
More informationModule 6. Knowledge Representation and Logic (First Order Logic) Version 2 CSE IIT, Kharagpur
Module 6 Knowledge Representation and Logic (First Order Logic) 6.1 Instructional Objective Students should understand the advantages of first order logic as a knowledge representation language Students
More informationSPARQL Back-end for Contextual Logic Agents
SPARQL Back-end for Contextual Logic Agents Cláudio Fernandes and Salvador Abreu Universidade de Évora Abstract. XPTO is a contextual logic system that can represent and query OWL ontologies from a contextual
More informationCOMP718: Ontologies and Knowledge Bases
1/38 COMP718: Ontologies and Knowledge Bases Lecture 4: OWL 2 and Reasoning Maria Keet email: keet@ukzn.ac.za home: http://www.meteck.org School of Mathematics, Statistics, and Computer Science University
More informationChapter 3: Propositional Languages
Chapter 3: Propositional Languages We define here a general notion of a propositional language. We show how to obtain, as specific cases, various languages for propositional classical logic and some non-classical
More informationCOMP718: Ontologies and Knowledge Bases
1/35 COMP718: Ontologies and Knowledge Bases Lecture 9: Ontology/Conceptual Model based Data Access Maria Keet email: keet@ukzn.ac.za home: http://www.meteck.org School of Mathematics, Statistics, and
More informationTowards Efficient Reasoning for Description Logics with Inverse Roles
Towards Efficient Reasoning for Description Logics with Inverse Roles Yu Ding and Volker Haarslev Concordia University, Montreal, Quebec, Canada {ding yu haarslev}@cse.concordia.ca Abstract This paper
More informationFast ABox Consistency Checking using Incomplete Reasoning and Caching
Fast ABox Consistency Checking using Incomplete Reasoning and Caching Christian Meilicke 1, Daniel Ruffinelli 1, Andreas Nolle 2, Heiko Paulheim 1, and Heiner Stuckenschmidt 1 1 Research Group Data and
More informationConsequence based Procedure for Description Logics with Self Restriction
Consequence based Procedure for Description Logics with Self Restriction Cong Wang and Pascal Hitzler Kno.e.sis Center, Wright State University, Dayton, OH, U.S.A. {wang.156, pascal.hizler}@wright.edu
More informationComputing least common subsumers for FLE +
Computing least common subsumers for FLE + Sebastian Brandt and Anni-Yasmin Turhan Theoretical Computer Science, TU Dresden, Germany Email: {brandt, turhan}@tcs.inf.tu-dresden.de Abstract Transitive roles
More informationCost based Query Ordering over OWL Ontologies
Cost based Query Ordering over OWL Ontologies Ilianna Kollia 1,2 and Birte Glimm 1 1 University of Ulm, Germany, birte.glimm@uni-ulm.de 2 National Technical University of Athens, Greece, ilianna2@mail.ntua.gr
More informationDescription logic reasoning using the PTTP approach
Description logic reasoning using the PTTP approach Zsolt Nagy, Gergely Lukácsy, Péter Szeredi Budapest University of Technology and Economics Department of Computer Science and Information Theory {zsnagy,
More informationAutomatic Ontology Extension: Resolving Inconsistencies
Ekaterina Ovchinnikova, Kai-Uwe Kühnberger Automatic Ontology Extension: Resolving Inconsistencies Ontologies are widely used in text technology and artificial intelligence. The need to develop large ontologies
More informationDescription Logics as Ontology Languages for Semantic Webs
Description Logics as Ontology Languages for Semantic Webs Franz Baader, Ian Horrocks, and Ulrike Sattler Presented by:- Somya Gupta(10305011) Akshat Malu (10305012) Swapnil Ghuge (10305907) Presentation
More informationDescription Logics and OWL
Description Logics and OWL Based on slides from Ian Horrocks University of Manchester (now in Oxford) Where are we? OWL Reasoning DL Extensions Scalability OWL OWL in practice PL/FOL XML RDF(S)/SPARQL
More informationKnowledge Engineering with Semantic Web Technologies
This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 3 Ontologies and Logic 3.7 Description Logics
More informationDescription Logics Reasoning Algorithms Concluding Remarks References. DL Reasoning. Stasinos Konstantopoulos. IIT, NCSR Demokritos
Stasinos Konstantopoulos 10-3-2006 Overview Description Logics Definitions Some Family Members Reasoning Algorithms Introduction Resolution Calculus Tableau Calculus Concluding Remarks Definitions The
More information3 No-Wait Job Shops with Variable Processing Times
3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select
More informationSEMANTIC WEB AND COMPARATIVE ANALYSIS OF INFERENCE ENGINES
SEMANTIC WEB AND COMPARATIVE ANALYSIS OF INFERENCE ENGINES Ms. Neha Dalwadi 1, Prof. Bhaumik Nagar 2, Prof. Ashwin Makwana 1 1 Computer Engineering, Chandubhai S Patel Institute of Technology Changa, Dist.
More informationOn the Scalability of Description Logic Instance Retrieval
On the Scalability of Description Logic Instance Retrieval V. Haarslev 1, R. Moeller 2, M. Wessel 2 1 Concordia University, Montreal 2 Hamburg University of Technology (TUHH) 1 Supported by EU Project
More informationX-KIF New Knowledge Modeling Language
Proceedings of I-MEDIA 07 and I-SEMANTICS 07 Graz, Austria, September 5-7, 2007 X-KIF New Knowledge Modeling Language Michal Ševčenko (Czech Technical University in Prague sevcenko@vc.cvut.cz) Abstract:
More informationOWL Rules, OK? Ian Horrocks Network Inference Carlsbad, CA, USA
OWL Rules, OK? Ian Horrocks Network Inference Carlsbad, CA, USA ian.horrocks@networkinference.com Abstract Although the OWL Web Ontology Language adds considerable expressive power to the Semantic Web
More informationBenchmarking DL Reasoners Using Realistic Ontologies
Benchmarking DL Reasoners Using Realistic Ontologies Zhengxiang Pan Bell Labs Research and Lehigh University zhp2@lehigh.edu Abstract. We did a preliminary benchmark on DL reasoners using real world OWL
More informationjcel: A Modular Rule-based Reasoner
jcel: A Modular Rule-based Reasoner Julian Mendez Theoretical Computer Science, TU Dresden, Germany mendez@tcs.inf.tu-dresden.de Abstract. jcel is a reasoner for the description logic EL + that uses a
More informationLightweight Semantic Web Motivated Reasoning in Prolog
Lightweight Semantic Web Motivated Reasoning in Prolog Salman Elahi, s0459408@sms.ed.ac.uk Supervisor: Dr. Dave Robertson Introduction: As the Semantic Web is, currently, in its developmental phase, different
More informationOWL 2 Profiles. An Introduction to Lightweight Ontology Languages. Markus Krötzsch University of Oxford. Reasoning Web 2012
University of Oxford Department of Computer Science OWL 2 Profiles An Introduction to Lightweight Ontology Languages Markus Krötzsch University of Oxford Reasoning Web 2012 Remark for the Online Version
More informationWhich Role for an Ontology of Uncertainty?
Which Role for an Ontology of Uncertainty? Paolo Ceravolo, Ernesto Damiani, Marcello Leida Dipartimento di Tecnologie dell Informazione - Università degli studi di Milano via Bramante, 65-26013 Crema (CR),
More informationSOFTWARE ENGINEERING DESIGN I
2 SOFTWARE ENGINEERING DESIGN I 3. Schemas and Theories The aim of this course is to learn how to write formal specifications of computer systems, using classical logic. The key descriptional technique
More informationA proof-producing CSP solver: A proof supplement
A proof-producing CSP solver: A proof supplement Report IE/IS-2010-02 Michael Veksler Ofer Strichman mveksler@tx.technion.ac.il ofers@ie.technion.ac.il Technion Institute of Technology April 12, 2010 Abstract
More informationTowards Implementing Finite Model Reasoning in Description Logics
In Proc. of the 2004 Int. Workshop on Description Logics (DL 2004) Towards Implementing Finite Model Reasoning in Description Logics Marco Cadoli 1, Diego Calvanese 2, Giuseppe De Giacomo 1 1 Dipartimento
More informationLeveraging Data and Structure in Ontology Integration
Leveraging Data and Structure in Ontology Integration O. Udrea L. Getoor R.J. Miller Group 15 Enrico Savioli Andrea Reale Andrea Sorbini DEIS University of Bologna Searching Information in Large Spaces
More informationRacer: An OWL Reasoning Agent for the Semantic Web
Racer: An OWL Reasoning Agent for the Semantic Web Volker Haarslev and Ralf Möller Concordia University, Montreal, Canada (haarslev@cs.concordia.ca) University of Applied Sciences, Wedel, Germany (rmoeller@fh-wedel.de)
More informationPart I Logic programming paradigm
Part I Logic programming paradigm 1 Logic programming and pure Prolog 1.1 Introduction 3 1.2 Syntax 4 1.3 The meaning of a program 7 1.4 Computing with equations 9 1.5 Prolog: the first steps 15 1.6 Two
More informationTemporal Reasoning for Supporting Temporal Queries in OWL 2.0
Temporal Reasoning for Supporting Temporal Queries in OWL 2.0 Sotiris Batsakis, Kostas Stravoskoufos, and Euripides G.M. Petrakis Department of Electronic and Computer Engineering Technical University
More informationTowards a Logical Reconstruction of Relational Database Theory
Towards a Logical Reconstruction of Relational Database Theory On Conceptual Modelling, Lecture Notes in Computer Science. 1984 Raymond Reiter Summary by C. Rey November 27, 2008-1 / 63 Foreword DB: 2
More informationarxiv: v1 [cs.lo] 23 Apr 2012
The Distributed Ontology Language (DOL): Ontology Integration and Interoperability Applied to Mathematical Formalization Christoph Lange 1,2, Oliver Kutz 1, Till Mossakowski 1,3, and Michael Grüninger
More information3.4 Deduction and Evaluation: Tools Conditional-Equational Logic
3.4 Deduction and Evaluation: Tools 3.4.1 Conditional-Equational Logic The general definition of a formal specification from above was based on the existence of a precisely defined semantics for the syntax
More informationResearch Article A Method of Extracting Ontology Module Using Concept Relations for Sharing Knowledge in Mobile Cloud Computing Environment
e Scientific World Journal, Article ID 382797, 5 pages http://dx.doi.org/10.1155/2014/382797 Research Article A Method of Extracting Ontology Module Using Concept Relations for Sharing Knowledge in Mobile
More informationAn Ontology Based Question Answering System on Software Test Document Domain
An Ontology Based Question Answering System on Software Test Document Domain Meltem Serhatli, Ferda N. Alpaslan Abstract Processing the data by computers and performing reasoning tasks is an important
More informationThe Inverse of a Schema Mapping
The Inverse of a Schema Mapping Jorge Pérez Department of Computer Science, Universidad de Chile Blanco Encalada 2120, Santiago, Chile jperez@dcc.uchile.cl Abstract The inversion of schema mappings has
More information2. Sets. 2.1&2.2: Sets and Subsets. Combining Sets. c Dr Oksana Shatalov, Fall
c Dr Oksana Shatalov, Fall 2014 1 2. Sets 2.1&2.2: Sets and Subsets. Combining Sets. Set Terminology and Notation DEFINITIONS: Set is well-defined collection of objects. Elements are objects or members
More informationRaDON Repair and Diagnosis in Ontology Networks
RaDON Repair and Diagnosis in Ontology Networks Qiu Ji, Peter Haase, Guilin Qi, Pascal Hitzler, and Steffen Stadtmüller Institute AIFB Universität Karlsruhe (TH), Germany {qiji,pha,gqi,phi}@aifb.uni-karlsruhe.de,
More informationQuerying Description Logics
Querying Description Logics Petr Křemen 1 SPARQL and Ontology Querying 1.1 SPARQL Query Structure SPARQL Language [SS13] is aimed at querying RDF(S) [GB04] documents. As OWL 2 [MPSP09] is an extension
More informationRelational Representation of ALN Knowledge Bases
Relational Representation of ALN Knowledge Bases Thomas Studer Abstract The retrieval problem for a knowledge base O and a concept C is to find all individuals a such that O entails C(a). We describe a
More informationModularity in Ontologies: Introduction (Part A)
Modularity in Ontologies: Introduction (Part A) Thomas Schneider 1 Dirk Walther 2 1 Department of Computer Science, University of Bremen, Germany 2 Faculty of Informatics, Technical University of Madrid,
More informationCSC Discrete Math I, Spring Sets
CSC 125 - Discrete Math I, Spring 2017 Sets Sets A set is well-defined, unordered collection of objects The objects in a set are called the elements, or members, of the set A set is said to contain its
More informationBenchmarking Reasoners for Multi-Ontology Applications
Benchmarking Reasoners for Multi-Ontology Applications Ameet N Chitnis, Abir Qasem and Jeff Heflin Lehigh University, 19 Memorial Drive West, Bethlehem, PA 18015 {anc306, abq2, heflin}@cse.lehigh.edu Abstract.
More informationFalcon-AO: Aligning Ontologies with Falcon
Falcon-AO: Aligning Ontologies with Falcon Ningsheng Jian, Wei Hu, Gong Cheng, Yuzhong Qu Department of Computer Science and Engineering Southeast University Nanjing 210096, P. R. China {nsjian, whu, gcheng,
More informationDescription Logic Systems with Concrete Domains: Applications for the Semantic Web
Description Logic Systems with Concrete Domains: Applications for the Semantic Web Volker Haarslev and Ralf Möller Concordia University, Montreal University of Applied Sciences, Wedel Abstract The Semantic
More information