Efficient semi-automated assessment of annotations trustworthiness

Size: px
Start display at page:

Download "Efficient semi-automated assessment of annotations trustworthiness"

Transcription

1 Ceolin et al. Journal of Trust Management 2014, 1:3 RESEARCH Open Access Efficient semi-automated assessment of annotations trustworthiness Davide Ceolin *, Archana Nottamkandath * and Wan Fokkink *Correspondence: d.ceolin@vu.nl; a.nottamkandath@vu.nl The Network Institute, VU University Amsterdam, de Boelelaan, 1081a, 1081HV Amsterdam, The Netherlands Abstract Crowdsourcing provides a valuable means for accomplishing large amounts of work which may require a high level of expertise. We present an algorithm for computing the trustworthiness of user-contributed tags of artifacts, based on the reputation of the user, represented as a probability distribution, and on provenance of the tag. The algorithm only requires a small number of manually assessed tags, and computes two trust values for each tag, based on reputation and provenance. We moreover present a computationally cheaper adaptation of the algorithm, which clusters semantically similar tags in the training set, and builds an opinion on a new tag based on its semantic relatedness with respect to the medoids of the clusters. Also, we introduce an adaptation of the algorithm based on the use of provenance stereotypes as an alternative basis for the estimation. Two case studies from the cultural heritage domain show that the algorithms produce satisfactory results. Keywords: Trust; Annotations; Semantic similarity; Subjective logic; Clustering; Cultural heritage; Tagging; Crowdsourcing; Provenance Introduction Through the Web, cultural heritage institutions can reach large masses of people, with intentions varying from increasing visibility (and hence visitors) to acquiring usergenerated content. Crowdsourcing is an effective way to handle tasks which are highly demanding in terms of the amount of work needed to complete and required level of expertise [1], such as annotating artifacts in large cultural heritage collections. For this reason, many cultural heritage institutions have opened up their archives to ask the masses to help them in tagging or annotating their artifacts. In earlier years it was feasible for employees at the cultural heritage institutions to manually assess the quality of the tags entered by external users, since there were relatively few contributions from Web users. However, with the growth of the Web, the amount of data has become too large to be accurately dealt with by experts at the disposal of these institutions within a reasonable time. Nevertheless a high quality of annotations is vital for their business. The cultural heritage institutions need the annotations to be trustworthy in order to maintain their authoritative reputation. This calls for mechanisms to automate the annotation evaluation process in order to assist the cultural heritage institutions to obtain quality content from the Web. Annotations from external users can be either in the form of tags or free text, describing entities in the crowdsourced systems. Here, we focus on tags in 2014 Ceolin et al.; licensee Springer. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

2 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 2 of 31 the cultural heritage domain, which describe mainly the content, context and facts about an artifact by associating words to it. The goal of this paper is to propose an algorithm for computing the trustworthiness of annotations in a fast and reliable manner. We focus on three main evaluation aspects for our algorithm. First, the trust values produced by our algorithm are meant as indicators of the trustworthiness of annotations, and therefore they must be accurate enough to warrant their usefulness. Accuracy of trust values is achieved by carefully handling the information at our disposal and by utilizing the existence of a relationship between the features considered (e.g., the annotation creator) and the trust values themselves. If the information is handled correctly and the relationship holds, then the trust values are accurate enough and form a basis to automatically decide whether or not to use the annotations. We evaluate this first research question by applying our algorithm on two different datasets, one from a SEALINC Media project experiment and the other from the Steve.Museum dataset. In both cases we divide the dataset into two parts, training set and test set, so as to build a model based on subjective logic and semantic similarity in the training set, and then evaluate the accuracy of such a model on the test set. The goal of the work described in this paper is to automate the process of evaluation of tags obtained through crowdsourcing in an effective way, by means of an algorithm. In fact, crowsourcing provides massive amounts of annotations, but these are not always trustworthy enough. So we aim at automating the process of deciding whether these are satisfactorily correct and of high quality, i.e. of evaluating them. This is done by first collecting manual evaluations about the quality of a small part of the tags contributed by a user, and then learning a statistical model from them. Throughout the paper we refer to this set as training set. On the basis of such a model, that assumes the existence of a relation between the user reputation and his overall performance, the system automatically evaluates the tags further added by the same user or user stereotype (i.e. set of users behaving similarly, e.g., users that always provide their tags on Sunday morning). We refer to this set of new tags as test set. Suppose that a user, Alex, provides annotations to the Fictitious National Museum. We propose a method that automatically evaluates these annotations, based on a small set of annotations that the museum previously evaluated from which we derive Alex reputation, or the reputation of the users whose annotation behaviour is similar to Alex. We will return on this example more in detail in the following sections. We employ Semantic Web technologies to represent and store the annotations and the corresponding reviews. We use subjective logic to build a reputation for users that contribute to the system, and moreover semantic similarity measures to generate assessments on the tags entered by the same users at a later point in time. By reputation, we mean a value indicating the estimated probability that the annotations provided by a given author (or, later in the paper, by a given user stereotype) are positively evaluated. In subjective logic, we use the expected value of an opinion about an author as the value of the reputation. The opinion is based on a set of previously evaluated tags. In order to reduce the computation time, we cluster evaluated tags to reduce the number of comparisons. Our experiments show that this preprocessing does not seriously affect the accuracy of the predictions, while significantly reducing the computation time. The proposed algorithms are evaluated on two datasets from the cultural heritage domain. These case studies show that it is possible to semi-automatically evaluate the tags entered by users

3 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 3 of 31 in crowdsourcing systems into binomial categories (good, bad) with an accuracy above 80%. Apart from using subjective logic and semantic similarity, we also use provenance mechanisms to evaluate the quality of user-contributed tags. Provenance is represented by means of a data record per tag, containing information on its creation such as time of day, day of the week, typing speed, etc., obtained by tracking user behavior. We use provenance information to group annotations according to the stereotype or behavior (or provenance group ) that produced them. In other words, we group them depending on whether they are produced by, for instance, early-morning or late-night users. Once the annotations have been grouped per stereotype, we compute a reputation for each stereotype, based on a sample of evaluations provided by an authority: we learn the policy adopted by the authority in evaluating the annotations and we apply the learnt model on further annotations. The fact that we take into account provenance and not rely solely on user identities makes our approach suitable for situations where users are anonymous and only their behavior is tracked. By policy we mean a set of rules that the institution adopts to evaluate tags. The fact that we do not have an explicit definition of such rules (nor we could ask for it) determines the need to learn a probabilistic model that aims at mimicking their evaluation strategy. For instance, we do not know a priori if one of two conflicting tags about the same image is wrong, both because we do not know the image (these could refer to two distinct image parts) and the museum policies (which may prohibit their coexistence, in principle). So, we rely only on the museum evaluations and we do not consider the possible impact of conflicting tags. We propose three algorithms for estimating the trustworthiness of tags. The first learns a reputation for each user based on a set of evaluated tags. Then it predicts the evaluations of the rest of the tags provided by the same user by ranking them according to the user performance in each specific domain (using semantic similarity measures) and then accepting a number of tags proportional to the user reputation. The second algorithm reduces the computational complexity by clustering the tags in the training set, and the third algorithm computes the same prediction on the basis of the user stereotype rather than on the basis of each single user identity. The novelty of this research lies in the automation of tag evaluations on crowdsourcing systems by coupling subjective logic opinions with measures of semantic similarity along with provenance metrics. The only variable parameter that we require is the size of the set of manual evaluations that are needed to build a useful and reliable reputation. Moreover, in the experiments that we performed, varying this parameter did not substantially affect the performance (resulting in about 1% precision variation per five new observations considered in a user reputation). We will further investigate in the future about the impact of this parameter. Using our algorithms, we show how it is possible to avoid asking the curators responsible for the quality of the annotations to set a threshold in order to make assessments about a tag trustworthiness (e.g., accept only tags which have a trust value above a given threshold). Background and literature review Trust has been studied extensively in Computer Science. We refer the reader to Sabater and Sierra [2], Gil and Artz [3] and Golbeck [4] for a comprehensive review of trust in computer science, Semantic Web, and Web respectively. The work presented in this paper

4 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 4 of 31 focuses on trust in crowdsourced information from the Web, using an adapted version of the definition of Castelfranchi and Falcone [5], reported by Sabater and Sierra [2], so we decide to trust or not trust tags based on a set of beliefs and assumptions about who produced the tags and how these were produced. We quantify these beliefs, for instance, though reputations. Crowdsourcing techniques are widely used by cultural heritage and multimedia institutions for enhancing the available information about their collections. Examples include the Tag Your Paintings project [6], the Steve.Museum project [7] and the Waisda? video tagging platform [8]. The Socially Enriched Access to Linked Cultural (SEALINC) Media project investigates also in this direction. In this project, Rijksmuseum [9] in Amsterdam is using crowdsourcing on a Web platform selecting experts of various domains to enrich information about their collection. One of the case studies analyzed in this paper is provided by the SEALINC Media project. Trust management in crowdsourced systems often employs wisdom of crowds approaches[10].inourscenarioswecannotmakeuseofthoseapproachesbecausethe level of expertise needed to annotate cultural heritage artifacts restricts the potential set of users, thus making this kind of approach inapplicable or less effective. Gamification, that consists of using game mechanisms for involving users in non-game tasks, is another approach that leads to an improvement of the quality of tags gathered from crowds, as shown, for instance, in von Ahn et al. [1]. The work presented here is orthogonal to a gamified environment, as it allows us to semi-automatically evaluate the user-contributed annotations and hence to semi-automatically incentivize them. By combining the two, museums could increase the user incentiviazion (showing his reputation may be enough to incentivize a user) while curating the quality of annotations. Users that participated in the experiments that provided the datasets for our analyses did not receive monetary incentives, so leveraging incentives related to gamification and personal satisfaction (by means of reputation tracking) may reveal to be an important factor in increasing the accuracy of the tags collected. In folksonomy systems such as the Steve.Museum project, tag evaluation techniques such as comparing the presence of the tags in standard vocabularies and thesauri, determining their frequency and their popularity or agreement with other tags (see, for instance, Van Damme et al. [11]) have been employed to determine the quality of tags entered by users. Such mechanisms focus mainly on the contributed content with little or no reference to the user who authored it. Also, in folksonomy systems the crowd often manages the tags, while in our scenarios, the crowd only provides the tags, that are managed by museums or other institutions, according to specific policies. Medeylan et al. [12] present algorithms to determine the quality of tags entered by users in a collaboratively created folksonomy, and apply them to the dataset CiteULike [13], whichconsists of text documents. Theyevaluate the relevance of user-provided tags by means of text document-based metrics. In our work, since we evaluate tags, we cannot apply document-based metrics, and since we do not have at our disposal large amounts of tags per subject, we cannot check for consistency among users tagging the same image. Similarly, we cannot compute semantic similarity based on the available annotations (like in Cattuto et al. [14]). In fact, since we do not have at our disposal image analysis software nor explicit museum policies, we can not know if possible conflicts between tags regarding the same image are due to the fact that some are correct and some not, or to the fact that they refer to different aspects (or parts) of a complex picture. Therefore, instead of

5 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 5 of 31 assuming one of the two cases a priori, we determine the trustworthiness of the tags on the basis of the reputation of their user or provenance stereotype. As future direction, we plan to consider also inputs from image recognition software that will help us dealing with conflicting or dubious tags. In open collaborative sites such as Wikipedia [15],where information is contributed by Web users, automated quality evaluation mechanisms have been investigated (see, for instance, De La Calzada et al. [16]). Most of these mechanisms involve computing trust from article revision histories and user groups (see Zeng et al. [17] and Wang et al. [18]). These algorithms track the changes that a particular article or piece of text has undergone over time, along with details of the users performing the changes. In our case study, tags do not have a revision history. Another approach to obtain trustworthy data is to find experts amongst Web users with a good intention (see De Martini et al. [19]). This mechanism assumes that users who are experts tend to provide more trustworthy annotations. It aims at identifying such experts, by analyzing the profiles built by tracking user performance. In our model, we build profiles based on user performance in the system. So the profile is only behavior-based, and rather than looking for expert and trustworthy users, we build a model which helps in evaluating the tag quality based on the estimated reputation of the tag author. However there is a clear relation between highly reputed users and experts, although these two classes do not always overlap. Modeling of reputation and user behavior on the Web is a widely studied domain. Javanmardi et al. [20] propose three computational models for user reputation by extracting detailed user edit patterns and statistics which are particularly tailored for wikis, while we focus on the annotations domain. Ceolin et al. [21] build a reputation- and provenance-based model for predicting the trustworthiness of Web users in Waisda? over time. We optimize the reputation management and the decision strategies described in that paper. We use subjective logic to represent user reputations, in combination with semantic relatedness measures. This work extends Ceolin et al. [21,22]. Similarity measures have been combined with subjective logic in Tavakolifard et al. [23], who infer new trust connections between entities (e.g., users) given a set of trust connections known a priori. In our paper, we also start from a graphical representation of relations between the various participating entities (annotators, tags, reviewers, etc.), but: (1) trust relationships are learnt from a sample of museum evaluations, and (2) new trust connections are inferred based on the relative position of the tags in another graph, WordNet. We also use semantic similarity measures to cluster related tags to optimize the computations. In Cilibrasi et al. [24], hierarchical clustering is used for grouping related topics, while Ushioda et al. [25] experiment on clustering words in a hierarchical manner. Begelman et al. [26] present an algorithm for the automated clustering of tags on the basis of tag co-occurrences in order to facilitate more effective retrieval. A similar approach is used by Hassan-Montero and Herrero-Solana [27]. They compute tag similarities using the Jaccard similarity coefficient and then cluster the tags hierarchically using the k-means algorithm. In our work, to build user reputations, we cluster the tags along with their respective evaluations (e.g., accept or reject). Each cluster is represented by a medoid (that is, the element of the cluster which is the closest to its center), and in order to evaluate a newly entered tag by the same user, we consider clusters which are most semantically relevant to the new tag. This helps in selectively weighing only the relevant evidence about a user for evaluating a new tag.

6 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 6 of 31 In general, different cultural heritage institutions employ different values and metrics of varying scales to represent the trustworthiness of user-contributed information. The accuracy of various scales has been studied earlier. Certain cases use a binary (boolean) scale for trust values, as in Golbeck et al. [28], while binomial values (i.e., the probabilities of two mutually exclusive values zero and one, that we use in our work) are used in Guha et al. [29] and Kamvar et al. [30]. Relevant for the work presented in this paper is the link between provenance and trust. Bizer and Cyganiak [31], Hartig and Zhao [32] and Zaihrayeu et al. [33] use provenance and background information expressed as annotated or named graphs to produce trust values. We use the same class of information to make our estimates, but we do not use named graphs to represent provenance information. We represent provenance by means of the W3C recommendation PROV-O, the PROV Ontology [34,35]. Provenance is employed for determining trustworthiness of user-contributed information in crowdsourced environments in Ceolin et al. [21], where provenance information is used in combination with user reputation to make binomial assessments of annotations. Also, they employ support vector machines for making the provenance-based estimates, while we employ a subjective logic-based approach. Provenance is used for data verification in crowdsourced environments by Ebden et al. [36]. In their work, they introduced provenance tracking into their online CollabMap application (used to crowdsource evacuation maps), and in this way they collect approximately 5,000 provenance graphs, generated using the Open Provenance Model [37] (which has now been superseded by the PROV Data Model and Ontology). In their work they have at their disposal large provenance graphs and can learn useful features about the artifact trustworthiness from the graphs topologies.here,thegraphsatourdisposalaremuchmorelimited,sowecannot rely on the graph topology, but we can easily group graphs in stereotypes. Provenance mechanisms have also been used to understand and study workflows in collaborative environments as discussed in Altintas et al. [38]. We share the same context with that work, but we do not focus on the workflow of artifact creation. The rest of the paper is structured as follows: first we describe the research questions tackled and we connect the datasets used with them. Then we explain our research work in detail and present and discuss the obtained results. Lastly, we provide some final conclusions. Research design and methodology The goal of this paper is to propose an algorithm for computing the trustworthiness of annotations in a fast and reliable manner. We focus on three main evaluation aspects for our algorithm. First, the trust values produced by our algorithm are meant as indicators of the trustworthiness of annotations, and therefore they must be accurate enough to warrant their usefulness. Accuracy of trust values is achieved by carefully handling the information at our disposal and by utilizing the existence of a relationship between the features considered (e.g., the annotation creator) and the trust values themselves. If the information is handled correctly and the relationship holds, then the trust values are accurate enough and form a basis to automatically decide whether or not to use the annotations. We evaluate this first research question by applying our algorithm on two different datasets, one from a SEALINC Media project experiment and the other from the Steve.Museum dataset. In both cases we divide the dataset into two parts, training set

7 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 7 of 31 and test set, so as to build a model based on subjective logic and semantic similarity in the training set, and then evaluate the accuracy of such a model on the test set. The second evaluation we make regards the possibility to perform trust estimations in a relatively fast manner, by properly clustering the training set on a semantic similarity basis. Here the goal of the contribution is to reduce the computational overhead due to avoidable comparisons between evaluated annotations and new annotations. We evaluate this contribution by applying clustering mechanisms in the training set data of the aforementioned datasets and by running our algorithmfor computing trust valuesonthe clustered training sets. The evaluation will check whether clustering reduces the computation time (and in case it does, up to which magnitude) and whether it affects the accuracy of the predictions. Finally, we show that the algorithm we propose is dependable and not solely dependent on the availability of information about the author of an annotation. Our assumption is that when the identity of the author is not known or when a reliable reputation about the author is not available, we can base our estimates on provenance information, that is, on a range of information about how the tag has been created (e.g., the timestamp of the annotation). By properly gathering and grouping such information to make it utilizable, we can use it as a stereotypical description of a user s behavior. Users are often constrained in their behavior by the environment and other factors; for instance, they produce tags within certain periodic intervals, such as the time of the day or day of the week. Being able to recognize such stereotypes, we can compute a reputation per stereotype rather than per user. While on the one hand this approach guarantees the availability of evidence, as typically multiple users belong to the same stereotype, on the other hand this approach compensates on the lack of evidence about specific users. We evaluate our hypothesis over the two datasets mentioned before by splitting them into two parts, one to build a provenance-based model and the other to evaluate it. Methods Here we describe the methods adopted and implemented in our algorithm. We start describing the tools that were already available and that we incorporated in our framework, and then we continue with the framework description. Preliminaries The system that we propose aims at estimating the trustworthiness of new annotations based on a set of evaluated ones (per user or per provenance stereotype, as we will see). In order to make the estimates, we need to make use of a probabilistic logic that allows us to model and reason about the evidence at our disposal while accounting for uncertainty due to the limited information available. For this reason we employ subjective logic, which fits our needs. Moreover, since the evidence at our disposal consists of textual annotations, we use semantic similarity measures to understand the relevance of each piece of evidence when analyzing each different annotation. Subjective logic Subjective logic is a probabilistic logic that we extensively use in our system in order to reason about the trustworthiness of the annotations and the reputations of their authors, based on limited samples. In subjective logic, so-called subjective opinions (represented

8 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 8 of 31 Probability Density Trust Values Figure 1 Beta distribution of the user trustworthiness. We use a Beta distribution to describe the probability for each value in the [ 0...1] interval to be the right reputation for a given user or the right trust value for a given tag. as ω) express the belief that source x owns with respect to the value of assertion y (for instance, a user s reputation). When y can assume only two values (e.g., true or false), the opinion is called binomial ; when y ranges over more than two values, the opinion is called multinomial. Opinions are computed as follows, where the positive and negative evidence are represented as p and n,respectively,andb, d, u and a represent the belief, disbelief, uncertainty and prior probability, respectively. A binomial opinion is represented as: where: ωy x (b, d, u) (1) b = p p + n + 2 d = n p + n + 2 u = 2 p + n + 2 a = 1 2 Such an opinion is equivalent to a Beta probability distribution (see Figure 1), which describes the likelihood for each possible trust value to be the right trust value for a given subject. An expected probability for a possible value of an opinion is computed as: E = b + a u (3) The belief b of the opinion represents how much we believe that the y statement is true (where y is, for instance, the fact that an annotation is correct), based on the evidence at our disposal. However, the evidence at our disposal is limited (and since the evidence is obtained by asking an authority, such as a museum, to evaluate an annotation, we aim at keeping it limited in order to avoid overloading the museum with such requests), so we must take into account the fact that other observations about the same fact might, in principle, disagree with those currently at our disposal. This leads to uncertainty, which is represented by the corresponding element of the opinion, u.thedisbeliefd represents, instead, the disbelief that we have about the statement y based on actual negative evidence (2)

9 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 9 of 31 at our disposal. Disbelief is the counterpart of belief, and the sum of belief, disbelief and uncertaintyisalwaysone(b + d + u = 1). Whereas belief, disbelief and uncertainty are based on the actual evidence observed, the base rate a encodes the prior knowledge about the truth of y according to x (before observing any evidence). The final opinion about the correctness of y is then obtained by aggregating a and b in a weighted manner, so as to make b count more if it is based on more evidence than a (and thus its uncertainty is lower). This explains the meaning of E, which represents exactly such an aggregation. One last consideration regards the fact that E represents both the expected value of the opinion and of the Beta distribution equivalent to it. u is tightly connected to the variance of the Beta distribution, as they both represent the uncertainty about the belief to be significant. These elements make subjective opinions ideal tools for representing the fact that the probabilities that we estimate about the trustworthiness of annotations, authors and provenance stereotypes (that are defined later in this section)are based on evidence provided by a museum (that consist of evaluations based on the museum s policy), and that the estimates we make carry some uncertainty due to the fact that they are based on a limited set of observations. Semantic similarity The target of our trust assessments are annotations (that is, words associated to images, in order to describe them), and our evidence consists of evaluated annotations. In our system we collect evidence about an author or a provenance stereotype (which is defined later in this section) that consist of annotations evaluated by the museum, and we compare each new annotation that needs to be evaluated against the evidence at our disposal. If, for instance, we based our estimates only on the ratio between positive and negative evidence of a given author, then all his annotations would be rated equally. If, instead, we compared every new annotation only against the evidence at our disposal of matching words, we would have a more tailored, but more limited estimate, as we cannot expect that the same author always uses the same words (and so that there is evidence for every word he uses in an annotation). In order to increase the availability of evidence for our estimate and to let the more relevant evidence have a higher impact on those calculations, we employ semantic relatedness measures as a weighing factor. These measures quantify the likeness between the meaning of two given terms. Whenever we evaluate a tag, we take the evidence at our disposal, and tags that are more semantically similar to the one we focus on are weighed more heavily. There exist many techniques for measuring semantic relatedness, which can be divided into two groups. First, we have so-called topological semantic similarity measures, which are deterministic measures based on the graph distance between the two words examined, based on a word graphs (e.g. WordNet [39]). Second, there is the family of statistical semantic similarity measures, which include, for instance, the Normalized Google Distance [40], that measures statistically the similarity between two words on the basis of the number of times that these occur and co-occur in documents indexed by Google. These measures are characterized by the fact that the similarity of two words is estimated on a statistical basis from their occurrence and co-occurrence in large sets of documents.

10 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 10 of 31 We focus on deterministic semantic relatedness measures based on WordNet or its Dutch counterpart Cornetto [41]. In particular, we use the Wu and Palmer [42] and the Lin [43] measure for computing semantic relatedness between tags, because both provide us with values in the range [0, 1], but other measures are possible as well. WordNet is a directed and acyclic graph where each vertex v, w is an integer that represents a synset (set of word synonyms), and each directed edge from v to w implies that w is a hypernym of v. In other words w shares a type-of relation with v.forinstance,ifv is the word winter (hyponym), w can be the word season (hypernym). The Wu and Palmer measure calculates semantic relatedness between two words by considering the depths between two synsets in WordNet, along with the depth of the Least Common Subsumer, as follows: depth(lcs(s1, s2)) score(s1, s2) = 2 depth(s1) + depth(s2) where s1 is a synset of the first word and s2 of the second. WordNet is an acyclic graph where nodes are represented by synsets and edges represent hypernym/hyponym relations. If a synset is a generalization of another one, we can measure the depth, that is the distance between the two. The first ancestor shared by two nodes is the Least Common Subsumer. We compute the similarity of all synsets combinations and pick the maximum value, as we adopt the upper bound of the similarity between the two words. The Lin measure considers the information content of the Least Common Subsumer and the two compared synsets, as follows: IC(lcs(s1, s2)) 2 IC(s1) + IC(s2) where IC is the information context, that is the probability of finding the concept in a given corpus, and is defined as: ( ) freq(s) IC(s) = log freq(root) and freq is the frequency of the synset. So the Wu and Palmer measure derives the similarity of two concepts from their distance from a common ancestor, while the Lin similarity derives it from the information content of the two concepts and their lowest ancestor. The Wu and Palmer similarity measure is more recent and has shown to be effectively combinable with subjective logic in Ceolin et al. [44], so when we deal with datasets of tags in English (Steve.Museum dataset), we use its implementation provided by the python nltk library [45]. Instead, when we work on datasets composed of Dutch tags (e.g., the dataset from the SEALINC Media project experiment), we rely on pycornetto [46], an interface to Cornetto, the Dutch WordNet. pycornetto does not provide a means to compute the Wu and Palmer similarity measure, but it provides the Lin similarity measure, and given the relatedness of the two measures, in this case we adopt the Lin measure. For more details about how to combine semantic relatedness measures and subjective logic, see the work of Ceolin et al. [44]. By choosing to use these measures we limit ourself in the possibility to evaluate only single-word tags and only common words, because these are the kinds of words that are present in WordNet. However, we choose these measures because almost all the tags we evaluate fall into the mentioned categories and because the use of these similarity measures together with subjective logic has already been theoretically validated. Moreover, almost all the words used in the annotations that form the dataset we used in our evaluations are single-word tags and common words,

11 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 11 of 31 hence this limitation does not affect our evaluation significantly. The algorithm proposed is designed so that any other relatedness measure could be used in place of the chosen ones, without the need of any additional intervention. The choice of the semantic similarity and how the semantic similarity is used both affect the uncertainty of the expected results of the algorithms that we propose. In fact, these algorithms use semantic similarity to weigh the importance of evidence when evaluating tags, i.e. words associated with cultural heritage artifacts. We use a deterministic semantic similarity measure which, although it constitutes a heuristics, is based on a trustworthy data source (WordNet) and this implies that the measure is less uncertain than a probabilistic measure based on a limited document corpus. Still, the semantic similarity measure represents an approximation and part of the uncertainty these imply is due to their use: semantic similarity measures represent the similarity between synsets, but we have at our disposal only words without an indication of their intended synset (a word may have more meanings, represented by synsets). Since we are situated in a well-defined domain (cultural heritage), and since words are all used to tag cultural heritage artifacts, we assume that words are semantically related, and hence, when computing the semantic similarity between two words,we make use of the maximum of the similarity between all their synsets. Despite the fact that this introduces an approximation, we will show in Section Results that the method is effective. Datasets adopted We validate the algorithms we propose over two datasets of annotations of images. The annotations contained in these datasets consist of content descriptions and the datasets contain also the evaluations from the institutions that collected them. For each annotation, the datasets contain information about its author and a timestamp. Since each institution adopts a different policy for evaluating annotations, we try to learn such a policy from a sample of annotations per dataset, and find a relationship between the identity oftheauthororotherinformationabout the annotation and its evaluation. SEALINC media project experiment As part of SEALINC Media project, the Rijksmuseum in Amsterdam [9] is crowdsourcing annotations of artifacts in its collection using Web users. An initial experiment was conducted to study the effect of presenting pre-set tags on the quality of annotations on crowdsourced data [47]. In the experiment, the external annotators were presented with pictures from the Web and prints from the Rijksmuseum collection along with a pre-set annotations about the picture or print, and they were asked to insert new annotations, or remove the pre-set ones which they did not agree with (the pre-set tags are either correct or not). A total of 2,650 annotations resulted from the experiment, and these were manually evaluated by trusted personnel for their quality and relevance using the following scale: 1 : Irrelevant 2 : Incorrect 3:Subjective 4 : Correct and possibly relevant 5 : Correct and highly relevant typo : Spelling mistake

12 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 12 of 31 These tags, along with their evaluations, were used to validate our model. For each tag, the SEALINC Media dataset presents the following elements: author identifier, artifact identifier, timestamp, evaluation. We do not focus on the goals of the experiment from which this dataset is obtained, that is, we do not analyze the relation between the kind of tag that was proposed to the user, and the tag that the user provided. We focus on the tag that the user actually proposes and its evaluation and we try to predict the evaluation of the tags provided by each user, given a small training set of sample evaluations about each of them. Weneglectthetagsevaluatedas Typo becauseourfocusisonthesemanticcorrectness of the tags, so we assume that such a category of mistakes would be properly avoided or treated (e.g., by using autocompletion and checking the presence of the tags in dictionaries) before the tags reach our evaluation framework. We build our training set using a fixed amount of evaluated annotations for each of the users, and form the test set using the remaining annotations. The number of annotations used to build the reputation and the percentage of the dataset covered is presented in Table 1: in the first column # annotation per reputation we report the number of evaluated annotations we use to build each reputation, while in the second column, % training set covered we report the percentage of annotation used as training set compared to the whole dataset. Steve.Museum project dataset Steve.Museum [7] is a project involving several museum professionals in the cultural heritage domain. Part of the project focuses on understanding the various effects of crowdsourcing cultural heritage artifact annotations. Their experiments involved external annotators annotating museum collections, and a subset of the data collected from the crowd was evaluated for trustworthiness. In total, 4,588 users tagged the 89,671 artifacts using 480,617 tags from 21 participating museums. Part of these annotations consisting of 45,860 tags were manually evaluated by professionals at these museums and were used as a basis for our second case study. In this project, the annotations were classified in a more refined way, compared to the previous case study, namely as: {Todo, Judgement-negative, Judgement-positive, Problematic-foreign, Problematic-huh, Problematic-misperception, Problematic-misspelling, Problematic-no_consensus, Problematic-personal, Usefulnessnot_useful, Usefulness-useful}. There are three main categories: judgement (a personal judgement by the annotator about the picture), problematic (for several, different reasons) and usefulness (stating whether the annotation is useful or not). We consider only usefulness-useful as a positive judgement, all the others are considered as negative evaluations. The tags classified as todo are discarded, since their evaluation has not been Table 1 Results of the evaluation of Algorithm 1 over the SEALINC Media dataset # Tags per % Training Time reputation set covered Accuracy Precision Recall F-measure (sec.) 5 8% % % % Results of the evaluation of Algorithm 1 over the SEALINC Media dataset for training sets formed by aggregating 5, 10, 15 and 20 reputations per user. We report the percentage of dataset actually covered by the training set, the accuracy, the precision, the recall and the F-measure of our prediction.

13 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 13 of 31 performed yet. The Steve.Museum dataset is provided as a MySQL database and consists of several tables. Those most important for us are: steve_term, that contains information like the identifiers for the artifact annotated and the words associated with them (tags); steve_session that reports information about when the tags are provided and by whom, and steve_term_review that contain information about the tag evaluations. We join these tables and we select the information that is relevant for us: the tags, their authors, their timestamps (i.e. date and time of creation) and their evaluation. We partition this dataset thus obtained into a training and a test set, as shown in Table 2, along with their percentage coverage of the whole dataset and the obtained results (in the second column, % of training set covered ). We use the training set to learn a model for evaluating user provided tags, and we consider the tags in the test set as tags newly introduced by the authors for which we build a model. System description After having introduced the methods that we make use of, and the datasets that we analyze, we provide here a description of the system that we propose. High-level system overview The system that we propose aims at relieving the institution personnel (reviewers in particular) from the burden of controlling and evaluating all the annotations inserted by users. The system asks for some interaction with the reviewers, but tries to minimize it. Figure 2 shows a high-level view of the model. For each user, the system asks the reviewers to review a fixed number of annotations, and on the basis of these reviews it builds user reputations. A reputation is meant to express a global measure of trustworthiness and accountability of the corresponding user. The reviews are also used to assess the trustworthiness of each tag inserted afterwards by a user: given a tag, the system evaluates it by looking at the evaluations already available. The evaluations of the tags semantically closer to the one that we evaluate have a higher impact. So we have two distinct phases: a first training step where we collect samples of manual reviews, and a second step where we make automatic assessments of tags trustworthiness (possibly after having clustered the evaluated tags, to improve the computation time). The more reviews there are, the more reliable the reputation is, but this number depends also on the workforce at the disposal of the institution. On the other hand, as we will see in the following section, this parameter does not affect significantly the accuracy obtained. Moreover, we do not need to set an acceptance threshold Table 2 Results of the evaluation of Algorithm 1 over the Steve.Museum dataset # Tags per % Training Time reputation set covered Accuracy Precision Recall F-measure (sec.) 5 18% % % % % % Results of the evaluation of Algorithm 1 over the Steve.Museum dataset for training sets formed by aggregating 5, 10, 15, 20, 25 and 30 reputations per user. We report the percentage of dataset actually covered by the training set, the accuracy, the precision, the recall and the F-measure of our prediction.

14 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 14 of 31 Figure 2 High-level overview of the system. The figure shows the high-level overview of the system. We evaluate a subset of user-contributed tags and use it to build the annotator reputation model. We evaluate the remaining set of the tags based on semantic similarity measures and subjective logic. The unevaluated tags are then ranked based on the annotator reputation model and the model serves as a basis to accept or reject the unevaluated tags from the users. (e.g., accept only annotations with a trust value of say at least 0.9, for trust values ranging from zero to one), in contrast to the work of Ceolin et al. [21]. This is important since such a threshold is arbitrary, and it is not trivial to find a balance between the risk to accept wrong annotations and to reject good ones. We introduce here a running example that accompanies the description of the system in the rest of this section. Suppose that a user, Alex (whose profile already contains three tags which were evaluated by the museum), newly contributes to the collection of the Fictitious National Museum by tagging five artifacts. Alex tags one artifact with Chinese. If the museum immediately uses the tag for classifying the artifact, it might be risky because the tag might be wrong (maliciously or not). On the other hand, had the museum enough internal employees to check the external contributed tag, then it would not have needed to crowdsource it. The system that we propose here relies on few evaluations of Alex s tags by the Museum. Based on these evaluations, the system: (1) computes Alex s reputation; (2) computes a trust value for the new tag; and (3) decides whether to accept it or not. We describe the system implementation in the following sections. Annotation representation We adopt the Open Annotation model [48] as a standard model for describing annotations, together with the most relevant related metadata (like the author and the time of creation). The Open Annotation model allows to reify the annotation itself, and by treating it as an object, we can easily link to it properties like the annotator URI or the time of creation. Moreover, the review of an annotation can be represented as an annotation which target is an annotation and which body contains a value of the review about the annotation. To continue with our example, Figure 3 and Listing 1 show an example of an annotation and a corresponding review, both represented as annotations from the Open Annotation model.

15 Ceolin et al. Journal of Trust Management 2014, 1:3 Page 15 of 31 Listing 1 Example of an annotation and respective evaluation. The annotation is represented using the Annotation class from the Open Annotation model. The evaluation is represented as an annotation of the rdf : <http :// org/1999/02/22 rdf syntax oac : <http :// foaf: < ex:user_1 oac:annotator Annotation; foaf:givenname "Alex". ex : annotation_1 oac : hasbody tag : Chinese ; oac : annotator ex : user_1 ; oac : hastarget ex : img_231 ; rdf : type oac : annotation. ex : review oac : hasbody ex : ann_accepted ; oac : annotator ex : reviewer_1 ; oac : hastarget ex : annotation_1; rdf : type oac : annotation. ex : annotation_accepted oac : annotates ex : annotation_1. Trust management We employ subjective logic [49] for representing, computing and reasoning on trust assessments. There are several reasons why we use this logic. First, it allows to quantify the truth of statements regarding different subjects (e.g. user reputation and tag trust value) by aggregating the evidence at our disposal in a simple and clear way that accounts both for the distribution of the observed evidence and the size of it, hence quantifying the uncertainty of our assessment. Second, each statement in subjective logic is equivalent to a Beta or Dirichlet probability distribution, and hence we can tackle the problem from a statistical point of view without the need to change our data representation. Third, the logic offers several operators to combine the assessments made over the statements of our interest. We made a limited use of operators so far, but we aim at expanding this in the near future. Lastly, we use subjective logic because it allows us to represent formally the fact that the evidence we collect is linked to a given subject (user, tag), and is based on a specific point of view (reviewers for a museum) that is the source of the evaluations. Trust is context-dependent, since different users or tags (or, more in general, agents and artifacts) might receive different trust evaluations, depending on the context from which they situate, and the reviewer. In our scenarios we do not have at our disposal an explicit description of trust policies by the museums. Also, we do not aim at determining a generic tag (or user) trust level. Our goal is to learn a model that evaluates tags as closely as Figure 3 Representation of the annotations and their reviews. We represent annotations and their reviews as annotations from the open annotation model.

Combining User Reputation and Provenance Analysis for Trust Assessment

Combining User Reputation and Provenance Analysis for Trust Assessment Combining User Reputation and Provenance Analysis for Assessment Davide Ceolin, VU University Amsterdam Paul Groth, Elsevier B.V. Valentina Maccatrozzo, VU University Amsterdam Wan Fokkink, VU University

More information

Linking Trust to Data Quality

Linking Trust to Data Quality Linking Trust to Data Quality Davide Ceolin, Valentina Maccatrozzo, and Lora Aroyo VU University Amsterdam de Boelelaan, 1081a 1081HV Amsterdam, The Netherlands {d.ceolin, v.maccatrozzo, lora.aroyo}@vu.nl

More information

Trust4All: a Trustworthy Middleware Platform for Component Software

Trust4All: a Trustworthy Middleware Platform for Component Software Proceedings of the 7th WSEAS International Conference on Applied Informatics and Communications, Athens, Greece, August 24-26, 2007 124 Trust4All: a Trustworthy Middleware Platform for Component Software

More information

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE

DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE DENSITY BASED AND PARTITION BASED CLUSTERING OF UNCERTAIN DATA BASED ON KL-DIVERGENCE SIMILARITY MEASURE Sinu T S 1, Mr.Joseph George 1,2 Computer Science and Engineering, Adi Shankara Institute of Engineering

More information

Ranking Clustered Data with Pairwise Comparisons

Ranking Clustered Data with Pairwise Comparisons Ranking Clustered Data with Pairwise Comparisons Alisa Maas ajmaas@cs.wisc.edu 1. INTRODUCTION 1.1 Background Machine learning often relies heavily on being able to rank the relative fitness of instances

More information

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani

International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

Trust Enhanced Cryptographic Role-based Access Control for Secure Cloud Data Storage

Trust Enhanced Cryptographic Role-based Access Control for Secure Cloud Data Storage 1 Trust Enhanced Cryptographic Role-based Access Control for Secure Cloud Data Storage Lan Zhou,Vijay Varadharajan,and Michael Hitchens Abstract Cloud data storage has provided significant benefits by

More information

A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud Data

A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud Data An Efficient Privacy-Preserving Ranked Keyword Search Method Cloud data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. Therefore it is essential to develop

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites Access IT Training 2003 Google indexed 3,3 billion of pages http://searchenginewatch.com/3071371 2005 Google s index contains 8,1 billion of websites http://blog.searchenginewatch.com/050517-075657 Estimated

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI 1 KAMATCHI.M, 2 SUNDARAM.N 1 M.E, CSE, MahaBarathi Engineering College Chinnasalem-606201, 2 Assistant Professor,

More information

A Framework for Securing Databases from Intrusion Threats

A Framework for Securing Databases from Intrusion Threats A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:

More information

Cluster-based Instance Consolidation For Subsequent Matching

Cluster-based Instance Consolidation For Subsequent Matching Jennifer Sleeman and Tim Finin, Cluster-based Instance Consolidation For Subsequent Matching, First International Workshop on Knowledge Extraction and Consolidation from Social Media, November 2012, Boston.

More information

Test designs for evaluating the effectiveness of mail packs Received: 30th November, 2001

Test designs for evaluating the effectiveness of mail packs Received: 30th November, 2001 Test designs for evaluating the effectiveness of mail packs Received: 30th November, 2001 Leonard Paas previously worked as a senior consultant at the Database Marketing Centre of Postbank. He worked on

More information

Adaptive Medical Information Delivery Combining User, Task and Situation Models

Adaptive Medical Information Delivery Combining User, Task and Situation Models Adaptive Medical Information Delivery Combining User, Task and Situation s Luis Francisco-Revilla and Frank M. Shipman III Department of Computer Science Texas A&M University College Station, TX 77843-3112,

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 22000 Lead Auditor www.pecb.com The objective of the Certified ISO 22000 Lead Auditor examination is to ensure that the candidate has

More information

Edge-exchangeable graphs and sparsity

Edge-exchangeable graphs and sparsity Edge-exchangeable graphs and sparsity Tamara Broderick Department of EECS Massachusetts Institute of Technology tbroderick@csail.mit.edu Diana Cai Department of Statistics University of Chicago dcai@uchicago.edu

More information

Pedigree Management and Assessment Framework (PMAF) Demonstration

Pedigree Management and Assessment Framework (PMAF) Demonstration Pedigree Management and Assessment Framework (PMAF) Demonstration Kenneth A. McVearry ATC-NY, Cornell Business & Technology Park, 33 Thornwood Drive, Suite 500, Ithaca, NY 14850 kmcvearry@atcorp.com Abstract.

More information

Recent Progress on RAIL: Automating Clustering and Comparison of Different Road Classification Techniques on High Resolution Remotely Sensed Imagery

Recent Progress on RAIL: Automating Clustering and Comparison of Different Road Classification Techniques on High Resolution Remotely Sensed Imagery Recent Progress on RAIL: Automating Clustering and Comparison of Different Road Classification Techniques on High Resolution Remotely Sensed Imagery Annie Chen ANNIEC@CSE.UNSW.EDU.AU Gary Donovan GARYD@CSE.UNSW.EDU.AU

More information

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness UvA-DARE (Digital Academic Repository) Exploring topic structure: Coherence, diversity and relatedness He, J. Link to publication Citation for published version (APA): He, J. (211). Exploring topic structure:

More information

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES Christian de Sainte Marie ILOG Introduction We are interested in the topic of communicating policy decisions to other parties, and, more generally,

More information

Retrieval Evaluation. Hongning Wang

Retrieval Evaluation. Hongning Wang Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User

More information

Enhanced retrieval using semantic technologies:

Enhanced retrieval using semantic technologies: Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008

More information

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks

CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks Archana Sulebele, Usha Prabhu, William Yang (Group 29) Keywords: Link Prediction, Review Networks, Adamic/Adar,

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

InfoQ: Computational Assessment of Information Quality on the Web. Centrum voor Wiskunde en Informatica (CWI) Vrije Universiteit Amsterdam

InfoQ: Computational Assessment of Information Quality on the Web. Centrum voor Wiskunde en Informatica (CWI) Vrije Universiteit Amsterdam InfoQ: Computational Assessment of Information Quality on the Web Davide Ceolin 1 *, Chantal van Son 2, Lora Aroyo 2, Julia Noordegraaf 3, Ozkan Sener 2, Robin Sharma 2, Piek Vossen 2, Serge ter Braake

More information

CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION

CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION 5.1 INTRODUCTION Generally, deployment of Wireless Sensor Network (WSN) is based on a many

More information

Detecting Controversial Articles in Wikipedia

Detecting Controversial Articles in Wikipedia Detecting Controversial Articles in Wikipedia Joy Lind Department of Mathematics University of Sioux Falls Sioux Falls, SD 57105 Darren A. Narayan School of Mathematical Sciences Rochester Institute of

More information

Presented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu

Presented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu Presented by: Dimitri Galmanovich Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu 1 When looking for Unstructured data 2 Millions of such queries every day

More information

3.4 Data-Centric workflow

3.4 Data-Centric workflow 3.4 Data-Centric workflow One of the most important activities in a S-DWH environment is represented by data integration of different and heterogeneous sources. The process of extract, transform, and load

More information

Edge Classification in Networks

Edge Classification in Networks Charu C. Aggarwal, Peixiang Zhao, and Gewen He Florida State University IBM T J Watson Research Center Edge Classification in Networks ICDE Conference, 2016 Introduction We consider in this paper the edge

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE COMP90042 LECTURE 3 LEXICAL SEMANTICS SENTIMENT ANALYSIS REVISITED 2 Bag of words, knn classifier. Training data: This is a good movie.! This is a great movie.! This is a terrible film. " This is a wonderful

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

MIA - Master on Artificial Intelligence

MIA - Master on Artificial Intelligence MIA - Master on Artificial Intelligence 1 Hierarchical Non-hierarchical Evaluation 1 Hierarchical Non-hierarchical Evaluation The Concept of, proximity, affinity, distance, difference, divergence We use

More information

The Trustworthiness of Digital Records

The Trustworthiness of Digital Records The Trustworthiness of Digital Records International Congress on Digital Records Preservation Beijing, China 16 April 2010 1 The Concept of Record Record: any document made or received by a physical or

More information

Social Navigation Support through Annotation-Based Group Modeling

Social Navigation Support through Annotation-Based Group Modeling Social Navigation Support through Annotation-Based Group Modeling Rosta Farzan 2 and Peter Brusilovsky 1,2 1 School of Information Sciences and 2 Intelligent Systems Program University of Pittsburgh, Pittsburgh

More information

A Semantic Role Repository Linking FrameNet and WordNet

A Semantic Role Repository Linking FrameNet and WordNet A Semantic Role Repository Linking FrameNet and WordNet Volha Bryl, Irina Sergienya, Sara Tonelli, Claudio Giuliano {bryl,sergienya,satonelli,giuliano}@fbk.eu Fondazione Bruno Kessler, Trento, Italy Abstract

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

Topic 7 Machine learning

Topic 7 Machine learning CSE 103: Probability and statistics Winter 2010 Topic 7 Machine learning 7.1 Nearest neighbor classification 7.1.1 Digit recognition Countless pieces of mail pass through the postal service daily. A key

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 9001 Lead Auditor www.pecb.com The objective of the PECB Certified ISO 9001 Lead Auditor examination is to ensure that the candidate possesses

More information

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction Adaptable and Adaptive Web Information Systems School of Computer Science and Information Systems Birkbeck College University of London Lecture 1: Introduction George Magoulas gmagoulas@dcs.bbk.ac.uk October

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

NATURAL LANGUAGE PROCESSING

NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity

More information

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Abstract Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in content-based

More information

INFORMATION RETRIEVAL SYSTEM USING FUZZY SET THEORY - THE BASIC CONCEPT

INFORMATION RETRIEVAL SYSTEM USING FUZZY SET THEORY - THE BASIC CONCEPT ABSTRACT INFORMATION RETRIEVAL SYSTEM USING FUZZY SET THEORY - THE BASIC CONCEPT BHASKAR KARN Assistant Professor Department of MIS Birla Institute of Technology Mesra, Ranchi The paper presents the basic

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

A Recommendation Model Based on Site Semantics and Usage Mining

A Recommendation Model Based on Site Semantics and Usage Mining A Recommendation Model Based on Site Semantics and Usage Mining Sofia Stamou Lefteris Kozanidis Paraskevi Tzekou Nikos Zotos Computer Engineering and Informatics Department, Patras University 26500 GREECE

More information

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities Journal of Applied Science and Engineering Innovation, Vol.6, No.1, 2019, pp.26-30 ISSN (Print): 2331-9062 ISSN (Online): 2331-9070 Application of Individualized Service System for Scientific and Technical

More information

Trustworthiness of Data on the Web

Trustworthiness of Data on the Web Trustworthiness of Data on the Web Olaf Hartig Humboldt-Universität zu Berlin Department of Computer Science hartig@informatik.hu-berlin.de Abstract: We aim for an evolution of the Web of data to a Web

More information

Document Retrieval using Predication Similarity

Document Retrieval using Predication Similarity Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO/IEC 20000 Lead Auditor www.pecb.com The objective of the Certified ISO/IEC 20000 Lead Auditor examination is to ensure that the candidate

More information

Crowdsourcing Annotations for Visual Object Detection

Crowdsourcing Annotations for Visual Object Detection Crowdsourcing Annotations for Visual Object Detection Hao Su, Jia Deng, Li Fei-Fei Computer Science Department, Stanford University Abstract A large number of images with ground truth object bounding boxes

More information

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES

CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 188 CHAPTER 6 PROPOSED HYBRID MEDICAL IMAGE RETRIEVAL SYSTEM USING SEMANTIC AND VISUAL FEATURES 6.1 INTRODUCTION Image representation schemes designed for image retrieval systems are categorized into two

More information

Context Change and Versatile Models in Machine Learning

Context Change and Versatile Models in Machine Learning Context Change and Versatile s in Machine Learning José Hernández-Orallo Universitat Politècnica de València jorallo@dsic.upv.es ECML Workshop on Learning over Multiple Contexts Nancy, 19 September 2014

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

Text Document Clustering Using DPM with Concept and Feature Analysis

Text Document Clustering Using DPM with Concept and Feature Analysis Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,

More information

Semantically Driven Snippet Selection for Supporting Focused Web Searches

Semantically Driven Snippet Selection for Supporting Focused Web Searches Semantically Driven Snippet Selection for Supporting Focused Web Searches IRAKLIS VARLAMIS Harokopio University of Athens Department of Informatics and Telematics, 89, Harokopou Street, 176 71, Athens,

More information

REENGINEERING SYSTEM

REENGINEERING SYSTEM GENERAL OBJECTIVES OF THE SUBJECT At the end of the course, Individuals will analyze the characteristics of the materials in the industries, taking into account its advantages and functionality, for their

More information

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE 15 : CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find

More information

Chapter 3: Google Penguin, Panda, & Hummingbird

Chapter 3: Google Penguin, Panda, & Hummingbird Chapter 3: Google Penguin, Panda, & Hummingbird Search engine algorithms are based on a simple premise: searchers want an answer to their queries. For any search, there are hundreds or thousands of sites

More information

Annotation Component in KiWi

Annotation Component in KiWi Annotation Component in KiWi Marek Schmidt and Pavel Smrž Faculty of Information Technology Brno University of Technology Božetěchova 2, 612 66 Brno, Czech Republic E-mail: {ischmidt,smrz}@fit.vutbr.cz

More information

A Session-based Ontology Alignment Approach for Aligning Large Ontologies

A Session-based Ontology Alignment Approach for Aligning Large Ontologies Undefined 1 (2009) 1 5 1 IOS Press A Session-based Ontology Alignment Approach for Aligning Large Ontologies Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Estimating the Quality of Databases

Estimating the Quality of Databases Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information

CLUSTER ANALYSIS APPLIED TO EUROPEANA DATA

CLUSTER ANALYSIS APPLIED TO EUROPEANA DATA CLUSTER ANALYSIS APPLIED TO EUROPEANA DATA by Esra Atescelik In partial fulfillment of the requirements for the degree of Master of Computer Science Department of Computer Science VU University Amsterdam

More information

Wireless Network Security : Spring Arjun Athreya March 3, 2011 Survey: Trust Evaluation

Wireless Network Security : Spring Arjun Athreya March 3, 2011 Survey: Trust Evaluation Wireless Network Security 18-639: Spring 2011 Arjun Athreya March 3, 2011 Survey: Trust Evaluation A scenario LOBOS Management Co A CMU grad student new to Pittsburgh is looking for housing options in

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

Keywords: clustering algorithms, unsupervised learning, cluster validity

Keywords: clustering algorithms, unsupervised learning, cluster validity Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Challenges of Analyzing Parametric CFD Results. White Paper Published: January

Challenges of Analyzing Parametric CFD Results. White Paper Published: January Challenges of Analyzing Parametric CFD Results White Paper Published: January 2011 www.tecplot.com Contents Introduction... 3 Parametric CFD Analysis: A Methodology Poised for Growth... 4 Challenges of

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 95-96 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

Pilot: Collaborative Glossary Platform

Pilot: Collaborative Glossary Platform Pilot: Collaborative Glossary Platform CI Context Viewpoint 1. As-Is Workflow Model Workflow Name: Term / Definition Collection Domain Context: Typically each group / community has its own list of terms

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

A Generalized Method to Solve Text-Based CAPTCHAs

A Generalized Method to Solve Text-Based CAPTCHAs A Generalized Method to Solve Text-Based CAPTCHAs Jason Ma, Bilal Badaoui, Emile Chamoun December 11, 2009 1 Abstract We present work in progress on the automated solving of text-based CAPTCHAs. Our method

More information

Bayesian analysis of genetic population structure using BAPS: Exercises

Bayesian analysis of genetic population structure using BAPS: Exercises Bayesian analysis of genetic population structure using BAPS: Exercises p S u k S u p u,s S, Jukka Corander Department of Mathematics, Åbo Akademi University, Finland Exercise 1: Clustering of groups of

More information

Chapter 2 Overview of the Design Methodology

Chapter 2 Overview of the Design Methodology Chapter 2 Overview of the Design Methodology This chapter presents an overview of the design methodology which is developed in this thesis, by identifying global abstraction levels at which a distributed

More information

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2 Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu

More information

ALIN Results for OAEI 2016

ALIN Results for OAEI 2016 ALIN Results for OAEI 2016 Jomar da Silva, Fernanda Araujo Baião and Kate Revoredo Department of Applied Informatics Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, Brazil {jomar.silva,fernanda.baiao,katerevoredo}@uniriotec.br

More information

Chapter S:II. II. Search Space Representation

Chapter S:II. II. Search Space Representation Chapter S:II II. Search Space Representation Systematic Search Encoding of Problems State-Space Representation Problem-Reduction Representation Choosing a Representation S:II-1 Search Space Representation

More information

Overview of Akamai s Personal Data Processing Activities and Role

Overview of Akamai s Personal Data Processing Activities and Role Overview of Akamai s Personal Data Processing Activities and Role Last Updated: April 2018 This document is maintained by the Akamai Global Data Protection Office 1 Introduction Akamai is a global leader

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

What is this Song About?: Identification of Keywords in Bollywood Lyrics

What is this Song About?: Identification of Keywords in Bollywood Lyrics What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics

More information

Linking Entities in Chinese Queries to Knowledge Graph

Linking Entities in Chinese Queries to Knowledge Graph Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified Management System Auditor www.pecb.com The objective of the PECB Certified Management System Auditor examination is to ensure that the candidates

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE EXAM PREPARATION GUIDE PECB Certified ISO 50001 Lead Auditor The objective of the PECB Certified ISO 50001 Lead Auditor examination is to ensure that the candidate has the knowledge and skills to plan

More information

CS 224W Final Report Group 37

CS 224W Final Report Group 37 1 Introduction CS 224W Final Report Group 37 Aaron B. Adcock Milinda Lakkam Justin Meyer Much of the current research is being done on social networks, where the cost of an edge is almost nothing; the

More information

A NEURAL NETWORK BASED TRAFFIC-FLOW PREDICTION MODEL. Bosnia Herzegovina. Denizli 20070, Turkey. Buyukcekmece, Istanbul, Turkey

A NEURAL NETWORK BASED TRAFFIC-FLOW PREDICTION MODEL. Bosnia Herzegovina. Denizli 20070, Turkey. Buyukcekmece, Istanbul, Turkey Mathematical and Computational Applications, Vol. 15, No. 2, pp. 269-278, 2010. Association for Scientific Research A NEURAL NETWORK BASED TRAFFIC-FLOW PREDICTION MODEL B. Gültekin Çetiner 1, Murat Sari

More information

Problems in Reputation based Methods in P2P Networks

Problems in Reputation based Methods in P2P Networks WDS'08 Proceedings of Contributed Papers, Part I, 235 239, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Problems in Reputation based Methods in P2P Networks M. Novotný Charles University, Faculty of Mathematics

More information

(Milano Lugano Evaluation Method) A systematic approach to usability evaluation. Part I LAB HCI prof. Garzotto - HOC-POLITECNICO DI MILANO a.a.

(Milano Lugano Evaluation Method) A systematic approach to usability evaluation. Part I LAB HCI prof. Garzotto - HOC-POLITECNICO DI MILANO a.a. (Milano Lugano Evaluation Method) A systematic approach to usability evaluation Part I LAB HCI prof. Garzotto - HOC-POLITECNICO DI MILANO a.a. 04-05 1 ABOUT USABILITY 1. WHAT IS USABILITY? Ë Usability

More information