Gold-standard evaluation of a folksonomy-based ontology learning model

Size: px
Start display at page:

Download "Gold-standard evaluation of a folksonomy-based ontology learning model"

Transcription

1 Journal of Physics: Conference Series PAPER OPEN ACCESS Gold-standard evaluation of a folksonomy-based ontology learning model To cite this article: E Djuana 2018 J. Phys.: Conf. Ser View the article online for updates and enhancements. Related content - Local ontology for a dual-rail qubit Pawel Blasiak - The ontology in description of production processes in the Industry 4.0 item designing company A V Gurjanov, D A Zakoldaev, A V Shukalov et al. - Ontology to relational database transformation for web application development and maintenance Kamal Mahmudi, M M Inggriani Liem and Saiful Akbar This content was downloaded from IP address on 06/12/2018 at 18:05

2 Gold-standard evaluation of a folksonomy-based ontology learning model E Djuana Computer System Laboratory, Electrical Engineering Department, Faculty of Industrial Technology, Trisakti University, Grogol, Jakarta 11440, Indonesia edjuana@trisakti.ac.id Abstract. Folksonomy, as one result of collaborative tagging process, has been acknowledged for its potential in improving categorization and searching of web resources. However, folksonomy contains ambiguities such as synonymy and polysemy as well as different abstractions or generality problem. To maximize its potential, some methods for associating tags of folksonomy with semantics and structural relationships have been proposed such as using ontology learning method. This paper evaluates our previous work in ontology learning according to gold-standard evaluation approach in comparison to a notable state-of-the-art work and several baselines. The results show that our method is comparable to the state-of the art work which further validate our approach as has been previously validated using task-based evaluation approach. 1. Introduction Collaborative tagging is a collective process whereby web users annotate web resources with their own keywords (tags) as users defined metadata for those resources (Golder and Huberman, 2006; Marlow et al., 2006). Because of this process, there is emerging a kind of informal categorization system for searching and browsing of web resources which is defined by users themselves. This categorization system is known as folksonomy or folks generated taxonomy (Peter, 2009). Folksonomy has been acknowledged for its potential in improving categorization and searching of web resources (Peter, 2009; Bischoff et al., 2008; Robu et al., 2008). Also, there are studies which discover the potential of folksonomies for building semantic resources such as lightweight ontologies or taxonomies (Heymann and Garcia-Molina, 2006; Schmitz, 2006; Mika, 2007; Garcia-Silva, 2012). However, folksonomy may contain inherent semantic ambiguities e.g. synonymy, polysemy and generality problem (Golder and Huberman, 2006). Besides that, it has no explicit structural and semantic relationships among tags (Djuana et al, 2012). Nonetheless, since tags are contributed by users, there are many personal tags which may only be meaningful to themselves such as chapter 1, 101, etc (Bischoff et al., 2008). All these challenges may hinder folksonomy s potential for improving search, browsing and other potential applications such as recommendation. There are many attempts to overcome these challenges, one of which is by associating tags with semantic entities such as lexical resources, dictionary, taxonomy or ontology to make the meaning of tags explicit (Mika, 2007; Garcia-Silva, 2012). Other stream of approaches (Garcia-Silva, 2012) is using clustering based on similarity-based approach (Heymann and Garcia-Molina, 2006) or settheoretical approach (Schmitz, 2006) to consolidate tags into one structure such as taxonomy or Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by Ltd 1

3 ontology. In this context, how to evaluate the accuracy and effectiveness of the built structures and relationships becomes a crucial issue for these entities to be useful. This paper presents an attempt to evaluate our ontology learning method from folksonomy which has been conducted and presented in previous papers (Djuana et al, 2012, Djuana et al, 2013, Djuana et al, 2014). Previously, the result of this ontology learning method has been evaluated using taskbased evaluation approach or evaluation by application which is in tag recommendation scenario. In this paper, we focus on formal evaluation using gold-standard evaluation approach (Dellschaft and Staab, 2008). The motivation of this further evaluation is to evaluate coverage of this built ontology as it presents the usefulness of this built ontology for wider application as well as to evaluate intrinsic quality of it. The other motivation is to evaluate our method in comparison to other people s method in ontology learning including state-of-the-art method, how well our method is improving over baseline (classic) method in comparison to state-of-the-art s improvement over the baseline against gold standard universal ontology. We compare this built ontology to one notable state-of-the-art work by Liu et al. (2010) which has been compared to two notable baseline methods: Heymann and Garcia-Molina (2006) which represents similarity-based approach and Schmitz (2006) which represents settheoretical approach. This paper is structured as follows. In Section 2 we discuss the key concepts in ontology learning, with specific emphasis in ontology learning from folksonomy, and ontology evaluation. Section 3 discusses related works in terms of ontology learning and ontology evaluation, which discusses the two baselines and the state the art work as the benchmark. Section 4 summarizes our proposed approach from our previous papers. In Section 5 we discuss the evaluation settings and in Section 6 we discuss the evaluation results. Finally, Section 7 concludes this paper with a conclusion and direction for future work. 2. Key Concepts In this section, we provide brief introduction on ontology and ontology learning, specific class of ontology learning from folksonomy and ontology learning evaluation approaches Ontology and Ontology Learning Gruber defined that ontology is formal description and explicit specification of a shared conceptualization (Gruber, 1992). Depending on the types of stored knowledge, ontology can be differentiated in two types: domain ontology and general ontology (Navigli et al., 2003). General ontology defines concepts that are general for all domains while domain defines specific concepts that forms the core knowledge for one specific domain. Construction of ontology from Web contents is one important task in Web intelligence according to Zhong and Hayazaki (2002). This work points out that ontologies serves the Semantic Web by providing a controlled vocabulary of concepts, each with explicitly defined and machine-processable semantics. However, manual ontology construction is time consuming and very costly. Therefore, automatic and semi-automatic ontology constructions have been eagerly studied over the last decade (Maedche and Staab, 2001). One stream of approach relies on machine learning and automated languageprocessing techniques to extract concepts and ontological relations from structured or unstructured data such as database and text (Navigli et al., 2003) Ontology Learning from Folksonomy Mika stated that folksonomy which is emerging from collaborative tagging has been acknowledged as potential source for constructing ontology. As it captures vocabulary of users which may be aggregated to produce emergent semantics, people may develop lightweight ontologies (Mika, 2007). In this context, folksonomies can be referred to as a new data source for ontology learning that can be 2

4 analyzed using techniques already used in this area such as clustering, natural language processing, and formal concept analysis (Garcia-Silva et al, 2012). Construction of ontology from Web contents is one important task in Web intelligence according to Zhong and Hayazaki (2002). This work points out that ontologies serves the Semantic Web by providing a controlled vocabulary of concepts, each with explicitly defined and machine-processable semantics. In our previous work we have taken Garcia-Silva et al (2012) s view which describes the most relevant approaches described in the literature whose main objective is either to extract ontologies from tags in folksonomies or to associate tags to external semantic entities to make explicit the meaning of those tags. They have identified three group of approaches which are based on 1) clustering techniques i.e. to cluster tags according to some relations among them (statistical techniques); 2) ontologies i.e. aiming at associating semantic entities e.g. WordNet, Wikipedia, to tags to formally define their meaning; 3) hybrid approach i.e. mixing clustering techniques and ontologies (Djuana et al, 2012) Ontology Learning Evaluation In this section, we summarize the strategies for evaluating ontology learning approaches according to comprehensive evaluation strategies proposed by Dellschaft and Staab (2008) Preliminaries. Dellschaft and Staab (2008) have introduced two scenarios in ontology learning evaluation which are 1) evaluation of the learning algorithm in the broader context of an automatic or semi-automatic approach to ontology engineering whereby not only the learning algorithm influences the results but also the choice of the correct corpus which must contain information relevant for the task; and 2) evaluation of the quality of the learning algorithm itself. There are two dimensions of ontology which needs to be evaluated: functional and structural. The functional dimension of an ontology is related to its conceptualization while the structural dimension is related to the representation of an ontology as a graph. While the structural evaluation may pinpoint areas where the problems could exist, it is the functional evaluation that more crucial for evaluating the usefulness or effectiveness of the learned ontology. Dellschaft and Staab (2008) have argued that for the first scenario, the functional dimension of an ontology should be evaluated by means of an extrinsic, task-based evaluation, i.e. in the running application for which the ontology is engineered while in the second scenario, an intrinsic or taskneutral evaluation by means of a gold-standard based evaluation is usually the better choice. Task-based approaches (among other approaches such as corpus-based and criteria-based) is trying to measure in how far an ontology helps to improve the results of a certain task. A task-based evaluation is influenced by many aspects which must be kept constant during all evaluations so that changes in the results can be put down to the changes in the used ontologies (Dellschaft and Staab, 2008). Gold-standard based approaches (as supposed to manual evaluation by human experts) compare the learned ontology with a previously created gold standard which represents an idealized outcome of the learning algorithm. A learning algorithm is better when the learned ontology has a high similarity with the gold standard (Dellschaft and Staab, 2008) Gold-standard evaluation approach. The gold-standard based evaluation approach for evaluating ontologies may involve several measures. They can be distinguished between measures which only evaluate the lexical layer of an ontology, the ones which also take the concept hierarchy or taxonomic layer into account and the ones which evaluate the non-taxonomic relations contained in an ontology (Dellschaft and Staab, 2008). In this paper, we will concentrate on the measures for evaluating the lexical and the taxonomic layer. 3

5 The lexical layer or also known as coverage measure are often used for comparing the terms from the reference and the learned ontology based on an exact match of strings. Examples for this kind of measure are the Term Precision and Term Recall or also known as Lexical Precision and Recall. The taxonomic layer or also known as relationships measure compares the similarity of the positions of two concepts in the learned and the reference hierarchy as the local measure. The global measure is then computed by averaging the results of the local measure for concept pairs from the reference and the learned ontology. It is usually calculated using local taxonomic over-lap which compares two concepts based on the set of all their super- and sub concepts. These two measures will be described in more details in Section Related Works In this section, we describe related body of works in ontology learning from folksonomy data for evaluating our proposed method (Djuana et al, 2012, Djuana et al, 2013, Djuana et al, 2014) using gold-standard approach in addition to previous evaluation using task-based approach (tag recommendation). First, we describe two classic baseline approaches by 1) Heymann and Garcia-Molina (2006) and 2) Schmitz (2006); which are well known for their effectiveness and ease of implementation in Section 3.1. Then, we describe related state-of-the-art approaches and we specifically describe a method by Liu et al. (2010) which we are using as benchmark for our ontology validation in Section Similarity-based approach The algorithm published by Heymann and Garcia-Molina (2006) is classified as similarity-based approach according to classification by Liu et al (2010). The algorithm works on tag vectors whose index is equal to the number of times that a tag annotates an object. Then it calculates the similarity between tags using the cosine similarity between tag vectors to build tag similarity graph where each tag is represented by a vertex, and two vertices are connected by an edge if the similarity of the nodes they represent is above some set threshold. In the explanation provided by Liu et al (2010), for building up the hierarchy of tags or taxonomy, the algorithm starts with a single node tree whose only node is the root" node representing the top of the tree. Then, it adds each tag in the tagging system to the tree in decreasing order of how central the tag is to the similarity graph described above. It decides where to put each candidate tag by computing its similarity to every node currently present in the tree, keeping track of the most similar node. The candidate tag is then either added as a child of the most similar node if its similarity to that node is greater than some threshold, or it is added to the root node if there does not currently exist a good parent for that node (Liu et al, 2010) Set-theoretical approach The algorithm published by Schmitz (2006) is classified as set-theoretical approach according to classification by Liu et al (2010). It is an extension of the algorithm published by Sanderson and Croft (1999). The algorithm is based on subsumption model which partially order the concepts with the pairwise subsumption relations. The subsumption relation of two concepts is usually derived by a settheoretical method according to the inclusion relation between their specific attribute sets, such as the occurrences of terms in documents. It is originally defined as follows, for two terms, x and y, x is said to subsume y if the following two conditions hold: P(x y) = 1, P(y x) < 1 In other words, x subsumes y if the documents which y occurs in are a subset of the documents which x occurs in. Because x subsumes y and because it is more frequent, in the hierarchy, x is the parent of y. Although a respectable number of term pairs were found that adhered to the two subsumption conditions, it was noticed that many were just failing to be included because a few 4

6 occurrences of the subsumed term, y, did not co-occur with x. Subsequently, the first condition was relaxed and subsumption was redefined as: P(x y) 0.8, P(y x) < 1 as is described by Liu et al (2010). Schmitz (2006) adjusting Sanderson and Croft (1999) s statistical thresholds to reflect the ad hoc usage and adding filters to control for highly idiosyncratic vocabulary as follows: P(x y) t, Dx Dmin, Ux Umin, P(y x) < t, Dy Dmin, Uy Umin Where: t is the co-occurrence threshold, Dx is the # of documents in which term x occurs, and must be greater than a minimum value Dmin, and Ux is the # of users that use x in at least one annotation, and must be greater than a minimum value Umin State of the Art Approaches According to framework proposed by Garcia-Silva et al (2012), our proposed work falls into the second group which is based on ontologies. It will be discussed in detail in Section 4. Outside the list from survey conducted by Garcia-Silva et al (2012), we have described previously in Djuana et al (2012) that there are several recent works which tried to extract ontological structures from user tagging systems. Lin, Davis and Zhou (2009) extracted ontological structures by exploiting low support association rule mining supplemented by WordNet. Trabelsi, Jrad and Yahia (2010) focused more on extracting non-taxonomic relationships from folksonomies using triadic concepts with external resources: WordNet, Wikipedia and Google. Tang et al. (2009) and Liu et al. (2010) represents state-of-the-art work for generating ontology from folksonomy based on generative probabilistic models i.e. tag-topic model and set-theoretical approach i.e. to produce tag subsumption graph respectively. We chose state-of-the-art work by Liu et al. (2010) as a benchmark to evaluate our proposed approach as it is a major improvement to the subsumption models based on set-theoretical approach, by proposing to reduce noisy subsumptions including irrelevant and inconsistent paths. It is also chosen for its comprehensive evaluation which has been conducted against the classic baselines (Heymann and Garcia-Molina, 2006; Schmitz, 2006) according to evaluation framework proposed by Dellschaft and Staab (2008). The approach by Liu et al (2010) consists of 3 steps. In the first step, it identifies subsumption tags with a set-theoretical method. Since the subsumption relations discovered in this step may be inconsistent, it then resorts to ranking the tags by generality to settle this problem. Therefore, in the second step, it constructs a tag subsumption graph to compute the generality scores of tags with a random walk based procedure. In the last step, it uses an agglomerative clustering approach, which leverages the result of the tag generality ranking procedure, to generate the concept hierarchy. 4. Proposed Model To describe our proposed approach, we present several definitions below. This approach has been summarized here as a background for the evaluation purpose and the details has been published in previous papers (Djuana et al, 2012, Djuana et al, 2013, Djuana et al, 2014) Definitions Collaborative Tagging System. A collaborative tagging system contains three entities: users, tags, and items, which are described below: Users U = {u 1, u 2,.. u U } contains all users in an online community who have used tags to annotate their items; 5

7 Tags T = {t 1, t 2,.. t T } contains all tags used by users in U. Tags are typically arbitrary strings which could be a single word or short phrase. In this respect, a tag is defined as a sequence of terms. For t T, t =< term 1, term 2,, term m >, a function tagset(t) = {term 1, term 2,.. term m } is defined to return the terms in a tag; Items I = {i 1, i 2,.. i I } contains all domain-relevant items or resources. What is considered by an item depends on the type of collaborative tagging system, for instance, in Delicious the items are mainly bookmarks; Based on these three entities, a collaborative tagging system is formulated as Folksonomy which consists of 4-tuple: F = (U, T, I, Y) where U, T, I are finite sets, whose elements are the users, tags and items, respectively. Y is a ternary relation between those elements, i.e. Y U T I, whose elements are called the tag assignments, whereby an element (u, t, i) Y represents that user u annotated item i using tag t General Ontology. The general ontology is defined as a 2-tuple GeneralONTO = (C, R). where C = {c 1, c 2,.., c C } is a set of concepts; R = {r 1, r 2,.., r R } is a set of relations representing the relationships between concepts. A concept c in C is a 3-tuple c = (id, synset, category) where id is a unique identification of concept c; synset is a synonym set containing synonymic terms which represent the meaning of the concept c; and category is a taxonomic category to classify this concept c. A relation r in the relation set R is a 3-tuple r = (type, x, y), where type {is_a, }; x, y C are the concepts that hold the relation r. Specifically, there are the set of synonyms representing c which represented as synset(c) and the category of c as category(c). For each term w in synset (c), w is represented as a 2-tuple (w, freq c (w)) where w is a synonym term of the concept c; freq c (w) is the frequency as an indication of how frequently this term has been used to represent the meaning of the concept c based on the accompanying corpus. For a term w, the set of concepts for which w is a synonymic term is defined as con(w) = {c (w, f) synset(c)} Domain Ontology. The domain ontology is defined as 2-tuple DomainOnto = (TC, TR). where TC = {tc 1, tc 2,.., tc TC } is a set of tag-concepts, i.e., TC C 2 T, and TR = {tr 1, tr 2,.., tr TR } is a set of tag relations. Each element in TC is a pair of a concept c and a set of tags {t 1, t 2,.. t n }, i.e., tc = (c, {t 1, t 2,.. t n } ) TC, which represents that each tag in {t 1, t 2,.. t n } can be mapped to concept c. TR is defined as: r R, TR = {r = (type, c 1, c 2 ) Concept_Tag(c 1 ), } Concept_Tag(c 2 ) 4.2. Ontology Learning Process From the backbone ontology, it was expected by conducting ontology learning process; domain ontology which represents a tag collection can be generated. It is expected that this domain ontology will contain sub ontology of the backbone ontology which contextualized to the tag vocabulary and possibly personalized to users tag usage in that collection. This sub ontology is expected to have tag to concept mapping and the taxonomic relationships between tags which extracted from concept to concept relationships in the backbone ontology. The lexical knowledge base WordNet (Fellbaum, 1998) was chosen as the backbone ontology as it has wide coverage of concepts (over 200,000) and richness of relationships such as semantic relationships is-a, part-of, lexical relationships synonymy and antonymy as well as availability of accompanying corpus and other facility for disambiguation process. We have summarized the 3 stages in the domain ontology generation process which are mapping tags to concepts, mapping disambiguation and relationships extraction. 6

8 Mapping Tags to Concepts. One tag may contain one or more terms. It is possible that a tag can map directly to one of synonym terms of a concept in the backbone ontology. In other cases, only part of a tag that can map to one of synonym terms. These cases where handled by three mapping approaches which are (1) whole mapping where by whole tag string can be mapped to a synonym terms in a concept; (2) partial mapping where by partial tag string, after phrase identification stage, can be mapped to a synonym terms in the concept; and (3) term mapping where by each individual term in tag string is mapped to a synonym terms in a concept. This mapping is represented as tag to concept mapping and one tag may map to more than one concepts. Overall, for t T, the tag to concept mapping is defined as follows: Tag_Concept whole (t), directly mapped Tag_Concept(t) = { Tag_Concept partial (t), partially mapped Tag_Concept term (t), term mapped Mapping Disambiguation. After all the possible mappings are found, the next stage was mapping disambiguation to choose the most appropriate concept from mapped concepts to represent the meaning of tag for this tag collection. Two disambiguation strategies were performed which are (1) disambiguation by frequency which comes from an expert point of view about general meaning of tags. This mapping strength comes from frequency in a representative corpus of documents which indicate how frequent one synonym terms would be used to represent the meaning of concept that contains these terms; (2) disambiguation by tag relevance which comes from users point of view about a personal meaning in the tags collection. This mapping strength comes from the tag relevance in relation to similar users understanding and usage of tags. Given a related tag that has been used for an item, this mapping is chosen according to the relevance to other tags. After mapping disambiguation, each tag t will be mapped to one and only one concept. This can be defined by a one to one disambiguation mapping M γ : T C, M γ (t) = argmax (T C [t, c] ) (1) γ c Tag Concept(t) where matrix T_C[t i, c j ] n is defined to represent the strength of the mapping between tags and concepts, where m= T and n= C and γ is a mapping disambiguation strategy. On the other hand, multiple tags may also be mapped to one concept. The following function defines the mapping from a concept to tags: Concept_Tag: C 2 T, Concept_Tag(c) = {t t T, M γ (t) == c} At the end, the confirmed mapping according to two disambiguation strategies were: M frequency (t) and M relevance (t) Relationships Extraction. Once mapping tags to concepts and mapping disambiguation processes are completed, each tag will map to a concept on the backbone ontology. Based on tag to concept mappings, available relationships ( is-a relation) among concepts in general ontology were extracted to form the domain ontology. 5. Evaluation 5.1. Evaluation Method To compare the intrinsic quality of our proposed method we use the gold standard evaluation approach as discussed in Section For task-based evaluation results please refer to our previous papers (Djuana et al, 2012, Djuana et al, 2013, Djuana et al, 2014). 7

9 Following gold-standard ontology chosen by Liu et al. (2010), we use the concept hierarchy from Open Directory Project (ODP) 1. ODP is a free, user-maintained hierarchical web directory. Each node in the ODP hierarchy has a topic label (e.g. Sports or Arts) and a set of associated URLs. ODP is generated by collaborating users. In the following the simplified definition of a core ontology will be used. This definition of an ontology only contains the lexical layer and the concept hierarchy. In Dellschaft and Staab (2008) a core ontology is defined as follows: The structure O: = (C, root, c) is called a core ontology. C is a set of concept identifiers and root being a designated root concept for the partial order c on C. This partial order is called concept hierarchy or taxonomy. The equation c C: c root holds for this concept hierarchy. Given a computed core ontology O C and a reference ontology O R, the lexical precision (LP) and lexical recall (LR) are defined as follows: LP(O C, O R ) = C C C R (2) C C LR(O C, O R ) = C C C R C R To compare between the gold standard and the learned ontology, each sub tree which starts with different ODP topic label will be compared to its corresponding ODP sub tree using lexical precision and lexical recall. The higher the value of precision and recall the more accurate the learned ontology to the gold standard ontology Experiment Setup Dataset and Experiment Run. One public folksonomy dataset was used for the experiment. We use the subset of Delicious dataset provided by Wetzker et al (2008) which contains all public bookmarks of users posted on delicious.com between September 2003 and December In our experiment, we use the data between September 2003 and July We also perform a filtering for the dense part of the folksonomy using p-core calculation according to Batagelj and Zaversnik (2002). This p-core 15 calculation reduces the folksonomy population to tags, user, and items which appear in at least 15 posts. The overall statistics is presented in Table 1. (3) Table 1. Delicious dataset statistics All Filtered by p-core=15 #users 75,245 24,562 #items 3,158,4 45,793 #tags 456, ,718 #posts 7,698,6 53 1,436,52 7 We have implemented the two baselines by Heyman and Garcia-Molina (2006) and Schmitz (2006) and ran their algorithms with our dataset and produced two versions of learned ontology structures. We have run our proposed algorithm with the same dataset and produced a version of learned ontology structure. For each of this learned ontology we are calculating lexical precision and recall against the gold standard. We did not implement Liu s method, but instead we are using results from Liu et al (2010) s paper directly since they are also comparing against the same two baselines. To compare with Liu s, we are measuring the percentage of improvement from our results in comparison to the two baselines using our dataset with the percentage of improvement in Liu et al (2010) s paper. By comparing the percentage of improvement, we are indirectly comparing the accuracy of the two algorithms

10 Table 2. Improvement Results between the Proposed Method to Baselines Lexical Precision Lexical Recall Sub tree Ours Heymann Schmitz Ours vs Ours vs Ours Heymann Schmitz Ours vs Ours vs Heymann Schmitz Heymann Schmitz sport science news program history book culture computers game education resources media shop graphic health all average Table 3. Improvement Results between the State-of-the-Art Method to Baselines Lexical Precision Lexical Recall Sub tree Liu Heymann Schmitz Liu vs Liu vs Liu Heymann Schmitz Liu vs Liu vs Heymann Schmitz Heymann Schmitz sport science news program history book culture computers game education resources media shop graphic health all average

11 6. Results and Discussion The results for our proposed method are presented in Table 2 while the results from Liu et al (2010) s paper are copied in Table 3 for easy comparison. The average value is the average over all the sub trees but not include the all root. The all root is when all of sub trees are connected which means for all overlap sub trees, the overlapped part will be merged. Overall results for the proposed method are better against the two baselines with the improvement over Heymann and Garcia algorithm is higher than the improvement over Schmitz algorithm. In this dense part of folksonomy, the proposed method is also following the trend shown by Liu et al s method because the absence of all idiosyncratic and personal tags. Overall results also show that the proposed methods in average and all situations are better in terms of coverage which are shown by higher value of lexical precision and recall. The improvement in recall value are higher than in precision which shows that the proposed method is better in coverage against Liu et al s method. However, as we compare the percentage of improvement over Schmitz algorithm, there are several sub trees such as program (programming), computers and game which has lower improvement of precision value than Liu et al. s method, although comparing to Heymann algorithm the improvement is still better. It is suspected that since the proposed method is partly based on WordNet which is contains more general vocabularies in comparison to ODP, these more technical sub trees were not performed that well. In this evaluation, we haven t included the taxonomic evaluation in terms of taxonomic precision and recall which evaluates the closeness of relationships structure of the learned ontology to the gold standard. At this stage, we can conclude that based on coverage alone, the proposed method is better and for overall evaluation, the proposed method may be comparable. This is subject to taxonomic evaluation to be conducted as future work. 7. Conclusions We have presented an attempt to evaluate the proposed ontology learning approach using goldstandard evaluation approach, which previously evaluated using task-based evaluation approach. The lexical evaluation (coverage) show a positive indication that the proposed method is better than notable state-of-the-art method. This is subject to a taxonomic (relationships) evaluation which is to be conducted as a near future work. References [1] Batagelj, V. and Zaversnik, M Generalized cores, Arxiv preprint cs/ [2] Bischoff, K., Firan, C.S., Nejdl, W. and Paiu, R Can all tags be used for search? In Proceedings of the 17th ACM Conference on Information and Knowledge Management, ACM, New York, NY, [3] Dellschaft, K. and Staab, S Strategies for the evaluation of ontology learning, In Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, IOS Press, Amsterdam, The Netherlands, [4] Djuana, E., Xu, Y. and Li, Y Learning personalized tag ontology from user tagging information. In Proceedings of the Tenth Australasian Data Mining Conference (Sydney, Australia, December 05-07, 2012). AusDM ACS, Sydney, Aus, [5] Djuana, E., Xu, Y., Li, Y and Cox, C Personalization in tag ontology learning for recommendation making. In Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services (Denpasar, Bali, December 03-05, 2012). iiwas '12. ACM, New York, NY, [6] Djuana, E., Xu, Y., Li, Y., Josang, A. and Cox, C An ontology based method for sparsity problem in tag recommendation. In Proceedings of the 15 th International 10

12 Conference on Enterprise Information Systems (Angers, France, July 04-07, 2013). ICEIS SCITEPRESS, INSTICC, Portugal, [7] Djuana, E., Xu, Y., Li, Y., and Josang, A A Combined Method for Mitigating Sparsity Problem in Tag Recommendation. In Proceedings of the 47th Hawaii International Conference on System Sciences (Waikoloa, HI, USA, January 6-9, 2014). IEEE Computer Society 2014, [8] Fellbaum, C. (ed.), WordNet: An Electronic Lexical Database, Cambridge, MA: MIT Press. [9] García-Silva, A., Corcho, O., Alani, H. and Gómez-Pérez, A Review of the state of the art: discovering and associating semantics to tags in folksonomies. The Knowledge Engineering Review. 27, 1 (Feb. 2012), Cambridge University Press, [10] Golder, S.A. and Huberman, B.A Usage patterns of collaborative tagging systems. J. Inf. Sci. 32, 2 (Apr. 2006), [11] Gruber, T.R A translation approach to portable ontology specifications. Knowledge Acquisition, 5, 2, [12] Heymann, P. and Garcia-Molina, H Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report Stanford University. [13] Lin, H., Davis, J. and Zhou, Y An integrated approach to extracting ontological structures from folksonomies, The Semantic Web: Research and Applications, Springer, [14] Liu, K., Fang, B. and Zhang, W Ontology emergence from folksonomies, In Proceedings of the 19 th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, [15] Maedche, A. and Staab, S Ontology learning for the semantic web. IEEE Intelligent Systems, 16, 2, (March 2001), IEEE, NJ, US, [16] Marlow, C., Naaman, M., Boyd, D. and Davis, M HT06, tagging paper, taxonomy, Flickr, academic article, to read, In Proceedings the Seventeenth Conference on Hypertext and Hypermedia, ACM, New York, NY, [17] Mika, P Ontologies are us: A unified model of social networks and semantics. Web Semantics: Science, Services and Agents on the World Wide Web. 5, 1 (March. 2007), [18] Navigli, R., Velardi, P. and Gangemi, A Ontology learning and its application to automated terminology translation. IEEE Intelligent Systems, 18, 1, (Jan 2003), IEEE, NJ, US, [19] Peters, I Folksonomies. Indexing and Retrieval in Web 2.0. De Gruyter Saur, Berlin, Germany. [20] Robu, V., Halpin, H., and Shepherd, H Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM Trans. Web. 3, 4 (Sep. 2009), 14:1-34. [21] Sanderson, M. and Croft, B Deriving concept hierarchies from text. In Proceedings of the 22nd Annual International ACM SIGIR conference on Research and Development in Information Retrieval. SIGIR'99, [22] Schmitz, P Inducing ontology from flickr tags. In Proceedings of the Collaborative Web Tagging Workshop at WWW'06, (Edinburgh, Scotland, May 23-26, 2006). [23] Tang, J., Leung, H., Luo, Q., Chen, D. and Gong, J Towards ontology learning from folksonomies, In Proceedings 21 st International Joint Conference on Artificial Intelligence, [24] Trabelsi, C., Jrad, A.B., and Yahia, S.B Bridging folksonomies and domain ontologies: Getting out non-taxonomic relations, In Proceedings IEEE International Conference on Data Mining Workshops, IEEE, [25] Wetzker, R., Zimmermann, C. and Bauckhage, C Analyzing social bookmarking systems: A del.icio.us cookbook, In Proceedings of European Conference on Artificial Intelligence (ECAI). 11

13 [26] Zhong, N. and Hayazaki, N., Roles of ontologies for web intelligence. In M.-S. Hacid, Z. Ras, D. Zighed & Y. Kodratoff (Eds.), Foundations of Intelligent Systems, 2366, Springer Berlin, Heidelberg,

Ontology Extraction from Heterogeneous Documents

Ontology Extraction from Heterogeneous Documents Vol.3, Issue.2, March-April. 2013 pp-985-989 ISSN: 2249-6645 Ontology Extraction from Heterogeneous Documents Kirankumar Kataraki, 1 Sumana M 2 1 IV sem M.Tech/ Department of Information Science & Engg

More information

Rules for Inducing Hierarchies from Social Tagging Data

Rules for Inducing Hierarchies from Social Tagging Data Rules for Inducing Hierarchies from Social Tagging Data iconference 2018, Sheffield, UK, March 25-28, 2018 Hang Dong, Wei Wang, Frans Coenen Department of Computer Science, University of Liverpool Social

More information

Classifying Users and Identifying User Interests in Folksonomies

Classifying Users and Identifying User Interests in Folksonomies Classifying Users and Identifying User Interests in Folksonomies Elias Zavitsanos 1, George A. Vouros 1, and Georgios Paliouras 2 1 Department of Information and Communication Systems Engineering University

More information

Understanding the user: Personomy translation for tag recommendation

Understanding the user: Personomy translation for tag recommendation Understanding the user: Personomy translation for tag recommendation Robert Wetzker 1, Alan Said 1, and Carsten Zimmermann 2 1 Technische Universität Berlin, Germany 2 University of San Diego, USA Abstract.

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Springer Science+ Business, LLC

Springer Science+ Business, LLC Chapter 11. Towards OpenTagging Platform using Semantic Web Technologies Hak Lae Kim DERI, National University of Ireland, Galway, Ireland John G. Breslin DERI, National University of Ireland, Galway,

More information

An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique

An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique An Improvement of Search Results Access by Designing a Search Engine Result Page with a Clustering Technique 60 2 Within-Subjects Design Counter Balancing Learning Effect 1 [1 [2www.worldwidewebsize.com

More information

Collaborative Tag Recommendations

Collaborative Tag Recommendations Collaborative Tag Recommendations Leandro Balby Marinho and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Samelsonplatz 1, University of Hildesheim, D-31141 Hildesheim, Germany

More information

RSDC 09: Tag Recommendation Using Keywords and Association Rules

RSDC 09: Tag Recommendation Using Keywords and Association Rules RSDC 09: Tag Recommendation Using Keywords and Association Rules Jian Wang, Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem, PA 18015 USA

More information

Towards Social Semantic Suggestive Tagging

Towards Social Semantic Suggestive Tagging Towards Social Semantic Suggestive Tagging Fabio Calefato, Domenico Gendarmi, Filippo Lanubile University of Bari, Dipartimento di Informatica, Via Orabona, 4, 70126 - Bari, Italy {calefato,gendarmi,lanubile}@di.uniba.it

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Motivating Ontology-Driven Information Extraction

Motivating Ontology-Driven Information Extraction Motivating Ontology-Driven Information Extraction Burcu Yildiz 1 and Silvia Miksch 1, 2 1 Institute for Software Engineering and Interactive Systems, Vienna University of Technology, Vienna, Austria {yildiz,silvia}@

More information

Open Research Online The Open University s repository of research publications and other research outputs

Open Research Online The Open University s repository of research publications and other research outputs Open Research Online The Open University s repository of research publications and other research outputs Social Web Communities Conference or Workshop Item How to cite: Alani, Harith; Staab, Steffen and

More information

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

Kristina Lerman University of Southern California. This lecture is partly based on slides prepared by Anon Plangprasopchok

Kristina Lerman University of Southern California. This lecture is partly based on slides prepared by Anon Plangprasopchok Kristina Lerman University of Southern California This lecture is partly based on slides prepared by Anon Plangprasopchok Social Web is a platform for people to create, organize and share information Users

More information

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION Ms. Nikita P.Katariya 1, Prof. M. S. Chaudhari 2 1 Dept. of Computer Science & Engg, P.B.C.E., Nagpur, India, nikitakatariya@yahoo.com 2 Dept.

More information

Leopold Franzens University Innsbruck. Ontology Learning. Institute of Computer Science STI - Innsbruck. Seminar Paper

Leopold Franzens University Innsbruck. Ontology Learning. Institute of Computer Science STI - Innsbruck. Seminar Paper Leopold Franzens University Innsbruck Institute of Computer Science STI - Innsbruck Ontology Learning Seminar Paper Applied Ontology Engineering (WS 2010) Supervisor: Dr. Katharina Siorpaes Michael Rogger

More information

Identifying and Ranking Possible Semantic and Common Usage Categories of Search Engine Queries

Identifying and Ranking Possible Semantic and Common Usage Categories of Search Engine Queries Identifying and Ranking Possible Semantic and Common Usage Categories of Search Engine Queries Reza Taghizadeh Hemayati 1, Weiyi Meng 1, Clement Yu 2 1 Department of Computer Science, Binghamton university,

More information

Power Tags as Tools for Social Knowledge Organization Systems

Power Tags as Tools for Social Knowledge Organization Systems Power Tags as Tools for Social Knowledge Organization Systems Isabella Peters Abstract Web services are popular which allow users to collaboratively index and describe web resources with folksonomies.

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging 007 International Conference on Convergence Information Technology HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging Lixin Han,, Guihai Chen Department of Computer Science and Engineering,

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

Query Expansion using Wikipedia and DBpedia

Query Expansion using Wikipedia and DBpedia Query Expansion using Wikipedia and DBpedia Nitish Aggarwal and Paul Buitelaar Unit for Natural Language Processing, Digital Enterprise Research Institute, National University of Ireland, Galway firstname.lastname@deri.org

More information

Ontology Based Prediction of Difficult Keyword Queries

Ontology Based Prediction of Difficult Keyword Queries Ontology Based Prediction of Difficult Keyword Queries Lubna.C*, Kasim K Pursuing M.Tech (CSE)*, Associate Professor (CSE) MEA Engineering College, Perinthalmanna Kerala, India lubna9990@gmail.com, kasim_mlp@gmail.com

More information

Linking Entities in Chinese Queries to Knowledge Graph

Linking Entities in Chinese Queries to Knowledge Graph Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn

More information

Telling Experts from Spammers Expertise Ranking in Folksonomies

Telling Experts from Spammers Expertise Ranking in Folksonomies 32 nd Annual ACM SIGIR 09 Boston, USA, Jul 19-23 2009 Telling Experts from Spammers Expertise Ranking in Folksonomies Michael G. Noll (Albert) Ching-Man Au Yeung Christoph Meinel Nicholas Gibbins Nigel

More information

Enabling Semantic Search in Large Open Source Communities

Enabling Semantic Search in Large Open Source Communities Enabling Semantic Search in Large Open Source Communities Gregor Leban, Lorand Dali, Inna Novalija Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana {gregor.leban, lorand.dali, inna.koval}@ijs.si

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Semantic Web Systems Ontologies Jacques Fleuriot School of Informatics

Semantic Web Systems Ontologies Jacques Fleuriot School of Informatics Semantic Web Systems Ontologies Jacques Fleuriot School of Informatics 15 th January 2015 In the previous lecture l What is the Semantic Web? Web of machine-readable data l Aims of the Semantic Web Automated

More information

Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem

Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem To cite this article:

More information

A Comprehensive Analysis of using Semantic Information in Text Categorization

A Comprehensive Analysis of using Semantic Information in Text Categorization A Comprehensive Analysis of using Semantic Information in Text Categorization Kerem Çelik Department of Computer Engineering Boğaziçi University Istanbul, Turkey celikerem@gmail.com Tunga Güngör Department

More information

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD 10 Text Mining Munawar, PhD Definition Text mining also is known as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT).[1] A process of identifying novel information from a collection

More information

Computer-assisted Ontology Construction System: Focus on Bootstrapping Capabilities

Computer-assisted Ontology Construction System: Focus on Bootstrapping Capabilities Computer-assisted Ontology Construction System: Focus on Bootstrapping Capabilities Omar Qawasmeh 1, Maxime Lefranois 2, Antoine Zimmermann 2, Pierre Maret 1 1 Univ. Lyon, CNRS, Lab. Hubert Curien UMR

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 95-96 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

arxiv: v1 [cs.dl] 23 Feb 2012

arxiv: v1 [cs.dl] 23 Feb 2012 Analyzing Tag Distributions in Folksonomies for Resource Classification Arkaitz Zubiaga, Raquel Martínez, and Víctor Fresno arxiv:1202.5477v1 [cs.dl] 23 Feb 2012 NLP & IR Group @ UNED Abstract. Recent

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

What is this Song About?: Identification of Keywords in Bollywood Lyrics

What is this Song About?: Identification of Keywords in Bollywood Lyrics What is this Song About?: Identification of Keywords in Bollywood Lyrics by Drushti Apoorva G, Kritik Mathur, Priyansh Agrawal, Radhika Mamidi in 19th International Conference on Computational Linguistics

More information

An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information

An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information Stefan Schulte Multimedia Communications Lab (KOM) Technische Universität Darmstadt, Germany schulte@kom.tu-darmstadt.de

More information

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Gidhane* et al., 5(7): July, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY AN EFFICIENT APPROACH FOR TEXT MINING USING SIDE INFORMATION Kiran V. Gaidhane*, Prof. L. H. Patil, Prof. C. U. Chouhan DOI: 10.5281/zenodo.58632

More information

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM

IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM Myomyo Thannaing 1, Ayenandar Hlaing 2 1,2 University of Technology (Yadanarpon Cyber City), near Pyin Oo Lwin, Myanmar ABSTRACT

More information

Evolva: A Comprehensive Approach to Ontology Evolution

Evolva: A Comprehensive Approach to Ontology Evolution Evolva: A Comprehensive Approach to Evolution Fouad Zablith Knowledge Media Institute (KMi), The Open University Walton Hall, Milton Keynes, MK7 6AA, United Kingdom f.zablith@open.ac.uk Abstract. evolution

More information

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

A service based on Linked Data to classify Web resources using a Knowledge Organisation System A service based on Linked Data to classify Web resources using a Knowledge Organisation System A proof of concept in the Open Educational Resources domain Abstract One of the reasons why Web resources

More information

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network Roberto Navigli, Simone Paolo Ponzetto What is BabelNet a very large, wide-coverage multilingual

More information

Jianyong Wang Department of Computer Science and Technology Tsinghua University

Jianyong Wang Department of Computer Science and Technology Tsinghua University Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity

More information

Papers for comprehensive viva-voce

Papers for comprehensive viva-voce Papers for comprehensive viva-voce Priya Radhakrishnan Advisor : Dr. Vasudeva Varma Search and Information Extraction Lab, International Institute of Information Technology, Gachibowli, Hyderabad, India

More information

TGI Modules for Social Tagging System

TGI Modules for Social Tagging System TGI Modules for Social Tagging System Mr. Tambe Pravin M. Prof. Shamkuwar Devendra O. M.E. (2 nd Year) Department Of Computer Engineering Department Of Computer Engineering SPCOE, Otur SPCOE, Otur Pune,

More information

Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems

Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems InfoLab Technical Report 2006-10 Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems Paul Heymann and Hector Garcia-Molina Computer Science Department, Stanford University

More information

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search

Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search 1 / 33 Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search Bernd Wittefeld Supervisor Markus Löckelt 20. July 2012 2 / 33 Teaser - Google Web History http://www.google.com/history

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 52 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 52 (2015 ) 1071 1076 The 5 th International Symposium on Frontiers in Ambient and Mobile Systems (FAMS-2015) Health, Food

More information

Tagging tagging. Analysing user keywords in scientific bibliography management systems

Tagging tagging. Analysing user keywords in scientific bibliography management systems Tagging tagging. Analysing user keywords in scientific bibliography management systems Markus Heckner, Susanne Mühlbacher, Christian Wolff University of Regensburg Outline 1. Introduction - Research context

More information

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN: Semi Automatic Annotation Exploitation Similarity of Pics in i Personal Photo Albums P. Subashree Kasi Thangam 1 and R. Rosy Angel 2 1 Assistant Professor, Department of Computer Science Engineering College,

More information

Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity

Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity Disambiguating Search by Leveraging a Social Context Based on the Stream of User s Activity Tomáš Kramár, Michal Barla and Mária Bieliková Faculty of Informatics and Information Technology Slovak University

More information

CADIAL Search Engine at INEX

CADIAL Search Engine at INEX CADIAL Search Engine at INEX Jure Mijić 1, Marie-Francine Moens 2, and Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia {jure.mijic,bojana.dalbelo}@fer.hr

More information

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks

University of Amsterdam at INEX 2010: Ad hoc and Book Tracks University of Amsterdam at INEX 2010: Ad hoc and Book Tracks Jaap Kamps 1,2 and Marijn Koolen 1 1 Archives and Information Studies, Faculty of Humanities, University of Amsterdam 2 ISLA, Faculty of Science,

More information

Understanding the Semantics of Ambiguous Tags in Folksonomies

Understanding the Semantics of Ambiguous Tags in Folksonomies Understanding the Semantics of Ambiguous Tags in Folksonomies Ching-man Au Yeung, Nicholas Gibbins, and Nigel Shadbolt Intelligence, Agents and Multimedia Group (IAM), School of Electronics and Computer

More information

Ontology-Based Web Query Classification for Research Paper Searching

Ontology-Based Web Query Classification for Research Paper Searching Ontology-Based Web Query Classification for Research Paper Searching MyoMyo ThanNaing University of Technology(Yatanarpon Cyber City) Mandalay,Myanmar Abstract- In web search engines, the retrieval of

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch

More information

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI 1 KAMATCHI.M, 2 SUNDARAM.N 1 M.E, CSE, MahaBarathi Engineering College Chinnasalem-606201, 2 Assistant Professor,

More information

arxiv: v1 [cs.ai] 24 May 2008

arxiv: v1 [cs.ai] 24 May 2008 Constructing Folksonomies from User-specified Relations on Flickr Anon Plangprasopchok and Kristina Lerman arxiv:0805.3747v1 [cs.ai] 24 May 2008 USC Information Sciences Institute 4676 Admiralty Way, Marina

More information

Ontology Creation and Development Model

Ontology Creation and Development Model Ontology Creation and Development Model Pallavi Grover, Sonal Chawla Research Scholar, Department of Computer Science & Applications, Panjab University, Chandigarh, India Associate. Professor, Department

More information

Multimodal Information Spaces for Content-based Image Retrieval

Multimodal Information Spaces for Content-based Image Retrieval Research Proposal Multimodal Information Spaces for Content-based Image Retrieval Abstract Currently, image retrieval by content is a research problem of great interest in academia and the industry, due

More information

PRIOR System: Results for OAEI 2006

PRIOR System: Results for OAEI 2006 PRIOR System: Results for OAEI 2006 Ming Mao, Yefei Peng University of Pittsburgh, Pittsburgh, PA, USA {mingmao,ypeng}@mail.sis.pitt.edu Abstract. This paper summarizes the results of PRIOR system, which

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

Category Theory in Ontology Research: Concrete Gain from an Abstract Approach

Category Theory in Ontology Research: Concrete Gain from an Abstract Approach Category Theory in Ontology Research: Concrete Gain from an Abstract Approach Markus Krötzsch Pascal Hitzler Marc Ehrig York Sure Institute AIFB, University of Karlsruhe, Germany; {mak,hitzler,ehrig,sure}@aifb.uni-karlsruhe.de

More information

Document Retrieval using Predication Similarity

Document Retrieval using Predication Similarity Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research

More information

A Tagging Approach to Ontology Mapping

A Tagging Approach to Ontology Mapping A Tagging Approach to Ontology Mapping Colm Conroy 1, Declan O'Sullivan 1, Dave Lewis 1 1 Knowledge and Data Engineering Group, Trinity College Dublin {coconroy,declan.osullivan,dave.lewis}@cs.tcd.ie Abstract.

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

Reading group on Ontologies and NLP:

Reading group on Ontologies and NLP: Reading group on Ontologies and NLP: Machine Learning27th infebruary Automated 2014 1 / 25 Te Reading group on Ontologies and NLP: Machine Learning in Automated Text Categorization, by Fabrizio Sebastianini.

More information

Collaborative Tagging: A New Way of Defining Keywords to Access Web Resources

Collaborative Tagging: A New Way of Defining Keywords to Access Web Resources International CALIBER-2008 309 Collaborative Tagging: A New Way of Defining Keywords to Access Web Resources Abstract Anila S The main feature of web 2.0 is its flexible interaction with users. It has

More information

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,

More information

Improving Suffix Tree Clustering Algorithm for Web Documents

Improving Suffix Tree Clustering Algorithm for Web Documents International Conference on Logistics Engineering, Management and Computer Science (LEMCS 2015) Improving Suffix Tree Clustering Algorithm for Web Documents Yan Zhuang Computer Center East China Normal

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

Patent Classification Using Ontology-Based Patent Network Analysis

Patent Classification Using Ontology-Based Patent Network Analysis Association for Information Systems AIS Electronic Library (AISeL) PACIS 2010 Proceedings Pacific Asia Conference on Information Systems (PACIS) 2010 Patent Classification Using Ontology-Based Patent Network

More information

A Content-Based Method to Enhance Tag Recommendation

A Content-Based Method to Enhance Tag Recommendation Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) A Content-Based Method to Enhance Tag Recommendation Yu-Ta Lu, Shoou-I Yu, Tsung-Chieh Chang, Jane Yung-jen

More information

Exploiting routing information encoded into backlinks to improve topical crawling

Exploiting routing information encoded into backlinks to improve topical crawling 2009 International Conference of Soft Computing and Pattern Recognition Exploiting routing information encoded into backlinks to improve topical crawling Alban Mouton Valoria European University of Brittany

More information

TSS: A Hybrid Web Searches

TSS: A Hybrid Web Searches 410 TSS: A Hybrid Web Searches Li-Xin Han 1,2,3, Gui-Hai Chen 3, and Li Xie 3 1 Department of Mathematics, Nanjing University, Nanjing 210093, P.R. China 2 Department of Computer Science and Engineering,

More information

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Available online at   ScienceDirect. Procedia Computer Science 89 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 562 567 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Image Recommendation

More information

Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy

Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy Martin Rajman, Pierre Andrews, María del Mar Pérez Almenta, and Florian Seydoux Artificial Intelligence

More information

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems Jan Polowinski Martin Voigt Technische Universität DresdenTechnische Universität Dresden 01062 Dresden, Germany

More information

Tag Based Image Search by Social Re-ranking

Tag Based Image Search by Social Re-ranking Tag Based Image Search by Social Re-ranking Vilas Dilip Mane, Prof.Nilesh P. Sable Student, Department of Computer Engineering, Imperial College of Engineering & Research, Wagholi, Pune, Savitribai Phule

More information

Use of Content Tags in Managing Advertisements for Online Videos

Use of Content Tags in Managing Advertisements for Online Videos Use of Content Tags in Managing Advertisements for Online Videos Chia-Hsin Huang, H. T. Kung, Chia-Yung Su Harvard School of Engineering and Applied Sciences, Cambridge, MA 02138, USA {jashing, htk, cysu}@eecs.harvard.edu

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

Ranking Web Pages by Associating Keywords with Locations

Ranking Web Pages by Associating Keywords with Locations Ranking Web Pages by Associating Keywords with Locations Peiquan Jin, Xiaoxiang Zhang, Qingqing Zhang, Sheng Lin, and Lihua Yue University of Science and Technology of China, 230027, Hefei, China jpq@ustc.edu.cn

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems University of Koblenz Landau, Germany On Understanding the Sharing of Conceptualizations, Klaas Dellschaft What is an ontology? Gruber 93 (slightly adapted by Borst): An Ontology is

More information

Semantic Annotation for Semantic Social Networks. Using Community Resources

Semantic Annotation for Semantic Social Networks. Using Community Resources Semantic Annotation for Semantic Social Networks Using Community Resources Lawrence Reeve and Hyoil Han College of Information Science and Technology Drexel University, Philadelphia, PA 19108 lhr24@drexel.edu

More information

Exploring Social Annotations for Web Document Classification

Exploring Social Annotations for Web Document Classification Exploring Social Annotations for Web Document Classification ABSTRACT Michael G. Noll Hasso-Plattner-Institut, University of Potsdam 14440 Potsdam, Germany michael.noll@hpi.uni-potsdam.de Social annotation

More information

WordNet-based User Profiles for Semantic Personalization

WordNet-based User Profiles for Semantic Personalization PIA 2005 Workshop on New Technologies for Personalized Information Access WordNet-based User Profiles for Semantic Personalization Giovanni Semeraro, Marco Degemmis, Pasquale Lops, Ignazio Palmisano LACAM

More information

Using DDC to create a visual knowledge map as an aid to online information retrieval

Using DDC to create a visual knowledge map as an aid to online information retrieval Sudatta Chowdhury and G.G. Chowdhury Department of Computer and Information Sciences University of Strathclyde, Glasgow G1 1XH Using DDC to create a visual knowledge map as an aid to online information

More information

Using Linked Data to Reduce Learning Latency for e-book Readers

Using Linked Data to Reduce Learning Latency for e-book Readers Using Linked Data to Reduce Learning Latency for e-book Readers Julien Robinson, Johann Stan, and Myriam Ribière Alcatel-Lucent Bell Labs France, 91620 Nozay, France, Julien.Robinson@alcatel-lucent.com

More information

A Study of Pattern-based Subtopic Discovery and Integration in the Web Track

A Study of Pattern-based Subtopic Discovery and Integration in the Web Track A Study of Pattern-based Subtopic Discovery and Integration in the Web Track Wei Zheng and Hui Fang Department of ECE, University of Delaware Abstract We report our systems and experiments in the diversity

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

Context Sensitive Search Engine

Context Sensitive Search Engine Context Sensitive Search Engine Remzi Düzağaç and Olcay Taner Yıldız Abstract In this paper, we use context information extracted from the documents in the collection to improve the performance of the

More information

Query Difficulty Prediction for Contextual Image Retrieval

Query Difficulty Prediction for Contextual Image Retrieval Query Difficulty Prediction for Contextual Image Retrieval Xing Xing 1, Yi Zhang 1, and Mei Han 2 1 School of Engineering, UC Santa Cruz, Santa Cruz, CA 95064 2 Google Inc., Mountain View, CA 94043 Abstract.

More information

SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS

SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS SCUBA DIVER: SUBSPACE CLUSTERING OF WEB SEARCH RESULTS Fatih Gelgi, Srinivas Vadrevu, Hasan Davulcu Department of Computer Science and Engineering, Arizona State University, Tempe, AZ fagelgi@asu.edu,

More information