Towards the Automatic Creation of a Wordnet from a Term-based Lexical Network

Size: px
Start display at page:

Download "Towards the Automatic Creation of a Wordnet from a Term-based Lexical Network"

Transcription

1 Towards the Automatic Creation of a Wordnet from a Term-based Lexical Network Hugo Gonçalo Oliveira, Paulo Gomes (hroliv,pgomes)@dei.uc.pt Cognitive & Media Systems Group CISUC, University of Coimbra Uppsala, July 15, 2010 Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

2 Outline 1 Introduction Lexical ontologies Information extraction Issues Research Goals 2 Approach Clustering for synsets Merging thesauri Assigning terms to synsets 3 Experimentation Preparation Wordnet establishment 4 Concluding remarks Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

3 Lexical ontologies Introduction Lexical ontologies Such as Princeton WordNet [Fellbaum 1998] Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

4 Lexical ontologies Introduction Lexical ontologies Such as Princeton WordNet [Fellbaum 1998] Ontology + lexicon [Hirst 2004] Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

5 Lexical ontologies Introduction Lexical ontologies Such as Princeton WordNet [Fellbaum 1998] Ontology + lexicon [Hirst 2004] Knowledge structured on words and their meanings Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

6 Lexical ontologies Introduction Lexical ontologies Such as Princeton WordNet [Fellbaum 1998] Ontology + lexicon [Hirst 2004] Knowledge structured on words and their meanings Cover the whole language Not based on a specific domain Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

7 Lexical ontologies Introduction Lexical ontologies Such as Princeton WordNet [Fellbaum 1998] Ontology + lexicon [Hirst 2004] Knowledge structured on words and their meanings Cover the whole language Not based on a specific domain Construction and maintenance involve time-consuming human effort! Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

8 Introduction Information extraction from text Information extraction From dictionaries: Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

9 Introduction Information extraction from text Information extraction From dictionaries: 1 basketball, noun a game, also known as hoops, played indoors... game HYPERNYM OF basketball basketball SYNONYM OF hoops Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

10 Introduction Information extraction from text Information extraction From dictionaries: 1 basketball, noun a game, also known as hoops, played indoors... game HYPERNYM OF basketball basketball SYNONYM OF hoops 2 basketball, noun the ball used in playing basketball. ball HYPERNYM OF basketball Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

11 Introduction Information extraction from text Information extraction From dictionaries: 1 basketball, noun a game, also known as hoops, played indoors... game HYPERNYM OF basketball basketball SYNONYM OF hoops 2 basketball, noun the ball used in playing basketball. ball HYPERNYM OF basketball From textual corpora: Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

12 Introduction Information extraction from text Information extraction From dictionaries: 1 basketball, noun a game, also known as hoops, played indoors... game HYPERNYM OF basketball basketball SYNONYM OF hoops 2 basketball, noun the ball used in playing basketball. ball HYPERNYM OF basketball From textual corpora:... team sports, such as basketball, rugby... team sport HYPERNYM OF basketball team sport HYPERNYM OF rugby Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

13 Introduction Natural language is ambiguous Issues Term-based networks are impractical for many applications Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

14 Introduction Issues Natural language is ambiguous Term-based networks are impractical for many applications In the previous example: is hoops a team sport? Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

15 Introduction Issues Natural language is ambiguous Term-based networks are impractical for many applications In the previous example: is hoops a team sport? An example extracted from a Portuguese dictionary: ruína SYNONYM OF queda queda SYNONYM OF habilidade habilidade SYNONYM OF ruína?? Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

16 Introduction Issues Natural language is ambiguous Term-based networks are impractical for many applications In the previous example: is hoops a team sport? An example extracted from a Portuguese dictionary: ruína SYNONYM OF queda queda SYNONYM OF habilidade habilidade SYNONYM OF ruína?? queda can either mean aptitude or downfall! Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

17 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

18 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Extracted from different sources Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

19 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Extracted from different sources Manually created thesauri Language dictionaries/encyclopedias Corpora Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

20 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Extracted from different sources Manually created thesauri Language dictionaries/encyclopedias Corpora Modelled after Princeton WordNet Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

21 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Extracted from different sources Manually created thesauri Language dictionaries/encyclopedias Corpora Modelled after Princeton WordNet Synsets: groups of synonymous words Synset-based relational triples Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

22 Onto.PT Introduction Research Goals Automatic construction of a lexical ontology for Portuguese Extracted from different sources Manually created thesauri Language dictionaries/encyclopedias Corpora Modelled after Princeton WordNet Synsets: groups of synonymous words Synset-based relational triples WSD based on the knowledge already extracted, not on the context. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

23 Information flow Approach Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

24 Information flow Approach Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

25 Approach Clustering for synsets Synonymy networks tend to have a clustered structure Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

26 Approach Clustering for synsets Synonymy networks tend to have a clustered structure Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

27 Approach Clustering for synsets Synset discovery (inspired by [Gfeller et al. 2005]) 1 Split the original network into sub-networks and calculate the frequency-weighted adjacency matrix F of each sub-network; Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

28 Approach Clustering for synsets Synset discovery (inspired by [Gfeller et al. 2005]) 1 Split the original network into sub-networks and calculate the frequency-weighted adjacency matrix F of each sub-network; 2 F ij = F ij + F ij δ, 0.5 < δ < 0.5; Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

29 Approach Clustering for synsets Synset discovery (inspired by [Gfeller et al. 2005]) 1 Split the original network into sub-networks and calculate the frequency-weighted adjacency matrix F of each sub-network; 2 F ij = F ij + F ij δ, 0.5 < δ < 0.5; 3 Run MCL [van Dongen 2000], with γ = 1.6, over F for 30 times; Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

30 Approach Clustering for synsets Synset discovery (inspired by [Gfeller et al. 2005]) 1 Split the original network into sub-networks and calculate the frequency-weighted adjacency matrix F of each sub-network; 2 F ij = F ij + F ij δ, 0.5 < δ < 0.5; 3 Run MCL [van Dongen 2000], with γ = 1.6, over F for 30 times; 4 Use the (hard) clustering from each run to create P, a matrix with the probabilities of each pair of words in F belonging to the same cluster; Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

31 Approach Clustering for synsets Synset discovery (inspired by [Gfeller et al. 2005]) 1 Split the original network into sub-networks and calculate the frequency-weighted adjacency matrix F of each sub-network; 2 F ij = F ij + F ij δ, 0.5 < δ < 0.5; 3 Run MCL [van Dongen 2000], with γ = 1.6, over F for 30 times; 4 Use the (hard) clustering from each run to create P, a matrix with the probabilities of each pair of words in F belonging to the same cluster; 5 Remove: (a) big clusters, B, if there is a group of clusters C = C 1, C 2,...C n such that B = C 1 C 2... C n ; (b) clusters completely included in other clusters. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

32 Approach Merging thesauri Merging synsets from different thesaurus For each synset T i T, select B j B with higher c = T i B j /T i B j 1 B 1 = (diva, beldade, beleza, deidade, deusa, divindade) B 2 = (divindade, deidade, deus, nume) 1 Jaccard coefficient Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

33 Approach Merging thesauri Merging synsets from different thesaurus For each synset T i T, select B j B with higher c = T i B j /T i B j 1 B 1 = (diva, beldade, beleza, deidade, deusa, divindade) B 2 = (divindade, deidade, deus, nume) T 1 = (divindade, diva, deusa) 1 Jaccard coefficient Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

34 Approach Merging thesauri Merging synsets from different thesaurus For each synset T i T, select B j B with higher c = T i B j /T i B j 1 B 1 = (diva, beldade, beleza, deidade, deusa, divindade) B 2 = (divindade, deidade, deus, nume) T 1 = (divindade, diva, deusa) c(t1, B 1 ) = 1 3 c(t1, B 2 ) = Jaccard coefficient Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

35 Approach Merging thesauri Merging synsets from different thesaurus For each synset T i T, select B j B with higher c = T i B j /T i B j 1 B 1 = (diva, beldade, beleza, deidade, deusa, divindade) B 2 = (divindade, deidade, deus, nume) T 1 = (divindade, diva, deusa) c(t1, B 1 ) = 1 3 c(t1, B 2 ) = 1 6 N = B 1 T 1 = (diva, beldade, beleza, deidade, deusa, divindade) 1 Jaccard coefficient Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

36 Mapping methods Approach Assigning terms to synsets Input: Thesaurus T, containing synsets Term-based semantic network, N, where each edge has a type R Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

37 Mapping methods Approach Assigning terms to synsets Input: Thesaurus T, containing synsets Term-based semantic network, N, where each edge has a type R Goal: map a R b N to A R B, (A, B) T Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

38 Mapping methods Approach Assigning terms to synsets Input: Thesaurus T, containing synsets Term-based semantic network, N, where each edge has a type R Goal: map a R b N to A R B, (A, B) T Output: semantic network W, whose nodes are synsets, which relate to other synsets by means of semantic relations (wordnet) Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

39 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

40 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

41 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai a is not in T? create synset A = (a), a A Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

42 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai a is not in T? create synset A = (a), a A 3 For each S ai S a, Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

43 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai a is not in T? create synset A = (a), a A 3 For each S ai S a, p ai = n ai S ai, n ai = number of terms t j S ai : (t j R b) Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

44 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai a is not in T? create synset A = (a), a A 3 For each S ai S a, p ai = n ai S ai, n ai = number of terms t j S ai : (t j R b) S a1 = (a, c, d, e), p a1 = 3 4 S a2 = (a, f, g), p a2 = 2 3 S a3 = (a, h, i, j), p a3 = 1 4 Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

45 Procedure 1 Approach Assigning terms to synsets Assignment of a (in a R b) to A: 1 Fix b 2 S a T : S ai S a, a S ai a is not in T? create synset A = (a), a A 3 For each S ai S a, p ai = n ai S ai, n ai = number of terms t j S ai : (t j R b) S a1 = (a, c, d, e), p a1 = 3 4 S a2 = (a, f, g), p a2 = 2 3 S a3 = (a, h, i, j), p a3 = 1 4 a Sa1 Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

46 Procedure 1 (stage 2) Approach Assigning terms to synsets Only for semi-mapped triples a R B and A R b Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

47 Approach Assigning terms to synsets Procedure 1 (stage 2) Only for semi-mapped triples a R B and A R b Take advantage of established hypernymy links. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

48 Approach Assigning terms to synsets Procedure 1 (stage 2) Only for semi-mapped triples a R B and A R b Take advantage of established hypernymy links. Assigning b in A R b Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

49 Approach Assigning terms to synsets Procedure 1 (stage 2) examples and additional cleaning If there is C i C with... C i HYPER OF H A R H, b C i If all C i HYPER OF I i A R I i, triples A R I i can be inferred! Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

50 Approach Assigning terms to synsets Procedure 1 (stage 2) examples and additional cleaning If there is C i C with... C i HYPER OF H A R H, b C i If all C i HYPER OF I i A R I i, triples A R I i can be inferred! If H = (dog) I 1 = (cat), I 1 = (mouse) and C i = (mammal): A = (hair) and R = (PART OF ) A = (animal) and R = (HYPER OF ) Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

51 Approach Alternative mapping procedure Assigning terms to synsets 1 M = term-term matrix based on the adjacencies of the lexical network. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

52 Approach Assigning terms to synsets Alternative mapping procedure 1 M = term-term matrix based on the adjacencies of the lexical network. 2 Collect all the synsets with a, S a T, and all synsets with b, S b T. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

53 Approach Alternative mapping procedure Assigning terms to synsets 1 M = term-term matrix based on the adjacencies of the lexical network. 2 Collect all the synsets with a, S a T, and all synsets with b, S b T. 3 For each A S a and B S b, with terms A i A and B j B: sim(a, B) = A B cos(a i, B j ) i=1 j=1 A B Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

54 Approach Alternative mapping procedure Assigning terms to synsets 1 M = term-term matrix based on the adjacencies of the lexical network. 2 Collect all the synsets with a, S a T, and all synsets with b, S b T. 3 For each A S a and B S b, with terms A i A and B j B: sim(a, B) = A B cos(a i, B j ) i=1 j=1 A B 4 Select the pair of synsets with the highest similarity Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

55 Experimentation Preparation Resources used (only nouns) PAPEL 2 lexical network Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

56 Experimentation Resources used (only nouns) Preparation PAPEL 2 lexical network Hypernymy, part-of and member-of triples Synonymy instances Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

57 Experimentation Resources used (only nouns) Preparation PAPEL 2 lexical network Hypernymy, part-of and member-of triples Synonymy instances Huge synonymy sub-network with 16k nodes!!! Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

58 Experimentation Resources used (only nouns) Preparation PAPEL 2 lexical network Hypernymy, part-of and member-of triples Synonymy instances Huge synonymy sub-network with 16k nodes!!! TeP 3 thesaurus OpenThesaurus.PT (OT) 4 CLIP = clustered PAPEL TOP = TeP merged with OT, merged with CLIP Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

59 Resulting Thesaurus Experimentation Preparation Words Synsets TeP OT CLIP TOP Quantity 17,158 5,819 23,741 30,554 Ambiguous 5, ,196 13,294 Most ambiguous Quantity 8,254 1,872 7,468 9,960 Avg. size Biggest Table: (Noun) thesauruses in numbers. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

60 Experimentation Preparation Clustered sub-network of PAPEL example Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

61 Manual validation Experimentation Preparation Sample Correct Incorrect N/A Agreement CLIP 519 sets 65.8% 31.7% 2.5% 76.1% CLIP 310 sets 81.1% 16.9% 2.0% 84.2% TOP 480 sets 83.2% 15.8% 1.0% 82.3% TOP 448 sets 86.8% 12.3% 0.9% 83.0% Table: Results of manual synset validation. CLIP and TOP only consider synsets with 10 or less words. The quality is higher for smaller synsets. Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

62 Resulting WordNet Experimentation Wordnet establishment Hypernym of Part of Member of Term-based triples 62,591 2,805 5,929 Mapped 27,750 1,460 3,962 1st Same synset Already present 3, Semi-mapped triples 7, Mapped nd Could be inferred Already present Synset-based triples 23,572 1,416 3,783 Table: Results of triples mapping Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

63 Automatic validation Experimentation Wordnet establishment For each triple, A R B 1 Compile a set of textual patterns denoting R, e.g.: (hypo) é um uma (tipo forma variedade...)* de (hyper) (whole/group) é um (grupo conjunto...) de (part/member) Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

64 Automatic validation Experimentation Wordnet establishment For each triple, A R B 1 Compile a set of textual patterns denoting R, e.g.: (hypo) é um uma (tipo forma variedade...)* de (hyper) (whole/group) é um (grupo conjunto...) de (part/member) 2 Score the triple with the help of Google: A B found(a i, B j, R) score = i=1 j=1 A B Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

65 Automatic validation Experimentation Wordnet establishment For each triple, A R B 1 Compile a set of textual patterns denoting R, e.g.: (hypo) é um uma (tipo forma variedade...)* de (hyper) (whole/group) é um (grupo conjunto...) de (part/member) 2 Score the triple with the help of Google: A B found(a i, B j, R) score = i=1 j=1 A B Relation Sample size Validation Hypernymy of 419 synsets 44,1% Member of 379 synsets 24,3% Part of 290 synsets 24,8% Table: Automatic validation of triples Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

66 Concluding remarks Concluding remarks Our way to achieve WSD without a context continues... Clustering is a suitable alternative for establishing synsets Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

67 Concluding remarks Concluding remarks Our way to achieve WSD without a context continues... Clustering is a suitable alternative for establishing synsets What about for networks not extracted from dictionaries? Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

68 Concluding remarks Concluding remarks Our way to achieve WSD without a context continues... Clustering is a suitable alternative for establishing synsets What about for networks not extracted from dictionaries? Rules can be defined to map terms in triples to synsets Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

69 Concluding remarks Concluding remarks Our way to achieve WSD without a context continues... Clustering is a suitable alternative for establishing synsets What about for networks not extracted from dictionaries? Rules can be defined to map terms in triples to synsets Though some triples remain unmapped... Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

70 Concluding remarks Concluding remarks Our way to achieve WSD without a context continues... Clustering is a suitable alternative for establishing synsets What about for networks not extracted from dictionaries? Rules can be defined to map terms in triples to synsets Though some triples remain unmapped... Future: Evaluate the alternative mapping method Exploit other resources: e.g. Wiktionary and Wikipedia Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

71 References The end Christiane Fellbaum, editor (1998). WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press. Graeme Hirst (2004). Ontology and the lexicon. In Steffen Staab and Rudi Studer, editors, Handbook on Ontologies, International Handbooks on Information Systems, pages Springer. S. M. van Dongen (2000). Graph Clustering by Flow Simulation. Ph.D. thesis, University of Utrecht. David Gfeller, Jean-Cédric Chappelier and Paulo De Los Rios (2005). Synonym Dictionary Improvement through Markov Clustering and Clustering Stability. In Proc. of International Symposium on Applied Stochastic Models and Data Analysis (ASMDA), pages Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

72 The end Thank you! Gonçalo Oliveira & Gomes (CISUC) TextGraphs-5 Uppsala, July 15, / 24

Automatically Enriching a Thesaurus with Information from Dictionaries

Automatically Enriching a Thesaurus with Information from Dictionaries Automatically Enriching a Thesaurus with Information from Dictionaries Hugo Gonçalo Oliveira 1 Paulo Gomes {hroliv,pgomes}@dei.uc.pt Cognitive & Media Systems Group CISUC, Universidade de Coimbra October

More information

Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese

Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese PhD Thesis by: Hugo Gonçalo Oliveira 1 hroliv@dei.uc.pt Supervised by: Paulo Gomes Cognitive & Media Systems Group CISUC,

More information

WordNet-based User Profiles for Semantic Personalization

WordNet-based User Profiles for Semantic Personalization PIA 2005 Workshop on New Technologies for Personalized Information Access WordNet-based User Profiles for Semantic Personalization Giovanni Semeraro, Marco Degemmis, Pasquale Lops, Ignazio Palmisano LACAM

More information

Enhancing Automatic Wordnet Construction Using Word Embeddings

Enhancing Automatic Wordnet Construction Using Word Embeddings Enhancing Automatic Wordnet Construction Using Word Embeddings Feras Al Tarouti University of Colorado Colorado Springs 1420 Austin Bluffs Pkwy Colorado Springs, CO 80918, USA faltarou@uccs.edu Jugal Kalita

More information

NATURAL LANGUAGE PROCESSING

NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Intelligent Information Retrieval 1. Relevance feedback - Direct feedback - Pseudo feedback 2. Query expansion

More information

Synonym Dictionary Improvement through Markov Clustering and Clustering Stability

Synonym Dictionary Improvement through Markov Clustering and Clustering Stability Synonym Dictionary Improvement through Markov Clustering and Clustering Stability David Gfeller, Jean-Cédric Chappelier, and Paolo De Los Rios Ecole Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne,

More information

Sense-based Information Retrieval System by using Jaccard Coefficient Based WSD Algorithm

Sense-based Information Retrieval System by using Jaccard Coefficient Based WSD Algorithm ISBN 978-93-84468-0-0 Proceedings of 015 International Conference on Future Computational Technologies (ICFCT'015 Singapore, March 9-30, 015, pp. 197-03 Sense-based Information Retrieval System by using

More information

A Comprehensive Analysis of using Semantic Information in Text Categorization

A Comprehensive Analysis of using Semantic Information in Text Categorization A Comprehensive Analysis of using Semantic Information in Text Categorization Kerem Çelik Department of Computer Engineering Boğaziçi University Istanbul, Turkey celikerem@gmail.com Tunga Güngör Department

More information

Mining Wikipedia for Large-scale Repositories

Mining Wikipedia for Large-scale Repositories Mining Wikipedia for Large-scale Repositories of Context-Sensitive Entailment Rules Milen Kouylekov 1, Yashar Mehdad 1;2, Matteo Negri 1 FBK-Irst 1, University of Trento 2 Trento, Italy [kouylekov,mehdad,negri]@fbk.eu

More information

Text Similarity. Semantic Similarity: Synonymy and other Semantic Relations

Text Similarity. Semantic Similarity: Synonymy and other Semantic Relations NLP Text Similarity Semantic Similarity: Synonymy and other Semantic Relations Synonyms and Paraphrases Example: post-close market announcements The S&P 500 climbed 6.93, or 0.56 percent, to 1,243.72,

More information

Fuzzy Synsets, and Lexicon-Based Sentiment Analysis

Fuzzy Synsets, and Lexicon-Based Sentiment Analysis Fuzzy Synsets, and Lexicon-Based Sentiment Analysis Sayyed-Ali Hossayni 1 a,b, Mohammad-R Akbarzadeh-T b, Diego Reforgiato Recupero c,e, Aldo Gangemi d,c, Josep Lluís de la Rosa i Esteva a a Agents Research

More information

Web Information Retrieval using WordNet

Web Information Retrieval using WordNet Web Information Retrieval using WordNet Jyotsna Gharat Asst. Professor, Xavier Institute of Engineering, Mumbai, India Jayant Gadge Asst. Professor, Thadomal Shahani Engineering College Mumbai, India ABSTRACT

More information

A Linguistic Approach for Semantic Web Service Discovery

A Linguistic Approach for Semantic Web Service Discovery A Linguistic Approach for Semantic Web Service Discovery Jordy Sangers 307370js jordysangers@hotmail.com Bachelor Thesis Economics and Informatics Erasmus School of Economics Erasmus University Rotterdam

More information

Automatic Discovery of Fuzzy Synsets from Dictionary Definitions

Automatic Discovery of Fuzzy Synsets from Dictionary Definitions Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Automatic Discovery of Fuzzy Synsets from Dictionary Definitions Hugo Gonçalo Oliveira and Paulo Gomes CISUC,

More information

SAACO: Semantic Analysis based Ant Colony Optimization Algorithm for Efficient Text Document Clustering

SAACO: Semantic Analysis based Ant Colony Optimization Algorithm for Efficient Text Document Clustering SAACO: Semantic Analysis based Ant Colony Optimization Algorithm for Efficient Text Document Clustering 1 G. Loshma, 2 Nagaratna P Hedge 1 Jawaharlal Nehru Technological University, Hyderabad 2 Vasavi

More information

MRD-based Word Sense Disambiguation: Extensions and Applications

MRD-based Word Sense Disambiguation: Extensions and Applications MRD-based Word Sense Disambiguation: Extensions and Applications Timothy Baldwin Joint Work with F. Bond, S. Fujita, T. Tanaka, Willy and S.N. Kim 1 MRD-based Word Sense Disambiguation: Extensions and

More information

On the Automatic Enrichment of a Portuguese Wordnet with Dictionary Definitions

On the Automatic Enrichment of a Portuguese Wordnet with Dictionary Definitions On the Automatic Enrichment of a Portuguese Wordnet with Dictionary Definitions Hugo Gonçalo Oliveira and Paulo Gomes CISUC, University of Coimbra, Portugal {hroliv,pgomes}@dei.uc.pt Abstract. Besides

More information

Serbian Wordnet for biomedical sciences

Serbian Wordnet for biomedical sciences Serbian Wordnet for biomedical sciences Sanja Antonic University library Svetozar Markovic University of Belgrade, Serbia antonic@unilib.bg.ac.yu Cvetana Krstev Faculty of Philology, University of Belgrade,

More information

Automatic Wordnet Mapping: from CoreNet to Princeton WordNet

Automatic Wordnet Mapping: from CoreNet to Princeton WordNet Automatic Wordnet Mapping: from CoreNet to Princeton WordNet Jiseong Kim, Younggyun Hahm, Sunggoo Kwon, Key-Sun Choi Semantic Web Research Center, School of Computing, KAIST 291 Daehak-ro, Yuseong-gu,

More information

Putting ontologies to work in NLP

Putting ontologies to work in NLP Putting ontologies to work in NLP The lemon model and its future John P. McCrae National University of Ireland, Galway Introduction In natural language processing we are doing three main things Understanding

More information

Lightweight Transformation of Tabular Open Data to RDF

Lightweight Transformation of Tabular Open Data to RDF Proceedings of the I-SEMANTICS 2012 Posters & Demonstrations Track, pp. 38-42, 2012. Copyright 2012 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes.

More information

International Journal of Advance Engineering and Research Development SENSE BASED INDEXING OF HIDDEN WEB USING ONTOLOGY

International Journal of Advance Engineering and Research Development SENSE BASED INDEXING OF HIDDEN WEB USING ONTOLOGY Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 SENSE

More information

Punjabi WordNet Relations and Categorization of Synsets

Punjabi WordNet Relations and Categorization of Synsets Punjabi WordNet Relations and Categorization of Synsets Rupinderdeep Kaur Computer Science Engineering Department, Thapar University, rupinderdeep@thapar.edu Suman Preet Department of Linguistics and Punjabi

More information

Tag Semantics for the Retrieval of XML Documents

Tag Semantics for the Retrieval of XML Documents Tag Semantics for the Retrieval of XML Documents Davide Buscaldi 1, Giovanna Guerrini 2, Marco Mesiti 3, Paolo Rosso 4 1 Dip. di Informatica e Scienze dell Informazione, Università di Genova, Italy buscaldi@disi.unige.it,

More information

Measuring Semantic Similarity between Words Using Page Counts and Snippets

Measuring Semantic Similarity between Words Using Page Counts and Snippets Measuring Semantic Similarity between Words Using Page Counts and Snippets Manasa.Ch Computer Science & Engineering, SR Engineering College Warangal, Andhra Pradesh, India Email: chandupatla.manasa@gmail.com

More information

Cluster-based Similarity Aggregation for Ontology Matching

Cluster-based Similarity Aggregation for Ontology Matching Cluster-based Similarity Aggregation for Ontology Matching Quang-Vinh Tran 1, Ryutaro Ichise 2, and Bao-Quoc Ho 1 1 Faculty of Information Technology, Ho Chi Minh University of Science, Vietnam {tqvinh,hbquoc}@fit.hcmus.edu.vn

More information

Open Source Dutch WordNet

Open Source Dutch WordNet Open Source Dutch WordNet Marten Postma Piek Vossen Release 1.0 Date December 1, 2014 m.c.postma@vu.nl p.t.j.m.vossen@vu.nl License CC BY-SA 4.0 Website http://wordpress.let.vupr.nl/odwn/ This project

More information

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ???

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ??? @ INSIDE DEEPQA Managing complex unstructured data with UIMA Simon Ellis INTRODUCTION 22 nd November, 2013 WAT SON TECHNOLOGIES AND OPEN ARCHIT ECT URE QUEST ION ANSWERING PROFESSOR JIM HENDLER S IMON

More information

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE COMP90042 LECTURE 3 LEXICAL SEMANTICS SENTIMENT ANALYSIS REVISITED 2 Bag of words, knn classifier. Training data: This is a good movie.! This is a great movie.! This is a terrible film. " This is a wonderful

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

LexiRes: A Tool for Exploring and Restructuring EuroWordNet for Information Retrieval

LexiRes: A Tool for Exploring and Restructuring EuroWordNet for Information Retrieval LexiRes: A Tool for Exploring and Restructuring EuroWordNet for Information Retrieval Ernesto William De Luca and Andreas Nürnberger 1 Abstract. The problem of word sense disambiguation in lexical resources

More information

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY

WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.4, April 2009 349 WEIGHTING QUERY TERMS USING WORDNET ONTOLOGY Mohammed M. Sakre Mohammed M. Kouta Ali M. N. Allam Al Shorouk

More information

Random Walks for Knowledge-Based Word Sense Disambiguation. Qiuyu Li

Random Walks for Knowledge-Based Word Sense Disambiguation. Qiuyu Li Random Walks for Knowledge-Based Word Sense Disambiguation Qiuyu Li Word Sense Disambiguation 1 Supervised - using labeled training sets (features and proper sense label) 2 Unsupervised - only use unlabeled

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network Roberto Navigli, Simone Paolo Ponzetto What is BabelNet a very large, wide-coverage multilingual

More information

A graph-based method to improve WordNet Domains

A graph-based method to improve WordNet Domains A graph-based method to improve WordNet Domains Aitor González, German Rigau IXA group UPV/EHU, Donostia, Spain agonzalez278@ikasle.ehu.com german.rigau@ehu.com Mauro Castillo UTEM, Santiago de Chile,

More information

0.1 Knowledge Organization Systems for Semantic Web

0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1.1 Knowledge Organization Systems Why do we need to organize knowledge? Indexing Retrieval Organization

More information

Multi-Modal Word Synset Induction. Jesse Thomason and Raymond Mooney University of Texas at Austin

Multi-Modal Word Synset Induction. Jesse Thomason and Raymond Mooney University of Texas at Austin Multi-Modal Word Synset Induction Jesse Thomason and Raymond Mooney University of Texas at Austin Word Synset Induction kiwi Word Synset Induction chinese grapefruit kiwi kiwi vine Word Synset Induction

More information

The Dictionary Parsing Project: Steps Toward a Lexicographer s Workstation

The Dictionary Parsing Project: Steps Toward a Lexicographer s Workstation The Dictionary Parsing Project: Steps Toward a Lexicographer s Workstation Ken Litkowski ken@clres.com http://www.clres.com http://www.clres.com/dppdemo/index.html Dictionary Parsing Project Purpose: to

More information

Ontology Based Search Engine

Ontology Based Search Engine Ontology Based Search Engine K.Suriya Prakash / P.Saravana kumar Lecturer / HOD / Assistant Professor Hindustan Institute of Engineering Technology Polytechnic College, Padappai, Chennai, TamilNadu, India

More information

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS 82 CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS In recent years, everybody is in thirst of getting information from the internet. Search engines are used to fulfill the need of them. Even though the

More information

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD 10 Text Mining Munawar, PhD Definition Text mining also is known as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT).[1] A process of identifying novel information from a collection

More information

Package wordnet. February 15, 2013

Package wordnet. February 15, 2013 Package wordnet February 15, 2013 Title WordNet Interface Version 0.1-8 An interface to WordNet using the Jawbone Java API to WordNet. WordNet is an on-line lexical reference system developed by the Cognitive

More information

The Semantic Annotated Documents - From HTML to the Semantic Web

The Semantic Annotated Documents - From HTML to the Semantic Web Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 413 The Semantic Annotated Documents - From HTML to the Semantic

More information

Using DEB Services for Knowledge Representation within the KYOTO Project

Using DEB Services for Knowledge Representation within the KYOTO Project Using DEB Services for Knowledge Representation within the KYOTO Project Aleš Horák and Adam Rambousek Faculty of Informatics, Masaryk University Botanická 68a, 602 00 Brno, Czech Republic {hales,xrambous}@fi.muni.cz

More information

Enriching Ontology Concepts Based on Texts from WWW and Corpus

Enriching Ontology Concepts Based on Texts from WWW and Corpus Journal of Universal Computer Science, vol. 18, no. 16 (2012), 2234-2251 submitted: 18/2/11, accepted: 26/8/12, appeared: 28/8/12 J.UCS Enriching Ontology Concepts Based on Texts from WWW and Corpus Tarek

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy

Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy Martin Rajman, Pierre Andrews, María del Mar Pérez Almenta, and Florian Seydoux Artificial Intelligence

More information

Identifying Poorly-Defined Concepts in WordNet with Graph Metrics

Identifying Poorly-Defined Concepts in WordNet with Graph Metrics Identifying Poorly-Defined Concepts in WordNet with Graph Metrics John P. McCrae and Narumol Prangnawarat Insight Centre for Data Analytics, National University of Ireland, Galway john@mccr.ae, narumol.prangnawarat@insight-centre.org

More information

Evaluation of Synset Assignment to Bi-lingual Dictionary

Evaluation of Synset Assignment to Bi-lingual Dictionary Evaluation of Synset Assignment to Bi-lingual Dictionary Thatsanee Charoenporn 1, Virach Sornlertlamvanich 1, Chumpol Mokarat 1, Hitoshi Isahara 2, Hammam Riza 3, and Purev Jaimai 4 1 Thai Computational

More information

From legal texts to legal ontologies and question-answering systems

From legal texts to legal ontologies and question-answering systems From legal texts to legal ontologies and question-answering systems Paulo Quaresma pq@di.uevora.pt Spoken Language Systems Lab / Dept. of Informatics INESC-ID, Lisbon / University of Évora Portugal 1 Some

More information

NUS-I2R: Learning a Combined System for Entity Linking

NUS-I2R: Learning a Combined System for Entity Linking NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm

More information

An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages

An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann, Simone Paolo Ponzetto Data and Web

More information

Boolean Queries. Keywords combined with Boolean operators:

Boolean Queries. Keywords combined with Boolean operators: Query Languages 1 Boolean Queries Keywords combined with Boolean operators: OR: (e 1 OR e 2 ) AND: (e 1 AND e 2 ) BUT: (e 1 BUT e 2 ) Satisfy e 1 but not e 2 Negation only allowed using BUT to allow efficient

More information

CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING

CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING 43 CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING 3.1 INTRODUCTION This chapter emphasizes the Information Retrieval based on Query Expansion (QE) and Latent Semantic

More information

ScienceDirect. Enhanced Associative Classification of XML Documents Supported by Semantic Concepts

ScienceDirect. Enhanced Associative Classification of XML Documents Supported by Semantic Concepts Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 194 201 International Conference on Information and Communication Technologies (ICICT 2014) Enhanced Associative

More information

IN4325 Query refinement. Claudia Hauff (WIS, TU Delft)

IN4325 Query refinement. Claudia Hauff (WIS, TU Delft) IN4325 Query refinement Claudia Hauff (WIS, TU Delft) The big picture Information need Topic the user wants to know more about The essence of IR Query Translation of need into an input for the search engine

More information

WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus

WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus 2009 IEEE International Conference on Semantic Computing WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus Lalindra De Silva University of Colombo School

More information

2 Experimental Methodology and Results

2 Experimental Methodology and Results Developing Consensus Ontologies for the Semantic Web Larry M. Stephens, Aurovinda K. Gangam, and Michael N. Huhns Department of Computer Science and Engineering University of South Carolina, Columbia,

More information

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch

More information

Package wordnet. November 26, 2017

Package wordnet. November 26, 2017 Title WordNet Interface Version 0.1-14 Package wordnet November 26, 2017 An interface to WordNet using the Jawbone Java API to WordNet. WordNet () is a large lexical database

More information

Semantic-Based Information Retrieval for Java Learning Management System

Semantic-Based Information Retrieval for Java Learning Management System AENSI Journals Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Semantic-Based Information Retrieval for Java Learning Management System Nurul Shahida Tukiman and Amirah

More information

Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology

Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology Ian Niles and Adam Pease (presenter) Teknowledge 1800 Embarcadero Rd Palo Alto CA 94303 650 424 0500 650 493 2645

More information

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH Andreas Walter FZI Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany, awalter@fzi.de

More information

Improving Retrieval Experience Exploiting Semantic Representation of Documents

Improving Retrieval Experience Exploiting Semantic Representation of Documents Improving Retrieval Experience Exploiting Semantic Representation of Documents Pierpaolo Basile 1 and Annalina Caputo 1 and Anna Lisa Gentile 1 and Marco de Gemmis 1 and Pasquale Lops 1 and Giovanni Semeraro

More information

Cross-Lingual Word Sense Disambiguation

Cross-Lingual Word Sense Disambiguation Cross-Lingual Word Sense Disambiguation Priyank Jaini Ankit Agrawal pjaini@iitk.ac.in ankitag@iitk.ac.in Department of Mathematics and Statistics Department of Mathematics and Statistics.. Mentor: Prof.

More information

Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource

Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource Hugo Gonçalo Oliveira 1, António Paulo Santos 2, and Paulo Gomes 3 1 CISUC, Department of Informatics Engineering University of

More information

Context Sensitive Search Engine

Context Sensitive Search Engine Context Sensitive Search Engine Remzi Düzağaç and Olcay Taner Yıldız Abstract In this paper, we use context information extracted from the documents in the collection to improve the performance of the

More information

MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY

MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY Ankush Maind 1, Prof. Anil Deorankar 2 and Dr. Prashant Chatur 3 1 M.Tech. Scholar, Department of Computer Science and Engineering, Government

More information

7 Analysis of experiments

7 Analysis of experiments Natural Language Addressing 191 7 Analysis of experiments Abstract In this research we have provided series of experiments to identify any trends, relationships and patterns in connection to NL-addressing

More information

Evaluating a Conceptual Indexing Method by Utilizing WordNet

Evaluating a Conceptual Indexing Method by Utilizing WordNet Evaluating a Conceptual Indexing Method by Utilizing WordNet Mustapha Baziz, Mohand Boughanem, Nathalie Aussenac-Gilles IRIT/SIG Campus Univ. Toulouse III 118 Route de Narbonne F-31062 Toulouse Cedex 4

More information

Web. The Discovery Method of Multiple Web Communities with Markov Cluster Algorithm

Web. The Discovery Method of Multiple Web Communities with Markov Cluster Algorithm Markov Cluster Algorithm Web Web Web Kleinberg HITS Web Web HITS Web Markov Cluster Algorithm ( ) Web The Discovery Method of Multiple Web Communities with Markov Cluster Algorithm Kazutami KATO and Hiroshi

More information

Soft Word Sense Disambiguation

Soft Word Sense Disambiguation Soft Word Sense Disambiguation Abstract: Word sense disambiguation is a core problem in many tasks related to language processing. In this paper, we introduce the notion of soft word sense disambiguation

More information

An ontology-based approach for semantics ranking of the web search engines results

An ontology-based approach for semantics ranking of the web search engines results An ontology-based approach for semantics ranking of the web search engines results Abdelkrim Bouramoul*, Mohamed-Khireddine Kholladi Computer Science Department, Misc Laboratory, University of Mentouri

More information

Software Design using Analogy and WordNet

Software Design using Analogy and WordNet Software Design using Analogy and WordNet Paulo Gomes, Francisco C. Pereira, Nuno Seco, Paulo Paiva, Paulo Carreiro, José L. Ferreira, Carlos Bento CISUC Centro de Informática e Sistemas da Universidade

More information

Automatic Construction of WordNets by Using Machine Translation and Language Modeling

Automatic Construction of WordNets by Using Machine Translation and Language Modeling Automatic Construction of WordNets by Using Machine Translation and Language Modeling Martin Saveski, Igor Trajkovski Information Society Language Technologies Ljubljana 2010 1 Outline WordNet Motivation

More information

Wordnet Based Document Clustering

Wordnet Based Document Clustering Wordnet Based Document Clustering Madhavi Katamaneni 1, Ashok Cheerala 2 1 Assistant Professor VR Siddhartha Engineering College, Kanuru, Vijayawada, A.P., India 2 M.Tech, VR Siddhartha Engineering College,

More information

INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA. Ernesto William De Luca

INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA. Ernesto William De Luca INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA Ernesto William De Luca Overview 2 Motivation EuroWordNet RDF/OWL EuroWordNet RDF/OWL LexiRes Tool Conclusions Overview 3 Motivation EuroWordNet

More information

INFUSING SEMANTICS IN WSDL WEB SERVICE DESCRIPTIONS TO ENHANCE SERVICE COMPOSITION AND DISCOVERY

INFUSING SEMANTICS IN WSDL WEB SERVICE DESCRIPTIONS TO ENHANCE SERVICE COMPOSITION AND DISCOVERY INFUSING SEMANTICS IN WSDL WEB SERVICE DESCRIPTIONS TO ENHANCE SERVICE COMPOSITION AND DISCOVERY Ourania Hatzi, Mara Nikolaidou and Dimosthenis Anagnostopoulos Department of Informatics and Telematics,

More information

A Combined Method of Text Summarization via Sentence Extraction

A Combined Method of Text Summarization via Sentence Extraction Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 434 A Combined Method of Text Summarization via Sentence Extraction

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17

More information

Lexical ambiguity in cross-language image retrieval: a preliminary analysis.

Lexical ambiguity in cross-language image retrieval: a preliminary analysis. Lexical ambiguity in cross-language image retrieval: a preliminary analysis. Borja Navarro-Colorado, Marcel Puchol-Blasco, Rafael M. Terol, Sonia Vázquez and Elena Lloret. Natural Language Processing Research

More information

A Visual Annotation Framework Using Common- Sensical and Linguistic Relationships for Semantic Media Retrieval

A Visual Annotation Framework Using Common- Sensical and Linguistic Relationships for Semantic Media Retrieval A Visual Annotation Framework Using Common- Sensical and Linguistic Relationships for Semantic Media Retrieval Bageshree Shevade, Hari Sundaram Arizona State University {Bageshree.Shevade, Hari.Sundaram}@asu.edu

More information

THE METHOD OF AUTOMATED FORMATION OF THE SEMANTIC DATABASE MODEL OF THE DIALOG SYSTEM

THE METHOD OF AUTOMATED FORMATION OF THE SEMANTIC DATABASE MODEL OF THE DIALOG SYSTEM International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 7, July 2018, pp. 1117 1122, Article ID: IJCIET_09_07_117 Available online at http://www.iaeme.com/ijciet/issues.asp?jtype=ijciet&vtype=9&itype=7

More information

Jianyong Wang Department of Computer Science and Technology Tsinghua University

Jianyong Wang Department of Computer Science and Technology Tsinghua University Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity

More information

A New Approach to Design Graph Based Search Engine for Multiple Domains Using Different Ontologies

A New Approach to Design Graph Based Search Engine for Multiple Domains Using Different Ontologies International Conference on Information Technology A New Approach to Design Graph Based Search Engine for Multiple Domains Using Different Ontologies Debajyoti Mukhopadhyay 1,3, Sukanta Sinha 2,3 1 Calcutta

More information

Building a Tokenizer for Indonesian

Building a Tokenizer for Indonesian Building a Tokenizer for Indonesian David Moeljadi and Hannah Choi Division of Linguistics and Multilingual Studies, Nanyang Technological University, Singapore The 21st International Symposium on Malay/Indonesian

More information

Extracting knowledge from Ontology using Jena for Semantic Web

Extracting knowledge from Ontology using Jena for Semantic Web Extracting knowledge from Ontology using Jena for Semantic Web Ayesha Ameen I.T Department Deccan College of Engineering and Technology Hyderabad A.P, India ameenayesha@gmail.com Khaleel Ur Rahman Khan

More information

A Thesaurus Construction Method from Large Scale Web Dictionaries

A Thesaurus Construction Method from Large Scale Web Dictionaries A Thesaurus Construction Method from Large Scale Web Dictionaries Kotaro NAKAYAMA Takahiro HARA Shojiro NISHIO Dept. of Multimedia Eng., Graduate School of Information Science and Technology, Osaka University,

More information

Mapping WordNet to the SUMO Ontology

Mapping WordNet to the SUMO Ontology Mapping WordNet to the SUMO Ontology Ian Niles Teknowledge Corporation 1810 Embarcadero Road, Palo Alto, CA 94303 iniles@teknowledge.com Introduction Ontologies are becoming extremely useful tools for

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

Contextualized Question Answering

Contextualized Question Answering Journal of Computing and Information Technology - CIT 18, 2010, 4, 325 332 doi:10.2498/cit.1001912 325 Contextualized Question Answering Luka Bradeško, Lorand Dali, Blaž Fortuna, Marko Grobelnik, Dunja

More information

Using AgreementMaker to Align Ontologies for OAEI 2010

Using AgreementMaker to Align Ontologies for OAEI 2010 Using AgreementMaker to Align Ontologies for OAEI 2010 Isabel F. Cruz, Cosmin Stroe, Michele Caci, Federico Caimi, Matteo Palmonari, Flavio Palandri Antonelli, Ulas C. Keles ADVIS Lab, Department of Computer

More information

A Machine Learning Approach for Displaying Query Results in Search Engines

A Machine Learning Approach for Displaying Query Results in Search Engines A Machine Learning Approach for Displaying Query Results in Search Engines Tunga Güngör 1,2 1 Boğaziçi University, Computer Engineering Department, Bebek, 34342 İstanbul, Turkey 2 Visiting Professor at

More information

Matching Web Tables To DBpedia - A Feature Utility Study

Matching Web Tables To DBpedia - A Feature Utility Study Matching Web Tables To DBpedia - A Feature Utility Study Dominique Ritze, Christian Bizer Data and Web Science Group, University of Mannheim, B6, 26 68159 Mannheim, Germany {dominique,chris}@informatik.uni-mannheim.de

More information

Language Resources and Linked Data (EKAW 2014, Linköping, Sweden)

Language Resources and Linked Data (EKAW 2014, Linköping, Sweden) Language Resources and Linked Data (EKAW 2014, Linköping, Sweden) Multilingual Word Sense Disambiguation and Entity Linking on the Web based on BabelNet Roberto Navigli, Tiziano Flati Sapienza 18/11/2014

More information

Automatic Extraction of Synonymy Information: - Extended Abstract -

Automatic Extraction of Synonymy Information: - Extended Abstract - Automatic Extraction of Synonymy Information: - Extended Abstract - A Kumaran, Ranbeer Makin, Vijay Pattisapu and Shaik Emran Sharif Multilingual Systems Research, Microsoft Research India Bangalore, India

More information

Domain-Specific Semantic Relatedness From Wikipedia: Can A Course Be Transferred?

Domain-Specific Semantic Relatedness From Wikipedia: Can A Course Be Transferred? Domain-Specific Semantic Relatedness From Wikipedia: Can A Course Be Transferred? Beibei Yang University of Massachusetts Lowell Lowell, MA 01854 byang1@cs.uml.edu Jesse M. Heines University of Massachusetts

More information