Data-Mining Algorithms with Semantic Knowledge

Size: px
Start display at page:

Download "Data-Mining Algorithms with Semantic Knowledge"

Transcription

1 Data-Mining Algorithms with Semantic Knowledge Ontology-based information extraction Carlos Vicient Monllaó Universitat Rovira i Virgili December, 14th Poznan A Project funded by the Ministerio de Ciencia e Innovación and Universitat Rovira i Virgili DAMASK, 2010

2 Contents 1. DAMASK 1. Introduction 2. Goals 3. Working plan 2. Ontology-based information extraction (Task 1) 1. State of art 1. IR vs IE 2. Ontology-based IE 2. Main methodology 3. Step by step methodology 1. Named entities detection 2. Discovering entity-subsumer concept (Candidates extraction) 3. Semantic annotation 2

3 1.- DAMASK DATA MINING ALGORITHMS WITH SEMANTIC KNOWLEDGE 3

4 INTRODUCTION Data-Mining Algorithms with Semantic Knowledge Founded by Ministerio de Ciencia e Innovación and Universitat Rovira i Virgili Main motivations: Explosive growth in the amount of information available on networked computers around the world, much of it in the form of natural language documents Increasing interest in semantic web contents => Semantic Knowledge Lack of use of domain knowledge of traditional data mining methods 4

5 GOALS Processing and extraction of Web resources based on ontologies. Extraction of relevant data from a domain of structured, semi-structured and unstructured Web resources. Semantic integration of information in an attribute-value matrix that can be used for further clustering methods. Performing an automatic classification of data (clustering method based on ontologies) Adaptation of traditional clustering methods to create classifications (trees and partitions) using semantic information. Definition of methods to analyse automatically the clusters obtained from previous step Test the practical applicability of the developed methods in the strategic area of Tourism 5

6 WORK PLAN I The project is divided into 3 main task: Task 1 - Ontology-based information extraction and integration from heterogeneous Web resources Task 2 - Automatic clustering of entities based on the semantics of the concepts and attributes obtained from the Web resources Task 3 - Application of the developed methods to a Tourism test case 6

7 WORK PLAN I The project is divided into 3 main task: Task 1 - Ontology-based information extraction and integration from heterogeneous Web resources Task 2 - Automatic clustering of entities based on the semantics of the concepts and attributes obtained from the Web resources Task 3 - Application of the developed methods to a Tourism test case 7

8 WORK PLAN (Task 1) II The key point of this task is to complement the syntactical parsing and natural language processing techniques with the knowledge contained in one or several input ontologies in order to be able to: Identify relevant features describing a particular entity from textual data Associate, if applicable, extracted features to concepts contained in the input ontologies. 8

9 2.- ONTOLOGY-BASED INFORMATION EXTRACTION 9

10 STATE OF ART (IR vs IE) I IR simply finds texts and presents them to the user (as classic search engines) Information Extraction (IE) is the task of locating specific pieces of data within a natural language document IE analyses texts and presents only the specific information extracted from the text that is of interest to a user Wrapper : a set of extraction rules suitable to extract information from a Web site. Two main approaches: Knowledge engineering supervised, traditional IE Automatic training unsupervised, open IE 10

11 STATE OF ART (IR vs IE) I IR simply finds texts and presents them to the user (as classic search engines) Information Extraction (IE) is the task of locating specific pieces of data within a natural language document IE analyses texts and presents only the specific information extracted from the text that is of interest to a user Wrapper : a set of extraction rules suitable to extract information from a Web site. Two main approaches: Knowledge engineering supervised, traditional IE Automatic training unsupervised, open IE 11

12 STATE OF ART (IR vs IE) II Comparison of tradition IE and Open IE 12

13 STATE OF ART (Ontology-Based IE) III Ontology-Based IE (Motivations): Growing interest in the research community in developing data mining techniques Textual documents describing a particular entity are difficult to process in order to extract relevant features which could be exploited in order to apply semantically focused data mining algorithms There have been many conceptual approximations in the field of Semantic Web in which it is assumed that resources have been semantically annotated, in the short term future we cannot expect the availability of a massive amount of annotated Web resources Ontology Based information extraction relies on ontologies in order to interpret the textual content of a resource regardless of its format. 13

14 STATE OF ART (Ontology-Based IE) IV Ontologies have emerged as a new paradigm to model and formalize domain knowledge in a machine readable way IE and ontologies are involved in two main and related tasks. Used for: Information Extraction: IE needs ontologies as part of the understanding process for extracting the relevant information; Populating and enhancing the ontology: texts are useful sources of knowledge to design and enrich ontologies. These two tasks can be combined in a cyclic process: ontologies are used for interpreting the text at the right level for IE and IE extracts new knowledge from text, to be integrated in the ontology. 14

15 Cyclic process IE ALGORITHMS Relevant extracted information ONTOLOGY Populating and enhancing 15

16 METHODOLOGY I Task 1 methodology could be compared with respect to automatic semantic annotation of documents. 1. Named Entity detection (instances of things) 2. Discovering entity-subsumer concept (candidates from Named Entity) 3. Semantic annotation of Named Entities (Pairs of NE and candidate) 16

17 METHODOLOGY (Named Entity detection) II 17

18 METHODOLOGY (Named Entity detection) II Madrid Paris Llobregat Catalan Antoni Gaudí Sagrada Familia 18

19 METHODOLOGY (Named Entity detection) III Extracted NE Madrid Paris Llobregat Catalan Antoni Gaudí Sagrada Familia 19

20 METHODOLOGY (Named Entity detection) III Extracted NE Madrid Paris Llobregat Catalan Representative NE Catalan Llobregat Antoni Gaudí Sagrada Faminlia Antoni Gaudí Sagrada Familia 20

21 METHODOLOGY (discovering Entity-subset) IV Representative NE Sagrada Familia Catalan Llobregat Antoni Gaudí 21

22 METHODOLOGY (discovering Entity-subset) IV Representative NE Sagrada Familia Catalan Llobregat Antoni Gaudí Subset {Cathedral, church} {Language} {River, town} {Architect, person} 22

23 METHODOLOGY (Ontology Matching) V NE-Subset Sagrada Familia; {Cathedral, church} Semantic annotation Catalan; {Language} Llobregat; {River, town} Antoni Gaudí; {Architect, person} 23

24 METHODOLOGY (Ontology Matching) V NE-Subset Sagrada Familia; {Cathedral, church} Semantic annotation Catalan; {Language} Llobregat; {River, town} Antoni Gaudí; {Architect, person} 24

25 STEP by STEP (Named Entity detection) I "Named entities are phrases that contain the names of persons, organizations, locations, times, and quantities." (CoNLL 2002). Problems to detect NE: Unstructured and unlimited by nature Relationships remain hidden in the text from which the extraction has been performed. Approaches Using rules learned from pre-tagget examples => Recall problems Use a thesaurus to detect NE (if it is not found in the dictionary, it is assumed to be a NE => NE composed by common words are discarded Exploiting the way in which NE are expressed in languages such as English using heuristics => Inaccurate results. Using linguistic analyses, heuristics and statistics web 25

26 STEP by STEP (Named Entity detection) II Linguistic analysis are applied to detect NE Tool: OpenNLP => Natural language Parser Four steps: SD, TOK, TAG, CHUNK. CHUNK is able to detect NE using a database => Lower recall, Limited NE [NP The/VB gothic/jj cathedral/nn] [VP of/vb] [NP Barcelona/NNP] [NP Tarragona/EX] [VP is/nns] [NP a/jjs city/nn] Proposal: Filter noise: remove stop words, misspellings, etc. Heuristics: Select all Noun Phrases (NP) where [NP.+ Regex2: s[a-z] Problem: Not all potencial NE are representative for the analized instance. e.g. Neither Paris nor Madrid are representative for Barcelona 26

27 STEP by STEP (Named Entity detection) III In order to improve NE extraction precision it will be complemented with a Web-based reliability analysis. Wider context, i.e. several observations in heterogeneous contexts Web-based analysis approach consists in use Web-statistics to sort all NE combining Semantic Relatedness measures and hits Relatedness Measures: PMI (Pointwise mutual information) SCP (Symmetrical. Conditional Probability) NGD (Normalized Google distance) 27

28 STEP by STEP (Named Entity detection) IV PMI (Pointwise mutual information) Using hits, 28

29 STEP by STEP (Named Entity detection) V SCP (Symmetrical. Conditional Probability) NGD (Normalized Google Distance) Where, M is the total number of Internet webpages. 29

30 STEP by STEP (Named Entity detection) VI Hits for Barcelona and Sagrada Familia Hits, Similarity, Sim(Barcelona, SagradaFamilia) = * = 3,42803E-09 30

31 STEP by STEP (Named Entity detection) VII Named Entity Hits(Bcn) Hits(NE) Hits(Bcn^NE) PMI Sagrada Familia ,580461E-09 Llobregat ,336235E-09 Antoni Gaudí ,605033E-09 Madrid ,592376E-09 Paris ,789873E-10 Catalan ,772626E-10 (*) Queries has been performed using yahoo searcher engine 31

32 STEP by STEP (Named Entity detection) VII Named Entity Hits(Bcn) Hits(NE) Hits(Bcn^NE) PMI Sagrada Familia ,580461E-09 Llobregat ,336235E-09 Antoni Gaudí ,605033E-09 Madrid ,592376E-09 Paris ,789873E-10 Catalan ,772626E-10 (*) Queries has been performed using yahoo searcher engine 32

33 STEP by STEP (Named Entity detection) VIII Select representative NE using different thresholds Extracted NE Sorted by PMI Sagrada Familia Llobregat Antoni Gaudí Madrid Representative NE Sagrada Familia Llobregat Antoni Gaudí Madrid Paris Catalan Paris Catalan 33

34 STEP by STEP (Named Entity detection) VIII Select representative NE using different thresholds Extracted NE Sorted by PMI Sagrada Familia Llobregat Antoni Gaudí Madrid Representative NE Sagrada Familia Llobregat Antoni Gaudí Madrid Paris Catalan Paris Catalan 34

35 STEP by STEP (Named Entity detection) VIII Select representative NE using different thresholds Extracted NE Sorted by PMI Sagrada Familia Llobregat Antoni Gaudí Madrid Representative NE Sagrada Familia Llobregat Antoni Gaudí Madrid Paris Catalan Paris Catalan 35

36 STEP by STEP (Named Entity detection) VIII Select representative NE using different thresholds Extracted NE Sorted by PMI Sagrada Familia Llobregat Antoni Gaudí Madrid Representative NE Sagrada Familia Llobregat Antoni Gaudí Madrid Paris Catalan Paris Catalan 36

37 STEP by STEP (Discovering entity-subsumer concept) I It is needed a way to go from the instance level to the conceptual level in an unsupervised domain independent NE and subsumer concepts are related by means of taxonomic relationships Approaches: Document-based notion of term subsumption Semantic similarity according to the shared context =>Both cases require a considerable amount of document and linguistic parsing Linguistic patterns => offer a relatively high precision but suffer a low recall due to the fact that explicit linguistic patterns are rare in corpora 37

38 STEP by STEP (Discovering entity-subsumer concept) II Solution Exploiting the web in order to increase the corpus Hearst Patterns: Used to acquire hyponymy/hypernym relations from unrestricted text. NN such as NP (cities such as Tortosa) such NN as NP (such cities as Tarragona) NP or other NN (London or other cities) NP and other NN (Barcelona and other cities) NN incluiding NP (locations including Reus) NN especially NP (gothic cathedrals especially Sagrada Familia) 38

39 STEP by STEP (Discovering entity-subsumer concept) III Six queries are constructed for each NE Returned snippets for each query are analysed using a Natural Language Parser in order to extract the taxonomical relationships 39

40 STEP by STEP (Discovering entity-subsumer concept) IV Interpretation of snippets (query such as London ) a big city such as London [NP a/vbz big/jj city/nn] [PP such/pdt] [NP as/nns London/ NNP] travel topics such as London sightseeing [NP travel/nns topics/nns] [NP such/pdt] [NP as/nns London/ NNP museum/nn] 40

41 STEP by STEP (Semantic annotation) I Consist in the annotation of Named Entities with ontological classes. Approaches: Web-based statistical evaluation => Considerable amount of queries Semantically unstructured annotations covering heterogeneous domains which are hard to exploit (Barcelona is a Metropolis, Madrid is a city) Direct matching between subsumer concept and ontology class => If the subsumer concept does not appear in the ontology but their meaning is similar than one of the classes it is not annotated. Direct matching + statistics web + WordNet 41

42 STEP by STEP (Semantic annotation) II Direct Matching A stemming algorithms is applied to subsumer concepts and ontology classes in order to discover morphologically equivalent terms. e.g., city and cities Subsumer concepts are looked up in the ontology. if there is any result and it is possible to reduce the concept, it is performed and the process is repeated e.g., big city -> city Chose the best proposed annotation 42

43 STEP by STEP (Semantic annotation) III Semantic Matching For each candidate get synonyms, hyponyms and hypernims using WordNet (if there are more than 1 synset, the context is used to resolve the problem of semantic disambiguation) Candidate: church => {abbey, basilica, cathedral, kirk, place of worship, house of prayer} To perform direct matching using the new extracted subsumer candidates To choose the best annotation 43

44 STEP by STEP (Semantic annotation) IV Semantic disambiguation In most of cases one word could have different meanings (Polysemy) e.g., {head-> part of the body, head-> Geographic accident, etc.} When WordNet has to be used to get synonyms from one concept, it is necessary to know which WordNet synset is the most appropriate. 44

45 STEP by STEP (Semantic annotation) V To resolve this problem it is used the context from where the candidate has been extracted. e.g.: - Document: - Named Entity: Darren Aronofsky - Subsumer candidate: producer - Context: Filmmaker Darren Aronofsky commented, "I walked out of The Matrix [...] and I was thinking, 'What kind of science fiction movie can people make now? - WordNet Synsets: o S: (n) manufacturer, producer (someone who manufactures something) o S: (n) producer (someone who finds financing for and supervises the making and presentation of a show (play or film or program or similar work) o S: (n) producer (something that produces) "Maine is a leading producer of potatoes"; "this microorganism is a producer of disease" 45

46 STEP by STEP (Semantic annotation) VI Then the context is compared with each synset using cosine distance similarity measure. Context is not enough in order to decide which synset is better and, for this reason, it is increased by means of web snippets. Query: The Matrix Aron Aronofsky Snippets: [0]: Weeks before shooting his second movie, Requiem for a Dream, Darren Aronofsky took the film's star, Jared Leto, to see The Matrix at a Brooklyn mall. [1]: UPCOMING FILM PROJECTS! There's been rumors for a while now that "The Matrix" trilogy filmmakers Andy and Lana Wachowski have been developing a secret [N] 46

47 STEP by STEP (Semantic annotation) VII Finally, the synset with higher punctuation is selected (0.084) S: (n) producer (someone who finds financing for and supervises the making and presentation of a show (play or film or program or similar work) And the synonyms, hypernyms and hyponyms are extracted. Producer => {film maker, filmmaker, film producer, movie maker, theatrical producer} 47

48 STEP by STEP (Semantic annotation) VII Choose the best annotation Compare the proposed annotations by pairs Between father-child relationships, child is chosen Other relationships are solved using statistics web (PMI) 48

49 STEP by STEP (Semantic annotation) VIII monument cathedral church religious building monument X PMI -> Monument PMI -> Monument PMI -> monument cathedral PMI -> Monument X PMI -> cathedral church PMI -> Monument PMI -> cathedral X SuperClass -> cathedral Superclass -> church religious bulding PMI -> Monument Superclass -> cathedral Superclass -> church X Subsumer concept Hits(Sagrada familia) Hits(Cand) Hits(SgF^Cand) PMI monument ,50E-08 cathedral ,43E-08 church ,57E-08 religious bulding ,17E-12 49

50 50

Engineering Applications of Artificial Intelligence

Engineering Applications of Artificial Intelligence Engineering Applications of Artificial Intelligence 26 (2013) 1092 1106 Contents lists available at SciVerse ScienceDirect Engineering Applications of Artificial Intelligence journal homepage: www.elsevier.com/locate/engappai

More information

Internal project report T3.1 Damask Ontology

Internal project report T3.1 Damask Ontology TIN2009-11005 DAMASK Data-Mining Algorithms with Semantic Knowledge PROYECTO DE INVESTIGACIÓN PROGRAMA NACIONAL DE INVESTIGACIÓN FUNDAMENTAL, PLAN NACIONAL DE I+D+i 2008-2011 ÁREA TEMÁTICA DE GESTIÓN:

More information

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch

More information

Text Mining for Software Engineering

Text Mining for Software Engineering Text Mining for Software Engineering Faculty of Informatics Institute for Program Structures and Data Organization (IPD) Universität Karlsruhe (TH), Germany Department of Computer Science and Software

More information

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD 10 Text Mining Munawar, PhD Definition Text mining also is known as Text Data Mining (TDM) and Knowledge Discovery in Textual Database (KDT).[1] A process of identifying novel information from a collection

More information

Tourism applications of Artificial Intelligence techniques. Dr. Antonio Moreno, ITAKA research group, URV

Tourism applications of Artificial Intelligence techniques. Dr. Antonio Moreno, ITAKA research group, URV Tourism applications of Artificial Intelligence techniques Dr. Antonio Moreno, ITAKA research group, URV ITAKA Basic research lines Multi-agent systems Ontology Learning Information Extraction Automated

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

Knowledge Engineering with Semantic Web Technologies

Knowledge Engineering with Semantic Web Technologies This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning

More information

Text Mining: A Burgeoning technology for knowledge extraction

Text Mining: A Burgeoning technology for knowledge extraction Text Mining: A Burgeoning technology for knowledge extraction 1 Anshika Singh, 2 Dr. Udayan Ghosh 1 HCL Technologies Ltd., Noida, 2 University School of Information &Communication Technology, Dwarka, Delhi.

More information

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI

MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI MEASURING SEMANTIC SIMILARITY BETWEEN WORDS AND IMPROVING WORD SIMILARITY BY AUGUMENTING PMI 1 KAMATCHI.M, 2 SUNDARAM.N 1 M.E, CSE, MahaBarathi Engineering College Chinnasalem-606201, 2 Assistant Professor,

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

MIA - Master on Artificial Intelligence

MIA - Master on Artificial Intelligence MIA - Master on Artificial Intelligence 1 Hierarchical Non-hierarchical Evaluation 1 Hierarchical Non-hierarchical Evaluation The Concept of, proximity, affinity, distance, difference, divergence We use

More information

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science Information Retrieval CS 6900 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Information Retrieval Information Retrieval (IR) is finding material of an unstructured

More information

Random Walks for Knowledge-Based Word Sense Disambiguation. Qiuyu Li

Random Walks for Knowledge-Based Word Sense Disambiguation. Qiuyu Li Random Walks for Knowledge-Based Word Sense Disambiguation Qiuyu Li Word Sense Disambiguation 1 Supervised - using labeled training sets (features and proper sense label) 2 Unsupervised - only use unlabeled

More information

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion Sara Lana-Serrano 1,3, Julio Villena-Román 2,3, José C. González-Cristóbal 1,3 1 Universidad Politécnica de Madrid 2 Universidad

More information

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) CONTEXT SENSITIVE TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & 6367(Print), ISSN 0976 6375(Online) Volume 3, Issue 1, January- June (2012), TECHNOLOGY (IJCET) IAEME ISSN 0976 6367(Print) ISSN 0976 6375(Online) Volume

More information

Using the Web as a Corpus. in Natural Language Processing

Using the Web as a Corpus. in Natural Language Processing Using the Web as a Corpus in Natural Language Processing Malvina Nissim Laboratory for Applied Ontology ISTC-CNR, Roma nissim@loa-cnr.it Johan Bos Dipartimento di Informatica Università La Sapienza, Roma

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

ScienceDirect. Enhanced Associative Classification of XML Documents Supported by Semantic Concepts

ScienceDirect. Enhanced Associative Classification of XML Documents Supported by Semantic Concepts Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 194 201 International Conference on Information and Communication Technologies (ICICT 2014) Enhanced Associative

More information

NATURAL LANGUAGE PROCESSING

NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity

More information

Limitations of XPath & XQuery in an Environment with Diverse Schemes

Limitations of XPath & XQuery in an Environment with Diverse Schemes Exploiting Structure, Annotation, and Ontological Knowledge for Automatic Classification of XML-Data Martin Theobald, Ralf Schenkel, and Gerhard Weikum Saarland University Saarbrücken, Germany 23.06.2003

More information

It s time for a semantic engine!

It s time for a semantic engine! It s time for a semantic engine! Ido Dagan Bar-Ilan University, Israel 1 Semantic Knowledge is not the goal it s a primary mean to achieve semantic inference! Knowledge design should be derived from its

More information

International ejournals

International ejournals Available online at www.internationalejournals.com International ejournals ISSN 0976 1411 International ejournal of Mathematics and Engineering 112 (2011) 1023-1029 ANALYZING THE REQUIREMENTS FOR TEXT

More information

Information Extraction Techniques in Terrorism Surveillance

Information Extraction Techniques in Terrorism Surveillance Information Extraction Techniques in Terrorism Surveillance Roman Tekhov Abstract. The article gives a brief overview of what information extraction is and how it might be used for the purposes of counter-terrorism

More information

Knowledge-based Word Sense Disambiguation using Topic Models Devendra Singh Chaplot

Knowledge-based Word Sense Disambiguation using Topic Models Devendra Singh Chaplot Knowledge-based Word Sense Disambiguation using Topic Models Devendra Singh Chaplot Ruslan Salakhutdinov Word Sense Disambiguation Word sense disambiguation (WSD) is defined as the problem of computationally

More information

Automatic Construction of WordNets by Using Machine Translation and Language Modeling

Automatic Construction of WordNets by Using Machine Translation and Language Modeling Automatic Construction of WordNets by Using Machine Translation and Language Modeling Martin Saveski, Igor Trajkovski Information Society Language Technologies Ljubljana 2010 1 Outline WordNet Motivation

More information

Error annotation in adjective noun (AN) combinations

Error annotation in adjective noun (AN) combinations Error annotation in adjective noun (AN) combinations This document describes the annotation scheme devised for annotating errors in AN combinations and explains how the inter-annotator agreement has been

More information

Named Entity Detection and Entity Linking in the Context of Semantic Web

Named Entity Detection and Entity Linking in the Context of Semantic Web [1/52] Concordia Seminar - December 2012 Named Entity Detection and in the Context of Semantic Web Exploring the ambiguity question. Eric Charton, Ph.D. [2/52] Concordia Seminar - December 2012 Challenge

More information

Web-scale taxonomy learning

Web-scale taxonomy learning Web-scale taxonomy learning David Sánchez DAVID.SANCHEZ@URV.NET Antonio Moreno AMORENO.MORENO@URV.NET Dep. Computer Science and Mathematics, University Rovira i Virgili, Av. Països Catalans, 24. 43007

More information

Question Answering Systems

Question Answering Systems Question Answering Systems An Introduction Potsdam, Germany, 14 July 2011 Saeedeh Momtazi Information Systems Group Outline 2 1 Introduction Outline 2 1 Introduction 2 History Outline 2 1 Introduction

More information

A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENT

A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENT A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENT ABSTRACT Tahar Guerram and Nacima Mellal Departement of Mathematics and Computer Science, University Larbi Ben M hidi of Oum El Bouaghi -

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

Apache UIMA and Mayo ctakes

Apache UIMA and Mayo ctakes Apache and Mayo and how it is used in the clinical domain March 16, 2012 Apache and Mayo Outline 1 Apache and Mayo Outline 1 2 Introducing Pipeline Modules Apache and Mayo What is? (You - eee - muh) Unstructured

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17

More information

OPEN INFORMATION EXTRACTION FROM THE WEB. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni

OPEN INFORMATION EXTRACTION FROM THE WEB. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni OPEN INFORMATION EXTRACTION FROM THE WEB Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni Call for a Shake Up in Search! Question Answering rather than indexed key

More information

Automatically Annotating Text with Linked Open Data

Automatically Annotating Text with Linked Open Data Automatically Annotating Text with Linked Open Data Delia Rusu, Blaž Fortuna, Dunja Mladenić Jožef Stefan Institute Motivation: Annotating Text with LOD Open Cyc DBpedia WordNet Overview Related work Algorithms

More information

MSc Advanced Computer Science School of Computer Science The University of Manchester

MSc Advanced Computer Science School of Computer Science The University of Manchester PROGRESS REPORT Ontology-Based Technical Document Retrieval System Ruvin Yusubov Supervisor: Professor Ulrike Sattler MSc Advanced Computer Science School of Computer Science The University of Manchester

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

A Linguistic Approach for Semantic Web Service Discovery

A Linguistic Approach for Semantic Web Service Discovery A Linguistic Approach for Semantic Web Service Discovery Jordy Sangers 307370js jordysangers@hotmail.com Bachelor Thesis Economics and Informatics Erasmus School of Economics Erasmus University Rotterdam

More information

Motivating Ontology-Driven Information Extraction

Motivating Ontology-Driven Information Extraction Motivating Ontology-Driven Information Extraction Burcu Yildiz 1 and Silvia Miksch 1, 2 1 Institute for Software Engineering and Interactive Systems, Vienna University of Technology, Vienna, Austria {yildiz,silvia}@

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

A Comprehensive Analysis of using Semantic Information in Text Categorization

A Comprehensive Analysis of using Semantic Information in Text Categorization A Comprehensive Analysis of using Semantic Information in Text Categorization Kerem Çelik Department of Computer Engineering Boğaziçi University Istanbul, Turkey celikerem@gmail.com Tunga Güngör Department

More information

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation

More information

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE

COMP90042 LECTURE 3 LEXICAL SEMANTICS COPYRIGHT 2018, THE UNIVERSITY OF MELBOURNE COMP90042 LECTURE 3 LEXICAL SEMANTICS SENTIMENT ANALYSIS REVISITED 2 Bag of words, knn classifier. Training data: This is a good movie.! This is a great movie.! This is a terrible film. " This is a wonderful

More information

Tagonto. Tagonto Project is an attempt of nearing two far worlds Tag based systems. Almost completely unstructured and semantically empty

Tagonto. Tagonto Project is an attempt of nearing two far worlds Tag based systems. Almost completely unstructured and semantically empty Tagonto is an attempt of nearing two far worlds Tag based systems Almost completely unstructured and semantically empty Ontologies Strongly structured and semantically significant Taking the best of both

More information

Jianyong Wang Department of Computer Science and Technology Tsinghua University

Jianyong Wang Department of Computer Science and Technology Tsinghua University Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity

More information

Question Answering Approach Using a WordNet-based Answer Type Taxonomy

Question Answering Approach Using a WordNet-based Answer Type Taxonomy Question Answering Approach Using a WordNet-based Answer Type Taxonomy Seung-Hoon Na, In-Su Kang, Sang-Yool Lee, Jong-Hyeok Lee Department of Computer Science and Engineering, Electrical and Computer Engineering

More information

3 Publishing Technique

3 Publishing Technique Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach

More information

An Improving for Ranking Ontologies Based on the Structure and Semantics

An Improving for Ranking Ontologies Based on the Structure and Semantics An Improving for Ranking Ontologies Based on the Structure and Semantics S.Anusuya, K.Muthukumaran K.S.R College of Engineering Abstract Ontology specifies the concepts of a domain and their semantic relationships.

More information

An Approach To Web Content Mining

An Approach To Web Content Mining An Approach To Web Content Mining Nita Patil, Chhaya Das, Shreya Patanakar, Kshitija Pol Department of Computer Engg. Datta Meghe College of Engineering, Airoli, Navi Mumbai Abstract-With the research

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

0.1 Knowledge Organization Systems for Semantic Web

0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1.1 Knowledge Organization Systems Why do we need to organize knowledge? Indexing Retrieval Organization

More information

* Overview. Ontology-Guided Information Extraction from Pathology Reports The SWPatho Project David Schlangen Universität Potsdam

* Overview. Ontology-Guided Information Extraction from Pathology Reports The SWPatho Project David Schlangen Universität Potsdam Overview Background of project The task The system Digression: gently machine aided ontology construction Evaluation Future Work -Guided Information Extraction from Pathology Reports The SWPatho Project

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Cross-Lingual Word Sense Disambiguation

Cross-Lingual Word Sense Disambiguation Cross-Lingual Word Sense Disambiguation Priyank Jaini Ankit Agrawal pjaini@iitk.ac.in ankitag@iitk.ac.in Department of Mathematics and Statistics Department of Mathematics and Statistics.. Mentor: Prof.

More information

Using ART2 Neural Network and Bayesian Network for Automating the Ontology Constructing Process

Using ART2 Neural Network and Bayesian Network for Automating the Ontology Constructing Process Available online at www.sciencedirect.com Procedia Engineering 29 (2012) 3914 3923 2012 International Workshop on Information and Electronics Engineering (IWIEE) Using ART2 Neural Network and Bayesian

More information

INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA. Ernesto William De Luca

INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA. Ernesto William De Luca INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA Ernesto William De Luca Overview 2 Motivation EuroWordNet RDF/OWL EuroWordNet RDF/OWL LexiRes Tool Conclusions Overview 3 Motivation EuroWordNet

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Ontology Matching with CIDER: Evaluation Report for the OAEI 2008 Jorge Gracia, Eduardo Mena IIS Department, University of Zaragoza, Spain {jogracia,emena}@unizar.es Abstract. Ontology matching, the task

More information

Multimedia Data Management M

Multimedia Data Management M ALMA MATER STUDIORUM - UNIVERSITÀ DI BOLOGNA Multimedia Data Management M Second cycle degree programme (LM) in Computer Engineering University of Bologna Semantic Multimedia Data Annotation Home page:

More information

CS 6320 Natural Language Processing

CS 6320 Natural Language Processing CS 6320 Natural Language Processing Information Retrieval Yang Liu Slides modified from Ray Mooney s (http://www.cs.utexas.edu/users/mooney/ir-course/slides/) 1 Introduction of IR System components, basic

More information

Module 3: GATE and Social Media. Part 4. Named entities

Module 3: GATE and Social Media. Part 4. Named entities Module 3: GATE and Social Media Part 4. Named entities The 1995-2018 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs Licence Named Entity Recognition Texts frequently

More information

Ontology-Based Information Extraction

Ontology-Based Information Extraction Ontology-Based Information Extraction Daya C. Wimalasuriya Towards Partial Completion of the Comprehensive Area Exam Department of Computer and Information Science University of Oregon Committee: Dr. Dejing

More information

A Semantic Role Repository Linking FrameNet and WordNet

A Semantic Role Repository Linking FrameNet and WordNet A Semantic Role Repository Linking FrameNet and WordNet Volha Bryl, Irina Sergienya, Sara Tonelli, Claudio Giuliano {bryl,sergienya,satonelli,giuliano}@fbk.eu Fondazione Bruno Kessler, Trento, Italy Abstract

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

DIT - University of Trento Concept Search: Semantics Enabled Information Retrieval

DIT - University of Trento Concept Search: Semantics Enabled Information Retrieval PhD Dissertation International Doctorate School in Information and Communication Technologies DIT - University of Trento Concept Search: Semantics Enabled Information Retrieval Uladzimir Kharkevich Advisor:

More information

WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus

WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus 2009 IEEE International Conference on Semantic Computing WikiOnto: A System For Semi-automatic Extraction And Modeling Of Ontologies Using Wikipedia XML Corpus Lalindra De Silva University of Colombo School

More information

Information Retrieval CSCI

Information Retrieval CSCI Information Retrieval CSCI 4141-6403 My name is Anwar Alhenshiri My email is: anwar@cs.dal.ca I prefer: aalhenshiri@gmail.com The course website is: http://web.cs.dal.ca/~anwar/ir/main.html 5/6/2012 1

More information

Although it s far from fully deployed, Semantic Heterogeneity Issues on the Web. Semantic Web

Although it s far from fully deployed, Semantic Heterogeneity Issues on the Web. Semantic Web Semantic Web Semantic Heterogeneity Issues on the Web To operate effectively, the Semantic Web must be able to make explicit the semantics of Web resources via ontologies, which software agents use to

More information

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ???

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ??? @ INSIDE DEEPQA Managing complex unstructured data with UIMA Simon Ellis INTRODUCTION 22 nd November, 2013 WAT SON TECHNOLOGIES AND OPEN ARCHIT ECT URE QUEST ION ANSWERING PROFESSOR JIM HENDLER S IMON

More information

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP A fully-automatic approach to answer geographic queries: at GikiP Johannes Leveling Sven Hartrumpf Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen)

More information

Chapter 6. Queries and Interfaces

Chapter 6. Queries and Interfaces Chapter 6 Queries and Interfaces Keyword Queries Simple, natural language queries were designed to enable everyone to search Current search engines do not perform well (in general) with natural language

More information

Papers for comprehensive viva-voce

Papers for comprehensive viva-voce Papers for comprehensive viva-voce Priya Radhakrishnan Advisor : Dr. Vasudeva Varma Search and Information Extraction Lab, International Institute of Information Technology, Gachibowli, Hyderabad, India

More information

WordNet-based User Profiles for Semantic Personalization

WordNet-based User Profiles for Semantic Personalization PIA 2005 Workshop on New Technologies for Personalized Information Access WordNet-based User Profiles for Semantic Personalization Giovanni Semeraro, Marco Degemmis, Pasquale Lops, Ignazio Palmisano LACAM

More information

Using NLP and context for improved search result in specialized search engines

Using NLP and context for improved search result in specialized search engines Mälardalen University School of Innovation Design and Engineering Västerås, Sweden Thesis for the Degree of Bachelor of Science in Computer Science DVA331 Using NLP and context for improved search result

More information

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22 Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion

More information

A MODEL-DRIVEN APPROACH OF ONTOLOGICAL COMPONENTS FOR ON- LINE SEMANTIC WEB INFORMATION RETRIEVAL

A MODEL-DRIVEN APPROACH OF ONTOLOGICAL COMPONENTS FOR ON- LINE SEMANTIC WEB INFORMATION RETRIEVAL Journal of Web Engineering, Vol. 6, No.4 (2007) 303-329 Rinton Press A MODEL-DRIVEN APPROACH OF ONTOLOGICAL COMPONENTS FOR ON- LINE SEMANTIC WEB INFORMATION RETRIEVAL HAJER BAAZAOUI ZGHAL 1, MARIE-AUDE

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Intelligent Information Retrieval 1. Relevance feedback - Direct feedback - Pseudo feedback 2. Query expansion

More information

Ontology Based Search Engine

Ontology Based Search Engine Ontology Based Search Engine K.Suriya Prakash / P.Saravana kumar Lecturer / HOD / Assistant Professor Hindustan Institute of Engineering Technology Polytechnic College, Padappai, Chennai, TamilNadu, India

More information

Collective Intelligence in Action

Collective Intelligence in Action Collective Intelligence in Action SATNAM ALAG II MANNING Greenwich (74 w. long.) contents foreword xv preface xvii acknowledgments xix about this book xxi PART 1 GATHERING DATA FOR INTELLIGENCE 1 "1 Understanding

More information

Ontology Population and Enrichment: State of the Art

Ontology Population and Enrichment: State of the Art Ontology Population and Enrichment: State of the Art Georgios Petasis, Vangelis Karkaletsis, Georgios Paliouras, Anastasia Krithara, and Elias Zavitsanos Institute of Informatics and Telecommunications,

More information

CS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University

CS377: Database Systems Text data and information. Li Xiong Department of Mathematics and Computer Science Emory University CS377: Database Systems Text data and information retrieval Li Xiong Department of Mathematics and Computer Science Emory University Outline Information Retrieval (IR) Concepts Text Preprocessing Inverted

More information

Real-time population of Knowledge Bases: Opportunities and Challenges. Ndapa Nakashole Gerhard Weikum

Real-time population of Knowledge Bases: Opportunities and Challenges. Ndapa Nakashole Gerhard Weikum Real-time population of Knowledge Bases: Opportunities and Challenges Ndapa Nakashole Gerhard Weikum AKBC Workshop at NAACL 2012 Real-time Data Sources In news and social media, the implicit query is:

More information

Domain-Specific. Languages. Martin Fowler. AAddison-Wesley. Sydney Tokyo. With Rebecca Parsons

Domain-Specific. Languages. Martin Fowler. AAddison-Wesley. Sydney Tokyo. With Rebecca Parsons Domain-Specific Languages Martin Fowler With Rebecca Parsons AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Sydney Tokyo Singapore

More information

Text Classification and Clustering Using Kernels for Structured Data

Text Classification and Clustering Using Kernels for Structured Data Text Mining SVM Conclusion Text Classification and Clustering Using, pgeibel@uos.de DGFS Institut für Kognitionswissenschaft Universität Osnabrück February 2005 Outline Text Mining SVM Conclusion 1 Text

More information

Collaborative editing of knowledge resources for cross-lingual text mining

Collaborative editing of knowledge resources for cross-lingual text mining UNIVERSITÀ DI PISA Scuola di Dottorato in Ingegneria Leonardo da Vinci Corso di Dottorato di Ricerca in INGEGNERIA DELL INFORMAZIONE Tesi di Dottorato di Ricerca Collaborative editing of knowledge resources

More information

Web Mining Evolution & Comparative Study with Data Mining

Web Mining Evolution & Comparative Study with Data Mining Web Mining Evolution & Comparative Study with Data Mining Anu, Assistant Professor (Resource Person) University Institute of Engineering and Technology Mahrishi Dayanand University Rohtak-124001, India

More information

Semantic Web Technologies Trends and Research in Ontology-based Systems

Semantic Web Technologies Trends and Research in Ontology-based Systems Semantic Web Technologies Trends and Research in Ontology-based Systems John Davies BT, UK Rudi Studer University of Karlsruhe, Germany Paul Warren BT, UK John Wiley & Sons, Ltd Contents Foreword xi 1.

More information

Semantics Isn t Easy Thoughts on the Way Forward

Semantics Isn t Easy Thoughts on the Way Forward Semantics Isn t Easy Thoughts on the Way Forward NANCY IDE, VASSAR COLLEGE REBECCA PASSONNEAU, COLUMBIA UNIVERSITY COLLIN BAKER, ICSI/UC BERKELEY CHRISTIANE FELLBAUM, PRINCETON UNIVERSITY New York University

More information

Iterative Learning of Relation Patterns for Market Analysis with UIMA

Iterative Learning of Relation Patterns for Market Analysis with UIMA UIMA Workshop, GLDV, Tübingen, 09.04.2007 Iterative Learning of Relation Patterns for Market Analysis with UIMA Sebastian Blohm, Jürgen Umbrich, Philipp Cimiano, York Sure Universität Karlsruhe (TH), Institut

More information