Knowledge Engineering with Semantic Web Technologies

Similar documents
Knowledge Engineering with Semantic Web Technologies

Knowledge Engineering with Semantic Web Technologies

0.1 Knowledge Organization Systems for Semantic Web

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Introduction to Text Mining. Hongning Wang

Natural Language Processing with PoolParty

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains

State of the Art: Patterns in Ontology Engineering

Text Mining: A Burgeoning technology for knowledge extraction

An Approach To Web Content Mining

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Text Mining for Software Engineering

STS Infrastructural considerations. Christian Chiarcos

NLP - Based Expert System for Database Design and Development

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library

ONTOLOGY MATCHING: A STATE-OF-THE-ART SURVEY

Description Logics and OWL

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

Exam in course TDT4215 Web Intelligence - Solutions and guidelines - Wednesday June 4, 2008 Time:

Review on Text Mining

Ontology Engineering. CSE 595 Semantic Web Instructor: Dr. Paul Fodor Stony Brook University

Semantics and Ontologies for Geospatial Information. Dr Kristin Stock

Envisioning Semantic Web Technology Solutions for the Arts

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

A MODEL-DRIVEN APPROACH OF ONTOLOGICAL COMPONENTS FOR ON- LINE SEMANTIC WEB INFORMATION RETRIEVAL

Data-Mining Algorithms with Semantic Knowledge

Published in A R DIGITECH

Question Answering Systems

Archive of SID. A Semantic Ontology-based Document Organizer to Cluster elearning Documents

A hybrid method to categorize HTML documents

Refining Ontologies by Pattern-Based Completion

Semantic Web. Ontology Pattern. Gerd Gröner, Matthias Thimm. Institute for Web Science and Technologies (WeST) University of Koblenz-Landau

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

SKOS. COMP62342 Sean Bechhofer

Enabling Semantic Search in Large Open Source Communities

Natural Language Processing as Key Component to Successful Information Products

Natural Language Processing

Knowledge-Driven Video Information Retrieval with LOD

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

Conceptual Search ESI, Litigation and the issue of Language. David T. Chaplin Kroll Ontrack

Automatic Summarization

Ontologies SKOS. COMP62342 Sean Bechhofer

Analysis of Automated Matching of the Semantic Wiki Resources with Elements of Domain Ontologies

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic

Text mining tools for semantically enriching the scientific literature

Text Mining. Representation of Text Documents

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

Sustainability of Text-Technological Resources

UIMA-based Annotation Type System for a Text Mining Architecture

ISKO UK. Content Architecture. Semantic interoperability in an international comprehensive knowledge organisation system.

Data and Information Integration: Information Extraction

Domain-specific Concept-based Information Retrieval System

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

Semantic Web Technologies II SS Semantic Search and Information Integration

Reading group on Ontologies and NLP:

An Improving for Ranking Ontologies Based on the Structure and Semantics

A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENT

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Helmi Ben Hmida Hannover University, Germany

A Tool for Storing OWL Using Database Technology

The HMatch 2.0 Suite for Ontology Matchmaking

Semantic Web and Natural Language Processing

Web Mining TEAM 8. Professor Anita Wasilewska CSE 634 Data Mining

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ???

Ontology Extraction from Heterogeneous Documents

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Opus: University of Bath Online Publication Store

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria

Content Enrichment. An essential strategic capability for every publisher. Enriched content. Delivered.

Chapter 1, Introduction

Ontology Learning for Semantic Web Services Auhood Alfaries, David Bell, Mark Lycett

TISA Methodology Threat Intelligence Scoring and Analysis

University of Rome Tor Vergata GENOMA. GENeric Ontology Matching Architecture

MSc Advanced Computer Science School of Computer Science The University of Manchester

Simplified Approach for Representing Part-Whole Relations in OWL-DL Ontologies

A Survey On Different Text Clustering Techniques For Patent Analysis

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island

Fuzzy Ontology-based Spatial Data Warehouse for Context-aware. search and recommendations.

Acquiring Rich Knowledge from Text

NLP Chain. Giuseppe Castellucci Web Mining & Retrieval a.a. 2013/2014

Conceptual Database Modeling

Semantic Web. Ontology Alignment. Morteza Amini. Sharif University of Technology Fall 95-96

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Part I: Data Mining Foundations

Internal project report T3.1 Damask Ontology

Enterprise Multimedia Integration and Search

Watson & WMR2017. (slides mostly derived from Jim Hendler and Simon Ellis, Rensselaer Polytechnic Institute, or from IBM itself)

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH

An UIMA based Tool Suite for Semantic Text Processing

Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi) )

3 Classifications of ontology matching techniques

Transcription:

This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning Dr. Harald Sack Hasso Plattner Institute for IT Systems Engineering University of Potsdam Autumn 2015

Ontology Learning Ontology Design is very expensive wrt. time and resources can we automate the process or at least some parts? Ontologies can be learned automatically Ontology Learning defines a set of methods and techniques for fundamental development of new ontologies for extension or adaption of already existing ontologies in a (partly) automated way from various resources. also referred to as Ontology Generation, Ontology Mining, or Ontology Extraction

Fundamental types of Ontology Learning Ontology Learning from Text Linked Data Mining automatic or semi-automatic generation of lightweight ontologies by means of text mining and information extraction detecting meaningful patterns in RDF graphs via statistical schema induction or statistical relational learning Concept Learning in Description Logics and OWL learning schema axioms from existing ontologies and instance data mostly based on based on Inductive Logic Programming Crowdsourcing Ontologies combines the speed of computers with the accuracy of humans, as e.g. taxonomy construction via Amazon Turk or games with a purpose

Ontology Learning from Text Ontology Learning from text is the process of identifying terms, concepts, relations, and optionally, axioms from textual information and using them to construct and maintain an ontology. Automatisation requires help from Natural Language Processing (NLP) Data Mining Machine Learning techniques (ML) Information Retrieval (IR)

Ontology Learning from Text - Basic Approach document corpus terminology ontology pet (1) term extraction <dog> <dogs> <cat> <siamese cat> (2) conceptualisation dog siamese cat (3) evaluation & adaption term extractions requires linguistic processing (NLP) to identify important noun phrases and their internal semantic structure terms: linguistic realisations of domain specific concepts Concepts: clusters of semantically related terms cat

Natural Language Processing for OL On November 3, 1954, the very first of a series of 28 Godzilla films premiered. The film focuses on Godzilla, a prehistoric monster resurrected by repeated nuclear tests in the Pacific, who ravages Japan and reignites the horrors of nuclear devastation to the very nation that experienced it firsthand. Since his debut, Godzilla has morphed into a worldwide cultural icon. http://blog.yovisto.com/godzilla/ 1. Tokenization breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens

Natural Language Processing for OL http://text-processing.com/demo/tokenize/

Natural Language Processing for OL On November 3, 1954, the very first of a series of 28 Godzilla films premiered. The film focuses on Godzilla, a prehistoric monster resurrected by repeated nuclear tests in the Pacific, who ravages Japan and reignites the horrors of nuclear devastation to the very nation that experienced it firsthand. Since his debut, Godzilla has morphed into a worldwide cultural icon. 1. Tokenization breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens 2. Morphological Analysis analysis and description of the structure of a given language's morphemes, as e.g. stemming and lemmatization

Natural Language Processing for OL http://text-processing.com/demo/stem/

Natural Language Processing for OL On November 3, 1954, the very first of a series of 28 Godzilla films premiered. The film focuses on Godzilla, a prehistoric monster resurrected by repeated nuclear tests in the Pacific, who ravages Japan and reignites the horrors of nuclear devastation to the very nation that experienced it firsthand. Since his debut, Godzilla has morphed into a worldwide cultural icon. 1. Tokenization breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens 2. Morphological Analysis analysis and description of the structure of a given language's morphemes, as e.g. stemming and lemmatization 3. Part-of-Speech Tagging marking up a word in a text (corpus) as corresponding to a particular part of speech

Natural Language Processing for OL http://cogcomp.cs.illinois.edu/page/demo_view/pos

Natural Language Processing for OL 1. Tokenization breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens 2. Morphological Analysis analysis and description of the structure of a given language's morphemes, as e.g. stemming and lemmatization 3. Part-of-Speech Tagging marking up a word in a text (corpus) as corresponding to a particular part of speech 4. Regular Expression Matching define regular expressions and match them in text 5. chunker / n-gram analysis detect larger coherent structures 6. Syntax Tree Parsing determine the full syntactic structure of a sentence

Natural Language Processing for OL http://nlpviz.bpodgursky.com/

Machine Learning for OL 1. Association Rule Discovery 2. (Hierarchical) Clustering 3. classification of new concepts into an existing hierarchy e.g. with Support Vector Machines (SVM), Naive Bayes, knn Inductive Logic Programming 5. unsupervised learning, in particular clustering of terms Classification 4. discovery of interesting associations between terms induction of rules from data, i.e. discovery of new concepts from extensional data Conceptual Clustering Formal Concept Analysis, learning concepts and concept hierarchies

The Ontology Learning Layer Cake Country 1 hascapital. River Mountain capitalof locatedin flowthrough(dom:river, range:geoentity) Capital City, City InhabitedGeoEntity c:=country:=<description(c), uri(c)> {country, nation, land} river, country, nation, city, capital,... General Axioms Axiomatic Schemata Relation Hierarchies Relations Concept Hierarchies Concept Description Multilingual Synonyms Terms

Ontology Learning Tasks 1. Ontology Creation 2. Ontology Schema Extraction 3. Design of an ontology from the scratch by a team of experts Maschine Learning (ML) supports the experts during the design phase by Suggestions of well suited relations among concepts Integrity / consistency checking of the designed ontology Extraction of ontology schemata from heterogeneous documents Machine Learning uses input data and meta ontology to create full-fledged domain ontologies (with the help of human experts) Extraction of Ontology Instances (Populate Ontologies) Extraction of ontology instances from semi-structured / unstructured data to populate already existing ontology schemata with individuals applies technologies from Information Retrieval and Data Mining

Ontology Learning Tasks 4. Ontology Integration and Navigation 5. Ontology Update 6. Reconstruction of existing knowledge bases and navigation in existing knowledge bases, as e.g. translation of an existing knowledge base from FOL to OWL 2 DL Extension, reconstruction and adaption of already existing ontologies, as e.g. adation of an ontology to a changed domain relates to parts of ontologies that have been created in the way that they can be changed Ontology Enrichment (also Ontology tuning) relates to automated update of smaller parts of existing ontologies doesn t changes concepts and relations, but refines them (more precise) in difference to ontology update only parts of the ontology are considered that usually shouldn t be changed

Challenges in Ontology Learning Heterogeneity Uncertainty data on the web differs largely,, e.g., wrt. formats, languages, domains and quality low-quality data, which is hard to interpret by computational means, as well as inherently imperfect methods for learning ontologies Reasoning applications relying on reasoning need consistent ontologies, must be explicitly supported by ontology learning Scalability Quality knowledge extraction from growing amounts of web data requires scalable ontology learning ontology evaluation to enable formal correctness, completeness and consistency Interactivity human involvement increases the quality of learned ontologies

04: Ontology Alignment OpenHPI - Course Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering