THE TECHNIQUES FOR THE ONTOLOGY-BASED INFORMATION RETRIEVAL

Similar documents
Ontology Creation and Development Model

Semantic Web Search Model for Information Retrieval of the Semantic Data *

Information Retrieval (IR) through Semantic Web (SW): An Overview

Ontology Mapping based on Similarity Measure and Fuzzy Logic

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model

The Semantic Web Services Tetrahedron: Achieving Integration with Semantic Web Services 1

Computer-assisted Ontology Construction System: Focus on Bootstrapping Capabilities

Proposal for Implementing Linked Open Data on Libraries Catalogue

Collaborative Ontology Construction using Template-based Wiki for Semantic Web Applications

Enhancement of CAD model interoperability based on feature ontology

Ontology-driven Translators: The new generation

KNOWLEDGE MANAGEMENT VIA DEVELOPMENT IN ACCOUNTING: THE CASE OF THE PROFIT AND LOSS ACCOUNT

Implementation of Semantic Information Retrieval. System in Mobile Environment

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

An Ontology-Based Intelligent Information System for Urbanism and Civil Engineering Data

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL

Just in time and relevant knowledge thanks to recommender systems and Semantic Web.

A Novel Architecture of Ontology based Semantic Search Engine

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February ISSN

A General Approach to Query the Web of Data

Semantic Image Retrieval Based on Ontology and SPARQL Query

PECULIARITIES OF LINKED DATA PROCESSING IN SEMANTIC APPLICATIONS. Sergey Shcherbak, Ilona Galushka, Sergey Soloshich, Valeriy Zavgorodniy

Agent Semantic Communications Service (ASCS) Teknowledge

GeoTemporal Reasoning for the Social Semantic Web

Introduction. October 5, Petr Křemen Introduction October 5, / 31

Semantic Web Mining and its application in Human Resource Management

Dartgrid: a Semantic Web Toolkit for Integrating Heterogeneous Relational Databases

Searching. Outline. Copyright 2006 Haim Levkowitz. Copyright 2006 Haim Levkowitz

ELENA: Creating a Smart Space for Learning. Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics

Demystifying the Semantic Web

Finding Topic-centric Identified Experts based on Full Text Analysis

A New Approach to Design Graph Based Search Engine for Multiple Domains Using Different Ontologies

Ontology Development Tools and Languages: A Review

Towards an Ontology Visualization Tool for Indexing DICOM Structured Reporting Documents

SEMANTIC ASSOCIATION-BASED SEARCH AND VISUALIZATION METHOD ON THE SEMANTIC WEB PORTAL

IMPROVING EFFICIENCY OF ONTOLOGY MAPPING IN SEMANTIC WEB USING CUT ARC ALGORITHM

A faceted lightweight ontology for Earthquake Engineering Research Projects and Experiments

Jumpstarting the Semantic Web

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria

SkyEyes: A Semantic Browser For the KB-Grid

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1

Domain-specific Concept-based Information Retrieval System

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

A Community-Driven Approach to Development of an Ontology-Based Application Management Framework

Domain Specific Semantic Web Search Engine

Development of Contents Management System Based on Light-Weight Ontology

Ontology Development. Qing He

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

H1 Spring C. A service-oriented architecture is frequently deployed in practice without a service registry

Ontology Matching with CIDER: Evaluation Report for the OAEI 2008

The Application Research of Semantic Web Technology and Clickstream Data Mart in Tourism Electronic Commerce Website Bo Liu

Semantic Web and Electronic Information Resources Danica Radovanović

Developing A Web-based User Interface for Semantic Information Retrieval

Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

Semantic-Based Web Mining Under the Framework of Agent

Knowledge and Ontological Engineering: Directions for the Semantic Web

Semantic IoT System for Indoor Environment Control A Sparql and SQL based hybrid model

Development of an Ontology-Based Portal for Digital Archive Services

A Linguistic Approach for Semantic Web Service Discovery

The Semantic Web & Ontologies

Grid Resources Search Engine based on Ontology

An Evaluation of Geo-Ontology Representation Languages for Supporting Web Retrieval of Geographical Information

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch

Payola: Collaborative Linked Data Analysis and Visualization Framework

Extensible Dynamic Form Approach for Supplier Discovery

Network-based Fast Handover for IMS Applications and Services

Java Learning Object Ontology

Thinking on the Web. Berners-Lee, Gödel and Turing

Semantic Web and Natural Language Processing

Structure of This Presentation

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Semantic Web Fundamentals

Semantic Web Domain Knowledge Representation Using Software Engineering Modeling Technique

A REASONING COMPONENT S CONSTRUCTION FOR PLANNING REGIONAL AGRICULTURAL ADVANTAGEOUS INDUSTRY DEVELOPMENT

Ontology Merging: on the confluence between theoretical and pragmatic approaches

A common metadata approach to support egovernment interoperability

Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management

OSDBQ: Ontology Supported RDBMS Querying

Knowledge Engineering. Ontologies

F-OWL: An OWL Reasoner in Flora-2 Youyong Zou, Harry Chen, Tim Finin, Lalana Kagal

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

State of the Art of Semantic Web

Web Service Matchmaking Using Web Search Engine and Machine Learning

AN OVERVIEW OF SEARCHING AND DISCOVERING WEB BASED INFORMATION RESOURCES

> Semantic Web Use Cases and Case Studies

A Novel Architecture of Ontology-based Semantic Web Crawler

An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery

Semantic Web Mining. Diana Cerbu

Semantic agents for location-aware service provisioning in mobile networks

SWSE: Objects before documents!

The 2 nd Generation Web - Opportunities and Problems

Ontology-Specific API for a Curricula Management System

OSM Lecture (14:45-16:15) Takahira Yamaguchi. OSM Exercise (16:30-18:00) Susumu Tamagawa

Context Ontology Construction For Cricket Video

COMPARATIVE STUDY OF TECHNOLOGIES RELATED TO COMPONENT-BASED APPLICATIONS BASED ON THEIR RESPONSE TIME PERFORMANCE

MODEL-BASED SYSTEMS ENGINEERING DESIGN AND TRADE-OFF ANALYSIS WITH RDF GRAPHS

Research on Extension of SPARQL Ontology Query Language Considering the Computation of Indoor Spatial Relations

Transcription:

THE TECHNIQUES FOR THE ONTOLOGY-BASED INFORMATION RETRIEVAL Myunggwon Hwang 1, Hyunjang Kong 1, Sunkyoung Baek 1, Kwangsu Hwang 1, Pankoo Kim 2 1 Dept. of Computer Science Chosun University, Gwangju, Korea Tel: +82-62-230-7799 E-mail: {mghwang, kisofire, zamilla100, hwangs00ks}@chosun.ac.kr 2 Dept. of CSE Chosun University, Gwangju, Korea Tel: +82-62-230-7636 E-mail: pkkim@chosun.ac.kr Abstract The use of ontologies to address the problems of the existing keyword-based search has been searched. For the efficient ontology-based information retrieval, there are several facts we should consider. In this paper, we describe the techniques demanded for the ontology-based information retrieval. Keywords, Query-Engine, two-column format, IEEE format. 1. Introduction Nowadays, people search the information on the web. The existing search engines help them searching information efficiently. But the information retrieval is still unsatisfied on the current web environment. The basic reason is that the computer cannot understand the web contents like human. To overcome the limitations, Tim Berners-Lee suggested the semantic web in the late 90 s. In the semantic web approaches, the core part is the ontology. That is, the degree of improvement of the semantic retrieval and success of the semantic web depends on the completeness and quality of the ontologies. In this paper, we suggest several techniques for the semantic information retrieval system. Especially, the techniques about the ontology occupy a great deal of weight in our system. In 2nd section, we introduce the related works. Then in section 3, we describe the techniques for the ontology-based information retrieval in details. In section 4, we evaluate our study. In the end of this paper, we conclude our study and suggest the future works. 2. Related Works For developing the OBIR system, several facts such as table 1 are considered. Table 1. The consideration facts for developing the OBIR system 1. Decision of the application s feature using ontology 2. Identification of the related facts with the user 3. Analysis the information and knowledge of the specific organization 4. Ownership problem 5. Examination of the system in the specific organization 6. Decision about the inference processing 7. Decision of the evaluation standard and measurement standard 8. Decision of the applicable range 9. Consideration about the data noise 10. Analysis of the ontology management tool and processing steps As we realized throughout table 1, for developing the OBIR system the developer should consider many facts. And we research to know how to apply the ontology in the information retrieval system through the analysis of the several OBIR systems that have developed until now. Firstly, in the OntoWeb project, the OBIR system was developed[1,2]. In this project, ontology is in charge of the guide role to search more related information about the user queries. And it tries to address the processing the meaning of the context. Secondly, the OntoBroker that is an ontology-based system was developed for analyzing the web documents and processing the user queries[3]. OntoBroker suggested the methodology for converting the HTML documents to ontology structure. And the people could search the information and understand the contents of the documents throughout the OntoBroker interface. In the OntoBroker, ontology is the common language for the information provider and searcher. And the ontology consists of the concepts, relationships and specific rules. Thirdly, MELISA(Medical Literature Search Agent) is the documents retrieval system about the medical part. It is one kind of the prototype system using the ontology. MELISA uses the medical ontology for addressing the user query problems and improving the retrieval accuracy of the documents about the medical part[6]. Figure 1 illustrates the structure of the information retrieval system in the semantic web[4]. The system consists of the search engine and ontology. In this structure, the upper part is 93560-1365 - Feb. 12-14, 2007 ICACT2007

in charge of the search engine and under part consists of the ontology related techniques on the ontology repository. Figure 1. The structure of the information retrieval system in the semantic web Above-mentioned most information retrieval systems tried to the semantic information search using the ontology. However, for developing the OBIR system many facts are considered about the ontology techniques such as the creation, management, inference and query processing about ontology. As those techniques are articulate each other, the ideal OBIR system will be developed. has several features. The significant features of the OntoMan describe as follows in details. 3.1.1. The Support for Automatic Building Methodology In the OntoMan, firstly, the system constructs the frame ontology about the specific domain automatically based on the WordNet. And then, the system adds more information to the frame ontology based on the specific input document that was made by domain experts. The methodology for the automatic ontology building is explained table 2. Table 2. The steps of automatic ontology building methodology 1. The user accesses the OntoMan. 2. The user selects the specific domain (based on the WordNet). 3. The system constructs the frame ontology (based on the WordNet). 4. The user inputs the specific document (made by domain expert). 5. The system adds more information to frame ontology. 6. The user modifies the ontology (add, delete, edit using OntoMan interface). 7. The system converts the constructed ontology to OWL format. 3.1.2. The Support for the GUI Environment The users are able to build the ontology easily using OntoMan. OntoMan is developed based on the GUI and especially, represent the ontology to tree structure. Figure 2 illustrates the screen shot of the OntoMan system. 3. The Techniques for -based Information Retrieval In our approach, ontology is in charge of the most important role and we focus the issues about how to manage the ontologies efficiently and how to apply the ontologies to the existing search engines. In this section, we describe the techniques for the ontology-based information retrieval. And then, we suggest the efficient ontology-based information retrieval model consisted of the core techniques related to the ontology. For the success of the ontology-based information retrieval, following these techniques related to the ontology are demanded and we design the following techniques. - Management Tool(OntoMan) - Repository - Web Crawler - Query Engine(CQEFT) 3.1 Management Tool(OntoMan) Firstly, we consider how manage the ontologies efficiently. In our study, we developed the ontology management tool that names OntoMan. The OntoMan supports the whole steps of the ontology building such as creation, deletion, edition, modification and storing. The users are able to build the ontology and then, store it in the ontology repository using the OntoMan. OntoMan is composed the GUI environment and Figure 2. The interface of the OntoMan system 3.1.3. The Support for the Writing Guide about the Language OntoMan provides the writing guide about the and OWL. and OWL are widely used for building the ontologies nowadays. However, the specification of and OWL is very complex and hard to understand. Thus, OntoMan provides the writing guide of and OWL. As mentioned before, OntoMan is a very important technique to manage the ontologies efficiently. Until now, most ontology-based information retrieval models ignore the steps about ontology creation to ontology management. These 93560-1366 - Feb. 12-14, 2007 ICACT2007

systems just used the pre-built ontologies or built the ontology newly. Thus, the interoperability among the systems is very low. So, the OntoMan was designed to support the total building steps about the ontologies. 3.2. Repository In here, we consider how provide the huge ontologies to the users efficiently. repository collects and stores the ontologies in the specific space. repository is connected to the OntoMan. And every user could access the ontology repository to use the existing ontogies. repository contains the ontology files(rdf and OWL) and the fact triple files reasoned by the inference engine. In our approach, if the ontology is created or collected newly, the system creates the fact triple files based on the pre-defined inference rules and stores the fact triple files with the original ontology files in the ontology repository together. 3.3. Web Crawler In this technique, we consider the reusability of the ontologies. The web ontology crawler finds the ontologies on the web and stores them in the ontology repository. The web ontology crawler consists of the domain classifying module, ranking module and retrieval module for reusing the existing ontologies efficiently. 3.3.1. Classifying The classifying module analyzes the ontologies and decides the domain concept about the ontology. For analyzing the concepts, firstly the domain classifying module matches the concepts in the ontology to the WordNet s concepts. The formula to define the domain concepts about the ontology is like below. It is the Resnik methodology. Using the formula we can define the minimum highest concept of the WordNet. Figure 3 explains how to decide the domain concept about the ontology using the formula. collected ontologies based on the domain concepts of the ontologies. 3.3.2. Ranking Although the ontologies are analyzed to the same domain concept, the degree of the ontology s integrity has a gap. When two more ontologies are analyzed to the same domain, we should give the ranking order for providing the efficient information. In this module, we measure the integrity of the ontology using the Jaccard formula and give the ranking order to each ontology 3.3.3. Retrieval The retrieval module support the efficient ontology search among a lot of ontologies stored in the ontology repository. Table 3 explains the whole processing steps of web ontology crawler. Table 3. Processing steps of the web ontology crawler Processing steps 1. Analysis the HTML Document RDF/OWL 2. Store the linked addresses in the que Parser 3. Transfer the RDF ontology to the Classifying Classifying Ranking Retrieval Web Retrieval User Interface Input the WordNet Matching (Synset_ID) Key 4. Match the concepts of RDF ontology to the concepts of WordNet 5. Decide the domain of the ontology using Resnik formula 6. Create the index ontology 7. Toss the results to Ranking 8. Evaluate the completion of ontology using Jaccard formula and give the ranking order to each ontology 9. Show the retrieval results in order based on the index ontology Web Page Queue : Store Retrieved Crawler HTML Parser Exclusion : Analyzed Web Page Matching Classifying Parser, Index Repository Consistency (%) Figure 4. Processing steps of the web ontology crawler Ranking s In a c d e s in Wordnet a b c d e 1 1 1 1 Jaccard Similarity 3.4. Query Engine(CQEFT) c33 c1 c4 c3 c2 c31 c32 c11 c14 c34 c36 c35 c37 c23 c12 c13 c21 c22 s are included in Figure 3. The decision of the domain concept based on the WordNet In figure 3, the minimum highest concept will become a domain concept about the ontology. After deciding the domain concept, this module creates the index ontology about all c38 c39 In here, we consider how evaluate the ontologies efficiently. In this study, we design the ontology query engine newly. It is the CQEFT(Controlled Query Engine For Triple). The CQEFT consists of the reasoning part and the query processing part. The reasoning part contains totally 55 inference rules the rules about the basic graph model (30), the inference rules supported by the web ontology language vocabularies(20) and consistency check rules(5). When new ontology is created or collected, the reasoning part makes the triple type files based on the 55 inference rules. And then, the query processing part extracts the information from the triple files made by the reasoning part. The query processing part has a feature that supports the text-based query interface for the 93560-1367 - Feb. 12-14, 2007 ICACT2007

normal users. So, the users can search the information easily although the users do not know the complex query syntax. Figure 5 and 6 show the reasoning part and query processing part of the CQEFT. Figure 5. The reasoning part of the CQEFT Figure 7. The structure of the ontology-based information retrieval model Our approach was designed to be able to retrieval the information semantically. The flow of our approach is like table 4. Table 4. The flow of our system 1. The user accesses the ontology-based information retrieval model. 2. The user inputs the query throughout the text based query processing part(cqeft). 3. The system finds the information about the query based on the pre-reasoned triple files. 4. The system gives the results to the user. 5. The OntoMan creates or manages the ontologies. 6. The web ontology crawler collects the ontologies from the web. Figure 6. The query processing part of the CQEFT 3.5. The Structure of the -based Information Retrieval Model In chapter 3.1, 3.2, 3.3 and 3.4, we explain four techniques that are demanded for achieving the ideal ontology-based information retrieval. In this paper, we compose the techniques and figure 7 illustrates the structure of the ontology-based information retrieval model. In our study, we realized that it is possible the semantic information retrieval based on the above processing steps throughout our approach. 4. Evaluations For evaluating our system, we make the scenario. The scenario is like as The man invites his girlfriend for dinner and his girlfriend is a vegetarian. So, he decides to prepare the TOFU Stake for dinner and wants to buy one bottle of wine well matched with the TOFU Stake. Thus, he finds the information that is The wine well matched with the TOFU Stake is the strong sweet white Zinfandel from the web search engine. And then, he tries to find a bottle of the wine that is the strong sweet white Zinfandel on the web site. In the evaluation, we compare our system and the other web search engines - Google and Yahoo that are the standard web search engine. As well as our scenario, we prepare three more queries for evaluating our system. And we measure the accuracy rate of three systems about four queries. The formula for the accuracy rate is like below. Accuracy rate = correct results / total searched results 93560-1368 - Feb. 12-14, 2007 ICACT2007

Our system Google.com Yahoo.com Luncheon(light dry wine) 45/52 41/100 35/100 Shellfish food (dry white wine) 62/87 53/100 47/100 TOFU Steak (sweet strong white Zinfandel) 11/19 8/100 11/100 Spicy food (sweet light white wine) 23/34 16/100 17/100 Accuracy ratesin the results of the Google and Yahoo, we got a lot of results about the queries. So, we made the deadline of the results that is one hundred items in order. And then, we find the correct results among one hundred results. Applications Institute (AIAI), the University of Edinburgh, 1997. [2] http://www.cs.umd.edu/projects/plus/shoe/. [3] http://ontoweb.aifb.uni-karlsruhe.de. [4] Aitken, S., Reid, S., "Evaluation of an -Based Information Retrieval Tool", 12th European conference on Artificial Intelligene(ECAI'00) Workshop on Applications of Ontologies and Problem-Solving Method, 2000. [5] Gruber, T., "Toward Principles for the design of ontologies used for knowledge sharing", International Journal of Human-Computer Studies, vol.43, no.5/6, pp. 907-928, 1995. [6] Abasolo, J.M., Gómez, M., "MELISA. An -based agent for information retrieval in medicine", ECDL 2000 Workshop on the Semantic Web(SemWeb2000), pp. 73-82, 2000. Figure 8. Accuracy rates using graph Table 5 and figure 8 illustrate the accuracy rate of the search results about each system. In our system, we could get the highest accuracy rate. At the results, we realized that it is possible the semantic information retrieval by using our system. 5. Conclusion In this paper, we suggest the semantic information retrieval system based on the ontology. In our study, we try to address the limitation of the existing ontology-based information retrieval system. For addressing the problems, we suggest and develop the all techniques related to the ontology theory. And then, we design the ontology-based information retrieval model by composing all techniques. Throughout the evaluation, we realized that it is able to retrieval the information semantically by using our approach. Acknowledgement "This research was supported by the MIC(Ministry of Information and Communication), Korea, under the ITRC(Information Technology Research Center) support program supervised by the IITA(Institute of Information Technology Advancement)" (IITA-2006-C1090-0603-0040) References [1] Uschold, M., King, M., Moralee, S., Zorgios, Y., "The Enterprise ", AIAI-TR-195, Aritificial Intelligence 93560-1369 - Feb. 12-14, 2007 ICACT2007