Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

Similar documents
> Semantic Web Use Cases and Case Studies

Available online at ScienceDirect. Procedia Computer Science 83 (2016 )

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Terminologies, Knowledge Organization Systems, Ontologies

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Agricultural bibliographic data sharing & interoperability in China

Semantic Web. Tahani Aljehani

Available online at ScienceDirect. Procedia Computer Science 56 (2015 )

Available online at ScienceDirect. Procedia Computer Science 34 (2014 ) Generic Connector for Mobile Devices

SCADA Systems Management based on WEB Services

Domain Specific Semantic Web Search Engine

A Community-Driven Approach to Development of an Ontology-Based Application Management Framework

Available online at ScienceDirect. International Workshop on Enabling ICT for Smart Buildings (ICT-SB 2014)

DEVELOPMENT OF ONTOLOGY-BASED INTELLIGENT SYSTEM FOR SOFTWARE TESTING

ScienceDirect. Reducing Semantic Gap in Video Retrieval with Fusion: A survey

Available online at ScienceDirect. Procedia Computer Science 60 (2015 )

PRIOR System: Results for OAEI 2006

Available online at ScienceDirect. Procedia Computer Science 56 (2015 )

Proceedings Energy-Related Data Integration Using Semantic Data Models for Energy Efficient Retrofitting Projects

A conceptual model of trademark retrieval based on conceptual similarity

MERGING BUSINESS VOCABULARIES AND RULES

Semantics for Optimization of the Livestock Farming

Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation

Solving problem of semantic terminology in digital library

Ontology-based Navigation of Bibliographic Metadata: Example from the Food, Nutrition and Agriculture Journal

Total cost of ownership for application replatform by open-source SW

Computer-assisted Ontology Construction System: Focus on Bootstrapping Capabilities

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata

Emerging URL Patterns in Mobile Websites: A Preliminary Results

Text clustering based on a divide and merge strategy

The AGROVOC Concept Scheme - A Walkthrough

SKOS. COMP62342 Sean Bechhofer

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Proposal for Implementing Linked Open Data on Libraries Catalogue

Contributions to the Study of Semantic Interoperability in Multi-Agent Environments - An Ontology Based Approach

Fuzzy Logic Based Scheme for Load Balancing in Grid Services

Ontologies SKOS. COMP62342 Sean Bechhofer

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Reducing Consumer Uncertainty

Dartgrid: a Semantic Web Toolkit for Integrating Heterogeneous Relational Databases

Thanks to our Sponsors

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

Available online at ScienceDirect. Procedia Computer Science 60 (2015 )

Available online at ScienceDirect. Procedia Computer Science 54 (2015 )

Semantic Processing of Sensor Event Stream by Using External Knowledge Bases

Automated Visualization Support for Linked Research Data

Category Theory in Ontology Research: Concrete Gain from an Abstract Approach

Procedia - Social and Behavioral Sciences 174 ( 2015 ) INTE Jintavee Khlaisang*

Available online at ScienceDirect. Procedia Computer Science 34 (2014 )

warwick.ac.uk/lib-publications

Acquiring Experience with Ontology and Vocabularies

SCADA virtual instruments management

Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets

Ontology-based Architecture Documentation Approach

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

Extracting knowledge from Ontology using Jena for Semantic Web

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Networked Ontologies

Chinese Agricultural Thesaurus and its application on data sharing & interoperability

Semantic Web Fundamentals

ScienceDirect. An Efficient Association Rule Based Clustering of XML Documents

Structured Data To RDF II Deliverable D4.3.2

Ontology Molecule Theory-based Information Integrated Service for Agricultural Risk Management

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

The NEPOMUK project. Dr. Ansgar Bernardi DFKI GmbH Kaiserslautern, Germany

An Ontology-Based Intelligent Information System for Urbanism and Civil Engineering Data

ScienceDirect. Extending Lifetime of Wireless Sensor Networks by Management of Spare Nodes

Collaborative Ontology Construction using Template-based Wiki for Semantic Web Applications

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

An Improving for Ranking Ontologies Based on the Structure and Semantics

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1

The role of vocabularies for estimating carbon footprint for food recipies using Linked Open Data

Ontology-Based Schema Integration

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

Investigating Ontology Development and Usage in Horticultural Production

Europeana and semantic alignment of vocabularies

Semantic Web based Information Extraction

Grid Resources Search Engine based on Ontology

Languages and tools for building and using ontologies. Simon Jupp, James Malone

On the use of Abstract Workflows to Capture Scientific Process Provenance

The Semantic Web DEFINITIONS & APPLICATIONS

Collaborative editing of knowledge resources for cross-lingual text mining

Accumulative Privacy Preserving Data Mining Using Gaussian Noise Data Perturbation at Multi Level Trust

Study on Ontology-based Multi-technologies Supported Service-Oriented Architecture

A Developer s Guide to the Semantic Web

Context Ontology Construction For Cricket Video

Available online at ScienceDirect. Procedia Computer Science 59 (2015 )

Ontology and Hyper Graph Based Dashboards in Data Warehousing Systems

ScienceDirect KEYWORD EXTRACTION USING PARTICLE SWARM OPTIMIZATION

ONTOLOGY MODELLING OF MULTIMODALITY IMAGE RETRIEVAL SYSTEM FOR SPORT NEWS DOMAIN

An Architecture to Aggregate Heterogeneous and Semantic Sensed Data

RGB Intensity Based Variable- Bits Image Steganography

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Springer Science+ Business, LLC

Cluster-based Instance Consolidation For Subsequent Matching

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

An Approach to Evaluate and Enhance the Retrieval of Web Services Based on Semantic Information

Transcription:

Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 52 (2015 ) 1071 1076 The 5 th International Symposium on Frontiers in Ambient and Mobile Systems (FAMS-2015) Health, Food and User s Profile Ontologies for Personalized Information Retrieval Tarek Helmy *, Ahmed Al-Nazer, Saeed Al-Bukhitan, Ali Iqbal Information and Computer Science Department, College of Computer Science & Engineering, King Fahd University of Petroleum & Minerals, Dhahran 31216, Mail Box 413, Saudi Arabia, *on leave from College of Engineering, Department of Computers & Automatic Control, Tanta University, Egypt [helmy, alnazer, bukhitan, aliqbal]@kfupm.edu.sa Abstract As we aim to retrieve personalized information to user s queries related to food, health and nutrition domains such as Is apple good for people with heart diseases?, How much honey can be taken by a diabetic patient?, What are the health benefits of eating pineapple? and What are the fruits that contain the daily need quantity of calcium? The information retrieval system needs to integrate ontologies from different domains such as food, nutrition, health (diseases, body parts, body functions) and recipe in order to answer such kind of queries. In addition, to support multilingual queries, the system and ontologies require aggregation of information from multilevel ontologies. Also, to achieve high relevancy and coverage we need to use ontologies that have comprehensive and rich vocabularies. Moreover, to make effective use for the annotation, ontologies concept names should be unique and self-contained. The main focus of this paper is to integrate ontologies from food, health and nutrition domains to help the personalized information systems to retrieve food and heath recommendations based on the user s health conditions and food preferences. Such ontologies that satisfy these requirements do not explicitly exist. Therefore, we were challenged to develop these ontologies by creating, integrating and reusing some of the existing ontologies to meet our requirements. 2015 The Authors. Published by by Elsevier Elsevier B.V. B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Conference Program Chairs. Peer-review under responsibility of the Conference Program Chairs Keywords: Ontology Integration; Semanic Web; Personalized Retrieval. * Corresponding author. Tel.:+966138601967; fax: +966138601967. E-mail address: helmy@kfupm.edu.sa 1877-0509 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Conference Program Chairs doi:10.1016/j.procs.2015.05.114

1072 Tarek Helmy et al. / Procedia Computer Science 52 ( 2015 ) 1071 1076 1. Introduction Semantic Web brings the Internet from Web of documents to Web of data where the linked data empowers the computers with the ability to provide better services such as reasoning and inferring 2,3,4,5,6,7,8,9. Semantic Web technologies help in building data stores on Web, create vocabularies and provide rules to deal with data. Some of the technologies used by linked data are Resource Description Framework (RDF) 8, Simple Protocol and RDF Query Language (SPARQL), and Web Ontology Language (OWL) 1. Ontology is a formal representation of knowledge in a network of concepts within a certain domain using a shared terminology for the types, properties and relationships between the domain s concepts. The main components of ontologies are Concepts which are similar to classes in Object Oriented Programming (OOP); Instances which are similar to objects in OOP; Attributes which are part of the concept; Attribute values which are the values of the attributes and part of the instance; Subject which can be concept, instance, attribute or attribute values; Object which can be concept, instance, attribute or attribute values; Predicate which is a relation between a subject and an object and Triple which is subject-predicate-object. This paper introduces food, health, nutrition domain ontologies and the user s profile ontology to be used by our semantic Web-based personalized retrieval system 16. The rest of this paper is organized as follows. Section 2 presents existing related ontologies with their limitations with respect to our system requirements. Section 3 explains the development cycle for ontologies and their integration. Section 4 presents the performance evaluation and finally Section 5 concludes the paper and presents the future work. 2. Related Ontologies 2.1. Semantic Diet Ontologies Evan Patton developed a project called Semantic Diet (SD) 12,13 for the purpose to help people to eat healthier. SD has a main ontology with one concept related to nutrition and two concepts related to food. The food concepts are based on two USDA food tables: food-item and food-groups. In addition, SD has other ontologies: recipes, units for measurements, food serving size, and nutritional guidelines. One advantage of the SD ontologies is that they are built based on USDA database, which is used in many semantic applications. Another advantage is that it integrates food concepts with nutrition concepts with one property. One disadvantage of SD ontologies is that they are flat and shallow ontologies with one to two levels only. Another limitation is that SD ontologies are available in English only. Moreover, many foods contain similar names, which make it difficult to use them as is for annotation. Finally, the SD ontologies lack of synonyms, which lead to limited coverage during the Web resources and user s query annotation. We have resolved all of these limitations in the developed ontologies. 2.2. International Classification of Diseases (ICD-10) Ontology ICD10 is huge ontology consisting of 14502 concepts consisting of diseases and health care procedures, which can be useful as it provides a huge vocabulary 17. Although the ontology is available in the English language but the translations are available for the vocabularies of ICD10 in different languages such as Arabic. The ICD10 ontology is designed to categorize diseases and health issues based on the various types of health and important records. 2.3. Human Disease Ontology The Disease Ontology (DO) 13 is open source ontology for the integration of biomedical data that is associated with human disease. Terms in DO ontology are well defined, using standard references. These terms are linked to wellestablished and adopted terminologies that contain disease and disease-related concepts. DO ontology represents a comprehensive knowledge base of 8043 human diseases. Each concept has a reference for most common health related ontologies with different synonyms or alternative names for the same concept. It is very useful for semantic annotation for two reasons; self-contained names used for each concept and rich set of synonyms for each concept. We have selected this ontology for semantic annotation of disease concepts and tuned for multilingual support.

Tarek Helmy et al. / Procedia Computer Science 52 ( 2015 ) 1071 1076 1073 2.4. AGROVOC Ontologies AGROVOC 15 provides ontologies with rich vocabulary that covers different areas of Food and Agriculture Organization (FAO), United Nations (UN) such as food, nutrition, etc. AGROVOC uses the standard RDF format to represent their linked dataset. The main advantage of the AGROVOC is the multilingual support that includes 22 languages and four languages are under development. 3. Ontology Development Process There are different methodologies to develop ontologies 11. We used four processes as shown in Tables 1 to 4. Table 1: Domain Ontology Development Process Process No 1 Output Domain Ontology Methodologies Reuse single existing domain ontology or multiple heterogeneous domain ontologies as they are. Extend existing domain ontology or build domain ontology from scratch. Table 2: Cross Domain Ontologies Development Process Process No 2 Output Integrated cross domain ontologies Methodologies Reuse or extend an existing integration between different domain ontologies as is. Build an integration between different domain ontologies from scratch (merge ontologies into one ontology, create an integration ontology and linking the ontologies with relationship) Table 3: Multilingual Ontologies Development Process from Multiple Mono-lingual Ontologies Process No 3 Output Integrated multilingual domain ontologies using either one-to-one mapping or agnostic ontology acting as a bridge between the existing ontologies Methodologies Automatic alignment between the monolingual ontologies (e.g. using translation service, mediator like Wikipedia) Manual alignment between the mono-lingual ontologies Semi-automatic alignment between the monolingual ontologies, i.e. partially automatic and partially manual. Table 4: Multilingual Ontologies Development Process from Single Mono-lingual Ontology Process No 4 Output Integrated multilingual domain ontologies using either one-to-one mapping or agnostic ontology acting as a bridge between the existing ontologies Methodologies - Option-1: (create different ontology for each culture) Create monolingual domain ontology or use Multilingual Ontologies to align the two domain ontologies. - Option-2: (enrich the existing ontology or replicate it) Automatic Translation of the input monolingual domain ontology into a new language. Manual translation for the input monolingual domain ontology. Semi-automatic translation for the input mono-lingual domain ontology. 3.1. Disease Ontology We have adapted the human Disease Ontology (DO) to produce a multilingual ontology that covers English and Arabic languages at this stage. We defined different interaction with food and nutrition concepts. We choose this ontology because its concepts are self-contained concepts unlike the ICD10 17. 3.2. Food Ontology We have selected the Semantic Diet (SD) ontology, which provides the properties of being aligned with USDA food database and useful for annotation 10. The limitations for this ontology are the hierarchy levels and non-support of multilingual. For hierarchy levels, we have extended the ontology with 4 to 5 levels in addition the two levels provided by the initial ontology of SD. The multilingual property was achieved to cover English and Arabic languages at this stage. We maintained the same integration with nutrition, religion, culture and recipe ontologies. 3.3. Nutrition Ontology Similar to food, we have selected nutrition ontology provided by SD as starting ontology. The SD nutrition ontology contains only one concept with 146 distinct nutrition elements with instances for all food instances. We have

1074 Tarek Helmy et al. / Procedia Computer Science 52 ( 2015 ) 1071 1076 extended the SD ontology to multi-levels in order to be able to capture the aggregation of nutrient in the same group. 3.4. Body Function and Body Part Ontologies Since we did not find suitable ontologies that cover concepts related to human body either functions, systems and parts, we built primitive ontology for the proof of the concept as shown in Figure 1. 3.5. Religion Ontology We need to create religion ontology to map the profile, health and food ontologies to the related religion properties. The religion ontology is dependent on the other developed domain ontologies and contains properties and relations with these ontologies. Hence, we create religion ontology as new ontology to answer questions related to food preference with regard to the user s religion as shown in Figure 1. 3.6. Culture Ontology The culture ontology is dependent on the other developed domain ontologies and contains properties and relations with these ontologies. Hence, we create culture ontology as new ontology to answer questions related to food preference with regard to the user s culture as shown in Figure 1. 3.7. Recipe Ontology Similar to food and nutrition, we have selected recipe ontology provided by SD as starting ontology. The SD recipe ontology contains only one concept without any instances. We have extended the ontology to multi-levels in order to be able to capture the aggregation of recipes in the same group as shown in Figure 1. 3.8. User s Profile Ontology The user s profile ontology 16 is based on the user s preferences and it is integrated with the domain ontologies for semantic manipulation of user s queries and results personalization. The mix between the personal information and the specialized food and health information motivates creating a specific profile ontology that can help in personalizing the food and health information. It is linked with disease ontology, body part ontology, body function ontology, food ontology, nutrition ontology and recipe ontology. More details of different ways are found in 5,6, 11,14. 3.9. Integration Ontology The integration ontology as shown in Figure 1 is the upper lay ontologies which integrates the health ontologies (disease, Body Parts, Body Functions) with Food (Food item and Nutrient) related ontologies. It is done through using the common known relation among the domains, which will allow us to capture and reason information following the used relations. 4. Performance Evaluations We have conducted several experiments to validate the system performance. We have implemented all the management tasks to support the knowledge integration of the food, health and nutrition domains. To assess effectiveness of the developed integrated ontology, we have used an existing reference set of documents related to three domains Food, Nutrition and Health. The aim is to evaluate the comprehensiveness and completeness of the developed integrated ontology with respect to handling all type of queries. We have collected 453 queries from different sources. Table 5 shows the source and the distributions of the collected queries. We categorized the 453 queries based on the concepts related to the health and food domain concepts. Table 6 shows the distribution of the queries on the categories. We tested if the system answer the given queries using the developed integrated ontology

Tarek Helmy et al. / Procedia Computer Science 52 ( 2015 ) 1071 1076 1075 based the type of question asked by the user. Some of the answer depended of available data in the knowledge base and not only ontologies. To evaluate the comprehensiveness, we study each query and check of the developed ontologies could answer them or not. The result of this evaluation is shown in the Table 7. Figure 1: Integration Ontology Table 5: Query Source Distribution Source Number of Collected Queries Survey sent to users interested in the system 98 Domain experts 53 Yahoo! Answers 103 Google Answers 86 Various Health Consumer Websites 113 Table 6: Query Categories Query Category Query Type Total Queries per Category Yes/No List Quantities Food-centric questions 32 57 Nutrition-centric questions 29 27 Disease-centric questions 24 34 Body Part-centric questions 18 12 Body Function-centric questions 23 16 Profile-centric questions 11 8 Culture-centric questions 14 10 Recipe-centric questions 21 24 Total 172 188

1076 Tarek Helmy et al. / Procedia Computer Science 52 ( 2015 ) 1071 1076 Table 7 Performance Query Category Queries Fully Answered Partially Answered No-Answer Performance Food-centric questions 105 81 8 16 77.1% Nutrition-centric questions 75 53 4 18 70.7% Disease-centric questions 74 64 2 8 86.5% Body Part-centric questions 37 29 0 8 78.4% Body Function-centric questions 45 40 1 4 88.9% Profile-centric questions 26 20 2 4 76.9% Culture-centric questions 33 27 3 3 81.8% Recipe-centric questions 58 50 1 7 86.2% Total 453 364 21 68 80.4% 5. Conclusion and Future Work This paper presents food, health, nutrition and the user s profile ontologies for semantic query manipulation and the result personalization. We investigate the existing ontologies; and summarize their limitations with respect to the system requirements. We developed the user s profile ontology, enhanced the existing related ontologies and integrated them to support multi-lingual support. In case of multilingual ontologies where domain ontologies already exist and encapsulates cultural changes in the domain along with language would not be similar in structure and aligning of such ontologies is required to work with our system. If the implementation supports agnostic ontology approach for multilingual ontologies, then this limitation can be handled as future work. Acknowledgements We acknowledge the support from King Abdulaziz City for Science and Technology (KACST) via the Science & Technology Unit at King Fahd University of Petroleum & Minerals (project No.10-INF1381-04), which is part of the National Science, Technology and Innovation Plan. References 1. Semantic Web. [Online]. Available: http://www.w3.org/standards/semanticweb/. [Accessed: 05-May-2014]. 2. Y. Wang and Z. Liu. Personalized health information retrieval system. AMIA Annual Symposium. WashingtonDC; 2005, p. 1149. 3. X. Jiang and A.-H. Tan, Learning and inferencing in user ontology for personalized Semantic Web search, Inf. Sci. (Ny)., vol. 179, no. 16, pp. 2794 2808, Jul. 2009. 4. Y. Li, J. Mostafa and X. Wang. A privacy enhancing infomediary for retrieving personalized health information from the Web. Personal Information Management A SIGIR 2006 Workshop; 2006, pp. 82-85. 5. S. Sahay and A. Ram. Socio-semantic health information access. AAAI 2011 Spring Symposium; 2011. 6. D. Matsumoto and L. Juang. Culture and Psychology. Cengage Learning, Inc. 5th edition, United States; 2012. 7. X. Tao, Y. Li and N. Zhong. A personalized ontology model for web information gathering. IEEE Transactions on Knowledge and Data Engineering, IEEE Computer Society Digital Library; 2011; 23 (4), pp. 496 511. 8. RDF: Resource Description Framework, http://www.w3.org/rdf/, last visited on: 21-01-2013. 9. The Protégé Ontology Editor and Knowledge Acquisition System. Available: http://protege.stanford.edu. [Accessed: 20-Jan-2013]. 10. C. Snae and M. Brückner, FOODS: A Food-Oriented Ontology-Driven System, 2nd IEEE Int. Conf. Digit. Ecosyst. Technol. (DEST 2008), pp. 168 176, 2008. 11. D. Wimalasuriya and D. Dou, Ontology-based information extraction: An introduction and a survey of current approaches, J. Inf. Sci., no. December 2009, pp. 1 20, 2010. 12. USDA National Nutrient Database for Standard Reference. [Online]. Available: http://www.ars.usda.gov/services/docs.htm?docid=8964. 13. Semantic Diet Intelligent dieting on the intelligent web. [Online]. Available: http://www.semanticdiet.com/. [Accessed: 01-Jan-2013]. 14. Human Disease Ontology. [Online]. Available: https://bioportal.bioontology.org/ontologies/doid. [Accessed: 05-Jan-2014]. 15. FAO. AGROVOC. [Online]. Available: http://www.fao.org/agrovoc/. [Accessed: 10-Jan-2014]. 16. Al-Nazer A, Helmy T (2013) Semantic Query-Manipulation and Personalized Retrieval of Health, Food and Nutrition Information. Elsevier open-access Procedia Computer Science Journal, Vol. 19, June 2013, PP. 163-170. 17. International Classification of Diseases (ICD). Available: http://www.who.int/classifications/icd/en/. [Accessed: 20-Jan-2013].