warwick.ac.uk/lib-publications

Similar documents
Semantic Annotation and Linking of Medical Educational Resources

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Supporting Patient Screening to Identify Suitable Clinical Trials

Linked Data: Fast, low cost semantic interoperability for health care?

Core Technology Development Team Meeting

LexGrid Philosophy, Model and Interfaces Harold R Solbrig Division of Biomedical Statistics and Informatics Mayo Clinic

Disease Information and Semantic Web

Representing Multiple Standards in a Single DAM: Use of Atomic Classes

Terminology as a Service HL7 AU

Prototyping a Biomedical Ontology Recommender Service

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

warwick.ac.uk/lib-publications

ISO CTS2 and Value Set Binding. Harold Solbrig Mayo Clinic

Korea Institute of Oriental Medicine, South Korea 2 Biomedical Knowledge Engineering Laboratory,

Utilizing NCBO Tools to Develop & Use an ECG Ontology

Acquiring Experience with Ontology and Vocabularies

Collaborative & WebProtégé

WHO ICD11 Wiki LexWiki, Semantic MediaWiki and the International Classification of Diseases

SNOMED CT Implementation Approaches. National Resource Centre for EHR Standards (NRCeS) C-DAC, Pune

Using Ontologies for Data and Semantic Integration

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

WebProtégé. Protégé going Web. Tania Tudorache, Jennifer Vendetti, Natasha Noy. Stanford Center for Biomedical Informatics

NCI Thesaurus, managing towards an ontology

A Vision for Bigger Biomedical Data: Integration of REDCap with Other Data Sources

Building Standardized Semantic Web RESTful Services to Support ICD-11 Revision

Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR

Development of an Environment for Data Annotation in Bioengineering

HL7 s Common Terminology Services Standard (CTS)

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR

LexBIG/EVS API Overview

SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL

A Linked Data Translation Approach to Semantic Interoperability

A Knowledge Model Driven Solution for Web-Based Telemedicine Applications

Terminology Harmonization

Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains

NCBO Technology: Powering semantically aware applications

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

NLM HL7 RFQ CHI mapping contract August/September, 2006

Workshop 2. > Interoperability <

Utilizing Semantic Web Technologies in Healthcare

Discovery services: next generation of searching scholarly information

The National Cancer Institute's Thésaurus and Ontology

Integrating TCGA Clinical Data Using Metadata-driven Tools and NLP

Ontology Servers and Metadata Vocabulary Repositories

Advances In Data Integration: The No ETL Approach. Marcos A. Campos, Principle Consultant, The Cognatic Group. capsenta.com. Sponsored by Capsenta

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

From Integration to Interoperability: The Role of Public Health Systems in the Emerging World of Health Information Exchange

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

Linked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library

A Justification-based Semantic Framework for Representing, Evaluating and Utilizing Terminology Mappings

ORES-2010 Ontology Repositories and Editors for the Semantic Web

Tania Tudorache Stanford University. - Ontolog forum invited talk04. October 2007

New Approach to Graph Databases

Apelon DTS on FHIR. John Gresh, Director of Product Development

Quick Reference Guide. Biomedical Answers

SEXTANT 1. Purpose of the Application

Community-based ontology development, alignment, and evaluation. Natasha Noy Stanford Center for Biomedical Informatics Research Stanford University

REDCap under FHIR Enhancing electronic data capture with FHIR capability

> Semantic Web Use Cases and Case Studies

Opus: University of Bath Online Publication Store

Medical Imaging Information I

Industry Adoption of Semantic Web Technology

A scalable AI Knowledge Graph Solution for Healthcare (and many other industries) Dr. Jans Aasman

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

A Semantic Web Approach to Integrative Biosurveillance. Narendra Kunapareddy, UTHSC Zhe Wu, Ph.D., Oracle

FHA Federal Health Information Model (FHIM) Terminology Modeling Process

Integrated Access to Biological Data. A use case

The Semantic Web DEFINITIONS & APPLICATIONS

Vocabulary Sharing and Maintenance using HL7Master File / Registry Infrastructure

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

Health Information Exchange Content Model Architecture Building Block HISO

The Emerging Data Lake IT Strategy

A method for recommending ontology alignment strategies

Ontology-based Architecture Documentation Approach

Controlled Medical Vocabulary in the CPR Generations

Agricultural bibliographic data sharing & interoperability in China

UMLS-Query: A Perl Module for Querying the UMLS

SELF-SERVICE SEMANTIC DATA FEDERATION

Chinese Agricultural Thesaurus and its application on data sharing & interoperability

Beyond PACS. From the Radiological Imaging Archive to the Medical Imaging Global Repository. Patricio Ledesma Project Manager

OMV / CTS2 Crosswalk

Aggregating Educational Data for Patient Empowerment

Chapter 27 Introduction to Information Retrieval and Web Search

Geosemantically-enhanced PubMed Queries Using the Geonames Ontology and Web Services

From Open Data to Data- Intensive Science through CERIF

Healthcare Information and Literature Searching

CMS and ehealth. Robert Tagalicod Director, Office of ehealth Standards and Services (OESS)

Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

University of Wollongong. Research Online

Utilizing, creating and publishing Linked Open Data with the Thesaurus Management Tool PoolParty

THE GETTY VOCABULARIES TECHNICAL UPDATE

Unlocking the full potential of location-based services: Linked Data driven Web APIs

Web Ontology for Software Package Management

Submission to: NLM Show Off Your Apps

Jena based Implementation of a ISO Metadata Registry. Gokce B. Laleci & A. Anil Sinaci

Transcription:

Original citation: Zhao, Lei, Lim Choi Keung, Sarah Niukyun and Arvanitis, Theodoros N. (2016) A BioPortalbased terminology service for health data interoperability. In: Unifying the Applications and Foundations of Biomedical and Health Informatics. Studies in Health Technology and Informatics, 226. IOS Press, pp. 143-146. Permanent WRAP URL: http://wrap.warwick.ac.uk/79131 Copyright and reuse: The Warwick Research Archive Portal (WRAP) makes this work by researchers of the University of Warwick available open access under the following conditions. Copyright and all moral rights to the version of the paper presented here belong to the individual author(s) and/or other copyright owners. To the extent reasonable and practicable the material made available in WRAP has been checked for eligibility before being made available. Copies of full items can be used for personal research or study, educational, or not-for-profit purposes without prior permission or charge. Provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way. Publisher s statement: The final publication is available at IOS Press through http://dx.doi.org/10.3233/978-1-61499-664-4-143 A note on versions: The version presented here may differ from the published version or, version of record, if you wish to cite this item you are advised to consult the publisher s version. Please see the permanent WRAP URL above for details on accessing the published version and note that access may require a subscription. For more information, please contact the WRAP Team at: wrap@warwick.ac.uk warwick.ac.uk/lib-publications

A BioPortal-based Terminology Service for Health Data Interoperability Lei ZHAO a,1, Sarah N. LIM CHOI KEUNG a, and Theodoros N. ARVANITIS a a Institute of Digital Healthcare, WMG, University of Warwick, Coventry, UK Abstract. A terminology service makes diverse terminologies/ontologies accessible under a uniform interface. The EU TRANSFoRm project built an online terminology service for European primary care research. The service experienced performance limitations during its operation. Based on community feedback, we evaluated alternative solutions and developed a new version of the service. Based on BioPortal s scalable infrastructure, the new service delivers more features with improved performance and reduced maintenance cost. We plan to extend the service to meet Fast Healthcare Interoperability Resources specifications. Keywords. Terminology Service, BioPortal, LexEVS, FHIR 1. Introduction The need for a centralised terminology service becomes critical to querying and analyzing distributed healthcare and biomedical databases in a heterogeneous environment, where a diverse set of code systems and ontologies have been used to encode health data. Healthcare code systems and biomedical ontologies are managed and developed by different organisations and have their own distribution formats. A centralised service provides a uniform interface to browse, search and manage terminologies and concept maps, facilitating terminology interoperability. 1.1. Related Work Several US organisations and projects have developed software solutions and online services to serve terminology resources on a large scale, most notably Mayo Clinic s LexEVS 0, National Cancer Institute (NCI) Thesaurus and Metathesaurus [1], National Library of Medicine (NLM) Unified Medical Language System (UMLS) [2], and National Center for Biomedical Ontology (NCBO) BioPortal [3]. LexEVS is an open source software package which provides a common terminology model and application programming interface (API) to access a wide range of terminology formats, value sets, and cross-terminology mappings. LexEVS is used by NCI as the software infrastructure to implement its Thesaurus and Metathesaurus services. The NCI Metathesaurus supplements NLM UMLS Metathesaurus with cancer-centric terminologies. UMLS Metathesaurus is a comprehensive multi-purpose and multi-lingual thesaurus that contains millions of biomedical and health related concepts, their synonymous names and their relationships. It combines over 150 classifications, thesauri and lists of controlled terms in the biomedical domain. NCBO BioPortal is the largest repository of biomedical ontologies with over 300 ontologies. It hosts ontologies developed in OWL, OBO and other formats, as well as a large number of medical terminologies from UMLS Metathesaurus. Users can publish their ontologies to BioPortal, submit new versions, browse the ontologies, and access the ontologies and their components through a set of RESTful services, SPARQL and dereferenceable URIs. The European FP7 project TRANSFoRm (2010-2015) (www.transformproject.eu) has developed an integrated vocabulary service (VS) [4], based on LexEVS, with a focus on European primary care systems. The service extracted and loaded a subset of UMLS Metathesaurus, which contained the most important code systems for European primary care such as SNOMED CT, ICPC, ICD10, Clinical Terms v3, LOINC, DICOM, HL7, including their European language variants, as well as genetic ontologies such as Gene Ontology and OMIM. A number of national code systems e.g. UK Read Codes v2 and British National 1 Corresponding Author. E-mail: Lei.Zhao@warwick.ac.uk

Formulary (BNF) were also integrated. The service provided a web-based terminology browser and web services API to integrate with other TRANSFoRm software e.g. patient cohort identification and ecrfs. The service has experienced certain limitations in performance and scalability during its 4 years operation in TRANSFoRm community. This paper describes our attempt to address some of these limitations in a new version of the service which is based on BioPortal. 2. Methods In order to support federated medical database query and analysis, a terminology service, as a minimal set of functional operations, needs to allow users to browse a classification s subsumption hierarchies, expand a concept to a set of all the subsumed codes, lookup a concept by search text, and translate a concept from one code system to another if precise cross mappings are available. TRANSFoRm VS specifically used UMLS Metathesaurus as the main source of concept maps, and built its infrastructure on LexEVS 6.1. As the size of the content database grew however, expanding a parent concept to its children became an expensive operation in LexEVS due to its relational implementation. The raw database files (MySQL 5.1) in TRANSFoRm VS were about 30GB. Parent-child navigation involves joining tables with millions of rows, resulting in the high latency for a single navigation operation. We selected 10 clinical concepts (e.g. Type 2 diabetes, UMLS C0011860) from TRANSFoRm use cases and tested the latency of LexEVS API call for each concept to return its child concepts. The typical response time varied from 5 seconds to 30 seconds from our experience with the use case-related queries. Another common feature request by TRANSFoRm users is to make autocomplete search suggestions while users type the medical terms into the search field. LexEVS internally builds Lucene index to speed up full-text search for all the text fields when loading a terminology. However the index is not exposed and cannot be leveraged to implement the autocomplete user interface (UI). We evaluated existing services as alternative solutions due to these limitations. Both NCI Metathesaurus and NLM UMLS Metathesaurus have implemented web services for public access. Nevertheless, these services only allow to retrieve provided contents and cannot be replicated locally. BioPortal, on the other hand, not only has a public repository free to access and publish but also provides virtual appliance for local installation. The latest version of BioPortal organises ontology data in RDF format and uses a triple store as the primary storage. Parent-child navigation is much faster compared to LexEVS (less than one second on average for one level expansion). BioPortal provides a rich set of RESTful APIs which include pagination and search suggestions, convenient for UI development. BioPortal s own web UI is slow and cannot select codes to include in medical database queries. We therefore develop a new terminology browser based on its RESTful service, using modern responsive web design techniques e.g. HTML5, Bootstrap, JQuery, Fancytree and Typeahead.

3. Results Figure 1: Screenshot of our web-based terminology browser with autocomplete suggestions Built on BioPortal s REST infrastructure, our new terminology service allows our users to access terminologies in the public repository and license protected contents through a local instance. Many UMLS terminologies we have to manually load into LexEVS are available from BioPortal s public repository, having significantly reduced our maintenance overhead. As shown in [5], users can explore a terminology s subsumption hierarchy in a tree browser, expand a concept, and select all subsumed codes. The selected code set can be saved and included in a database query. Each concept has a hyperlink which will present more information about the concept, including its definition, semantic type, identifier (i.e. Concept Unique Identifiers for UMLS terminologies), and cross-mappings to other code systems. The search field allows users to search concepts by either terms or codes. As user types in the field, a list of suggested terms are presented in a dropdown window to save user input. The search results are listed in a paged table where users can navigate and filter further. 4. Discussion The main focus of our current terminology service is easy access to healthcare code systems in diverse formats and facilitate integration with patient data analytics. However, the service needs to extend its functional scope in order to be used in more healthcare contexts, e.g. to support the functions related to value set as specified by HL7 Fast Healthcare Interoperability Resources (FHIR) [6]. FHIR is a next generation standards framework created by HL7 for sharing healthcare data. Over fifty organisations worldwide use FHIR to exchange information. FHIR makes important differentiation between the concepts of code system and value set. A code system defines a set of codes with meanings. A value set selects a set of codes from one or more code systems to specify which codes can be used in a particular context. A coded element is bound to a value set instead of a code system. A value set can either contain an in-line code system or describe a set of rules to select codes defined in other code systems. These rules can be simply a direct list of codes from a specified version of a code system or can be complex query expressions. FHIR terminology server will expand the value set to a collection of enumerated codes ready to use for data entry or validation. For example, a value set of Myocardial infarction using SNOMED CT is defined as is-a SCT::22298006. The value set will be expanded to the set of all descendants of 22298006 including 22298006 itself. BioPortal provides the service infrastructure to distribute code systems but lacks support for value sets. We plan to investigate the FHIR terminology specification and extend our service to support definition and resolution of value sets, particularly with dynamic expressions.

5. Conclusions The robust and scalable software infrastructure makes BioPortal a good candidate for building online terminology service. We have developed a new version of our terminology service based on BioPortal with improved performance, more UI features and reduced maintenance cost. We plan to extend our service to a full-featured HL7 FHIR terminology server in next version, and also explore BioPortal s SPARQL capability for semantic reasoning in future. Acknowledgements: This work has been supported by the West Midlands Academic Health Science Network, UK. References [1] National Cancer Institute, LexEVS - NCI Wiki. [Online]. Available: https://wiki.nci.nih.gov/display/lexevs/lexevs. [Accessed: 10-May-2016]. [2] National Cancer Institute, NCI Metathesaurus. [Online]. Available: https://ncimeta.nci.nih.gov/ncimbrowser/. [Accessed: 10- May-2016]. [3] US National Library of Medicine, Unified Medical Language System (UMLS), 2016. [Online]. Available: http://www.nlm.nih.gov/research/umls/. [Accessed: 10-May-2016]. [4] M. Salvadores, P. R. Alexander, M. A. Musen, and N. F. Noy, BioPortal as a Dataset of Linked Biomedical Ontologies and Terminologies in RDF, Semantic Web, vol. 4, no. 3, pp. 277 284, 2013. [5] S. N. Lim Choi Keung, L. Zhao, E. Tyler, T. N. Arvanitis, and F.D.R. Hobbs, Integrated Vocabulary Service for Health Data Interoperability, in 4 th International Conference on ehealth, Telemedicine and Social Medicine (etelemed 2012), Valencia, Spain, 2012, pp. 124 127. [6] HL7, Terminology-service - FHIR v1.0.2. [Online]. Available: https://www.hl7.org/fhir/terminology-service.html. [Accessed: 10-May-2016].