Towards better data Johannes Ulander Standardisation and Harmonisation Specialist S-Cubed PhUSE SDE Beerse, 2017-11-28 Agenda What is data? Current state of submissions Introduction to linked data and graph db How does this improve quality? Summary 1
Disclaimer No magic tricks All similarities to actual data is intentional Everything is actual data CDISC pilot submission available at http://www.cdisc.org But sometimes I ve made it power point friendly No data was harmed during the making of this presentation How we think of data 2
1/8/18 Is this the data we re looking for? Current state of submissions FDA @ PhUSE CSS 2016 From a presentation by Mary Doi, M.D., M.S. (FDA CDER) FDA @ PhUSE CSS 2017 From a presentation by Crystal Allard, Special Assistant to the Director Office of Computational Science 3
Rate of Change 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 Series1 7 Left and Right Sides We map from here to datasets inconsistently We design here without fully understanding the right hand side 8 4
We Need Control Constant new versions Rate-of-change of versions Precision Which version am I using? Visibility What changed? When did it change? What is the impact of the change? Ease-of-use Make it easier to use Machine readable 9 Is there another way of managing clinical data? Could linked data or graph db provide something? 5
Introduction to linked data and graph db Site ID: 1 City: London Site Works at Investigator Name: Dr.X 11 Introduction to linked data and graph db Site ID: 1 City: London Site Works at Investigator Name: Dr.X Investigator Name: Dr. Who Site ID: 2 City: Leeds Site Investigator Name: Dr. Y 12 6
Race/Ethnicity MULTIPLE X + X Sex Name: Female X X Subject Ethnicity Name: Not Hispanic or Latino USUBJID: 1 Race Name: Asian Race Name: Black or African American 13 The graph can be queried Don t need to change the question 7
Medical History Disease Name: Tinnitus USUBJID:01-701-1015 Subject Disease Name: Alzheimer s USUBJID:01-701-1148 Subject Significant pre-existing condition 15 How does this impact data quality? What would happen if we had tools that we re built upon linked data? 8
Tools built with linked data Form Nodes view Form view Form built with Biomedical Concepts Tool with linked data Add controlled terminology 9
Tool with linked data Form and terminology 3.0.0 3.1.0 Enables impact analysis - Gives us control of version changes Electronic Health Records (EHR) vs. Clinical data Uses UCUM units and LOINC codes Can they share definitions? 10
Tool with linked data: Map definitions UCUM units and LOINC codes = Mapping = Shares the same definition! Tool with linked data Form + mappings We just need some data here! 11
EHR demo server in US Built on FHIR - Uses UCUM units and LOINC codes Updated graph with EHR data - Machine readable - Machine understandable - Traversal - EHR and Clinical Data can share definitions 12
Updated graph Add SDTM metadata Linked data + Query = SDTM Domain MATCH (s:subject)-[]->(n), (s:subject)-[]-(study:study) WHERE s.usubjid = '1' AND n.domain = 'MH' RETURN study.name as STUDYID, s.usubjid as USUBJID, n.domain as DOMAIN, n.name as MHTERM, n.stdtc as MHSTDTC 13
SDTM domains are outputs from the graph First exposure to study treatment MATCH (s:subject)-[]->(n), (s:subject)-[]-(study:study) WHERE s.usubjid = '1' AND n.domain = EX' EX Part of the query Not duplicated Not mapped MATCH (s:subject)-[]->(n), (s:subject)-[]-(study:study) WHERE s.usubjid = '1' AND n.domain = DS' DS DM MATCH (s:subject)-[]->(n), (s:subject)-[]-(study:study) WHERE s.usubjid = '1' AND n.domain = DM' Summary Linked data solves many of our current problems which cannot be solved with two dimensional structures (relational databases) Excels at handling complex information No structural boundaries, Metadata and Data merged Gives us precision and Enables control No mapping Impact management Version control EHR and Clinical Data can co-exist Next steps: Focus on how we represent knowledge 28 14
Thanksfor listening Contact Information Email: ju@a3informatics.com More information at: www.a3informatics.com A3 Informatics 2017 29 15