ModelChain: Decentralized Privacy-Preserving Healthcare Predictive Modeling Framework on Private Blockchain Networks Tsung-Ting Kuo, Chun-Nan Hsu, and Lucila Ohno-Machado pscanner Face-to-Face Meeting October 13, 2016 Office of the National Coordinator for Health IT Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 1
Predictive Modeling Healthcare predictive modeling Machine learning from healthcare data to predict outcomes Cross-institutional healthcare predictive modeling More generalizable models Comparative effectiveness research, biomedical discovery, patient-care, etc. Especially for large research network like pscanner Site 1 Hard to predict outcomes Records = 10 Records = 1,500 Feasible to predict outcome Patient Records = 10 Patients Patients Records = 760 Records = 380 Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 2
Protecting Privacy of Individuals Challenge of cross-institutional predictive modeling Improper disclosure of protected heath information (PHI) Privacy-preserving algorithms transfer models but not PHI Many methods exist [Wu et al. 2012] [Li et al. 2015] [Wang et al. 2013] [Yan et al. 2013] Share observation-level patient healthcare data Privacy-preserving predictive modeling algorithms Share partially-trained machine learning models Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 3
Risk for Existing Algorithms Centralized architecture Institutional policies Single-point-of-failure/breach Sites cannot join/leave at any time Mutable data and records Consensus/synchronization issues Central Server Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 4
The Blockchain Technology Desirable features Decentralized architecture» Peer-to-peer» Sites keep full control of resources» No risk of single-point-of-failure Sites can join/leave freely» No central server overhead» No disruption of learning process Immutable audit trail» Tampering is difficult Block B1 Block B2 Hash of Block B0 Nonce N1 Hash of Block B1 Nonce N2 Transaction T1 1 Transaction T1 2 Transaction T2 1 Transaction T2 2 Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 5
ModelChain 1. Privacy-preserving online machine learning on blockchains 2. Transaction metadata to transfer partial models and info 3. Proof-of-information algorithm to decide order of learning Site 1 Model M1 1 Site 2 Block B1 Hash of Block B0 Nonce N1 Block B2 Hash of Block B1 Nonce N2 Model M1 1 Model NULL Transaction T1 12 Flag TRANSFER Hash HASH (M1 1 ) Error E1 2 Model M2 2 Transaction T2 22 Flag UPDATE Hash HASH (M2 2 ) Error E2 2 Model M2 2 Model M1 1 Site 3 Model M2 2 Site 4 Model M3 3 Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 6
Summary Blockchain infrastructure as in Bitcoin ModelChain: distributed predictive models across institutes Privacy protection by exchange of model, not patient data No single-point-of-failure Verified data provenance Secure record of transactions Improved interoperability White paper and full slides: https://healthit.gov/blockchain Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 7
Acknowledgement The authors would like to thank Xiaoqian Jiang, PhD, and Shuang Wang, PhD for very helpful discussions All authors are funded by PCORI CDRN-1306-04819 LO-M is funded by NIH U54HL108460, UL1TR001442, and VA I01HX000982 Tsung-Ting Kuo PhD Chun-Nan Hsu PhD Lucila Ohno-Machado MD, PhD Xiaoqian Jiang PhD Shuang Wang PhD Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 8
Thank you! Questions? Supported by the Patient-Centered Outcomes Research Institute (PCORI) Contract CDRN-1306-04819 9