Published Versions 1 Vol 3 (3) : 329-339 2021
Download
Deep Learning with Heterogeneous Graph Embeddings for Mortality Prediction from Electronic Health Records
: 2020 - 12 - 24
: 2021 - 04 - 02
: 2021 - 04 - 30
24 0 0
Abstract & Keywords
Abstract: Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model (HGM) on electronic health record (EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network (CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.
Keywords: Electronic health records (EHRs); Convolutional Neural Networks (CNNs); Heterogeneous Graph Model (HGM); Machine learning; Deep learning
Acknowledgments
[1]
Johnson, A.E.W., Mark, R.G.: Real-time mortality prediction in the Intensive Care Unit. In: AMIA AnnualSymposium Proceedings, pp. 1–10 (2017)
[2]
Sharma, A., et al.: Mortality prediction of ICU patients using Machine Leaning: A survey. In: Proceedings ofthe International Conference on Compute and Data Analysis, pp. 49–53 (2017)
[3]
Delahanty, R.J., et al.: Development and evaluation of a machine learning model for the early identificationof patients at risk for sepsis. Annals of Emergency Medicine 73 (4), 334–344 (2019)
[4]
Meyer, A., et al.: Machine learning for real-time prediction of complications in critical care: A retrospectivestudy. The Lancet Respiratory Medicine 6(12), 905–914 (2018)
[5]
Glicksberg, B.S., Johnson, K.W., Dudley, J.D.: The next generation of precision medicine: Observationalstudies, electronic health records, biobanks and continuous monitoring. Human Molecular Genetics 27, R1,R56–R62 (2018)
[6]
Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: Towards better research applicationsand clinical care. Nature Reviews Genetics 13(6), 395–405 (2012)
[7]
Glicksberg, B.S., et al.: Automated disease cohort selection using word embeddings from Electronic HealthRecords. In: Pacific Symposium on Biocomputing 2018, pp. 145-156 (2018)
[8]
Rajkomar, A., et al.: Scalable and accurate deep learning with electronic health records. NPJ DigitalMedicine 1(1), 18 (2018)
[9]
Shickel, B., et al.: Deep EHR: A survey of recent advances in deep learning techniques for electronic healthrecord (EHR) analysis. IEEE Journal of Biomedical and Health Informatics 22(5), 1589–1604 (2017)
[10]
Cheng, Y., et al.: Risk prediction with electronic health records: A deep learning approach. In: Proceedingsof the 2016 SIAM International Conference on Data Mining, pp. 432–440 (2016)
[11]
Kim, S.Y., et al.: A deep learning model for real-time mortality prediction in critically ill children. CriticalCare 23(1), 279 (2019)
[12]
Zhang, J., Gong, J., Barnes, L.: HCNN: Heterogeneous convolutional neural networks for comorbid riskprediction with electronic health records. In: 2017 IEEE/ACM International Conference on Connected Health:Applications, Systems and Engineering Technologies (CHASE), pp. 214–221 (2017)
[13]
De Freitas, J.K., et al.: Phe2vec: Automated disease phenotyping based on unsupervised embeddings fromelectronic health records. medRxiv preprint medRxiv: 10.1101/2020.11.14.20231894 (2020)
[14]
Landi, I., et al.: Deep representation learning of electronic health records to unlock patient stratification atscale. Digital Medicine 3(1), 96 (2020)
[15]
Miotto, R., et al.: Deep patient: An unsupervised representation to predict the future of patients from theelectronic health records. Scientific Reports 6(1) 1–10 (2016)
[16]
Choi, E., et al.: Mime: Multilevel medical embedding of electronic health records for predictive healthcare.In: Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18),pp. 4552–4562 (2018)
[17]
Choi, E., et al.: Graph convolutional transformer: Learning the graphical structure of electronic health records.arXiv preprint arXiv:1906.04716 (2019)
[18]
Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Scientific Data 3(1), 1–9 (2016)
[19]
Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Scientific Data 3(1), 1–9 (2016)
Article and author information
Cite As
Citation: Wanyan T.Y., et al.: Deep learning with heterogeneous graph embeddings for mortality prediction from electronic health records. Data Intelligence 3(3), 329-339 (2021). doi: 10.1162/dint_a_00097
Tingyi Wanyan
T.Y. Wanyang, A. Azad, Y. Ding, and B.S. Glicksberg conceived of the project. T.Y. Wanyang and B.S. Glicksberg collected the data. TW and H. Honarvar performed the analyses and made the figures. T.Y. Wanyang, HH, and B.S. Glicksberg wrote the manuscript. TW, HH, A. Azad, Y. Ding, and B.S. Glicksberg edited the manuscript and provided revisions.
Tingyi Wanyan is a PhD student in Artificial Intelligence from IndianaUniversity co-mentored by Dr. Ariful Azad from Indiana University, andDr. Ying Ding from University of Texas at Austin. He is a visiting PhD studentin the Glicksberg Lab at the Icahn School of Medicine at Mount Sinai. He isinterested in research involving AI implemented in clinical care, especiallyregarding integrating various modalities of clinical data via representationlearning through Heterogeneous Knowledge Graph models. He specializesin integrating various data types such as electronic health records, medicalimages, and clinical note.
0000-0002-5011-3973
Hossein Honarvar
T.Y. Wanyang and H. Honarvar performed the analyses and made the figures. T.Y. Wanyang, HH, and B.S. Glicksberg wrote the manuscript. TW, HH, A. Azad, Y. Ding, and B.S. Glicksberg edited the manuscript and provided revisions.
Hossein Honarvar is a postdoctoral fellow at the Hasso Plattner Institutefor Digital Health at Mount Sinai in New York City. Previously, he was apostdoctoral researcher in Computational Physics at JILA Research Institutein Boulder and he received his PhD from the University of Colorado Boulderin 2018. His current research focuses on developing interpretable, fair, andmultimodal deep learning models for electrocardiogram data and creatingnovel clinical applications.
0000-0002-5011-3973
Ariful Azad
T.Y. Wanyang, A. Azad, Y. Ding, and B.S. Glicksberg conceived of the project. TW, HH, A. Azad, Y. Ding, and B.S. Glicksberg edited the manuscript and provided revisions. A. Azad, Y. Ding, and B.S. Glicksberg jointly supervised the work.
Ariful Azad is an Assistant Professor of Intelligent Systems Engineering atLuddy School of Informatics, Computing, and Engineering in IndianaUniversity. Dr. Azad obtained his PhD from Purdue University and B.S. fromBangladesh University of Engineering and Technology, Bangladesh. Hisresearch interests are in graph machine learning, sparse matrix algorithms,high-performance computing, and bioinformatics.
0000-0003-1332-8630
Ying Ding
T.Y. Wanyang, A. Azad, Y. Ding, and B.S. Glicksberg conceived of the project. TW, HH, A. Azad, Y. Ding, and B.S. Glicksberg edited the manuscript and provided revisions. A. Azad, Y. Ding, and B.S. Glicksberg jointly supervised the work.
Ying Ding is Bill & Lewis Suit Professor at School of Information, Universityof Texas at Austin. Before that, she was a professor and director of graduatestudies for data science program at School of Informatics, Computing, andEngineering at Indiana University. She has led the effort to develop the onlinedata science graduate program for Indiana University. She also worked as asenior researcher at Department of Computer Science, University of Innsburck(Austria) and Free University of Amsterdam (The Netherlands). She has beeninvolved in various NIH, NSF and European-Union funded projects. She haspublished more than 240 papers in journals, conferences, and workshops,and served as the program committee member for over 200 internationalconferences. She is the co-editor of book series called Semantic Web Synthesisby Morgan & Claypool publisher, the co-editor-in-chief for Data Intelligencepublished by MIT Press and Chinese Academy of Sciences, and serves as theeditorial board member for several top journals in Information Science andSemantic Web. She is the co-founder of Data2Discovery company advancingcutting edge AI technologies in drug discovery and healthcare. Her currentresearch interests include data-driven science of science, AI in healthcare,Semantic Web, knowledge graph, data science, scholarly communication,and the application of Web technologies.
0000-0003-2567-2009
Benjamin S. Glicksberg
T.Y. Wanyang, A. Azad, Y. Ding, and B.S. Glicksberg conceived of the project. T.Y. Wanyang and B.S. Glicksberg collected the data. T.Y. Wanyang, HH, and B.S. Glicksberg wrote the manuscript. TW, HH, A. Azad, Y. Ding, and B.S. Glicksberg edited the manuscript and provided revisions. A. Azad, Y. Ding, and B.S. Glicksberg jointly supervised the work.
benjamin.glicksberg@mssm.edu
Benjamin Glicksberg is an Assistant Professor of Genetics and GenomicSciences and a member of the Hasso Plattner Institute for Digital Health atthe Icahn School of Medicine at Mount Sinai, New York. Dr. Glicksberg hasextensive experience in clinical informatics and work involving electronichealth record data. He uses machine learning to couple multi-omic patienthealth data to forward personalized medicine. He completed his PhD inNeuroscience at the Icahn School of Medicine at Mount Sinai in 2017 andpost-doctoral work at the University of California, San Francisco in 2019.
0000-0003-4515-8090
Publication records
Published: Sept. 15, 2021 (Versions1
References
Data Intelligence