TY - JOUR
T1 - At-admission prediction of mortality and pulmonary embolism in an international cohort of hospitalised patients with COVID-19 using statistical and machine learning methods
AU - Mazankowski Heart Institute
AU - ISARIC Characterisation Group
AU - Mesinovic, Munib
AU - Wong, Xin Ci
AU - Rajahram, Giri Shan
AU - Citarella, Barbara Wanjiru
AU - Peariasamy, Kalaiarasu M.
AU - van Someren Greve, Frank
AU - Olliaro, Piero
AU - Merson, Laura
AU - Clifton, Lei
AU - Kartsonaki, Christiana
AU - Abdukahil, Sheryl Ann
AU - Abdulkadir, Nurul Najmee
AU - Abe, Ryuzo
AU - Abel, Laurent
AU - Abrous, Amal
AU - Absil, Lara
AU - Acker, Andrew
AU - Adachi, Shingo
AU - Adam, Elisabeth
AU - Adriano, Enrico
AU - Adrião, Diana
AU - Ageel, Saleh Al
AU - Ahmed, Shakeel
AU - Aiello, Marina
AU - Ainscough, Kate
AU - Airlangga, Eka
AU - Aisa, Tharwat
AU - Hssain, Ali Ait
AU - Tamlihat, Younes Ait
AU - Akimoto, Takako
AU - Akmal, Ernita
AU - Qasim, Eman Al
AU - Alalqam, Razi
AU - Alberti, Angela
AU - Al-dabbous, Tala
AU - Alegesan, Senthilkumar
AU - Alegre, Cynthia
AU - Alessi, Marta
AU - Alex, Beatrice
AU - Alexandre, Kévin
AU - Al-Fares, Abdulrahman
AU - Alfoudri, Huda
AU - Ali, Adam
AU - Ali, Imran
AU - Alidjnou, Kazali Enagnon
AU - Aliudin, Jeffrey
AU - Alkhafajee, Qabas
AU - Allavena, Clotilde
AU - Allou, Nathalie
AU - Luque, Nestor
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - By September 2022, more than 600 million cases of SARS-CoV-2 infection have been reported globally, resulting in over 6.5 million deaths. COVID-19 mortality risk estimators are often, however, developed with small unrepresentative samples and with methodological limitations. It is highly important to develop predictive tools for pulmonary embolism (PE) in COVID-19 patients as one of the most severe preventable complications of COVID-19. Early recognition can help provide life-saving targeted anti-coagulation therapy right at admission. Using a dataset of more than 800,000 COVID-19 patients from an international cohort, we propose a cost-sensitive gradient-boosted machine learning model that predicts occurrence of PE and death at admission. Logistic regression, Cox proportional hazards models, and Shapley values were used to identify key predictors for PE and death. Our prediction model had a test AUROC of 75.9% and 74.2%, and sensitivities of 67.5% and 72.7% for PE and all-cause mortality respectively on a highly diverse and held-out test set. The PE prediction model was also evaluated on patients in UK and Spain separately with test results of 74.5% AUROC, 63.5% sensitivity and 78.9% AUROC, 95.7% sensitivity. Age, sex, region of admission, comorbidities (chronic cardiac and pulmonary disease, dementia, diabetes, hypertension, cancer, obesity, smoking), and symptoms (any, confusion, chest pain, fatigue, headache, fever, muscle or joint pain, shortness of breath) were the most important clinical predictors at admission. Age, overall presence of symptoms, shortness of breath, and hypertension were found to be key predictors for PE using our extreme gradient boosted model. This analysis based on the, until now, largest global dataset for this set of problems can inform hospital prioritisation policy and guide long term clinical research and decision-making for COVID-19 patients globally. Our machine learning model developed from an international cohort can serve to better regulate hospital risk prioritisation of at-risk patients.
AB - By September 2022, more than 600 million cases of SARS-CoV-2 infection have been reported globally, resulting in over 6.5 million deaths. COVID-19 mortality risk estimators are often, however, developed with small unrepresentative samples and with methodological limitations. It is highly important to develop predictive tools for pulmonary embolism (PE) in COVID-19 patients as one of the most severe preventable complications of COVID-19. Early recognition can help provide life-saving targeted anti-coagulation therapy right at admission. Using a dataset of more than 800,000 COVID-19 patients from an international cohort, we propose a cost-sensitive gradient-boosted machine learning model that predicts occurrence of PE and death at admission. Logistic regression, Cox proportional hazards models, and Shapley values were used to identify key predictors for PE and death. Our prediction model had a test AUROC of 75.9% and 74.2%, and sensitivities of 67.5% and 72.7% for PE and all-cause mortality respectively on a highly diverse and held-out test set. The PE prediction model was also evaluated on patients in UK and Spain separately with test results of 74.5% AUROC, 63.5% sensitivity and 78.9% AUROC, 95.7% sensitivity. Age, sex, region of admission, comorbidities (chronic cardiac and pulmonary disease, dementia, diabetes, hypertension, cancer, obesity, smoking), and symptoms (any, confusion, chest pain, fatigue, headache, fever, muscle or joint pain, shortness of breath) were the most important clinical predictors at admission. Age, overall presence of symptoms, shortness of breath, and hypertension were found to be key predictors for PE using our extreme gradient boosted model. This analysis based on the, until now, largest global dataset for this set of problems can inform hospital prioritisation policy and guide long term clinical research and decision-making for COVID-19 patients globally. Our machine learning model developed from an international cohort can serve to better regulate hospital risk prioritisation of at-risk patients.
UR - http://www.scopus.com/inward/record.url?scp=85199015852&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-63212-7
DO - 10.1038/s41598-024-63212-7
M3 - Article
C2 - 39013928
AN - SCOPUS:85199015852
SN - 2045-2322
VL - 14
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 16387
ER -