TY - JOUR
T1 - Application and utility of boosting machine learning model based on laboratory test in the differential diagnosis of non-COVID-19 pneumonia and COVID-19
AU - Baik, Seung Min
AU - Hong, Kyung Sook
AU - Park, Dong Jin
N1 - Publisher Copyright:
© 2023 The Author(s)
PY - 2023/8
Y1 - 2023/8
N2 - Background: Non-Coronavirus disease 2019 (COVID-19) pneumonia and COVID-19 have similar clinical features but last for different periods, and consequently, require different treatment protocols. Therefore, they must be differentially diagnosed. This study uses artificial intelligence (AI) to classify the two forms of pneumonia using mainly laboratory test data. Methods: Various AI models are applied, including boosting models known for deftly solving classification problems. In addition, important features that affect the classification prediction performance are identified using the feature importance technique and SHapley Additive exPlanations method. Despite the data imbalance, the developed model exhibits robust performance. Results: eXtreme gradient boosting, category boosting, and light gradient boosted machine yield an area under the receiver operating characteristic of 0.99 or more, accuracy of 0.96–0.97, and F1-score of 0.96–0.97. In addition, D-dimer, eosinophil, glucose, aspartate aminotransferase, and basophil, which are rather nonspecific laboratory test results, are demonstrated to be important features in differentiating the two disease groups. Conclusions: The boosting model, which excels in producing classification models using categorical data, excels in developing classification models using linear numerical data, such as laboratory tests. Finally, the proposed model can be applied in various fields to solve classification problems.
AB - Background: Non-Coronavirus disease 2019 (COVID-19) pneumonia and COVID-19 have similar clinical features but last for different periods, and consequently, require different treatment protocols. Therefore, they must be differentially diagnosed. This study uses artificial intelligence (AI) to classify the two forms of pneumonia using mainly laboratory test data. Methods: Various AI models are applied, including boosting models known for deftly solving classification problems. In addition, important features that affect the classification prediction performance are identified using the feature importance technique and SHapley Additive exPlanations method. Despite the data imbalance, the developed model exhibits robust performance. Results: eXtreme gradient boosting, category boosting, and light gradient boosted machine yield an area under the receiver operating characteristic of 0.99 or more, accuracy of 0.96–0.97, and F1-score of 0.96–0.97. In addition, D-dimer, eosinophil, glucose, aspartate aminotransferase, and basophil, which are rather nonspecific laboratory test results, are demonstrated to be important features in differentiating the two disease groups. Conclusions: The boosting model, which excels in producing classification models using categorical data, excels in developing classification models using linear numerical data, such as laboratory tests. Finally, the proposed model can be applied in various fields to solve classification problems.
KW - Artificial intelligence
KW - Boosting model
KW - COVID-19
KW - Differential diagnosis
KW - Laboratory test
KW - Non-COVID-19 pneumonia
UR - http://www.scopus.com/inward/record.url?scp=85160095370&partnerID=8YFLogxK
U2 - 10.1016/j.clinbiochem.2023.05.003
DO - 10.1016/j.clinbiochem.2023.05.003
M3 - Article
C2 - 37211061
AN - SCOPUS:85160095370
SN - 0009-9120
VL - 118
JO - Clinical Biochemistry
JF - Clinical Biochemistry
M1 - 110584
ER -