TY - JOUR
T1 - Interpretable Deep-Learning Approaches for Osteoporosis Risk Screening and Individualized Feature Analysis Using Large Population-Based Data
T2 - Model Development and Performance Evaluation
AU - Suh, Bogyeong
AU - Yu, Heejin
AU - Kim, Hyeyeon
AU - Lee, Sanghwa
AU - Kong, Sunghye
AU - Kim, Jin Woo
AU - Choi, Jongeun
N1 - Publisher Copyright:
©Bogyeong Suh, Heejin Yu, Hyeyeon Kim, Sanghwa Lee, Sunghye Kong, Jin-Woo Kim, Jongeun Choi.
PY - 2023
Y1 - 2023
N2 - Background: Osteoporosis is one of the diseases that requires early screening and detection for its management. Common clinical tools and machine-learning (ML) models for screening osteoporosis have been developed, but they show limitations such as low accuracy. Moreover, these methods are confined to limited risk factors and lack individualized explanation. Objective: The aim of this study was to develop an interpretable deep-learning (DL) model for osteoporosis risk screening with clinical features. Clinical interpretation with individual explanations of feature contributions is provided using an explainable artificial intelligence (XAI) technique. Methods: We used two separate data sets: the National Health and Nutrition Examination Survey data sets from the United States (NHANES) and South Korea (KNHANES) with 8274 and 8680 respondents, respectively. The study population was classified according to the T-score of bone mineral density at the femoral neck or total femur. A DL model for osteoporosis diagnosis was trained on the data sets and significant risk factors were investigated with local interpretable model-agnostic explanations (LIME). The performance of the DL model was compared with that of ML models and conventional clinical tools. Additionally, contribution ranking of risk factors and individualized explanation of feature contribution were examined. Results: Our DL model showed area under the curve (AUC) values of 0.851 (95% CI 0.844-0.858) and 0.922 (95% CI 0.916-0.928) for the femoral neck and total femur bone mineral density, respectively, using the NHANES data set. The corresponding AUC values for the KNHANES data set were 0.827 (95% CI 0.821-0.833) and 0.912 (95% CI 0.898-0.927), respectively. Through the LIME method, significant features were induced, and each feature’s integrated contribution and interpretation for individual risk were determined. Conclusions: The developed DL model significantly outperforms conventional ML models and clinical tools. Our XAI model produces high-ranked features along with the integrated contributions of each feature, which facilitates the interpretation of individual risk. In summary, our interpretable model for osteoporosis risk screening outperformed state-of-the-art methods.
AB - Background: Osteoporosis is one of the diseases that requires early screening and detection for its management. Common clinical tools and machine-learning (ML) models for screening osteoporosis have been developed, but they show limitations such as low accuracy. Moreover, these methods are confined to limited risk factors and lack individualized explanation. Objective: The aim of this study was to develop an interpretable deep-learning (DL) model for osteoporosis risk screening with clinical features. Clinical interpretation with individual explanations of feature contributions is provided using an explainable artificial intelligence (XAI) technique. Methods: We used two separate data sets: the National Health and Nutrition Examination Survey data sets from the United States (NHANES) and South Korea (KNHANES) with 8274 and 8680 respondents, respectively. The study population was classified according to the T-score of bone mineral density at the femoral neck or total femur. A DL model for osteoporosis diagnosis was trained on the data sets and significant risk factors were investigated with local interpretable model-agnostic explanations (LIME). The performance of the DL model was compared with that of ML models and conventional clinical tools. Additionally, contribution ranking of risk factors and individualized explanation of feature contribution were examined. Results: Our DL model showed area under the curve (AUC) values of 0.851 (95% CI 0.844-0.858) and 0.922 (95% CI 0.916-0.928) for the femoral neck and total femur bone mineral density, respectively, using the NHANES data set. The corresponding AUC values for the KNHANES data set were 0.827 (95% CI 0.821-0.833) and 0.912 (95% CI 0.898-0.927), respectively. Through the LIME method, significant features were induced, and each feature’s integrated contribution and interpretation for individual risk were determined. Conclusions: The developed DL model significantly outperforms conventional ML models and clinical tools. Our XAI model produces high-ranked features along with the integrated contributions of each feature, which facilitates the interpretation of individual risk. In summary, our interpretable model for osteoporosis risk screening outperformed state-of-the-art methods.
KW - artificial intelligence
KW - deep learning
KW - machine learning
KW - osteoporosis
KW - risk factors
KW - screening
UR - http://www.scopus.com/inward/record.url?scp=85146364834&partnerID=8YFLogxK
U2 - 10.2196/40179
DO - 10.2196/40179
M3 - Article
C2 - 36482780
AN - SCOPUS:85146364834
SN - 1438-8871
VL - 25
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
M1 - e40179
ER -