Machine learning-powered prediction of recurrence in patients with non-small cell lung cancer using quantitative clinical and radiomic biomarkers

Sehwa Moon, Dahim Choi, Ji Yeon Lee, Myoung Hee Kim, Helen Hong, Bong Seog Kim, Jang Hwan Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


Lung cancer is a fatal disease, non-small cell lung cancer (NSCLC) being the most prevalent type. One of the main purposes of researching NSCLC is identifying patients at high risk for recurrence after surgical resection so that specific and suitable treatments can be found for them. The classification of cancer by anatomic disease extent, that is, by tumor-size (T stage) and nodal-involvement (N stage), is the most widely accepted determinant of appropriate treatment and prognosis among practicing clinicians. However, TN stage-based risk prediction can be inaccurate, as there is moderate observer variability when reporting the size of the lesion. Here, we propose a lung cancer-recurrence prediction model using principal component analysis (PCA) and machine learning (ML) techniques and considering radiomic features and clinical data, including the TN stage. After being filtered by a statistical model, the principal components, including Tand N-stage data and the handcrafted radiomic features from CT images, were applied to various ML models (i.e., random forests, support vector machines, naive Bayesian classifiers, and both boosting). We conducted this study, not only on recurrence, but also recurrence within two years of surgical resection, since more than 80% of recurrence occurs within this time frame. In both cases, the experimental results showed that combining radiomic features and clinical data improves the prediction of lung-cancer recurrence over that of models that only use TN stage data in terms of the 5-fold cross-validation accuracy mean, the receiver operating characteristic (ROC), the area under the ROC curve (AUC), and Kaplan-Meier curves. Finally, this model has been embedded in a website and is being prepared for the Ministry of Food and Drug Safety (MFDS) medical device registration and approval in South Korea.

Original languageEnglish
Title of host publicationMedical Imaging 2020
Subtitle of host publicationComputer-Aided Diagnosis
EditorsHorst K. Hahn, Maciej A. Mazurowski
ISBN (Electronic)9781510633957
StatePublished - 2020
EventMedical Imaging 2020: Computer-Aided Diagnosis - Houston, United States
Duration: 16 Feb 202019 Feb 2020

Publication series

NameProgress in Biomedical Optics and Imaging - Proceedings of SPIE
ISSN (Print)1605-7422


ConferenceMedical Imaging 2020: Computer-Aided Diagnosis
Country/TerritoryUnited States

Bibliographical note

Publisher Copyright:
© 2020 SPIE.


  • hand-crafted features
  • machine learning
  • non-small cell lung cancer
  • radiomics
  • recurrence risk


Dive into the research topics of 'Machine learning-powered prediction of recurrence in patients with non-small cell lung cancer using quantitative clinical and radiomic biomarkers'. Together they form a unique fingerprint.

Cite this