An Interactive Online App for Predicting Diabetes via Machine Learning from Environment-Polluting Chemical Exposure Data

Rosy Oh, Hong Kyu Lee, Youngmi Kim Pak, Man Suk Oh

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


The early prediction and identification of risk factors for diabetes may prevent or delay diabetes progression. In this study, we developed an interactive online application that provides the predictive probabilities of prediabetes and diabetes in 4 years based on a Bayesian network (BN) classifier, which is an interpretable machine learning technique. The BN was trained using a dataset from the Ansung cohort of the Korean Genome and Epidemiological Study (KoGES) in 2008, with a follow-up in 2012. The dataset contained not only traditional risk factors (current diabetes status, sex, age, etc.) for future diabetes, but it also contained serum biomarkers, which quantified the individual level of exposure to environment-polluting chemicals (EPC). Based on accuracy and the area under the curve (AUC), a tree-augmented BN with 11 variables derived from feature selection was used as our prediction model. The online application that implemented our BN prediction system provided a tool that performs customized diabetes prediction and allows users to simulate the effects of controlling risk factors for the future development of diabetes. The prediction results of our method demonstrated that the EPC biomarkers had interactive effects on diabetes progression and that the use of the EPC biomarkers contributed to a substantial improvement in prediction performance.

Original languageEnglish
Article number5800
JournalInternational Journal of Environmental Research and Public Health
Issue number10
StatePublished - 2 May 2022

Bibliographical note

Funding Information:
Funding: This research was supported by the Basic Science Research Program (2020R1I1A1A01067376 to R.O, 2020R1A2C1008699 and 2018R1A6A1A03025124 to Y.K.P, 2019R1A6A1A11051177 and 2019R1A2C1003086 to M.O) through the National Research Foundation of Korea (NRF) funded by the Korean government (MIST). The funding source had no role in the collection of data or in the decision to submit the manuscript for publication.

Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.


  • Bayesian network
  • diabetes mellitus
  • environmental pollutants
  • glucose intolerance
  • machine learning


Dive into the research topics of 'An Interactive Online App for Predicting Diabetes via Machine Learning from Environment-Polluting Chemical Exposure Data'. Together they form a unique fingerprint.

Cite this