TY - JOUR
T1 - Identification of important features in overweight and obesity among Korean adolescents using machine learning
AU - Lee, Serim
AU - Chun, Jong Serl
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/6
Y1 - 2024/6
N2 - Overweight and obesity in adolescents have been reported as one of the most serious threats worldwide including South Korea. This study aims to investigate the complex factors contributing to overweight and obesity in Korean adolescents using various machine learning methods. The research includes a dataset of 43,268 records from the 16th Korean Youth Risk Behavior Web-based Survey and explores 71 different factors, such as sociodemographic characteristics, dietary habits, health, behavior problems, family, and peer and school-related factors. Our analysis encompassed an array of algorithms, including Logistic Regression, Ridge, LASSO, Elasticnet, Decision tree, Bagging, Random forest, AdaBoost, and XGBoost. A total of nine machine learning models exhibited accuracy levels within the range of 0.7662 to 0.8403. Based on the domains and sub-domains of factors, it was determined that domains including sociodemographic characteristics, dietary habits, physical health, psychological health, behavioral problems, family factor, and peer and school factors were deemed important. Additionally, it is suggested that attention be given to newly-emerged features indicated by machine learning techniques, including oral health, smartphone addiction, smoking, sexual behavior, school violence, and nationality of parents. The current study's findings emphasize the critical need for collective and customized prevention programs considering multi-facet features to prevent overweight and obesity among Korean adolescents.
AB - Overweight and obesity in adolescents have been reported as one of the most serious threats worldwide including South Korea. This study aims to investigate the complex factors contributing to overweight and obesity in Korean adolescents using various machine learning methods. The research includes a dataset of 43,268 records from the 16th Korean Youth Risk Behavior Web-based Survey and explores 71 different factors, such as sociodemographic characteristics, dietary habits, health, behavior problems, family, and peer and school-related factors. Our analysis encompassed an array of algorithms, including Logistic Regression, Ridge, LASSO, Elasticnet, Decision tree, Bagging, Random forest, AdaBoost, and XGBoost. A total of nine machine learning models exhibited accuracy levels within the range of 0.7662 to 0.8403. Based on the domains and sub-domains of factors, it was determined that domains including sociodemographic characteristics, dietary habits, physical health, psychological health, behavioral problems, family factor, and peer and school factors were deemed important. Additionally, it is suggested that attention be given to newly-emerged features indicated by machine learning techniques, including oral health, smartphone addiction, smoking, sexual behavior, school violence, and nationality of parents. The current study's findings emphasize the critical need for collective and customized prevention programs considering multi-facet features to prevent overweight and obesity among Korean adolescents.
KW - Feature importance
KW - Korean adolescents
KW - Machine learning
KW - Obesity
KW - Overweight
UR - http://www.scopus.com/inward/record.url?scp=85192839142&partnerID=8YFLogxK
U2 - 10.1016/j.childyouth.2024.107644
DO - 10.1016/j.childyouth.2024.107644
M3 - Article
AN - SCOPUS:85192839142
SN - 0190-7409
VL - 161
JO - Children and Youth Services Review
JF - Children and Youth Services Review
M1 - 107644
ER -