Abstract
Rice (Oryza sativa L.) is a widely consumed food source, and its geographical origin has long been a subject of discussion. In our study, we collected 44 and 20 rice samples from different regions of the Republic of Korea and China, respectively, of which 35 and 29 samples were of white and brown rice, respectively. These samples were analyzed using nuclear magnetic resonance (NMR) spectroscopy, followed by analyses with various data normalization and scaling methods. Then, leave-one-out cross-validation (LOOCV) and external validation were employed to evaluate various machine learning algorithms. Total area normalization, with unit variance and Pareto scaling for white and brown rice samples, respectively, was determined as the best pre-processing method in orthogonal partial least squares–discriminant analysis. Among the various tested algorithms, support vector machine (SVM) was the best algorithm for predicting the geographical origin of white and brown rice, with an accuracy of 0.99 and 0.96, respectively. In external validation, the SVM-based prediction model for white and brown rice showed good performance, with an accuracy of 1.0. The results of this study suggest the potential application of machine learning techniques based on NMR data for the differentiation and prediction of diverse geographical origins of white and brown rice.
Original language | English |
---|---|
Article number | 1012 |
Journal | Metabolites |
Volume | 12 |
Issue number | 11 |
DOIs | |
State | Published - Nov 2022 |
Bibliographical note
Funding Information:This work was supported by SRC project (grant number 2022R1A5A6000760) and Chung-ang University Young Scientist Scholarship (CAYSS) in 2021.
Publisher Copyright:
© 2022 by the authors.
Keywords
- NMR spectroscopy
- geographical origin
- machine learning
- prediction model
- rice