TY - JOUR
T1 - Partial least squares fusing unsupervised learning
AU - Yoo, Jae Keun
N1 - Funding Information:
The authors are also grateful to the associate editor, the three referees and the new referee for many insightful and helpful comments. For Jae Keun Yoo, this work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean Ministry of Education ( NRF-2017R1A2B1004909/2009-0093827 ).
Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2018/4/15
Y1 - 2018/4/15
N2 - In this paper, partial least squares to fuse unsupervised learning, called fused clustered least squares (FCLS), is proposed. As an unsupervised method, the K-means clustering algorithm is adopted, and it clusters either the original predictors or its principal components. This unsupervised learning procedure has a function to discover unknown structures of the predictors, and this information is utilized in their further reduction. Within each cluster, the covariance of the response and the predictors is computed and successively projected onto the covariance matrix of the predictors. This is called clustered least squares. Then we fuse all clustered least squares from the various numbers of clusters. The FCLS is basically implemented by combining supervised and unsupervised statistical methods, and it overcomes the deficits that the ordinary least squares, including its popular variation of partial least squares, have in practice. Numerical studies support the theory, and its application to near infrared spectroscopy data confirms the potential advantage of FCLS in practice.
AB - In this paper, partial least squares to fuse unsupervised learning, called fused clustered least squares (FCLS), is proposed. As an unsupervised method, the K-means clustering algorithm is adopted, and it clusters either the original predictors or its principal components. This unsupervised learning procedure has a function to discover unknown structures of the predictors, and this information is utilized in their further reduction. Within each cluster, the covariance of the response and the predictors is computed and successively projected onto the covariance matrix of the predictors. This is called clustered least squares. Then we fuse all clustered least squares from the various numbers of clusters. The FCLS is basically implemented by combining supervised and unsupervised statistical methods, and it overcomes the deficits that the ordinary least squares, including its popular variation of partial least squares, have in practice. Numerical studies support the theory, and its application to near infrared spectroscopy data confirms the potential advantage of FCLS in practice.
KW - Cluster analysis
KW - Fused approach
KW - Large p small n
KW - Multivariate analysis
KW - Partial least squares
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85042630848&partnerID=8YFLogxK
U2 - 10.1016/j.chemolab.2017.12.016
DO - 10.1016/j.chemolab.2017.12.016
M3 - Article
AN - SCOPUS:85042630848
SN - 0169-7439
VL - 175
SP - 82
EP - 86
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
ER -