A Projection Pursuit Forest Algorithm for Supervised Classification

Natalia da Silva, Dianne Cook, Eun Kyung Lee

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

This article presents a new ensemble learning method for classification problems called projection pursuit random forest (PPF). PPF uses the PPtree algorithm where trees are constructed by splitting on linear combinations of randomly chosen variables. Projection pursuit is used to choose a projection of the variables that best separates the classes. Using linear combinations of variables to separate classes takes the correlation between variables into account which allows PPF to outperform a traditional random forest when separations between groups occurs in combinations of variables. The method presented here can be used in multi-class problems and is implemented into an R package, PPforest, which is available on CRAN. Supplementary files for this article are available online.

Original languageEnglish
Pages (from-to)1168-1180
Number of pages13
JournalJournal of Computational and Graphical Statistics
Volume30
Issue number4
DOIs
StatePublished - 2021

Bibliographical note

Publisher Copyright:
© 2021 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Keywords

  • Data mining
  • Ensemble model
  • Exploratory data analysis
  • High-dimensional data
  • Statistical computing

Fingerprint

Dive into the research topics of 'A Projection Pursuit Forest Algorithm for Supervised Classification'. Together they form a unique fingerprint.

Cite this