Support Vector Machines (SVMs) are amongst the most powerful classification algorithms in machine learning and data mining. However, SVMs are limited by high training complexity when training with large datasets. Pattern selection methods have been proposed to reduce the training complexity by selecting a smaller subset of important patterns among all training patterns. In this paper, we propose a new pattern selection method called Expected Margin–based Pattern Selection (EMPS), which selects patterns based on an estimated margin for SVM classifiers. With the estimated margin, EMPS selects patterns that are likely to become support vectors located on the margin boundary and inside the margin region; however, other patterns including noise support vectors are discarded. The experimental results involving 15 benchmark datasets and one real–world semiconductor manufacturing dataset showed that EMPS exhibits excellent performance and stability.
Bibliographical noteFunding Information:
This work was supported by funding from Chungnam National University .
© 2019 Elsevier Ltd
- Large data
- Pattern selection
- Semiconductor manufacturing
- Support vector machines
- Training complexity