Abstract
Microarray experiments have raised challenging questions such as how to make an accurate identification of a set of marker genes responsible for various cancers. In statistics, this specific task can be posed as the feature selection problem. Since a support vector machine can deal with a vast number of features, it has gained wide spread use in microarray data analysis. We propose a stepwise feature selection using the generalized logistic loss that is a smooth approximation of the usual hinge loss. We compare the proposed method with the support vector machine with recursive feature elimination for both real and simulated datasets. It is illustrated that the proposed method can improve the quality of feature selection through standardization while the method retains similar predictive performance compared with the recursive feature elimination.
Original language | English |
---|---|
Pages (from-to) | 3709-3718 |
Number of pages | 10 |
Journal | Computational Statistics and Data Analysis |
Volume | 52 |
Issue number | 7 |
DOIs | |
State | Published - 15 Mar 2008 |