Abstract
Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.
Original language | English |
---|---|
Pages (from-to) | 2590-2617 |
Number of pages | 28 |
Journal | Annals of Statistics |
Volume | 45 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2017 |
Keywords
- Exact p-value
- Hypothesis test
- Principal components