Abstract
Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.
| Original language | English |
|---|---|
| Pages (from-to) | 2590-2617 |
| Number of pages | 28 |
| Journal | Annals of Statistics |
| Volume | 45 |
| Issue number | 6 |
| DOIs | |
| State | Published - Dec 2017 |
Keywords
- Exact p-value
- Hypothesis test
- Principal components