Primal path algorithm for compositional data analysis

Jong June Jeon, Yongdai Kim, Sungho Won, Hosik Choi

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

We consider the LASSO estimator for compositional data in which covariates are nonnegative, and their sum is always one. Due to the linear constraint of the regression coefficients caused by the sum to one condition, standard algorithms for LASSO cannot be applied directly to compositional data. Hence, a specific regularized regression model with linear constraints is commonly used. However, linear constraints incur additional computational time, which becomes severe in high-dimensional cases. Additionally, the exact computation for the regression is not investigated under existing methods. In this paper, we first propose an exact solution path algorithm for a l1 regularized regression with high-dimensional compositional data and extend to a classification model. We also compare its computational speed with that of previously developed algorithms and then apply the proposed algorithm to analyzing income inequality data in economics and human gut microbiome data in biology. By analyzing simulated and real data sets, we illustrate that our specialized algorithm is significantly more efficient than the generalized LASSO algorithm for compositional data.

Original languageEnglish
Article number106958
JournalComputational Statistics and Data Analysis
Volume148
DOIs
StatePublished - Aug 2020

Keywords

  • Constraint
  • Microbiome data
  • Penalized regression
  • Solution path algorithm

Fingerprint

Dive into the research topics of 'Primal path algorithm for compositional data analysis'. Together they form a unique fingerprint.

Cite this