Abstract
This paper develops a new approach to learning high-dimensional linear structural equation models (SEMs) without the commonly assumed faithfulness, Gaussian error distribution, and equal error distribution conditions. A key component of the algorithm is componentwise ordering and parent estimations, where both problems can be efficiently addressed using '1-regularized regression. This paper proves that sample sizes n = (d2 log p) and n = (d2p2=m) are sufficient for the proposed algorithm to recover linear SEMs with sub- Gaussian and (4m)-th bounded-moment error distributions, respectively, where p is the number of nodes and d is the maximum degree of the moralized graph. Further shown is the worst-case computational complexity O(n(p3 + p2d2)), and hence, the proposed algorithm is statistically consistent and computationally feasible for learning a high-dimensional linear SEM when its moralized graph is sparse. Through simulations, we verify that the proposed algorithm is statistically consistent and computationally feasible, and it performs well compared to the state-of-the-art US, GDS, LISTEN and TD algorithms with our settings. We also demonstrate through real COVID-19 data that the proposed algorithm is well-suited to estimating a virus-spread map in China.
Original language | English |
---|---|
Journal | Journal of Machine Learning Research |
Volume | 22 |
State | Published - 2021 |
Keywords
- '1-regularization
- Bayesian networks
- Causal learning
- Directed acyclic graph
- Linear structural equation model
- Structure learning