Abstract
We consider the block coordinate descent methods of Gauss-Seidel type with proximal regularization (BCD-PR), which is a classical method of minimizing general nonconvex objectives under constraints that has a wide range of practical applications. We theoretically establish the worst-case complexity bound for this algorithm. Namely, we show that for general nonconvex smooth objective with block-wise constraints, the classical BCD-PR algorithm converges to an ε-stationary point within Oe(ε−1) iterations. Under a mild condition, this result still holds even if the algorithm is executed inexactly in each step. As an application, we propose a provable and efficient algorithm for 'Wasserstein CP-dictionary learning', which seeks a set of elementary probability distributions that can well-approximate a given set of d-dimensional joint probability distributions. Our algorithm is a version of BCD-PR that operates in the dual space, where the primal problem is regularized both entropically and proximally.
| Original language | English |
|---|---|
| Pages (from-to) | 18114-18134 |
| Number of pages | 21 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 202 |
| State | Published - 2023 |
| Event | 40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States Duration: 23 Jul 2023 → 29 Jul 2023 |
Fingerprint
Dive into the research topics of 'Complexity of Block Coordinate Descent with Proximal Regularization and Applications to Wasserstein CP-dictionary Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver