Flat Posterior Does Matter For Bayesian Model Averaging

  • Sungjun Lim
  • , Jeyoon Yeom
  • , Sooyon Kim
  • , Hoyoon Byun
  • , Jinho Kang
  • , Yohan Jung
  • , Jiyoung Jung
  • , Kyungwoo Song

Research output: Contribution to journalConference articlepeer-review

Abstract

Bayesian neural networks (BNNs) estimate the posterior distribution of model parameters and utilize posterior samples for Bayesian Model Averaging (BMA) in prediction. However, despite the crucial role of flatness in the loss landscape in improving the generalization of neural networks, its impact on BMA has been largely overlooked. In this work, we explore how posterior flatness influences BMA generalization and empirically demonstrate that (1) most approximate Bayesian inference methods fail to yield a flat posterior and (2) BMA predictions, without considering posterior flatness, are less effective at improving generalization. To address this, we propose Flat Posterioraware Bayesian Model Averaging (FP-BMA), a novel training objective that explicitly encourages flat posteriors in a principled Bayesian manner. We also introduce a Flat Posterior-aware Bayesian Transfer Learning scheme that enhances generalization in downstream tasks. Empirically, we show that FP-BMA successfully captures flat posteriors, improving generalization performance.

Original languageEnglish
Pages (from-to)2582-2594
Number of pages13
JournalProceedings of Machine Learning Research
Volume286
StatePublished - 2025
Event41st Conference on Uncertainty in Artificial Intelligence, UAI 2025 - Rio de Janeiro, Brazil
Duration: 21 Jul 202525 Jul 2025

Fingerprint

Dive into the research topics of 'Flat Posterior Does Matter For Bayesian Model Averaging'. Together they form a unique fingerprint.

Cite this