Variable selection in Bayesian multiple instance regression using shotgun stochastic search

Seongoh Park, Joungyoun Kim, Xinlei Wang, Johan Lim

Research output: Contribution to journalArticlepeer-review

Abstract

In multiple instance learning (MIL), a bag represents a sample that has a set of instances, each of which is described by a vector of explanatory variables, but the entire bag only has one label/response. Though many methods for MIL have been developed to date, few have paid attention to interpretability of models and results. The proposed Bayesian regression model stands on two levels of hierarchy, which transparently show how explanatory variables explain and instances contribute to bag responses. Moreover, two selection problems are simultaneously addressed; the instance selection to find out the instances in each bag responsible for the bag response, and the variable selection to search for the important covariates. To explore a joint discrete space of indicator variables created for selection of both explanatory variables and instances, the shotgun stochastic search algorithm is modified to fit in the MIL context. Also, the proposed model offers a natural and rigorous way to quantify uncertainty in coefficient estimation and outcome prediction, which many modern MIL applications call for. The simulation study shows the proposed regression model can select variables and instances with high performance (AUC greater than 0.86), thus predicting responses well. The proposed method is applied to the musk data for prediction of binding strengths (labels) between molecules (bags) with different conformations (instances) and target receptors. It outperforms all existing methods, and can identify variables relevant in modeling responses.

Original languageEnglish
Article number107954
JournalComputational Statistics and Data Analysis
Volume196
DOIs
StatePublished - Aug 2024

Keywords

  • Binding affinity prediction
  • Hierarchical model
  • MCMC
  • Model selection
  • Multiple instance learning
  • Musk data

Fingerprint

Dive into the research topics of 'Variable selection in Bayesian multiple instance regression using shotgun stochastic search'. Together they form a unique fingerprint.

Cite this