Semi-parametric hidden Markov model for large-scale multiple testing under dependency

Joungyoun Kim, Johan Lim, Jong Soo Lee

Research output: Contribution to journalArticlepeer-review

Abstract

In this article, we propose a new semiparametric hidden Markov model (HMM) for use in the simultaneous hypothesis testing with dependency. The semi- or non-parametric HMM in the literature requires two conditions for its model identifiability, (a) the latent Markov chain (MC) is ergodic and its transition probability is full rank and (b) the observational distributions of different hidden states are disjoint or linearly independent. Unlike the existing models, our semiparametric HMM with two hidden states makes no assumption on the transition probability of the latent MC but assumes that observational distributions are extremal for the set of all stationary distributions of the model. To estimate the model, we propose a modified expectation-maximization algorithm, whose M-step has an additional purification step to make the observational distribution be extremal one. We numerically investigate the performance of the proposed procedure in the estimation of the model and compare it to two recent existing methods in various multiple testing error settings. In addition, we apply our procedure to analyzing two real data examples, the gas chromatography/mass spectrometry experiment to differentiate the origin of herbal medicine and the epidemiologic surveillance of an influenza-like illness.

Original languageEnglish
Pages (from-to)320-343
Number of pages24
JournalStatistical Modelling
Volume24
Issue number4
DOIs
StatePublished - Aug 2024

Keywords

  • false discovery rate
  • identifiability
  • local index of significance
  • modified EM procedure
  • multiple testing
  • semi-parametric hidden Markov model

Fingerprint

Dive into the research topics of 'Semi-parametric hidden Markov model for large-scale multiple testing under dependency'. Together they form a unique fingerprint.

Cite this