TY - JOUR
T1 - Semi-parametric hidden Markov model for large-scale multiple testing under dependency
AU - Kim, Joungyoun
AU - Lim, Johan
AU - Lee, Jong Soo
N1 - Publisher Copyright:
© 2022 The Author(s).
PY - 2024/8
Y1 - 2024/8
N2 - In this article, we propose a new semiparametric hidden Markov model (HMM) for use in the simultaneous hypothesis testing with dependency. The semi- or non-parametric HMM in the literature requires two conditions for its model identifiability, (a) the latent Markov chain (MC) is ergodic and its transition probability is full rank and (b) the observational distributions of different hidden states are disjoint or linearly independent. Unlike the existing models, our semiparametric HMM with two hidden states makes no assumption on the transition probability of the latent MC but assumes that observational distributions are extremal for the set of all stationary distributions of the model. To estimate the model, we propose a modified expectation-maximization algorithm, whose M-step has an additional purification step to make the observational distribution be extremal one. We numerically investigate the performance of the proposed procedure in the estimation of the model and compare it to two recent existing methods in various multiple testing error settings. In addition, we apply our procedure to analyzing two real data examples, the gas chromatography/mass spectrometry experiment to differentiate the origin of herbal medicine and the epidemiologic surveillance of an influenza-like illness.
AB - In this article, we propose a new semiparametric hidden Markov model (HMM) for use in the simultaneous hypothesis testing with dependency. The semi- or non-parametric HMM in the literature requires two conditions for its model identifiability, (a) the latent Markov chain (MC) is ergodic and its transition probability is full rank and (b) the observational distributions of different hidden states are disjoint or linearly independent. Unlike the existing models, our semiparametric HMM with two hidden states makes no assumption on the transition probability of the latent MC but assumes that observational distributions are extremal for the set of all stationary distributions of the model. To estimate the model, we propose a modified expectation-maximization algorithm, whose M-step has an additional purification step to make the observational distribution be extremal one. We numerically investigate the performance of the proposed procedure in the estimation of the model and compare it to two recent existing methods in various multiple testing error settings. In addition, we apply our procedure to analyzing two real data examples, the gas chromatography/mass spectrometry experiment to differentiate the origin of herbal medicine and the epidemiologic surveillance of an influenza-like illness.
KW - false discovery rate
KW - identifiability
KW - local index of significance
KW - modified EM procedure
KW - multiple testing
KW - semi-parametric hidden Markov model
UR - http://www.scopus.com/inward/record.url?scp=85139004813&partnerID=8YFLogxK
U2 - 10.1177/1471082X221121235
DO - 10.1177/1471082X221121235
M3 - Article
AN - SCOPUS:85139004813
SN - 1471-082X
VL - 24
SP - 320
EP - 343
JO - Statistical Modelling
JF - Statistical Modelling
IS - 4
ER -