TY - JOUR
T1 - Fuzzy restricted Boltzmann machine based probabilistic linear discriminant analysis for noise-robust text-dependent speaker verification on short utterances
AU - Yoon, Sung Hyun
AU - Koh, Min Sung
AU - Yu, Ha Jin
N1 - Publisher Copyright:
© International Association of Engineers.
PY - 2020
Y1 - 2020
N2 - In the i-vector-based speaker verification system, it is important to compensate for session variability on the i-vector to improve speaker verification performance. Linear discriminant analysis (LDA) is widely used to compensate for session variability by reducing the dimensionality of the i-vector. Restricted Boltzmann machine (RBM)-based probabilistic linear discriminant analysis (PLDA) has been proposed to improve the session variability compensation ability of LDA. It can be viewed as a probabilistic approach of LDA using RBM. However, since the RBM does not consider uncertainties in obtaining the parameters, the representation capability of RBM-based PLDA is limited. For instance, many real-world speaker verifications must consider noisy environments, which make the compensated session variability uncertain. The fuzzy restricted Boltzmann machine (FRBM) was proposed to improve the capability of the RBM. It showed a more robust performance than that of the RBM. Hence, in this paper, we propose FRBM-based PLDA to improve the representation capability of RBM-PLDA by replacing all the parameters of RBM-PLDA with fuzzy numbers. An evaluation with Part 1 of Robust Speaker Recognition (RSR) 2015 was conducted. In the experimental results, the proposed algorithm shows a better compensation for phonetic variability that exists in short utterances, and a robust speaker verification performance in diverse noisy environments where phonetic and noise variabilities are challenging issues in real-world applications.
AB - In the i-vector-based speaker verification system, it is important to compensate for session variability on the i-vector to improve speaker verification performance. Linear discriminant analysis (LDA) is widely used to compensate for session variability by reducing the dimensionality of the i-vector. Restricted Boltzmann machine (RBM)-based probabilistic linear discriminant analysis (PLDA) has been proposed to improve the session variability compensation ability of LDA. It can be viewed as a probabilistic approach of LDA using RBM. However, since the RBM does not consider uncertainties in obtaining the parameters, the representation capability of RBM-based PLDA is limited. For instance, many real-world speaker verifications must consider noisy environments, which make the compensated session variability uncertain. The fuzzy restricted Boltzmann machine (FRBM) was proposed to improve the capability of the RBM. It showed a more robust performance than that of the RBM. Hence, in this paper, we propose FRBM-based PLDA to improve the representation capability of RBM-PLDA by replacing all the parameters of RBM-PLDA with fuzzy numbers. An evaluation with Part 1 of Robust Speaker Recognition (RSR) 2015 was conducted. In the experimental results, the proposed algorithm shows a better compensation for phonetic variability that exists in short utterances, and a robust speaker verification performance in diverse noisy environments where phonetic and noise variabilities are challenging issues in real-world applications.
KW - Discriminant analysis
KW - Fuzzy restricted boltzmann machine
KW - I-vector
KW - Restricted boltzmann machine
KW - Speaker verification
UR - http://www.scopus.com/inward/record.url?scp=85089810025&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85089810025
SN - 1819-656X
VL - 47
SP - 468
EP - 480
JO - IAENG International Journal of Computer Science
JF - IAENG International Journal of Computer Science
IS - 3
ER -