Abstract
The similarity in tones between speakers can lower the performance of speaker verification. To improve the performance of speaker verification systems, we propose a multi-task learning technique using deep neural network to learn speaker information and age information. Multi–task learning can improve generalization performances, because it helps deep neural networks to prevent hidden layers from overfitting into one task. However, we found in experiments that learning of age information does not work well in the process of learning the deep neural network. In order to improve the learning, we propose a method to dynamically change the objective function weights of speaker identification and age estimation in the learning process. Results show the equal error rate based on RSR2015 evaluation data set, 6.91 % for the speaker verification system without using age information, 6.77 % using age information only, and 4.73 % using age information when weight change technique was applied.
Original language | English |
---|---|
Pages (from-to) | 593-600 |
Number of pages | 8 |
Journal | Journal of the Acoustical Society of Korea |
Volume | 38 |
Issue number | 5 |
DOIs | |
State | Published - 2019 |
Keywords
- Age estimation
- Deep neural network
- Multi-task learning
- Speaker verification