Abstract
In this paper, we propose the a-Feature Map Scaling (a-FMS) method which extends the FMS method that was designed to enhance the discriminative power of feature maps of deep neural networks in Speaker Verification (SV) systems. The FMS derives a scale vector from a feature map and then adds or multiplies them to the features, or sequentially apply both operations. However, the FMS method not only uses an identical scale vector for both addition and multiplication, but also has a limitation that it can only add a value between zero and one in case of addition. In this study, to overcome these limitations, we propose a-FMS to add a trainable parameter a to the feature map element-wise, and then multiply a scale vector. We compare the performance of the two methods: the one where a is a scalar, and the other where it is a vector. Both a-FMS methods are applied after each residual block of the deep neural network. The proposed system using the a-FMS methods are trained using the RawNet2 and tested using the VoxCeleb1 evaluation set. The result demonstrates an equal error rate of 2.47 % and 2.31 % for the two a-FMS methods respectively.
Original language | English |
---|---|
Pages (from-to) | 441-446 |
Number of pages | 6 |
Journal | Journal of the Acoustical Society of Korea |
Volume | 39 |
Issue number | 5 |
DOIs | |
State | Published - 2020 |
Keywords
- Deep neural network
- Feature map scaling
- Raw waveform
- Speaker verification