TY - GEN
T1 - Histogram equalization using centroids of fuzzy C-means of background speakers' utterances for speaker identification
AU - Kim, Myung Jae
AU - Yang, Il Ho
AU - Yu, Ha Jin
PY - 2013
Y1 - 2013
N2 - In this paper, we propose a novel approach of histogram equalization for speaker recognition with short utterances which are not enough for building histograms. The proposed method clusters the features of randomly selected background speakers' utterances, and estimates the cumulative distribution using the centroids of the clusters sorted in ascending order and the samples of a short test utterance. The ranks are obtained from the test utterance and the sorted centroid set and the sum of the two ranks are used to estimate the cumulative distribution function. For the evaluation, we use ETRI PC database and simulate VoIP codecs for the test set. The system is compared with other feature normalization methods such as CMN, MVN and the conventional HEQ. Our proposed method reduces the error rates by 27.9%, 35.9%, and 30.1% relatively in the test environments: G.729, SILK and Speex, respectively.
AB - In this paper, we propose a novel approach of histogram equalization for speaker recognition with short utterances which are not enough for building histograms. The proposed method clusters the features of randomly selected background speakers' utterances, and estimates the cumulative distribution using the centroids of the clusters sorted in ascending order and the samples of a short test utterance. The ranks are obtained from the test utterance and the sorted centroid set and the sum of the two ranks are used to estimate the cumulative distribution function. For the evaluation, we use ETRI PC database and simulate VoIP codecs for the test set. The system is compared with other feature normalization methods such as CMN, MVN and the conventional HEQ. Our proposed method reduces the error rates by 27.9%, 35.9%, and 30.1% relatively in the test environments: G.729, SILK and Speex, respectively.
KW - histogram equalization
KW - speaker identification
KW - speaker recognition
UR - http://www.scopus.com/inward/record.url?scp=84883176082&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-39593-2_13
DO - 10.1007/978-3-642-39593-2_13
M3 - Conference contribution
AN - SCOPUS:84883176082
SN - 9783642395925
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 143
EP - 151
BT - Statistical Language and Speech Processing - First International Conference, SLSP 2013, Proceedings
T2 - 1st International Conference on Statistical Language and Speech Processing, SLSP 2013
Y2 - 29 July 2013 through 31 July 2013
ER -