Histogram equalization using centroids of fuzzy C-means of background speakers' utterances for speaker identification

Myung Jae Kim, Il Ho Yang, Ha Jin Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In this paper, we propose a novel approach of histogram equalization for speaker recognition with short utterances which are not enough for building histograms. The proposed method clusters the features of randomly selected background speakers' utterances, and estimates the cumulative distribution using the centroids of the clusters sorted in ascending order and the samples of a short test utterance. The ranks are obtained from the test utterance and the sorted centroid set and the sum of the two ranks are used to estimate the cumulative distribution function. For the evaluation, we use ETRI PC database and simulate VoIP codecs for the test set. The system is compared with other feature normalization methods such as CMN, MVN and the conventional HEQ. Our proposed method reduces the error rates by 27.9%, 35.9%, and 30.1% relatively in the test environments: G.729, SILK and Speex, respectively.

Original languageEnglish
Title of host publicationStatistical Language and Speech Processing - First International Conference, SLSP 2013, Proceedings
Pages143-151
Number of pages9
DOIs
StatePublished - 2013
Event1st International Conference on Statistical Language and Speech Processing, SLSP 2013 - Tarragona, Spain
Duration: 29 Jul 201331 Jul 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7978 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st International Conference on Statistical Language and Speech Processing, SLSP 2013
Country/TerritorySpain
CityTarragona
Period29/07/1331/07/13

Keywords

  • histogram equalization
  • speaker identification
  • speaker recognition

Fingerprint

Dive into the research topics of 'Histogram equalization using centroids of fuzzy C-means of background speakers' utterances for speaker identification'. Together they form a unique fingerprint.

Cite this