A teacher student model based integrated feature speaker verification system robust to noisy environments

  • Kyo Won Koo
  • , Jungwoo Heo
  • , Hyun Seo Shin
  • , Chan Yeong Lim
  • , Seung Bin Kim
  • , Jisoo Son
  • , Kyung Wha Kim
  • , Ha Jin Yu

Research output: Contribution to journalArticlepeer-review

Abstract

While existing speaker verification systems exhibit excellent performance in clean environments, they suffer from performance degradation when contaminated with noise. Although recent research has employed teacher-student learning to enhance the noise robustness of speaker verification systems, these approaches are limited by their reliance on single input modalities. In real-world acoustic environments, various types of noise exist such as stationary and impulsive, and their characteristics manifest differently across different modalities. We propose an integrated feature system that leverages various features that each can represent different noise types differently. This system incorperates a CNN Extractor that processes spectrograms in parallel with the teacher-student learning-based Pre-trained Large Model(PLM) branch that processes raw waveforms. Features extracted from both branches are adaptively integrated through a feature fusion module, designed to exploit the complementary advantages of each input representation. The experimental results showed that the Equal Error Rate (EER) was improved by approximately 18 % in the domain noise environment and approximately 49 % in the out-of-domain noise environment compared to the existing PLM-based single input system. Furthermore, consistent performance improvements were observed across various real-world datasets validating the competitive performance of the proposed system in noisy environments.

Original languageEnglish
Pages (from-to)548-555
Number of pages8
JournalJournal of the Acoustical Society of Korea
Volume44
Issue number5
DOIs
StatePublished - 2025

Keywords

  • Integrated feature
  • Noise robustness
  • Self Supervised Pre-trained Large Model (Self Supervised PLM)
  • Speaker verification

Fingerprint

Dive into the research topics of 'A teacher student model based integrated feature speaker verification system robust to noisy environments'. Together they form a unique fingerprint.

Cite this