A complete end-to-end speaker verification system using deep neural networks: From raw signals to verification result

Jee Weon Jung, Hee Soo Heo, Il Ho Yang, Hye Jin Shim, Ha Jin Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

55 Scopus citations

Abstract

End-to-end systems using deep neural networks have been widely studied in the field of speaker verification. Raw audio signal processing has also been widely studied in the fields of automatic music tagging and speech recognition. However, as far as we know, end-to-end systems using raw audio signals have not been explored in speaker verification. In this paper, a complete end-to-end speaker verification system is proposed, which inputs raw audio signals and outputs the verification results. A pre-processing layer and the embedded speaker feature extraction models were mainly investigated. The proposed pre-emphasis layer was combined with a strided convolution layer for pre-processing at the first two hidden layers. In addition, speaker feature extraction models using convolutionallayer and long short-term memory are proposed to be embedded in the proposed end-to-end system.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5349-5353
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - 10 Sep 2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18

Keywords

  • End-to-end system
  • Raw audio signal
  • Speaker verification

Fingerprint

Dive into the research topics of 'A complete end-to-end speaker verification system using deep neural networks: From raw signals to verification result'. Together they form a unique fingerprint.

Cite this