ATTENTIVE MAX FEATURE MAP AND JOINT TRAINING FOR ACOUSTIC SCENE CLASSIFICATION

Hye Jin Shim, Jee Weon Jung, Ju Ho Kim, Ha Jin Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Various attention mechanisms are being widely applied to acoustic scene classification. However, we empirically found that the attention mechanism can excessively discard potentially valuable information, despite improving performance. We propose the attentive max feature map that combines two effective techniques, attention and a max feature map, to further elaborate the attention mechanism and mitigate the above-mentioned phenomenon. We also explore various joint training methods, including multi-task learning, that allocate additional abstract labels for each audio recording. Our proposed system demonstrates competitive performance with much larger state-of-the-art systems for single systems on Subtask A of the DCASE 2020 challenge by applying the two proposed techniques using relatively fewer parameters. Furthermore, adopting the proposed attentive max feature map, our team placed fourth in the recent DCASE 2021 challenge.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1036-1040
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Hybrid, Singapore
Duration: 23 May 202227 May 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityHybrid
Period23/05/2227/05/22

Keywords

  • acoustic scene classification
  • attention
  • joint training
  • max feature map

Fingerprint

Dive into the research topics of 'ATTENTIVE MAX FEATURE MAP AND JOINT TRAINING FOR ACOUSTIC SCENE CLASSIFICATION'. Together they form a unique fingerprint.

Cite this