Abstract
A subword-based neural network model for continuous speech recognition is proposed. The system consists of three modules, and each module is composed of simple neural networks. The speech input is segmented into non-uniform units by the network in the first module. Non-uniform unit can model phoneme variations which spread for several phonemes and between words. The second module recognizes segmented units. The unit has stationary and transition parts, and the network is divided according to the two parts. The last module spots words by modeling temporal representation. The results of speaker independent word spotting of 520 words are described.
Original language | English |
---|---|
Pages | 506-509 |
Number of pages | 4 |
State | Published - 1996 |
Event | Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA Duration: 3 Oct 1996 → 6 Oct 1996 |
Conference
Conference | Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) |
---|---|
City | Philadelphia, PA, USA |
Period | 3/10/96 → 6/10/96 |