Abstract
A neural network model based on a non-uniform unit for speaker-independent continuous speech recognition is proposed. The functions of the neural network model include segmenting the input speech into sub-word units, classifying the units and detecting words, and each of them is implemented by a module. The recognition unit we propose can includes arbitrary number of phonemes in a unit, so that it can absorb co-articulation effects which spread for several phonemes. The unit classifier module separates the speech into stationary and transition parts and use different parameters for them. The word detector module can learn all the pronunciation variations in the training data. The system is evaluated on a subset of TIMIT speech data.
Original language | English |
---|---|
Pages (from-to) | 3277-3280 |
Number of pages | 4 |
Journal | Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing |
Volume | 4 |
State | Published - 1997 |
Event | Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 5) - Munich, Ger Duration: 21 Apr 1997 → 24 Apr 1997 |