Large vocabulary Korean continuous speech recognition using a one-pass algorithm

Ha Jin Yu, Hoon Kim, Joon Mo Hong, Min Seong Kim, Jong Seok Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In this paper, we describe problems in recognizing largevocabulary Korean continuous speech, and proposed solutions to them. Korean sentences consist of eojeols, which are separated by spaces in text and consist of morphemes. When we use morpheme units, there are many word insertion and deletion errors because morpheme units are too short. We introduce a between-word phone variation lexicon that can represent many alternatives of phones of words in one structure. The decoding algorithm is composed of one pass, which is a modification of token-passing algorithm. In this algorithm, we allowed multiple tokens in a state at a time to get globalbest path without expanding the states when we use trigram language models. We confirmed thatbetween-word phone variation lexicon is useful for morpheme-based recognition by observing that the improvement is higher for morpheme units than for eojeol units. Allowing multiple tokens at a state also improved the performance.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
ISBN (Electronic)7801501144, 9787801501141
StatePublished - 2000
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 16 Oct 200020 Oct 2000

Publication series

Name6th International Conference on Spoken Language Processing, ICSLP 2000

Conference

Conference6th International Conference on Spoken Language Processing, ICSLP 2000
Country/TerritoryChina
CityBeijing
Period16/10/0020/10/00

Fingerprint

Dive into the research topics of 'Large vocabulary Korean continuous speech recognition using a one-pass algorithm'. Together they form a unique fingerprint.

Cite this