A Syllable-Based Turkish Speech Recognition System by Using Time Delay Neural Networks (TDNNs)

CAN BUĞLALILAR B., ARTUNER H.

International Conference of Soft Computing and Pattern Recognition (SoCPaR), Ha-Noi, Vietnam, 15 Aralık 2013 - 18 Aralık 2015, ss.219-224, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Doi Numarası: 10.1109/socpar.2013.7054130
Basıldığı Şehir: Ha-Noi
Basıldığı Ülke: Vietnam
Sayfa Sayıları: ss.219-224
Hacettepe Üniversitesi Adresli: Evet

Özet

In this paper, we present a model for Turkish speech recognition. The model is syllable-based, where the recognition is performed through syllables as speech recognition units. The main goal of the model is to recognize as much as possible of a given continuous speech by identifying only a small set of syllables in the language. For that purpose, only the syllable types with a higher frequency are selected for the recognition. The use of longer recognition units in speech recognition systems increases the success of the recognition since it is easier to detect the endpoints of syllables when compared to phonemes. On the other side, word-based recognition requires a very large dataset that includes all the words and word forms in the language, which is also another challenge. Hereby, we take the advantage of Turkish being an ortographically transparent and syllabified language. Our model employs time delay neural networks (TDNNs) for learning syllables. We achieve an accuracy of %65.6 on our large vocabulary continuous speech corpus. In addition, we define an algorithm for the automatic detection of syllable boundaries which gives an accuracy of %44. The automatic syllable boundary detection module is used for the recognition of isolated syllables rather than a continuous speech.