DSpace Repository

Unicode Sinhala and Phonetic English Bi-directional Conversion for Sinhala Speech Recognizer

Show simple item record

dc.contributor.author Punchimudiyanse, M.
dc.contributor.author Meegama, R.G.N.
dc.date.accessioned 2017-10-23T09:06:55Z
dc.date.available 2017-10-23T09:06:55Z
dc.date.issued 2015
dc.identifier.citation Punchimudiyanse, M., Meegama, R.G.N. (2015). "Unicode Sinhala and Phonetic English Bi-directional Conversion for Sinhala Speech Recognizer", IEEE International Conference on Industrial and Information Systems 2015, pp. 01-06 en_US, si_LK
dc.identifier.uri http://dr.lib.sjp.ac.lk/handle/123456789/6035
dc.description.abstract Attached en_US, si_LK
dc.description.abstract An automated speech recognizer (ASR) having a large vocabulary is yet to be developed for the Sinhala language because of the time consuming nature of gathering the training data to build a language model. The dictionary and building the language model require non-English text, in our case, Sinhala Unicode, to be transcribed in phonetic English text Unlike text to speech conversions which only require transcribing the nonEnglish text to phonetic English text an ASR needs correct reproduction of the original language text when the phonetic English text is produced as the output of the speech recognizer. In the present research, newspaper articles are used to gather a large set of sentences to build a language model having thousands of words for the Sphinx ASR. We present a decoder algorithm that produces phonetic English text from Sinhala Unicode text and an encoder algorithm that produces the correct reproduction of Unicode Sinhala text from phonetic English. For a near phonetic tag set for Sinhala alphabet, results indicate 100% accuracy for the decoder algorithm while for numberless text, accuracy of the encoder algorithm stands at 98.61% for distinct phonetic English words.
dc.language.iso en_US en_US, si_LK
dc.publisher IEEE International Conference on Industrial and Information Systems 2015 en_US, si_LK
dc.subject Sinhala to Phonetic English en_US, si_LK
dc.subject Phonetic English to Sinhala en_US, si_LK
dc.subject Sinhala ASR en_US, si_LK
dc.subject Sinhala Phonetic Tag set en_US, si_LK
dc.subject Sphinx en_US, si_LK
dc.title Unicode Sinhala and Phonetic English Bi-directional Conversion for Sinhala Speech Recognizer en_US, si_LK
dc.type Article en_US, si_LK


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account