Overview
Administration
CU Symbols
GREEN UNIVERSITY
CONTACT
Programs
Admissions
Exchange Student
Academic Units
Life at CU
Academic Services
Medical Services
Quality Assurance Services
Information Services
Creative Space
Highlights
29 November 2022
Writer Parinda Jangsook
An engineering professor from Chula has designed “Gowajee”, a Thai-language speech recognition AI voice generator capable of delivering speech-to-text in Thai and Thai text-to-speech with the accuracy of a native speaker while keeping users’ data secure. Having been rolled out in call centers, and depression patients screening process, Gowajee is set to be adapted to many other functions.
‘OK, Google’
The main objective of Gowajee is to make AI technology more accessible to Thai people because we are getting used to using our voice commands for AIs like Google or Siri to search or carry out tasks instead of typing them out. But for Thai speakers, have you ever felt that those AI voice generators don’t seem to understand the Thai language tone of voice that we use? Many times, we get a transcription that doesn’t match our words which means we need to adjust our Thai pronunciation to the AI developed by a foreign company that was aimed for multilingual adaptability, mostly standard languages like English.
Realizing this problem, a team led by Dr. Ekapol Chuangsuwanich of the Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University has developed “Gowajee” a genuine Thai speech-recognition AI that understands and execute commands in the Thai language more naturally and accurately. Actual usage has shown only a 9% incidence of linguistic inaccuracy compared to 15% for other language-recognition AIs.
The name Gowajee derives from the words ‘Go’ and ‘Wajee’ which means words. The word is designed as a command similar to ‘OK Google’ or ‘Hey Siri’. The word Gowajee was designed in such a way as not to replicate any other word being used in the Thai language, making it unique for Thai AI.
Based on this problem, foreign-made AI voice generators often misunderstands Thai language with the main reason being Thai language structure that’s different from English. Different pronunciations, tones, inflections, and homophones can lead to misinterpretation. Thai language’s more complicated structure than English may be an obstacle in the development of speech-to-text in Thai and Thai text-to-speech technology. Therefore, Dr. Ekapol’s best solution to this problem is to “create the most extensive Thai language database to enable Gowajee, or this Thai AI, to learn about the Thai language more effectively”.
Dr. Ekapol and his team began the task of compiling a Thai sound database for Gowajee from 2017 up until the present. As he recalled,
“….we applied a variety of methods and formats such as creating a website for people to log in and read a text to be stored as a sound database, getting people to engage in a conversation or actors to perform emotional speaking. Altogether, we achieved a compilation totaling five thousand hours which made us confident that we had a big enough database to transcribe Thai accurately so as to achieve the best possible speech-to-text in Thai / Thai text-to-speech and speech recognition.”
This database was enough to enable the Gowajee team to develop an accurate Thai language speech recognition AI that could be adapted for use in three main features:
which turns speech into text. “For example, if we record a lecture, Gowajee will transcribe it into texts for us to read without having to transcribe it ourselves,” Dr. Ekapol suggested.
works by transcribing a passage into spoken words in the same way that we might be familiar with the use of Google or Siri except that Gowajee will deliver more natural speech thanks to a larger Thai database.
is an identity verification through sound which can be used when contacting a call center or indicating the speaker and time frame.
Ever since it was developed, Gowajee has been used by various agencies, like universities, and the public and private sectors, especially at call centers, both for Thai speech-to-text, and text-to-speech functions. Gowajee’s error is only 9% compared to 15% for other AIs.
“Most clients have been satisfied with Gowajee’s level of accuracy. It is an improved version of what they have previously used, and the price is also more affordable. As for the errors, we are certain that they will decrease as the database grow.”
As a result of data gathering of voices that convey various emotions, Gowajee has been able to help develop the systems used in DMIND for screening patients with depression.
“DIMIND proved to be very challenging for us. Aside from transcriptions, a model of classifying and decoding emotions from voices in at-risk groups is also needed. Crying is usually involved which makes voices difficult to transcribe and decode, but Gowajee was able to do considerably well by determining the important keywords for decoding.”
Gowajee and AI technology can be used in many other areas such as …
“Data safety” is what puts Gowajee above other speech-recognition AI voice generators. As Dr. Ekapol tells us “Normally other transcription and speech recognition programs store their data on the cloud or compile them on users’ computer. With Gowajee, all the data is stored on the user’s database ensuring its safety. This is useful for organizations like banks which need high data security.”
AIs are becoming increasingly clever with the enhanced linguistic abilities that are getting closer and closer to human beings which have caused many to worry about being replaced by technologies. Regarding other Thai AIs for transcription or Gowajee, Dr. Ekapol only sees them as enablers that will make life easier for us in the present and the future.
“AIs aren’t that disrupting to our lives. We are disrupting ourselves. Aging societies, a shortage of working-age labor are making it necessary for us to create technologies to substitute what we can’t find humans to do.” Dr. Ekapol also concluded by saying “I’m not expecting that my work is going to be helpful to the aged of today but I’m thinking that in the future when I reach an old age I will be making use of these technologies.”
Therefore, the Thai speech recognition AI (both speech-to-text and text-to-speech) that Dr. Ekapol has been dedicated to develop is not a fearsome technology, or one that will replace human labor. But, it will bring more ease and convenience to many people. Just the ability to convert speech to text, and text to speech can be applied to various areas. As we are transforming into an aging society, the speech recognition technologies can be applied for the better quality of life.
For more information and a trial of Gowajee Thai speech recognition AI, please visit https://www.gowajee.ai/ and if you want to receive news or articles directly updated by Chulalongkorn University, you may click on this highlight article page.
Mitrearth, a Knowledge Platform, Identifies Risk Points, Provides Disaster Warning, Reduces Losses
“Physical Therapy”: Rehabilitation, Treatment and Health Promotion for All Ages
The Skinov’e: Innovative Skincare Derived from Hom Thong Pathum Banana Peel Extract – Chula’s Research Turning Acne Care into a Breeze
Chula’s Innovation: The Nano Coating Paper Archival Varnish to Preserve Old Documents and Art Pieces for Decades to Come
Chula Pioneers Responsible Use of Generative AI for Higher Education in Thailand with the Inauguration of ‘ChulaGENIE,’ in Collaboration with Google Cloud
Chula Presents “Mud Sang,” a Documentary Film to Revive the Spirit of Muay Thai in the World Arena
Chula is the place to discover one’s true individuality and the years I spent here were most enjoyable. Rossukhon Kongket Alumni, Faculty of Communication Arts, Chulalongkorn University
Chula is the place to discover one’s true individuality and the years I spent here were most enjoyable.
Rossukhon Kongket Alumni, Faculty of Communication Arts, Chulalongkorn University
This website uses cookies to personalize content, provide the best user experience, and improve Chula website services.
ท่านสามารถเลือกการตั้งค่าคุกกี้โดยเปิด/ปิด คุกกี้ในแต่ละประเภทได้ตามความต้องการ ยกเว้น คุกกี้ที่จำเป็น
ประเภทของคุกกี้ที่มีความจำเป็นสำหรับการทำงานของเว็บไซต์ เพื่อให้คุณสามารถใช้เว็บไซต์ได้อย่างเป็นปกติ ท่านไม่สามารถปิดการทำงานของคุกกี้นี้ในระบบเว็บไซต์ของเราได้
คุกกี้ประเภทนี้จะทำการเก็บข้อมูลพฤติกรรมการใช้งานเว็บไซต์ของคุณ โดยมีจุดประสงค์คือนำข้อมูลมาวิเคราะห์เพื่อปรับปรุงและพัฒนาเว็บไซต์ให้มีคุณภาพ และสร้างประสบการณ์ที่ดีกับผู้ใช้งาน เพื่อให้เกิดประโยชน์สูงสุด หากท่านไม่ยินยอมให้เราใช้คุกกี้นี้ เราอาจไม่สามารถวัดผลเพื่อการปรับปรุงและพัฒนาเว็บไซต์ให้ดีขึ้นได้ Cookies Details