Highlights

“Gowajee” — a Thai Speech-Recognition AI from Chula


An engineering professor from Chula has designed “Gowajee”, a Thai-language speech recognition AI voice generator capable of delivering speech-to-text in Thai and Thai text-to-speech with the accuracy of a native speaker while keeping users’ data secure.  Having been rolled out in call centers, and depression patients screening process, Gowajee is set to be adapted to many other functions.


‘OK, Google’

The main objective of Gowajee is to make AI technology more accessible to Thai people because we are getting used to using our voice commands for AIs like Google or Siri to search or carry out tasks instead of typing them out.  But for Thai speakers, have you ever felt that those AI voice generators don’t seem to understand the Thai language tone of voice that we use?  Many times, we get a transcription that doesn’t match our words which means we need to adjust our Thai pronunciation to the AI developed by a foreign company that was aimed for multilingual adaptability, mostly standard languages like English.  

An example of AI Voice Generator.

Realizing this problem, a team led by Dr. Ekapol Chuangsuwanich of the Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University has developed Gowajee” a genuine Thai speech-recognition AI that understands and execute commands in the Thai language more naturally and accurately.  Actual usage has shown only a 9% incidence of linguistic inaccuracy compared to 15% for other language-recognition AIs.

Dr. Ekapol Chuangsuwanich, developer of Gowajee Thai AI for listening to and transcribing speech into text
Dr. Ekapol Chuangsuwanich
Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University

The name Gowajee derives from the words ‘Go’ and ‘Wajee’ which means words.  The word is designed as a command similar to ‘OK Google’ or ‘Hey Siri’. The word Gowajee was designed in such a way as not to replicate any other word being used in the Thai language, making it unique for Thai AI.

The Challenge of Developing Thai AI

Based on this problem, foreign-made AI voice generators often misunderstands Thai language with the main reason being Thai language structure that’s different from English.  Different pronunciations, tones, inflections, and homophones can lead to misinterpretation.  Thai language’s more complicated structure than English may be an obstacle in the development of speech-to-text in Thai and Thai text-to-speech technology.  Therefore, Dr. Ekapol’s best solution to this problem is to “create the most extensive Thai language database to enable Gowajee, or this Thai AI, to learn about the Thai language more effectively”.

Thai language AI with a Thai sound database

Dr. Ekapol and his team began the task of compiling a Thai sound database for Gowajee from 2017 up until the present.   As he recalled,

“….we applied a variety of methods and formats such as creating a website for people to log in and read a text to be stored as a sound database, getting people to engage in a conversation or actors to perform emotional speaking.  Altogether, we achieved a compilation totaling five thousand hours which made us confident that we had a big enough database to transcribe Thai accurately so as to achieve the best possible speech-to-text in Thai / Thai text-to-speech and speech recognition.”

This database was enough to enable the Gowajee team to develop an accurate Thai language speech recognition AI that could be adapted for use in three main features:

Automated Speech Recognition (ASR)

which turns speech into text. “For example, if we record a lecture, Gowajee will transcribe it into texts for us to read without having to transcribe it ourselves,” Dr. Ekapol suggested.                                                                                                                        

Text-to-Speech (TTS)

works by transcribing a passage into spoken words in the same way that we might be familiar with the use of Google or Siri except that Gowajee will deliver more natural speech thanks to a larger Thai database. 

Automatic Speaker Verification (ASV)

is an identity verification through sound which can be used when contacting a call center or indicating the speaker and time frame. 

Gowajee – a perfect solution for call centers

Ever since it was developed, Gowajee has been used by various agencies, like universities, and the public and private sectors, especially at call centers, both for Thai speech-to-text, and text-to-speech functions. Gowajee’s error is only 9% compared to 15% for other AIs.  

Gowajee is perfect to use for call centers.

“Most clients have been satisfied with Gowajee’s level of accuracy.  It is an improved version of what they have previously used, and the price is also more affordable.  As for the errors, we are certain that they will decrease as the database grow.”

In search of meaning in the voice: Gowajee helps to screen patients with depression

As a result of data gathering of voices that convey various emotions, Gowajee has been able to help develop the systems used in DMIND for screening patients with depression.

“DIMIND proved to be very challenging for us.  Aside from transcriptions, a model of classifying and decoding emotions from voices in at-risk groups is also needed.  Crying is usually involved which makes voices difficult to transcribe and decode, but Gowajee was able to do considerably well by determining the important keywords for decoding.”

DMIND is an application that utilizes Gowajee.

How can Gowajee be adapted for use in other areas?

Gowajee and AI technology can be used in many other areas such as …

  • A dental assistant taking notes while the dentist is doing dental work on the patient and needing to record some notes.   
  • It can be used to detect a stroke risk in patients with slurred speech.
  • Act as a life coach by asking questions and analyzing people’s life goals from video interviews, use as part of students’ and employees’ orientation.
  • Modify and amplify sounds for the hard-of-hearing so that they can hear more clearly. 

Your data is safe with Gowajee

“Data safety” is what puts Gowajee above other speech-recognition AI voice generators. As Dr. Ekapol tells us “Normally other transcription and speech recognition programs store their data on the cloud or compile them on users’ computer.  With Gowajee, all the data is stored on the user’s database ensuring its safety.  This is useful for organizations like banks which need high data security.”

AIs are becoming increasingly clever with the enhanced linguistic abilities that are getting closer and closer to human beings which have caused many to worry about being replaced by technologies.  Regarding other Thai AIs for transcription or Gowajee, Dr. Ekapol only sees them as enablers that will make life easier for us in the present and the future.   

“AIs aren’t that disrupting to our lives.  We are disrupting ourselves.  Aging societies, a shortage of working-age labor are making it necessary for us to create technologies to substitute what we can’t find humans to do.” Dr. Ekapol also concluded by saying “I’m not expecting that my work is going to be helpful to the aged of today but I’m thinking that in the future when I reach an old age I will be making use of these technologies.”

AI for elderly

Therefore, the Thai speech recognition AI (both speech-to-text and text-to-speech) that Dr. Ekapol has been dedicated to develop is not a fearsome technology, or one that will replace human labor.  But, it will bring more ease and convenience to many people. Just the ability to convert speech to text, and text to speech can be applied to various areas.  As we are transforming into an aging society, the speech recognition technologies can be applied for the better quality of life.

For more information and a trial of Gowajee Thai speech recognition AI, please visit https://www.gowajee.ai/ and if you want to receive news or articles directly updated by Chulalongkorn University, you may click on this highlight article page.

Chula is the place to discover one’s true individuality and the years I spent here were most enjoyable.

Rossukhon Kongket Alumni, Faculty of Communication Arts, Chulalongkorn University

This website uses cookies to personalize content, provide the best user experience, and improve Chula website services.

Privacy Preferences

ท่านสามารถเลือกการตั้งค่าคุกกี้โดยเปิด/ปิด คุกกี้ในแต่ละประเภทได้ตามความต้องการ ยกเว้น คุกกี้ที่จำเป็น

Accept All
Manage Consent Preferences
  • คุกกี้ที่จำเป็น
    Always Active

    ประเภทของคุกกี้ที่มีความจำเป็นสำหรับการทำงานของเว็บไซต์ เพื่อให้คุณสามารถใช้เว็บไซต์ได้อย่างเป็นปกติ ท่านไม่สามารถปิดการทำงานของคุกกี้นี้ในระบบเว็บไซต์ของเราได้

  • คุกกี้เพื่อการวิเคราะห์

    คุกกี้ประเภทนี้จะทำการเก็บข้อมูลพฤติกรรมการใช้งานเว็บไซต์ของคุณ โดยมีจุดประสงค์คือนำข้อมูลมาวิเคราะห์เพื่อปรับปรุงและพัฒนาเว็บไซต์ให้มีคุณภาพ และสร้างประสบการณ์ที่ดีกับผู้ใช้งาน เพื่อให้เกิดประโยชน์สูงสุด หากท่านไม่ยินยอมให้เราใช้คุกกี้นี้ เราอาจไม่สามารถวัดผลเพื่อการปรับปรุงและพัฒนาเว็บไซต์ให้ดีขึ้นได้
    Cookies Details

Save