Main Menu
— Event —

【CUHK Distinguished Lecture Series】AI in Spoken Language Technologies for Learning and Wellbeing

  • 2018.09.07
  • Event
Spoken language is a primary form of human communication. AI in spoken language technologies must incorporate knowledge of acoustics, phonetics and linguistics in analyzing speech. While AI has made great strides in general speech recognition and achieved human parity in performance, our research team at CUHK has been focusing on the problems of recognizing and analyzing non-native, learners’ speech for the purpose of mispronunciation detection and diagnosis in computer-aided pronunciation training. To generate corrective feedback that is personalized to enhance the learning experience, we have also developed an approach that uses phonetic posterior-grams (PPGs) for personalized, cross-lingual text-to-speech synthesis given arbitrary textual input, based on voice conversion techniques. We have also extended our work to benefit those with speech disorders, focusing on assistive technologies for alternative and augmented communication, as well as automated recognition and analyses of dysarthric speech recordings. The analyses are intended to inform intervention strategies. Additionally, voice conversion is further developed to restore disordered speech to normal speech, based only on very sparse data from the target speaker. In this talk, I will present the challenges in these problems, our approaches and solutions, as well as our ongoing work.

Topic: AI in Spoken Language Technologies for Learning and Wellbeing

Speaker: Prof. Helen Meng

Date: September 10th, Monday

Time: 16:30-17:30

Venue: Governing Board Meeting Room, Dao Yuan Building 

Language: English

Speaker's Profile:

Helen Meng is Patrick Huen Wing Ming Professor of Systems Engineering & Engineering Management, Chinese University of Hong Kong (CUHK). She is the Founding Director of the CUHK Ministry of Education (MoE)-Microsoft Key Laboratory for Human-Centric Computing and Interface Technologies, Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems, and CUHK Stanley Ho Big Data Decision Analytics Research Center. She also established the CAS-CUHK Shenzhen Institute of Advanced Technology Ambient Intelligence and Multimodal Systems Laboratory and served as its Director between 2007 and 2011.  

Abstract:

Spoken language is a primary form of human communication. AI in spoken language technologies must incorporate knowledge of acoustics, phonetics and linguistics in analyzing speech. While AI has made great strides in general speech recognition and achieved human parity in performance, our research team at CUHK has been focusing on the problems of recognizing and analyzing non-native, learners’ speech for the purpose of mispronunciation detection and diagnosis in computer-aided pronunciation training. To generate corrective feedback that is personalized to enhance the learning experience, we have also developed an approach that uses phonetic posterior-grams (PPGs) for personalized, cross-lingual text-to-speech synthesis given arbitrary textual input, based on voice conversion techniques. We have also extended our work to benefit those with speech disorders, focusing on assistive technologies for alternative and augmented communication, as well as automated recognition and analyses of dysarthric speech recordings. The analyses are intended to inform intervention strategies. Additionally, voice conversion is further developed to restore disordered speech to normal speech, based only on very sparse data from the target speaker.  In this talk, I will present the challenges in these problems, our approaches and solutions, as well as our ongoing work.