Current Research Projects

Our collaborators at Texas A&M University

Development of L2-ARCTIC corpus

We are developing an L2 English spoken corpus based on the Carnegie Mellon ARCTIC corpus for the purpose of providing a repository of language data for voice conversion, accent conversion, mispronunciation detection, and pronunciation research. This corpus will contain four speakers (M=2, F=2) from each of six language backgrounds (Hindi, Korean, Mandarin, Spanish, Arabic, and Vietnamese). This project is done in coordination with Texas A&M University. Learn more here and here.

Golden Speaker Builder

We are testing whether learners are able to demonstrate greater pronunciation improvement when they listen to a model voice maximally similar to their own (that is, their synthesized with that of a native speaker to maintain the voice quality of the learner and the accent of the native speaker (a so-called “golden speaker”) compared to someone else’s voice. This project is done in coordination with Texas A&M University. Learn more here.

Mispronunciation Awareness

We are investigating whether learners are able to notice the pronunciation differences between their voice and that of a model voice, and whether they are able to do so better when the model voice is a “golden speaker” than when the model is the voice of a different speaker.

Development of L2-ARCTIC corpus

Golden Speaker Builder

Mispronunciation Awareness

Our current research projects are funded through the NSF Robust Intelligence directive ($565,647) and the NSF Cyberlearning directive ($316,000).