Development of L2-ARCTIC corpus
We are developing an L2 English spoken corpus based on the Carnegie Mellon ARCTIC corpus for the purpose of providing a repository of language data for voice conversion, accent conversion, mispronunciation detection, and pronunciation research. This corpus will contain four speakers (M=2, F=2) from each of six language backgrounds (Hindi, Korean, Mandarin, Spanish, Arabic, and Vietnamese). This project is done in coordination with Texas A&M University. Learn more here and here.
Golden Speaker Builder
We are testing whether learners are able to demonstrate greater pronunciation improvement when they listen to a model voice maximally similar to their own (that is, their synthesized with that of a native speaker to maintain the voice quality of the learner and the accent of the native speaker (a so-called “golden speaker”) compared to someone else’s voice. This project is done in coordination with Texas A&M University. Learn more here.
Mispronunciation Awareness
We are investigating whether learners are able to notice the pronunciation differences between their voice and that of a model voice, and whether they are able to do so better when the model voice is a “golden speaker” than when the model is the voice of a different speaker.