Laurent Besacier
Laboratoire LIG/GETALP
Session JEP orale O1 Diversité des langues Lundi 9 Juin - 13h30 15h30
-
papier 1580
Modélisation acoustique multilingue pour la reconnaissance automatique de la parole non native
- Tien-Ping Tan ( Laboratoire LIG/GETALP)
- Laurent Besacier ( Laboratoire LIG/GETALP)
- Résumé : Automatic speech recognition system performance on non-native speakers is still poor. Non-native speech has a different characteristic compare to native speech as it is influenced by their mother tongue. Getting sufficient non-native speech to create a non-native acoustic model is not always possible. In this paper, we investigate the usage of multilingual acoustic models to adapt the target language acoustic model for non-native speakers. A hybrid of acoustic interpolation and merging is proposed for adapting the target acoustic model by using multilingual acoustic models without requiring the raw corpus. Three types of multilingual resources were tested for adapting non-native speakers: the native language of the speakers, non-native speech from a language different from the target language but spoken by the same native origin, and language close to the target language.
- article
Session JEP orale O1 Diversité des langues Lundi 9 Juin - 13h30 15h30
-
papier 1624
Reconnaissance automatique de la parole en langue khmère : quelles unités pour la modélisation du langage et la modélisation acoustique?
- Sopheap Seng ( Laboratoire Informatique de Grenoble)
- Sethserey Sam ( Laboratoire Informatique de Grenoble)
- Viet-Bac Le ( Laboratoire Informatique de Grenoble)
- Brigitte Bigi ( Laboratoire Informatique de Grenoble)
- Laurent Besacier ( Laboratoire Informatique de Grenoble)
- Résumé : In this paper we present an overview on the development of a large vocabulary continuous speech recognition system for Khmer language. Methods and tools used for language resources collection for quick development of an ASR system for a new under-resourced language are presented. Face with the problem of lack of text data and the word error segmentation in language modeling, we investigate how different views of the text data (word and sub-word units) can be exploited for Khmer language modeling. We propose to work both at the model level (by making hybrid vocabularies with both word and sub-word units) as well as at the ASR output level (systems combination). For acoustic modeling, we use basic linguistic rules to automatically generate pronunciation dictionaries based on grapheme or phoneme. An experimental framework is setup to evaluate the performance of each modeling units.
- article
Session JEP orale O2 Pathologies Mardi 10 Juin - 16h30 18h30
-
papier 1670
Adaptation de la production labiale d'un participant sourd et classification : le cas des voyelles en contexte du code LPC
- Noureddine Aboutabit ( Grenoble Images Parole Signal Automatique, département Parole & Cognition)
- Denis Beautemps ( Grenoble Images Parole Signal Automatique, département Parole & Cognition)
- Olivier Mathieu ( Grenoble Images Parole Signal Automatique, département Parole & Cognition)
- Laurent Besacier ( Laboratoire d'Informatique de Grenoble)
- Résumé : The phonetic translation of Cued Speech (CS) gestures needs to mix the manual CS information together with the lips, taking into account the desynchronization delay (Attina et al. [2], Aboutabit et al. [7]) between these two flows of information. This contribution focuses on the lip flow modeling in the case of French vowels. Previously, classification models have been developed for a professional normal hearing CS speaker (Aboutabit et al., [7]). These models are used as a reference. Now, we process the case of a deaf CS speaker and discuss the possibilities of classification. The best performance (92,8%) is obtained with the adaptation of the deaf data to the reference models.
- article