Pascal Nocéra
Université d'Avignon
Session JEP poster P5 Jeudi 12 Juin - 10h30 12h30
-
papier 1615
Reconnaissance de la parole continue à grand vocabulaire en vietnamien, une langue syllabique tonale
- Hong-Quang Nguyen ( Université d'Avignon et des pays de Vaucluse)
- Pascal Nocéra ( Université d'Avignon et des pays de Vaucluse)
- Eric Castelli ( L'Institut Polytechnique de Ha Noi)
- Trinh Van-Loan ( L'Institut Polytechnique de Ha Noi)
- Résumé : This paper proposes a method to build a Vietnamese Large Vocabulary Continuous Speech Recognition system (Vietnamese LVCSR system). The difference between Vietnamese and European languages is analyzed and used to adapt a LVCSR system for European languages to Vietnamese. Experiments are implemented on the VNSPEECHCORPUS. The results show that the accuracy of Vietnamese recognition system is increased by using Vietnamese language characteristics.
- article
Session JEP orale O4 Reconnaissance de la parole et du locuteur Jeudi 12 Juin - 14h00 16h00
-
papier 1574
Enrichissement dynamique du vocabulaire à partir du Web
- Stanislas Oger ( Université d'Avignon)
- Georges Linarès ( Université d'Avignon)
- Frédéric Béchet ( Université d'Avignon)
- Pascal Nocéra ( Université d'Avignon)
- Résumé : Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We suggest that the local context of the out-of-vocabulary words contains relevant information on the OOV words. With this information, we propose to use the Web to build locally-augmented lexicons which are used in a final local decoding pass. We first demonstrate the relevance of the Web for the OOV word retrieval. Then, different methods are proposed to retrieve the hypothesis words. Finally we present the integration of new words in the transcription process based on part-of-speech models. This technique allows to recover 7.6% of the significant OOV words and the accuracy of the system is slightly improved.
- article