Gérard Bailly

GIPSA-Lab

Session JEP orale O5 Corpus Jeudi 12 Juin - 16h30 17h30

papier 1606 Amélioration de la conversion de voix chuchotée enregistrée par capteur NAM vers la voix audible

Viet-Anh Tran  ( GIPSA-Lab)

Gérard Bailly  ( GIPSA-Lab)

Hélène Loevenbruck  ( GIPSA-Lab)

Christian Jutten  ( GIPSA-Lab)

Résumé : The NAM-to-speech conversion proposed by Toda and colleagues which converts Non-Audible Murmur (NAM) to audible speech by statistical mapping trained using aligned corpora is a very promising technique, but its performance is still insufficient. In this paper, we present our current work to improve the intelligibility and the naturalness of the synthesized speech converted from whispered speech with this technique. The first system is proposed to improve F0 estimation and voicing decision. A simple neural network is used to detect voiced segments in the whisper while a GMM estimates a continuous melodic contour based on training voiced segments. In the second system, we attempt to integrate visual information for improving both spectral estimation, F0 estimation and voicing decision.

article