Unsupervised speech-to-speech translation from monolingual data

Translatotron 3 is a new unsupervised speech-to-speech translation architecture developed by Google Research. This technology aims to overcome the scarcity of parallel speech data, which is a major challenge in the field of machine translation. By using unsupervised machine translation techniques, Translatotron 3 can learn a speech-to-speech translation task from monolingual data alone. This opens up the possibility of translating between more language pairs and preserving non-textual speech attributes such as pauses, speaking rates, and speaker identity. Experimental results show that Translatotron 3 outperforms a baseline cascade system in speech-to-speech translation tasks between Spanish and English.

https://blog.research.google/2023/12/unsupervised-speech-to-speech.html

To top