StreamVC is a cutting-edge streaming voice conversion tool that maintains the essence of the original speech while adopting the voice characteristics of another. Unlike previous methods, StreamVC allows for real-time conversion, ideal for calls and video conferences, and even offers voice anonymization. By utilizing the SoundStream neural audio codec, this solution offers lightweight but high-quality speech synthesis. The innovative feature of learning soft speech units causally and using whitened fundamental frequency information to enhance pitch stability without compromising the original voice timbre sets StreamVC apart.
https://research.google/pubs/streamvc-real-time-low-latency-voice-conversion/