Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit

Amphion is an open-source toolkit for Audio, Music, and Speech Generation that aims to support junior researchers and engineers in the field. One unique feature is the visualizations of classic models to aid in understanding. Amphion focuses on tasks like Text to Speech and Voice Conversion, offering several models like FastSpeech2 and VITS. It also provides comprehensive evaluation metrics for generated audio quality. Notably, Amphion supports the Emilia dataset and preprocessing pipeline for speech data. Installation is simple through Setup Installer or Docker Image. Contributions are welcome, and the toolkit is free for research and commercial use under the MIT License.

https://github.com/open-mmlab/Amphion

To top