此处我们总结了市面上常见的文本转语音的项目,一起来探索一下吧。

1 ChatTTS

Github: https://github.com/2noise/ChatTTS

2 MandarinTTS

Github: https://github.com/ranchlai/mandarin-tts

This is a modularized Text-to-speech framework aiming to support fast research and product developments. Main features include

  • all modules are configurable via yaml,
  • speaker embedding / prosody embeding/ multi-stream text embedding are supported and configurable,
  • various vocoders (VocGAN, hifi-GAN, waveglow, melGAN) are supported by adapter so that comparison across different vocoders can be done easily,
  • durations/pitch/energy variance predictor are supported, and other variances can be added easily,
  • and more on the road-map.

3 Chinese-FastSpeech2

Github: https://github.com/Executedone/Chinese-FastSpeech2

基于标贝中文标准女声数据继续训练,同时对原论文的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏.