TTS - a RogerZhuo Collection

RogerZhuo 's Collections

must-read-papers

TTS

updated Mar 31, 2025

语音相关

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Paper • 2307.16430 • Published Jul 31, 2023 • 4
Zyphra/Zonos-v0.1-transformer

Text-to-Speech • Updated Jun 3, 2025 • 10.3k • 421
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Paper • 2502.05512 • Published Feb 8, 2025 • 7
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Paper • 2502.11946 • Published Feb 17, 2025 • 3
hexgrad/Kokoro-82M-v1.1-zh

Text-to-Speech • Updated Mar 4, 2025 • 29.5k • 196
Running on Zero

Featured

3.26k

Kokoro TTS

❤

3.26k

Upgraded to v1.0!
Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated Jun 3, 2025 • 2.02k • 1.1k
fishaudio/fish-speech-1.5

Text-to-Speech • Updated Mar 25, 2025 • 6.68k • 723
Running on Zero

Featured

2.84k

F5-TTS

🗣

2.84k

F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9, 2024 • 47
ylacombe/expresso

Viewer • Updated Apr 30, 2024 • 11.6k • 619 • 81
EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

Paper • 2308.05725 • Published Aug 10, 2023 • 2
stepfun-ai/Step-Audio-TTS-3B

Text-to-Speech • 4B • Updated Feb 17, 2025 • 48 • 196
senstella/csm-1b-mlx

Updated Mar 14, 2025 • 14
HKUSTAudio/Llasa-3B

Text-to-Speech • 4B • Updated May 10, 2025 • 317 • 526