NeMo TTS API#
Model Classes#
MagpieTTS (Codec-based TTS)#
MagpieTTS is an end-to-end TTS model that generates audio codes from transcript and optional context (audio or text). It supports multiple architectures (e.g. multi-encoder context, decoder context) and can be used for standard, long-form, and streaming inference.
Mel-Spectrogram Generators#
Vocoders#
Codecs#
Base Classes#
The classes below are the base of the TTS pipeline.