We release optimized checkpoints for our codecs.
Audio FSQ 22kHz
Audio FSQ 44kHz
Spectral FSQ 22kHz
Spectral FSQ 44kHz
We compare the reconstructed audio of different codec models after compression, in multiple languages.
A comparison of synthesized speech from FastPitch when trained with different codec models. For our best performing codecs, we also provide samples from an autoregressive FastPitch model.