what does this model do?

by MayensGuds - opened

@MayensGuds its a tts model, so converts text to speech.
It's pretty realistic imo and can even laugh and pause, later models might have even more emotions. However, it does not have voice cloning or prompting like some other tts models. Not the best tts possible but pretty decent.

I was looking for same model to integrate at my site https://www.tiroalpalotv.es but in spanish. Do this model also work for espanòl language? and is it mobile responsive?

No, this not for Spanish, only for Chinese and English I believe.

It is also quite slow, you need a decent gpu for real-time inference but this is far far slower then real-time on a mobile.

I would recommend xtts v2 if you can run it on a gpu(doesn’t need to be decent or high end, 4gb vram is probably fine).

If you are running in mobile, I suppose openvoice v2 is a possibility.

Sign up or log in to comment