You must log in or # to comment.
deleted by creator
Agree. I think the developers stated they added cartoon voices on purpose to demonstrate expressiveness.
The demo video sounds really good. In testing, it is very noisy and poor quality even with the 80M model. It is not all that fast, not instant like I would expect. It’s okay for memory constrained environments, but without voice cloning, may as well stick with whatever built-in TTS your OS has.


