NeuTTS Nano: 120M Parameter On-Device TTS based on Llama3

Tools 213 points 44 comments 6 days ago

Hey everyone, The team at Neuphonic is back with a new open-source release: NeuTTS Nano. After NeuTTS Air trended #1 on HuggingFace last October, we received a lot of requests for something even smaller that could fit into tighter VRAM/RAM constraints for robotics and embedded agents. Key Specs: * Model Size: 120M active parameters (3x smaller than NeuTTS Air). * Architecture: Simple LM + codec architecture built off Llama3. * Format: Provided in GGML for easy deployment on mobile, Jetson, and Raspberry Pi. * Capabilities: Instant voice cloning (3s sample) and ultra-realistic prosody. Why use this? If you are building for smart home devices, robotics, or mobile apps where every MB of RAM matters, Nano is designed for you. It delivers the same "voice magic" but in a much lighter package. Links: * GitHub: https://github.com/neuphonic/neutts * HuggingFace: https://huggingface.co/neuphonic/neutts-nano * Spaces: https://huggingface.co/spaces/neuphonic/neutts-nano * Website: https://www.neuphonic.com/ We’re curious to see the RTF (Real-Time Factor) benchmarks the community gets on different hardware. What’s the smallest device you’re planning to run this on?

More from r/LocalLLaMA