Posted by u/TeamNeuphonic
NeuTTS Nano: 120M Parameter On-Device TTS based on Llama3
Hey everyone, The team at Neuphonic is back with a new open-source release: NeuTTS Nano. After NeuTTS Air trended #1 on HuggingFace last October, we received a lot of requests for something even smaller that could fit into tighter VRAM/RAM constraints for robotics and embedded agents. Key Specs: * Model Size: 120M active parameters (3x smaller than NeuTTS Air). * Architecture: Simple LM + codec architecture built off Llama3. * Format: Provided in GGML for easy deployment on mobile, Jetson, and Raspberry Pi. * Capabilities: Instant voice cloning (3s sample) and ultra-realistic prosody. Why use this? If you are building for smart home devices, robotics, or mobile apps where every MB of RAM matters, Nano is designed for you. It delivers the same "voice magic" but in a much lighter package. Links: * GitHub: https://github.com/neuphonic/neutts * HuggingFace: https://huggingface.co/neuphonic/neutts-nano * Spaces: https://huggingface.co/spaces/neuphonic/neutts-nano * Website: https://www.neuphonic.com/ We’re curious to see the RTF (Real-Time Factor) benchmarks the community gets on different hardware. What’s the smallest device you’re planning to run this on?
External link:
https://v.redd.it/2nikcyj6ycdg1More from r/LocalLLaMA
My story of underestimating /r/LocalLLaMA's thirst for VRAM
zai-org/GLM-4.7-Flash · Hugging Face
NVIDIA's new 8B model is Orchestrator-8B, a specialized 8-billion-parameter AI designed not to answer everything itself, but to intelligently manage and route complex tasks to different tools (like web search, code execution, other LLMs) for greater efficiency
I’ve seen some arguments we’ve reached AGI, it’s just about putting the separate pieces together in the right context....