🗣️
Moshi AI
Kyutai's open-source real-time speech-to-speech conversational AI
Audio & Speech
Moshi is an open-weight real-time speech-to-speech conversational AI model developed by Kyutai, a French AI research lab. Unlike LLM-based voice assistants that convert speech to text before processing, Moshi operates directly on audio streams, enabling fully natural real-time conversations with latency under 200ms. Moshi can listen and speak simultaneously, interrupting itself or being interrupted naturally. The model weights are fully open, enabling researchers and developers to run and fine-tune it.
Key Features
- ✓Real-time speech AI
- ✓Sub-200ms latency
- ✓Simultaneous listen/speak
- ✓Open weights
- ✓Full duplex conversation
- ✓Natural interruptions
#voice-ai#real-time#open-source#conversational-ai#speech
Quick Info
- Category
- Audio & Speech
- Pricing
- Free