Skip to main content
🗣️

Moshi AI

Kyutai's open-source real-time speech-to-speech conversational AI

Audio & Speech
Moshi AI logo

Moshi AI

Kyutai's open-source real-time speech-to-speech conversational AI

Moshi is an open-weight real-time speech-to-speech conversational AI model developed by Kyutai, a French AI research lab. Unlike LLM-based voice assistants that convert speech to text before processing, Moshi operates directly on audio streams, enabling fully natural real-time conversations with latency under 200ms. Moshi can listen and speak simultaneously, interrupting itself or being interrupted naturally. The model weights are fully open, enabling researchers and developers to run and fine-tune it.

Key Features

  • Real-time speech AI
  • Sub-200ms latency
  • Simultaneous listen/speak
  • Open weights
  • Full duplex conversation
  • Natural interruptions
#voice-ai#real-time#open-source#conversational-ai#speech

Get Started

Visit Moshi AI
🟢
Free
Completely free to use

Quick Info

Category
Audio & Speech
Pricing
Free

More Audio & Speech Tools