🔊

AudioLDM 2

Open-source latent diffusion model for text-to-audio and music generation

Audio & Speech

AudioLDM 2

Open-source latent diffusion model for text-to-audio and music generation

Audio & SpeechFree

AudioLDM 2 is an open-source text-to-audio generation model that creates sound effects, ambient audio, and music from text descriptions using latent diffusion. It can generate realistic environmental sounds (rain, crowds, machinery), musical compositions, and speech-like audio from natural language prompts. AudioLDM 2 is widely used in research, game audio prototyping, and creative audio applications where developers need programmatic control over generative audio without commercial API restrictions.