DeepInfra
Serverless inference API for running open-source LLMs cheaply
DeepInfra is a serverless inference platform that provides API access to hundreds of open-source machine learning models—including Llama, Mistral, Mixtral, Whisper, and Stable Diffusion—at a fraction of the cost of proprietary API providers. Models are available as REST APIs with OpenAI-compatible endpoints, enabling drop-in replacement for applications built on the OpenAI SDK. DeepInfra's auto-scaling infrastructure handles bursts without pre-provisioning, with pay-per-token pricing and no minimum commitments.
Key Features
- ✓100+ open-source models
- ✓OpenAI-compatible API
- ✓Serverless scaling
- ✓Pay-per-token
- ✓Whisper/Stable Diffusion
- ✓Low cost
Quick Info
- Category
- Code & Development
- Pricing
- Paid
More Code & Development Tools
GitHub Copilot
Code & DevelopmentThe AI pair programmer trusted by millions of developers
Cursor
Code & DevelopmentThe code editor built around AI from the ground up
Tabnine
Code & DevelopmentPrivacy-first AI code completion
Codeium
Code & DevelopmentFree AI coding assistant with no usage limits