🦙
Llama.cpp
Run Meta's Llama and other LLMs locally with CPU inference
Code & Development
llama.cpp is an open-source library that enables running large language models like Llama, Mistral, and others efficiently on consumer CPUs without requiring a GPU. It uses quantization techniques to dramatically reduce memory requirements. Privacy-focused users, developers, and researchers use llama.cpp to run powerful AI models entirely on their own hardware.
Key Features
- ✓CPU-optimized LLM inference
- ✓GGUF model format support
- ✓GPU offloading support
- ✓OpenAI-compatible API server
- ✓Quantization for memory efficiency
#local LLM#open source#privacy#CPU inference#Llama
Quick Info
- Category
- Code & Development
- Pricing
- Free
More Code & Development Tools
GitHub Copilot
Code & DevelopmentThe AI pair programmer trusted by millions of developers
Cursor
Code & DevelopmentThe code editor built around AI from the ground up
Tabnine
Code & DevelopmentPrivacy-first AI code completion
Codeium
Code & DevelopmentFree AI coding assistant with no usage limits