🦙

Llama.cpp

Run Meta's Llama and other LLMs locally with CPU inference

Code & Development

Llama.cpp

Run Meta's Llama and other LLMs locally with CPU inference

Code & DevelopmentFree

llama.cpp is an open-source library that enables running large language models like Llama, Mistral, and others efficiently on consumer CPUs without requiring a GPU. It uses quantization techniques to dramatically reduce memory requirements. Privacy-focused users, developers, and researchers use llama.cpp to run powerful AI models entirely on their own hardware.

Key Features

✓CPU-optimized LLM inference
✓GGUF model format support
✓GPU offloading support
✓OpenAI-compatible API server
✓Quantization for memory efficiency

#local LLM#open source#privacy#CPU inference#Llama

Get Started

Visit Llama.cpp →

🟢

Free

Completely free to use

Quick Info

Category: Code & Development
Pricing: Free

More Code & Development Tools

GitHub Copilot

Code & Development

The AI pair programmer trusted by millions of developers

Cursor

Code & Development

The code editor built around AI from the ground up

Tabnine

Code & Development

Privacy-first AI code completion

Codeium

Code & Development

Free AI coding assistant with no usage limits