Skip to main content
🧪

DeepEval

Open-source LLM evaluation framework for testing AI applications

Data & Analytics
DeepEval logo

DeepEval

Open-source LLM evaluation framework for testing AI applications

DeepEval is an open-source evaluation framework specifically designed for testing and benchmarking LLM applications. It provides 14+ evaluation metrics out of the box—including faithfulness, answer relevancy, hallucination, toxicity, and bias—that can be run as unit tests in CI/CD pipelines. DeepEval supports RAG evaluation, agent evaluation, fine-tuning evaluation, and integrates with pytest, making LLM testing as straightforward as traditional software testing.

Key Features

  • 14+ eval metrics
  • RAG evaluation
  • CI/CD integration
  • pytest compatible
  • Hallucination detection
  • Agent evaluation
#llm-evaluation#open-source#testing#rag#hallucination

Get Started

Visit DeepEval
🔵
Freemium
Free plan + paid upgrades

Quick Info

Category
Data & Analytics
Pricing
Freemium

More Data & Analytics Tools