Skip to main content

Zephyr

HuggingFace's distilled LLM trained with direct preference optimization

Code & Development
Zephyr logo

Zephyr

HuggingFace's distilled LLM trained with direct preference optimization

Zephyr is a series of language models from HuggingFace trained using Direct Preference Optimization (DPO) and distilled supervised fine-tuning (dSFT) on AI feedback. Zephyr-7B-beta was a breakthrough model demonstrating that DPO training on AI-generated preference data could produce models outperforming much larger instruction-tuned models. HuggingFace releases Zephyr as a research artifact demonstrating alignment training techniques.

Key Features

  • DPO training
  • 7B parameter
  • AI feedback alignment
  • HuggingFace native
  • Apache 2.0
  • Research focused
#llm#huggingface#dpo#alignment#open-source

Get Started

Visit Zephyr
🟢
Free
Completely free to use

Quick Info

Category
Code & Development
Pricing
Free

More Code & Development Tools