Skip to main content
🛡️

Llama Guard

Meta's safety classifier for detecting harmful content in LLM I/O

Security
Llama Guard logo

Llama Guard

Meta's safety classifier for detecting harmful content in LLM I/O

Llama Guard is Meta's open-source LLM-based input-output safety classification model designed to detect harmful content in conversations with AI systems. It classifies both user prompts and LLM responses across categories including violence, hate speech, sexual content, dangerous activities, and privacy violations. Llama Guard is designed to be integrated as a safeguard layer in production AI systems, running checks before and after LLM calls to enforce content policies.

Key Features

  • Prompt safety classification
  • Response safety classification
  • Multi-category detection
  • Open source
  • API compatible
  • Low latency
#llm-safety#content-moderation#meta#guardrails#open-source

Get Started

Visit Llama Guard
🟢
Free
Completely free to use

Quick Info

Category
Security
Pricing
Free

More Security Tools