DVC
Open-source data version control for ML projects with Git integration
DVC (Data Version Control) is an open-source tool that brings Git-like version control to ML datasets, models, and experiments. DVC stores large files and directories—training datasets, model weights, preprocessed features—in cloud storage while tracking them with lightweight metafiles in Git, enabling teams to version data alongside code with the same Git workflow they already use. DVC pipelines define reproducible ML workflows as DAGs, and DVC experiments track hyperparameters and metrics for comparing runs. The tool is framework and cloud agnostic, supporting AWS, GCP, Azure, and local storage, and integrates with CI/CD systems for automated ML pipeline execution and model testing.
Key Features
- ✓Data version control
- ✓Git workflow integration
- ✓ML pipeline DAGs
- ✓Cloud storage support
- ✓Experiment tracking
- ✓CI/CD integration
Quick Info
- Category
- Code & Development
- Pricing
- Free
More Code & Development Tools
GitHub Copilot
Code & DevelopmentThe AI pair programmer trusted by millions of developers
Cursor
Code & DevelopmentThe code editor built around AI from the ground up
Tabnine
Code & DevelopmentPrivacy-first AI code completion
Codeium
Code & DevelopmentFree AI coding assistant with no usage limits