GitHub Blogs Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage
Edit page


  • Object storage or NFS: Create folders or prefixes to add version semantics. It is straightforward to store different versions of data. But it lacks commit message, metadata, and history tracking. And we cannot know which one is the latest version.
  • S3 versioning: S3 Versioning can achieve the object-level versioning. We can get the latest, but possible to roll back to the previous version.
  • Git LFS: Git LFS is an open-source Git extension for versioning large files developed by Github.
  • DVC: DVC is built to make ML models shareable and reproducible. It is designed to handle large files, data sets, machine learning models, and metrics as well as code.
    • use git command to version small files or metadata, use dvc to manage large files.
    • you need to know both git and dvc. In the workflow, the two commands should switch back and forth. See the dvc tutorial
  • LakeFS: LakeFS provides a multi-server solution to make s3 as git-like repositories
    • the architecture is much heavier than ArtiVC due to an extra database for metadata storage and S3 configuration. (e.g., S3 gateway)