A quick way to version control data stored remotely
Project description
LazyLFS
A quick way to version control data stored remotely
Lazy because
- it does not eagerly fetch the data, and
- it does not require a lot of work up front.
Usage
Install like
pip install lazylfs
Use like
cd path/to/repo
git init .
lazylfs link path/to/data/ ./
lazylfs track ./
lazylfs check ./
git add .
git commit -m "Adds some data"
git diff-tree --no-commit-id --name-only -r HEAD \
| lazylfs check
Alternatives
There are many existing ways to handle large files and large repositories in git. This section explores these alternatives, their strengths, weaknesses and applicability to my use case.
Common to many of them is that they have a higher barrier to entry if when migrating from something like a NAS.
Git LFS
Downloading a small part of files is cumbersome. The best method I have found is to
export GIT_LFS_SKIP_SMUDGE=1
,git-lfs fetch
files selectively using the-I
and-X
arguments,git-lfs
checkout files, explicitly if errors are to avoid.- repeat 2 and 3 after every git operation that add, remove or modify files tracked by lfs.
Reportedly, the footprint of a repo is likely to be much larger than just the files because they are stored once as objects in git and once as files in the working tree.
git-annex
Seems like default behavior is to store a copy of every file in.git/annex
.
I made a very brief attempt at replacing the files with symlinks which seemed to make it unhappy.
It also seems hard to learn and cumbersome to use in general due to its flexibility. Admittedly this may be because I have not spent enough time with it, so that is definitely on my ToDo list.
VFS for Git
The short answer is that a Linux VFSForGit client is not yet available, but we're working on it!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.