Skip to main content

A quick way to version control data stored remotely

Project description

LazyLFS

A quick way to version control data stored remotely

Lazy because

  • it does not eagerly fetch the data, and
  • it does not require a lot of work up front.

Usage

Install like

pip install lazylfs

Use like

cd path/to/repo

git init .

lazylfs link path/to/data/ ./

lazylfs track ./

lazylfs check ./

git add .

git commit -m "Adds some data"

git diff-tree --no-commit-id --name-only -r HEAD \
| lazylfs check

Alternatives

There are many existing ways to handle large files and large repositories in git. This section explores these alternatives, their strengths, weaknesses and applicability to my use case.

Common to many of them is that they have a higher barrier to entry if when migrating from something like a NAS.

Git LFS

Downloading a small part of files is cumbersome. The best method I have found is to

  1. export GIT_LFS_SKIP_SMUDGE=1,
  2. git-lfs fetch files selectively using the -I and -X arguments,
  3. git-lfs checkout files, explicitly if errors are to avoid.
  4. repeat 2 and 3 after every git operation that add, remove or modify files tracked by lfs.

Reportedly, the footprint of a repo is likely to be much larger than just the files because they are stored once as objects in git and once as files in the working tree.

git-annex

Seems like default behavior is to store a copy of every file in.git/annex. I made a very brief attempt at replacing the files with symlinks which seemed to make it unhappy.

It also seems hard to learn and cumbersome to use in general due to its flexibility. Admittedly this may be because I have not spent enough time with it, so that is definitely on my ToDo list.

VFS for Git

The short answer is that a Linux VFSForGit client is not yet available, but we're working on it!

VFSForGIT issue #1226

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazylfs-0.3.5.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

lazylfs-0.3.5-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file lazylfs-0.3.5.tar.gz.

File metadata

  • Download URL: lazylfs-0.3.5.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for lazylfs-0.3.5.tar.gz
Algorithm Hash digest
SHA256 e266cf702975f1dee9b8e3e5a19a2f10c2745a5878ab1d2bca47dbc98a4d725f
MD5 7531d5d7d99a852b8bb7a6996f860fd9
BLAKE2b-256 4838b624efdebdce1eefd68a74fccc6b7328f671967b45db916da07458df4c67

See more details on using hashes here.

File details

Details for the file lazylfs-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: lazylfs-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for lazylfs-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 083fb81d8ace7cdef0ca074efdf251730035b05f204eeb0ae740e544cd5dbe39
MD5 76efadcd2d72ee15a96d6b2d71d02dfc
BLAKE2b-256 06ad5d6bed3a5e7d6a2ffdc941f0983f0ade534939f78f1bb040559a527daf81

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page