Skip to main content

Monkey patch for huggingface_hub to download Git-LFS blobs from Storj

Project description

Monkey patch for HuggingFace Hub to download Git-LFS blobs from Storj

This patch aims to demonstrate the transfer speed that can be achieved with huggingface_hub Python library when utilizing the power of the Storj Decentralized Cloud Storage.

HuggingFace Hub stores all large files in Git-LFS.

image

When the huggingface_hub Python library requests to download such a file, the download request is redirected to the Git-LFS CDN hosted at cdn-lfs.huggingface.co.

This monkey patch modifies the huggingface_hub library to redirect Git-LFS downloads to the Storj Linksharing service hosted at link.storjshare.io.

Prerequisites

The Git-LFS blobs for the respective AI model must be replicated to a Storj bucket and shared it with the Storj Linksharing Service.

We have already replicated the Git-FLS blobs of the StarCoder model to a Storj bucket and shared it: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface

If you want to use another AI model, you need to use your own Storj bucket and then configure the patch to use it. See Configuration for more details.

Installation

First, install the patch module:

pip install huggingface-hub-storj-patch

Then add the following import statement at the top, before any other import, of your Python script:

import huggingface_hub_storj_patch

Now you can run your script. If the patch is applied successfully, you will see it printing the URLs from which the huggingface_hub library is downloading.

image

Configuration

These environment variables can configure the behavior of the patch.

HF_HUB_NO_STORJ

If set to true, downloads won't be redirected to the Storj Linksharing Service as if the patch is not applied.

HF_HUB_STORJ_PARALLELISM

Configures how many parallel download connections are open to the Storj Linksharing Service. The default value is 16.

HF_HUB_STORJ_URL_PREFIX

Configures the URL to the shared Storj bucket that replicates the Git-LFS blobs of the AI model. The default value is the bucket that replicates the StarCoder model: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

huggingface_hub_storj_patch-0.0.6.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

huggingface_hub_storj_patch-0.0.6-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file huggingface_hub_storj_patch-0.0.6.tar.gz.

File metadata

File hashes

Hashes for huggingface_hub_storj_patch-0.0.6.tar.gz
Algorithm Hash digest
SHA256 5da5c0ffe5bffe9d9745a7534da493a7d6a6033e2ca849d83c32b1b795b09cf7
MD5 353774b72c1a3fcb4a5e4d8073d6b652
BLAKE2b-256 94739fc2b2ae0298aa8ba0ef396416dc525df74b2f1696b410b66d0183e678da

See more details on using hashes here.

File details

Details for the file huggingface_hub_storj_patch-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for huggingface_hub_storj_patch-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 657c5a450673d6c04ec58d028901024323fbddb8dfd797cb00076841581ef125
MD5 6d2e1225290dfea6bf1eac25906bd090
BLAKE2b-256 a195bd9663e71e10f26f8d588a05177efb0d2236a3f3030c123b2717ccdb3496

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page