Skip to main content

A wrapper around huggingface datasets, invoking an IPFS model manager.

Project description

IPFS Huggingface Datasets

This is a model manager and wrapper for huggingface, looks up a index of models from an collection of models, and will download a model from either https/s3/ipfs, depending on which source is the fastest.

How to use

pip install .

look run python3 example.py for examples of usage.

this is designed to be a drop in replacement, which requires only 2 lines to be changed

In your python script

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_auto_download("bge-small-en-v1.5")  

or

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_ipfs("QmccfbkWLYs9K3yucc6b3eSt8s8fKcyRRt24e3CDaeRhM1")

or to use with with s3 caching

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_auto_download(
    dataset_name="common-crawl",
    s3cfg={
        "bucket": "cloud",
        "endpoint": "https://storage.googleapis.com",
        "secret_key": "",
        "access_key": ""
    }
)

IPFS Huggingface Bridge:

for transformers python library visit: https://github.com/endomorphosis/ipfs_transformers/

for transformers js client visit:
https://github.com/endomorphosis/ipfs_transformers_js/

for orbitdb_kit nodejs library visit: https://github.com/endomorphosis/orbitdb_kit/

for fireproof_kit nodejs library visit: https://github.com/endomorphosis/fireproof_kit

for Faiss KNN index python library visit: https://github.com/endomorphosis/ipfs_faiss/

for python model manager library visit: https://github.com/endomorphosis/ipfs_model_manager/

for nodejs model manager library visit: https://github.com/endomorphosis/ipfs_model_manager_js/

for nodejs ipfs huggingface scraper with pinning services visit: https://github.com/endomorphosis/ipfs_huggingface_scraper/

Author - Benjamin Barber QA - Kevin De Haan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ipfs_embeddings_py-0.0.19.tar.gz (38.8 kB view details)

Uploaded Source

Built Distribution

ipfs_embeddings_py-0.0.19-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file ipfs_embeddings_py-0.0.19.tar.gz.

File metadata

  • Download URL: ipfs_embeddings_py-0.0.19.tar.gz
  • Upload date:
  • Size: 38.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for ipfs_embeddings_py-0.0.19.tar.gz
Algorithm Hash digest
SHA256 a4b2c0d2a3ac67481c6cea98773ad1c705fb4468a648ff2ce0209f48fea74eb9
MD5 e90c3eb54523cda906a8eec5fee8aa73
BLAKE2b-256 f4f41346cb3226635922a09415aab6bc4fdd6b2ea32d941fb2f53b509c434ac8

See more details on using hashes here.

File details

Details for the file ipfs_embeddings_py-0.0.19-py3-none-any.whl.

File metadata

File hashes

Hashes for ipfs_embeddings_py-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 47bcea9332ff544f5576fb6a928ed852c013b29164442abf6fad48fb3f9661a5
MD5 380126fe051fe2ed84b6e3040b06c22a
BLAKE2b-256 4aab74f2e48e948d54509dab1c0787e287914f6703d7a0103d4f5a04ba3e4ced

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page