Skip to main content

A wrapper around huggingface datasets, invoking an IPFS model manager.

Project description

IPFS Huggingface Datasets

This is a model manager and wrapper for huggingface, looks up a index of models from an collection of models, and will download a model from either https/s3/ipfs, depending on which source is the fastest.

How to use

pip install .

look run python3 example.py for examples of usage.

this is designed to be a drop in replacement, which requires only 2 lines to be changed

In your python script

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_auto_download("bge-small-en-v1.5")  

or

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_ipfs("QmccfbkWLYs9K3yucc6b3eSt8s8fKcyRRt24e3CDaeRhM1")

or to use with with s3 caching

from datasets import load_dataset
from ipfs_datasets import load_dataset
dataset = load_dataset.from_auto_download(
    dataset_name="common-crawl",
    s3cfg={
        "bucket": "cloud",
        "endpoint": "https://storage.googleapis.com",
        "secret_key": "",
        "access_key": ""
    }
)

IPFS Huggingface Bridge:

for transformers python library visit: https://github.com/endomorphosis/ipfs_transformers/

for transformers js client visit:
https://github.com/endomorphosis/ipfs_transformers_js/

for orbitdb_kit nodejs library visit: https://github.com/endomorphosis/orbitdb_kit/

for fireproof_kit nodejs library visit: https://github.com/endomorphosis/fireproof_kit

for Faiss KNN index python library visit: https://github.com/endomorphosis/ipfs_faiss/

for python model manager library visit: https://github.com/endomorphosis/ipfs_model_manager/

for nodejs model manager library visit: https://github.com/endomorphosis/ipfs_model_manager_js/

for nodejs ipfs huggingface scraper with pinning services visit: https://github.com/endomorphosis/ipfs_huggingface_scraper/

Author - Benjamin Barber QA - Kevin De Haan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ipfs_embeddings_py-0.0.25.tar.gz (69.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ipfs_embeddings_py-0.0.25-py3-none-any.whl (78.8 kB view details)

Uploaded Python 3

File details

Details for the file ipfs_embeddings_py-0.0.25.tar.gz.

File metadata

  • Download URL: ipfs_embeddings_py-0.0.25.tar.gz
  • Upload date:
  • Size: 69.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for ipfs_embeddings_py-0.0.25.tar.gz
Algorithm Hash digest
SHA256 fe63619d3e7070e7096a9b5b96b49f3b8c47727e5b55c4c747eff2c43d8bfdee
MD5 2615cace7ff8ada07c84bc4c59f10b95
BLAKE2b-256 77ff9b4f77302839b0e0de5c146a956352c4fae5a098af488190ed41722e8e7e

See more details on using hashes here.

File details

Details for the file ipfs_embeddings_py-0.0.25-py3-none-any.whl.

File metadata

File hashes

Hashes for ipfs_embeddings_py-0.0.25-py3-none-any.whl
Algorithm Hash digest
SHA256 1987f5bdb6afec6cae4278a10d4d80e47f88b542976c538171691aec6b34aec2
MD5 cb845009a18e39493ce63f1ce58e73e1
BLAKE2b-256 b2fc2cf339f6f6d83477e028b679c1148d157de0a6999e739248cf3dfc9ebdd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page