Integrate VDK with Huggingface as both data source and target
Project description
Huggingface
Versatile Data Kit (VDK) plugin for integrating with Huggingface as both a data source and a target. This plugin allows you to ingest data payloads into a Huggingface repository and makes it easier to work with datasets stored in Huggingface.
Usage
pip install vdk-huggingface
The functionality adds new ingestion method "huggingface" which can be used like that:
job_input.send_object_for_ingestion(data, method="huggingface")
Configuration
(vdk config-help
is useful command to browse all config options of your installation of vdk)
Name | Description | (example) Value |
---|---|---|
HUGGINGFACE_TOKEN | HuggingFace API token for authentication. Get one from HuggingFace Settings | "" |
HUGGINGFACE_REPO_ID | HuggingFace Dataset repository ID | "username/test-dataset" |
Build and testing
pip install -r requirements.txt
pip install -e .
pytest
In VDK repo ../build-plugin.sh script can be used also.
Note about the CICD:
.plugin-ci.yaml is needed only for plugins part of Versatile Data Kit Plugin repo.
The CI/CD is separated in two stages, a build stage and a release stage. The build stage is made up of a few jobs, all which inherit from the same job configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and 3.10). They run according to rules, which are ordered in a way such that changes to a plugin's directory trigger the plugin CI, but changes to a different plugin does not.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file vdk-huggingface-0.1.1066314998.tar.gz
.
File metadata
- Download URL: vdk-huggingface-0.1.1066314998.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60cd92e41cfdf5a7c4374234c8d2d5618f144b71a4f9d0dcdff92fc45b9912ae |
|
MD5 | 59cc523fb1165b0b4e2f5255917ece9e |
|
BLAKE2b-256 | bbeff2bd7a9ed3fe61d1887085d0a5d1861e34239a068ec8df5deb14f6e4039c |