Hugging Face dataset-backed cloud file storage library
Project description
HuggingFaceStorage
Python library for cloud-style file storage backed by a private Hugging Face dataset repository.
Features
- Immutable file version history per logical remote path
- Soft delete via tombstone versions
- Content-addressed blob storage (
sha256) to avoid duplicate uploads HF_TOKEN-based authentication- Public API:
put,put_zip,get,list,delete,history
Project Structure
HuggingFaceStorage/
src/hf_storage/
tests/unit/
tests/integration/
requirements.txt
pyproject.toml
Setup
- Create the virtual environment (Python 3.11):
py -3.11 -m venv .venv
- Activate:
.\.venv\Scripts\Activate.ps1
- Install dependencies:
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install -e .
- Deactivate when finished:
deactivate
Authentication
Set your Hugging Face token before using the library:
$env:HF_TOKEN = "hf_xxx"
Quick Example
from hf_storage import HFStorage, StorageConfig
storage = HFStorage(StorageConfig(repo_id="your-namespace/your-private-dataset"))
storage.setup(create_if_missing=True, private=True)
version = storage.put("local.txt", "docs/local.txt")
zip_version = storage.put_zip("my_folder", "archives/my_folder")
storage.get("docs/local.txt", "restored.txt")
entries = storage.list(prefix="docs/")
history = storage.history("docs/local.txt")
deleted = storage.delete("docs/local.txt")
Running Tests
Unit tests:
pytest tests/unit
Integration tests (real HF repo):
$env:HF_STORAGE_INTEGRATION = "1"
$env:HF_STORAGE_TEST_REPO = "your-namespace/your-private-dataset"
pytest tests/integration
Publishing (Maintainers)
publish.bat is maintainer tooling for package release workflow only. It is not part of the public runtime API.
Set Twine credentials via environment variables:
$env:TWINE_USERNAME = "__token__"
$env:TWINE_TEST_PASSWORD = "pypi-<testpypi-token>"
$env:TWINE_PASSWORD = "pypi-<pypi-production-token>"
Default release target is TestPyPI:
.\publish.bat
Publish to production PyPI explicitly:
.\publish.bat pypi
What publish.bat does:
- Runs unit tests (
pytest tests/unit) - Builds wheel + sdist (
python -m build) - Validates artifacts (
twine check dist/*) - Uploads to TestPyPI by default, or PyPI when
pypiis passed
Credential behavior:
.\publish.bat(default TestPyPI) usesTWINE_TEST_PASSWORD.\publish.bat pypi(production) usesTWINE_PASSWORDTWINE_USERNAMEmust be__token__for both
Important: bump package version before each release. PyPI/TestPyPI do not allow re-uploading the same version.
One-command Zip Upload (Windows)
Use the batch wrapper to zip and upload a file or directory:
.\put_zip.bat "C:\path\to\folder_or_file" "backups/my_archive"
Notes:
HF_TOKENandHF_STORAGE_REPO_IDare read from.env(or current env vars).- If the remote path does not end with
.zip,.zipis appended automatically.
List and Download (Windows)
List stored logical paths:
.\list_files.bat
Include soft-deleted entries too:
.\list_files.bat 1
Download latest version by logical path:
.\get_file.bat "backup/venv.zip" ".\downloads\venv.zip"
Download a specific version:
.\get_file.bat "backup/venv.zip" ".\downloads\venv.zip" "version_id_here"
Soft delete a logical path:
.\delete_file.bat "backup/venv.zip"
Hard delete a logical path (removes manifest entry and unreferenced blob objects):
.\delete_file.bat "backup/venv.zip" hard
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hf_storage-0.1.0.tar.gz.
File metadata
- Download URL: hf_storage-0.1.0.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
004a68dc5394800e1b3f84e4c726746db9bea7b3133f589b2c89bb6cb78e14ee
|
|
| MD5 |
8686e8b30c759f9479cdf2514da06f79
|
|
| BLAKE2b-256 |
555cef609de228fca5eeb4a11550910daf55b38a7d4c3e19bd124d04c09cfccb
|
File details
Details for the file hf_storage-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hf_storage-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf25f9f55fc79c457d0d1a0f3cec70aa00f3ca5d58b883d273834fc2e7aa3b26
|
|
| MD5 |
db0c82dda2cd3c8e0744a9479f117750
|
|
| BLAKE2b-256 |
d9782804e4f8c19d3ce786318e0376d294349b5829612dd990a42ff6a15aea8e
|