S3-compatible storage backend for genblaze (B2, R2, MinIO, AWS)
Project description
genblaze-s3
S3-compatible storage backend for genblaze AI media pipelines — durable, content-addressable, dedup-ready. Works with Backblaze B2 (recommended default), Cloudflare R2, MinIO, and AWS S3.
genblaze-s3 plugs into the genblaze ObjectStorageSink to persist AI-generated video, image, and audio — plus their SHA-256 provenance manifests — onto any S3-compatible object store. It handles streaming downloads from provider CDNs, SHA-256 hashing, multipart uploads with retries, pre-signed URLs for private buckets, and Object Lock retention for tamper-evident manifests on Backblaze B2.
Why genblaze-s3
- Durable by default — Assets + manifests land in object storage, never stuck in a provider's expiring CDN URL.
- Backblaze B2 first-class — One-line
S3StorageBackend.for_backblaze()helper, Object Lock support for immutable provenance. - Content-addressable dedup —
KeyStrategy.CONTENT_ADDRESSABLEstores each unique asset once by SHA-256. - Works with any S3 API — AWS S3, Backblaze B2, Cloudflare R2, MinIO, SeaweedFS, Wasabi, Ceph.
- Presigned URLs — private buckets get time-limited URLs; public buckets get permanent
public_url_baselinks. - Resilient multipart uploads — credential-preserving retries, preflight checks, no partial writes.
Backends
| Provider | Helper | Notes |
|---|---|---|
| Backblaze B2 | S3StorageBackend.for_backblaze("bucket") |
Reads B2_KEY_ID / B2_APP_KEY; Object Lock retention supported |
| AWS S3 | S3StorageBackend(bucket="...", region="...") |
Standard AWS credential chain |
| Cloudflare R2 | S3StorageBackend(bucket="...", endpoint_url="https://<acct>.r2.cloudflarestorage.com") |
|
| MinIO / self-hosted | S3StorageBackend(bucket="...", endpoint_url="https://minio.example.com") |
Install
pip install genblaze-s3
Quickstart — Backblaze B2 (recommended)
export B2_KEY_ID="..."
export B2_APP_KEY="..."
from genblaze_core import KeyStrategy, ObjectStorageSink, Pipeline
from genblaze_s3 import S3StorageBackend
from genblaze_replicate import ReplicateProvider
backend = S3StorageBackend.for_backblaze(
"my-genblaze-bucket",
# Defaults to "us-west-004". Pass the region your bucket actually lives
# in (e.g. "us-east-005", "eu-central-003") to skip the redirect hop —
# the backend auto-corrects on first use, but a right hint saves an RTT.
region="us-west-004",
# Optional: pass public_url_base for public buckets (get_url returns permanent URLs)
public_url_base="https://f004.backblazeb2.com/file/my-genblaze-bucket",
)
sink = ObjectStorageSink(
backend,
prefix="genblaze-assets",
key_strategy=KeyStrategy.CONTENT_ADDRESSABLE, # dedupe by SHA-256
)
result = (
Pipeline("b2-demo")
.step(ReplicateProvider(), model="black-forest-labs/flux-schnell",
prompt="a photorealistic cat wearing a tiny spacesuit")
.run(sink=sink, timeout=120)
)
for step in result.run.steps:
for asset in step.assets:
print(asset.url, asset.sha256)
backend.close()
Resulting bucket layout with CONTENT_ADDRESSABLE:
genblaze-assets/
├── assets/{sha[:2]}/{sha[2:4]}/{sha}.ext # one object per unique asset
└── manifests/{run_id}.json # one manifest per run
Switch to KeyStrategy.HIERARCHICAL for runs/{date}/{run_id}/… layout (better for run-grouped browsing, worse for dedup).
Quickstart — AWS S3
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
from genblaze_s3 import S3StorageBackend
backend = S3StorageBackend(bucket="my-genblaze-bucket", region="us-east-1")
# get_url() returns pre-signed URLs when public_url_base is not set
Quickstart — Cloudflare R2 / MinIO
from genblaze_s3 import S3StorageBackend
# R2
backend = S3StorageBackend(
bucket="my-bucket",
endpoint_url="https://<account-id>.r2.cloudflarestorage.com",
access_key_id="...", secret_access_key="...",
)
# MinIO
backend = S3StorageBackend(
bucket="my-bucket",
endpoint_url="https://minio.example.com",
access_key_id="...", secret_access_key="...",
)
Object Lock for immutable manifests (Backblaze B2)
Genblaze can apply Object Lock retention to uploaded manifests, producing tamper-evident provenance suitable for compliance, legal, and content-authenticity workflows. See the main repo docs for the Object Lock guide.
Documentation
- Main repo: https://github.com/backblaze-labs/genblaze
- Storage feature doc: https://github.com/backblaze-labs/genblaze/blob/main/docs/features/object-storage.md
- Runnable examples:
b2_storage_pipeline.py,s3_storage_pipeline.py
Related packages
genblaze-core— the pipeline SDK- Provider adapters:
genblaze-openai·genblaze-google·genblaze-runway·genblaze-luma·genblaze-replicate
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file genblaze_s3-0.2.2.tar.gz.
File metadata
- Download URL: genblaze_s3-0.2.2.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
367896009698a56b0c95aad89a40c54eb1ffbc922e574957087019ec907e9ed3
|
|
| MD5 |
8451819a8254ecff7f6acf857abc7c04
|
|
| BLAKE2b-256 |
f8a20a26eebf04346255f4f883696a10a03d5823c9e72653e16bdb874b4b1273
|
File details
Details for the file genblaze_s3-0.2.2-py3-none-any.whl.
File metadata
- Download URL: genblaze_s3-0.2.2-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1770de28c79f11f292fa67fa0ae8fd3cecdbf0239c37b3a6ecdbd71172bab419
|
|
| MD5 |
bb6638ad31e0354738044b9d64b223fa
|
|
| BLAKE2b-256 |
1f8aef7ee8b4306b39a2bb16416b62dfdff8f1936d37caff25eb0a1d9cd09f03
|