Skip to main content

S3-compatible storage backend for genblaze (B2, R2, MinIO, AWS)

Project description

genblaze-s3

S3-compatible storage backend for genblaze AI media pipelines — durable, content-addressable, dedup-ready. Works with Backblaze B2 (recommended default), Cloudflare R2, MinIO, and AWS S3.

genblaze-s3 plugs into the genblaze ObjectStorageSink to persist AI-generated video, image, and audio — plus their SHA-256 provenance manifests — onto any S3-compatible object store. It handles streaming downloads from provider CDNs, SHA-256 hashing, multipart uploads with retries, pre-signed URLs for private buckets, and Object Lock retention for tamper-evident manifests on Backblaze B2.

Why genblaze-s3

  • Durable by default — Assets + manifests land in object storage, never stuck in a provider's expiring CDN URL.
  • Backblaze B2 first-class — One-line S3StorageBackend.for_backblaze() helper, Object Lock support for immutable provenance.
  • Content-addressable dedupKeyStrategy.CONTENT_ADDRESSABLE stores each unique asset once by SHA-256.
  • Works with any S3 API — AWS S3, Backblaze B2, Cloudflare R2, MinIO, SeaweedFS, Wasabi, Ceph.
  • Presigned URLs — private buckets get time-limited URLs; public buckets get permanent public_url_base links.
  • Resilient multipart uploads — credential-preserving retries, preflight checks, no partial writes.

Backends

Provider Helper Notes
Backblaze B2 S3StorageBackend.for_backblaze("bucket") Reads B2_KEY_ID / B2_APP_KEY; Object Lock retention supported
AWS S3 S3StorageBackend(bucket="...", region="...") Standard AWS credential chain
Cloudflare R2 S3StorageBackend(bucket="...", endpoint_url="https://<acct>.r2.cloudflarestorage.com")
MinIO / self-hosted S3StorageBackend(bucket="...", endpoint_url="https://minio.example.com")

Install

pip install genblaze-s3

Quickstart — Backblaze B2 (recommended)

export B2_KEY_ID="..."
export B2_APP_KEY="..."
from genblaze_core import KeyStrategy, ObjectStorageSink, Pipeline
from genblaze_s3 import S3StorageBackend
from genblaze_replicate import ReplicateProvider

backend = S3StorageBackend.for_backblaze(
    "my-genblaze-bucket",
    # Defaults to "us-west-004". Pass the region your bucket actually lives
    # in (e.g. "us-east-005", "eu-central-003") to skip the redirect hop —
    # the backend auto-corrects on first use, but a right hint saves an RTT.
    region="us-west-004",
    # Optional: pass public_url_base for public buckets (get_url returns permanent URLs)
    public_url_base="https://f004.backblazeb2.com/file/my-genblaze-bucket",
)

sink = ObjectStorageSink(
    backend,
    prefix="genblaze-assets",
    key_strategy=KeyStrategy.CONTENT_ADDRESSABLE,   # dedupe by SHA-256
)

result = (
    Pipeline("b2-demo")
    .step(ReplicateProvider(), model="black-forest-labs/flux-schnell",
          prompt="a photorealistic cat wearing a tiny spacesuit")
    .run(sink=sink, timeout=120)
)

for step in result.run.steps:
    for asset in step.assets:
        print(asset.url, asset.sha256)

backend.close()

Resulting bucket layout with CONTENT_ADDRESSABLE:

genblaze-assets/
├── assets/{sha[:2]}/{sha[2:4]}/{sha}.ext    # one object per unique asset
└── manifests/{run_id}.json                   # one manifest per run

Switch to KeyStrategy.HIERARCHICAL for runs/{date}/{run_id}/… layout (better for run-grouped browsing, worse for dedup).

Quickstart — AWS S3

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
from genblaze_s3 import S3StorageBackend

backend = S3StorageBackend(bucket="my-genblaze-bucket", region="us-east-1")
# get_url() returns pre-signed URLs when public_url_base is not set

Quickstart — Cloudflare R2 / MinIO

from genblaze_s3 import S3StorageBackend

# R2
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://<account-id>.r2.cloudflarestorage.com",
    access_key_id="...", secret_access_key="...",
)

# MinIO
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://minio.example.com",
    access_key_id="...", secret_access_key="...",
)

Object Lock for immutable manifests (Backblaze B2)

Genblaze can apply Object Lock retention to uploaded manifests, producing tamper-evident provenance suitable for compliance, legal, and content-authenticity workflows. See the main repo docs for the Object Lock guide.

Documentation

Related packages

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genblaze_s3-0.2.2.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genblaze_s3-0.2.2-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file genblaze_s3-0.2.2.tar.gz.

File metadata

  • Download URL: genblaze_s3-0.2.2.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.2.tar.gz
Algorithm Hash digest
SHA256 367896009698a56b0c95aad89a40c54eb1ffbc922e574957087019ec907e9ed3
MD5 8451819a8254ecff7f6acf857abc7c04
BLAKE2b-256 f8a20a26eebf04346255f4f883696a10a03d5823c9e72653e16bdb874b4b1273

See more details on using hashes here.

File details

Details for the file genblaze_s3-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: genblaze_s3-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1770de28c79f11f292fa67fa0ae8fd3cecdbf0239c37b3a6ecdbd71172bab419
MD5 bb6638ad31e0354738044b9d64b223fa
BLAKE2b-256 1f8aef7ee8b4306b39a2bb16416b62dfdff8f1936d37caff25eb0a1d9cd09f03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page