Skip to main content

S3-compatible storage backend for genblaze (B2, R2, MinIO, AWS)

Project description

genblaze-s3

S3-compatible storage backend for genblaze AI media pipelines — durable, content-addressable, dedup-ready. Works with Backblaze B2 (recommended default), Cloudflare R2, MinIO, and AWS S3.

genblaze-s3 plugs into the genblaze ObjectStorageSink to persist AI-generated video, image, and audio — plus their SHA-256 provenance manifests — onto any S3-compatible object store. It handles streaming downloads from provider CDNs, SHA-256 hashing, multipart uploads with retries, pre-signed URLs for private buckets, and Object Lock retention for tamper-evident manifests on Backblaze B2.

Why genblaze-s3

  • Durable by default — Assets + manifests land in object storage, never stuck in a provider's expiring CDN URL.
  • Backblaze B2 first-class — One-line S3StorageBackend.for_backblaze() helper, Object Lock support for immutable provenance.
  • Content-addressable dedupKeyStrategy.CONTENT_ADDRESSABLE stores each unique asset once by SHA-256.
  • Works with any S3 API — AWS S3, Backblaze B2, Cloudflare R2, MinIO, SeaweedFS, Wasabi, Ceph.
  • Presigned URLs — private buckets get time-limited URLs; public buckets get permanent public_url_base links.
  • Resilient multipart uploads — credential-preserving retries, preflight checks, no partial writes.

Backends

Provider Helper Notes
Backblaze B2 S3StorageBackend.for_backblaze("bucket") Reads B2_KEY_ID / B2_APP_KEY; Object Lock retention supported
AWS S3 S3StorageBackend(bucket="...", region="...") Standard AWS credential chain
Cloudflare R2 S3StorageBackend(bucket="...", endpoint_url="https://<acct>.r2.cloudflarestorage.com")
MinIO / self-hosted S3StorageBackend(bucket="...", endpoint_url="https://minio.example.com")

Install

pip install genblaze-s3

Quickstart — Backblaze B2 (recommended)

export B2_KEY_ID="..."
export B2_APP_KEY="..."
from genblaze_core import KeyStrategy, ObjectStorageSink, Pipeline
from genblaze_s3 import S3StorageBackend
from genblaze_replicate import ReplicateProvider

backend = S3StorageBackend.for_backblaze(
    "my-genblaze-bucket",
    region="us-west-004",
    # Optional: pass public_url_base for public buckets (get_url returns permanent URLs)
    public_url_base="https://f004.backblazeb2.com/file/my-genblaze-bucket",
)

sink = ObjectStorageSink(
    backend,
    prefix="genblaze-assets",
    key_strategy=KeyStrategy.CONTENT_ADDRESSABLE,   # dedupe by SHA-256
)

result = (
    Pipeline("b2-demo")
    .step(ReplicateProvider(), model="black-forest-labs/flux-schnell",
          prompt="a photorealistic cat wearing a tiny spacesuit")
    .run(sink=sink, timeout=120)
)

for step in result.run.steps:
    for asset in step.assets:
        print(asset.url, asset.sha256)

backend.close()

Resulting bucket layout with CONTENT_ADDRESSABLE:

genblaze-assets/
├── assets/{sha[:2]}/{sha[2:4]}/{sha}.ext    # one object per unique asset
└── manifests/{run_id}.json                   # one manifest per run

Switch to KeyStrategy.HIERARCHICAL for runs/{date}/{run_id}/… layout (better for run-grouped browsing, worse for dedup).

Quickstart — AWS S3

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
from genblaze_s3 import S3StorageBackend

backend = S3StorageBackend(bucket="my-genblaze-bucket", region="us-east-1")
# get_url() returns pre-signed URLs when public_url_base is not set

Quickstart — Cloudflare R2 / MinIO

from genblaze_s3 import S3StorageBackend

# R2
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://<account-id>.r2.cloudflarestorage.com",
    access_key_id="...", secret_access_key="...",
)

# MinIO
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://minio.example.com",
    access_key_id="...", secret_access_key="...",
)

Object Lock for immutable manifests (Backblaze B2)

Genblaze can apply Object Lock retention to uploaded manifests, producing tamper-evident provenance suitable for compliance, legal, and content-authenticity workflows. See the main repo docs for the Object Lock guide.

Documentation

Related packages

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genblaze_s3-0.2.0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genblaze_s3-0.2.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file genblaze_s3-0.2.0.tar.gz.

File metadata

  • Download URL: genblaze_s3-0.2.0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0b7db4f27f7c2f461130c41d4a86420e14a9a5cf09052ebedca791e4d35d305e
MD5 9fb0bdae78c0309eb6c1fb036d76791e
BLAKE2b-256 971fa0e2e559f98a4a42ec04420e6d0fd081ffe183fe5f22cfe974d63605e371

See more details on using hashes here.

File details

Details for the file genblaze_s3-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: genblaze_s3-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 240f36070fbbe2bbdd4a198146c751b6119c8bacd036755a98d086b9402bc900
MD5 6d4a6d2dcabc1ae58e82dc7714ad23f1
BLAKE2b-256 aa471c73911f5c78bfb58082d1e2391dcb1edac875cb42f85abf9828aaa7d433

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page