Skip to main content

S3-compatible storage backend for genblaze (B2, R2, MinIO, AWS)

Project description

genblaze-s3

S3-compatible storage backend for genblaze AI media pipelines — durable, content-addressable, dedup-ready. Works with Backblaze B2 (recommended default), Cloudflare R2, MinIO, and AWS S3.

genblaze-s3 plugs into the genblaze ObjectStorageSink to persist AI-generated video, image, and audio — plus their SHA-256 provenance manifests — onto any S3-compatible object store. It handles streaming downloads from provider CDNs, SHA-256 hashing, multipart uploads with retries, pre-signed URLs for private buckets, and Object Lock retention for tamper-evident manifests on Backblaze B2.

Why genblaze-s3

  • Durable by default — Assets + manifests land in object storage, never stuck in a provider's expiring CDN URL.
  • Backblaze B2 first-class — One-line S3StorageBackend.for_backblaze() helper, Object Lock support for immutable provenance.
  • Content-addressable dedupKeyStrategy.CONTENT_ADDRESSABLE stores each unique asset once by SHA-256.
  • Works with any S3 API — AWS S3, Backblaze B2, Cloudflare R2, MinIO, SeaweedFS, Wasabi, Ceph.
  • Presigned URLs — private buckets get time-limited URLs; public buckets get permanent public_url_base links.
  • Resilient multipart uploads — credential-preserving retries, preflight checks, no partial writes.

Backends

Provider Helper Notes
Backblaze B2 S3StorageBackend.for_backblaze("bucket") Reads B2_KEY_ID / B2_APP_KEY; Object Lock retention supported
AWS S3 S3StorageBackend(bucket="...", region="...") Standard AWS credential chain
Cloudflare R2 S3StorageBackend(bucket="...", endpoint_url="https://<acct>.r2.cloudflarestorage.com")
MinIO / self-hosted S3StorageBackend(bucket="...", endpoint_url="https://minio.example.com")

Install

pip install genblaze-s3

Quickstart — Backblaze B2 (recommended)

export B2_KEY_ID="..."
export B2_APP_KEY="..."
from genblaze_core import KeyStrategy, ObjectStorageSink, Pipeline
from genblaze_s3 import S3StorageBackend
from genblaze_replicate import ReplicateProvider

backend = S3StorageBackend.for_backblaze(
    "my-genblaze-bucket",
    # Defaults to "us-west-004". Pass the region your bucket actually lives
    # in (e.g. "us-east-005", "eu-central-003") to skip the redirect hop —
    # the backend auto-corrects on first use, but a right hint saves an RTT.
    region="us-west-004",
    # Optional: pass public_url_base for public buckets (get_url returns permanent URLs)
    public_url_base="https://f004.backblazeb2.com/file/my-genblaze-bucket",
)

sink = ObjectStorageSink(
    backend,
    prefix="genblaze-assets",
    key_strategy=KeyStrategy.CONTENT_ADDRESSABLE,   # dedupe by SHA-256
)

result = (
    Pipeline("b2-demo")
    .step(ReplicateProvider(), model="black-forest-labs/flux-schnell",
          prompt="a photorealistic cat wearing a tiny spacesuit")
    .run(sink=sink, timeout=120)
)

for step in result.run.steps:
    for asset in step.assets:
        print(asset.url, asset.sha256)

backend.close()

Resulting bucket layout with CONTENT_ADDRESSABLE:

genblaze-assets/
├── assets/{sha[:2]}/{sha[2:4]}/{sha}.ext    # one object per unique asset
└── manifests/{run_id}.json                   # one manifest per run

Switch to KeyStrategy.HIERARCHICAL for runs/{date}/{run_id}/… layout (better for run-grouped browsing, worse for dedup).

Quickstart — AWS S3

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
from genblaze_s3 import S3StorageBackend

backend = S3StorageBackend(bucket="my-genblaze-bucket", region="us-east-1")
# get_url() returns pre-signed URLs when public_url_base is not set

Quickstart — Cloudflare R2 / MinIO

from genblaze_s3 import S3StorageBackend

# R2
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://<account-id>.r2.cloudflarestorage.com",
    access_key_id="...", secret_access_key="...",
)

# MinIO
backend = S3StorageBackend(
    bucket="my-bucket",
    endpoint_url="https://minio.example.com",
    access_key_id="...", secret_access_key="...",
)

Object Lock for immutable manifests (Backblaze B2)

Genblaze can apply Object Lock retention to uploaded manifests, producing tamper-evident provenance suitable for compliance, legal, and content-authenticity workflows. See the main repo docs for the Object Lock guide.

Documentation

Related packages

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genblaze_s3-0.2.3.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genblaze_s3-0.2.3-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file genblaze_s3-0.2.3.tar.gz.

File metadata

  • Download URL: genblaze_s3-0.2.3.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.3.tar.gz
Algorithm Hash digest
SHA256 227f7a1e925aa2de7b5d2a5117e3f8c300bed6e630f56a149ac9396bac398e2f
MD5 394ac171551cb2cc0295248d9098b632
BLAKE2b-256 d57df63aa942b40aa9e8af06f55301323adbf1a092b6a5d5136c3b11e2664593

See more details on using hashes here.

File details

Details for the file genblaze_s3-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: genblaze_s3-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for genblaze_s3-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f34107db8f930a4cca0312ae096721b9ab9dd3a92588e96da5e898292d9f79d3
MD5 aa188f7cbbadb8dcfa1bad5ec293ea78
BLAKE2b-256 19afdb0bef8c2ace0c61c4e45ef48cd3ec819c41bb0adfcab2c342352e0145b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page