NVIDIA NIM / build.nvidia.com media provider adapters for genblaze (video, image, audio, chat)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

genblaze-nvidia

NVIDIA NIM / build.nvidia.com provider adapters for genblaze. Covers four modalities on one nvapi- key: video (Cosmos, Edify), image (SDXL, SD 3.5, FLUX), audio (Fugatto, Riva TTS), and chat (Nemotron, Llama, Mistral, Qwen, …).

Install

pip install genblaze-nvidia            # video/image/audio providers
pip install "genblaze-nvidia[chat]"    # + the OpenAI SDK for LLM calls

Auth

export NVIDIA_API_KEY=nvapi-...

The free tier is rate-limited (~40 requests/minute per model) with no per-token billing. Some models (Cosmos video) are still enterprise-gated as of 2026-04 and will return AUTH_FAILURE for free-tier keys until you have access.

Two base URLs

NVIDIA's API spans two public hosts on the same key:

Surface	Base URL	Used by
OpenAI-compatible chat / embeddings	`https://integrate.api.nvidia.com/v1`	`chat`, `achat`
Model-specific generation	`https://ai.api.nvidia.com/v1/genai/{vendor}/{slug}`	`NvidiaVideoProvider`, `NvidiaImageProvider`, `NvidiaAudioProvider`
NVCF async status	`https://api.nvcf.nvidia.com/v2/nvcf/pexec/status`	Async video polling

All three are overridable per-constructor for self-hosted NIM deployments:

NvidiaImageProvider(
    api_key="...",
    gen_base_url="https://self-hosted.internal/v1",
    nvcf_status_url="https://self-hosted.internal/v2/nvcf/pexec/status",
)

Video — `NvidiaVideoProvider`

Cosmos and Edify Video return async (202 Accepted + NVCF-REQID header) and the provider polls NVCF for completion. Some fast models return inline synchronous responses — both paths converge on the same lifecycle.

from genblaze_core.models.step import Step
from genblaze_nvidia import NvidiaVideoProvider

provider = NvidiaVideoProvider()  # reads NVIDIA_API_KEY
step = Step(
    provider="nvidia-video",
    model="nvidia/cosmos-1.0-7b-diffusion-text2world",
    prompt="a drone flight over a coastal cliff at sunset",
)
result = provider.invoke(step)
print(result.assets[0].url)  # file:// or https:// depending on response shape

Image — `NvidiaImageProvider`

Synchronous inline base64 response. If an endpoint occasionally returns 202, the provider short-polls NVCF inside generate() so the caller still sees one blocking call.

from genblaze_nvidia import NvidiaImageProvider

provider = NvidiaImageProvider()
step = Step(
    provider="nvidia-image",
    model="stabilityai/stable-diffusion-3-5-large",
    prompt="a studio photo of a brass teapot",
    params={"cfg_scale": 4.5, "aspect_ratio": "1:1"},
)
result = provider.invoke(step)

SDXL's schema differs from SD 3.5 / FLUX — the registry handles that transparently, rewriting prompt + negative_prompt into the text_prompts array SDXL expects.

Audio — `NvidiaAudioProvider`

from genblaze_nvidia import NvidiaAudioProvider

provider = NvidiaAudioProvider()

# TTS (mono)
step = Step(provider="nvidia-audio", model="nvidia/riva-tts", prompt="Hello, world.")

# Music / SFX (stereo)
step = Step(provider="nvidia-audio", model="nvidia/fugatto", prompt="upbeat synthwave intro")

result = provider.invoke(step)

Chat — `chat` / `achat`

OpenAI-wire-compatible. Any model NIM currently serves works as a plain string — no enumeration.

from genblaze_nvidia import chat

resp = chat(
    "nvidia/nemotron-4-340b-instruct",
    prompt="Summarize the Cosmos world foundation model in one sentence.",
)
print(resp.text)

import asyncio
from genblaze_nvidia import achat

async def main():
    r = await achat("meta/llama-3.3-70b-instruct", prompt="hi")
    print(r.text)

asyncio.run(main())

Structured outputs

Pass a Pydantic class to response_format= and the JSON Schema is generated automatically (NIM, OpenAI, GMICloud all speak the same json_schema envelope).

from pydantic import BaseModel
from genblaze_nvidia import chat

class Summary(BaseModel):
    title: str
    key_points: list[str]

resp = chat(
    "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning",
    prompt="Summarize: NVIDIA shipped a 30B-A3B omnimodal model.",
    response_format=Summary,
)
import json; obj = Summary.model_validate(json.loads(resp.text))

Chat as a Pipeline step — `NvidiaChatProvider`

Drop NIM chat into a Pipeline alongside generation steps. Multimodal input flows through step.inputs[Asset] — each asset becomes the right OpenAI-vision content block based on its media_type. Verified against the Nemotron 3 Nano Omni model card on build.nvidia.com.

from genblaze_core import Asset, Pipeline
from genblaze_nvidia import NvidiaChatProvider

# Eyes-and-ears: image + audio + video in one step.
inputs = [
    Asset(url="https://example.com/scene.png",  media_type="image/png"),
    Asset(url="https://example.com/voice.wav",  media_type="audio/wav"),
    Asset(url="https://example.com/clip.mp4",   media_type="video/mp4"),
]

pipe = Pipeline().step(
    NvidiaChatProvider(reasoning=False),  # turn off thinking for a fast perception pass
    model="nvidia/nemotron-3-nano-omni-30b-a3b-reasoning",
    prompt="Describe what's happening across these inputs.",
    external_inputs=inputs,  # caller-held Assets seeded directly into step.inputs
)
result = pipe.run()
print(result.steps[-1].assets[0].metadata["text"])

reasoning is tri-state: None (default) lets the server pick based on the model checkpoint, True/False overrides explicitly via extra_body["chat_template_kwargs"]["enable_thinking"]. Tuning fields like media_io_kwargs={"video": {"fps": 3.0}} and mm_processor_kwargs={"max_num_tiles": 3} are passed through to NIM untouched.

PDFs are not natively supported — Nemotron Omni processes documents as multi-page image sequences upstream, so callers must rasterize pages client-side and pass each one as Asset(media_type="image/png").

Models

genblaze-nvidia ships pattern-keyed ModelFamily rules — each family encodes the per-line param shape (SDXL's text_prompts, Cosmos's width/height/fps, etc.), and any slug fitting the pattern works the day NIM ships it. The audio / video / image endpoints declare DiscoverySupport.PARTIAL — slug liveness is confirmed via the empty-payload-POST probe attached to each family, so a retired slug like the historical nvidia/riva-tts surfaces as NOT_FOUND at preflight rather than mid-pipeline 404. Chat declares DiscoverySupport.NATIVE and reads the integrate.api.nvidia.com/v1/models catalog directly.

Modality	Family pattern(s)	Example slugs
Video	`^nvidia/cosmos-`	`nvidia/cosmos-1.0-7b-diffusion-text2world`, `.../video2world`, `nvidia/cosmos-2.0-diffusion-*`
Image	`^stabilityai/stable-diffusion`, `^black-forest-labs/flux`	`stable-diffusion-xl`, `stable-diffusion-3-5-{large,large-turbo,medium}`, `flux.1-{schnell,dev}`
Audio	`^nvidia/fugatto`, `^nvidia/(?:magpie-tts\|riva-tts\|maxine-)`	`nvidia/fugatto`, `nvidia/magpie-tts-multilingual`, `nvidia/maxine-voice-font`
Chat	n/a — `NATIVE` discovery	Any NIM chat model id

Pricing is not shipped — register a strategy from docs/reference/pricing-recipes.md when one is published for the model line you use. Until then, step.cost_usd is None.

Discover live models at runtime (if you want the fresh catalog) via the OpenAI-compatible /v1/models endpoint:

import httpx, os
r = httpx.get(
    "https://integrate.api.nvidia.com/v1/models",
    headers={"Authorization": f"Bearer {os.environ['NVIDIA_API_KEY']}"},
)
for m in r.json()["data"]:
    print(m["id"])

Error handling

NIM returns safety refusals as HTTP 400 with Nemoguard / safety markers in the body. map_nvidia_error classifies these as CONTENT_POLICY (non-retryable) instead of INVALID_INPUT — pipelines don't burn retries on a deterministic refusal.

HTTP / message	`ProviderErrorCode`
401, 403	`AUTH_FAILURE`
404	`MODEL_ERROR`
429	`RATE_LIMIT`
400 with safety marker	`CONTENT_POLICY`
400 plain	`INVALID_INPUT`
5xx	`SERVER_ERROR`
transport timeout	`TIMEOUT`

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jdnyc

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

May 20, 2026

0.2.1

Apr 28, 2026

0.2.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genblaze_nvidia-0.3.0.tar.gz (39.1 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

genblaze_nvidia-0.3.0-py3-none-any.whl (37.5 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file genblaze_nvidia-0.3.0.tar.gz.

File metadata

Download URL: genblaze_nvidia-0.3.0.tar.gz
Upload date: May 20, 2026
Size: 39.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for genblaze_nvidia-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`7f60a08573a58aaff1dc119d777f12317c33d878042c7000eeb95cce7150f461`
MD5	`e6670c3cce5810263993ff30fd45307e`
BLAKE2b-256	`aa1a29faf75de625e606d369d869ac03aeb5fb87fb394060ae5d9ed98cb31791`

See more details on using hashes here.

Provenance

The following attestation bundles were made for genblaze_nvidia-0.3.0.tar.gz:

Publisher: release.yml on backblaze-labs/genblaze

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: genblaze_nvidia-0.3.0.tar.gz
- Subject digest: 7f60a08573a58aaff1dc119d777f12317c33d878042c7000eeb95cce7150f461
- Sigstore transparency entry: 1585553741
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: backblaze-labs/genblaze@9aaf87d7eeb99128234b632233c7506914b00b7f
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/backblaze-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@9aaf87d7eeb99128234b632233c7506914b00b7f
- Trigger Event: release

File details

Details for the file genblaze_nvidia-0.3.0-py3-none-any.whl.

File metadata

Download URL: genblaze_nvidia-0.3.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 37.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for genblaze_nvidia-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`98c59abc794ff2a24e289a6a296297b1ea825b20f62a4844a7bdf688fb80fd15`
MD5	`636878aa9dd90950103ee18ca0e9209a`
BLAKE2b-256	`5af9cabf2322d2c0d87708bffebd582947e138d8489f00179ca72409743a79e2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for genblaze_nvidia-0.3.0-py3-none-any.whl:

Publisher: release.yml on backblaze-labs/genblaze

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: genblaze_nvidia-0.3.0-py3-none-any.whl
- Subject digest: 98c59abc794ff2a24e289a6a296297b1ea825b20f62a4844a7bdf688fb80fd15
- Sigstore transparency entry: 1585553824
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: backblaze-labs/genblaze@9aaf87d7eeb99128234b632233c7506914b00b7f
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/backblaze-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@9aaf87d7eeb99128234b632233c7506914b00b7f
- Trigger Event: release

genblaze-nvidia 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

genblaze-nvidia

Install

Auth

Two base URLs

Video — NvidiaVideoProvider

Image — NvidiaImageProvider

Audio — NvidiaAudioProvider

Chat — chat / achat

Structured outputs

Chat as a Pipeline step — NvidiaChatProvider

Models

Error handling

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Video — `NvidiaVideoProvider`

Image — `NvidiaImageProvider`

Audio — `NvidiaAudioProvider`

Chat — `chat` / `achat`

Chat as a Pipeline step — `NvidiaChatProvider`