Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield — aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

These details have not been verified by PyPI

Project links

Project description

multi-model-image-gen

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield. One API, aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

from image_gen import generate

result = generate(
    prompt="a neon-lit Tokyo alley at night",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],
    output="out.png",
)

Supported models

Catalog of models known to the library (current as of 2026-04-17). Shorthands resolve to the provider-native model ID automatically. Any full model ID also works — unknown models just pass through without capability validation.

Single source of truth: src/image_gen/providers/catalog.yaml. Add or update a model = edit that file, no Python changes.

OpenAI — GPT Image family

Shorthand	Model ID	Seed	Neg. prompt	Notes
`gpt-image-1.5`	`gpt-image-1.5`	–	–	Current SOTA — natively multimodal (text + image I/O).
`gpt-image-mini`	`gpt-image-1-mini`	–	–	Cheapest tier when quality is secondary.
`gpt-image`	`gpt-image-1`	–	–	Legacy. Kept for back-compat.

Pricing is token-based (text + image tokens) — see OpenAI's pricing page for current rates. The image API also supports dall-e-2 / dall-e-3 via full model IDs.

fal.ai — curated subset of 1000+ models

Any fal model works by full ID (generate(model="fal:fal-ai/sdxl/lightning", …)). The shorthands below cover the most common ones:

Shorthand	Model ID	Seed	Neg. prompt	Cost / image
`flux2-pro`	`fal-ai/flux-2-pro`	✓	–	~$0.03 / MP
`flux2-dev-turbo`	`fal-ai/flux-2-dev-turbo`	✓	–	~$0.008
`flux-pro`	`fal-ai/flux-pro`	✓	–	$0.050
`flux`	`fal-ai/flux/dev`	✓	–	$0.025
`flux-schnell`	`fal-ai/flux/schnell`	✓	–	$0.003
`seedream` / `seedream-4.5`	`fal-ai/bytedance/seedream/v4.5/text-to-image`	✓	✓	$0.040
`ideogram`	`fal-ai/ideogram/v3`	✓	✓	$0.030–0.090
`recraft`	`fal-ai/recraft-v3`	✓	–	$0.040 raster / $0.080 vector
`nb2-fal`	`fal-ai/nano-banana-2`	–	–	$0.080
`nb-pro-fal`	`fal-ai/nano-banana-pro`	–	–	$0.150
`imagen4-fal`	`fal-ai/imagen4/preview`	–	–	variable

fal acts as a unified gateway — you can use Google's Nano Banana and Imagen through fal's billing + API instead of Google's own credentials, which is occasionally useful for multi-tenant apps.

Google — Nano Banana + Imagen (AI Studio or Vertex AI)

Two providers, each with independent auth:

Provider	Env var	Auth method	Use case
`google`	`GOOGLE_VERTEX_API_KEY`	Vertex AI (API key)	Railway / production
`google`	`GOOGLE_CLOUD_PROJECT`	Vertex AI (ADC)	Local dev
`google-studio`	`GOOGLE_GEMINI_API_KEY`	AI Studio	Gemini-only, no GCP project
`google`	`GOOGLE_API_KEY` + `GOOGLE_GENAI_USE_VERTEXAI=True`	Vertex AI (API key)	Legacy / compat
`google`	`GOOGLE_API_KEY` alone	AI Studio	Legacy / compat

GOOGLE_VERTEX_API_KEY takes priority for the google provider — set it to use a Vertex-bound API key instead of ADC. Imagen models require Vertex. google-studio shorthands (nb2, nano-banana, nb-pro) also work via google-studio:model-id if you want them on AI Studio billing.

Nano Banana (Gemini Image) — three tiers, shared generate_content API surface:

Shorthand	Model ID	Aliases	Description
`nb-pro`	`gemini-3-pro-image-preview`	Nano Banana Pro	Best reasoning ("Thinking"), high-fidelity text, most expensive.
`nb2`	`gemini-3.1-flash-image-preview`	Nano Banana 2	Fast, high-volume. Trades some compositional depth for speed.
`nano-banana`	`gemini-2.5-flash-image`	original Nano Banana	Legacy speed/efficiency tier.

Imagen 4 — three tiers:

Shorthand	Model ID	Notes
`imagen4-ultra`	`imagen-4.0-ultra-generate-001`	Highest fidelity; supports 1K + 2K + 4K.
`imagen4`	`imagen-4.0-generate-001`	Default Imagen 4.
`imagen4-fast`	`imagen-4.0-fast-generate-001`	Cheapest / fastest.

All Google image models output includes a SynthID watermark.

Higgsfield — async-task platform (submit → poll → download)

Polling is built in via poll_until. Higgsfield also gives day-0 access to third-party video models.

Soul (photorealistic image generation):

Shorthand	Model ID
`higgsfield-soul`	`higgsfield-ai/soul/standard`
`higgsfield-soul-pro`	`higgsfield-ai/soul/pro`

DoP (image-to-video, three quality/speed tiers):

Shorthand	Model ID
`higgsfield-dop`	`higgsfield-ai/dop/lite`
`higgsfield-dop-preview`	`higgsfield-ai/dop/preview`
`higgsfield-dop-turbo`	`higgsfield-ai/dop/turbo`

Kling (video via Higgsfield platform):

Shorthand	Model ID	Notes
`higgsfield-kling`	`kling-video/v3.0/master/image-to-video`	Kling 3.0 — unified video + audio + images, multi-shot.
`higgsfield-kling-v2`	`kling-video/v2.1/pro/image-to-video`	Kling 2.1 Master.

Other Higgsfield platform models (Sora 2, WAN 2.5, MiniMax Hailuo 02, Seedance Pro, etc.) work by passing their full ID: generate(model="higgsfield:sora-2/preview", …).

All models support

Aspect ratios 9:16, 16:9, 1:1. Resolutions 720p, 1080p. Capabilities beyond these are rejected pre-flight with UnsupportedCapabilityError — the SDK call is never made. Add a model, or update capabilities, by editing src/image_gen/providers/catalog.yaml — no Python changes needed.

Install

pip install git+https://github.com/oj-rivas/multi-model-image-gen.git
# or pinned:
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git@v0.2.0
# or editable from local checkout:
pip install -e /path/to/multi-model-image-gen
# with Prometheus metrics:
pip install "multi-model-image-gen[metrics] @ git+…"

Credentials

Set only the keys for providers you'll actually call:

# OpenAI
OPENAI_API_KEY=sk-…

# fal.ai
FAL_KEY=…

# Google — Vertex AI (production / Railway)
GOOGLE_VERTEX_API_KEY=…                   # "google" provider → Vertex with API key
GOOGLE_GEMINI_API_KEY=…                   # "google-studio" provider → AI Studio
GOOGLE_CLOUD_PROJECT=your-gcp-project     # required for Vertex (Imagen 4)
GOOGLE_CLOUD_LOCATION=us-central1         # default
# Google — local dev (ADC)
# GOOGLE_CLOUD_PROJECT=your-gcp-project   # Vertex via gcloud auth application-default login
# Google — legacy single key
# GOOGLE_API_KEY=…                        # AI Studio, or Vertex with GOOGLE_GENAI_USE_VERTEXAI=True

# Higgsfield
HIGGSFIELD_API_KEY=…
HIGGSFIELD_API_SECRET=…
HIGGSFIELD_BASE_URL=https://platform.higgsfield.ai  # default

The library does not load .env for you. Use python-dotenv or your framework's loader.

Addressing

Three ways to name a model:

Form	Example	Provider chosen by
`provider:model`	`"fal:flux/dev"`	explicit prefix
Bare shorthand	`"flux"`	catalog
Bare full model ID	`"higgsfield-ai/soul/standard"`	catalog → `higgsfield` default

Unknown provider prefix raises ValueError. Unknown bare models route to higgsfield (which uses full IDs rather than shorthands).

Library usage

Sync

from image_gen import generate

result = generate(
    prompt="a neon fox",
    model="flux",                 # catalog → fal-ai/flux/dev
    output="fox.png",             # written when status == "completed"
    aspect_ratio="9:16",
    resolution="720p",
)
assert result.status == "completed"
print(result.request_id, result.cost_usd)

Async — for FastAPI / high-concurrency

import asyncio
from image_gen import generate_async

async def batch():
    return await asyncio.gather(*[
        generate_async(prompt=f"scene {i}", model="flux")
        for i in range(10)
    ])

Fallback chains

result = generate(
    prompt="…",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],   # tried in order on retry-exhausted failure
    correlation_id="trace-42",
)

Direct provider client (bypass routing)

from image_gen import FalImageClient

with FalImageClient() as client:
    result = client.generate_image(
        prompt="cinematic forest",
        model="fal-ai/flux/dev",
        aspect_ratio="16:9",
        num_inference_steps=40,      # provider-specific kwarg
    )
    client.download(result.image_url, "forest.png")

Result shape

@dataclass
class GenerationResult:
    request_id: str
    status: str                # "completed" | "failed" | "blocked"
    image_bytes: bytes | None  # inline payload (OpenAI, Google)
    image_url: str | None      # remote URL (fal, Higgsfield)
    video_url: str | None      # Higgsfield video models
    cost_usd: float | None     # populated from catalog when available
    raw: dict | None           # full provider response + correlation_id if set

Callers that pass output= to generate() don't need to touch this — bytes/URL are handled automatically.

CLI

image-gen -p "a neon fox" -m flux -o fox.png
image-gen -p "…" -m "openai:gpt-image-1" --aspect 16:9 --resolution 1080p -o wide.png
image-gen -p "…" -m "higgsfield:higgsfield-ai/soul/standard" -o h.png

FastAPI integration

from fastapi import FastAPI, HTTPException
from image_gen import generate_async, FallbackExhausted

app = FastAPI()

@app.post("/images")
async def gen(prompt: str, model: str = "flux", x_correlation_id: str | None = None):
    try:
        r = await generate_async(prompt=prompt, model=model, correlation_id=x_correlation_id)
    except FallbackExhausted as e:
        raise HTTPException(502, str(e))
    if r.status == "blocked":
        raise HTTPException(422, r.raw)
    return {"url": r.image_url, "cost": r.cost_usd, "id": r.request_id}

Resilience

Retry: tenacity-backed exponential backoff on httpx.TimeoutException, ConnectError, ReadError, and HTTP 429/500/502/503/504. Configurable via IMAGE_GEN_RETRY_ATTEMPTS (default 3) and IMAGE_GEN_RETRY_MAX_WAIT (default 8s). Non-transient errors (ValueError, 4xx) are not retried.
Fallback: Router iterates [primary, *fallbacks]; each provider's retries run first, then the router moves on. FallbackExhausted is raised if all fail, carrying the list of underlying exceptions.
Blocked content: NSFW / policy refusals return status="blocked" instead of raising — never retried.

Observability

Every generate() / generate_async() call emits one structured log record image_gen.request with canonical fields:

request_id, provider, model, latency_ms, status, bytes_out,
cost_usd, retry_count, fallback_used, correlation_id

Set IMAGE_GEN_LOG_FORMAT=json for one-line JSON per request (Datadog / CloudWatch / Loki-friendly).

Install the optional [metrics] extra to expose Prometheus counters: image_gen_requests_total, image_gen_retries_total, image_gen_fallbacks_total, image_gen_cost_usd_total, and histogram image_gen_latency_seconds.

Extending — add a provider as a plugin

No fork needed. In your own package:

# runway_image_gen/__init__.py
from image_gen.providers.base import ImageProvider

class RunwayClient:
    def generate_image(self, prompt, model, aspect_ratio, resolution, **kw): ...
    def download(self, url, output_path): ...
    def close(self): ...

# pyproject.toml
[project.entry-points."image_gen.providers"]
runway = "runway_image_gen:RunwayClient"

Then pip install your-package and generate(model="runway:gen-3") just works — entry-point discovery picks it up.

Also works at runtime:

from image_gen import register_provider
register_provider("custom", MyClientFactory)
generate(model="custom:foo")

A complete example lives in examples/runway_plugin/.

Layout

src/image_gen/
├── __init__.py                 # generate(), generate_async(), Router re-export
├── cli.py                      # image-gen CLI
├── config.py                   # lazy env readers
├── router.py                   # Router, FallbackExhausted
├── observability.py            # StructuredLogger, JSONFormatter, correlation_id
├── metrics.py                  # Prometheus (optional, gated on import)
├── py.typed                    # PEP 561 marker
└── providers/
    ├── __init__.py             # registry + plugin discovery (register_provider, get_provider)
    ├── base.py                 # ImageProvider Protocol (sync + async)
    ├── result.py               # GenerationResult
    ├── catalog.yaml            # SINGLE SOURCE OF TRUTH for models
    ├── catalog.py              # ModelEntry, resolve, validate
    ├── _retry.py               # @retry_policy, is_retryable
    ├── _poll.py                # poll_until, poll_until_async
    ├── README.md               # "how to add an async-task provider"
    ├── openai_images.py
    ├── fal_images.py
    ├── google_images.py        # AI Studio + Vertex AI
    └── higgsfield.py
examples/
├── runway_plugin/              # third-party plugin example
└── benchmark_concurrent.py     # async concurrency demo

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.3

Apr 18, 2026

0.2.2

Apr 18, 2026

This version

0.2.1

Apr 18, 2026

0.2.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_model_image_gen-0.2.1.tar.gz (58.1 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

multi_model_image_gen-0.2.1-py3-none-any.whl (43.3 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file multi_model_image_gen-0.2.1.tar.gz.

File metadata

Download URL: multi_model_image_gen-0.2.1.tar.gz
Upload date: Apr 18, 2026
Size: 58.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for multi_model_image_gen-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`52f3adda8be102d179f5e59eff45a8fa7a49c211bac4ea0ccedd6dac1f5336ba`
MD5	`d7f97d77ec2345c1a98ba98b066994a0`
BLAKE2b-256	`423cd00d1d519d45f2a67187998a5f9d695784e2e42efc9eaae30fa3129e54ab`

See more details on using hashes here.

File details

Details for the file multi_model_image_gen-0.2.1-py3-none-any.whl.

File metadata

Download URL: multi_model_image_gen-0.2.1-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 43.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for multi_model_image_gen-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0cb7854cc8ad13179c8a4afdcf5b7bebb58e5e57400eaba726c230c69c0b2bf8`
MD5	`e22a974659ae6aeb788c6d402acaa776`
BLAKE2b-256	`0e4355dec536a8b0400613d83e8b9958e41271d513ed65b89b6c096babfd2a0f`

See more details on using hashes here.

multi-model-image-gen 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

multi-model-image-gen

Supported models

OpenAI — GPT Image family

fal.ai — curated subset of 1000+ models

Google — Nano Banana + Imagen (AI Studio or Vertex AI)

Higgsfield — async-task platform (submit → poll → download)

All models support

Install

Credentials

Addressing

Library usage

Sync

Async — for FastAPI / high-concurrency

Fallback chains

Direct provider client (bypass routing)

Result shape

CLI

FastAPI integration

Resilience

Observability

Extending — add a provider as a plugin

Layout

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes