Skip to main content

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield — aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

Project description

multi-model-image-gen

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield. One API, aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

from image_gen import generate

result = generate(
    prompt="a neon-lit Tokyo alley at night",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],
    output="out.png",
)

Supported models

Catalog of models known to the library (current as of 2026-04-17). Shorthands resolve to the provider-native model ID automatically. Any full model ID also works — unknown models just pass through without capability validation.

Single source of truth: src/image_gen/providers/catalog.yaml. Add or update a model = edit that file, no Python changes.

OpenAI — GPT Image family

Shorthand Model ID Seed Neg. prompt Notes
gpt-image-1.5 gpt-image-1.5 Current SOTA — natively multimodal (text + image I/O).
gpt-image-mini gpt-image-1-mini Cheapest tier when quality is secondary.
gpt-image gpt-image-1 Legacy. Kept for back-compat.

Pricing is token-based (text + image tokens) — see OpenAI's pricing page for current rates. The image API also supports dall-e-2 / dall-e-3 via full model IDs.

fal.ai — curated subset of 1000+ models

Any fal model works by full ID (generate(model="fal:fal-ai/sdxl/lightning", …)). The shorthands below cover the most common ones:

Shorthand Model ID Seed Neg. prompt Cost / image
flux2-pro fal-ai/flux-2-pro ~$0.03 / MP
flux2-dev-turbo fal-ai/flux-2-dev-turbo ~$0.008
flux-pro fal-ai/flux-pro $0.050
flux fal-ai/flux/dev $0.025
flux-schnell fal-ai/flux/schnell $0.003
seedream / seedream-4.5 fal-ai/bytedance/seedream/v4.5/text-to-image $0.040
ideogram fal-ai/ideogram/v3 $0.030–0.090
recraft fal-ai/recraft-v3 $0.040 raster / $0.080 vector
nb2-fal fal-ai/nano-banana-2 $0.080
nb-pro-fal fal-ai/nano-banana-pro $0.150
imagen4-fal fal-ai/imagen4/preview variable

fal acts as a unified gateway — you can use Google's Nano Banana and Imagen through fal's billing + API instead of Google's own credentials, which is occasionally useful for multi-tenant apps.

Google — Nano Banana + Imagen (AI Studio or Vertex AI)

Two providers, each with independent auth:

Provider Env var Auth method Use case
google GOOGLE_VERTEX_API_KEY Vertex AI (API key) Railway / production
google GOOGLE_CLOUD_PROJECT Vertex AI (ADC) Local dev
google-studio GOOGLE_GEMINI_API_KEY AI Studio Gemini-only, no GCP project
google GOOGLE_API_KEY + GOOGLE_GENAI_USE_VERTEXAI=True Vertex AI (API key) Legacy / compat
google GOOGLE_API_KEY alone AI Studio Legacy / compat

GOOGLE_VERTEX_API_KEY takes priority for the google provider — set it to use a Vertex-bound API key instead of ADC. Imagen models require Vertex. google-studio shorthands (nb2, nano-banana, nb-pro) also work via google-studio:model-id if you want them on AI Studio billing.

Nano Banana (Gemini Image) — three tiers, shared generate_content API surface:

Shorthand Model ID Aliases Description
nb-pro gemini-3-pro-image-preview Nano Banana Pro Best reasoning ("Thinking"), high-fidelity text, most expensive.
nb2 gemini-3.1-flash-image-preview Nano Banana 2 Fast, high-volume. Trades some compositional depth for speed.
nano-banana gemini-2.5-flash-image original Nano Banana Legacy speed/efficiency tier.

Imagen 4 — three tiers:

Shorthand Model ID Notes
imagen4-ultra imagen-4.0-ultra-generate-001 Highest fidelity; supports 1K + 2K + 4K.
imagen4 imagen-4.0-generate-001 Default Imagen 4.
imagen4-fast imagen-4.0-fast-generate-001 Cheapest / fastest.

All Google image models output includes a SynthID watermark.

Higgsfield — async-task platform (submit → poll → download)

Polling is built in via poll_until. Higgsfield also gives day-0 access to third-party video models.

Soul (photorealistic image generation):

Shorthand Model ID
higgsfield-soul higgsfield-ai/soul/standard
higgsfield-soul-pro higgsfield-ai/soul/pro

DoP (image-to-video, three quality/speed tiers):

Shorthand Model ID
higgsfield-dop higgsfield-ai/dop/lite
higgsfield-dop-preview higgsfield-ai/dop/preview
higgsfield-dop-turbo higgsfield-ai/dop/turbo

Kling (video via Higgsfield platform):

Shorthand Model ID Notes
higgsfield-kling kling-video/v3.0/master/image-to-video Kling 3.0 — unified video + audio + images, multi-shot.
higgsfield-kling-v2 kling-video/v2.1/pro/image-to-video Kling 2.1 Master.

Other Higgsfield platform models (Sora 2, WAN 2.5, MiniMax Hailuo 02, Seedance Pro, etc.) work by passing their full ID: generate(model="higgsfield:sora-2/preview", …).

All models support

Aspect ratios 9:16, 16:9, 1:1. Resolutions 720p, 1080p. Capabilities beyond these are rejected pre-flight with UnsupportedCapabilityError — the SDK call is never made. Add a model, or update capabilities, by editing src/image_gen/providers/catalog.yaml — no Python changes needed.

Install

pip install git+https://github.com/oj-rivas/multi-model-image-gen.git
# or pinned:
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git@v0.2.0
# or editable from local checkout:
pip install -e /path/to/multi-model-image-gen
# with Prometheus metrics:
pip install "multi-model-image-gen[metrics] @ git+…"

Credentials

Set only the keys for providers you'll actually call:

# OpenAI
OPENAI_API_KEY=sk-…

# fal.ai
FAL_KEY=# Google — Vertex AI (production / Railway)
GOOGLE_VERTEX_API_KEY=                   # "google" provider → Vertex with API key
GOOGLE_GEMINI_API_KEY=                   # "google-studio" provider → AI Studio
GOOGLE_CLOUD_PROJECT=your-gcp-project     # required for Vertex (Imagen 4)
GOOGLE_CLOUD_LOCATION=us-central1         # default
# Google — local dev (ADC)
# GOOGLE_CLOUD_PROJECT=your-gcp-project   # Vertex via gcloud auth application-default login
# Google — legacy single key
# GOOGLE_API_KEY=…                        # AI Studio, or Vertex with GOOGLE_GENAI_USE_VERTEXAI=True

# Higgsfield
HIGGSFIELD_API_KEY=HIGGSFIELD_API_SECRET=HIGGSFIELD_BASE_URL=https://platform.higgsfield.ai  # default

The library does not load .env for you. Use python-dotenv or your framework's loader.

Addressing

Three ways to name a model:

Form Example Provider chosen by
provider:model "fal:flux/dev" explicit prefix
Bare shorthand "flux" catalog
Bare full model ID "higgsfield-ai/soul/standard" catalog → higgsfield default

Unknown provider prefix raises ValueError. Unknown bare models route to higgsfield (which uses full IDs rather than shorthands).

Library usage

Sync

from image_gen import generate

result = generate(
    prompt="a neon fox",
    model="flux",                 # catalog → fal-ai/flux/dev
    output="fox.png",             # written when status == "completed"
    aspect_ratio="9:16",
    resolution="720p",
)
assert result.status == "completed"
print(result.request_id, result.cost_usd)

Async — for FastAPI / high-concurrency

import asyncio
from image_gen import generate_async

async def batch():
    return await asyncio.gather(*[
        generate_async(prompt=f"scene {i}", model="flux")
        for i in range(10)
    ])

Fallback chains

result = generate(
    prompt="…",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],   # tried in order on retry-exhausted failure
    correlation_id="trace-42",
)

Direct provider client (bypass routing)

from image_gen import FalImageClient

with FalImageClient() as client:
    result = client.generate_image(
        prompt="cinematic forest",
        model="fal-ai/flux/dev",
        aspect_ratio="16:9",
        num_inference_steps=40,      # provider-specific kwarg
    )
    client.download(result.image_url, "forest.png")

Result shape

@dataclass
class GenerationResult:
    request_id: str
    status: str                # "completed" | "failed" | "blocked"
    image_bytes: bytes | None  # inline payload (OpenAI, Google)
    image_url: str | None      # remote URL (fal, Higgsfield)
    video_url: str | None      # Higgsfield video models
    cost_usd: float | None     # populated from catalog when available
    raw: dict | None           # full provider response + correlation_id if set

Callers that pass output= to generate() don't need to touch this — bytes/URL are handled automatically.

CLI

image-gen -p "a neon fox" -m flux -o fox.png
image-gen -p "…" -m "openai:gpt-image-1" --aspect 16:9 --resolution 1080p -o wide.png
image-gen -p "…" -m "higgsfield:higgsfield-ai/soul/standard" -o h.png

FastAPI integration

from fastapi import FastAPI, HTTPException
from image_gen import generate_async, FallbackExhausted

app = FastAPI()

@app.post("/images")
async def gen(prompt: str, model: str = "flux", x_correlation_id: str | None = None):
    try:
        r = await generate_async(prompt=prompt, model=model, correlation_id=x_correlation_id)
    except FallbackExhausted as e:
        raise HTTPException(502, str(e))
    if r.status == "blocked":
        raise HTTPException(422, r.raw)
    return {"url": r.image_url, "cost": r.cost_usd, "id": r.request_id}

Resilience

  • Retry: tenacity-backed exponential backoff on httpx.TimeoutException, ConnectError, ReadError, and HTTP 429/500/502/503/504. Configurable via IMAGE_GEN_RETRY_ATTEMPTS (default 3) and IMAGE_GEN_RETRY_MAX_WAIT (default 8s). Non-transient errors (ValueError, 4xx) are not retried.
  • Fallback: Router iterates [primary, *fallbacks]; each provider's retries run first, then the router moves on. FallbackExhausted is raised if all fail, carrying the list of underlying exceptions.
  • Blocked content: NSFW / policy refusals return status="blocked" instead of raising — never retried.

Observability

Every generate() / generate_async() call emits one structured log record image_gen.request with canonical fields:

request_id, provider, model, latency_ms, status, bytes_out,
cost_usd, retry_count, fallback_used, correlation_id

Set IMAGE_GEN_LOG_FORMAT=json for one-line JSON per request (Datadog / CloudWatch / Loki-friendly).

Install the optional [metrics] extra to expose Prometheus counters: image_gen_requests_total, image_gen_retries_total, image_gen_fallbacks_total, image_gen_cost_usd_total, and histogram image_gen_latency_seconds.

Extending — add a provider as a plugin

No fork needed. In your own package:

# runway_image_gen/__init__.py
from image_gen.providers.base import ImageProvider

class RunwayClient:
    def generate_image(self, prompt, model, aspect_ratio, resolution, **kw): ...
    def download(self, url, output_path): ...
    def close(self): ...
# pyproject.toml
[project.entry-points."image_gen.providers"]
runway = "runway_image_gen:RunwayClient"

Then pip install your-package and generate(model="runway:gen-3") just works — entry-point discovery picks it up.

Also works at runtime:

from image_gen import register_provider
register_provider("custom", MyClientFactory)
generate(model="custom:foo")

A complete example lives in examples/runway_plugin/.

Layout

src/image_gen/
├── __init__.py                 # generate(), generate_async(), Router re-export
├── cli.py                      # image-gen CLI
├── config.py                   # lazy env readers
├── router.py                   # Router, FallbackExhausted
├── observability.py            # StructuredLogger, JSONFormatter, correlation_id
├── metrics.py                  # Prometheus (optional, gated on import)
├── py.typed                    # PEP 561 marker
└── providers/
    ├── __init__.py             # registry + plugin discovery (register_provider, get_provider)
    ├── base.py                 # ImageProvider Protocol (sync + async)
    ├── result.py               # GenerationResult
    ├── catalog.yaml            # SINGLE SOURCE OF TRUTH for models
    ├── catalog.py              # ModelEntry, resolve, validate
    ├── _retry.py               # @retry_policy, is_retryable
    ├── _poll.py                # poll_until, poll_until_async
    ├── README.md               # "how to add an async-task provider"
    ├── openai_images.py
    ├── fal_images.py
    ├── google_images.py        # AI Studio + Vertex AI
    └── higgsfield.py
examples/
├── runway_plugin/              # third-party plugin example
└── benchmark_concurrent.py     # async concurrency demo

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_model_image_gen-0.2.2.tar.gz (58.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multi_model_image_gen-0.2.2-py3-none-any.whl (43.4 kB view details)

Uploaded Python 3

File details

Details for the file multi_model_image_gen-0.2.2.tar.gz.

File metadata

  • Download URL: multi_model_image_gen-0.2.2.tar.gz
  • Upload date:
  • Size: 58.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for multi_model_image_gen-0.2.2.tar.gz
Algorithm Hash digest
SHA256 6590091b519ac33795773fe4a5ccc5901f597b508773c50c6dcd55df5c4d0edf
MD5 9ab1aeaae5964a76de49d7ee2c39068d
BLAKE2b-256 cfe25bae03d60c2a1f6745a51c9e4778e432ab65a2b2e096540b54be6e88f0b6

See more details on using hashes here.

File details

Details for the file multi_model_image_gen-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for multi_model_image_gen-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 48da2a22ef508f9748c92c113dee944cc54b1e64bc57a9bfa870d8f91a60d6d9
MD5 764bb5d6e3949f5d0d20e2ace127eaad
BLAKE2b-256 67d31b3dc3c4016b4538a0444871775be149eee71535021dbc9ce9905d0ead93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page