Skip to main content

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield — aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

Project description

multi-model-image-gen

Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield. One API, aisuite-style addressing, retry + fallback, async-first, plugin-extensible.

from image_gen import generate

result = generate(
    prompt="a neon-lit Tokyo alley at night",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],
    output="out.png",
)

Supported models

Catalog of models known to the library (current as of 2026-04-17). Shorthands resolve to the provider-native model ID automatically. Any full model ID also works — unknown models just pass through without capability validation.

Single source of truth: src/image_gen/providers/catalog.yaml. Add or update a model = edit that file, no Python changes.

OpenAI — GPT Image family

Shorthand Model ID Seed Neg. prompt Notes
gpt-image-1.5 gpt-image-1.5 Current SOTA — natively multimodal (text + image I/O).
gpt-image-mini gpt-image-1-mini Cheapest tier when quality is secondary.
gpt-image gpt-image-1 Legacy. Kept for back-compat.

Pricing is token-based (text + image tokens) — see OpenAI's pricing page for current rates. The image API also supports dall-e-2 / dall-e-3 via full model IDs.

fal.ai — curated subset of 1000+ models

Any fal model works by full ID (generate(model="fal:fal-ai/sdxl/lightning", …)). The shorthands below cover the most common ones:

Shorthand Model ID Seed Neg. prompt Cost / image
flux2-pro fal-ai/flux-2-pro ~$0.03 / MP
flux2-dev-turbo fal-ai/flux-2-dev-turbo ~$0.008
flux-pro fal-ai/flux-pro $0.050
flux fal-ai/flux/dev $0.025
flux-schnell fal-ai/flux/schnell $0.003
seedream / seedream-4.5 fal-ai/bytedance/seedream/v4.5/text-to-image $0.040
ideogram fal-ai/ideogram/v3 $0.030–0.090
recraft fal-ai/recraft-v3 $0.040 raster / $0.080 vector
nb2-fal fal-ai/nano-banana-2 $0.080
nb-pro-fal fal-ai/nano-banana-pro $0.150
imagen4-fal fal-ai/imagen4/preview variable

fal acts as a unified gateway — you can use Google's Nano Banana and Imagen through fal's billing + API instead of Google's own credentials, which is occasionally useful for multi-tenant apps.

Google — Nano Banana + Imagen (AI Studio or Vertex AI)

Two providers, each with independent auth:

Provider Env var Auth method Use case
google GOOGLE_VERTEX_API_KEY Vertex AI (API key) Railway / production
google GOOGLE_CLOUD_PROJECT Vertex AI (ADC) Local dev
google-studio GOOGLE_GEMINI_API_KEY AI Studio Gemini-only, no GCP project
google GOOGLE_API_KEY + GOOGLE_GENAI_USE_VERTEXAI=True Vertex AI (API key) Legacy / compat
google GOOGLE_API_KEY alone AI Studio Legacy / compat

GOOGLE_VERTEX_API_KEY takes priority for the google provider — set it to use a Vertex-bound API key instead of ADC. Imagen models require Vertex. google-studio shorthands (nb2, nano-banana, nb-pro) also work via google-studio:model-id if you want them on AI Studio billing.

Nano Banana (Gemini Image) — three tiers, shared generate_content API surface:

Shorthand Model ID Aliases Description
nb-pro gemini-3-pro-image-preview Nano Banana Pro Best reasoning ("Thinking"), high-fidelity text, most expensive.
nb2 gemini-3.1-flash-image-preview Nano Banana 2 Fast, high-volume. Trades some compositional depth for speed.
nano-banana gemini-2.5-flash-image original Nano Banana Legacy speed/efficiency tier.

Imagen 4 — three tiers:

Shorthand Model ID Notes
imagen4-ultra imagen-4.0-ultra-generate-001 Highest fidelity; supports 1K + 2K + 4K.
imagen4 imagen-4.0-generate-001 Default Imagen 4.
imagen4-fast imagen-4.0-fast-generate-001 Cheapest / fastest.

All Google image models output includes a SynthID watermark.

Higgsfield — async-task platform (submit → poll → download)

Polling is built in via poll_until. Higgsfield also gives day-0 access to third-party video models.

Soul (photorealistic image generation):

Shorthand Model ID
higgsfield-soul higgsfield-ai/soul/standard
higgsfield-soul-pro higgsfield-ai/soul/pro

DoP (image-to-video, three quality/speed tiers):

Shorthand Model ID
higgsfield-dop higgsfield-ai/dop/lite
higgsfield-dop-preview higgsfield-ai/dop/preview
higgsfield-dop-turbo higgsfield-ai/dop/turbo

Kling (video via Higgsfield platform):

Shorthand Model ID Notes
higgsfield-kling kling-video/v3.0/master/image-to-video Kling 3.0 — unified video + audio + images, multi-shot.
higgsfield-kling-v2 kling-video/v2.1/pro/image-to-video Kling 2.1 Master.

Other Higgsfield platform models (Sora 2, WAN 2.5, MiniMax Hailuo 02, Seedance Pro, etc.) work by passing their full ID: generate(model="higgsfield:sora-2/preview", …).

All models support

Aspect ratios 9:16, 16:9, 1:1. Resolutions 720p, 1080p. Capabilities beyond these are rejected pre-flight with UnsupportedCapabilityError — the SDK call is never made. Add a model, or update capabilities, by editing src/image_gen/providers/catalog.yaml — no Python changes needed.

Install

pip install git+https://github.com/oj-rivas/multi-model-image-gen.git
# or pinned:
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git@v0.2.0
# or editable from local checkout:
pip install -e /path/to/multi-model-image-gen
# with Prometheus metrics:
pip install "multi-model-image-gen[metrics] @ git+…"

Credentials

Set only the keys for providers you'll actually call:

# OpenAI
OPENAI_API_KEY=sk-…

# fal.ai
FAL_KEY=# Google — Vertex AI (production / Railway)
GOOGLE_VERTEX_API_KEY=                   # "google" provider → Vertex with API key
GOOGLE_GEMINI_API_KEY=                   # "google-studio" provider → AI Studio
GOOGLE_CLOUD_PROJECT=your-gcp-project     # required for Vertex (Imagen 4)
GOOGLE_CLOUD_LOCATION=us-central1         # default
# Google — local dev (ADC)
# GOOGLE_CLOUD_PROJECT=your-gcp-project   # Vertex via gcloud auth application-default login
# Google — legacy single key
# GOOGLE_API_KEY=…                        # AI Studio, or Vertex with GOOGLE_GENAI_USE_VERTEXAI=True

# Higgsfield
HIGGSFIELD_API_KEY=HIGGSFIELD_API_SECRET=HIGGSFIELD_BASE_URL=https://platform.higgsfield.ai  # default

The library does not load .env for you. Use python-dotenv or your framework's loader.

Addressing

Three ways to name a model:

Form Example Provider chosen by
provider:model "fal:flux/dev" explicit prefix
Bare shorthand "flux" catalog
Bare full model ID "higgsfield-ai/soul/standard" catalog → higgsfield default

Unknown provider prefix raises ValueError. Unknown bare models route to higgsfield (which uses full IDs rather than shorthands).

Library usage

Sync

from image_gen import generate

result = generate(
    prompt="a neon fox",
    model="flux",                 # catalog → fal-ai/flux/dev
    output="fox.png",             # written when status == "completed"
    aspect_ratio="9:16",
    resolution="720p",
)
assert result.status == "completed"
print(result.request_id, result.cost_usd)

Async — for FastAPI / high-concurrency

import asyncio
from image_gen import generate_async

async def batch():
    return await asyncio.gather(*[
        generate_async(prompt=f"scene {i}", model="flux")
        for i in range(10)
    ])

Fallback chains

result = generate(
    prompt="…",
    model="fal:flux/dev",
    fallbacks=["openai", "higgsfield"],   # tried in order on retry-exhausted failure
    correlation_id="trace-42",
)

Direct provider client (bypass routing)

from image_gen import FalImageClient

with FalImageClient() as client:
    result = client.generate_image(
        prompt="cinematic forest",
        model="fal-ai/flux/dev",
        aspect_ratio="16:9",
        num_inference_steps=40,      # provider-specific kwarg
    )
    client.download(result.image_url, "forest.png")

Result shape

@dataclass
class GenerationResult:
    request_id: str
    status: str                # "completed" | "failed" | "blocked"
    image_bytes: bytes | None  # inline payload (OpenAI, Google)
    image_url: str | None      # remote URL (fal, Higgsfield)
    video_url: str | None      # Higgsfield video models
    cost_usd: float | None     # populated from catalog when available
    raw: dict | None           # full provider response + correlation_id if set

Callers that pass output= to generate() don't need to touch this — bytes/URL are handled automatically.

CLI

image-gen -p "a neon fox" -m flux -o fox.png
image-gen -p "…" -m "openai:gpt-image-1" --aspect 16:9 --resolution 1080p -o wide.png
image-gen -p "…" -m "higgsfield:higgsfield-ai/soul/standard" -o h.png

FastAPI integration

from fastapi import FastAPI, HTTPException
from image_gen import generate_async, FallbackExhausted

app = FastAPI()

@app.post("/images")
async def gen(prompt: str, model: str = "flux", x_correlation_id: str | None = None):
    try:
        r = await generate_async(prompt=prompt, model=model, correlation_id=x_correlation_id)
    except FallbackExhausted as e:
        raise HTTPException(502, str(e))
    if r.status == "blocked":
        raise HTTPException(422, r.raw)
    return {"url": r.image_url, "cost": r.cost_usd, "id": r.request_id}

Resilience

  • Retry: tenacity-backed exponential backoff on httpx.TimeoutException, ConnectError, ReadError, and HTTP 429/500/502/503/504. Configurable via IMAGE_GEN_RETRY_ATTEMPTS (default 3) and IMAGE_GEN_RETRY_MAX_WAIT (default 8s). Non-transient errors (ValueError, 4xx) are not retried.
  • Fallback: Router iterates [primary, *fallbacks]; each provider's retries run first, then the router moves on. FallbackExhausted is raised if all fail, carrying the list of underlying exceptions.
  • Blocked content: NSFW / policy refusals return status="blocked" instead of raising — never retried.

Observability

Every generate() / generate_async() call emits one structured log record image_gen.request with canonical fields:

request_id, provider, model, latency_ms, status, bytes_out,
cost_usd, retry_count, fallback_used, correlation_id

Set IMAGE_GEN_LOG_FORMAT=json for one-line JSON per request (Datadog / CloudWatch / Loki-friendly).

Install the optional [metrics] extra to expose Prometheus counters: image_gen_requests_total, image_gen_retries_total, image_gen_fallbacks_total, image_gen_cost_usd_total, and histogram image_gen_latency_seconds.

Extending — add a provider as a plugin

No fork needed. In your own package:

# runway_image_gen/__init__.py
from image_gen.providers.base import ImageProvider

class RunwayClient:
    def generate_image(self, prompt, model, aspect_ratio, resolution, **kw): ...
    def download(self, url, output_path): ...
    def close(self): ...
# pyproject.toml
[project.entry-points."image_gen.providers"]
runway = "runway_image_gen:RunwayClient"

Then pip install your-package and generate(model="runway:gen-3") just works — entry-point discovery picks it up.

Also works at runtime:

from image_gen import register_provider
register_provider("custom", MyClientFactory)
generate(model="custom:foo")

A complete example lives in examples/runway_plugin/.

Layout

src/image_gen/
├── __init__.py                 # generate(), generate_async(), Router re-export
├── cli.py                      # image-gen CLI
├── config.py                   # lazy env readers
├── router.py                   # Router, FallbackExhausted
├── observability.py            # StructuredLogger, JSONFormatter, correlation_id
├── metrics.py                  # Prometheus (optional, gated on import)
├── py.typed                    # PEP 561 marker
└── providers/
    ├── __init__.py             # registry + plugin discovery (register_provider, get_provider)
    ├── base.py                 # ImageProvider Protocol (sync + async)
    ├── result.py               # GenerationResult
    ├── catalog.yaml            # SINGLE SOURCE OF TRUTH for models
    ├── catalog.py              # ModelEntry, resolve, validate
    ├── _retry.py               # @retry_policy, is_retryable
    ├── _poll.py                # poll_until, poll_until_async
    ├── README.md               # "how to add an async-task provider"
    ├── openai_images.py
    ├── fal_images.py
    ├── google_images.py        # AI Studio + Vertex AI
    └── higgsfield.py
examples/
├── runway_plugin/              # third-party plugin example
└── benchmark_concurrent.py     # async concurrency demo

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_model_image_gen-0.2.1.tar.gz (58.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multi_model_image_gen-0.2.1-py3-none-any.whl (43.3 kB view details)

Uploaded Python 3

File details

Details for the file multi_model_image_gen-0.2.1.tar.gz.

File metadata

  • Download URL: multi_model_image_gen-0.2.1.tar.gz
  • Upload date:
  • Size: 58.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for multi_model_image_gen-0.2.1.tar.gz
Algorithm Hash digest
SHA256 52f3adda8be102d179f5e59eff45a8fa7a49c211bac4ea0ccedd6dac1f5336ba
MD5 d7f97d77ec2345c1a98ba98b066994a0
BLAKE2b-256 423cd00d1d519d45f2a67187998a5f9d695784e2e42efc9eaae30fa3129e54ab

See more details on using hashes here.

File details

Details for the file multi_model_image_gen-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for multi_model_image_gen-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0cb7854cc8ad13179c8a4afdcf5b7bebb58e5e57400eaba726c230c69c0b2bf8
MD5 e22a974659ae6aeb788c6d402acaa776
BLAKE2b-256 0e4355dec536a8b0400613d83e8b9958e41271d513ed65b89b6c096babfd2a0f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page