Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield — aisuite-style addressing, retry + fallback, async-first, plugin-extensible.
Project description
multi-model-image-gen
Unified Python library for image generation across OpenAI, fal.ai, Google (AI Studio + Vertex AI), and Higgsfield. One API, aisuite-style addressing, retry + fallback, async-first, plugin-extensible.
from image_gen import generate
result = generate(
prompt="a neon-lit Tokyo alley at night",
model="fal:flux/dev",
fallbacks=["openai", "higgsfield"],
output="out.png",
)
Supported models
Catalog of models known to the library (current as of 2026-04-17). Shorthands resolve to the provider-native model ID automatically. Any full model ID also works — unknown models just pass through without capability validation.
Single source of truth:
src/image_gen/providers/catalog.yaml. Add or update a model = edit that file, no Python changes.
OpenAI — GPT Image family
| Shorthand | Model ID | Seed | Neg. prompt | Notes |
|---|---|---|---|---|
gpt-image-1.5 |
gpt-image-1.5 |
– | – | Current SOTA — natively multimodal (text + image I/O). |
gpt-image-mini |
gpt-image-1-mini |
– | – | Cheapest tier when quality is secondary. |
gpt-image |
gpt-image-1 |
– | – | Legacy. Kept for back-compat. |
Pricing is token-based (text + image tokens) — see OpenAI's pricing page for current rates. The image API also supports dall-e-2 / dall-e-3 via full model IDs.
fal.ai — curated subset of 1000+ models
Any fal model works by full ID (generate(model="fal:fal-ai/sdxl/lightning", …)). The shorthands below cover the most common ones:
| Shorthand | Model ID | Seed | Neg. prompt | Cost / image |
|---|---|---|---|---|
flux2-pro |
fal-ai/flux-2-pro |
✓ | – | ~$0.03 / MP |
flux2-dev-turbo |
fal-ai/flux-2-dev-turbo |
✓ | – | ~$0.008 |
flux-pro |
fal-ai/flux-pro |
✓ | – | $0.050 |
flux |
fal-ai/flux/dev |
✓ | – | $0.025 |
flux-schnell |
fal-ai/flux/schnell |
✓ | – | $0.003 |
seedream / seedream-4.5 |
fal-ai/bytedance/seedream/v4.5/text-to-image |
✓ | ✓ | $0.040 |
ideogram |
fal-ai/ideogram/v3 |
✓ | ✓ | $0.030–0.090 |
recraft |
fal-ai/recraft-v3 |
✓ | – | $0.040 raster / $0.080 vector |
nb2-fal |
fal-ai/nano-banana-2 |
– | – | $0.080 |
nb-pro-fal |
fal-ai/nano-banana-pro |
– | – | $0.150 |
imagen4-fal |
fal-ai/imagen4/preview |
– | – | variable |
fal acts as a unified gateway — you can use Google's Nano Banana and Imagen through fal's billing + API instead of Google's own credentials, which is occasionally useful for multi-tenant apps.
Google — Nano Banana + Imagen (AI Studio or Vertex AI)
Backend is env-selected: if GOOGLE_CLOUD_PROJECT is set → Vertex AI (IAM auth, regional, enterprise). Else if GOOGLE_API_KEY → AI Studio (simple key). Imagen models generally require Vertex.
Nano Banana (Gemini Image) — three tiers, shared generate_content API surface:
| Shorthand | Model ID | Aliases | Description |
|---|---|---|---|
nb-pro |
gemini-3-pro-image-preview |
Nano Banana Pro | Best reasoning ("Thinking"), high-fidelity text, most expensive. |
nb2 |
gemini-3.1-flash-image-preview |
Nano Banana 2 | Fast, high-volume. Trades some compositional depth for speed. |
nano-banana |
gemini-2.5-flash-image |
original Nano Banana | Legacy speed/efficiency tier. |
Imagen 4 — three tiers:
| Shorthand | Model ID | Notes |
|---|---|---|
imagen4-ultra |
imagen-4.0-ultra-generate-001 |
Highest fidelity; supports 1K + 2K + 4K. |
imagen4 |
imagen-4.0-generate-001 |
Default Imagen 4. |
imagen4-fast |
imagen-4.0-fast-generate-001 |
Cheapest / fastest. |
All Google image models output includes a SynthID watermark.
Higgsfield — async-task platform (submit → poll → download)
Polling is built in via poll_until. Higgsfield also gives day-0 access to third-party video models.
Soul (photorealistic image generation):
| Shorthand | Model ID |
|---|---|
higgsfield-soul |
higgsfield-ai/soul/standard |
higgsfield-soul-pro |
higgsfield-ai/soul/pro |
DoP (image-to-video, three quality/speed tiers):
| Shorthand | Model ID |
|---|---|
higgsfield-dop |
higgsfield-ai/dop/lite |
higgsfield-dop-preview |
higgsfield-ai/dop/preview |
higgsfield-dop-turbo |
higgsfield-ai/dop/turbo |
Kling (video via Higgsfield platform):
| Shorthand | Model ID | Notes |
|---|---|---|
higgsfield-kling |
kling-video/v3.0/master/image-to-video |
Kling 3.0 — unified video + audio + images, multi-shot. |
higgsfield-kling-v2 |
kling-video/v2.1/pro/image-to-video |
Kling 2.1 Master. |
Other Higgsfield platform models (Sora 2, WAN 2.5, MiniMax Hailuo 02, Seedance Pro, etc.) work by passing their full ID: generate(model="higgsfield:sora-2/preview", …).
All models support
Aspect ratios 9:16, 16:9, 1:1. Resolutions 720p, 1080p. Capabilities beyond these are rejected pre-flight with UnsupportedCapabilityError — the SDK call is never made. Add a model, or update capabilities, by editing src/image_gen/providers/catalog.yaml — no Python changes needed.
Install
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git
# or pinned:
pip install git+https://github.com/oj-rivas/multi-model-image-gen.git@v0.2.0
# or editable from local checkout:
pip install -e /path/to/multi-model-image-gen
# with Prometheus metrics:
pip install "multi-model-image-gen[metrics] @ git+…"
Credentials
Set only the keys for providers you'll actually call:
# OpenAI
OPENAI_API_KEY=sk-…
# fal.ai
FAL_KEY=…
# Google — pick ONE backend:
GOOGLE_CLOUD_PROJECT=your-gcp-project # Vertex AI (enterprise)
GOOGLE_CLOUD_LOCATION=us-central1 # default
# …OR…
GOOGLE_API_KEY=… # AI Studio (simple)
# Higgsfield
HIGGSFIELD_API_KEY=…
HIGGSFIELD_API_SECRET=…
HIGGSFIELD_BASE_URL=https://platform.higgsfield.ai # default
The library does not load .env for you. Use python-dotenv or your framework's loader.
Addressing
Three ways to name a model:
| Form | Example | Provider chosen by |
|---|---|---|
provider:model |
"fal:flux/dev" |
explicit prefix |
| Bare shorthand | "flux" |
catalog |
| Bare full model ID | "higgsfield-ai/soul/standard" |
catalog → higgsfield default |
Unknown provider prefix raises ValueError. Unknown bare models route to higgsfield (which uses full IDs rather than shorthands).
Library usage
Sync
from image_gen import generate
result = generate(
prompt="a neon fox",
model="flux", # catalog → fal-ai/flux/dev
output="fox.png", # written when status == "completed"
aspect_ratio="9:16",
resolution="720p",
)
assert result.status == "completed"
print(result.request_id, result.cost_usd)
Async — for FastAPI / high-concurrency
import asyncio
from image_gen import generate_async
async def batch():
return await asyncio.gather(*[
generate_async(prompt=f"scene {i}", model="flux")
for i in range(10)
])
Fallback chains
result = generate(
prompt="…",
model="fal:flux/dev",
fallbacks=["openai", "higgsfield"], # tried in order on retry-exhausted failure
correlation_id="trace-42",
)
Direct provider client (bypass routing)
from image_gen import FalImageClient
with FalImageClient() as client:
result = client.generate_image(
prompt="cinematic forest",
model="fal-ai/flux/dev",
aspect_ratio="16:9",
num_inference_steps=40, # provider-specific kwarg
)
client.download(result.image_url, "forest.png")
Result shape
@dataclass
class GenerationResult:
request_id: str
status: str # "completed" | "failed" | "blocked"
image_bytes: bytes | None # inline payload (OpenAI, Google)
image_url: str | None # remote URL (fal, Higgsfield)
video_url: str | None # Higgsfield video models
cost_usd: float | None # populated from catalog when available
raw: dict | None # full provider response + correlation_id if set
Callers that pass output= to generate() don't need to touch this — bytes/URL are handled automatically.
CLI
image-gen -p "a neon fox" -m flux -o fox.png
image-gen -p "…" -m "openai:gpt-image-1" --aspect 16:9 --resolution 1080p -o wide.png
image-gen -p "…" -m "higgsfield:higgsfield-ai/soul/standard" -o h.png
FastAPI integration
from fastapi import FastAPI, HTTPException
from image_gen import generate_async, FallbackExhausted
app = FastAPI()
@app.post("/images")
async def gen(prompt: str, model: str = "flux", x_correlation_id: str | None = None):
try:
r = await generate_async(prompt=prompt, model=model, correlation_id=x_correlation_id)
except FallbackExhausted as e:
raise HTTPException(502, str(e))
if r.status == "blocked":
raise HTTPException(422, r.raw)
return {"url": r.image_url, "cost": r.cost_usd, "id": r.request_id}
Resilience
- Retry: tenacity-backed exponential backoff on
httpx.TimeoutException,ConnectError,ReadError, and HTTP429/500/502/503/504. Configurable viaIMAGE_GEN_RETRY_ATTEMPTS(default 3) andIMAGE_GEN_RETRY_MAX_WAIT(default 8s). Non-transient errors (ValueError, 4xx) are not retried. - Fallback:
Routeriterates[primary, *fallbacks]; each provider's retries run first, then the router moves on.FallbackExhaustedis raised if all fail, carrying the list of underlying exceptions. - Blocked content: NSFW / policy refusals return
status="blocked"instead of raising — never retried.
Observability
Every generate() / generate_async() call emits one structured log record image_gen.request with canonical fields:
request_id, provider, model, latency_ms, status, bytes_out,
cost_usd, retry_count, fallback_used, correlation_id
Set IMAGE_GEN_LOG_FORMAT=json for one-line JSON per request (Datadog / CloudWatch / Loki-friendly).
Install the optional [metrics] extra to expose Prometheus counters: image_gen_requests_total, image_gen_retries_total, image_gen_fallbacks_total, image_gen_cost_usd_total, and histogram image_gen_latency_seconds.
Extending — add a provider as a plugin
No fork needed. In your own package:
# runway_image_gen/__init__.py
from image_gen.providers.base import ImageProvider
class RunwayClient:
def generate_image(self, prompt, model, aspect_ratio, resolution, **kw): ...
def download(self, url, output_path): ...
def close(self): ...
# pyproject.toml
[project.entry-points."image_gen.providers"]
runway = "runway_image_gen:RunwayClient"
Then pip install your-package and generate(model="runway:gen-3") just works — entry-point discovery picks it up.
Also works at runtime:
from image_gen import register_provider
register_provider("custom", MyClientFactory)
generate(model="custom:foo")
A complete example lives in examples/runway_plugin/.
Layout
src/image_gen/
├── __init__.py # generate(), generate_async(), Router re-export
├── cli.py # image-gen CLI
├── config.py # lazy env readers
├── router.py # Router, FallbackExhausted
├── observability.py # StructuredLogger, JSONFormatter, correlation_id
├── metrics.py # Prometheus (optional, gated on import)
├── py.typed # PEP 561 marker
└── providers/
├── __init__.py # registry + plugin discovery (register_provider, get_provider)
├── base.py # ImageProvider Protocol (sync + async)
├── result.py # GenerationResult
├── catalog.yaml # SINGLE SOURCE OF TRUTH for models
├── catalog.py # ModelEntry, resolve, validate
├── _retry.py # @retry_policy, is_retryable
├── _poll.py # poll_until, poll_until_async
├── README.md # "how to add an async-task provider"
├── openai_images.py
├── fal_images.py
├── google_images.py # AI Studio + Vertex AI
└── higgsfield.py
examples/
├── runway_plugin/ # third-party plugin example
└── benchmark_concurrent.py # async concurrency demo
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file multi_model_image_gen-0.2.0.tar.gz.
File metadata
- Download URL: multi_model_image_gen-0.2.0.tar.gz
- Upload date:
- Size: 57.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70821517e441f480d98fec9a2841d97489a4bc4d8ebbbbe416143826527c70f6
|
|
| MD5 |
33821d7d206117e55336ee83d153d16a
|
|
| BLAKE2b-256 |
5874b63f2f39b7de4d4483d4c3405b1befd7b8e5f6e330c102e25b629bac34ea
|
File details
Details for the file multi_model_image_gen-0.2.0-py3-none-any.whl.
File metadata
- Download URL: multi_model_image_gen-0.2.0-py3-none-any.whl
- Upload date:
- Size: 42.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a7aa0cea9e9957af6b2d739229355a8b6e739f0d4f5e307df08a744b7e2496c
|
|
| MD5 |
0e6a86f911c9702c1f5b250409958c94
|
|
| BLAKE2b-256 |
f6534e0ee8c318a117defd19788b8634c3a77cbf2e3cfe47cd0a35e7434adb3f
|