Model-agnostic generative vision abstractions (image/video) for the Abstract ecosystem

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lpalbou

These details have not been verified by PyPI

Project description

AbstractVision

Model-agnostic generative vision API (images, optional video) for Python and the Abstract* ecosystem.

What you get

A small orchestration API: VisionManager
Packaged capability registries for model families and adapter defaults: VisionModelCapabilitiesRegistry backed by vision_model_capabilities.json, plus VisionAdapterCapabilitiesRegistry backed by vision_adapter_capabilities.json
Shared model and adapter metadata that drives local catalog surfacing and backend request normalization across the CLI, playground, and AbstractCore paths
Optional artifact-ref outputs (small JSON refs): LocalAssetStore and RuntimeArtifactStoreAdapter
Built-in backends (execution engines): src/abstractvision/backends/
- OpenAI-compatible HTTP: openai_compatible.py
- Local Diffusers: huggingface_diffusers.py
- Local stable-diffusion.cpp / GGUF: stable_diffusion_cpp.py
- Local MLX-Gen backend for curated AbstractFramework MLX presets and Wan video: mflux.py
CLI for manual testing (abstractvision cli, legacy alias: abstractvision repl): abstractvision
Self-contained local Playground UI/API: playground/vision_playground.html (docs: playground/README.md)

How it fits together (diagram)

flowchart LR
  Caller[Python / CLI / AbstractCore] --> VM[VisionManager]
  VM --> BE[VisionBackend]
  BE --> VM
  VM -->|optional| Store[MediaStore]
  Store --> Ref[Artifact ref dict]
  VM -->|no store| Asset["GeneratedAsset (bytes + mime)"]

Status (current backend support)

Development status: Alpha (0.x). The public API is stable-by-design, but breaking changes may still happen and will be called out in CHANGELOG.md.
Built-in backends implement images (text_to_image, image_to_image) plus backend-dependent video (text_to_video, image_to_video).
Local MLX-Gen supports text_to_image for curated FLUX.2, Qwen Image, Z-Image, ERNIE Image Turbo, FIBO, and Bonsai ternary models; supports image_to_image for FLUX.2 klein/base, Qwen Image Edit, ERNIE Image Turbo, FIBO, and FIBO Edit models; and supports Wan 2.2 TI2V plus task-specific Wan 2.2 A14B packages for local text_to_video and first-frame image_to_video.
Local MLX-Gen 0.18.19+ exposes route-aware Qwen structured control on validated base-Qwen routes through control_image / control_strength, and route-aware masked edit on validated Qwen 2511 edit routes through mask.
Local MLX-Gen exposes a shared lora_adapters contract across text_to_image, image_to_image, text_to_video, and image_to_video. Route-level LoRA truth stays backend-owned and is surfaced through provider catalogs as supports_lora, lora_status, lora_target_roles, and lora_validation_profile.
Local Diffusers text_to_video remains experimental and is temporarily disabled from the normal local runtime surfaces pending docs/backlog/planned/0023_local_runtime_capability_quarantine_for_glm_mflux_and_t2v.md.
Remote text_to_video / image_to_video are also supported through the OpenAI-compatible backend when endpoints are configured.
multi_view_image is part of the public API (VisionManager.generate_angles) but no built-in backend implements it yet.

Details: docs/reference/backends.md.

Installation

pip install abstractvision

The base install is lightweight. It includes the shared API, capability registry, artifact helpers, CLI, AbstractCore plugin entry point, and the stdlib OpenAI-compatible HTTP backend. Local inference runtimes are explicit extras.

Optional extras:

Extra	Use
`abstractvision[openai]`	Official OpenAI provider intent marker; no SDK dependency today.
`abstractvision[openai-compatible]`	Generic local/remote OpenAI-shaped endpoint intent marker; stdlib-only today.
`abstractvision[models]`	Curated Hugging Face download helpers for cache-backed local quantized vision model presets.
`abstractvision[diffusers]`	Install Torch/Diffusers and related packages for local Diffusers generation.
`abstractvision[huggingface]`	Compatibility alias for callers that still request the historical Diffusers extra.
`abstractvision[sdcpp]`	Install `stable-diffusion-cpp-python` for the pip binding fallback.
`abstractvision[mlx-gen]`	Install the optional MLX-Gen image/video runtime. This release is validated on Apple Silicon first; the extra also installs on Linux when upstream `mlx-gen` / `mlx` support is available.
`abstractvision[mflux]`	Compatibility alias for the MLX-Gen runtime.
`abstractvision[local]`	Convenience for the Diffusers + `sdcpp` local runtime stack. Add `abstractvision[mlx-gen]` or use `abstractvision[all]` / `abstractvision[all-apple]` when you also want MLX-Gen.
`abstractvision[all]`	All runtime backend dependencies, without contributor tooling.
`abstractvision[apple]` / `abstractvision[all-apple]`	Native macOS Python profile: Diffusers/Torch MPS, stable-diffusion.cpp bindings, and MLX-Gen.
`abstractvision[gpu]`	GPU-oriented local profile: Diffusers/Torch plus MLX-Gen when the platform markers match. Install a CUDA/ROCm-enabled PyTorch wheel when needed.
`abstractvision[all-gpu]`	Full GPU-oriented local vision profile: Diffusers, stable-diffusion.cpp bindings, and MLX-Gen when the platform markers match.
`abstractvision[abstractcore]`	Compatibility marker only; AbstractCore is still supplied by the host application.

stable-diffusion-cpp-python is currently constrained below 0.4.6 because that release's source distribution is missing vendored CMake files required by native Linux builds.

Contributor-only extras:

Extra	Use
`abstractvision[diffusers-dev]` / `abstractvision[huggingface-dev]`	Looser dependency pins for newer/unreleased Diffusers pipelines; install Diffusers `main` separately if needed.
`abstractvision[test]`	Local test dependencies.
`abstractvision[docs]`	Documentation build tooling.
`abstractvision[dev]`	Full contributor workflow: tests, docs, build, lint, formatting, and pre-commit. Do not use this as an application runtime profile.

Note (CUDA): on Windows/Linux, pip install "abstractvision[diffusers]" may install a CPU-only PyTorch build. If you want to use an NVIDIA GPU, install a CUDA-enabled PyTorch build first (see https://pytorch.org/get-started/locally/) and verify torch.cuda.is_available() is True.

AbstractCore is not installed by AbstractVision. When an AbstractCore application has AbstractVision installed in the same environment, AbstractCore can discover the plugin entry point and use the integration modules lazily.

If you hit “missing pipeline class” errors for newer model families, see docs/getting-started.md. In that case you may need Diffusers from source (main):

pip install -U "abstractvision[diffusers-dev]"
pip install -U "git+https://github.com/huggingface/diffusers@main"

For local development from a repo checkout:

pip install -e ".[dev]"

Usage

Start here:

Getting started: docs/getting-started.md
FAQ: docs/faq.md
Troubleshooting: docs/troubleshooting.md
API reference: docs/api.md
Architecture: docs/architecture.md
Capability registry + catalog policy: docs/reference/capabilities-registry.md, docs/adr/README.md
Docs index: docs/README.md

First Apple-local MLX model

For Apple-first local MLX-Gen image generation, prefer the AbstractFramework MLX-Gen q4 presets first. They are published in the AbstractFramework/mlx-gen Hugging Face collection and are the default recommendation for local memory efficiency. q8 variants are also listed as separate model repos and should be selected explicitly when quality is more important than memory footprint. Qwen and ERNIE q4 prepared folders can mix q4 and q8 components, but they remain the default prepared choice.

The downloader stores curated presets in the Hugging Face cache by default and imports older ~/models/<preset> trees on first use. Generation stays cache-only unless you explicitly enable runtime downloads.

pip install "abstractvision[models,mlx-gen]"
abstractvision model-presets
abstractvision catalog --provider mlx-gen
abstractvision show-model AbstractFramework/qwen-image-edit-2511-8bit
abstractvision show-model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit
abstractvision adapters --provider mlx-gen --model AbstractFramework/qwen-image-edit-2509-8bit --task image_to_image
# Tip: `--provider mlx-gen` implies `--target mlx` (you usually set one or the other).
# Use `catalog` to browse available downloadable models, then `show-model` for the exact route
# contract, defaults, tasks, and LoRA support truth of one selected model.
# Use `adapters` to inspect locally cached overlays for one model/task route.
abstractvision download AbstractFramework/flux.2-klein-4b-4bit --provider mlx-gen
abstractvision download AbstractFramework/qwen-image-2512-4bit --provider mlx-gen
abstractvision download AbstractFramework/qwen-image-edit-2511-4bit --provider mlx-gen
abstractvision download AbstractFramework/z-image-turbo-4bit --provider mlx-gen
abstractvision download AbstractFramework/ernie-image-turbo-8bit --provider mlx-gen
abstractvision download briaai/FIBO --provider mlx-gen
abstractvision download briaai/Fibo-lite --provider mlx-gen
abstractvision download briaai/Fibo-Edit --provider mlx-gen
abstractvision download prism-ml/bonsai-image-ternary-4B-mlx-2bit --provider mlx-gen
abstractvision download Wan-AI/Wan2.2-TI2V-5B-Diffusers --provider mlx-gen
abstractvision download AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit --provider mlx-gen
abstractvision download AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit --provider mlx-gen
abstractvision download AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit --provider mlx-gen
abstractvision download AbstractFramework/seedvr2-3b-8bit --provider mlx-gen
abstractvision t2i --provider mlx-gen --model AbstractFramework/flux.2-klein-4b-4bit "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0
abstractvision t2i --provider mlx-gen --model AbstractFramework/flux.2-klein-4b-8bit "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0
abstractvision i2i --provider mlx-gen --model AbstractFramework/qwen-image-edit-2511-4bit --image ./input.png "replace the background with a clean white studio setup" --steps 20 --guidance-scale 2.5 --strength 0.75
abstractvision i2i --provider mlx-gen --model AbstractFramework/qwen-image-edit-2511-4bit --image ./input.png --reference-image ./style.png "apply the style reference while preserving the subject" --progress
abstractvision t2i --provider mlx-gen --model briaai/FIBO "a studio product photo of a white ceramic mug with the AbstractFramework logo" --steps 50 --guidance-scale 4.0
abstractvision t2i --provider mlx-gen --model prism-ml/bonsai-image-ternary-4B-mlx-2bit "a bonsai tree in a quiet ceramic studio" --steps 4 --guidance-scale 1.0
abstractvision i2i --provider mlx-gen --model briaai/Fibo-Edit --image ./input.png "remove the background and keep the object edges clean" --steps 20 --guidance-scale 4.0
abstractvision upscale --provider mlx-gen --model AbstractFramework/seedvr2-3b-8bit --image ./input.png --resolution 2x --softness 0.25 --open
abstractvision t2v --provider mlx-gen --model AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit "a red fox walking through a snowy forest, cinematic" --width 480 --height 240 --frames 41 --fps 10 --steps 20 --guidance-scale 4.0 --guidance-2 3.0
abstractvision i2v --provider mlx-gen --model AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit --image ./first-frame.png "slow camera push-in" --width 480 --height 240 --frames 41 --fps 10 --steps 20 --guidance-scale 3.5 --guidance-2 3.5

For a complete local MLX-Gen gallery with copy-paste commands and bundled generated outputs, see MLX-Gen local examples.

Select q4 or q8 by using the exact published model id, for example AbstractFramework/flux.2-klein-4b-4bit or AbstractFramework/flux.2-klein-4b-8bit. Quantization is metadata of the published model folder, not a generation-time override.

The shipped MLX-Gen backend currently supports curated q4/q8 prepared folders for flux2-klein-4b, flux2-klein-9b, flux2-klein-base-4b, flux2-klein-base-9b, qwen-image, qwen-image-edit, z-image, and z-image-turbo families, plus the q4/q8 ernie-image-turbo prepared folders. MLX-Gen 0.18.19+ also runs official runtime snapshots such as briaai/FIBO, briaai/Fibo-lite, briaai/Fibo-Edit, briaai/Fibo-Edit-RMBG, prism-ml/bonsai-image-ternary-4B-mlx-2bit, and Wan-AI/Wan2.2-TI2V-5B-Diffusers, the prepared TI2V package AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit, and prepared Wan 2.2 A14B video packages such as AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit and AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit. It also supports SeedVR2 single-image upscaling through the official ByteDance-Seed/SeedVR2-3B and ByteDance-Seed/SeedVR2-7B bases plus the canonical AbstractFramework/seedvr2-{3b,7b}-{8bit,4bit} prepared packages. Use AbstractFramework/seedvr2-3b-8bit by default, AbstractFramework/seedvr2-7b-8bit when memory allows, and q4 variants only when memory is tight. The default upscale request uses --resolution 2x and --softness 0.25; pass an integer --resolution for a target shortest edge. --scale remains accepted as a friendly scale-factor alias when --resolution is omitted. --quantize is only needed for official/source weights. Bonsai is a pre-packed ternary 2-bit checkpoint, not a q4/q8 prepared folder; use its exact repo id and keep guidance at 1.0. The binary 1-bit Bonsai checkpoint is not surfaced because stock MLX cannot run it yet. image_to_image is implemented for FLUX.2 klein/base, Qwen Image Edit, ERNIE Image Turbo, FIBO, and FIBO Edit models. FLUX.2 and Qwen Image Edit can accept additional references with --reference-image (or Python extra={"reference_images": [...]}) for multi-reference edits. FIBO Edit snapshots support mask inputs where the runtime supports them, and AbstractFramework/qwen-image-edit-2511-8bit supports route-aware masked edits. AbstractFramework/qwen-image-8bit supports route-aware structured control through abstractvision t2i --control-image ... --control-strength ... (or Python control_image= / control_strength=). Edit/control inputs fail closed when the selected route does not advertise the matching runtime capability. Edit strength is passed as strength and normalized to MLX-Gen's image_strength parameter where the runtime supports it. For video, prefer the prepared TI2V-5B or task-specific Wan 2.2 A14B packages when memory allows. TI2V-5B should be run at 832x480 / 480x832 or above in practice; the bundled proof page validates the route at 832x480 with flow_shift=3.0. Wan A14B dimensions must be multiples of 16 and can still be smoke-tested at 480x240. Wan A14B uses two guidance stages: guidance_scale is the primary/high-noise guidance, and guidance_2 (--guidance-2) is the second-stage/low-noise guidance. Registry defaults are 4.0 + 3.0 for T2V and 3.5 + 3.5 for I2V; omit guidance_2 for single-stage video models.

Shared LoRA adapters

AbstractVision exposes one shared LoRA adapter request shape across Python, CLI, and the AbstractCore plugin:

Python: lora_adapters=[LoRAAdapterSpec(...)]
CLI: repeated --lora, --lora-scale, --lora-weight-name, --lora-subfolder, --lora-adapter-name, and --lora-target-role
Zero adapters means a base-model run; repeated flags let you stack one or more adapters in order
Provider catalogs: supports_lora, lora_status, lora_target_roles, lora_validation_profile

Example CLI commands:

abstractvision t2i \
  --provider mlx-gen \
  --model AbstractFramework/qwen-image-2512-8bit \
  --lora prithivMLmods/Qwen-Image-2512-Pixel-Art-LoRA:Qwen-Image-2512-Master-Pixel-Art-LoRA.safetensors \
  --lora-scale 1.0 \
  --steps 20 \
  --guidance-scale 5.0 \
  --progress \
  "Pixel art astronaut floating above Earth"

abstractvision t2v \
  --provider mlx-gen \
  --model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit \
  --lora AlekseyCalvin/HSToric_Color_Wan2.2_5B_LoRA_BySilverAgePoets:HSToric_color_Wan22_5b_LoRA.safetensors \
  --lora-scale 0.8 \
  --lora-target-role transformer \
  --width 832 \
  --height 480 \
  --frames 17 \
  --fps 16 \
  --steps 20 \
  --flow-shift 3.0 \
  --progress \
  "HST style HD film, early 1900s, autochrome, analog cinema. A horse-drawn carriage crossing a snowy town square at dusk, pedestrians in wool coats, historical street lamps glowing, gentle cinematic motion."

abstractvision i2i \
  --provider mlx-gen \
  --model AbstractFramework/qwen-image-edit-2509-8bit \
  --image ./input.png \
  --lora lightx2v/Qwen-Image-Lightning:Qwen-Image-Edit-2509/Qwen-Image-Edit-2509-Lightning-8steps-V1.0-bf16.safetensors \
  --lora dx8152/Qwen-Edit-2509-Multiple-angles:镜头转换.safetensors \
  --lora-scale 1.0 \
  --lora-scale 0.9 \
  --steps 8 \
  --guidance-scale 1.0 \
  --progress \
  "Use the source starfighter as the same object. Re-render it from behind and slightly above, banking left over the snowy canyon, keeping the same vehicle identity and environment."

Wan LoRA routes may require explicit target roles. TI2V-5B uses one role, transformer. A14B routes use two roles, high_noise_transformer and low_noise_transformer. AbstractVision passes the roles through unchanged instead of guessing.

Quantization guidance for Lightning LoRAs:

The upstream LightX2V Qwen Lightning docs warn against pairing a BF16-trained Lightning LoRA with the raw unscaled FP8 Qwen base (qwen_image_fp8_e4m3fn.safetensors); that mix can produce grid artifacts. Use the fp8-trained Lightning LoRA variant on that raw FP8 base, or use a scaled FP8/BF16-compatible base with the BF16-trained Lightning LoRA instead. See: https://github.com/ModelTC/LightX2V-Qwen-Image-Lightning#-using-lightning-loras-with-fp8-models
AbstractVision does not currently surface that raw unscaled Qwen FP8 base as a first-class local route. For local LoRA-heavy Qwen and Wan runs through the MLX-Gen backend, prefer the curated ...-8bit prepared model routes when memory allows. q4 routes remain available for tighter memory budgets.

Diffusers still accepts legacy extra["loras"], extra["loras_json"], extra["lora"], and extra["lora_json"] payloads for compatibility, but new callers should use the typed shared contract.

Inspect one exact route before long LoRA runs:

abstractvision show-model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit
abstractvision adapters --provider mlx-gen --model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit --task text_to_video --json

Batch generation

AbstractVision also exposes a public batch surface above the exact single-run requests:

Python: generate_image_batch(...), edit_image_batch(...), generate_video_batch(...), image_to_video_batch(...)
CLI: --count plus optional explicit --seeds

Examples:

abstractvision t2i \
  --provider mlx-gen \
  --model AbstractFramework/qwen-image-2512-8bit \
  --count 2 \
  --seeds 2512,2513 \
  --lora prithivMLmods/Qwen-Image-2512-Pixel-Art-LoRA:Qwen-Image-2512-Master-Pixel-Art-LoRA.safetensors \
  --lora-scale 1.0 \
  --width 768 \
  --height 768 \
  --steps 20 \
  --guidance-scale 5.0 \
  --progress \
  "Pixel art isometric research outpost on an icy exoplanet at blue hour, tiny service drones, crisp tiles, retro RPG palette"

abstractvision t2v \
  --provider mlx-gen \
  --model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit \
  --count 2 \
  --seeds 6201,6202 \
  --width 832 \
  --height 480 \
  --frames 9 \
  --fps 8 \
  --steps 4 \
  --guidance-scale 4.0 \
  --flow-shift 3.0 \
  --progress \
  "A paper lantern drifting across a quiet snowy street, slow stable camera, soft evening light"

SeedVR2 model selection:

Model id	Use when
`AbstractFramework/seedvr2-3b-8bit`	Default upscaler: lowest-friction q8 package.
`AbstractFramework/seedvr2-7b-8bit`	Higher-quality 7B upscaling when memory allows.
`AbstractFramework/seedvr2-3b-4bit`	Lower-memory 3B package.
`AbstractFramework/seedvr2-7b-4bit`	Lower-memory 7B package.
`ByteDance-Seed/SeedVR2-3B` / `ByteDance-Seed/SeedVR2-7B`	Official source weights; pass `--quantize 8` or `--quantize 4` when runtime quantization is needed.

If you already have a prepared MLX-Gen folder, pass the folder directly as the model. AbstractVision infers the SeedVR2 base from names such as seedvr2-7b-8bit:

abstractvision upscale \
  --provider mlx-gen \
  --model /path/to/seedvr2-7b-8bit \
  --image ./input.png \
  --resolution 2x \
  --softness 0.25 \
  --open

One-shot t2i, i2i, t2v, and i2v commands store results in the local asset store and print an artifact ref followed by the local content path. Use --open when you want the generated output opened after storage, or --store-dir <dir> to choose the asset store directory. Image dimensions are provider/model-specific. width/height are optional request overrides; when omitted, AbstractVision lets the selected backend use its own default or auto behavior. Do not treat 512x512 as a universal default: some older/local models accept it, while newer OpenAI-compatible image models may only accept larger provider-declared sizes such as 1024x1024, 1024x1536, 1536x1024, or auto. If you pass --width/--height, the backend may reject unsupported combinations. MLX-Gen video and upscale commands report denoise-step progress on stderr by default; video progress also includes frame context. Pass --no-progress when you need quiet output. MLX-Gen image generation/edit commands are quiet by default and accept --progress for step progress.

Stable Diffusion does not currently have a curated MLX-Gen q4/q8 preset in AbstractVision, so full Diffusers downloads remain explicit.

Install the Diffusers runtime extra, download a Diffusers snapshot, then select the Diffusers backend explicitly:

pip install "abstractvision[models,diffusers]"
abstractvision catalog --provider diffusers
# Tip: `--provider diffusers` implies `--target diffusers` (you usually set one or the other).
abstractvision download stable-diffusion --provider diffusers
abstractvision download sd1.4 --provider diffusers
abstractvision download sd1.5-inpaint --provider diffusers
abstractvision download sdxl-base --provider diffusers
abstractvision download sdxl-inpaint --provider diffusers
abstractvision download sd3-medium --provider diffusers
abstractvision download sd3.5-large --provider diffusers
abstractvision download ernie-image --provider diffusers
abstractvision download qwen-image-edit-2511 --provider diffusers
abstractvision download flux2-dev --provider diffusers
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
abstractvision cli

Notes:

abstractvision download qwen-image-edit-2511 --provider diffusers downloads the curated official 16-bit Diffusers snapshot.
GLM-Image remains in the packaged registry, but local Diffusers GLM-Image is temporarily disabled pending the follow-up tracked in docs/backlog/planned/0023_local_runtime_capability_quarantine_for_glm_mflux_and_t2v.md.
CogVideoX-2b downloads are still available for experimentation, but local text_to_video is currently marked experimental and disabled from the normal product surfaces.

For a fresh cache, you can also permit the interactive CLI to download missing files:

ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 abstractvision cli

More recommendations by VRAM: docs/getting-started.md.

Capability-driven model selection

from abstractvision import VisionModelCapabilitiesRegistry

reg = VisionModelCapabilitiesRegistry()
assert reg.supports("runwayml/stable-diffusion-v1-5", "text_to_image")
assert reg.supports("Qwen/Qwen-Image-Edit-2511", "image_to_image")

print(reg.list_tasks())
print(reg.models_for_task("text_to_image"))
print(reg.models_for_task("image_to_image"))

Backend wiring + generation (artifact outputs)

The base install is import-light and does not install Torch/Diffusers. Heavy local backend modules are imported lazily (see src/abstractvision/backends/__init__.py). Install abstractvision[diffusers] for local Diffusers, or abstractvision[sdcpp] for the optional stable-diffusion.cpp python binding fallback.

from abstractvision import LocalAssetStore, VisionManager, VisionModelCapabilitiesRegistry, is_artifact_ref
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

reg = VisionModelCapabilitiesRegistry()

backend = OpenAICompatibleVisionBackend(
    config=OpenAICompatibleBackendConfig(
        base_url="http://localhost:1234/v1",
        api_key="YOUR_KEY",      # optional for local servers
        model_id="REMOTE_MODEL", # optional (server-dependent)
    )
)

vm = VisionManager(
    backend=backend,
    store=LocalAssetStore(),         # enables artifact-ref outputs
    model_id="Qwen/Qwen-Image-Edit-2511",  # optional: capability gating
    registry=reg,                   # optional: reuse loaded registry
)

out = vm.generate_image("a cinematic photo of a red fox in snow")
assert is_artifact_ref(out)
print(out)  # {"$artifact": "...", "content_type": "...", ...}

png_bytes = vm.store.load_bytes(out["$artifact"])  # type: ignore[union-attr]

When installed next to AbstractCore, AbstractVision is also discovered as a llm.vision capability plugin. The plugin defaults to the official OpenAI image endpoint (https://api.openai.com/v1) and reads OPENAI_API_KEY. Set OPENAI_BASE_URL when you need a local or remote compatible /v1 server, and use the same OPENAI_API_KEY bearer token if that endpoint requires auth. Set ABSTRACTVISION_BACKEND=openai-compatible when you want to force compatible-endpoint semantics. Set ABSTRACTVISION_MODEL_ID, OPENAI_IMAGE_MODEL_ID, or OPENAI_IMAGE_MODEL when you need an explicit image model (static default OpenAI model: gpt-image-1). AbstractVision does not query provider /models catalogs to discover or select image models automatically, but you can inspect them explicitly with abstractvision provider-models, VisionManager.list_provider_models(...), or the AbstractCore plugin method llm.vision.list_provider_models(...). After inspection, set the model env var explicitly for newer provider models when available to your account. Set ABSTRACTVISION_BACKEND=mlx-gen, ABSTRACTVISION_BACKEND=diffusers, or ABSTRACTVISION_BACKEND=sdcpp when you want AbstractCore to launch local AbstractVision generation directly. For MLX-Gen, set ABSTRACTVISION_MFLUX_MODEL=AbstractFramework/flux.2-klein-4b-4bit or use routed model ids such as mlx-gen/AbstractFramework/flux.2-klein-4b-4bit. Legacy mflux provider values remain accepted as compatibility aliases.

Interactive testing (CLI)

abstractvision models
abstractvision provider-models --openai --task text_to_image
abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image
abstractvision tasks
abstractvision show-model runwayml/stable-diffusion-v1-5

abstractvision cli

Inside the interactive CLI:

/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
/i2i --image ./input.png "make it watercolor" --steps 20 --guidance-scale 6.5 --open

For a newer but still relatively small local model, try black-forest-labs/FLUX.2-klein-4B after installing Diffusers from source (see docs/getting-started.md):

/backend diffusers black-forest-labs/FLUX.2-klein-4B mps float16
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --open

Local Diffusers text_to_video remains experimental and is temporarily disabled from the normal bundled local surfaces. Use MLX-Gen Wan 2.2 A14B for local Apple Silicon text_to_video / image_to_video, or use the OpenAI-compatible backend for remote video endpoints.

For local MLX-Gen generation (Apple-first validation):

/backend mlx-gen AbstractFramework/flux.2-klein-4b-4bit
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --open
/backend mlx-gen prism-ml/bonsai-image-ternary-4B-mlx-2bit
/t2i "a bonsai tree in a quiet ceramic studio" --steps 4 --guidance-scale 1.0 --open
/backend mlx-gen AbstractFramework/qwen-image-edit-2511-4bit
/i2i --image ./input.png "replace the background with a clean white studio setup" --steps 20 --guidance-scale 2.5 --strength 0.75 --open
/backend mlx-gen AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit
/t2v "a red fox walking through a snowy forest, cinematic" --width 432 --height 240 --frames 41 --fps 10 --steps 20 --guidance-scale 4.0 --guidance-2 3.0 --open
/backend mlx-gen AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit
/i2v --image ./first-frame.png "slow camera push-in" --width 432 --height 240 --frames 41 --fps 10 --steps 20 --guidance-scale 3.5 --guidance-2 3.5 --open

/t2v and /i2v show MLX-Gen denoise-step video progress by default; add --no-progress to suppress progress output in scripts.

OpenAI-compatible server example:

/backend openai http://localhost:1234/v1
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open

The CLI/REPL can also be configured via ABSTRACTVISION_* env vars; see docs/reference/configuration.md.

Local web playground

The playground is owned by AbstractVision and runs without AbstractCore. It is a local/dev testing surface; use AbstractCore/Gateway for production routing, authentication, and browser-origin policy.

abstractvision playground --port 8091

Open http://127.0.0.1:8091/vision_playground.html. The page and the API are served by the same process.

Current behavior:

The UI is split into task tabs (Text→Image, Image→Image, Text→Video, and a placeholder Image→Video tab for later work).
Each active task tab has its own model selector and unload button. Switching models in a tab unloads the current active backend first to free memory before loading the replacement.
The Image→Image tab is enabled only for models that both advertise image_to_image in the packaged capability registry and remain enabled by the selected backend.
MLX-Gen FLUX.2 klein/base, Qwen Image Edit, ERNIE Image Turbo, FIBO, and FIBO Edit models are surfaced for Image→Image edits when cached.
The Text→Video tab is experimental in the playground UI. Use shell/REPL or AbstractCore for the most complete MLX-Gen Wan video controls.
Model-specific request normalization happens at the API/backend layer, not just in the page.
Local video export packages generated frames into MP4 via an external ffmpeg binary on PATH.
Response logs intentionally show only a shortened b64_json preview instead of the full base64 image payload.

One-shot commands default to the OpenAI-compatible HTTP backend, but they also support local providers:

abstractvision t2i --base-url http://localhost:1234/v1 "a studio photo of an espresso machine"
abstractvision i2i --base-url http://localhost:1234/v1 --image ./input.png "make it watercolor"
abstractvision t2i --provider diffusers --model qwen-image "a studio photo of an espresso machine"

Local GGUF via stable-diffusion.cpp

If you want to run GGUF diffusion models locally, use the stable-diffusion.cpp backend (sdcpp). Start with a single-file Stable Diffusion model when possible; Qwen Image and FLUX GGUF component sets are heavier.

Recommended:

abstractvision auto-installs sd-cli into ~/.abstractvision/bin on first use (set ABSTRACTVISION_SDCPP_AUTO_INSTALL=0 to disable).
If you prefer python bindings: install abstractvision[sdcpp] (uses stable-diffusion-cpp-python).

Alternative (external executable): install sd-cli from https://github.com/leejet/stable-diffusion.cpp/releases.

In the REPL:

/backend sdcpp /path/to/sd-v1-5.gguf /path/to/sd-cli
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open

Curated FLUX/Qwen GGUF bundle example:

abstractvision download flux2-klein-base-4b --provider sdcpp
abstractvision download qwen-image-edit-2511-gguf --provider sdcpp

/backend sdcpp flux2-klein-base-4b /path/to/sd-cli
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --sampling-method euler --diffusion-fa --offload-to-cpu --open

The package resolves the required VAE and text-encoder companions from the cache automatically for curated sdcpp model keys. Manual component wiring remains available for advanced cases.

Extra flags are forwarded via request.extra. In CLI mode they are forwarded to sd-cli; in python bindings mode, keys are mapped to python binding kwargs when supported and unsupported keys are ignored.

AbstractCore tool integration (artifact refs)

If you’re using AbstractCore tool calling, AbstractVision can expose vision tasks as tools:

from abstractvision.integrations.abstractcore import make_vision_tools

tools = make_vision_tools(vision_manager=vm, model_id="Qwen/Qwen-Image-Edit-2511")

Install abstractcore in the host application environment when you use these helpers; it is not pulled in by AbstractVision.

AbstractFramework ecosystem

AbstractVision is part of the AbstractFramework ecosystem and is designed to compose with:

AbstractFramework (project hub): https://github.com/lpalbou/AbstractFramework
AbstractCore (orchestration + tool calling): https://github.com/lpalbou/abstractcore
AbstractRuntime (runtime services, including artifact storage): https://github.com/lpalbou/abstractruntime

In practice:

AbstractVision standardizes generative vision outputs (image/video) behind VisionManager.
AbstractCore can discover and use AbstractVision via the capability plugin (src/abstractvision/integrations/abstractcore_plugin.py) or you can expose vision tasks as tools (src/abstractvision/integrations/abstractcore.py).
Artifact refs returned by AbstractVision are designed to travel across processes; RuntimeArtifactStoreAdapter bridges to an AbstractRuntime-style artifact store (src/abstractvision/artifacts.py).

Project

Release notes: CHANGELOG.md
Contributing: CONTRIBUTING.md
Security: SECURITY.md
Acknowledgments: ACKNOWLEDGMENTS.md
Agent docs: llms.txt and llms-full.txt

Requirements

Python >= 3.9

License

MIT License - see LICENSE file for details.

Author

Laurent-Philippe Albou

Contact

contact@abstractcore.ai

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lpalbou

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.27

Jun 16, 2026

0.3.26

Jun 13, 2026

0.3.25

Jun 13, 2026

0.3.22

Jun 7, 2026

0.3.20

Jun 4, 2026

0.3.19

Jun 3, 2026

0.3.18

May 31, 2026

0.3.17

May 29, 2026

0.3.16

May 26, 2026

0.3.15

May 26, 2026

0.3.14

May 26, 2026

0.3.13

May 23, 2026

0.3.12

May 22, 2026

0.3.11

May 22, 2026

0.3.10

May 22, 2026

0.3.9

May 21, 2026

0.3.8

May 20, 2026

0.3.7

May 19, 2026

0.3.6

May 17, 2026

0.3.5

May 13, 2026

0.3.4

May 9, 2026

0.3.3

May 8, 2026

0.3.2

May 8, 2026

0.3.1

May 7, 2026

0.3.0

May 7, 2026

0.2.6

May 6, 2026

0.2.5

May 6, 2026

0.2.4

May 6, 2026

0.2.3

May 6, 2026

0.2.2

May 6, 2026

0.2.1

Feb 5, 2026

0.1.0

Jan 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstractvision-0.3.27.tar.gz (23.6 MB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

abstractvision-0.3.27-py3-none-any.whl (267.2 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file abstractvision-0.3.27.tar.gz.

File metadata

Download URL: abstractvision-0.3.27.tar.gz
Upload date: Jun 16, 2026
Size: 23.6 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for abstractvision-0.3.27.tar.gz
Algorithm	Hash digest
SHA256	`f5b57f7e9ddd293f61a17566df80eb4c0aeb4cbc8b093688109dfaf7fe9723fd`
MD5	`8d6401f5d002f3297c33750b9e1fecf9`
BLAKE2b-256	`bf578caa1b2341e24c9feea7dbf36ae5f0232719e7addc568d4a9dcbf08f5600`

See more details on using hashes here.

Provenance

The following attestation bundles were made for abstractvision-0.3.27.tar.gz:

Publisher: release.yml on lpalbou/AbstractVision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: abstractvision-0.3.27.tar.gz
- Subject digest: f5b57f7e9ddd293f61a17566df80eb4c0aeb4cbc8b093688109dfaf7fe9723fd
- Sigstore transparency entry: 1831608696
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: lpalbou/AbstractVision@dc3a6b05858b72a975df3cd42e42362d88c3dcb1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/lpalbou
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@dc3a6b05858b72a975df3cd42e42362d88c3dcb1
- Trigger Event: workflow_dispatch

File details

Details for the file abstractvision-0.3.27-py3-none-any.whl.

File metadata

Download URL: abstractvision-0.3.27-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 267.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for abstractvision-0.3.27-py3-none-any.whl
Algorithm	Hash digest
SHA256	`56fbcc4e81ef7419ea48f087529ad7926294a55a4e28cb57498787770ccd89d3`
MD5	`46d6b9123a75c4b4233f55e773fd9a02`
BLAKE2b-256	`51f5ae0587760d262804ebdef18ef23ae360115f096f769536959a260c0ae589`

See more details on using hashes here.

Provenance

The following attestation bundles were made for abstractvision-0.3.27-py3-none-any.whl:

Publisher: release.yml on lpalbou/AbstractVision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: abstractvision-0.3.27-py3-none-any.whl
- Subject digest: 56fbcc4e81ef7419ea48f087529ad7926294a55a4e28cb57498787770ccd89d3
- Sigstore transparency entry: 1831608875
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: lpalbou/AbstractVision@dc3a6b05858b72a975df3cd42e42362d88c3dcb1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/lpalbou
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@dc3a6b05858b72a975df3cd42e42362d88c3dcb1
- Trigger Event: workflow_dispatch

abstractvision 0.3.27

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

AbstractVision

What you get

How it fits together (diagram)

Status (current backend support)

Installation

Usage

First Apple-local MLX model

Shared LoRA adapters

Batch generation

Capability-driven model selection

Backend wiring + generation (artifact outputs)

Interactive testing (CLI)

Local web playground

Local GGUF via stable-diffusion.cpp

AbstractCore tool integration (artifact refs)

AbstractFramework ecosystem

Project

Requirements

License

Author

Contact

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance