Skip to main content

Model-agnostic generative vision abstractions (image/video) for the Abstract ecosystem

Project description

AbstractVision

Model-agnostic generative vision API (images, optional video) for Python and the Abstract* ecosystem.

What you get

  • A stable task API: VisionManager (src/abstractvision/vision_manager.py)
  • A packaged capability registry (“what models can do”): VisionModelCapabilitiesRegistry backed by src/abstractvision/assets/vision_model_capabilities.json
  • Optional artifact-ref outputs (small JSON refs): LocalAssetStore / store adapters (src/abstractvision/artifacts.py)
  • Built-in backends (src/abstractvision/backends/):
    • OpenAI-compatible HTTP (openai_compatible.py)
    • Local Diffusers (huggingface_diffusers.py)
    • Local stable-diffusion.cpp / GGUF (stable_diffusion_cpp.py)
  • CLI/REPL for manual testing: abstractvision ... (src/abstractvision/cli.py)

Status (current backend support)

  • Built-in backends implement: text_to_image and image_to_image.
  • Video (text_to_video, image_to_video) is supported only via the OpenAI-compatible backend when endpoints are configured.
  • multi_view_image is part of the public API (VisionManager.generate_angles) but no built-in backend implements it yet.

Details: docs/reference/backends.md.

Installation

pip install abstractvision

Install optional integrations:

pip install "abstractvision[abstractcore]"

Some newer model pipelines may require Diffusers from GitHub main (see docs/getting-started.md):

pip install -U "abstractvision[huggingface-dev]"

For local dev (from a repo checkout):

pip install -e .

Usage

Start here:

  • Getting started: docs/getting-started.md
  • FAQ: docs/faq.md
  • API reference: docs/api.md
  • Architecture: docs/architecture.md
  • Docs index: docs/README.md

Capability-driven model selection

from abstractvision import VisionModelCapabilitiesRegistry

reg = VisionModelCapabilitiesRegistry()
assert reg.supports("Qwen/Qwen-Image-2512", "text_to_image")

print(reg.list_tasks())
print(reg.models_for_task("text_to_image"))

Backend wiring + generation (artifact outputs)

The default install is “batteries included” (Torch + Diffusers + stable-diffusion.cpp python bindings), but heavy modules are imported lazily (see src/abstractvision/backends/__init__.py).

from abstractvision import LocalAssetStore, VisionManager, VisionModelCapabilitiesRegistry, is_artifact_ref
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

reg = VisionModelCapabilitiesRegistry()

backend = OpenAICompatibleVisionBackend(
    config=OpenAICompatibleBackendConfig(
        base_url="http://localhost:1234/v1",
        api_key="YOUR_KEY",      # optional for local servers
        model_id="REMOTE_MODEL", # optional (server-dependent)
    )
)

vm = VisionManager(
    backend=backend,
    store=LocalAssetStore(),         # enables artifact-ref outputs
    model_id="zai-org/GLM-Image",    # optional: capability gating
    registry=reg,                   # optional: reuse loaded registry
)

out = vm.generate_image("a cinematic photo of a red fox in snow")
assert is_artifact_ref(out)
print(out)  # {"$artifact": "...", "content_type": "...", ...}

png_bytes = vm.store.load_bytes(out["$artifact"])  # type: ignore[union-attr]

Interactive testing (CLI / REPL)

abstractvision models
abstractvision tasks
abstractvision show-model zai-org/GLM-Image

abstractvision repl

Inside the REPL:

/backend openai http://localhost:1234/v1
/cap-model zai-org/GLM-Image
/set width 1024
/set height 1024
/t2i "a watercolor painting of a lighthouse" --open

The CLI/REPL can also be configured via ABSTRACTVISION_* env vars; see docs/reference/configuration.md.

One-shot commands (OpenAI-compatible HTTP backend only):

abstractvision t2i --base-url http://localhost:1234/v1 "a studio photo of an espresso machine"
abstractvision i2i --base-url http://localhost:1234/v1 --image ./input.png "make it watercolor"

Local GGUF via stable-diffusion.cpp

If you want to run GGUF diffusion models locally (e.g. Qwen Image), use the stable-diffusion.cpp backend (sdcpp).

Recommended (pip-only; no external binary download): pip install abstractvision already includes the stable-diffusion.cpp python bindings (stable-diffusion-cpp-python).

Alternative (external executable):

In the REPL:

/backend sdcpp /path/to/qwen-image-2512-Q4_K_M.gguf /path/to/qwen_image_vae.safetensors /path/to/Qwen2.5-VL-7B-Instruct-*.gguf
/t2i "a watercolor painting of a lighthouse" --sampling-method euler --offload-to-cpu --diffusion-fa --flow-shift 3 --open

Extra flags are forwarded via request.extra. In CLI mode they are forwarded to sd-cli; in python bindings mode, keys are mapped to python binding kwargs when supported and unsupported keys are ignored.

AbstractCore tool integration (artifact refs)

If you’re using AbstractCore tool calling, AbstractVision can expose vision tasks as tools:

from abstractvision.integrations.abstractcore import make_vision_tools

tools = make_vision_tools(vision_manager=vm, model_id="zai-org/GLM-Image")

Project

  • Release notes: CHANGELOG.md
  • Contributing: CONTRIBUTING.md
  • Security: SECURITY.md
  • Acknowledgments: ACKNOWLEDMENTS.md

Requirements

  • Python >= 3.8

License

MIT License - see LICENSE file for details.

Author

Laurent-Philippe Albou

Contact

contact@abstractcore.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abstractvision-0.2.1.tar.gz (58.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abstractvision-0.2.1-py3-none-any.whl (52.1 kB view details)

Uploaded Python 3

File details

Details for the file abstractvision-0.2.1.tar.gz.

File metadata

  • Download URL: abstractvision-0.2.1.tar.gz
  • Upload date:
  • Size: 58.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for abstractvision-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ef7093c3ba45013dadd6c0a2a71c8185bf23787abd36b01213e4b0f11fab7f4e
MD5 f72b7fa201206981f76aa5dd8d840263
BLAKE2b-256 a7cd2406200a40033c0ab51e8964ab1f26ffca5090a503546367d7437dc6e3f2

See more details on using hashes here.

File details

Details for the file abstractvision-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: abstractvision-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 52.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for abstractvision-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 96b3f5e468ca91602cbeca0eaf6d6f067b2b9bfd6f196ec3a67fb60b84f5b4b9
MD5 e5313c47168d3e9220089826c3817a71
BLAKE2b-256 331c5a85c25ba67f1a90233440dca0f28f38b05d07450596969e3a255d16f149

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page