A permissive-license-aware framework for serving modern computer vision models locally and over Cloudflare Tunnel.

These details have not been verified by PyPI

Project links

Project description

VisionServeX

Secure, beginner-friendly Python API serving for permissive computer vision models
Local inference · Cloudflare Tunnel · Stable JSON API · LLM-agent-friendly

Python 3.10+ ruff v0.4.0

Note on the CI badge: it turns green after the workflow runs successfully on GitHub for the first time. Until then the badge shows "no status" — this is expected for a new repository.

VisionServeX is a permissive-license-aware Python framework for running modern computer vision models locally, exposing them through a clean HTTP API, and optionally sharing them securely over Cloudflare Tunnel.

No CUDA expertise required. visionservex doctor tells you what your machine can run. GPU is preferred automatically when available and healthy; broken CUDA runtimes fall back to CPU with a clear warning.
One download command. visionservex pull rfdetr-nano — weights are cached and verified.
Stable contracts. Every prediction returns the same JSON envelope, whether from CLI, Python, or curl.
Honest. Registry entries say wired, partial, or stub. Stubs never silently fake results.
Secure defaults. Binds to 127.0.0.1, requires auth for public mode, SSRF and bomb guards on.

Quickstart (works on CPU, ~5 minutes)

pip install 'visionservex[server,hf,rfdetr]'

visionservex getting-started          # personalized guide for your machine

# RF-DETR detection — real, fast
visionservex pull rfdetr-nano
visionservex predict rfdetr-nano examples/images/street.jpg --save outputs/out.jpg

# Grounding DINO — text-prompted detection
visionservex pull grounding-dino-tiny
visionservex predict grounding-dino-tiny examples/images/street.jpg \
    --prompt "car,person" --save outputs/gd.jpg

# D-FINE — detection via HF Transformers
visionservex pull dfine-s
visionservex predict dfine-s examples/images/street.jpg --save outputs/dfine.jpg

# Start the API
visionservex serve
curl -F "image=@examples/images/street.jpg" \
     -F "model_id=rfdetr-nano" \
     http://127.0.0.1:8080/detect | jq

Recommendation engine:

visionservex recommend --task detect --simple

What works today

Family	Model IDs	Task	Status	Install
Mock (built-in)	`mock-*`	All tasks	stable	base
RF-DETR	`rfdetr-nano/small/base/medium/large`	detect	beta	`[rfdetr]`
RF-DETR-Seg	`rfdetr-seg-nano/small/medium`	segment	beta	`[rfdetr]`
D-FINE	`dfine-n/s/m/l/x`	detect	beta	`[hf]`
Grounding DINO	`grounding-dino-tiny/swin-t/swin-b`	open-vocab detect	beta	`[hf]`
SwinV2	`swinv2-tiny/small/base/large`	classify	beta	`[hf]`
SAM v1	`sam-vit-base/large/huge`	foundation segment	beta	`[hf]`
SAM 2	`sam2-hiera-tiny/small/base-plus/large`	foundation segment	beta	`[hf]`
Grounded SAM	`grounded-sam`	grounded segment	beta	`[hf]`
OneFormer	`oneformer-swin-large/dinat-large/convnext-large`	segment (semantic/instance/panoptic)	beta	`[hf]`

Not yet wired

Family	Why	Alternative
RTMPose	Requires OpenMMLab toolchain	`mock-pose` for schema
RTMDet-R/R2 (OBB)	Requires OpenMMLab + mmrotate	`mock-obb` for schema
Co-DINO-Inst	Requires heavy OpenMMLab	`rfdetr-seg-*` for instance seg
InternImage	Custom CUDA ops, build required	`swinv2-*` for classification
SEEM	Expert manual install	`oneformer-swin-large`
Grounded-SAM-2	Needs upstream `sam2` package	`grounded-sam` (works today)
ONNX export	CLI exists; engine-quality varies	Use HF model repos for ONNX
TensorRT	Future roadmap	—

We make no benchmark claims. Pick by task, license, and hardware. See docs/model_zoo.md.

Which model to start with?

I want	Start with	CPU?
Fast detection	`rfdetr-nano`	yes
More accurate detection	`dfine-s`	yes (slower)
Text-prompted detection	`grounding-dino-tiny`	yes (slower)
Instance segmentation	`rfdetr-seg-nano`	yes
SAM-style masking	`sam-vit-base` or `sam2-hiera-tiny`	yes (slow)
Text + mask together	`grounded-sam`	yes (slow)
Image classification	`swinv2-tiny`	yes
Semantic scene parsing	`oneformer-swin-large`	yes (slow)
Just testing/CI	`mock-detect`	yes (instant)
I have no GPU	Any `-nano` or `-tiny` model	yes
I have NVIDIA GPU	Run `visionservex doctor` first — GPU is used automatically when available	—

Installation

pip install visionservex                        # base: CLI, registry, mock
pip install 'visionservex[server]'              # + FastAPI HTTP server
pip install 'visionservex[hf]'                  # + D-FINE, GD, SwinV2, SAM, SAM2, OneFormer
pip install 'visionservex[rfdetr]'              # + RF-DETR and RF-DETR-Seg
pip install 'visionservex[server,hf,rfdetr]'    # full recommended install

For OpenMMLab models (RTMPose, RTMDet-R, Co-DINO):

pip install openmim
mim install mmengine mmcv mmpose mmdet mmrotate

See docs/installation.md for platform-specific notes.

Python API

from visionservex import VisionModel

# Object detection
m = VisionModel("rfdetr-nano")
result = m.predict("image.jpg")
for det in result.detections:
    print(det.label, f"{det.score:.2f}", det.box.to_xyxy())
result.save("annotated.jpg")

# D-FINE detection (HF Transformers)
m = VisionModel("dfine-s")
result = m.predict("image.jpg")

# SAM 2 (point prompt)
m = VisionModel("sam2-hiera-tiny")
result = m.predict("image.jpg", points=[[x, y]], point_labels=[1])

# SAM 2 (box prompt)
result = m.predict("image.jpg", boxes=[[x1, y1, x2, y2]])

# OneFormer (choose task)
m = VisionModel("oneformer-swin-large")
result = m.predict("image.jpg", task="semantic")       # or "instance", "panoptic"

# Grounding DINO
m = VisionModel("grounding-dino-tiny")
result = m.predict("image.jpg", prompts=["red car", "person walking"])

# Auto-pull on first use
m = VisionModel("dfine-s", auto_pull=True)
result = m.predict("image.jpg")

Stable result fields: kind, model_id, task, device, precision, backend, latency_ms, model_loaded_from, fallback_reason, warnings.

HTTP API

Stable response envelope:

{
  "request_id": "...",
  "status": "completed",
  "model_id": "dfine-s",
  "task": "detect",
  "backend": "huggingface_dfine",
  "device": "cpu",
  "precision": "fp32",
  "latency_ms": 187.4,
  "results": [{"box": {...}, "score": 0.72, "label": "person", "class_id": 0}],
  "warnings": [],
  "metadata": {}
}

Error envelope:

{
  "request_id": "...",
  "error": {
    "code": "MODEL_MISSING",
    "message": "Model weights for 'dfine-s' are not cached.",
    "hint": "Run: visionservex pull dfine-s",
    "details": {}
  }
}

Key endpoints: GET /health, GET /devices, GET /models, POST /detect, POST /segment, POST /classify, POST /open-vocab/detect, POST /grounded-segment, GET /jobs/{id}, GET /metrics. Full reference: docs/api_reference.md.

Security defaults

Setting	Default
Server bind	`127.0.0.1` only
Public mode	disabled (explicit opt-in)
Authentication	disabled — enable before exposing
Remote URL inputs	disabled (SSRF protection)
CORS	disabled
Upload limit	20 MiB
Image pixel limit	~33 MP (decompression-bomb guard)
Rate limit	120 req/min per IP
Token redaction	enabled in all logs

See docs/security.md and SECURITY.md.

Safe Cloudflare Tunnel

export VISIONSERVEX_AUTH__ENABLED=true
export VISIONSERVEX_AUTH__API_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(48))")

visionservex tunnel doctor
visionservex tunnel create visionservex
visionservex tunnel route visionservex api.yourdomain.com
visionservex tunnel config api.yourdomain.com --out tunnel.yaml
visionservex serve &
visionservex tunnel run tunnel.yaml --i-understand-this-is-public

The CLI refuses without auth enabled and the explicit confirmation flag. The generated config always ends with a catch-all http_status:404 rule. See docs/cloudflare_tunnel.md.

Documentation


Beginner quickstart	First prediction in 5 min
Device check	GPU/CPU/MPS diagnostics
Model zoo	All models, license table, "which model?"
Model downloads	Download system, auto-pull
Model licenses	Per-model license details
Cloudflare Tunnel	Safe public exposure
Security	Threat model, all protections
HTTP API reference	Endpoints, error codes
Python API	VisionModel, result types
CLI reference	Every command
Troubleshooting	Common errors
LLM agent guide	Stable CLI/JSON for agents
About	Author, citation, acknowledgment

License and upstream models

VisionServeX is Apache-2.0 (SPDX-License-Identifier: Apache-2.0). See LICENSE and NOTICE.

Each integrated model retains its own upstream license. Review the model, checkpoint, and training-data licenses before commercial use. VisionServeX does not provide legal advice. See docs/model_licenses.md.

Citation

@software{sajjadi2026visionservex,
  author = {Arash Sajjadi},
  title  = {{VisionServeX: A permissive-license-aware framework for
             local computer vision model serving}},
  year   = {2026},
  url    = {https://github.com/arashsajjadi/VisionServeX},
  note   = {Developed under the supervision of Prof. Mark Eramian,
            Department of Computer Science, University of Saskatchewan.}
}

Author: Arash Sajjadi — PhD Candidate, Department of Computer Science, University of Saskatchewan
Supervision: Prof. Mark Eramian, Computer Vision Lab, University of Saskatchewan
(This project is not an official product of the University of Saskatchewan.)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.11.0

Jun 8, 2026

3.10.1

Jun 8, 2026

3.10.0

Jun 8, 2026

3.9.1

Jun 8, 2026

3.9.0

Jun 8, 2026

3.8.1

Jun 8, 2026

3.8.0

Jun 8, 2026

3.7.0

Jun 7, 2026

3.3.0

Jun 7, 2026

3.2.0

Jun 7, 2026

3.1.0

Jun 7, 2026

3.0.0

Jun 7, 2026

2.60.0

Jun 7, 2026

2.59.0

Jun 7, 2026

2.58.0

May 21, 2026

2.57.0

May 21, 2026

2.56.0

May 21, 2026

2.55.0

May 21, 2026

2.54.0

May 20, 2026

2.53.0

May 20, 2026

2.52.0

May 20, 2026

2.51.0

May 20, 2026

2.50.1

May 20, 2026

2.50.0

May 20, 2026

2.49.0

May 20, 2026

2.48.0

May 20, 2026

2.47.3

May 20, 2026

2.47.2

May 20, 2026

2.47.1

May 20, 2026

2.47.0

May 20, 2026

2.45.0

May 19, 2026

2.44.0

May 19, 2026

2.43.0

May 19, 2026

2.42.0

May 19, 2026

2.41.0

May 19, 2026

2.40.0

May 19, 2026

2.39.0

May 19, 2026

2.38.1

May 19, 2026

2.38.0

May 19, 2026

2.37.0

May 19, 2026

2.36.0

May 19, 2026

2.35.0

May 19, 2026

2.34.0

May 19, 2026

2.33.0

May 19, 2026

2.32.0

May 19, 2026

2.31.0

May 19, 2026

2.30.0

May 19, 2026

2.29.0

May 19, 2026

2.28.0

May 18, 2026

2.27.1

May 18, 2026

2.27.0

May 18, 2026

2.26.0

May 18, 2026

2.25.2

May 18, 2026

2.25.1

May 18, 2026

2.25.0

May 18, 2026

2.24.1

May 18, 2026

2.24.0

May 18, 2026

2.23.0

May 18, 2026

2.22.0

May 18, 2026

2.21.0

May 18, 2026

2.20.0

May 18, 2026

2.19.0

May 18, 2026

2.18.0

May 18, 2026

2.17.0

May 18, 2026

2.16.0

May 18, 2026

2.15.0

May 17, 2026

2.14.0

May 17, 2026

2.13.1

May 17, 2026

2.13.0

May 17, 2026

2.12.0

May 17, 2026

2.11.0

May 16, 2026

2.10.0

May 16, 2026

2.9.0

May 16, 2026

2.8.0

May 16, 2026

2.7.0

May 16, 2026

2.6.0

May 16, 2026

2.5.0

May 16, 2026

2.4.0

May 16, 2026

2.3.0

May 16, 2026

2.2.0

May 16, 2026

2.1.1

May 16, 2026

2.1.0

May 16, 2026

2.0.1

May 16, 2026

2.0.0

May 16, 2026

1.9.0

May 16, 2026

1.8.1

May 16, 2026

1.8.0

May 16, 2026

1.7.1

May 16, 2026

1.7.0

May 16, 2026

1.6.0

May 16, 2026

1.5.0

May 16, 2026

1.4.0

May 16, 2026

1.3.0

May 16, 2026

1.2.0

May 16, 2026

1.1.0

May 15, 2026

1.0.0

May 15, 2026

1.0.0rc3 pre-release

May 15, 2026

1.0.0rc2 pre-release

May 15, 2026

1.0.0rc1 pre-release

May 15, 2026

This version

0.9.0

May 15, 2026

0.8.0

May 15, 2026

0.7.0

May 15, 2026

0.5.0

May 15, 2026

0.4.0

May 15, 2026

0.3.0

May 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visionservex-0.9.0.tar.gz (121.0 kB view details)

Uploaded May 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

visionservex-0.9.0-py3-none-any.whl (153.8 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file visionservex-0.9.0.tar.gz.

File metadata

Download URL: visionservex-0.9.0.tar.gz
Upload date: May 15, 2026
Size: 121.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for visionservex-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`cc196342f4f9f703a78978276d78b141f14efae950f5dd08753bbce80298b321`
MD5	`1485e177d5248b4205a08a94b0b3dbcd`
BLAKE2b-256	`a0c98085c910a75128c7da9402bd2542c13b8e1dce7b0bed60d9ea8077417094`

See more details on using hashes here.

Provenance

The following attestation bundles were made for visionservex-0.9.0.tar.gz:

Publisher: publish.yml on arashsajjadi/VisionServeX

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: visionservex-0.9.0.tar.gz
- Subject digest: cc196342f4f9f703a78978276d78b141f14efae950f5dd08753bbce80298b321
- Sigstore transparency entry: 1549599223
- Sigstore integration time: May 15, 2026
Source repository:
- Permalink: arashsajjadi/VisionServeX@6faac58bb45fef5c61dbda4ee9133213b1dd7850
- Branch / Tag: refs/tags/v0.9.0
- Owner: https://github.com/arashsajjadi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6faac58bb45fef5c61dbda4ee9133213b1dd7850
- Trigger Event: push

File details

Details for the file visionservex-0.9.0-py3-none-any.whl.

File metadata

Download URL: visionservex-0.9.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 153.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for visionservex-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`397f9259f1f95188a704d5371387b72aa596e50ed203b78aba51b1147cdefb25`
MD5	`03389a58c446f9d01d1147b05783fbab`
BLAKE2b-256	`581fc7dea409035cd9698b732f6bd545185ea19436f79c1e61b23d1a27a99207`

See more details on using hashes here.

Provenance

The following attestation bundles were made for visionservex-0.9.0-py3-none-any.whl:

Publisher: publish.yml on arashsajjadi/VisionServeX

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: visionservex-0.9.0-py3-none-any.whl
- Subject digest: 397f9259f1f95188a704d5371387b72aa596e50ed203b78aba51b1147cdefb25
- Sigstore transparency entry: 1549599233
- Sigstore integration time: May 15, 2026
Source repository:
- Permalink: arashsajjadi/VisionServeX@6faac58bb45fef5c61dbda4ee9133213b1dd7850
- Branch / Tag: refs/tags/v0.9.0
- Owner: https://github.com/arashsajjadi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6faac58bb45fef5c61dbda4ee9133213b1dd7850
- Trigger Event: push

visionservex 0.9.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

VisionServeX

Quickstart (works on CPU, ~5 minutes)

What works today

Not yet wired

Which model to start with?

Installation

Python API

HTTP API

Security defaults

Safe Cloudflare Tunnel

Documentation

License and upstream models

Citation

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance