A permissive-license-aware framework for serving modern computer vision models locally and over Cloudflare Tunnel.
Project description
VisionServeX
Secure, beginner-friendly Python API serving for permissive computer vision models.
VisionServeX is a permissive-license-aware Python framework for serving modern computer vision models locally and over Cloudflare Tunnel. It is designed to be easy enough for a complete beginner — no CUDA expertise, no HuggingFace Hub internals, no Cloudflare DNS knowledge required.
Author: Arash Sajjadi · PhD Candidate, Department of Computer Science,
University of Saskatchewan
Supervision: Developed under the supervision of Prof. Mark Eramian,
Department of Computer Science, University of Saskatchewan, Computer Vision Lab.
This project is not an official product of the University of Saskatchewan.
Beginner quickstart (< 5 minutes)
pip install 'visionservex[server,hf]'
# 1. Diagnose your system
visionservex doctor
# 2. Get personalized recommendations
visionservex recommend --task detect --simple
# 3. Download a model
visionservex pull grounding-dino-tiny # real, wired, text-prompted
# or
visionservex pull rfdetr-nano # real, wired, COCO detection (needs `visionservex[rfdetr]`)
# or
visionservex pull mock-detect # always works, no deps
# 4. Predict
visionservex predict grounding-dino-tiny examples/images/street.jpg \
--prompt "car,person" --save outputs/out.jpg
# 5. Start the API
visionservex serve
In another terminal:
curl -F "image=@examples/images/street.jpg" \
-F "model_id=grounding-dino-tiny" \
-F "prompts=car,person" \
http://127.0.0.1:8080/predict | jq
"I have…" quickstart paths
| I have… | Recommended model | Install command |
|---|---|---|
| No GPU | grounding-dino-tiny (CPU) |
pip install 'visionservex[hf]' |
| NVIDIA GPU | rfdetr-nano or GD-tiny |
pip install 'visionservex[rfdetr,hf]' |
| Just Python | mock-detect |
pip install visionservex (no extras) |
| Need detection | rfdetr-nano, rfdetr-small |
pip install 'visionservex[rfdetr]' |
| Need segmentation | rfdetr-seg-nano |
pip install 'visionservex[rfdetr]' |
| Need text-prompt detection | grounding-dino-tiny |
pip install 'visionservex[hf]' |
| Need text-prompt masks | grounded-sam |
pip install 'visionservex[hf]' |
| Need classification | swinv2-tiny |
pip install 'visionservex[hf]' |
| Need SAM segmentation | sam-vit-base |
pip install 'visionservex[hf]' |
| Need an API server | any model | pip install 'visionservex[server]' |
| Need Cloudflare Tunnel | any model + tunnel docs | pip install 'visionservex[server]' |
Real model backends (Pass 3)
| Family | Model IDs | Task | Status | Backend | CPU works? |
|---|---|---|---|---|---|
| Mock (built-in) | mock-* |
All tasks | stable | built-in | yes |
| Grounding DINO | grounding-dino-tiny / swin-t / swin-b |
text-prompt detect | beta | HF Transformers | yes |
| RF-DETR detection | rfdetr-nano / small / base / medium / large |
detect | beta | rfdetr package | yes |
| RF-DETR segmentation | rfdetr-seg-nano / small / medium |
segment | beta | rfdetr package | yes |
| SwinV2 classification | swinv2-tiny / small / base / large |
classify | beta | HF Transformers | yes |
| SAM v1 | sam-vit-base / large / huge |
foundation_seg | beta | HF Transformers | yes |
| Grounded SAM (composed) | grounded-sam |
grounded_seg | beta | HF (GD + SAM) | yes |
Models still as stubs (registry-only, no real inference yet):
D-FINE, SAM 2 / 2.1 (needs sam2 pip package), RTMPose, RTMDet-R/R2, InternImage, Co-DINO, SEEM, OneFormer, Grounded-SAM-2.
Python API
from visionservex import VisionModel
# Detection with RF-DETR (real, wired)
m = VisionModel("rfdetr-nano")
result = m.predict("examples/images/street.jpg")
print(result.summary())
for det in result.detections:
print(det.label, det.score, det.box.to_xyxy())
# Text-prompted detection with Grounding DINO (real, wired)
m = VisionModel("grounding-dino-tiny")
result = m.predict("image.jpg", prompts=["red car", "person walking"])
# Classification with SwinV2 (real, wired)
m = VisionModel("swinv2-tiny")
result = m.predict("dog.jpg")
for label, score in result.top_k[:5]:
print(label, score)
# Foundation segmentation with SAM v1 (real, wired)
m = VisionModel("sam-vit-base")
result = m.predict("image.jpg", points=[[100, 150]], point_labels=[1])
result.save("mask.jpg")
# Grounded segmentation (composed GD + SAM, real, wired)
m = VisionModel("grounded-sam")
result = m.predict("image.jpg", prompts=["dog", "leash"])
result.save("grounded.jpg")
Auto-pull (download weights on first use):
m = VisionModel("rfdetr-small", auto_pull=True)
result = m.predict("image.jpg")
Stable API response
{
"request_id": "abc123",
"status": "completed",
"model_id": "rfdetr-nano",
"task": "detect",
"backend": "rfdetr_package",
"device": "cpu",
"precision": "fp32",
"latency_ms": 61.4,
"model_loaded_from": "cache",
"results": [
{"box": {"x1": 59.0, "y1": 300.0, "x2": 201.5, "y2": 383.0},
"score": 0.723, "label": "fire hydrant", "class_id": 10}
],
"warnings": [],
"metadata": {}
}
CLI reference
# Diagnostics
visionservex getting-started # beginner guide with exact next commands
visionservex doctor # full system + device + dependency report
visionservex doctor --fix-suggestions # actionable fix commands
visionservex status # quick package/cache/device status
visionservex devices
# Models
visionservex list-models --friendly # human-readable table
visionservex list-models --easy --can-run # only easy auto-downloadable models
visionservex recommend --task detect --simple
visionservex info grounding-dino-tiny
# Downloads
visionservex pull rfdetr-nano
visionservex pull-easy # all beginner auto-downloadable
visionservex pull-all --task detect --yes-i-understand-large-downloads
# Inference
visionservex predict rfdetr-nano examples/images/street.jpg --save outputs/out.jpg
visionservex predict grounding-dino-tiny examples/images/street.jpg \
--prompt "car,person" --save outputs/gd.jpg
visionservex benchmark mock-detect examples/images/simple_shapes.jpg --n 20
# Server
visionservex serve --host 127.0.0.1 --port 8080
# Cloudflare Tunnel
visionservex tunnel doctor
visionservex tunnel config api.example.com --out tunnel.yaml
visionservex tunnel run tunnel.yaml --i-understand-this-is-public
Auto-pull during server requests
export VISIONSERVEX_MODELS__AUTO_PULL=true
export VISIONSERVEX_MODELS__AUTO_PULL_POLICY=easy_only
visionservex serve
Client-side: pass ?wait_for_download=false to get a job id when the model is missing.
Track progress at GET /jobs/{id}.
Security defaults
| Default | Value |
|---|---|
| Server bind address | 127.0.0.1 |
| Public mode | disabled |
| API key authentication | disabled |
| Remote URL image input | disabled |
| Local file path input | disabled |
| CORS | disabled |
| Max upload size | 20 MiB |
| Max image pixels | ~33 MP |
| Rate limit | 120 / minute |
| Decompression-bomb guard | enabled |
| SSRF protection | enabled |
| Path traversal protection | enabled |
| Secret redaction in logs | enabled |
Secure Cloudflare Tunnel
export VISIONSERVEX_AUTH__ENABLED=true
export VISIONSERVEX_AUTH__API_KEY=$(python -c "import secrets;print(secrets.token_urlsafe(48))")
visionservex tunnel doctor
visionservex tunnel create visionservex
visionservex tunnel route visionservex api.example.com
visionservex tunnel config api.example.com --out tunnel.yaml
visionservex serve &
visionservex tunnel run tunnel.yaml --i-understand-this-is-public
- The CLI refuses to start the tunnel unless auth is enabled AND the
--i-understand-this-is-publicflag is passed. - The generated
tunnel.yamlalways ends with a catch-allhttp_status:404. - We recommend adding a Cloudflare Access policy (service tokens for automation,
mTLS for high-value clients). See
docs/cloudflare_tunnel.md.
Documentation
| Document | Topic |
|---|---|
docs/beginner_quickstart.md |
5-minute walkthrough |
docs/installation.md |
All install options |
docs/device_check.md |
GPU/CPU/MPS/ROCm guide |
docs/model_downloads.md |
Download system, auto-pull |
docs/model_zoo.md |
Model table + "Which model?" |
docs/model_licenses.md |
Per-model license details |
docs/cloudflare_tunnel.md |
Secure public exposure |
docs/security.md |
Security model, threat model |
docs/api_reference.md |
Full HTTP API spec |
docs/python_api.md |
Python API reference |
docs/cli.md |
CLI reference |
docs/troubleshooting.md |
Common errors + fixes |
docs/llm_agent_guide.md |
For LLM agents / automation |
docs/about.md |
Author, citation, acknowledgment |
Honest status
We do not claim benchmark superiority for any model. Each registry entry
carries implementation_status:
wired— real inference runs when the optional extra is installed.partial— code path exists but has rough edges.stub— registry entry only; no real inference in this build.
Models with uncertain licensing are flagged license_uncertain=true and
disabled/labelled accordingly.
License
Apache-2.0 (SPDX-License-Identifier: Apache-2.0). See LICENSE,
NOTICE, and docs/model_licenses.md.
Citation
@software{sajjadi2026visionservex,
author = {Arash Sajjadi},
title = {{VisionServeX: A permissive-license-aware framework for local computer vision model serving}},
year = {2026},
url = {https://github.com/example/visionservex},
note = {Developed under the supervision of Prof. Mark Eramian,
Department of Computer Science, University of Saskatchewan.}
}
See also CITATION.cff.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file visionservex-0.3.0.tar.gz.
File metadata
- Download URL: visionservex-0.3.0.tar.gz
- Upload date:
- Size: 80.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32a8ed07129f6c0fdade54ccbf028e7e5c0b627ab778b7e37a93595a55aa0af5
|
|
| MD5 |
581a849e67e64be7056ad046dca7f1da
|
|
| BLAKE2b-256 |
e1de02dae0114d3286ed48cceb00d8893b282310d4c7daafef308b94d56903f8
|
Provenance
The following attestation bundles were made for visionservex-0.3.0.tar.gz:
Publisher:
publish.yml on arashsajjadi/VisionServeX
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visionservex-0.3.0.tar.gz -
Subject digest:
32a8ed07129f6c0fdade54ccbf028e7e5c0b627ab778b7e37a93595a55aa0af5 - Sigstore transparency entry: 1546927825
- Sigstore integration time:
-
Permalink:
arashsajjadi/VisionServeX@2215cb66fbadc76f0a44c20401add8c9e485fc02 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/arashsajjadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2215cb66fbadc76f0a44c20401add8c9e485fc02 -
Trigger Event:
push
-
Statement type:
File details
Details for the file visionservex-0.3.0-py3-none-any.whl.
File metadata
- Download URL: visionservex-0.3.0-py3-none-any.whl
- Upload date:
- Size: 103.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5468b5a3ac18487642868a41a7e205c9c91ee8acd12d217af68e0f2455b519f4
|
|
| MD5 |
c3366d2e0c99c15cf12b945ecf5cf617
|
|
| BLAKE2b-256 |
571f43d4dc3d3c248ab31a3591f492f92c39d298be606bb736495a7c3f11adcd
|
Provenance
The following attestation bundles were made for visionservex-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on arashsajjadi/VisionServeX
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visionservex-0.3.0-py3-none-any.whl -
Subject digest:
5468b5a3ac18487642868a41a7e205c9c91ee8acd12d217af68e0f2455b519f4 - Sigstore transparency entry: 1546927845
- Sigstore integration time:
-
Permalink:
arashsajjadi/VisionServeX@2215cb66fbadc76f0a44c20401add8c9e485fc02 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/arashsajjadi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2215cb66fbadc76f0a44c20401add8c9e485fc02 -
Trigger Event:
push
-
Statement type: