Skip to main content

Contract-first evaluation + tooling harness for detection/pose/segmentation.

Project description

YOLOZU (萬)

CI PyPI Python License

Pronunciation: Yaoyorozu (yorozu). Official ASCII name: YOLOZU.

YOLOZU is an Apache-2.0-only, contract-first evaluation + tooling harness for:

  • real-time monocular RGB detection
  • monocular depth + 6DoF pose heads (RT-DETR-based scaffold)
  • semantic segmentation utilities (dataset prep + mIoU evaluation)
  • instance segmentation utilities (PNG-mask contract + mask mAP evaluation)

Recommended deployment path (canonical): PyTorch → ONNX → TensorRT (TRT).

It focuses on:

  • CPU-minimum dev/tests (GPU optional)
  • A stable predictions-JSON contract for evaluation (bring-your-own inference backend)
  • Minimal training scaffold (RT-DETR pose) with reproducible artifacts
  • Hessian-based refinement for regression head predictions (depth, rotation, offsets)

Quickstart (pip users)

python3 -m pip install yolozu
yolozu doctor --output -
yolozu predict-images --backend dummy --input-dir /path/to/images
yolozu demo instance-seg

Optional extras:

python3 -m pip install 'yolozu[demo]'    # torch demos (CPU OK)
python3 -m pip install 'yolozu[onnxrt]'  # ONNXRuntime CPU exporter
python3 -m pip install 'yolozu[coco]'    # pycocotools COCOeval
python3 -m pip install 'yolozu[full]'

Docs index (start here): docs/README.md

Why YOLOZU (what’s “sellable”)

  • Bring-your-own inference + contract-first evaluation: run inference in PyTorch / ONNXRuntime / TensorRT / C++ / Rust → export the same predictions.json → compare apples-to-apples.
  • Safe TTT (test-time training): presets + guard rails + reset policies (see docs/ttt_protocol.md).
  • Apache-2.0-only ops: license policy + checks to keep the toolchain clean (see docs/license_policy.md).
  • Unified CLI: yolozu (pip) + python3 tools/yolozu.py (repo) wrap backends with consistent args, caching (--cache), and always write run metadata (git SHA / env / GPU / config hash).
  • Parity + benchmarks: backend diff stats (torch vs onnxrt vs trt) and fixed-protocol latency/FPS reports.
  • AI-friendly repo surface: stable schemas + tools/manifest.json for tool discovery / automation.

Feature highlights (what you can do)

  • Dataset I/O: YOLO-format images/labels + optional per-image JSON metadata.
  • Stable evaluation contract: versioned predictions-JSON schema + adapter contract.
  • Unified CLI:
    • pip: yolozu (install-safe commands + CPU demos)
    • repo: python3 tools/yolozu.py (power-user research/eval workflows)
  • Inference/export: tools/export_predictions.py (torch adapter), tools/export_predictions_onnxrt.py, tools/export_predictions_trt.py.
  • Test-time adaptation options:
    • TTA: lightweight prediction-space post-transform (--tta).
    • TTT: pre-prediction test-time training (Tent or MIM) via --ttt (adapter + torch required).
  • Hessian solver: per-detection iterative refinement of regression outputs (depth, rotation, offsets) using Gauss-Newton optimization.
  • Evaluation: COCO mAP conversion/eval and scenario suite reporting.
  • Keypoints: YOLO pose-style keypoints in labels/predictions + PCK evaluation + optional COCO OKS mAP (tools/eval_keypoints.py --oks), plus parity/benchmark helpers.
  • Semantic seg: dataset prep helpers + tools/eval_segmentation.py (mIoU/per-class IoU/ignore_index + optional HTML overlays).
  • Instance seg: tools/eval_instance_segmentation.py (mask mAP from per-instance binary PNG masks + optional HTML overlays).
  • Training scaffold: minimal RT-DETR pose trainer with metrics output, ONNX export, and optional SDFT-style self-distillation.

Instance segmentation (PNG masks)

YOLOZU evaluates instance segmentation using per-instance binary PNG masks (no RLE/polygons required).

Predictions JSON (minimal):

[
  {
    "image": "000001.png",
    "instances": [
      { "class_id": 0, "score": 0.9, "mask": "masks/000001_inst0.png" }
    ]
  }
]

Validate an artifact:

python3 tools/validate_instance_segmentation_predictions.py reports/instance_seg_predictions.json

Eval outputs:

  • mask mAP (map50, map50_95)
  • per-class AP table
  • per-image diagnostics (TP/FP/FN, mean IoU) and overlay selection (--overlay-sort {worst,best,first}; default: worst)

Run the synthetic demo and render overlays/HTML:

python3 tools/eval_instance_segmentation.py \
  --dataset examples/instance_seg_demo/dataset \
  --split val2017 \
  --predictions examples/instance_seg_demo/predictions/instance_seg_predictions.json \
  --pred-root examples/instance_seg_demo/predictions \
  --classes examples/instance_seg_demo/classes.txt \
  --html reports/instance_seg_demo_eval.html \
  --overlays-dir reports/instance_seg_demo_overlays \
  --max-overlays 10

Same via the unified CLI:

python3 tools/yolozu.py eval-instance-seg --dataset examples/instance_seg_demo/dataset --split val2017 --predictions examples/instance_seg_demo/predictions/instance_seg_predictions.json --pred-root examples/instance_seg_demo/predictions --classes examples/instance_seg_demo/classes.txt --html reports/instance_seg_demo_eval.html --overlays-dir reports/instance_seg_demo_overlays --max-overlays 10

Optional: prepare COCO instance-seg dataset with per-instance PNG masks (requires pycocotools):

python3 tools/prepare_coco_instance_seg.py --coco-root /path/to/coco --split val2017 --out data/coco-instance-seg

Optional: convert COCO instance-seg predictions (RLE/polygons) into YOLOZU PNG masks (requires pycocotools):

python3 tools/convert_coco_instance_seg_predictions.py \
  --predictions /path/to/coco_instance_seg_preds.json \
  --instances-json /path/to/instances_val2017.json \
  --output reports/instance_seg_predictions.json \
  --masks-dir reports/instance_seg_masks

Documentation

Start here: docs/training_inference_export.md

Roadmap (priorities)

  • P0: Unified CLI (torch / onnxruntime / tensorrt) with consistent args + same output schema; always write meta (git SHA / env / GPU / seed / config hash); keep tools/manifest.json updated.
  • P1: doctor (deps/GPU/driver/onnxrt/TRT diagnostics) + predict-images (folder input → predictions JSON + overlays) + HTML report.
  • P2: cache/re-run (fingerprinted runs) + sweeps (wrapper exists; expand sweeps for TTT/threshold/gate weights) + production inference cores (C++/Rust) as needed.
  • Long-form notes: docs/roadmap.md

Pros / Operational Notes (project-level)

Pros

  • Apache-2.0-only utilities and evaluation harnesses (no vendored GPL/AGPL inference code).
  • CPU-first development workflow: dataset tooling, validators, scenario suite, and unit tests run without a GPU.
  • Adapter interface decouples inference backend from evaluation (PyTorch/ONNXRuntime/TensorRT/custom), so you can run inference elsewhere and still score/compare locally.
  • Reproducible artifacts: stable JSON reports + optional JSONL history for regressions.
  • Symmetry + commonsense constraints are treated as first-class, test-covered utilities (not ad-hoc postprocess).

Operational notes and mitigations

  • Training remains scaffold-first in rtdetr_pose/ (data/loss/export wiring), while continual-learning behavior is immediately testable from pip with yolozu demo continual --compare --markdown and source training stays available via yolozu dev train <config> (docs/training_inference_export.md).
  • A one-command folder inference path is available from pip: yolozu predict-images --backend onnxrt --input-dir <dir> --onnx <model.onnx>, which writes predictions JSON + overlays + HTML in one run.
  • TensorRT remains NVIDIA/Linux-centric, while macOS can run CPU validation and ONNXRuntime export: yolozu onnxrt export ...; GPU/TRT build/eval is pinned to Runpod/container workflows (docs/tensorrt_pipeline.md).
  • Backend parity drift is handled by a dedicated checker: yolozu parity --reference reports/pred_torch.json --candidate reports/pred_onnxrt.json plus protocol-pinned eval settings (docs/yolo26_eval_protocol.md).
  • Lightweight metrics stay available for fast loops, and full COCOeval is directly exposed from pip: python3 -m pip install 'yolozu[coco]' then yolozu eval-coco --dataset <yolo-dataset> --predictions <predictions.json>.
  • Model weights/datasets stay outside git by design; reproducibility is maintained through stable JSON artifacts and pinned path conventions documented in docs/external_inference.md and docs/yolo26_inference_adapters.md.

Install (pip users)

python3 -m pip install yolozu
yolozu --help
yolozu doctor --output -

Optional extras (recommended as needed):

python3 -m pip install 'yolozu[demo]'    # torch demos (CPU OK)
python3 -m pip install 'yolozu[onnxrt]'  # ONNXRuntime CPU exporter
python3 -m pip install 'yolozu[coco]'    # pycocotools COCOeval
python3 -m pip install 'yolozu[full]'

CPU demos:

yolozu demo instance-seg
yolozu demo continual --method ewc_replay     # requires yolozu[demo]
yolozu demo continual --compare --markdown    # suite: naive/ewc/replay/ewc_replay

Source checkout (repo users)

This path unlocks the full repo tooling (tools/, rtdetr_pose/, scenarios, etc.).

python3 -m pip install -r requirements-test.txt
python3 -m pip install -e .

# Tiny smoke dataset (optional but useful for scenario runs)
bash tools/fetch_coco128.sh

python3 -m unittest -q

CLI: pip vs repo

Capability pip (pip install yolozu) repo checkout
Environment report yolozu doctor --output - python3 tools/yolozu.py doctor --output reports/doctor.json
Export smoke (no inference) yolozu export --backend labels --dataset /path/to/yolo --output reports/predictions.json --force same
Folder inference + overlays/HTML yolozu predict-images --backend onnxrt --input-dir /path/to/images --onnx /path/to/model.onnx python3 tools/yolozu.py predict-images ...
Backend parity check yolozu parity --reference reports/pred_torch.json --candidate reports/pred_onnxrt.json python3 tools/check_predictions_parity.py ...
Validate dataset layout yolozu validate dataset /path/to/yolo --strict python3 tools/validate_dataset.py /path/to/yolo --strict
Validate predictions JSON yolozu validate predictions reports/predictions.json --strict python3 tools/validate_predictions.py reports/predictions.json --strict
COCOeval mAP yolozu eval-coco --dataset /path/to/yolo --predictions reports/predictions.json (requires yolozu[coco]) python3 tools/eval_coco.py ...
Instance-seg eval (PNG masks) yolozu eval-instance-seg --dataset /path --predictions preds.json --output reports/instance_seg_eval.json python3 tools/eval_instance_segmentation.py ...
ONNXRuntime CPU export yolozu onnxrt export ... (requires yolozu[onnxrt]) python3 tools/export_predictions_onnxrt.py ...
Training scaffold yolozu dev train configs/examples/train_setting.yaml (source checkout only) python3 rtdetr_pose/tools/train_minimal.py ...
Scenario suite python3 tools/run_scenarios.py ...

Dev-only commands (source checkout): yolozu dev train, yolozu dev test (aliases: yolozu train/test).

The “power-user” unified CLI lives in-repo: python3 tools/yolozu.py --help.

Path behavior in tool CLIs:

  • Relative input paths are resolved from the current working directory (with repo-root fallback for compatibility).
  • Relative output paths are written under the current working directory.
  • For config-driven tools such as tools/tune_gate_weights.py, relative paths in the config are resolved from the config file directory.

Container images (GHCR)

YOLOZU can publish Docker images to GitHub Container Registry (GHCR) on tags vX.Y.Z.

  • Minimal (no torch): ghcr.io/<owner>/yolozu:<tag>
  • Demo (includes torch): ghcr.io/<owner>/yolozu-demo:<tag>

Examples:

docker run --rm ghcr.io/<owner>/yolozu:0.1.0 doctor --output -
docker run --rm ghcr.io/<owner>/yolozu-demo:0.1.0 demo continual --method ewc_replay

Publish trigger:

  • Push a tag vX.Y.Z to run .github/workflows/container.yml.
  • If the tag existed before the workflow was added, run it manually via GitHub Actions (workflow_dispatch) or cut a new tag.

Details: deploy/docker/README.md

GPU notes

  • GPU is supported (training/inference): install CUDA-enabled PyTorch in your environment and use --device cuda:0.
  • CI/dev does not require GPU; many checks are CPU-friendly.

Training scaffold (RT-DETR pose)

The minimal trainer is implemented in rtdetr_pose/tools/train_minimal.py.

Recommended usage is to set --run-dir, which writes a standard, reproducible artifact set:

  • metrics.jsonl (+ final metrics.json / metrics.csv)
  • checkpoint.pt (+ optional checkpoint_bundle.pt)
  • model.onnx (+ model.onnx.meta.json)
  • run_record.json (git SHA / platform / args)

Plot a loss curve (requires matplotlib):

python3 tools/plot_metrics.py --jsonl runs/<run>/metrics.jsonl --out reports/train_loss.png

ONNX export

ONNX export runs when --run-dir is set (defaulting to <run-dir>/model.onnx) or when --onnx-out is provided.

Useful flags:

  • --run-dir <dir>
  • --onnx-out <path>
  • --onnx-meta-out <path>
  • --onnx-opset <int>
  • --onnx-dynamic-hw (dynamic H/W axes)

Dataset format (YOLO + optional metadata)

Base dataset format:

  • Images: images/<split>/*.(jpg|png|...)
  • Labels: labels/<split>/*.txt (YOLO: class cx cy w h normalized)

Optional per-image metadata (JSON): labels/<split>/<image>.json

  • Masks/seg: mask_path / mask / M
  • Depth: depth_path / depth / D_obj
  • Pose: R_gt / t_gt (or pose)
  • Intrinsics: K_gt / intrinsics (also supports OpenCV FileStorage-style camera_matrix: {rows, cols, data:[...]})

Notes on units (pixels vs mm/m) and intrinsics coordinate frames:

Mask-only labels (seg -> bbox/class)

If YOLO txt labels are missing and a mask is provided, bbox+class can be derived from masks. Details (including color/instance modes and multi-PNG-per-class options) are documented in:

Evaluation / contracts (stable)

This repo evaluates models through a stable predictions JSON format:

Adapters power tools/export_predictions.py --adapter <name> and follow:

Precomputed predictions workflow (no torch required)

If you run real inference elsewhere (PyTorch/TensorRT/etc.), you can evaluate this repo without installing heavy deps locally.

  • Export predictions (in an environment where the adapter can run):
    • python3 tools/export_predictions.py --adapter rtdetr_pose --checkpoint /path/to.ckpt --max-images 50 --wrap --output reports/predictions.json
    • TTA (post-transform): python3 tools/export_predictions.py --adapter rtdetr_pose --tta --tta-seed 0 --tta-flip-prob 0.5 --wrap --output reports/predictions_tta.json
    • TTT (pre-prediction test-time training; updates model weights in-memory):
      • Tent (safe preset + guard rails): python3 tools/export_predictions.py --adapter rtdetr_pose --ttt --ttt-preset safe --ttt-reset sample --wrap --output reports/predictions_ttt_safe.json
      • MIM (safe preset + guard rails): python3 tools/export_predictions.py --adapter rtdetr_pose --ttt --ttt-preset mim_safe --ttt-reset sample --wrap --output reports/predictions_ttt_mim_safe.json
      • Optional log: add --ttt-log-out reports/ttt_log.json
      • Recommended protocol: docs/ttt_protocol.md
  • Validate the JSON:
    • python3 tools/validate_predictions.py reports/predictions.json
  • Consume predictions locally:
    • python3 tools/run_scenarios.py --adapter precomputed --predictions reports/predictions.json --max-images 50

Supported predictions JSON shapes:

  • [{"image": "...", "detections": [...]}, ...]
  • { "predictions": [ ... ] }
  • { "000000000009.jpg": [...], "/abs/path.jpg": [...] } (image -> detections)

Schema details:

COCO mAP (end-to-end, no NMS)

To compete on e2e mAP (NMS-free), evaluate detections as-is (no NMS postprocess applied).

This repo includes a COCO-style evaluator that:

  • Builds COCO ground truth from YOLO-format labels
  • Converts YOLOZU predictions JSON into COCO detections
  • Runs COCO mAP via pycocotools (optional dependency)

Example (coco128 quick run):

  • Export predictions (any adapter): python3 tools/export_predictions.py --adapter dummy --max-images 50 --wrap --output reports/predictions.json
  • Evaluate mAP: python3 tools/eval_coco.py --dataset data/coco128 --predictions reports/predictions.json --bbox-format cxcywh_norm --max-images 50

Note:

  • --bbox-format cxcywh_norm expects bbox dict {cx,cy,w,h} normalized to [0,1] (matching the RTDETR pose adapter bbox head).

Training recipe (v1)

Reference recipe for external training runs (augment, multiscale, schedule, EMA):

  • docs/training_recipe_v1.md

Training, inference, export (quick steps)

  • docs/training_inference_export.md

Hyperparameter sweep harness

Run a configurable sweep and emit CSV/MD tables:

  • docs/hpo_sweep.md

Latency/FPS benchmark harness

Report latency/FPS per YOLO26 bucket and archive runs over time:

  • docs/benchmark_latency.md

Inference-time gating / score fusion

Fuse detection/template/uncertainty signals into a single score and tune weights offline (CPU-only):

  • docs/gate_weight_tuning.md

TensorRT FP16/INT8 pipeline

Reproducible engine build + parity validation steps:

  • docs/tensorrt_pipeline.md

External baselines (Apache-2.0-friendly)

This repo does not require (or vendor) any GPL/AGPL inference code.

To compare against external baselines (including YOLO26) while keeping this repo Apache-2.0-only:

  • Run baseline inference in your own environment/implementation (ONNX Runtime / TensorRT / custom code).
  • Export detections to YOLOZU predictions JSON (see schema below).
  • (Optional) Normalize class ids using COCO classes.json mapping.
  • Validate + evaluate mAP in this repo:
    • python3 tools/validate_predictions.py reports/predictions.json
    • python3 tools/eval_coco.py --dataset /path/to/coco-yolo --split val2017 --predictions reports/predictions.json --bbox-format cxcywh_norm

Minimal predictions entry schema:

  • {"image": "/abs/or/rel/path.jpg", "detections": [{"class_id": 0, "score": 0.9, "bbox": {"cx": 0.5, "cy": 0.5, "w": 0.2, "h": 0.2}}]}

Optional class-id normalization (when your exporter produces COCO category_id):

  • python3 tools/normalize_predictions.py --input reports/predictions.json --output reports/predictions_norm.json --classes data/coco-yolo/labels/val2017/classes.json --wrap

COCO dataset prep (official JSON -> YOLO-format)

If you have the official COCO layout (images + annotations/instances_*.json), you can generate YOLO-format labels:

  • python3 tools/prepare_coco_yolo.py --coco-root /path/to/coco --split val2017 --out /path/to/coco-yolo

This creates:

  • /path/to/coco-yolo/labels/val2017/*.txt (YOLO normalized class cx cy w h)
  • /path/to/coco-yolo/labels/val2017/classes.json (category_id <-> class_id mapping)

Dataset layout under data/

For local development, keep datasets under data/:

  • Debug/smoke: data/coco128 (already included)
  • Full COCO (official): data/coco (your download)
  • YOLO-format labels generated from official JSON: data/coco-yolo (your output from tools/prepare_coco_yolo.py)

Size-bucket competition (yolo26n/s/m/l/x)

If you export yolo26n/s/m/l/x predictions as separate JSON files (e.g. reports/pred_yolo26n.json, ...), you can score them together:

  • Protocol details: docs/yolo26_eval_protocol.md
  • python3 tools/eval_suite.py --protocol yolo26 --dataset /path/to/coco-yolo --predictions-glob 'reports/pred_yolo26*.json' --output reports/eval_suite.json
  • Fill in targets: baselines/yolo26_targets.json
  • Validate targets: python3 tools/validate_map_targets.py --targets baselines/yolo26_targets.json
  • Check pass/fail: python3 tools/check_map_targets.py --suite reports/eval_suite.json --targets baselines/yolo26_targets.json --key map50_95
  • Print a table: python3 tools/print_leaderboard.py --suite reports/eval_suite.json --targets baselines/yolo26_targets.json --key map50_95
  • Archive the run (commands + hardware + suite output): python3 tools/import_yolo26_baseline.py --dataset /path/to/coco-yolo --predictions-glob 'reports/pred_yolo26*.json'

Debug without pycocotools

If you don't have pycocotools installed yet, you can still validate/convert predictions on data/coco128:

  • python3 tools/export_predictions.py --adapter dummy --max-images 10 --wrap --output reports/predictions_dummy.json
  • python3 tools/eval_coco.py --predictions reports/predictions_dummy.json --dry-run

Deployment notes

  • Keep symmetry/commonsense logic in lightweight postprocess utilities, outside any inference graph export.

License

Code in this repository is licensed under the Apache License, Version 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yolozu-0.1.1.tar.gz (189.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yolozu-0.1.1-py3-none-any.whl (160.3 kB view details)

Uploaded Python 3

File details

Details for the file yolozu-0.1.1.tar.gz.

File metadata

  • Download URL: yolozu-0.1.1.tar.gz
  • Upload date:
  • Size: 189.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yolozu-0.1.1.tar.gz
Algorithm Hash digest
SHA256 cc28f631376250e5e6a2afb3439a2a00ccc2eb54bc07c5c0d69764f61bccd6d5
MD5 0c85a9304f3c31024f182d225f6e9df4
BLAKE2b-256 ea1dbd2963e2dd3c869d658fb91406415e7728e8f6e7df58a4352e08f1a89d14

See more details on using hashes here.

Provenance

The following attestation bundles were made for yolozu-0.1.1.tar.gz:

Publisher: publish.yml on thinksyncs/YOLOZU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yolozu-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: yolozu-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 160.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yolozu-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4c56505e0c32d5135d5a38424a4c93999cbf18cb19bd64f4136e642a385d756c
MD5 955ebd8ebd51bca8370936032c7397ad
BLAKE2b-256 8c631c090905f827db45abd5155fcf57d1dfdb74255c4ee21b8d2dd1037af9ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for yolozu-0.1.1-py3-none-any.whl:

Publisher: publish.yml on thinksyncs/YOLOZU

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page