Model-Agnostic Task Architecture - A task-centric, model-agnostic framework for computer vision
Project description
MATA | Model-Agnostic Task Architecture
Write your vision pipeline once. Swap any model — HuggingFace, ONNX, Torchvision — without changing a line of code.
For ML engineers and CV practitioners who want YOLO-like simplicity with HuggingFace-scale model choice. MATA is a task-centric computer vision framework built on three ideas:
- Universal model loading — load any model by HuggingFace ID, local ONNX file, or config alias with one API
- Composable graph pipelines — wire Detect → Segment → Embed into typed DAGs with parallel execution, conditional branching, and control flow
- Zero-shot everything — CLIP classify, GroundingDINO detect, SAM segment — no training required
See It in Action
One-liner inference — any HuggingFace model, three lines:
import mata
result = mata.run("detect", "image.jpg", model="facebook/detr-resnet-50")
for det in result.instances:
print(f"{det.label_name}: {det.score:.2f} at {det.bbox}")
Multi-task graph pipeline — MATA's unique power. Compose tasks into typed, parallel workflows:
import mata
from mata.nodes import Detect, Filter, PromptBoxes, Fuse
result = mata.infer(
image="image.jpg",
graph=[
Detect(using="detector", text_prompts="cat . dog", out="dets"),
Filter(src="dets", score_gt=0.3, out="filtered"),
PromptBoxes(using="segmenter", dets="filtered", out="masks"),
Fuse(dets="filtered", masks="masks", out="final"),
],
providers={
"detector": mata.load("detect", "IDEA-Research/grounding-dino-tiny"),
"segmenter": mata.load("segment", "facebook/sam-vit-base"),
}
)
CLI — run from the terminal, no script needed:
mata run detect image.jpg --model facebook/detr-resnet-50 --conf 0.4 --save
mata track video.mp4 --model facebook/detr-resnet-50 --tracker botsort --save
mata recognize person.jpg --gallery gallery.npz --model openai/clip-vit-base-patch32
Installation
pip install datamata
For GPU acceleration, install PyTorch with CUDA first:
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
pip install datamata
See INSTALLATION.md for CUDA version table, optional dependencies (ONNX, barcode, notebook, Valkey), and troubleshooting.
Core Tasks
Detection
result = mata.run("detect", "image.jpg", model="facebook/detr-resnet-50", threshold=0.4)
for det in result.instances:
print(f"{det.label_name}: {det.score:.2f} at {det.bbox}")
Classification
result = mata.run("classify", "image.jpg", model="microsoft/resnet-50")
print(f"Top-1: {result.top1.label_name} ({result.top1.score:.2%})")
Segmentation
result = mata.run("segment", "image.jpg",
model="facebook/mask2former-swin-tiny-coco-instance", threshold=0.5)
instances = result.get_instances()
Depth Estimation
result = mata.run("depth", "image.jpg",
model="depth-anything/Depth-Anything-V2-Small-hf")
result.save("depth.png", colormap="magma")
And More
| Task | One-liner | Guide |
|---|---|---|
| OCR | mata.run("ocr", "doc.jpg", model="easyocr") |
OCR Guide |
| Tracking | mata.track("video.mp4", model="...", tracker="botsort") |
Tracking Guide |
| VLM | mata.run("vlm", "img.jpg", model="Qwen/Qwen3-VL-2B-Instruct", prompt="...") |
VLM Guide |
| Embedding | mata.run("embed", "img.jpg", model="openai/clip-vit-base-patch32") |
Embed Example |
| Barcode | mata.run("barcode", "img.jpg", model="pyzbar") |
Barcode Examples |
| Recognition | mata.run("recognize", "img.jpg", gallery=gallery, model="...") |
Recognition Guide |
What Makes MATA Different
Graph Pipelines
Compose multi-task workflows as typed directed graphs. Run independent tasks in parallel for 1.5-3x speedup:
Example 1 — Parallel multi-task scene analysis (image):
from mata.nodes import Detect, Classify, EstimateDepth, Fuse
from mata.core.graph import Graph
result = mata.infer(
image="scene.jpg",
graph=Graph("scene_analysis").parallel([
Detect(using="detector", out="dets"),
Classify(using="classifier", text_prompts=["indoor", "outdoor"], out="cls"),
EstimateDepth(using="depth", out="depth"),
]).then(
Fuse(dets="dets", classification="cls", depth="depth", out="scene")
),
providers={
"detector": mata.load("detect", "facebook/detr-resnet-50"),
"classifier": mata.load("classify", "openai/clip-vit-base-patch32"),
"depth": mata.load("depth", "depth-anything/Depth-Anything-V2-Small-hf"),
}
)
Example 2 — Natural-language video semantic search:
from mata.nodes import IndexVideo, EmbeddingSearch
from mata.core.graph import Graph
embedder = mata.load("embed", "Qwen/Qwen3-VL-Embedding-2B", dtype="bfloat16")
result = (
Graph("urban_traffic_search")
.then(IndexVideo(using="embedder", mode="frame", sample_fps=1.0))
.then(EmbeddingSearch(
using="embedder",
text=[
"person dangerously jaywalking between moving vehicles",
"cyclist weaving through fast-moving traffic at night",
"vehicle making an abrupt lane change near pedestrians",
],
top_k=3,
threshold=0.18,
))
).run(video="dashcam.mp4", providers={"embedder": embedder})
for qr in result["search_results"].results:
print(f'"{qr.query}"')
for rank, m in enumerate(qr.matches, 1):
mm, ss = int(m.start_s) // 60, int(m.start_s) % 60
print(f" #{rank} sim={m.similarity:.4f} @ {mm:02d}m{ss:02d}s")
Control flow primitives (v1.9.5) — EarlyExit, While, and Graph.add(condition=...) for quality gates, feedback loops, and adaptive pipelines.
Pre-built presets for common workflows:
from mata.presets import grounding_dino_sam, full_scene_analysis
result = mata.infer("image.jpg", grounding_dino_sam(), providers={...})
See Graph API Reference | Cookbook | Examples
Zero-Shot Vision
Perform any vision task without training — just provide text prompts:
# Classify into arbitrary categories
result = mata.run("classify", "image.jpg",
model="openai/clip-vit-base-patch32",
text_prompts=["cat", "dog", "bird"])
# Detect objects by description
result = mata.run("detect", "image.jpg",
model="IDEA-Research/grounding-dino-tiny",
text_prompts="red apple . green apple . banana")
# Segment anything with point/box/text prompts
result = mata.run("segment", "image.jpg",
model="facebook/sam-vit-base",
point_prompts=[(320, 240, 1)])
See Zero-Shot Guide for CLIP, GroundingDINO, OWL-ViT, SAM, and SAM3 details.
Object Tracking
Track objects across video with persistent IDs, ReID, and streaming support:
# One-liner video tracking
results = mata.track("video.mp4",
model="facebook/detr-resnet-50", tracker="botsort", conf=0.3, save=True)
# Memory-efficient streaming for RTSP / long videos
for result in mata.track("rtsp://camera/stream",
model="facebook/detr-resnet-50", stream=True):
print(f"Active tracks: {len(result.instances)}")
# Appearance-based ReID — recover IDs after occlusion
results = mata.track("video.mp4", model="facebook/detr-resnet-50",
reid_model="openai/clip-vit-base-patch32")
ByteTrack and BotSort are fully vendored — no external tracking dependencies. See Tracking Guide for ByteTrack vs BotSort comparison, cross-camera ReID, and YAML config.
Command-Line Interface
mata run detect image.jpg --model facebook/detr-resnet-50 --conf 0.4 --save
mata run classify image.jpg --model microsoft/resnet-50 --json
mata run vlm image.jpg --model Qwen/Qwen3-VL-2B-Instruct --prompt "Describe this"
mata track video.mp4 --model facebook/detr-resnet-50 --tracker botsort --save
mata val detect --data coco.yaml --model facebook/detr-resnet-50
mata --version
All subcommands support --help. See CLI Examples.
Supported Models
MATA works with any model from HuggingFace Transformers, Torchvision, or local ONNX/TorchScript files. Tested and recommended models:
| Task | Representative Models | Runtimes |
|---|---|---|
| Detection | DETR, RT-DETR, GroundingDINO, OWL-ViT, RetinaNet, Faster R-CNN, FCOS, SSD | PyTorch, ONNX, TorchScript, Torchvision |
| Classification | ResNet, ViT, ConvNeXt, EfficientNet, Swin, CLIP (zero-shot) | PyTorch, ONNX, TorchScript |
| Segmentation | Mask2Former, MaskFormer, SAM, SAM3 (zero-shot) | PyTorch |
| Depth | Depth Anything V1/V2 | PyTorch |
| VLM | Qwen3-VL, MedGemma, Florence-2, LLaVA-NeXT, SmolVLM, Moondream2, + 3 more | PyTorch |
| OCR | EasyOCR, PaddleOCR, Tesseract, GOT-OCR2, TrOCR | PyTorch |
| Embedding | CLIP, OSNet, X-CLIP | PyTorch, ONNX |
| Barcode | pyzbar, zxing-cpp | Native |
See Supported Models for model IDs, benchmarks, and runtime compatibility matrix.
When NOT to Use MATA
- Training-first workflows —
mata.train()is planned for v2.0.0. If training is your primary need today, use HuggingFace Trainer or PyTorch Lightning directly. - Edge / mobile deployment — TensorRT and TFLite export are planned but not yet available.
- Single-model, maximum-throughput — MATA's adapter layer adds ~1-2ms overhead. For bare-metal speed on one model, use the runtime directly.
Architecture
mata.run() / mata.load() / mata.infer()
|
UniversalLoader (5-strategy auto-detection)
|
Task Adapters (HuggingFace / ONNX / TorchScript / Torchvision)
| |
VisionResult (single-task) Graph System (multi-task)
| |
Runtime Layer Parallel scheduler + control flow
|
Export (JSON / CSV / image overlay / crops)
For a deep-dive into design decisions and layer contracts, see docs/MATA_DESIGN_AND_ARCHITECTURE.md.
Roadmap
v1.9.7 is the final feature release. The 1.9.x line is now in maintenance mode.
See CHANGELOG.md for full version history.
- v1.9.x (maintenance) — bug fixes and documentation only
- v2.0.0 (Q2 2026) — Annotation tooling (
mata.annotate()), training module (mata.train()), quantized ONNX export, breaking API cleanup. View Development of v2.0.0 Beta 2 Branch - v2.x — HuggingFace Hub model recommendations, KACA CNN integration, V2L HyperLoRA research
- v2.5+ — 3D vision, edge deployment, Auto-ML
What's Next?
- Quickstart Guide — get running in 5 minutes
- Notebook Examples — interactive Jupyter tutorials
- Graph Cookbook — multi-task pipeline recipes
- Real-World Scenarios — 20 industry-ready pipelines
- Quick Reference — export, config, validation cheat sheet
- Validation Guide — mAP, accuracy, and depth metrics against COCO / ImageNet / DIODE
License
Apache License 2.0. See LICENSE and NOTICE.
MATA does not distribute model weights. Models fetched via mata.load() are governed by their own licenses (Apache 2.0, MIT, CC-BY-NC, etc.). You are responsible for complying with model-specific terms.
Contributing
Contributions welcome. See CONTRIBUTING.md for guidelines (Apache 2.0 compatibility, >80% test coverage, Black formatting, type hints).
Acknowledgments
Built on HuggingFace Transformers, PyTorch, and ONNX Runtime.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datamata-1.9.8.tar.gz.
File metadata
- Download URL: datamata-1.9.8.tar.gz
- Upload date:
- Size: 915.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e19fbd61a0a1759f90b04925318bec23f6739c6133dc9ab398070bc297309f2
|
|
| MD5 |
99c2130308ee9b908a969d5d1b66470f
|
|
| BLAKE2b-256 |
2941b5c1295c1807aa0b93fd3da415dc3214d3e44622374666f5dc4f9cbca9fe
|
Provenance
The following attestation bundles were made for datamata-1.9.8.tar.gz:
Publisher:
publish.yml on datamata-io/mata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datamata-1.9.8.tar.gz -
Subject digest:
5e19fbd61a0a1759f90b04925318bec23f6739c6133dc9ab398070bc297309f2 - Sigstore transparency entry: 1293642296
- Sigstore integration time:
-
Permalink:
datamata-io/mata@ae417399a1d06f87516176278ec65e8d8f69c2c7 -
Branch / Tag:
refs/tags/v1.9.8 - Owner: https://github.com/datamata-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ae417399a1d06f87516176278ec65e8d8f69c2c7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file datamata-1.9.8-py3-none-any.whl.
File metadata
- Download URL: datamata-1.9.8-py3-none-any.whl
- Upload date:
- Size: 599.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bf288611710ef36ea679fa0feb3973660d94ed7e0493785b52600b300e89db6
|
|
| MD5 |
76d0ae50c4460fda51a7c866757adfa4
|
|
| BLAKE2b-256 |
246f4da047c6a984c10c34a7138c5e982b54c2d72a5a89c733f2a767bbd11ae2
|
Provenance
The following attestation bundles were made for datamata-1.9.8-py3-none-any.whl:
Publisher:
publish.yml on datamata-io/mata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datamata-1.9.8-py3-none-any.whl -
Subject digest:
9bf288611710ef36ea679fa0feb3973660d94ed7e0493785b52600b300e89db6 - Sigstore transparency entry: 1293642315
- Sigstore integration time:
-
Permalink:
datamata-io/mata@ae417399a1d06f87516176278ec65e8d8f69c2c7 -
Branch / Tag:
refs/tags/v1.9.8 - Owner: https://github.com/datamata-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ae417399a1d06f87516176278ec65e8d8f69c2c7 -
Trigger Event:
push
-
Statement type: