Skip to main content

Model-Agnostic Task Architecture - A task-centric, model-agnostic framework for computer vision

Project description

MATA Logo

MATA | Model-Agnostic Task Architecture

Write your vision pipeline once. Swap any model — HuggingFace, ONNX, Torchvision — without changing a line of code.

Python Apache 2.0 v1.9.6 Tests


For ML engineers and CV practitioners who want YOLO-like simplicity with HuggingFace-scale model choice. MATA is a task-centric computer vision framework built on three ideas:

  1. Universal model loading — load any model by HuggingFace ID, local ONNX file, or config alias with one API
  2. Composable graph pipelines — wire Detect → Segment → Embed into typed DAGs with parallel execution, conditional branching, and control flow
  3. Zero-shot everything — CLIP classify, GroundingDINO detect, SAM segment — no training required

See It in Action

One-liner inference — any HuggingFace model, three lines:

import mata

result = mata.run("detect", "image.jpg", model="facebook/detr-resnet-50")
for det in result.instances:
    print(f"{det.label_name}: {det.score:.2f} at {det.bbox}")

Multi-task graph pipeline — MATA's unique power. Compose tasks into typed, parallel workflows:

import mata
from mata.nodes import Detect, Filter, PromptBoxes, Fuse

result = mata.infer(
    image="image.jpg",
    graph=[
        Detect(using="detector", text_prompts="cat . dog", out="dets"),
        Filter(src="dets", score_gt=0.3, out="filtered"),
        PromptBoxes(using="segmenter", dets="filtered", out="masks"),
        Fuse(dets="filtered", masks="masks", out="final"),
    ],
    providers={
        "detector":  mata.load("detect", "IDEA-Research/grounding-dino-tiny"),
        "segmenter": mata.load("segment", "facebook/sam-vit-base"),
    }
)

CLI — run from the terminal, no script needed:

mata run detect image.jpg --model facebook/detr-resnet-50 --conf 0.4 --save
mata track video.mp4 --model facebook/detr-resnet-50 --tracker botsort --save
mata recognize person.jpg --gallery gallery.npz --model openai/clip-vit-base-patch32

Installation

pip install datamata

For GPU acceleration, install PyTorch with CUDA first:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
pip install datamata

See INSTALLATION.md for CUDA version table, optional dependencies (ONNX, barcode, notebook, Valkey), and troubleshooting.

Core Tasks

Detection

result = mata.run("detect", "image.jpg", model="facebook/detr-resnet-50", threshold=0.4)
for det in result.instances:
    print(f"{det.label_name}: {det.score:.2f} at {det.bbox}")

Classification

result = mata.run("classify", "image.jpg", model="microsoft/resnet-50")
print(f"Top-1: {result.top1.label_name} ({result.top1.score:.2%})")

Segmentation

result = mata.run("segment", "image.jpg",
    model="facebook/mask2former-swin-tiny-coco-instance", threshold=0.5)
instances = result.get_instances()

Depth Estimation

result = mata.run("depth", "image.jpg",
    model="depth-anything/Depth-Anything-V2-Small-hf")
result.save("depth.png", colormap="magma")

And More

Task One-liner Guide
OCR mata.run("ocr", "doc.jpg", model="easyocr") OCR Guide
Tracking mata.track("video.mp4", model="...", tracker="botsort") Tracking Guide
VLM mata.run("vlm", "img.jpg", model="Qwen/Qwen3-VL-2B-Instruct", prompt="...") VLM Guide
Embedding mata.run("embed", "img.jpg", model="openai/clip-vit-base-patch32") Embed Example
Barcode mata.run("barcode", "img.jpg", model="pyzbar") Barcode Examples
Recognition mata.run("recognize", "img.jpg", gallery=gallery, model="...") Recognition Guide

What Makes MATA Different

Graph Pipelines

Compose multi-task workflows as typed directed graphs. Run independent tasks in parallel for 1.5-3x speedup:

from mata.nodes import Detect, Classify, EstimateDepth, Fuse
from mata.core.graph import Graph

result = mata.infer(
    image="scene.jpg",
    graph=Graph("scene_analysis").parallel([
        Detect(using="detector", out="dets"),
        Classify(using="classifier", text_prompts=["indoor", "outdoor"], out="cls"),
        EstimateDepth(using="depth", out="depth"),
    ]).then(
        Fuse(dets="dets", classification="cls", depth="depth", out="scene")
    ),
    providers={
        "detector": mata.load("detect", "facebook/detr-resnet-50"),
        "classifier": mata.load("classify", "openai/clip-vit-base-patch32"),
        "depth": mata.load("depth", "depth-anything/Depth-Anything-V2-Small-hf"),
    }
)

Control flow primitives (v1.9.5) — EarlyExit, While, and Graph.add(condition=...) for quality gates, feedback loops, and adaptive pipelines.

Pre-built presets for common workflows:

from mata.presets import grounding_dino_sam, full_scene_analysis
result = mata.infer("image.jpg", grounding_dino_sam(), providers={...})

See Graph API Reference | Cookbook | Examples

Zero-Shot Vision

Perform any vision task without training — just provide text prompts:

# Classify into arbitrary categories
result = mata.run("classify", "image.jpg",
    model="openai/clip-vit-base-patch32",
    text_prompts=["cat", "dog", "bird"])

# Detect objects by description
result = mata.run("detect", "image.jpg",
    model="IDEA-Research/grounding-dino-tiny",
    text_prompts="red apple . green apple . banana")

# Segment anything with point/box/text prompts
result = mata.run("segment", "image.jpg",
    model="facebook/sam-vit-base",
    point_prompts=[(320, 240, 1)])

See Zero-Shot Guide for CLIP, GroundingDINO, OWL-ViT, SAM, and SAM3 details.

Object Tracking

Track objects across video with persistent IDs, ReID, and streaming support:

# One-liner video tracking
results = mata.track("video.mp4",
    model="facebook/detr-resnet-50", tracker="botsort", conf=0.3, save=True)

# Memory-efficient streaming for RTSP / long videos
for result in mata.track("rtsp://camera/stream",
                         model="facebook/detr-resnet-50", stream=True):
    print(f"Active tracks: {len(result.instances)}")

# Appearance-based ReID — recover IDs after occlusion
results = mata.track("video.mp4", model="facebook/detr-resnet-50",
    reid_model="openai/clip-vit-base-patch32")

ByteTrack and BotSort are fully vendored — no external tracking dependencies. See Tracking Guide for ByteTrack vs BotSort comparison, cross-camera ReID, and YAML config.

Command-Line Interface

mata run detect image.jpg --model facebook/detr-resnet-50 --conf 0.4 --save
mata run classify image.jpg --model microsoft/resnet-50 --json
mata run vlm image.jpg --model Qwen/Qwen3-VL-2B-Instruct --prompt "Describe this"
mata track video.mp4 --model facebook/detr-resnet-50 --tracker botsort --save
mata val detect --data coco.yaml --model facebook/detr-resnet-50
mata --version

All subcommands support --help. See CLI Examples.

Supported Models

MATA works with any model from HuggingFace Transformers, Torchvision, or local ONNX/TorchScript files. Tested and recommended models:

Task Representative Models Runtimes
Detection DETR, RT-DETR, GroundingDINO, OWL-ViT, RetinaNet, Faster R-CNN, FCOS, SSD PyTorch, ONNX, TorchScript, Torchvision
Classification ResNet, ViT, ConvNeXt, EfficientNet, Swin, CLIP (zero-shot) PyTorch, ONNX, TorchScript
Segmentation Mask2Former, MaskFormer, SAM, SAM3 (zero-shot) PyTorch
Depth Depth Anything V1/V2 PyTorch
VLM Qwen3-VL, MedGemma, Florence-2, LLaVA-NeXT, SmolVLM, Moondream2, + 3 more PyTorch
OCR EasyOCR, PaddleOCR, Tesseract, GOT-OCR2, TrOCR PyTorch
Embedding CLIP, DINOv2, OSNet PyTorch, ONNX
Barcode pyzbar, zxing-cpp Native

See Supported Models for model IDs, benchmarks, and runtime compatibility matrix.

When NOT to Use MATA

  • Training-first workflowsmata.train() is in beta (v2.0.0b1). If training is your primary need today, consider HuggingFace Trainer directly.
  • Edge / mobile deployment — TensorRT and TFLite export are planned but not yet available.
  • Single-model, maximum-throughput — MATA's adapter layer adds ~1-2ms overhead. For bare-metal speed on one model, use the runtime directly.

Architecture

mata.run() / mata.load() / mata.infer()
         |
   UniversalLoader (5-strategy auto-detection)
         |
   Task Adapters (HuggingFace / ONNX / TorchScript / Torchvision)
         |                          |
   VisionResult (single-task)   Graph System (multi-task)
         |                          |
   Runtime Layer              Parallel scheduler + control flow
         |
   Export (JSON / CSV / image overlay / crops)

Roadmap

See CHANGELOG.md for full version history.

  • v2.0 (Q2 2026) — Training module (mata.train()), TensorRT, mobile export, breaking API cleanup
  • v2.x — HuggingFace Hub model recommendations, KACA CNN integration, V2L HyperLoRA research
  • v2.5+ — 3D vision, edge deployment, Auto-ML

What's Next?

License

Apache License 2.0. See LICENSE and NOTICE.

MATA does not distribute model weights. Models fetched via mata.load() are governed by their own licenses (Apache 2.0, MIT, CC-BY-NC, etc.). You are responsible for complying with model-specific terms.

Contributing

Contributions welcome. See CONTRIBUTING.md for guidelines (Apache 2.0 compatibility, >80% test coverage, Black formatting, type hints).

Acknowledgments

Built on HuggingFace Transformers, PyTorch, and ONNX Runtime.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamata-1.9.6.tar.gz (887.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datamata-1.9.6-py3-none-any.whl (581.0 kB view details)

Uploaded Python 3

File details

Details for the file datamata-1.9.6.tar.gz.

File metadata

  • Download URL: datamata-1.9.6.tar.gz
  • Upload date:
  • Size: 887.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamata-1.9.6.tar.gz
Algorithm Hash digest
SHA256 8a2716339428d2ae4254512a29bf4476724311973a0c6e850ee17cf1f6c9b7f2
MD5 0916be0190ed9587847baa5adfee7de3
BLAKE2b-256 b297dd529298a4bf4380496ee461166f686ce82447a3904e7743c193f6122d7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamata-1.9.6.tar.gz:

Publisher: publish.yml on datamata-io/mata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datamata-1.9.6-py3-none-any.whl.

File metadata

  • Download URL: datamata-1.9.6-py3-none-any.whl
  • Upload date:
  • Size: 581.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for datamata-1.9.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a52a029a10bdf68631cc88a39eeeae311ca1d817950d26a32abe2c01308d6864
MD5 b5cb7410fe9110cfa7ac64f98ef18515
BLAKE2b-256 6ab0194d99872e110e3c789dc1ff6c454eb407d27a64d1292649ad3fe5b87a5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for datamata-1.9.6-py3-none-any.whl:

Publisher: publish.yml on datamata-io/mata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page