YOLO segmentation utilities for video streams with tracking and inference metrics

Project description

mini_vision

Simple Python library for object segmentation in video streams using YOLO models.

The goal of this library is to provide a modular pipeline to:

consume video streams
perform object segmentation
render contours or masks

The library follows data-oriented design and low coupling principles, allowing easy replacement of computer vision models.

Installation

pip install mini-vision

Usage

Example showing how to use mini_vision to consume a video stream, run YOLO segmentation, and render object contours.

Import the library

from mini_vision import (
 YoloSegmenter,
 SegmentationRenderer,
 WSFrameClient,
 InferenceMetrics
)

Components

Component	Description
`YoloSegmenter`	Runs object segmentation using one or more YOLO segmentation models
`SegmentationRenderer`	Draws segmentation contours or masks on the frame
`WSFrameClient`	Connects to a WebSocket video stream and yields frames
`InferenceMetrics`	Calculates inference time, estimated FPS, and process RAM usage

Device (CPU / GPU)

YoloSegmenter supports explicit device selection for inference.

By default, it runs on CPU, but you can manually choose the execution device.

Device	Description
`"cpu"`	Runs inference on CPU (default)
`"cuda"`	Runs inference on NVIDIA GPU (faster)

Example using CPU

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")

Example using GPU (CUDA)

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cuda")

Note: CUDA requires a compatible NVIDIA GPU and properly installed drivers.

Multiple Models

YoloSegmenter supports loading more than one YOLO model at the same time.

You can pass:

a single .pt model file
or a directory containing multiple .pt model files

When a directory is provided, the library loads all supported model files found inside it and combines detections from all loaded models into a single output.

Example using a single model

segmenter = YoloSegmenter("models/yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)

Example using a directory with multiple models

segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)

Example directory structure:

models/
├── car-seg.pt
├── person-seg.pt
└── animal-seg.pt

This is useful when you want to combine specialized models in the same segmentation pipeline.

YoloSegmenter

Runs object segmentation using one or more YOLO segmentation models.

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)

You can also load multiple models by passing a directory:

segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)

Tracking during segmentation

To populate object tracking IDs, enable tracking when calling segment(...).

detections = segmenter.segment(frame, track=True)

This allows each detection to carry a track_id, which can be used by the renderer, JSON output, or TOON output.

Inference Metrics

YoloSegmenter can optionally return inference metrics for each processed frame.

To enable metrics, pass return_metrics=True when calling segment(...).

detections, metrics = segmenter.segment(
 frame,
 return_metrics=True
)

The returned metrics include:

inference_time_ms: model inference time in milliseconds
fps_estimated: estimated FPS based on inference time
ram_usage_mb: current RAM usage of the Python process in megabytes

Example output:

{
 "metrics": [
 {
 "inference_time_ms": 42.18,
 "fps_estimated": 23.71,
 "ram_usage_mb": 684.52
 }
 ]
}

Metrics can also be used together with tracking:

detections, metrics = segmenter.segment(
 frame,
 track=True,
 return_metrics=True
)

If return_metrics is not enabled, segment(...) keeps the default behavior and returns only the detections:

detections = segmenter.segment(frame)

SegmentationRenderer

Responsible for rendering segmentation contours or masks on frames.

renderer = SegmentationRenderer()
frame = renderer.draw(
 frame,
 detections,
 mode="contour",
 detect_object=["car", "person", "dog"],
 detect_color=["gold", "neon_pink", "lime"],
 thickness=2
)

Tracking by ID

The renderer also supports object tracking visualization by ID.

To enable tracking visualization, pass track=True in renderer.draw(...).

renderer = SegmentationRenderer()
frame = renderer.draw(
 frame,
 detections,
 mode="contour",
 detect_object=["car", "person", "dog"],
 detect_color=["gold", "neon_pink", "lime"],
 thickness=2,
 track=True
)

When tracking is enabled, the rendered label can include the tracked object ID together with the class label and confidence score.

Example output on frame:

car #3 0.87
person #1 0.92
dog #5 0.81

Note: to have track_id populated in rendered labels, JSON output, or TOON output, tracking must be enabled during segmentation:
detections = segmenter.segment(frame, track=True)

JSON Output (optional)

SegmentationRenderer can optionally return structured detection data in JSON format.

This allows integration with logging systems, APIs, analytics pipelines, or other downstream processing tools.

To enable this feature, pass return_json=True.

frame, data = renderer.draw(
 frame,
 detections,
 mode="contour",
 detect_object=["car", "person", "dog"],
 detect_color=["gold", "neon_pink", "lime"],
 thickness=2,
 return_json=True
)

Example JSON output:

{
 "detections": [
 {
 "label": "person",
 "score": 0.92,
 "track_id": 3,
 "bbox": [120, 80, 240, 300]
 },
 {
 "label": "car",
 "score": 0.88,
 "track_id": 7,
 "bbox": [400, 210, 560, 350]
 }
 ]
}

If return_json is not enabled, the renderer behaves normally and only returns the processed frame.

TOON Output

The renderer can also return detections in TOON format, a lightweight text representation designed for agent pipelines, logging, and token-efficient LLM processing.

Enable it using return_toon=True.

frame, toon = renderer.draw(
 frame,
 detections,
 return_toon=True
)

Example TOON output:

frame_width=1280 frame_height=720
label=person track_id=3 x=412 y=210 w=120 h=260 cx=472 cy=340 area=31200 score=0.91
label=car track_id=7 x=102 y=320 w=180 h=90 cx=192 cy=365 area=16200 score=0.88

If tracking is enabled, TOON output can include the tracked object ID through the track_id field.

WSFrameClient

Connects to a WebSocket video stream and yields frames.

client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")

async for frame in client.frames():
 detections = segmenter.segment(frame, track=True)

 frame = renderer.draw(
 frame,
 detections,
 mode="contour",
 detect_object=["car", "person", "dog"],
 detect_color=["gold", "neon_pink", "lime"],
 thickness=2,
 track=True
 )

Full Example

from mini_vision import (
 YoloSegmenter,
 SegmentationRenderer,
 WSFrameClient
)

segmenter = YoloSegmenter("models/", device="cuda")
renderer = SegmentationRenderer()
client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")

async for frame in client.frames():
 detections, metrics = segmenter.segment(
  frame,
  track=True,
  return_metrics=True
 )

 frame = renderer.draw(
 frame,
 detections,
 mode="contour",
 detect_object=["car", "person", "dog"],
 detect_color=["gold", "neon_pink", "lime"],
 thickness=2,
 track=True
 )

 print(metrics)

Project details

Release history Release notifications | RSS feed

This version

0.5.1

May 5, 2026

0.5.0

Mar 25, 2026

0.3.1

Mar 7, 2026

0.3.0

Mar 6, 2026

0.2.4

Mar 6, 2026

0.2.3

Mar 5, 2026

0.2.1

Mar 4, 2026

0.2.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mini_vision-0.5.1.tar.gz (8.9 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mini_vision-0.5.1-py3-none-any.whl (7.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file mini_vision-0.5.1.tar.gz.

File metadata

Download URL: mini_vision-0.5.1.tar.gz
Upload date: May 5, 2026
Size: 8.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mini_vision-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`cdce8c26e4ff3eb71774c1064d7a4db7dbe1e4818a0ea2ab6ff565fc738d3e76`
MD5	`d38e7da998152f335dd23f2646b12689`
BLAKE2b-256	`f0019e6c236240c2a89b04e2095ea43cb41ea20adff4ed793f9df6f1fde6dbca`

See more details on using hashes here.

File details

Details for the file mini_vision-0.5.1-py3-none-any.whl.

File metadata

Download URL: mini_vision-0.5.1-py3-none-any.whl
Upload date: May 5, 2026
Size: 7.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mini_vision-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a994a00592b9d7c611599430adc8a471659ed892ca02a0effcdce4bf79482d13`
MD5	`ecce125e17e0f72e42a0bca850440df3`
BLAKE2b-256	`32c27a2f78d43a4baf4694357ef2aaaee10d98895dee6f985907a1a4f5c6c739`

See more details on using hashes here.

mini-vision 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

mini_vision

Installation

Usage

Import the library

Components

Device (CPU / GPU)

Example using CPU

Example using GPU (CUDA)

Multiple Models

Example using a single model

Example using a directory with multiple models

YoloSegmenter

Tracking during segmentation

Inference Metrics

SegmentationRenderer

Tracking by ID

JSON Output (optional)

TOON Output

WSFrameClient

Full Example

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes