YOLO segmentation utilities for video streams with tracking and inference metrics
Project description
mini_vision
Simple Python library for object segmentation in video streams using YOLO models.
The goal of this library is to provide a modular pipeline to:
- consume video streams
- perform object segmentation
- render contours or masks
The library follows data-oriented design and low coupling principles, allowing easy replacement of computer vision models.
Installation
pip install mini-vision
Usage
Example showing how to use mini_vision to consume a video stream, run YOLO segmentation, and render object contours.
Import the library
from mini_vision import (
YoloSegmenter,
SegmentationRenderer,
WSFrameClient,
InferenceMetrics
)
Components
| Component | Description |
|---|---|
YoloSegmenter |
Runs object segmentation using one or more YOLO segmentation models |
SegmentationRenderer |
Draws segmentation contours or masks on the frame |
WSFrameClient |
Connects to a WebSocket video stream and yields frames |
InferenceMetrics |
Calculates inference time, estimated FPS, and process RAM usage |
Device (CPU / GPU)
YoloSegmenter supports explicit device selection for inference.
By default, it runs on CPU, but you can manually choose the execution device.
| Device | Description |
|---|---|
"cpu" |
Runs inference on CPU (default) |
"cuda" |
Runs inference on NVIDIA GPU (faster) |
Example using CPU
segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")
Example using GPU (CUDA)
segmenter = YoloSegmenter("yolov8n-seg.pt", device="cuda")
Note: CUDA requires a compatible NVIDIA GPU and properly installed drivers.
Multiple Models
YoloSegmenter supports loading more than one YOLO model at the same time.
You can pass:
- a single
.ptmodel file - or a directory containing multiple
.ptmodel files
When a directory is provided, the library loads all supported model files found inside it and combines detections from all loaded models into a single output.
Example using a single model
segmenter = YoloSegmenter("models/yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)
Example using a directory with multiple models
segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)
Example directory structure:
models/
├── car-seg.pt
├── person-seg.pt
└── animal-seg.pt
This is useful when you want to combine specialized models in the same segmentation pipeline.
YoloSegmenter
Runs object segmentation using one or more YOLO segmentation models.
segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)
You can also load multiple models by passing a directory:
segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)
Tracking during segmentation
To populate object tracking IDs, enable tracking when calling segment(...).
detections = segmenter.segment(frame, track=True)
This allows each detection to carry a track_id, which can be used by the renderer, JSON output, or TOON output.
Inference Metrics
YoloSegmenter can optionally return inference metrics for each processed frame.
To enable metrics, pass return_metrics=True when calling segment(...).
detections, metrics = segmenter.segment(
frame,
return_metrics=True
)
The returned metrics include:
inference_time_ms: model inference time in millisecondsfps_estimated: estimated FPS based on inference timeram_usage_mb: current RAM usage of the Python process in megabytes
Example output:
{
"metrics": [
{
"inference_time_ms": 42.18,
"fps_estimated": 23.71,
"ram_usage_mb": 684.52
}
]
}
Metrics can also be used together with tracking:
detections, metrics = segmenter.segment(
frame,
track=True,
return_metrics=True
)
If return_metrics is not enabled, segment(...) keeps the default behavior and returns only the detections:
detections = segmenter.segment(frame)
SegmentationRenderer
Responsible for rendering segmentation contours or masks on frames.
renderer = SegmentationRenderer()
frame = renderer.draw(
frame,
detections,
mode="contour",
detect_object=["car", "person", "dog"],
detect_color=["gold", "neon_pink", "lime"],
thickness=2
)
Tracking by ID
The renderer also supports object tracking visualization by ID.
To enable tracking visualization, pass track=True in renderer.draw(...).
renderer = SegmentationRenderer()
frame = renderer.draw(
frame,
detections,
mode="contour",
detect_object=["car", "person", "dog"],
detect_color=["gold", "neon_pink", "lime"],
thickness=2,
track=True
)
When tracking is enabled, the rendered label can include the tracked object ID together with the class label and confidence score.
Example output on frame:
car #3 0.87
person #1 0.92
dog #5 0.81
Note: to have
track_idpopulated in rendered labels, JSON output, or TOON output, tracking must be enabled during segmentation:detections = segmenter.segment(frame, track=True)
JSON Output (optional)
SegmentationRenderer can optionally return structured detection data in JSON format.
This allows integration with logging systems, APIs, analytics pipelines, or other downstream processing tools.
To enable this feature, pass return_json=True.
frame, data = renderer.draw(
frame,
detections,
mode="contour",
detect_object=["car", "person", "dog"],
detect_color=["gold", "neon_pink", "lime"],
thickness=2,
return_json=True
)
Example JSON output:
{
"detections": [
{
"label": "person",
"score": 0.92,
"track_id": 3,
"bbox": [120, 80, 240, 300]
},
{
"label": "car",
"score": 0.88,
"track_id": 7,
"bbox": [400, 210, 560, 350]
}
]
}
If return_json is not enabled, the renderer behaves normally and only returns the processed frame.
TOON Output
The renderer can also return detections in TOON format, a lightweight text representation designed for agent pipelines, logging, and token-efficient LLM processing.
Enable it using return_toon=True.
frame, toon = renderer.draw(
frame,
detections,
return_toon=True
)
Example TOON output:
frame_width=1280 frame_height=720
label=person track_id=3 x=412 y=210 w=120 h=260 cx=472 cy=340 area=31200 score=0.91
label=car track_id=7 x=102 y=320 w=180 h=90 cx=192 cy=365 area=16200 score=0.88
If tracking is enabled, TOON output can include the tracked object ID through the track_id field.
WSFrameClient
Connects to a WebSocket video stream and yields frames.
client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")
async for frame in client.frames():
detections = segmenter.segment(frame, track=True)
frame = renderer.draw(
frame,
detections,
mode="contour",
detect_object=["car", "person", "dog"],
detect_color=["gold", "neon_pink", "lime"],
thickness=2,
track=True
)
Full Example
from mini_vision import (
YoloSegmenter,
SegmentationRenderer,
WSFrameClient
)
segmenter = YoloSegmenter("models/", device="cuda")
renderer = SegmentationRenderer()
client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")
async for frame in client.frames():
detections, metrics = segmenter.segment(
frame,
track=True,
return_metrics=True
)
frame = renderer.draw(
frame,
detections,
mode="contour",
detect_object=["car", "person", "dog"],
detect_color=["gold", "neon_pink", "lime"],
thickness=2,
track=True
)
print(metrics)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mini_vision-0.5.1.tar.gz.
File metadata
- Download URL: mini_vision-0.5.1.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdce8c26e4ff3eb71774c1064d7a4db7dbe1e4818a0ea2ab6ff565fc738d3e76
|
|
| MD5 |
d38e7da998152f335dd23f2646b12689
|
|
| BLAKE2b-256 |
f0019e6c236240c2a89b04e2095ea43cb41ea20adff4ed793f9df6f1fde6dbca
|
File details
Details for the file mini_vision-0.5.1-py3-none-any.whl.
File metadata
- Download URL: mini_vision-0.5.1-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a994a00592b9d7c611599430adc8a471659ed892ca02a0effcdce4bf79482d13
|
|
| MD5 |
ecce125e17e0f72e42a0bca850440df3
|
|
| BLAKE2b-256 |
32c27a2f78d43a4baf4694357ef2aaaee10d98895dee6f985907a1a4f5c6c739
|