Skip to main content

YOLO segmentation utilities for video streams

Project description

mini_vision

Simple Python library for object segmentation in video streams using YOLO models.

The goal of this library is to provide a modular pipeline to:

  • consume video streams
  • perform object segmentation
  • render contours or masks

The library follows data-oriented design and low coupling principles, allowing easy replacement of computer vision models.


Installation

pip install mini-vision

Usage

Example showing how to use mini_vision to consume a video stream, run YOLO segmentation, and render object contours.

Import the library

from mini_vision import (
    YoloSegmenter,
    SegmentationRenderer,
    WSFrameClient
)

Components

Component Description
YoloSegmenter Runs object segmentation using one or more YOLO segmentation models
SegmentationRenderer Draws segmentation contours or masks on the frame
WSFrameClient Connects to a WebSocket video stream and yields frames

Device (CPU / GPU)

YoloSegmenter supports explicit device selection for inference.

By default, it runs on CPU, but you can manually choose the execution device.

Device Description
"cpu" Runs inference on CPU (default)
"cuda" Runs inference on NVIDIA GPU (faster)

Example using CPU

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")

Example using GPU (CUDA)

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cuda")

Note: CUDA requires a compatible NVIDIA GPU and properly installed drivers.


Multiple Models

YoloSegmenter supports loading more than one YOLO model at the same time.

You can pass:

  • a single .pt model file
  • or a directory containing multiple .pt model files

When a directory is provided, the library loads all supported model files found inside it and combines detections from all loaded models into a single output.

Example using a single model

segmenter = YoloSegmenter("models/yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)

Example using a directory with multiple models

segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)

Example directory structure:

models/
├── car-seg.pt
├── person-seg.pt
└── animal-seg.pt

This is useful when you want to combine specialized models in the same segmentation pipeline.


YoloSegmenter

Runs object segmentation using one or more YOLO segmentation models.

segmenter = YoloSegmenter("yolov8n-seg.pt", device="cpu")
detections = segmenter.segment(frame)

You can also load multiple models by passing a directory:

segmenter = YoloSegmenter("models/", device="cuda")
detections = segmenter.segment(frame)

Tracking during segmentation

To populate object tracking IDs, enable tracking when calling segment(...).

detections = segmenter.segment(frame, track=True)

This allows each detection to carry a track_id, which can be used by the renderer, JSON output, or TOON output.


SegmentationRenderer

Responsible for rendering segmentation contours or masks on frames.

renderer = SegmentationRenderer()
frame = renderer.draw(
    frame,
    detections,
    mode="contour",
    detect_object=["car", "person", "dog"],
    detect_color=["gold", "neon_pink", "lime"],
    thickness=2
)

Tracking by ID

The renderer also supports object tracking visualization by ID.

To enable tracking visualization, pass track=True in renderer.draw(...).

renderer = SegmentationRenderer()
frame = renderer.draw(
    frame,
    detections,
    mode="contour",
    detect_object=["car", "person", "dog"],
    detect_color=["gold", "neon_pink", "lime"],
    thickness=2,
    track=True
)

When tracking is enabled, the rendered label can include the tracked object ID together with the class label and confidence score.

Example output on frame:

car #3 0.87
person #1 0.92
dog #5 0.81

Note: to have track_id populated in rendered labels, JSON output, or TOON output, tracking must be enabled during segmentation:

detections = segmenter.segment(frame, track=True)

JSON Output (optional)

SegmentationRenderer can optionally return structured detection data in JSON format.

This allows integration with logging systems, APIs, analytics pipelines, or other downstream processing tools.

To enable this feature, pass return_json=True.

frame, data = renderer.draw(
    frame,
    detections,
    mode="contour",
    detect_object=["car", "person", "dog"],
    detect_color=["gold", "neon_pink", "lime"],
    thickness=2,
    return_json=True
)

Example JSON output:

{
  "detections": [
    {
      "label": "person",
      "score": 0.92,
      "track_id": 3,
      "bbox": [120, 80, 240, 300]
    },
    {
      "label": "car",
      "score": 0.88,
      "track_id": 7,
      "bbox": [400, 210, 560, 350]
    }
  ]
}

If return_json is not enabled, the renderer behaves normally and only returns the processed frame.


TOON Output

The renderer can also return detections in TOON format, a lightweight text representation designed for agent pipelines, logging, and token-efficient LLM processing.

Enable it using return_toon=True.

frame, toon = renderer.draw(
    frame,
    detections,
    return_toon=True
)

Example TOON output:

frame_width=1280 frame_height=720
label=person track_id=3 x=412 y=210 w=120 h=260 cx=472 cy=340 area=31200 score=0.91
label=car track_id=7 x=102 y=320 w=180 h=90 cx=192 cy=365 area=16200 score=0.88

If tracking is enabled, TOON output can include the tracked object ID through the track_id field.


WSFrameClient

Connects to a WebSocket video stream and yields frames.

client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")

async for frame in client.frames():
    detections = segmenter.segment(frame, track=True)

    frame = renderer.draw(
        frame,
        detections,
        mode="contour",
        detect_object=["car", "person", "dog"],
        detect_color=["gold", "neon_pink", "lime"],
        thickness=2,
        track=True
    )

Full Example

from mini_vision import (
    YoloSegmenter,
    SegmentationRenderer,
    WSFrameClient
)

segmenter = YoloSegmenter("models/", device="cuda")
renderer = SegmentationRenderer()
client = WSFrameClient("ws://127.0.0.1:8000/ws/frames")

async for frame in client.frames():
    detections = segmenter.segment(frame, track=True)

    frame = renderer.draw(
        frame,
        detections,
        mode="contour",
        detect_object=["car", "person", "dog"],
        detect_color=["gold", "neon_pink", "lime"],
        thickness=2,
        track=True
    )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mini_vision-0.5.0.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mini_vision-0.5.0-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file mini_vision-0.5.0.tar.gz.

File metadata

  • Download URL: mini_vision-0.5.0.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mini_vision-0.5.0.tar.gz
Algorithm Hash digest
SHA256 c9af801032f912855d5ad247ef01dbe98fd4ad299bc3e73b88efd7607d358ae6
MD5 0105e89b8091ef9541d0430c2931b5bd
BLAKE2b-256 77aae92240eb296aca65cb4d4d1de1d657d4bcfa1aa080407d57d27e34949134

See more details on using hashes here.

File details

Details for the file mini_vision-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: mini_vision-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mini_vision-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 442d20cbc9ca653d6894680424e114c6dbab5f8886141b97201f80a30ad2fc47
MD5 5a041b36c7824021fbeb2ff06724e086
BLAKE2b-256 7057c20625e2ea86d8b19028bd0ca35baa6277f4aa62ee2f6b7f9dc109e6c332

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page