Advanced Vision-Language Model Engine for content tagging

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

havencto masterpy

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

VLM Engine

A high-performance Python package for Vision-Language Model (VLM) based content tagging and analysis. This package provides an advanced implementation for automatic content detection and tagging, delivering superior accuracy compared to traditional image classification methods.

Features

Remote VLM Integration: Connects to any OpenAI-compatible VLM endpoint (no local model loading required)
Context-Aware Detection: Leverages Vision-Language Models' understanding of visual relationships for accurate content tagging
Flexible Architecture: Modular pipeline system with configurable models and processing stages
Asynchronous Processing: Built on asyncio for efficient video and image processing
Customizable Tag Sets: Easy configuration of detection categories
Production Ready: Includes retry logic, error handling, and comprehensive logging

Documentation

USER_GUIDE.md - Comprehensive configuration guide with detailed parameter descriptions, examples, and best practices
examples/ - Working code examples for various use cases
MULTIPLEXER_INTEGRATION.md - Detailed multiplexer setup and configuration

Features

Remote VLM Integration: Connects to any OpenAI-compatible VLM endpoint (no local model loading required)
Context-Aware Detection: Leverages Vision-Language Models' understanding of visual relationships for accurate content tagging
Flexible Architecture: Modular pipeline system with configurable models and processing stages
Asynchronous Processing: Built on asyncio for efficient video and image processing
Customizable Tag Sets: Easy configuration of detection categories
Production Ready: Includes retry logic, error handling, and comprehensive logging

Installation

From PyPI (when published)

pip install vlm-engine

From Source

git clone https://github.com/Haven-hvn/haven-vlm-engine-package.git
cd vlm-engine-package
pip install -e .

Requirements

Python 3.8+
Sufficient RAM: Video preprocessing loads entire videos into memory (not GPU memory)
Compatible VLM server endpoint:
- Remote OpenAI-compatible API (recommended)
- Local server using LM Studio

Quick Start

import asyncio
from vlm_engine import VLMEngine
from vlm_engine.config_models import EngineConfig, ModelConfig

# Configure the engine
config = EngineConfig(
    active_ai_models=["llm_vlm_model"],
    models={
        "llm_vlm_model": ModelConfig(
            type="vlm_model",
            model_id="HuggingFaceTB/SmolVLM-Instruct",
            api_base_url="http://localhost:7045",
            tag_list=["tag1", "tag2", "tag3"]  # Your custom tags
        )
    }
)

# Initialize and use
async def main():
    engine = VLMEngine(config)
    await engine.initialize()

    results = await engine.process_video(
        "path/to/video.mp4",
        frame_interval=2.0,
        threshold=0.5
    )
    print(f"Detected tags: {results}")

asyncio.run(main())

For more detailed configuration options, parameter descriptions, and best practices, see the USER_GUIDE.md.

Multiplexer Configuration (Load Balancing)

For high-performance deployments, you can configure multiple VLM endpoints with automatic load balancing:

from vlm_engine.config_models import EngineConfig, ModelConfig

config = EngineConfig(
    active_ai_models=["vlm_multiplexer_model"],
    models={
        "vlm_multiplexer_model": ModelConfig(
            type="vlm_model",
            model_id="HuggingFaceTB/SmolVLM-Instruct",
            use_multiplexer=True,  # Enable multiplexer mode
            multiplexer_endpoints=[
                {
                    "base_url": "http://server1:7045/v1",
                    "api_key": "",
                    "name": "primary-server",
                    "weight": 5,  # Higher weight = more requests
                    "is_fallback": False
                },
                {
                    "base_url": "http://server2:7045/v1",
                    "api_key": "",
                    "name": "secondary-server",
                    "weight": 3,
                    "is_fallback": False
                },
                {
                    "base_url": "http://backup:7045/v1",
                    "api_key": "",
                    "name": "backup-server",
                    "weight": 1,
                    "is_fallback": True  # Used only when primaries fail
                }
            ],
            tag_list=["tag1", "tag2", "tag3"]
        )
    }
)

Architecture

Core Components

VLMEngine: Main entry point for the package
- Manages model initialization and pipeline execution
- Handles asynchronous processing of videos and images
VLMClient: OpenAI-compatible API client with multiplexer support
- Supports any VLM with chat completions endpoint
- Load balancing across multiple endpoints using multiplexer-llm
- Automatic failover for high availability
- Includes retry logic with exponential backoff and jitter
- Handles image encoding and prompt formatting
Pipeline System: Flexible processing pipeline
- Modular design allows custom processing stages
- Built-in support for preprocessing, analysis, and postprocessing
- Configurable through YAML or Python objects
Model Management: Dynamic model loading
- Supports multiple model types (VLM, preprocessors, postprocessors)
- Lazy loading for efficient resource usage
- Thread-safe model access

For detailed architecture information and component interactions, see USER_GUIDE.md.

Configuration

The VLM Engine uses four main configuration classes:

EngineConfig - Global engine settings and behavior
PipelineConfig - Defines processing workflows
ModelConfig - Configures individual AI models and processors
PipelineModelConfig - Defines how models integrate into pipelines

For detailed parameter descriptions, configuration examples, and best practices, see USER_GUIDE.md.

Basic Configuration

from vlm_engine.config_models import EngineConfig, ModelConfig, PipelineConfig

config = EngineConfig(
    active_ai_models=["my_vlm_model"],
    models={
        "my_vlm_model": ModelConfig(
            type="vlm_model",
            model_id="model-name",
            api_base_url="http://localhost:8000",
            tag_list=["action1", "action2", "action3"],
            max_batch_size=5,
            instance_count=3,
            model_return_confidence=True
        )
    },
    pipelines={
        "video_pipeline": PipelineConfig(
            inputs=["video_path", "frame_interval"],
            output="results",
            version=1.0,
            models=[
                PipelineModelConfig(
                    name="my_vlm_model",
                    inputs=["video_path"],
                    outputs=["results"]
                )
            ]
        )
    }
)

Multiplexer Configuration

For high-performance deployments with load balancing:

from vlm_engine.config_models import ModelConfig

config = EngineConfig(
    active_ai_models=["vlm_multiplexer_model"],
    models={
        "vlm_multiplexer_model": ModelConfig(
            type="vlm_model",
            model_id="model-name",
            use_multiplexer=True,
            multiplexer_endpoints=[
                {
                    "api_base_url": "http://server1:7045/v1",
                    "model_id": "model-name",
                    "weight": 5
                },
                {
                    "api_base_url": "http://server2:7045/v1",
                    "model_id": "model-name",
                    "weight": 3
                }
            ],
            tag_list=["tag1", "tag2", "tag3"]
        )
    }
)

Advanced Configuration

The package supports complex configurations including:

Multiple models in a pipeline
Custom preprocessing and postprocessing stages
Category-specific settings (thresholds, durations, etc.)
Batch processing configurations
Category filtering and transformation rules

See the examples directory for detailed configuration examples.

For comprehensive configuration details, parameter descriptions, and best practices, see USER_GUIDE.md.

API Reference

VLMEngine

class VLMEngine:
    def __init__(self, config: EngineConfig)
    async def initialize()
    async def process_video(video_path: str, **kwargs) -> Dict[str, Any]

Processing Parameters

video_path: Path to the video file
frame_interval: Seconds between frame samples (default: 0.5)
threshold: Confidence threshold for tag detection (default: 0.5)
return_timestamps: Include timestamp information (default: True)
return_confidence: Include confidence scores (default: True)

For detailed parameter descriptions and configuration options, see USER_GUIDE.md.

Performance Optimization

Memory Requirements

Important: Video preprocessing loads the entire video into system RAM (not GPU memory)
Ensure sufficient RAM for your video sizes (e.g., a 1GB video may require 4-8GB of available RAM)
Consider processing videos in segments for very large files

API Optimization

Configure retry settings based on your VLM server's capacity
Adjust max_batch_size to balance throughput vs memory usage
Use appropriate frame_interval to reduce processing time and API calls

Processing Speed

Increase frame_interval to sample fewer frames (faster but less accurate)
Use batch processing when your VLM endpoint supports it
Consider running multiple VLM instances for parallel processing

For detailed performance tuning guidelines and best practices, see USER_GUIDE.md.

Extending the Package

Custom Models

Create custom model classes by inheriting from the base Model class:

from vlm_engine.models import Model

class CustomModel(Model):
    async def process(self, inputs):
        # Your custom processing logic
        return results

Custom Pipelines

Define custom pipelines for specific use cases:

custom_pipeline = PipelineConfig(
    inputs=["image_path"],
    output="analysis",
    models=[
        {"name": "preprocessor", "inputs": ["image_path"], "outputs": "processed_image"},
        {"name": "analyzer", "inputs": ["processed_image"], "outputs": "analysis"}
    ]
)

For detailed information on model types, pipeline design, and best practices, see USER_GUIDE.md.

Troubleshooting

Common Issues

"No valid pipelines loaded" Error
- Cause: Configuration is missing required pipeline definitions or models
- Solution: Ensure your EngineConfig includes:
  - At least one pipeline in the pipelines dictionary
  - Models defined in the models dictionary that are referenced by pipelines
  - Valid active_ai_models list pointing to existing model names
- Best Practice: Use the provided haven_vlm_config.py as a reference configuration

"Cannot import EngineConfig" Error

Cause: Incorrect import statement

Solution: Import from the correct module:

from vlm_engine import VLMEngine  # Only VLMEngine is exposed
from vlm_engine.config_models import EngineConfig  # Config classes are in separate module

Connection Errors
- Ensure your VLM server is running and accessible
- Check the api_base_url configuration
- Verify firewall settings
GPU Memory Errors
- Reduce batch size or frame interval
- Ensure proper CUDA installation
- Check GPU memory availability
Slow Processing
- Increase frame interval for faster processing
- Use GPU acceleration if available
- Optimize VLM server settings

Package Import Best Practices

What's exposed to consumers:

Only VLMEngine is exported via from vlm_engine import *
All configuration classes are in vlm_engine.config_models
Internal classes (Pipeline, ModelManager, etc.) are not exported

Correct usage pattern:

# ✅ CORRECT - Import what you need
from vlm_engine import VLMEngine
from vlm_engine.config_models import EngineConfig, ModelConfig, PipelineConfig

# ❌ WRONG - Don't try to import internal classes
from vlm_engine import Pipeline  # This will fail
from vlm_engine import ModelManager  # This will fail

Platform-Specific Notes

macOS Users:

The package uses PyAV for video processing (no decord required)
Video preprocessing loads entire videos into system RAM (not GPU memory)
Ensure sufficient RAM for your video sizes (e.g., 1GB video may require 4-8GB RAM)

Linux/Windows Users:

Optionally install decord for faster video decoding: pip install vlm-engine[decord]
PyAV is the default and works on all platforms

For detailed troubleshooting steps and validation checks, see USER_GUIDE.md.

Logging

Enable debug logging for troubleshooting:

import logging
logging.basicConfig(level=logging.DEBUG)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

git clone https://github.com/yourusername/vlm-engine.git
cd vlm-engine
pip install -e ".[dev]"

Running Tests

pytest tests/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on top of modern Python async patterns
Inspired by production ML serving architectures
Haven's custom VLM models trained using SmolVLM-Finetune - Model Download found on https://havenmodels.orbiter.website/
Designed for integration with OpenAI-compatible VLM endpoints

Support

For issues and feature requests, please use the GitHub issue tracker.

For questions and discussions, join our community:

Discord: Link to Discord

Note: This package requires an OpenAI-compatible VLM endpoint. Options include:

Remote Services

Any OpenAI-compatible API endpoint
Akash deployment - https://github.com/Haven-hvn/haven-inference

Local Setup

LM Studio - Easy local VLM hosting with OpenAI-compatible API

The package does not load VLM models directly - it communicates with external VLM services via API.

Documentation

USER_GUIDE.md - Comprehensive configuration guide with detailed parameter descriptions, examples, and best practices
examples/ - Working code examples for various use cases
MULTIPLEXER_INTEGRATION.md - Detailed multiplexer setup and configuration

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

havencto masterpy

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

1.0.1

Apr 5, 2026

1.0.0

Mar 21, 2026

0.9.93

Mar 21, 2026

0.9.92

Mar 21, 2026

0.9.91

Mar 21, 2026

0.9.9

Mar 21, 2026

0.9.8

Mar 21, 2026

0.9.7

Mar 21, 2026

0.9.6

Mar 21, 2026

0.9.5

Mar 21, 2026

0.9.4

Mar 15, 2026

0.9.3

Feb 22, 2026

0.9.2

Feb 22, 2026

0.9.1

Jan 25, 2026

0.9.0

Jan 25, 2026

0.8.49

Jan 25, 2026

0.8.48

Jan 25, 2026

0.8.47

Jan 25, 2026

0.8.46

Jan 25, 2026

This version

0.8.45

Jan 25, 2026

0.8.44

Jan 25, 2026

0.8.43

Jan 25, 2026

0.8.42

Jan 25, 2026

0.8.41

Jan 24, 2026

0.8.9

Jan 25, 2026

0.8.8

Jan 25, 2026

0.8.7

Jan 25, 2026

0.8.6

Jan 25, 2026

0.8.5

Jan 24, 2026

0.8.4

Jan 24, 2026

0.8.3

Jan 24, 2026

0.8.2

Jan 24, 2026

0.8.1

Jan 24, 2026

0.8.0

Jan 24, 2026

0.7.9

Jan 24, 2026

0.7.7

Jan 23, 2026

0.7.6

Jan 23, 2026

0.7.5

Jan 23, 2026

0.7.4

Jan 20, 2026

0.7.3

Jan 19, 2026

0.7.2

Jan 19, 2026

0.7.1

Jan 19, 2026

0.7.0

Jan 19, 2026

0.6.3

Jan 18, 2026

0.6.2

Jan 18, 2026

0.6.1

Jan 18, 2026

0.6.0

Jan 13, 2026

0.5.9

Jan 3, 2026

0.5.8

Jan 3, 2026

0.5.7

Jan 3, 2026

0.5.6

Jan 3, 2026

0.5.5

Jan 2, 2026

0.5.4

Jan 2, 2026

0.5.3

Jan 2, 2026

0.5.2

Jan 2, 2026

0.5.1

Jan 2, 2026

0.5.0

Jan 2, 2026

0.4.19

Jul 21, 2025

0.4.18

Jul 21, 2025

0.4.17

Jul 21, 2025

0.4.16

Jul 21, 2025

0.4.15

Jul 21, 2025

0.4.14

Jul 21, 2025

0.4.13

Jul 20, 2025

0.4.12

Jul 20, 2025

0.4.11

Jul 20, 2025

0.4.1

Jul 20, 2025

0.4.0

Jul 19, 2025

0.3.9996

Jul 19, 2025

0.3.9995

Jul 19, 2025

0.3.9994

Jul 19, 2025

0.3.9993

Jul 19, 2025

0.3.9992

Jul 18, 2025

0.3.9991

Jul 18, 2025

0.3.999

Jul 18, 2025

0.3.998

Jul 18, 2025

0.3.997

Jul 18, 2025

0.3.996

Jul 18, 2025

0.3.995

Jul 18, 2025

0.3.994

Jul 18, 2025

0.3.993

Jul 18, 2025

0.3.992

Jul 18, 2025

0.3.991

Jul 18, 2025

0.3.99

Jul 18, 2025

0.3.98

Jul 18, 2025

0.3.97

Jul 17, 2025

0.3.96

Jul 17, 2025

0.3.95

Jul 17, 2025

0.3.94

Jul 17, 2025

0.3.93

Jul 17, 2025

0.3.92

Jul 16, 2025

0.3.91

Jul 16, 2025

0.3.9

Jul 16, 2025

0.3.8

Jul 16, 2025

0.3.7

Jul 16, 2025

0.3.6

Jul 15, 2025

0.3.5

Jul 14, 2025

0.3.4

Jul 14, 2025

0.3.3

Jul 14, 2025

0.3.2

Jul 14, 2025

0.3.1

Jul 14, 2025

0.3.0

Jul 13, 2025

0.2.1

Jul 4, 2025

0.2.0

Jul 3, 2025

0.1.1

Jun 10, 2025

0.1.0

Jun 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlm_engine-0.8.45.tar.gz (57.1 kB view details)

Uploaded Jan 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vlm_engine-0.8.45-py3-none-any.whl (62.3 kB view details)

Uploaded Jan 25, 2026 Python 3

File details

Details for the file vlm_engine-0.8.45.tar.gz.

File metadata

Download URL: vlm_engine-0.8.45.tar.gz
Upload date: Jan 25, 2026
Size: 57.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vlm_engine-0.8.45.tar.gz
Algorithm	Hash digest
SHA256	`26856dfd45cbe4d434b077fc3ca306df89e7053f6c8feab2135a1e31dda5064e`
MD5	`0c6e466126fc9a3a13eb4ef351931d51`
BLAKE2b-256	`8e92f68c7f3efcf496a7048de886e4ae512e07572f19f117b39c886bf2349758`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vlm_engine-0.8.45.tar.gz:

Publisher: publish-pypi.yml on Haven-hvn/haven-vlm-engine-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vlm_engine-0.8.45.tar.gz
- Subject digest: 26856dfd45cbe4d434b077fc3ca306df89e7053f6c8feab2135a1e31dda5064e
- Sigstore transparency entry: 854525030
- Sigstore integration time: Jan 25, 2026
Source repository:
- Permalink: Haven-hvn/haven-vlm-engine-package@edda8197e970b3ce7e5ee869aed742b5df145dfb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Haven-hvn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@edda8197e970b3ce7e5ee869aed742b5df145dfb
- Trigger Event: workflow_dispatch

File details

Details for the file vlm_engine-0.8.45-py3-none-any.whl.

File metadata

Download URL: vlm_engine-0.8.45-py3-none-any.whl
Upload date: Jan 25, 2026
Size: 62.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vlm_engine-0.8.45-py3-none-any.whl
Algorithm	Hash digest
SHA256	`db97b4166337615fcc14a18968b559703b60371505e0d1a541c29a51f3950360`
MD5	`731e883bb2581c06666439c042e598de`
BLAKE2b-256	`c9f13974073f120877121922daf768ade85eda1c7e268c7ab7dd7ff5b5fa2bd9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vlm_engine-0.8.45-py3-none-any.whl:

Publisher: publish-pypi.yml on Haven-hvn/haven-vlm-engine-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vlm_engine-0.8.45-py3-none-any.whl
- Subject digest: db97b4166337615fcc14a18968b559703b60371505e0d1a541c29a51f3950360
- Sigstore transparency entry: 854525031
- Sigstore integration time: Jan 25, 2026
Source repository:
- Permalink: Haven-hvn/haven-vlm-engine-package@edda8197e970b3ce7e5ee869aed742b5df145dfb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Haven-hvn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@edda8197e970b3ce7e5ee869aed742b5df145dfb
- Trigger Event: workflow_dispatch

vlm-engine 0.8.45

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

VLM Engine

Features

Documentation

Features

Installation

From PyPI (when published)

From Source

Requirements

Quick Start

Multiplexer Configuration (Load Balancing)

Architecture

Core Components

Configuration

Basic Configuration

Multiplexer Configuration

Advanced Configuration

API Reference

VLMEngine

Processing Parameters

Performance Optimization

Memory Requirements

API Optimization

Processing Speed

Extending the Package

Custom Models

Custom Pipelines

Troubleshooting

Common Issues

Package Import Best Practices

Platform-Specific Notes

Logging

Contributing

Development Setup

Running Tests

License

Acknowledgments

Support

Remote Services

Local Setup

Documentation

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance