Advanced Vision-Language Model Engine for content tagging

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

havencto masterpy

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

VLM Engine

A high-performance Python package for Vision-Language Model (VLM) based content tagging and analysis. This package provides an advanced implementation for automatic content detection and tagging, delivering superior accuracy compared to traditional image classification methods.

Features

Remote VLM Integration: Connects to any OpenAI-compatible VLM endpoint (no local model loading required)
Context-Aware Detection: Leverages Vision-Language Models' understanding of visual relationships for accurate content tagging
Flexible Architecture: Modular pipeline system with configurable models and processing stages
Asynchronous Processing: Built on asyncio for efficient video and image processing
Customizable Tag Sets: Easy configuration of detection categories
Production Ready: Includes retry logic, error handling, and comprehensive logging

Installation

From PyPI (when published)

pip install vlm-engine

From Source

git clone https://github.com/Haven-hvn/haven-vlm-engine-package.git
cd vlm-engine-package
pip install -e .

Requirements

Python 3.8+
Sufficient RAM: Video preprocessing loads entire videos into memory (not GPU memory)
Compatible VLM server endpoint:
- Remote OpenAI-compatible API (recommended)
- Local server using LM Studio
- Haven's custom VLM available at https://havenmodels.orbiter.website/

Quick Start

import asyncio
from vlm_engine import VLMEngine
from vlm_engine.config_models import EngineConfig, ModelConfig

# Configure the engine
config = EngineConfig(
    active_ai_models=["vlm_nsfw_model"],
    models={
        "vlm_nsfw_model": ModelConfig(
            type="vlm_model",
            model_id="HuggingFaceTB/SmolVLM-Instruct",
            api_base_url="http://localhost:7045",
            tag_list=["tag1", "tag2", "tag3"]  # Your custom tags
        )
    }
)

# Initialize and use
async def main():
    engine = VLMEngine(config)
    await engine.initialize()
    
    results = await engine.process_video(
        "path/to/video.mp4",
        frame_interval=2.0,
        threshold=0.5
    )
    print(f"Detected tags: {results}")

asyncio.run(main())

Multiplexer Configuration (Load Balancing)

For high-performance deployments, you can configure multiple VLM endpoints with automatic load balancing:

from vlm_engine.config_models import EngineConfig, ModelConfig

config = EngineConfig(
    active_ai_models=["vlm_multiplexer_model"],
    models={
        "vlm_multiplexer_model": ModelConfig(
            type="vlm_model",
            model_id="HuggingFaceTB/SmolVLM-Instruct",
            use_multiplexer=True,  # Enable multiplexer mode
            multiplexer_endpoints=[
                {
                    "base_url": "http://server1:7045/v1",
                    "api_key": "",
                    "name": "primary-server",
                    "weight": 5,  # Higher weight = more requests
                    "is_fallback": False
                },
                {
                    "base_url": "http://server2:7045/v1",
                    "api_key": "",
                    "name": "secondary-server",
                    "weight": 3,
                    "is_fallback": False
                },
                {
                    "base_url": "http://backup:7045/v1",
                    "api_key": "",
                    "name": "backup-server",
                    "weight": 1,
                    "is_fallback": True  # Used only when primaries fail
                }
            ],
            tag_list=["tag1", "tag2", "tag3"]
        )
    }
)

Architecture

Core Components

VLMEngine: Main entry point for the package
- Manages model initialization and pipeline execution
- Handles asynchronous processing of videos and images
VLMClient: OpenAI-compatible API client with multiplexer support
- Supports any VLM with chat completions endpoint
- Load balancing across multiple endpoints using multiplexer-llm
- Automatic failover for high availability
- Includes retry logic with exponential backoff and jitter
- Handles image encoding and prompt formatting
Pipeline System: Flexible processing pipeline
- Modular design allows custom processing stages
- Built-in support for preprocessing, analysis, and postprocessing
- Configurable through YAML or Python objects
Model Management: Dynamic model loading
- Supports multiple model types (VLM, preprocessors, postprocessors)
- Lazy loading for efficient resource usage
- Thread-safe model access

Configuration

Basic Configuration

from vlm_engine.config_models import EngineConfig, ModelConfig, PipelineConfig

config = EngineConfig(
    active_ai_models=["my_vlm_model"],
    models={
        "my_vlm_model": ModelConfig(
            type="vlm_model",
            model_id="model-name",
            api_base_url="http://localhost:8000",
            tag_list=["action1", "action2", "action3"],
            max_new_tokens=128,
            request_timeout=70,
            vlm_detected_tag_confidence=0.99
        )
    },
    pipelines={
        "video_pipeline": PipelineConfig(
            inputs=["video_path", "frame_interval"],
            output="results",
            models=[{"name": "my_vlm_model", "inputs": ["frame"], "outputs": "tags"}]
        )
    }
)

Multiplexer Benefits

Load Balancing: Distribute requests across multiple VLM endpoints based on configurable weights
High Availability: Automatic failover to backup endpoints when primary endpoints fail
Improved Performance: Parallel processing across multiple servers for higher throughput
Seamless Integration: Drop-in replacement for single endpoint configurations
Flexible Configuration: Mix of primary and fallback endpoints with custom weights

Advanced Configuration

The package supports complex configurations including:

Multiple models in a pipeline
Custom preprocessing and postprocessing stages
Category-specific settings (thresholds, durations, etc.)
Batch processing configurations

See the examples directory for detailed configuration examples.

For comprehensive multiplexer setup and configuration, see MULTIPLEXER_INTEGRATION.md.

API Reference

VLMEngine

class VLMEngine:
    def __init__(self, config: EngineConfig)
    async def initialize()
    async def process_video(video_path: str, **kwargs) -> Dict[str, Any]

Processing Parameters

video_path: Path to the video file
frame_interval: Seconds between frame samples (default: 0.5)
threshold: Confidence threshold for tag detection (default: 0.5)
return_timestamps: Include timestamp information (default: True)
return_confidence: Include confidence scores (default: True)

Performance Optimization

Memory Requirements

Important: Video preprocessing loads the entire video into system RAM (not GPU memory)
Ensure sufficient RAM for your video sizes (e.g., a 1GB video may require 4-8GB of available RAM)
Consider processing videos in segments for very large files

API Optimization

Configure retry settings based on your VLM server's capacity
Adjust max_new_tokens to balance speed vs accuracy
Use appropriate frame_interval to reduce processing time and API calls

Processing Speed

Increase frame_interval to sample fewer frames (faster but less accurate)
Use batch processing when your VLM endpoint supports it
Consider running multiple VLM instances for parallel processing

Extending the Package

Custom Models

Create custom model classes by inheriting from the base Model class:

from vlm_engine.models import Model

class CustomModel(Model):
    async def process(self, inputs):
        # Your custom processing logic
        return results

Custom Pipelines

Define custom pipelines for specific use cases:

custom_pipeline = PipelineConfig(
    inputs=["image_path"],
    output="analysis",
    models=[
        {"name": "preprocessor", "inputs": ["image_path"], "outputs": "processed_image"},
        {"name": "analyzer", "inputs": ["processed_image"], "outputs": "analysis"}
    ]
)

Troubleshooting

Common Issues

Connection Errors
- Ensure your VLM server is running and accessible
- Check the api_base_url configuration
- Verify firewall settings
GPU Memory Errors
- Reduce batch size or frame interval
- Ensure proper CUDA installation
- Check GPU memory availability
Slow Processing
- Increase frame interval for faster processing
- Use GPU acceleration if available
- Optimize VLM server settings

Logging

Enable debug logging for troubleshooting:

import logging
logging.basicConfig(level=logging.DEBUG)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

git clone https://github.com/yourusername/vlm-engine.git
cd vlm-engine
pip install -e ".[dev]"

Running Tests

pytest tests/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on top of modern Python async patterns
Inspired by production ML serving architectures
Haven's custom VLM models trained using SmolVLM-Finetune - Model Download found on https://havenmodels.orbiter.website/
Designed for integration with OpenAI-compatible VLM endpoints

Support

For issues and feature requests, please use the GitHub issue tracker.

For questions and discussions, join our community:

Discord: Link to Discord

Note: This package requires an OpenAI-compatible VLM endpoint. Options include:

Remote Services

Any OpenAI-compatible API endpoint
Akash deployment - https://github.com/Haven-hvn/haven-inference

Local Setup

LM Studio - Easy local VLM hosting with OpenAI-compatible API

The package does not load VLM models directly - it communicates with external VLM services via API.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

havencto masterpy

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

1.0.1

Apr 5, 2026

1.0.0

Mar 21, 2026

0.9.93

Mar 21, 2026

0.9.92

Mar 21, 2026

0.9.91

Mar 21, 2026

0.9.9

Mar 21, 2026

0.9.8

Mar 21, 2026

0.9.7

Mar 21, 2026

0.9.6

Mar 21, 2026

0.9.5

Mar 21, 2026

0.9.4

Mar 15, 2026

0.9.3

Feb 22, 2026

0.9.2

Feb 22, 2026

0.9.1

Jan 25, 2026

0.9.0

Jan 25, 2026

0.8.49

Jan 25, 2026

0.8.48

Jan 25, 2026

0.8.47

Jan 25, 2026

0.8.46

Jan 25, 2026

0.8.45

Jan 25, 2026

0.8.44

Jan 25, 2026

0.8.43

Jan 25, 2026

0.8.42

Jan 25, 2026

0.8.41

Jan 24, 2026

0.8.9

Jan 25, 2026

0.8.8

Jan 25, 2026

0.8.7

Jan 25, 2026

0.8.6

Jan 25, 2026

0.8.5

Jan 24, 2026

0.8.4

Jan 24, 2026

0.8.3

Jan 24, 2026

0.8.2

Jan 24, 2026

0.8.1

Jan 24, 2026

0.8.0

Jan 24, 2026

0.7.9

Jan 24, 2026

0.7.7

Jan 23, 2026

0.7.6

Jan 23, 2026

0.7.5

Jan 23, 2026

0.7.4

Jan 20, 2026

0.7.3

Jan 19, 2026

0.7.2

Jan 19, 2026

0.7.1

Jan 19, 2026

0.7.0

Jan 19, 2026

0.6.3

Jan 18, 2026

0.6.2

Jan 18, 2026

0.6.1

Jan 18, 2026

0.6.0

Jan 13, 2026

0.5.9

Jan 3, 2026

0.5.8

Jan 3, 2026

0.5.7

Jan 3, 2026

0.5.6

Jan 3, 2026

0.5.5

Jan 2, 2026

0.5.4

Jan 2, 2026

0.5.3

Jan 2, 2026

0.5.2

Jan 2, 2026

0.5.1

Jan 2, 2026

0.5.0

Jan 2, 2026

0.4.19

Jul 21, 2025

0.4.18

Jul 21, 2025

0.4.17

Jul 21, 2025

0.4.16

Jul 21, 2025

0.4.15

Jul 21, 2025

0.4.14

Jul 21, 2025

0.4.13

Jul 20, 2025

0.4.12

Jul 20, 2025

0.4.11

Jul 20, 2025

0.4.1

Jul 20, 2025

0.4.0

Jul 19, 2025

0.3.9996

Jul 19, 2025

This version

0.3.9995

Jul 19, 2025

0.3.9994

Jul 19, 2025

0.3.9993

Jul 19, 2025

0.3.9992

Jul 18, 2025

0.3.9991

Jul 18, 2025

0.3.999

Jul 18, 2025

0.3.998

Jul 18, 2025

0.3.997

Jul 18, 2025

0.3.996

Jul 18, 2025

0.3.995

Jul 18, 2025

0.3.994

Jul 18, 2025

0.3.993

Jul 18, 2025

0.3.992

Jul 18, 2025

0.3.991

Jul 18, 2025

0.3.99

Jul 18, 2025

0.3.98

Jul 18, 2025

0.3.97

Jul 17, 2025

0.3.96

Jul 17, 2025

0.3.95

Jul 17, 2025

0.3.94

Jul 17, 2025

0.3.93

Jul 17, 2025

0.3.92

Jul 16, 2025

0.3.91

Jul 16, 2025

0.3.9

Jul 16, 2025

0.3.8

Jul 16, 2025

0.3.7

Jul 16, 2025

0.3.6

Jul 15, 2025

0.3.5

Jul 14, 2025

0.3.4

Jul 14, 2025

0.3.3

Jul 14, 2025

0.3.2

Jul 14, 2025

0.3.1

Jul 14, 2025

0.3.0

Jul 13, 2025

0.2.1

Jul 4, 2025

0.2.0

Jul 3, 2025

0.1.1

Jun 10, 2025

0.1.0

Jun 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlm_engine-0.3.9995.tar.gz (45.1 kB view details)

Uploaded Jul 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vlm_engine-0.3.9995-py3-none-any.whl (50.3 kB view details)

Uploaded Jul 19, 2025 Python 3

File details

Details for the file vlm_engine-0.3.9995.tar.gz.

File metadata

Download URL: vlm_engine-0.3.9995.tar.gz
Upload date: Jul 19, 2025
Size: 45.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for vlm_engine-0.3.9995.tar.gz
Algorithm	Hash digest
SHA256	`14c8aaaa3ff41c3924f6069ef76eac898eede37a60d4bf889df8b8ce8c7c7f60`
MD5	`24d80b99cc14e36c4b344377aec487e5`
BLAKE2b-256	`e72e4ca75852eff5c40315e7816bf1bdbaf0e72e2333d6eeac6382b9758964c2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vlm_engine-0.3.9995.tar.gz:

Publisher: publish-pypi.yml on Haven-hvn/haven-vlm-engine-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vlm_engine-0.3.9995.tar.gz
- Subject digest: 14c8aaaa3ff41c3924f6069ef76eac898eede37a60d4bf889df8b8ce8c7c7f60
- Sigstore transparency entry: 293134259
- Sigstore integration time: Jul 19, 2025
Source repository:
- Permalink: Haven-hvn/haven-vlm-engine-package@65c5c3a52104cfdb0e91480c5f07da80504d4188
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Haven-hvn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@65c5c3a52104cfdb0e91480c5f07da80504d4188
- Trigger Event: workflow_dispatch

File details

Details for the file vlm_engine-0.3.9995-py3-none-any.whl.

File metadata

Download URL: vlm_engine-0.3.9995-py3-none-any.whl
Upload date: Jul 19, 2025
Size: 50.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for vlm_engine-0.3.9995-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3261b750e2fdf51ff1e05011daa6dea6fff653cf133fedb825fe9dea07f6b1b8`
MD5	`e8edceb3cef454691bd746131a137854`
BLAKE2b-256	`cb187900c011207da876a7e89ed60e1d62b09c127ff6f52397ed2333a36b861e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vlm_engine-0.3.9995-py3-none-any.whl:

Publisher: publish-pypi.yml on Haven-hvn/haven-vlm-engine-package

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vlm_engine-0.3.9995-py3-none-any.whl
- Subject digest: 3261b750e2fdf51ff1e05011daa6dea6fff653cf133fedb825fe9dea07f6b1b8
- Sigstore transparency entry: 293134264
- Sigstore integration time: Jul 19, 2025
Source repository:
- Permalink: Haven-hvn/haven-vlm-engine-package@65c5c3a52104cfdb0e91480c5f07da80504d4188
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Haven-hvn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@65c5c3a52104cfdb0e91480c5f07da80504d4188
- Trigger Event: workflow_dispatch

vlm-engine 0.3.9995

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

VLM Engine

Features

Installation

From PyPI (when published)

From Source

Requirements

Quick Start

Multiplexer Configuration (Load Balancing)

Architecture

Core Components

Configuration

Basic Configuration

Multiplexer Benefits

Advanced Configuration

API Reference

VLMEngine

Processing Parameters

Performance Optimization

Memory Requirements

API Optimization

Processing Speed

Extending the Package

Custom Models

Custom Pipelines

Troubleshooting

Common Issues

Logging

Contributing

Development Setup

Running Tests

License

Acknowledgments

Support

Remote Services

Local Setup

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance