Skip to main content

Vision module for the OpenMMLA platform.

Project description

🎥 OpenMMLA Vision

PyPI version

Video module of the mBox - an open multimodal learning analytic platform. For more details, please refer to mBox System Design.

Table of Contents

Related Modules

Installation

Uber Server Setup

Before setting up the video base, you need to set up a server hosting the InfluxDB, Redis, Mosquitto, and Nginx services. Please refer to mbox-uber module.

Video Base & Server Setup

  1. Clone the repository

    git clone https://github.com/ucph-ccs/mbox-video.git
    
  2. Install openmmla-vision

    Set up Conda environment
    # For Raspberry Pi
    wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
    bash Miniforge3-$(uname)-$(uname -m).sh
    
    # For Mac and Linux
    wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-$(uname)-$(uname -m).sh"
    bash Miniconda3-latest-$(uname)-$(uname -m).sh
    
    Install Video Base
    conda create -c conda-forge -n video-base python=3.10.12 -y
    conda activate video-base
    pip install openmmla-vision[base]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[base]'  # for Mac
    
    Install Video Server

    The video server provides video frame analyzer services.

    conda create -c conda-forge -n video-server python=3.10.12 -y
    conda activate video-server
    pip install openmmla-vision[server]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[server]'  # for Mac
    
  3. Set up folder structure

    cd mbox-video
    ./reset.sh
    

Standalone Setup

If you want to run the entire mBox Video system on a single machine, follow these steps:

  1. Set up the Uber Server on your machine following the instructions in the mbox-uber module.

  2. Install openmmla-vision with all dependencies:

    conda create -c conda-forge -n mbox-video python=3.10.12 -y
    conda activate mbox-video
    pip install openmmla-vision[all]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[all]'  # for Mac
    
  3. Set up the folder structure:

    cd mbox-video
    ./reset.sh
    

This setup will allow you to run all components of mBox Video on a single machine.

Usage

Realtime Indoor-Positioning

Indoor Positioning System Diagram Multi-camera Pipeline
  1. Stream video from camera(s)

    • Distributed: stream on each camera host machine (e.g., Raspberry Pi)
    • Centralized: stream to a centralized RTMP server (e.g., MacBook, see Raspberry Pi RTMP streaming setup)
  2. Calibrate camera's intrinsic parameters

    1. Print chessboard image from ./camera_calib/pattern/ and stick it on a flat surface
    2. Capture chessboard image with your camera and calibrate it by running ./calib_camera.sh
  3. Synchronize multi-cameras' coordinate systems

    Calculate transformation matrix between main and alternative cameras:

    ./sync_camera.sh [-d <num_cameras>] [-s <num_sync_managers>]
    

    Default parameter settings:

    • -d: 2 (number of cameras to sync)
    • -s: 1 (number of camera sync manager)

    Modes:

    • Centralized:
      ./sync_camera.sh -d 2 -s 1
      
    • Distributed:
      # On camera host (e.g., Raspberry Pi)
      ./sync_camera.sh -d 1 -s 0
      # On synchronizer (e.g., MacBook)
      ./sync_camera.sh -d 0 -s 1
      
  4. Run real-time indoor-positioning system

    ./run.sh [-b <num_bases>] [-s <num_synchronizers>] [-v <num_visualizers>] [-g <display_graphics>] [-r <record_frames>] [-v <store_visualizations>]
    

    Default parameter settings:

    • -b: 1 (number of video base)
    • -s: 1 (number of video base synchronizer)
    • -v: 1 (number of visualizer)
    • -g: true (display graphic window)
    • -r: false (record video frames as images)
    • -v: false (store real-time visualizations)

    Modes:

    • Centralized:
      ./run.sh
      
    • Distributed:
      # On camera host (e.g., Raspberry Pi)
      ./run.sh -b 1 -s 0 -v 0 -g false
      # On synchronizer (e.g., MacBook)
      ./run.sh -b 0 -s 1 -v 1
      

Video Frame Analyzer

Video Analyzer Pipeline
  1. Serve VLM and LLM on video server

    vllm
    vllm serve openbmb/MiniCPM-V-2_6 --dtype auto --max-model-len 2048 --port 8000 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code --enforce-eager
    vllm serve microsoft/Phi-3-small-128k-instruct --dtype auto --max-model-len 1028 --port 8001 --api-key token-abc123 --gpu_memory_utilization 0.8 --trust-remote-code --enforce-eager 
    
    ollama

    Install Ollama from official website.

    ollama pull llava:13b
    ollama pull llama3.1
    
  2. Configure conf/video_base.ini

    [Server]
    backend = ollama
    top_p = 0.1
    temperature = 0
    vlm_model = llava:13b
    llm_model = llama3.1
    
  3. Serve frame analyzer on video server

    cd examples/
    python serve_video_frame_analyzer.py
    
  4. Run client script on video base

    python request_video_frame_analyzer.py
    

Visualization

After running the analyzers, logs and visualizations are stored in the /logs/ and /visualizations/ folders.

The following image shows a simple demo of the video frame analyzer:

Video Frame Analyzer Demo

FAQ

Citation

If you use this code in your research, please cite the following paper:

@inproceedings{inproceedings,
  author = {Li, Zaibei and Jensen, Martin and Nolte, Alexander and Spikol, Daniel},
  year = {2024},
  month = {03},
  pages = {785-791},
  title = {Field report for Platform mBox: Designing an Open MMLA Platform},
  doi = {10.1145/3636555.3636872}
}

References

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmmla_vision-0.1.0.post2.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

openmmla_vision-0.1.0.post2-py3-none-any.whl (51.6 kB view details)

Uploaded Python 3

File details

Details for the file openmmla_vision-0.1.0.post2.tar.gz.

File metadata

  • Download URL: openmmla_vision-0.1.0.post2.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.9

File hashes

Hashes for openmmla_vision-0.1.0.post2.tar.gz
Algorithm Hash digest
SHA256 9304474439704e4e32d9d81dd1e4dd3257c49cca18bb0ae57859579e4cd85b0f
MD5 39dad7cbafb7f4d3ab7d4945a354e555
BLAKE2b-256 26fe3e4518d4d456e9628d7c2b7309a7bb63cc861141228e663595cd9e620d5a

See more details on using hashes here.

File details

Details for the file openmmla_vision-0.1.0.post2-py3-none-any.whl.

File metadata

File hashes

Hashes for openmmla_vision-0.1.0.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 1afbdb4b4b9417c444d04cdda67d0f4e6a0949b5e45b830c8f8ac406284ff594
MD5 e95ca7d52bdc7d95e068d68097bbde04
BLAKE2b-256 61c3c8a38ed55593ef4f574798440aa202b17065ad809e7b8938af8e3b2b70ad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page