Skip to main content

Vision module for the OpenMMLA platform.

Project description

🎥 OpenMMLA Vision

PyPI version

Video module of the mBox - an open multimodal learning analytic platform. For more details, please refer to mBox System Design.

Table of Contents

Related Modules

Installation

Uber Server Setup

Before setting up the video base, you need to set up a server hosting the InfluxDB, Redis, Mosquitto, and Nginx services. Please refer to mbox-uber module.

Video Base & Server Setup

  1. Clone the repository

    git clone https://github.com/ucph-ccs/mbox-video.git
    
  2. Install openmmla-vision

    Set up Conda environment
    # For Raspberry Pi
    wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
    bash Miniforge3-$(uname)-$(uname -m).sh
    
    # For Mac and Linux
    wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-$(uname)-$(uname -m).sh"
    bash Miniconda3-latest-$(uname)-$(uname -m).sh
    
    Install Video Base
    conda create -c conda-forge -n video-base python=3.10.12 -y
    conda activate video-base
    pip install openmmla-vision[base]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[base]'  # for Mac
    
    Install Video Server

    The video server provides video frame analyzer services.

    conda create -c conda-forge -n video-server python=3.10.12 -y
    conda activate video-server
    pip install openmmla-vision[server]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[server]'  # for Mac
    
  3. Set up folder structure

    cd mbox-video
    ./reset.sh
    

Standalone Setup

If you want to run the entire mBox Video system on a single machine, follow these steps:

  1. Set up the Uber Server on your machine following the instructions in the mbox-uber module.

  2. Install openmmla-vision with all dependencies:

    conda create -c conda-forge -n mbox-video python=3.10.12 -y
    conda activate mbox-video
    pip install openmmla-vision[all]  # for Linux and Raspberry Pi
    pip install 'openmmla-vision[all]'  # for Mac
    
  3. Set up the folder structure:

    cd mbox-video
    ./reset.sh
    

This setup will allow you to run all components of mBox Video on a single machine.

Usage

Realtime Indoor-Positioning

Indoor Positioning System Diagram Multi-camera Pipeline
  1. Stream video from camera(s)

    • Distributed: stream on each camera host machine (e.g. Raspberry Pi, Mac, Linux, etc.)
    • Centralized: stream to a centralized RTMP server (e.g. client/server, see Raspberry Pi RTMP streaming setup)
  2. Calibrate camera's intrinsic parameters

    1. Print chessboard image from ./camera_calib/pattern/ and stick it on a flat surface
    2. Capture chessboard image with your camera and calibrate it by running ./calib_camera.sh
  3. Synchronize multi-cameras' coordinate systems

    Calculate transformation matrix between main and alternative cameras:

    ./sync_camera.sh [-d <num_cameras>] [-s <num_sync_managers>]
    

    Default parameter settings:

    • -d: 2 (number of cameras to sync)
    • -s: 1 (number of camera sync manager)

    Modes:

    • Centralized:
      ./sync_camera.sh -d 2 -s 1
      
    • Distributed:
      # On camera host (e.g., Raspberry Pi)
      ./sync_camera.sh -d 1 -s 0
      # On synchronizer (e.g., MacBook)
      ./sync_camera.sh -d 0 -s 1
      
  4. Run real-time indoor-positioning system

    ./run.sh [-b <num_bases>] [-s <num_synchronizers>] [-v <num_visualizers>] [-g <display_graphics>] [-r <record_frames>] [-v <store_visualizations>]
    

    Default parameter settings:

    • -b: 1 (number of video base)
    • -s: 1 (number of video base synchronizer)
    • -v: 1 (number of visualizer)
    • -g: true (display graphic window)
    • -r: false (record video frames as images)
    • -v: false (store real-time visualizations)

    Modes:

    • Centralized:
      ./run.sh
      
    • Distributed:
      # On camera host (e.g., Raspberry Pi)
      ./run.sh -b 1 -s 0 -v 0 -g false
      # On synchronizer (e.g., MacBook)
      ./run.sh -b 0 -s 1 -v 1
      

Video Frame Analyzer

Video Analyzer Pipeline
  1. Serve VLM and LLM on video server

    vllm
    vllm serve openbmb/MiniCPM-V-2_6 --dtype auto --max-model-len 2048 --port 8000 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code --enforce-eager
    vllm serve microsoft/Phi-3-small-128k-instruct --dtype auto --max-model-len 1028 --port 8001 --api-key token-abc123 --gpu_memory_utilization 0.8 --trust-remote-code --enforce-eager 
    
    ollama

    Install Ollama from official website.

    ollama pull llava:13b
    ollama pull llama3.1
    
  2. Configure conf/video_base.ini

    [Server]
    backend = ollama
    top_p = 0.1
    temperature = 0
    vlm_model = llava:13b
    llm_model = llama3.1
    
  3. Serve frame analyzer on video server

    cd examples/
    python video_frame_analyzer_server.py
    
  4. Run client script on video base

    python analyze_video_frame.py
    

Visualization

After running the analyzers, logs and visualizations are stored in the /logs/ and /visualizations/ folders.

The following image shows a simple demo of the video frame analyzer:

Video Frame Analyzer Demo

FAQ

Citation

If you use this code in your research, please cite the following paper:

@inproceedings{inproceedings,
  author = {Li, Zaibei and Jensen, Martin and Nolte, Alexander and Spikol, Daniel},
  year = {2024},
  month = {03},
  pages = {785-791},
  title = {Field report for Platform mBox: Designing an Open MMLA Platform},
  doi = {10.1145/3636555.3636872}
}

References

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmmla_vision-0.1.0.post4.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

openmmla_vision-0.1.0.post4-py3-none-any.whl (51.7 kB view details)

Uploaded Python 3

File details

Details for the file openmmla_vision-0.1.0.post4.tar.gz.

File metadata

  • Download URL: openmmla_vision-0.1.0.post4.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.9

File hashes

Hashes for openmmla_vision-0.1.0.post4.tar.gz
Algorithm Hash digest
SHA256 65acdd03adb22275276a7499cf46c000c11372194573e85d5cbefb81be5a318a
MD5 15aa83435b1f6d450fc0506a5c0c5437
BLAKE2b-256 aef4bf411d4f00ffd110cd136dd014e08f79e4a4e5b7135e818dfd4655ee5c64

See more details on using hashes here.

File details

Details for the file openmmla_vision-0.1.0.post4-py3-none-any.whl.

File metadata

File hashes

Hashes for openmmla_vision-0.1.0.post4-py3-none-any.whl
Algorithm Hash digest
SHA256 d3b80286d4f85f35241a8984a61b39b821139872e2f2a0d7e1eb35f91597c256
MD5 0663d5ac08c93e9616fd2e0bee326502
BLAKE2b-256 0308deed453467fca0a4a1aeae25b6f0c39e9fcfe66aabfffdafbcbafad2935e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page