Vision module for the OpenMMLA platform.
Project description
🎥 OpenMMLA Vision
Video module of the mBox - an open multimodal learning analytic platform. For more details, please refer to mBox System Design.
Table of Contents
Related Modules
Installation
Uber Server Setup
Before setting up the video base, you need to set up a server hosting the InfluxDB, Redis, Mosquitto, and Nginx services. Please refer to mbox-uber module.
Video Base & Server Setup
-
Clone the repository
git clone https://github.com/ucph-ccs/mbox-video.git
-
Install openmmla-vision
Set up Conda environment
# For Raspberry Pi wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" bash Miniforge3-$(uname)-$(uname -m).sh # For Mac and Linux wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-$(uname)-$(uname -m).sh" bash Miniconda3-latest-$(uname)-$(uname -m).sh
Install Video Base
conda create -c conda-forge -n video-base python=3.10.12 -y conda activate video-base pip install openmmla-vision[base] # for Linux and Raspberry Pi pip install 'openmmla-vision[base]' # for Mac
Install Video Server
The video server provides video frame analyzer services.
conda create -c conda-forge -n video-server python=3.10.12 -y conda activate video-server pip install openmmla-vision[server] # for Linux and Raspberry Pi pip install 'openmmla-vision[server]' # for Mac
-
Set up folder structure
cd mbox-video ./reset.sh
Standalone Setup
If you want to run the entire mBox Video system on a single machine, follow these steps:
-
Set up the Uber Server on your machine following the instructions in the mbox-uber module.
-
Install openmmla-vision with all dependencies:
conda create -c conda-forge -n mbox-video python=3.10.12 -y conda activate mbox-video pip install openmmla-vision[all] # for Linux and Raspberry Pi pip install 'openmmla-vision[all]' # for Mac
-
Set up the folder structure:
cd mbox-video ./reset.sh
This setup will allow you to run all components of mBox Video on a single machine.
Usage
Realtime Indoor-Positioning
-
Stream video from camera(s)
- Distributed: stream on each camera host machine (e.g., Raspberry Pi)
- Centralized: stream to a centralized RTMP server (e.g., MacBook, see Raspberry Pi RTMP streaming setup)
-
Calibrate camera's intrinsic parameters
- Print chessboard image from
./camera_calib/pattern/
and stick it on a flat surface - Capture chessboard image with your camera and calibrate it by running
./calib_camera.sh
- Print chessboard image from
-
Synchronize multi-cameras' coordinate systems
Calculate transformation matrix between main and alternative cameras:
./sync_camera.sh [-d <num_cameras>] [-s <num_sync_managers>]
Default parameter settings:
-d
: 2 (number of cameras to sync)-s
: 1 (number of camera sync manager)
Modes:
- Centralized:
./sync_camera.sh -d 2 -s 1
- Distributed:
# On camera host (e.g., Raspberry Pi) ./sync_camera.sh -d 1 -s 0 # On synchronizer (e.g., MacBook) ./sync_camera.sh -d 0 -s 1
-
Run real-time indoor-positioning system
./run.sh [-b <num_bases>] [-s <num_synchronizers>] [-v <num_visualizers>] [-g <display_graphics>] [-r <record_frames>] [-v <store_visualizations>]
Default parameter settings:
-b
: 1 (number of video base)-s
: 1 (number of video base synchronizer)-v
: 1 (number of visualizer)-g
: true (display graphic window)-r
: false (record video frames as images)-v
: false (store real-time visualizations)
Modes:
- Centralized:
./run.sh
- Distributed:
# On camera host (e.g., Raspberry Pi) ./run.sh -b 1 -s 0 -v 0 -g false # On synchronizer (e.g., MacBook) ./run.sh -b 0 -s 1 -v 1
Video Frame Analyzer
-
Serve VLM and LLM on video server
vllm
vllm serve openbmb/MiniCPM-V-2_6 --dtype auto --max-model-len 2048 --port 8000 --api-key token-abc123 --gpu_memory_utilization 1 --trust-remote-code --enforce-eager vllm serve microsoft/Phi-3-small-128k-instruct --dtype auto --max-model-len 1028 --port 8001 --api-key token-abc123 --gpu_memory_utilization 0.8 --trust-remote-code --enforce-eager
-
Configure
conf/video_base.ini
[Server] backend = ollama top_p = 0.1 temperature = 0 vlm_model = llava:13b llm_model = llama3.1
-
Serve frame analyzer on video server
cd examples/ python serve_video_frame_analyzer.py
-
Run client script on video base
python request_video_frame_analyzer.py
Visualization
After running the analyzers, logs and visualizations are stored in the /logs/
and /visualizations/
folders.
The following image shows a simple demo of the video frame analyzer:
FAQ
Citation
If you use this code in your research, please cite the following paper:
@inproceedings{inproceedings,
author = {Li, Zaibei and Jensen, Martin and Nolte, Alexander and Spikol, Daniel},
year = {2024},
month = {03},
pages = {785-791},
title = {Field report for Platform mBox: Designing an Open MMLA Platform},
doi = {10.1145/3636555.3636872}
}
References
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file openmmla_vision-0.1.0.post2.tar.gz
.
File metadata
- Download URL: openmmla_vision-0.1.0.post2.tar.gz
- Upload date:
- Size: 43.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9304474439704e4e32d9d81dd1e4dd3257c49cca18bb0ae57859579e4cd85b0f |
|
MD5 | 39dad7cbafb7f4d3ab7d4945a354e555 |
|
BLAKE2b-256 | 26fe3e4518d4d456e9628d7c2b7309a7bb63cc861141228e663595cd9e620d5a |
File details
Details for the file openmmla_vision-0.1.0.post2-py3-none-any.whl
.
File metadata
- Download URL: openmmla_vision-0.1.0.post2-py3-none-any.whl
- Upload date:
- Size: 51.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1afbdb4b4b9417c444d04cdda67d0f4e6a0949b5e45b830c8f8ac406284ff594 |
|
MD5 | e95ca7d52bdc7d95e068d68097bbde04 |
|
BLAKE2b-256 | 61c3c8a38ed55593ef4f574798440aa202b17065ad809e7b8938af8e3b2b70ad |