Jormungandr is an novel end-to-end video object detection system that leverages the Spatial-Temporal Mamba architecture to accurately detect and track objects across video frames.
Project description
Jormungandr: End-to-End Video Object Detection with Spatial-Temporal Mamba
📋 Table of contents
Description
Jormungandr is an novel end-to-end video object detection system that leverages the Spatial-Temporal Mamba architecture to accurately detect and track objects across video frames. By combining spatial and temporal information, Jormungandr enhances detection accuracy and robustness, making it suitable for various applications such as surveillance, autonomous driving, and video analytics.
Getting started
Prerequisites
Before installing this package, ensure that your system meets the following requirements:
- Operating System: Linux
- Python: Version 3.12 or higher
- Hardware: CUDA-enabled GPU
- Software Dependencies:
- NVIDIA drivers compatible with your GPU
- CUDA Toolkit properly installed and configured, can be checked with
nvidia-smi
Installation
PyPI package:
pip install jormungandr-ssm
Alternatively, from source:
pip install git+https://github.com/Knolaisen/jormungandr
Usage
We expose several levels of interface with the Fafnir still image detector and Jormungandr Video Object Detection (VOD) model. Both models follow a simple PyTorch-style API. Due to the Mamba architecture, the models are optimized for GPU execution and require CUDA for inference and training.
Still Image Detection (Fafnir)
Use Fafnir when performing object detection on single images.
import torch
from jormungandr import Fafnir
device = torch.device("cuda")
batch, channels, height, width = 2, 3, 224, 224
x = torch.randn(batch, channels, height, width).to(device)
# Initialize model
model = Fafnir(variant="fafnir-b", pretrained=True).to(device)
model.eval()
# Inference
with torch.no_grad():
detections = model(x)
Video Object Detection (Jormungandr)
Use Jormungandr for end-to-end video object detection using spatial-temporal modeling.
import torch
from jormungandr import Jormungandr
device = torch.device("cuda")
frames, channels, height, width = 32, 8, 3, 224, 224
x = torch.randn(frames, channels, height, width).to(device)
# Initialize model
model = Jormungandr(variant="jormungandr-b", pretrained=True).to(device)
model.eval()
# Inference
with torch.no_grad():
detections = model(x)
Pretrained Models
We provide pretrained models hosted on Hugging Face.
- The Fafnir models (
fafnir-t,fafnir-s,fafnir-b) are pretrained on the COCO dataset. - The Jormungandr models (
jormungandr-t,jormungandr-s,jormungandr-b) are pretrained on the MOT17 dataset.
These models will be automatically downloaded when initialized in your code.
Documentation
Authors
|
Kristoffer Nohr Olaisen |
Sverre Nystad |
License
Distributed under the MIT License. See LICENSE for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jormungandr_ssm-0.0.1.tar.gz.
File metadata
- Download URL: jormungandr_ssm-0.0.1.tar.gz
- Upload date:
- Size: 6.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9468d79444ce739ffbe012f56a0b31c212aad2d3efb6b00b07d27f6f10071817
|
|
| MD5 |
3f62eef4142e87d18a61671bd86572dd
|
|
| BLAKE2b-256 |
6e96a18cc4162a9bcb12de23881fc42a192fd7be53eea4c686c913d8e559394e
|
File details
Details for the file jormungandr_ssm-0.0.1-py3-none-any.whl.
File metadata
- Download URL: jormungandr_ssm-0.0.1-py3-none-any.whl
- Upload date:
- Size: 45.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9f7dd4ff41638b9607bfcbe2b59ab0387c0cd2b35ec06673555c69077ab7081
|
|
| MD5 |
9738882692bc9d0d4ed9420399b01068
|
|
| BLAKE2b-256 |
82dba3f1d157aeda96faef1c6d70b5b321cb6a8af68a386993e11e8036b132ea
|