Fast almost fully automated image annotation tool for computer vision tasks detection, oriented bounding boxes and segmentation.
Project description
VisioFirm: Fast Almost fully-Automated Image Annotation for Computer Vision
[!IMPORTANT] A new release has just dropped.
VisioFirm 0.2.0brings enhancements including bug fixes for image import, improved frontend loading, Cloud/SSH support for downloading images and saving annotations, SAM2 worker offloading for better performance, optimized SAM2-Auto annotation for faster computing, and thread optimizations for image uploading. This version builds on the stable GroundingDINO dependency from 0.1.4.
- Cloud/SSH Integration: download images from cloud storage or SSH servers and save annotations remotely (using local absolute paths).
- Enhanced Image Handling: Fixed bugs in image import, faster frontend loading, and multi-threaded uploading for efficiency.
- SAM2 Optimizations: Worker-based offloading and improved auto-annotation for rapid, high-performance segmentation in the browser. Though you may experience a first timelaps for the first label generation the subsequent annotators are instant.
[!NOTE] If you prefer the HF transformers-based library (pre-0.2.0), install from the main branch via
pip install visiofirm==0.1.0.
VisioFirm is an open-source, AI-powered image annotation tool designed to accelerate labeling for computer vision tasks like object detection, oriented bounding boxes (OBB), and segmentation. Built for speed and simplicity, it leverages state-of-the-art models for semi-automated pre-annotations, allowing you to focus on refining rather than starting from scratch. Whether you're preparing datasets for YOLO, SAM, or custom models, VisioFirm streamlines your workflow with a intuitive web interface and powerful backend.
Perfect for researchers, data scientists, and ML engineers handling large image datasets—get high-quality annotations in minutes, not hours!
Why VisioFirm?
Unlike other annotation tool, this one is majoraly focused on CV tasks annotation detection (normal and oriented bounding box) and segmentation.
- AI-Driven Pre-Annotation: Automatically detect and segment objects using YOLOv10, SAM2, and Grounding DINO—saving up to 80% of manual effort.
- Multi-Task Support: Handles bounding boxes, oriented bounding boxes, and polygon segmentation in one tool.
- Browser-Based Editing: Interactive canvas for precise adjustments, with real-time SAM-powered segmentation in the browser.
- Offline-Friendly: Models download automatically (or pre-fetch for offline use), with SQLite backend for local projects.
- Extensible & Open-Source: Customize with your own ultralytics models or integrate into pipelines—contributions welcome!
- SAM2-base webgpu: Insta-drawing of annotations via SAM2 with worker offloading and auto-annotation for faster computing!
Features
- Semi-Automated Labeling: Kickstart annotations with AI models like YOLO for detection, SAM for segmentation, and Grounding DINO for zero-shot object grounding.
- Flexible Annotation Types:
- Axis-aligned bounding boxes for standard detection.
- Oriented bounding boxes for rotated objects (e.g., aerial imagery).
- Polygon segmentation for precise boundaries.
- Interactive Frontend: Draw, edit, and refine labels on a responsive canvas. Click-to-segment with browser-based SAM for instant masks.
- Project Management: Create, manage, and export projects with SQLite database storage. Support for multiple classes and images.
- Export Formats: Seamless exports to YOLO, COCO, or custom formats for training.
- Performance Optimizations: Cluster overlapping detections, simplify contours, and filter by confidence for clean datasets.
- Cloud/SSH Integration: Seamlessly download images from cloud storage or SSH servers and save annotations remotely.
- Enhanced Image Handling: Fixed bugs in image import, faster frontend loading, and multi-threaded uploading for efficiency.
- SAM2 Optimizations: Worker-based offloading and improved auto-annotation for rapid, high-performance segmentation in the browser.
- Cross-Platform: Runs locally on Linux, macOS, or Windows via Python— no cloud dependency.
Installation
VisioFirm is easy to install via pip from GitHub (PyPI coming soon!).
It was tested with Python 3.10+.
pip install -U visiofirm
For development or editable install (from a cloned repo):
git clone https://github.com/OschAI/VisioFirm.git
cd VisioFirm
pip install -e .
Quick Start
Launch VisioFirm with a single command—it auto-starts a local web server and opens in your browser.
visiofirm
- Create a new project and upload images.
- Define classes (e.g., "car", "person").
- For easy-to-detect object run AI pre-annotation (select model: YOLO, Grounding DINO).
- Refine labels in the interactive editor.
- Export your annotated dataset.
The VisioFirm app uses cache directories to store settings locally.
Usage
Pre-Annotation with AI
VisioFirm uses advanced models for initial labels:
- YOLOv10: Fast detection.
- SAM2: Precise segmentation.
- Grounding DINO: Zero-shot detection via text prompts.
Models auto-download on first run (stored in current dir or cache). For offline prep:
Frontend Customization
The web interface (Flask + JS) supports hotkeys, undo/redo, and zoom. Edit static/js/sam.js for browser SAM tweaks.
Exporting Data
From the dashboard, export to JSON, TXT (YOLO format), or images with masks.
Community & Support
- Issues: Report bugs or request features here.
- Discord: Coming soon—star the repo for updates!
- Roadmap: Multi-user support, video annotation, custom model integration.
License
Apache 2.0 - See LICENSE for details.
Built by Safouane El Ghazouali for the research community. Star the repo if it helps your workflow! 🚀
Citation
@misc{ghazouali2025visiofirm,
title={VisioFirm: Cross-Platform AI-assisted Annotation Tool for Computer Vision},
author={Safouane El Ghazouali and Umberto Michelucci},
year={2025},
eprint={2509.04180},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
TODOs
SOON:
- Documentation website
- Discord community
- Paper - detailing the implementation and AI preannotation pipeline
- Classification
Futur:
- Support for video annotation
- Support for more ML frameworks (such as mmdetection and detectron2)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file visiofirm-0.2.0.tar.gz.
File metadata
- Download URL: visiofirm-0.2.0.tar.gz
- Upload date:
- Size: 13.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f312f0e03b22ec86481c82fb5b06a42135c628a3a13537561bef5d007fc4994a
|
|
| MD5 |
b44095883237d54abec1d3347e82315d
|
|
| BLAKE2b-256 |
417b23a07614ec1baca4aa312859b51eece987a8f4a4211a3b5141de990996d7
|
Provenance
The following attestation bundles were made for visiofirm-0.2.0.tar.gz:
Publisher:
publish.yml on OschAI/VisioFirm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visiofirm-0.2.0.tar.gz -
Subject digest:
f312f0e03b22ec86481c82fb5b06a42135c628a3a13537561bef5d007fc4994a - Sigstore transparency entry: 481912170
- Sigstore integration time:
-
Permalink:
OschAI/VisioFirm@0d51b7e57cebf0fc3764a6ec0806302cfab7de52 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/OschAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0d51b7e57cebf0fc3764a6ec0806302cfab7de52 -
Trigger Event:
release
-
Statement type:
File details
Details for the file visiofirm-0.2.0-py3-none-any.whl.
File metadata
- Download URL: visiofirm-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76cff8fbf9b984e3b30388b8f5251924cd6d8b5d7f03fcc770380719a813b1b0
|
|
| MD5 |
82d5b59a72773019d22d802924f4a7ed
|
|
| BLAKE2b-256 |
2f008982b479d9bd24dc23b642c14666ff7c33e09e313cfbfc3cc4d3784d8d8d
|
Provenance
The following attestation bundles were made for visiofirm-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on OschAI/VisioFirm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visiofirm-0.2.0-py3-none-any.whl -
Subject digest:
76cff8fbf9b984e3b30388b8f5251924cd6d8b5d7f03fcc770380719a813b1b0 - Sigstore transparency entry: 481912172
- Sigstore integration time:
-
Permalink:
OschAI/VisioFirm@0d51b7e57cebf0fc3764a6ec0806302cfab7de52 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/OschAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0d51b7e57cebf0fc3764a6ec0806302cfab7de52 -
Trigger Event:
release
-
Statement type: