Skip to main content

GroundedDINO-VL vision-language backend

Project description

GroundedDINO-VL

Modern Vision-Language Foundation Models for PyTorch 2.7 + CUDA 12.9

CI License Python PyPI Downloads


Overview

GroundedDINO-VL is a modern, independent vision-language framework derived from GroundingDINO, fully refactored and maintained for current GPU infrastructure with PyTorch 2.7 and CUDA 12.8 support.

This project is now fully independent from the original GroundingDINO implementation, with a clean, modern codebase while maintaining full compatibility with GroundingDINO model weights and providing backward compatibility for existing code.

Key Features

  • Fully Independent: Clean break from legacy code - single source of truth in groundeddino_vl/
  • Modern Stack: PyTorch 2.7 + CUDA 12.8 support
  • Zero-Shot Detection: Detect objects using natural language descriptions
  • High Performance: Compatible with GroundingDINO's COCO zero-shot 52.5 AP weights
  • Backward Compatible: Legacy groundingdino namespace supported via compatibility shim
  • Clean Architecture: Refactored package structure with better organization
  • Label Studio Integration: Real-time ML backend for auto-annotation workflows
  • 95% Package Size Reduction: Eliminated duplicate code (9,000+ lines removed)

Example Results


Documentation Index

Getting Started

Advanced Topics

Migration & Compatibility

  • GroundingDINO Migration Guide - Migrating from legacy groundingdino namespace
  • API Migration Guide - Upgrading from previous versions
  • Independence Note: GroundedDINO-VL is now fully independent with all code in groundeddino_vl/. The groundingdino/ namespace provides backward compatibility via a lightweight shim.

Integration & Deployment

Contributing & Support


Quick Installation

Via PyPI (Recommended)

pip install groundeddino_vl

With GPU Support (CUDA 12.8)

pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install groundeddino_vl

From Source

git clone https://github.com/ghostcipher1/GroundedDINO-VL.git
cd GroundedDINO-VL
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install -e .

See Installation Guide for detailed instructions and system requirements.


Quick Start Example

from groundeddino_vl import load_model, predict, annotate

# Load model (auto-downloads weights on first run)
model = load_model(
    config_path="path/to/config.py",
    checkpoint_path="path/to/weights.pth",
    device="cuda"
)

# Run detection with text prompt
result = predict(
    model=model,
    image="path/to/image.jpg",
    text_prompt="car . person . dog",
    box_threshold=0.35,
    text_threshold=0.25,
)

# Visualize results
annotated_image = annotate(image, result, show_labels=True)

See Quick Start Guide for more examples and usage patterns.


Label Studio Integration

LabelStudio logo

GroundedDINO-VL includes an optional Label Studio ML Backend for real-time auto-annotation:

  • On-demand inference via FastAPI service
  • Auto-labeling with the "magic wand" feature
  • Batch annotation assistance
  • PostgreSQL/SQLite history logging

Complete setup guide: Label Studio Integration Documentation


Research Foundation

This project is based on the groundbreaking work:

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

@article{liu2023grounding,
  title={Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection},
  author={Liu, Shilong and Zeng, Zhaoyang and Ren, Tianhe and Li, Feng and Zhang, Hao and Yang, Jie and Li, Chunyuan and Yang, Jianwei and Su, Hang and Zhu, Jun and others},
  journal={arXiv preprint arXiv:2303.05499},
  year={2023}
}

Original Project: IDEA-Research/GroundingDINO Paper: arXiv:2303.05499

Research Benchmarks

PWC PWC PWC


System Requirements

Component Requirement
Python 3.9, 3.10, 3.11, or 3.12
PyTorch 2.7.0+
CUDA (optional) 12.6 or 12.8
C++ Compiler GCC 7+, Clang 5+, or MSVC 2019+
GPU (optional) NVIDIA with Compute Capability 6.0+

License

Copyright (c) 2025 GhostCipher. All rights reserved.

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Original GroundingDINO License

Copyright (c) 2023 IDEA. All Rights Reserved.
Licensed under the Apache License, Version 2.0

This project maintains the original Apache 2.0 license and properly attributes the original GroundingDINO research team.


Acknowledgments

  • GroundingDINO Team at IDEA Research for the original research and implementation
  • Deformable DETR for the multi-scale deformable attention mechanism
  • DINO for the transformer-based detection architecture
  • PyTorch Team for the excellent deep learning framework

Links


Built with ❤️ for the computer vision community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

groundeddino_vl-1.1.1.tar.gz (120.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

groundeddino_vl-1.1.1-py3-none-any.whl (133.2 kB view details)

Uploaded Python 3

File details

Details for the file groundeddino_vl-1.1.1.tar.gz.

File metadata

  • Download URL: groundeddino_vl-1.1.1.tar.gz
  • Upload date:
  • Size: 120.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for groundeddino_vl-1.1.1.tar.gz
Algorithm Hash digest
SHA256 020b279b1f139a35218b9e01beb2125f3684b9336483c6fa2723c61563a513e6
MD5 b623115010cb618a0d87f59c8f2b706f
BLAKE2b-256 a5eee47d2cc4b874ed720ee457e222bdb370d1479ea91f35ea64352317075a6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for groundeddino_vl-1.1.1.tar.gz:

Publisher: publish.yml on ghostcipher1/GroundedDINO-VL

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file groundeddino_vl-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: groundeddino_vl-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 133.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for groundeddino_vl-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 38d49ab2f4c975a4d81b4e765d58ce5d0a0509db6f504dd96076bda9d520fc7b
MD5 4754acdf656101dc088e1e4b64d57f0e
BLAKE2b-256 7cd1ae49ae89dc59ad85a9bb0f6115c618458f54ac4f0bfd6f2861f9b8d00378

See more details on using hashes here.

Provenance

The following attestation bundles were made for groundeddino_vl-1.1.1-py3-none-any.whl:

Publisher: publish.yml on ghostcipher1/GroundedDINO-VL

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page