Skip to main content

Finetuning Grounding DINO and SAM on custom datasets

Project description

SAM + Grounding DINO Finetuner

A simple and flexible CLI tool to fine-tune:

  • Segment Anything Model (SAM) → segmentation
  • Grounding DINO → prompt-based object detection

Features

  • ✅ SAM mask decoder finetuning
  • ✅ Grounding DINO text-conditioned detection
  • ✅ CLI-based training (easy to use)
  • ✅ COCO dataset support
  • ✅ Balanced dataset sampling
  • ✅ Mixed precision (AMP)
  • ✅ Auto checkpoint saving
  • ✅ Sample prediction visualization

Installation

🔹 Clone & install locally

git clone https://github.com/pratim4dasude/finetuning_grounded_dino_sam.git
cd finetuning_grounded_dino_sam
pip install -e .

Dataset Format

Your dataset must follow COCO format:

dataset_root/
│
├── train/
│   ├── _annotations.coco.json
│   ├── image1.jpg
│   └── image2.jpg
│
└── test/
    ├── _annotations.coco.json
    ├── image1.jpg
    └── image2.jpg

Usage

🔹 1. Finetune Grounding DINO

python -m finetuning.cli dino --dataset_root data\Dataset_oRobot \
  --output_dir data\grounding_dino_test \
  --text_labels person cat wall \
  --image_size 512 \
  --batch_size 2 \
  --grad_accum_steps 4 \
  --num_epochs 10 \
  --learning_rate 1e-5 \
  --weight_decay 1e-4 \
  --num_workers 0 \
  --max_grad_norm 1.0 \
  --train_sample_limit 500 \
  --test_sample_limit 100 \
  --seed 42

🔹 2. Finetune SAM

 python -m finetuning.cli sam --data_root data\cracks_cleaned \
  --output_dir data\sam_test \
  --model_name facebook/sam-vit-base \
  --max_train_samples 1000 \
  --max_test_samples 300 \
  --resize_to 512 \
  --batch_size 2 \
  --num_epochs 10 \
  --lr 1e-5 \
  --weight_decay 1e-4 \
  --num_workers 0 \
  --box_jitter 10 \
  --seed 42

Outputs

After training, your output_dir will contain:

output_dir/
│
├── epoch_1/               # Optional checkpoints
├── best_model/            # Best model weights
├── train_log.txt          # Training logs
└── sample_prediction.jpg  # Visualization

Metrics

Grounding DINO

  • Train Loss
  • Test Loss
  • Detection results

SAM

  • IoU
  • Dice Score
  • Mask quality

Notes

❗ Do NOT use very small image sizes (<256) for DINO

Use quotes for multi-word labels:

--text_labels person cat wall toys"

Install PyTorch (GPU)

Install manually based on your system:

👉 https://pytorch.org/get-started/locally/

Example (CUDA 12.4):

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Development Mode

Run without installing:

python -m finetuning.cli dino --dataset_root data --output_dir out

Tech Stack

Tool Purpose
PyTorch Deep learning backbone
Hugging Face Transformers Model hub & utilities
SAM (Meta) Segmentation model
Grounding DINO Text-conditioned detection
COCO API Dataset handling

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finetuning_grounding_dino_sam-0.1.2.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

finetuning_grounding_dino_sam-0.1.2-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file finetuning_grounding_dino_sam-0.1.2.tar.gz.

File metadata

File hashes

Hashes for finetuning_grounding_dino_sam-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b8d85635ddcaf90a5b12598a2fcab902b680fcd3a5e70f6bf5816c074f5c5b37
MD5 1ba762eeb3730977f2c84b38e360ce21
BLAKE2b-256 712d819b863f5e3b55072a2f9d18ff37663b985ba87590e032c85f4a0446612b

See more details on using hashes here.

Provenance

The following attestation bundles were made for finetuning_grounding_dino_sam-0.1.2.tar.gz:

Publisher: python-publish.yml on pratim4dasude/finetuning_grounded_dino_sam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file finetuning_grounding_dino_sam-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for finetuning_grounding_dino_sam-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e87b0473840e314bcf089c71bced184b71cf9f00c2832f62be2fc0891c3d8c1d
MD5 3bd2630471742afb3b2beadc03fbdf16
BLAKE2b-256 9c657b2428e8bf3f9ffd3a8ac6de94afafabbc46c4ca87747969c5c8ad5aa9bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for finetuning_grounding_dino_sam-0.1.2-py3-none-any.whl:

Publisher: python-publish.yml on pratim4dasude/finetuning_grounded_dino_sam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page