CLI tool to finetune SAM and Grounding DINO on custom datasets
Project description
SAM + Grounding DINO Finetuner
A simple and flexible CLI tool to fine-tune:
- Segment Anything Model (SAM) → segmentation
- Grounding DINO → prompt-based object detection
Features
- ✅ SAM mask decoder finetuning
- ✅ Grounding DINO text-conditioned detection
- ✅ CLI-based training (easy to use)
- ✅ COCO dataset support
- ✅ Balanced dataset sampling
- ✅ Mixed precision (AMP)
- ✅ Auto checkpoint saving
- ✅ Sample prediction visualization
Installation
🔹 Clone & install locally
git clone https://github.com/pratim4dasude/finetuning_grounded_dino_sam.git
cd finetuning_grounded_dino_sam
pip install -e .
Dataset Format
Your dataset must follow COCO format:
dataset_root/
│
├── train/
│ ├── _annotations.coco.json
│ ├── image1.jpg
│ └── image2.jpg
│
└── test/
├── _annotations.coco.json
├── image1.jpg
└── image2.jpg
Usage
🔹 1. Finetune Grounding DINO
python -m finetuning.cli dino --dataset_root data\Dataset_oRobot \
--output_dir data\grounding_dino_test \
--text_labels person cat wall \
--image_size 512 \
--batch_size 2 \
--grad_accum_steps 4 \
--num_epochs 10 \
--learning_rate 1e-5 \
--weight_decay 1e-4 \
--num_workers 0 \
--max_grad_norm 1.0 \
--train_sample_limit 500 \
--test_sample_limit 100 \
--seed 42
🔹 2. Finetune SAM
python -m finetuning.cli sam --data_root data\cracks_cleaned \
--output_dir data\sam_test \
--model_name facebook/sam-vit-base \
--max_train_samples 1000 \
--max_test_samples 300 \
--resize_to 512 \
--batch_size 2 \
--num_epochs 10 \
--lr 1e-5 \
--weight_decay 1e-4 \
--num_workers 0 \
--box_jitter 10 \
--seed 42
Outputs
After training, your output_dir will contain:
output_dir/
│
├── epoch_1/ # Optional checkpoints
├── best_model/ # Best model weights
├── train_log.txt # Training logs
└── sample_prediction.jpg # Visualization
Metrics
Grounding DINO
- Train Loss
- Test Loss
- Detection results
SAM
- IoU
- Dice Score
- Mask quality
Notes
❗ Do NOT use very small image sizes (
<256) for DINO
Use quotes for multi-word labels:
--text_labels person cat wall toys"
Install PyTorch (GPU)
Install manually based on your system:
👉 https://pytorch.org/get-started/locally/
Example (CUDA 12.4):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
Development Mode
Run without installing:
python -m finetuning.cli dino --dataset_root data --output_dir out
Tech Stack
| Tool | Purpose |
|---|---|
| PyTorch | Deep learning backbone |
| Hugging Face Transformers | Model hub & utilities |
| SAM (Meta) | Segmentation model |
| Grounding DINO | Text-conditioned detection |
| COCO API | Dataset handling |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file finetuning_grounding_dino_sam-0.1.0.tar.gz.
File metadata
- Download URL: finetuning_grounding_dino_sam-0.1.0.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff22d412d98f049ce07a5d2f79ac9337b047e71f3c89b1b7f5e1ef296fc67cfa
|
|
| MD5 |
9370702817332c8ae175c32af0c8a5be
|
|
| BLAKE2b-256 |
1c2d06751ea918119271aab8e334c7e9b1f9d7a495cf4289e90579dd02484a72
|
Provenance
The following attestation bundles were made for finetuning_grounding_dino_sam-0.1.0.tar.gz:
Publisher:
python-publish.yml on pratim4dasude/finetuning_grounded_dino_sam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
finetuning_grounding_dino_sam-0.1.0.tar.gz -
Subject digest:
ff22d412d98f049ce07a5d2f79ac9337b047e71f3c89b1b7f5e1ef296fc67cfa - Sigstore transparency entry: 1428882529
- Sigstore integration time:
-
Permalink:
pratim4dasude/finetuning_grounded_dino_sam@643eb85b80c9a46c0b2607f73589e9a1febc3104 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/pratim4dasude
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@643eb85b80c9a46c0b2607f73589e9a1febc3104 -
Trigger Event:
release
-
Statement type:
File details
Details for the file finetuning_grounding_dino_sam-0.1.0-py3-none-any.whl.
File metadata
- Download URL: finetuning_grounding_dino_sam-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c887c588d79f861746f5e8dcf3f29d9dadeb8a1124e1cdf030cd6f4321b7d8e
|
|
| MD5 |
5a0e1501b6d521e61cdafbd1c0364403
|
|
| BLAKE2b-256 |
b852b9a8b3c9ebe57ecbfa7699818f76d6c84cc7857e4d438121acf5f022d045
|
Provenance
The following attestation bundles were made for finetuning_grounding_dino_sam-0.1.0-py3-none-any.whl:
Publisher:
python-publish.yml on pratim4dasude/finetuning_grounded_dino_sam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
finetuning_grounding_dino_sam-0.1.0-py3-none-any.whl -
Subject digest:
0c887c588d79f861746f5e8dcf3f29d9dadeb8a1124e1cdf030cd6f4321b7d8e - Sigstore transparency entry: 1428882554
- Sigstore integration time:
-
Permalink:
pratim4dasude/finetuning_grounded_dino_sam@643eb85b80c9a46c0b2607f73589e9a1febc3104 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/pratim4dasude
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@643eb85b80c9a46c0b2607f73589e9a1febc3104 -
Trigger Event:
release
-
Statement type: