FlashDet: Ultra-lightweight real-time object detection with LoRA fine-tuning and Knowledge Distillation

These details have not been verified by PyPI

Project links

Project description

FlashDet Logo

FlashDet

PyTorch Python ONNX LoRA License

Ultra-lightweight real-time object detection with advanced training methods, LoRA fine-tuning, tracking, and analytics

Install • Architectures • Usage • Training Methods • Solutions • Trackers • Structure • Contributing

What is FlashDet?

FlashDet is an end-to-end object detection framework built for speed, accuracy, and extensibility. The core FlashDet model features a dual detection head (NMS-free one-to-one + dense one-to-many), STAL (Small-Target-Aware Label Assignment), ProgLoss (Progressive Loss Balancing), and the MuSGD (Muon+SGD hybrid) optimizer.

The framework supports 6 training methods — all through a unified, registry-based, pluggable design.

Training Pipeline:
  Dataset → Augmentation → FlashDet Model
    ├── Classification Loss (BCE)
    ├── Box Loss (CIoU + L1, ProgLoss weighted)
    └── STAL Assignment
        → MuSGD → Updated Weights

Model Sizes

Model	Backbone	Params (Inference)	FP16 Size	Notes
FlashDet-P (Pico)	LiteBackbone-0.5x / PicoBackbone + PicoNeck	~298K	0.57 MB	Sub-1MB, depthwise heads
FlashDet-N (Nano)	FlashBackbone (w=0.25, d=0.33)	~1.06M	2.01 MB	Lightweight
FlashDet-S (Small)	FlashBackbone (w=0.50, d=0.33)	~5.4M	10.3 MB	Balanced
FlashDet-M (Medium)	FlashBackbone (w=1.00, d=0.67)	~18M	34.3 MB	High accuracy

FlashDet-P (Pico) is designed for extreme edge deployment (microcontrollers, mobile, browser). It uses:

LiteBackbone-0.5x with pretrained weights (channel mixing + depthwise convolutions)
PicoNeck with 64-ch output (lightweight modules for efficient feature generation)
Depthwise-separable E2E dual head (DW-conv + pointwise instead of full convolutions)
Same STAL + ProgLoss training recipe as larger variants

Installation

pip (recommended)

pip install flashdet

# With all extras (tracking, analytics, ONNX export)
pip install "flashdet[all]"

From source (for development)

git clone https://github.com/FlashVision/FlashDet.git
cd FlashDet
pip install -e ".[all]"

Optional extras

pip install -e ".[export]"      # ONNX export support
pip install -e ".[tracker]"     # FlashTracker, MotionTracker, AppearanceTracker
pip install -e ".[solutions]"   # Counting, speed, heatmaps
pip install -e ".[analytics]"   # Benchmarking, plots
pip install -e ".[all]"         # Everything

Verify installation

flashdet check       # runs full health check
flashdet settings    # shows Python, PyTorch, CUDA, GPU info
flashdet version     # prints version

Usage

Python API

from flashdet import FlashDet, Trainer

# Build sub-1MB Pico model for edge deployment
pico = FlashDet(num_classes=80, size="p")
print(pico.get_model_info())  # inference_fp16_mb: 0.57

# Build with reparameterizable backbone
pico_v2 = FlashDet(num_classes=80, size="p", backbone_type="repnext")

# Build larger model
model_n = FlashDet(num_classes=80, size="n")

# Train
trainer = Trainer(
    model_size="p",   # "p" (Pico), "n", "s", "m", "l", "x"
    train_images="data/train",
    val_images="data/val",
    epochs=100,
    device="cuda",
)
trainer.train()

CLI

# Train (use --model-size p for Pico, n for Nano, s for Small, etc.)
flashdet train --model-size p --epochs 100 --device cuda \
  --train-images data/train --val-images data/val

# Validate
flashdet val --model best.pth --val-images data/val

Standalone Scripts

# Full training with LoRA
python train.py --lora --lora-rank 8 --epochs 50 --device cuda

# Inference
python test.py --model best.pth --image photo.jpg

Training Methods

FlashDet supports 5 training paradigms, each with a dedicated trainer class and CLI script:

Method	Trainer Class	CLI Script	Description
Standard	`Trainer`	`train.py`	Full supervised training with all augmentations
Self-Supervised (SSL)	`SSLTrainer`	`scripts/train_ssl.py`	BYOL pretraining on unlabeled data
Semi-Supervised	`SemiSupervisedTrainer`	`scripts/train_semi_supervised.py`	Teacher-student with pseudo-labels
Few-Shot	`FewShotTrainer`	`scripts/train_few_shot.py`	Learn from very few labeled examples
Active Learning	`ActiveLearningTrainer`	`scripts/train_active_learning.py`	Intelligently select samples for labeling

Self-Supervised Pretraining

python scripts/train_ssl.py \
  --method byol \
  --data-dir path/to/unlabeled/images \
  --epochs 100 --backbone-size n

Semi-Supervised Learning

python scripts/train_semi_supervised.py \
  --train-images data/train \
  --unlabeled-dir path/to/unlabeled/images \
  --pseudo-threshold 0.7

Few-Shot Learning

python scripts/train_few_shot.py \
  --base-checkpoint path/to/base.pth \
  --n-shot 10 --freeze-backbone

Active Learning

python scripts/train_active_learning.py \
  --train-images data/train \
  --unlabeled-pool path/to/unlabeled/images \
  --query-strategy entropy --budget 50 --rounds 5

LoRA / QLoRA Fine-Tuning

Parameter-efficient — freeze backbone, train only low-rank adapters:

# LoRA (6 variants: standard, dora, lora_plus, adalora, ortho, lora_fa)
python train.py --lora --lora-variant dora --lora-rank 8 --lora-alpha 16

# QLoRA (quantized base weights + LoRA)
python train.py --qlora --qlora-dtype nf4 --lora-rank 8

Mixed Precision & Multi-GPU

python train.py --amp --multi-gpu --device cuda

Core Components

STAL (Small-Target-Aware Label Assignment)

Task-Aligned Assignment with small-target protection — temporarily expands tiny GT boxes during candidate selection so small objects always get positive anchor supervision.

ProgLoss (Progressive Loss Balancing)

Linearly shifts training emphasis from the dense one-to-many head (exploration) to the NMS-free one-to-one head (refinement) over the course of training: alpha(t): 1.0 → 0.0.

MuSGD (Muon + SGD Hybrid Optimizer)

Applies Muon-style orthogonal updates to multi-dimensional parameters (conv weights, attention) while using standard SGD for 1D parameters (biases, norms), combining faster convergence with training stability.

E2E Detection Loss

Combines CIoU box loss, BCE classification loss, and L1 regression loss across both dual heads, weighted by the ProgLoss schedule.

Solutions

Built-in high-level applications for real-world use cases:

from flashdet.solutions import ObjectCounter, SpeedEstimator, Heatmap
from flashdet.trackers import FlashTracker

tracker = FlashTracker()
# Solutions integrate with any detection model for real-world applications

Solution	Description
ObjectCounter	Count objects crossing lines or entering regions
SpeedEstimator	Estimate real-world speed from tracked objects
Heatmap	Visualize detection density over time
RegionCounter	Count objects in polygon zones
QueueManager	Monitor queue lengths and wait times
DistanceCalculator	Measure real-world distances between objects
ParkingManager	Track parking spot occupancy
SecurityAlarm	Alert on intrusions into restricted zones
WorkoutMonitor	Track exercise repetitions and form
LiveInference	Real-time webcam/stream detection
AnalyticsDashboard	Aggregated detection statistics and visualization

Trackers

Multi-object tracking with persistent IDs across frames:

from flashdet.trackers import FlashTracker, MotionTracker, AppearanceTracker

tracker = FlashTracker(max_age=30, min_hits=3, iou_threshold=0.3)
tracks = tracker.update(detections)  # [x1,y1,x2,y2,track_id,score,cls]

Tracker	Method	Best For
FlashTracker	IoU + Kalman filter	General purpose, fast
MotionTracker	Kalman + Hungarian matching	Speed-critical applications
AppearanceTracker	Appearance + motion fusion	Crowded scenes, re-identification

Analytics

from flashdet.analytics import Benchmark, Profiler

bench = Benchmark(model_path="best.pth", device="cuda")
results = bench.run()  # {'fps': ..., 'latency_ms': ..., 'params': ..., ...}

profiler = Profiler(model_path="best.pth")
profiler.run()  # prints per-layer timing breakdown

Training Callbacks

Extend the training loop without modifying source code:

from flashdet import Trainer
from flashdet.engine.core.callbacks import EarlyStopping, CSVLogger, TensorBoardCallback

trainer = Trainer(model_size="n", train_images="data/train", val_images="data/val")

trainer.add_callback(EarlyStopping(patience=20, metric="val_mAP"))
trainer.add_callback(CSVLogger("metrics.csv"))
trainer.add_callback(TensorBoardCallback("runs/exp1"))

trainer.train()

Built-in callbacks: EarlyStopping, CSVLogger, TensorBoardCallback, LRSchedulerCallback.

Registry System

FlashDet uses a pluggable registry for all major components. Adding a new architecture, backbone, head, or loss is as simple as decorating your class:

from flashdet.registry import DETECTORS, BACKBONES, HEADS

@DETECTORS.register("MyDetector")
class MyDetector(nn.Module):
    ...

# Later, build from config
model = DETECTORS.build("MyDetector", num_classes=80)

Available registries: `DETECTORS`, `BACKBONES`, `NECKS`, `HEADS`, `LOSSES`, `DATASETS`, `TRANSFORMS`, `TRACKERS`.

Examples

Ready-to-run scripts in examples/:

Script	What it does
`train_custom_dataset.py`	Train on your own COCO-format dataset
`train_with_lora.py`	LoRA fine-tuning (DoRA variant)

cd examples
python train_custom_dataset.py

Project Structure

FlashDet/
├── flashdet/                        # Main package
│   ├── __init__.py                  # Public API
│   ├── cli.py                       # CLI entry point
│   ├── registry.py                  # Pluggable component registry
│   ├── cfg/                         # Configuration
│   ├── data/                        # Datasets, loaders, transforms, download
│   ├── engine/
│   │   ├── core/                    # Callbacks, EMA, MuSGD optimizer
│   │   ├── training/                # All training paradigms
│   │   │   ├── trainer.py           # Standard Trainer
│   │   │   ├── kd_trainer.py        # Knowledge Distillation
│   │   │   ├── ssl_trainer.py       # Self-Supervised Learning
│   │   │   ├── semi_supervised_trainer.py
│   │   │   ├── few_shot_trainer.py
│   │   │   └── active_learning_trainer.py
│   │   └── evaluation/              # Validator
│   ├── models/
│   │   ├── architectures/
│   │   │   └── flashdet.py          # FlashDet + FlashDetPico
│   │   ├── backbone/                # LiteBackbone, PicoBackbone, FlashBackbone
│   │   ├── neck/                    # PicoNeck, YOLO necks
│   │   ├── head/                    # E2E dual detection head
│   │   ├── layers/                  # ConvBlock, PicoBlock, SpatialPool, RepNeXt blocks
│   │   ├── assignment/              # STAL
│   │   ├── detector.py              # build_model() factory
│   │   └── lora.py                  # LoRA / QLoRA (6 variants)
│   ├── losses/
│   │   ├── e2e_loss.py              # E2E dual-head loss + ProgLoss
│   │   └── kd_loss.py               # Knowledge distillation losses
│   ├── utils/                       # Metrics, visualization, checkpoints
│   ├── trackers/                    # SORT, ByteTrack, BoT-SORT, DeepSORT, OC-SORT, StrongSORT
│   ├── solutions/                   # 17 ready-to-use vision solutions
│   └── analytics/                   # Benchmark, profiling, plots
├── scripts/                         # Training scripts (SSL, few-shot, etc.)
├── examples/                        # Ready-to-run example scripts
├── tests/                           # Unit & integration tests (pytest)
├── docs/                            # Documentation
├── docker/                          # Dockerfile + docker-compose
├── train.py                         # Main training entry point
├── test.py                          # Main inference entry point
└── pyproject.toml                   # Package configuration

Docker

# Build
docker build -t flashdet -f docker/Dockerfile .

# Run inference
docker run --gpus all -v $(pwd)/data:/app/data flashdet \
  predict --model best.pth --source data/test.jpg

# Or use docker-compose
cd docker && docker compose up

Supported Formats

Import	Export
COCO JSON	ONNX
TXT labels	FP16 weights
Pascal VOC XML	TorchScript

Documentation

Full documentation is in the docs/ folder:

Document	Description
Installation	Detailed installation guide
LoRA Fine-Tuning	LoRA/QLoRA variants and usage
Trackers	Multi-object tracking guide
FAQ	Frequently asked questions
Changelog	Version history

Contributing

We welcome contributions!

git clone https://github.com/FlashVision/FlashDet.git
cd FlashDet
pip install -e ".[dev,all]"
pytest tests/
ruff check flashdet/
flashdet check

License

MIT License — see LICENSE for details.

FlashVision — Open-source lightweight AI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.1

Jun 27, 2026

This version

1.2.0

Jun 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flashdet-1.2.0.tar.gz (181.8 kB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flashdet-1.2.0-py3-none-any.whl (248.4 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file flashdet-1.2.0.tar.gz.

File metadata

Download URL: flashdet-1.2.0.tar.gz
Upload date: Jun 27, 2026
Size: 181.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for flashdet-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`3cc3e6f29e85d86b51b70848dda4de42be9a09e371e892a96c932f3e202eb1d4`
MD5	`10641a5ee51dd78f2adb57c4f453d0a4`
BLAKE2b-256	`3969fcd5013e9e896299de267289c709a95258bcff2f33b9d56825311d461285`

See more details on using hashes here.

File details

Details for the file flashdet-1.2.0-py3-none-any.whl.

File metadata

Download URL: flashdet-1.2.0-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 248.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for flashdet-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba4826e15b774a71598e4ed86da9c266db6b3eb895aa00726b885fa37788a5c6`
MD5	`35cd29c5a270f1b94a99b1db2095e342`
BLAKE2b-256	`2a504a6e451f3de94b9326e4712056c5b15325e4129127f7d474121082cc6334`

See more details on using hashes here.

flashdet 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FlashDet

What is FlashDet?

Model Sizes

Installation

pip (recommended)

From source (for development)

Optional extras

Verify installation

Usage

Python API

CLI

Standalone Scripts

Training Methods

Self-Supervised Pretraining

Semi-Supervised Learning

Few-Shot Learning

Active Learning

LoRA / QLoRA Fine-Tuning

Mixed Precision & Multi-GPU

Core Components

STAL (Small-Target-Aware Label Assignment)

ProgLoss (Progressive Loss Balancing)

MuSGD (Muon + SGD Hybrid Optimizer)

E2E Detection Loss

Solutions

Trackers

Analytics

Training Callbacks

Registry System

Available registries: DETECTORS, BACKBONES, NECKS, HEADS, LOSSES, DATASETS, TRANSFORMS, TRACKERS.

Examples

Project Structure

Docker

Supported Formats

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Available registries: `DETECTORS`, `BACKBONES`, `NECKS`, `HEADS`, `LOSSES`, `DATASETS`, `TRANSFORMS`, `TRACKERS`.