Training and inference templates based on the D-FINE architecture.
Project description
Sinapsis D-FINE
Templates for training and inference with the D-FINE model
🐍 Installation • 🚀 Features • 📚 Usage example • 🌐 Webapp • 📙 Documentation • 🔍 License
The Sinapsis D-FINE module provides templates for training and inference with the D-FINE model, enabling advanced object detection tasks.
🐍 Installation
Install using your package manager of choice. We encourage the use of uv
Example with uv:
uv pip install sinapsis-dfine --extra-index-url https://pypi.sinapsis.tech
or with raw pip:
pip install sinapsis-dfine --extra-index-url https://pypi.sinapsis.tech
🚀 Features
Templates Supported
The Sinapsis D-FINE module provides two main templates for inference and training:
DFINETraining: A highly flexible template for fine-tuning D-FINE models on custom data. It is designed for rapid setup while still offering deep control.- Effortless Setup: Automatically infers class labels directly from the dataset, eliminating the need to manually create id2label maps.
- Flexible Data Sources: Seamlessly loads datasets from both local directories and the Hugging Face Hub.
- Adaptable to Your Data: Easily adapts to different dataset schemas by allowing users to specify custom keys for annotations (bbox, category, etc.) via the annotation_keys attribute.
- Powerful Customization: Provides granular control over every aspect of training through structured Pydantic models for hyperparameters, data mapping, and more.
DFINEInference: A streamlined and efficient template for running trained D-FINE models.- High-Performance: Processes images in batches for maximum throughput on the target hardware.
- Structured Output: Generates clear, structured annotations for each image, including bounding boxes, confidence scores, and class labels, ready for downstream tasks.
🌍 General Attributes
Both templates share the following attributes:
model_path(str, optional): The model identifier from the Hugging Face Hub or a local path to the model and processor files. Defaults to"ustc-community/dfine-nano-coco".model_cache_dir(str, optional): Directory to cache downloaded model files. Defaults to the path specified by theSINAPSIS_CACHE_DIRenvironment variable.threshold(float, required): The confidence score threshold (from 0.0 to 1.0) for filtering detections. For inference, it discards all detections below this value from the final output. For training, it is used on the validation dataset to filter predictions before calculating evaluation metrics.device(Literal["auto", "cuda", "cpu"], optional): The hardware device to run the model on. Defaults to"auto", which automatically selects"cuda"if a compatible GPU is available, otherwise falls back to"cpu".
Specific Attributes
There are some attributes specific to the templates used:
DFINEInferencehas one additional attribute:batch_size(int, optional): The number of images to process in a single batch. Defaults to8.
DFINETraininghas nine additional attributes:training_mode(Literal["fine-tune", "from-scratch"], optional): Specifies the training strategy.dataset_path(str, required): Path to the dataset to be loaded.id2label(dict[int, str] | None, optional): An optional mapping from class ID to label name. It's recommended to let the template infer this from the dataset. This attribute should only be used as a fallback if the dataset features are non-standard.annotation_keys(AnnotationKeys, optional): A configuration object that specifies the dictionary keys for accessing annotation data within the dataset.bbox(str, optional): The dictionary key for the bounding box annotations. Defaults to"bbox".category(str, optional): The dictionary key for the category/class label annotations. Defaults to"category".area(str, optional): The dictionary key for the bounding box area. If not provided, area will be calculated from the bbox. Defaults to"area".
validation_split_size(float, optional): The proportion of the dataset to reserve for validation. Defaults to0.15mapping_args(DatasetMappingArgs, optional): Parameters for the dataset preprocessing step.batch_size(int, optional): The batch size for applying transformations. A larger size can speed up preprocessing but requires more RAM. Defaults to16.num_proc(int, optional): The number of CPU processes to use for mapping. Defaults to0(no multiprocessing).
image_size(TrainingImageSize, optional): The target image size for image resizing.width(int, optional): The target width for image resizing. Defaults to640.height(int, optional): The target height for image resizing. Defaults to640.
training_args(TrainingArgs, optional): A nested configuration object for all Hugging FaceTrainerhyperparameters. Refer to the official documentation for the full list of possible arguments.save_dir(str, required): Path to the directory where the fine-tuned model will be saved.
📁 Supported Dataset Structure
To ensure compatibility and smooth training, the DFINETraining template relies on a specific dataset structure. This format is inspired by the widely used COCO dataset, making it easy to adapt many existing object detection datasets.
IMPORTANT: The DFINETraining template expects datasets to follow a specific nested (COCO-style) format. This ensures consistency and reliability during the data transformation process.
Each example in your dataset must contain at least two features:
image: A PIL Image object.objects: A dictionary that acts as a container for all annotations related to the image.
The objects dictionary must contain parallel lists for the annotations. The keys for these lists are configurable via the annotation_keys attribute.
Example of a single dataset entry:
{
'image': <PIL.Image object>,
'objects': {
'bbox': [[x, y, width, height], [x, y, width, height], ...],
'category': [label_id_1, label_id_2, ...],
'area': [area_1, area_2, ...] # This is optional and will be calculated if not present
}
}
Preparing a Local Dataset
To load a local dataset of images, the files must be structured with a metadata.jsonl file, which is the standard method for the Hugging Face datasets library.
- The folder structure should be organized as follows:
my_dataset/
|--- train/
| |--- image1.jpg
| |--- image2.png
| |--- metadata.jsonl
|--- validation/
|--- image3.jpg
|--- metadata.jsonl
- A
metadata.jsonlfile must be created. Each line in this file is a JSON object describing one image and its annotations.
Example line in train/metadata.jsonl:
{"file_name": "image1.jpg", "objects": {"bbox": [[22, 34, 100, 150]], "category": [3]}}
- The dataset can be loaded by providing the path to the root folder (
my_dataset/). The template will automatically find and parse themetadata.jsonlfiles.
For more detailed information on creating image datasets for object detection, refer to the official Hugging Face documentation.
Advanced Configuration
License Validation for Hub Datasets
For commercial safety, the DFINETraining template automatically validates that datasets from the Hugging Face Hub have a permissive license. This check can be managed using an environment variable.
ALLOW_UNVETTED_DATASETS:- Default Behavior (
True): By default, the license check is skipped. This is to provide a smooth experience for local development and testing. - Production Behavior (
False): For production environments, this variable must be explicitly set toFalseto enforce the license validation and ensure only commercially safe datasets are used.
- Default Behavior (
Example (for production):
export ALLOW_UNVETTED_DATASETS=False
[!TIP] Use CLI command
sinapsis info --example-template-config TEMPLATE_NAMEto produce an example Agent config for the Template specified in TEMPLATE_NAME.
For example, for DFINEInference use sinapsis info --example-template-config DFINEInference to produce an example config like:
agent:
name: my_test_agent
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: DFINEInference
class_name: DFINEInference
template_input: InputTemplate
attributes:
model_path: ustc-community/dfine-nano-coco
model_cache_dir: '/path/to/sinapsis/cache'
threshold: '`replace_me:<class ''float''>`'
device: auto
batch_size: 8
📚 Usage example
The following example demonstrates how to use the DFINEInference template for object detection. This setup processes a folder of images, runs inference using the D-FINE model, and saves the results, including detected bounding boxes.
Config
agent:
name: dfine_inference
description: "run inferences with D-FINE"
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: FolderImageDatasetCV2
class_name: FolderImageDatasetCV2
template_input: InputTemplate
attributes:
data_dir: datasets/coco
- template_name: DFINEInference
class_name: DFINEInference
template_input: FolderImageDatasetCV2
attributes:
model_path: ustc-community/dfine-small-coco
batch_size: 16
threshold: 0.5
device: cuda
- template_name: BBoxDrawer
class_name: BBoxDrawer
template_input: DFINEInference
attributes:
overwrite: true
randomized_color: false
- template_name: ImageSaver
class_name: ImageSaver
template_input: BBoxDrawer
attributes:
root_dir: datasets
save_dir: output
extension: png
This configuration defines an agent and a sequence of templates to run object detection with D-FINE.
[!IMPORTANT] The FolderImageDatasetCV2, BBoxDrawer and ImageSaver correspond to sinapsis-data-readers, sinapsis-data-visualization and sinapsis-data-writers. If you want to use the example, please make sure you install the packages.
To run the config, use the CLI:
sinapsis run name_of_config.yml
🌐 Webapp
The webapps included in this project demonstrate the modularity of the templates, showcasing the capabilities of various object detection models for different tasks.
[!IMPORTANT] To run the app, you first need to clone this repository:
git clone git@github.com:Sinapsis-ai/sinapsis-object-detection.git
cd sinapsis-object-detection
[!NOTE] If you'd like to enable external app sharing in Gradio,
export GRADIO_SHARE_APP=True
[!NOTE] Agent configuration can be changed through the
AGENT_CONFIG_PATHenv var. You can check the available configurations in each package configs folder.
[!NOTE] When running the app with the D-FINE model, it defaults to a confidence threshold of
0.5, uses CUDA for acceleration, and employs the nano-sized D-FINE model trained on the COCO dataset. These settings can be customized by modifying thedemo.ymlfile insidepackages/sinapsis_dfine/src/sinapsis_dfine/configsdirectory and restarting the webapp.
🐳 Docker
IMPORTANT: This docker image depends on the sinapsis-nvidia:base image. Please refer to the official sinapsis instructions to Build with Docker.
- Build the sinapsis-object-detection image:
docker compose -f docker/compose.yaml build
- Start the app container:
docker compose -f docker/compose_apps.yaml up sinapsis-dfine-gradio -d
- Check the status:
docker logs -f sinapsis-dfine-gradio
- The logs will display the URL to access the webapp, e.g.:
Running on local URL: http://127.0.0.1:7860
NOTE: The url can be different, check the output of logs.
- To stop the app:
docker compose -f docker/compose_apps.yaml down
💻 UV
To run the webapp using the uv package manager, follow these steps:
- Create the virtual environment and sync the dependencies:
uv sync --frozen
- Install the sinapsis-object-detection package:
uv pip install sinapsis-object-detection[all] --extra-index-url https://pypi.sinapsis.tech
- Run the webapp:
uv run webapps/detection_demo.py
- The terminal will display the URL to access the webapp, e.g.:
Running on local URL: http://127.0.0.1:7860
NOTE: The url can be different, check the output of the terminal.
📙 Documentation
Documentation for this and other sinapsis packages is available on the sinapsis website
Tutorials for different projects within sinapsis are available at sinapsis tutorials page
🔍 License
This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.
For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sinapsis_dfine-0.2.1.tar.gz.
File metadata
- Download URL: sinapsis_dfine-0.2.1.tar.gz
- Upload date:
- Size: 35.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
caedf03dcfa8803e3d3ed21d2e0e9459537de32e3b307bee50118b325c17bd27
|
|
| MD5 |
18bb7ea90b5e1550ab84324444640ae0
|
|
| BLAKE2b-256 |
6720faa83077ac299bdd926236156275483504268ee1f0e4286872ff92cf782c
|
File details
Details for the file sinapsis_dfine-0.2.1-py3-none-any.whl.
File metadata
- Download URL: sinapsis_dfine-0.2.1-py3-none-any.whl
- Upload date:
- Size: 34.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
422945344c11ba9ec9be664fd73dc9d93a13d89350e4ff8193d7252f5574725b
|
|
| MD5 |
fa5b47a4bc621a1e848efa19cf54c34e
|
|
| BLAKE2b-256 |
7c928d20169995735a85b2e494a7801a527160cf9539596cf7660761314e2e9f
|