Controllable Image Augmentation framework using Stable Diffusion + ControlNet
Project description
CIA: Controllable Image Augmentation
CIA is a Python library for synthetic data augmentation using Stable Diffusion + ControlNet. Generate high-quality synthetic images from real seed images, evaluate their quality, and use them to improve downstream ML models.
Features
- Synthetic image generation using Stable Diffusion controlled by Canny edges, OpenPose, Segmentation, or MediaPipe face features
- Quality metrics -- Frechet Inception Distance (FID), Inception Score (IS), Mahalanobis distance
- Quality-based filtering -- keep only the best synthetic images via top-k, top-p, or threshold filtering
- Auto-captioning -- generate image captions using OpenAI or Ollama vision models
- Multiple interfaces -- Python API, CLI, and Hydra config
Try it now
Run CIA in your browser with Google Colab: no installation required. Open the Quickstart notebook to generate, evaluate, and filter synthetic images in under 15 minutes.
Installation
pip install ciagen
With optional dependencies:
pip install ciagen[captioning] # OpenAI/Ollama auto-captioning
pip install ciagen[training] # YOLO/classifier training
pip install ciagen[datasets] # COCO, Flickr30K, FER, MOCS datasets
pip install ciagen[all] # Everything
Development
git clone https://github.com/fennecinspace/ciagen.git
cd ciagen
pip install -e ".[all]"
Docker
./run_and_build_docker_file.sh nvidia
docker exec -it ciagen zsh
Quick Start
Python API
from ciagen import generate, evaluate, filter_generated
# Generate synthetic images
result = generate(
source="data/real/train/images/",
output="data/generated/",
extractor="canny",
sd_model="fennecinspace/sd-v15",
cn_model="lllyasviel/sd-controlnet-canny",
num_per_image=3,
prompt="a person walking in a park",
seed=42,
device="cuda",
)
print(f"Generated {result['total_generated']} images")
# Evaluate quality
scores = evaluate(
real="data/real/train/images/",
generated="data/generated/",
metrics=["fid", "mld"],
feature_extractor="vit",
)
print(f"FID: {scores['dtd']['fid']}")
# Filter to keep the best images
kept = filter_generated(
generated="data/generated/",
method="top-k",
value=100,
)
CLI
# Generate images
ciagen generate \
--source data/real/train/images/ \
--output data/generated/ \
--extractor canny \
--sd-model fennecinspace/sd-v15 \
--cn-model lllyasviel/sd-controlnet-canny \
--num 3 \
--prompt "a person walking"
# Evaluate quality
ciagen evaluate \
--real data/real/train/images/ \
--generated data/generated/ \
--metrics fid mld
# Filter generated images
ciagen filter \
--generated data/generated/ \
--method top-k \
--value 100
# Auto-caption images
ciagen caption \
--images data/real/train/images/ \
--output data/real/train/captions/ \
--engine ollama \
--model llava
Hydra (Advanced)
python run.py task=gen model.cn_use=lllyasviel_canny prompt.base="a person"
python run.py task=dtd
python run.py task=ptd
python run.py task=filtering
python run.py task=mix
python run.py task=train
See ciagen/conf/config.yaml for all configuration options.
Pipeline
The recommended workflow:
real images ──► condition extraction ──► SD + ControlNet ──► synthetic images
│
real images ──────────────────────────────────────────────► evaluate ──► filter ──► mix ──► train
- Generate -- Extract a control condition (edges, pose, segmentation) from each real image, then generate synthetic variations using Stable Diffusion + ControlNet
- Evaluate -- Compute distribution-level metrics (FID, IS) and per-image metrics (Mahalanobis distance)
- Filter -- Select the best synthetic images based on quality scores
- Mix -- Combine real and filtered synthetic data into a training dataset
- Train -- Train your downstream model (YOLOv8 for detection, InceptionV3 for classification)
Available Extractors
| Extractor | Description | Use Case |
|---|---|---|
canny |
Canny edge detection | General purpose, preserves structure |
openpose |
Human pose estimation | People, actions, body pose |
segmentation |
YOLOv8 semantic segmentation | Object boundaries |
mediapipe_face |
MediaPipe face landmarks | Facial emotion, face generation |
Available Metrics
| Metric | Type | Description |
|---|---|---|
fid |
Distribution-to-Distribution | Frechet Inception Distance -- lower is better |
inception_score |
Distribution-to-Distribution | Inception Score -- higher is better |
mld |
Point-to-Distribution | Mahalanobis distance -- per-image, lower is better |
Data Structure
data/
├── real/{dataset}/
│ ├── train/{images,labels,captions}/
│ ├── val/{images,labels,captions}/
│ └── test/{images,labels,captions}/
├── generated/{dataset}/{controlnet-model}/
│ ├── metadata.yaml
│ └── *.png
└── mixed/{dataset}/
Example Datasets
python run.py task=prepare_data data.base=coco # COCO People
python run.py task=prepare_data data.base=flickr30k # Flickr30K Entities
python run.py task=prepare_data data.base=fer # Facial Emotion Recognition
python run.py task=prepare_data data.base=mocs # Construction Sites
Documentation
Full documentation is available in the docs/ directory and can be built with MkDocs:
pip install mkdocs-material mkdocstrings[python]
mkdocs serve
Contributing
See CONTRIBUTING.md for development setup, code style, and PR guidelines.
License
This project is licensed under the GNU Affero General Public License v3.
Copyright (c) 2026 Universite de Mons, Multitel, Universite Libre de Bruxelles, Universite Catholique de Louvain.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ciagen-1.0.1.tar.gz.
File metadata
- Download URL: ciagen-1.0.1.tar.gz
- Upload date:
- Size: 54.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4f287ff2672c74e11db589661daea7e7deea6e6a1e5128de8700857d7abbf0b
|
|
| MD5 |
0fc48271ff58d6eb4ee62bc1392f08d0
|
|
| BLAKE2b-256 |
29647ebd9613a66fb675b606f570a8739db0c1bd8b90074689308b06bcca45b6
|
Provenance
The following attestation bundles were made for ciagen-1.0.1.tar.gz:
Publisher:
python-publish.yml on fennecinspace/ciagen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ciagen-1.0.1.tar.gz -
Subject digest:
d4f287ff2672c74e11db589661daea7e7deea6e6a1e5128de8700857d7abbf0b - Sigstore transparency entry: 1389990810
- Sigstore integration time:
-
Permalink:
fennecinspace/ciagen@15e2fdcc1793d1acf197c02e6d9a366a5da63afb -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/fennecinspace
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@15e2fdcc1793d1acf197c02e6d9a366a5da63afb -
Trigger Event:
release
-
Statement type:
File details
Details for the file ciagen-1.0.1-py3-none-any.whl.
File metadata
- Download URL: ciagen-1.0.1-py3-none-any.whl
- Upload date:
- Size: 70.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51b306ca4af12fca2942890a6d3e918d2e236a179b1f1ca71d2e9d758a684810
|
|
| MD5 |
a521eb2140a83e40ccbd91e42e19d0d9
|
|
| BLAKE2b-256 |
b1af7a51679cd407033f90b76f47dc2981524fcb935cf34a47b01160f9e6e00b
|
Provenance
The following attestation bundles were made for ciagen-1.0.1-py3-none-any.whl:
Publisher:
python-publish.yml on fennecinspace/ciagen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ciagen-1.0.1-py3-none-any.whl -
Subject digest:
51b306ca4af12fca2942890a6d3e918d2e236a179b1f1ca71d2e9d758a684810 - Sigstore transparency entry: 1389990909
- Sigstore integration time:
-
Permalink:
fennecinspace/ciagen@15e2fdcc1793d1acf197c02e6d9a366a5da63afb -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/fennecinspace
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@15e2fdcc1793d1acf197c02e6d9a366a5da63afb -
Trigger Event:
release
-
Statement type: