Skip to main content

Controllable Image Augmentation framework using Stable Diffusion + ControlNet

Project description

CIA: Controllable Image Augmentation

GitHub Stars License Python Tests Docs arXiv Open In Colab

CIA is a Python library for synthetic data augmentation using Stable Diffusion + ControlNet. Generate high-quality synthetic images from real seed images, evaluate their quality, and use them to improve downstream ML models.

Features

  • Synthetic image generation using Stable Diffusion controlled by Canny edges, OpenPose, Segmentation, or MediaPipe face features
  • Quality metrics -- Frechet Inception Distance (FID), Inception Score (IS), Mahalanobis distance
  • Quality-based filtering -- keep only the best synthetic images via top-k, top-p, or threshold filtering
  • Auto-captioning -- generate image captions using OpenAI or Ollama vision models
  • Multiple interfaces -- Python API, CLI, and Hydra config

Try it now

Open In Colab

Run CIA in your browser with Google Colab: no installation required. Open the Quickstart notebook to generate, evaluate, and filter synthetic images in under 15 minutes.

Installation

pip install ciagen

With optional dependencies:

pip install ciagen[captioning]    # OpenAI/Ollama auto-captioning
pip install ciagen[training]     # YOLO/classifier training
pip install ciagen[datasets]      # COCO, Flickr30K, FER, MOCS datasets
pip install ciagen[all]           # Everything

Development

git clone https://github.com/fennecinspace/ciagen.git
cd ciagen
pip install -e ".[all]"

Docker

./run_and_build_docker_file.sh nvidia
docker exec -it ciagen zsh

Quick Start

Python API

from ciagen import generate, evaluate, filter_generated

# Generate synthetic images
result = generate(
    source="data/real/train/images/",
    output="data/generated/",
    extractor="canny",
    sd_model="fennecinspace/sd-v15",
    cn_model="lllyasviel/sd-controlnet-canny",
    num_per_image=3,
    prompt="a person walking in a park",
    seed=42,
    device="cuda",
)
print(f"Generated {result['total_generated']} images")

# Evaluate quality
scores = evaluate(
    real="data/real/train/images/",
    generated="data/generated/",
    metrics=["fid", "mld"],
    feature_extractor="vit",
)
print(f"FID: {scores['dtd']['fid']}")

# Filter to keep the best images
kept = filter_generated(
    generated="data/generated/",
    method="top-k",
    value=100,
)

CLI

# Generate images
ciagen generate \
    --source data/real/train/images/ \
    --output data/generated/ \
    --extractor canny \
    --sd-model fennecinspace/sd-v15 \
    --cn-model lllyasviel/sd-controlnet-canny \
    --num 3 \
    --prompt "a person walking"

# Evaluate quality
ciagen evaluate \
    --real data/real/train/images/ \
    --generated data/generated/ \
    --metrics fid mld

# Filter generated images
ciagen filter \
    --generated data/generated/ \
    --method top-k \
    --value 100

# Auto-caption images
ciagen caption \
    --images data/real/train/images/ \
    --output data/real/train/captions/ \
    --engine ollama \
    --model llava

Hydra (Advanced)

python run.py task=gen model.cn_use=lllyasviel_canny prompt.base="a person"
python run.py task=dtd
python run.py task=ptd
python run.py task=filtering
python run.py task=mix
python run.py task=train

See ciagen/conf/config.yaml for all configuration options.

Pipeline

The recommended workflow:

real images ──► condition extraction ──► SD + ControlNet ──► synthetic images
                                                              │
real images ──────────────────────────────────────────────► evaluate ──► filter ──► mix ──► train
  1. Generate -- Extract a control condition (edges, pose, segmentation) from each real image, then generate synthetic variations using Stable Diffusion + ControlNet
  2. Evaluate -- Compute distribution-level metrics (FID, IS) and per-image metrics (Mahalanobis distance)
  3. Filter -- Select the best synthetic images based on quality scores
  4. Mix -- Combine real and filtered synthetic data into a training dataset
  5. Train -- Train your downstream model (YOLOv8 for detection, InceptionV3 for classification)

Available Extractors

Extractor Description Use Case
canny Canny edge detection General purpose, preserves structure
openpose Human pose estimation People, actions, body pose
segmentation YOLOv8 semantic segmentation Object boundaries
mediapipe_face MediaPipe face landmarks Facial emotion, face generation

Available Metrics

Metric Type Description
fid Distribution-to-Distribution Frechet Inception Distance -- lower is better
inception_score Distribution-to-Distribution Inception Score -- higher is better
mld Point-to-Distribution Mahalanobis distance -- per-image, lower is better

Data Structure

data/
├── real/{dataset}/
│   ├── train/{images,labels,captions}/
│   ├── val/{images,labels,captions}/
│   └── test/{images,labels,captions}/
├── generated/{dataset}/{controlnet-model}/
│   ├── metadata.yaml
│   └── *.png
└── mixed/{dataset}/

Example Datasets

python run.py task=prepare_data data.base=coco       # COCO People
python run.py task=prepare_data data.base=flickr30k   # Flickr30K Entities
python run.py task=prepare_data data.base=fer         # Facial Emotion Recognition
python run.py task=prepare_data data.base=mocs        # Construction Sites

Documentation

Full documentation is available in the docs/ directory and can be built with MkDocs:

pip install mkdocs-material mkdocstrings[python]
mkdocs serve

Contributing

See CONTRIBUTING.md for development setup, code style, and PR guidelines.

License

This project is licensed under the GNU Affero General Public License v3.

Copyright (c) 2026 Universite de Mons, Multitel, Universite Libre de Bruxelles, Universite Catholique de Louvain.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ciagen-1.0.1.tar.gz (54.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ciagen-1.0.1-py3-none-any.whl (70.5 kB view details)

Uploaded Python 3

File details

Details for the file ciagen-1.0.1.tar.gz.

File metadata

  • Download URL: ciagen-1.0.1.tar.gz
  • Upload date:
  • Size: 54.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ciagen-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d4f287ff2672c74e11db589661daea7e7deea6e6a1e5128de8700857d7abbf0b
MD5 0fc48271ff58d6eb4ee62bc1392f08d0
BLAKE2b-256 29647ebd9613a66fb675b606f570a8739db0c1bd8b90074689308b06bcca45b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ciagen-1.0.1.tar.gz:

Publisher: python-publish.yml on fennecinspace/ciagen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ciagen-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ciagen-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 70.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ciagen-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 51b306ca4af12fca2942890a6d3e918d2e236a179b1f1ca71d2e9d758a684810
MD5 a521eb2140a83e40ccbd91e42e19d0d9
BLAKE2b-256 b1af7a51679cd407033f90b76f47dc2981524fcb935cf34a47b01160f9e6e00b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ciagen-1.0.1-py3-none-any.whl:

Publisher: python-publish.yml on fennecinspace/ciagen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page