Controllable Image Augmentation framework using Stable Diffusion + ControlNet

These details have not been verified by PyPI

Project description

CIA: Controllable Image Augmentation

CIA is a Python library for synthetic data augmentation using Stable Diffusion + ControlNet. Generate high-quality synthetic images from real seed images, evaluate their quality, and use them to improve downstream ML models.

Features

Synthetic image generation using Stable Diffusion controlled by Canny edges, OpenPose, Segmentation, or MediaPipe face features
Quality metrics -- Frechet Inception Distance (FID), Inception Score (IS), Mahalanobis distance
Quality-based filtering -- keep only the best synthetic images via top-k, top-p, or threshold filtering
Auto-captioning -- generate image captions using OpenAI or Ollama vision models
Multiple interfaces -- Python API, CLI, and Hydra config

Try it now

Run CIA in your browser with Google Colab: no installation required. Open the Quickstart notebook to generate, evaluate, and filter synthetic images in under 15 minutes.

Installation

pip install ciagen

With optional dependencies:

pip install ciagen[captioning]    # OpenAI/Ollama auto-captioning
pip install ciagen[training]     # YOLO/classifier training
pip install ciagen[datasets]      # COCO, Flickr30K, FER, MOCS datasets
pip install ciagen[all]           # Everything

Development

git clone https://github.com/fennecinspace/ciagen.git
cd ciagen
pip install -e ".[all]"

Docker

./run_and_build_docker_file.sh nvidia
docker exec -it ciagen zsh

Quick Start

Python API

from ciagen import generate, evaluate, filter_generated

# Generate synthetic images
result = generate(
    source="data/real/train/images/",
    output="data/generated/",
    extractor="canny",
    sd_model="fennecinspace/sd-v15",
    cn_model="lllyasviel/sd-controlnet-canny",
    num_per_image=3,
    prompt="a person walking in a park",
    seed=42,
    device="cuda",
)
print(f"Generated {result['total_generated']} images")

# Evaluate quality
scores = evaluate(
    real="data/real/train/images/",
    generated="data/generated/",
    metrics=["fid", "mld"],
    feature_extractor="vit",
)
print(f"FID: {scores['dtd']['fid']}")

# Filter to keep the best images
kept = filter_generated(
    generated="data/generated/",
    method="top-k",
    value=100,
)

CLI

# Generate images
ciagen generate \
    --source data/real/train/images/ \
    --output data/generated/ \
    --extractor canny \
    --sd-model fennecinspace/sd-v15 \
    --cn-model lllyasviel/sd-controlnet-canny \
    --num 3 \
    --prompt "a person walking"

# Evaluate quality
ciagen evaluate \
    --real data/real/train/images/ \
    --generated data/generated/ \
    --metrics fid mld

# Filter generated images
ciagen filter \
    --generated data/generated/ \
    --method top-k \
    --value 100

# Auto-caption images
ciagen caption \
    --images data/real/train/images/ \
    --output data/real/train/captions/ \
    --engine ollama \
    --model llava

Hydra (Advanced)

python run.py task=gen model.cn_use=lllyasviel_canny prompt.base="a person"
python run.py task=dtd
python run.py task=ptd
python run.py task=filtering
python run.py task=mix
python run.py task=train

See ciagen/conf/config.yaml for all configuration options.

Pipeline

The recommended workflow:

real images ──► condition extraction ──► SD + ControlNet ──► synthetic images
                                                              │
real images ──────────────────────────────────────────────► evaluate ──► filter ──► mix ──► train

Generate -- Extract a control condition (edges, pose, segmentation) from each real image, then generate synthetic variations using Stable Diffusion + ControlNet
Evaluate -- Compute distribution-level metrics (FID, IS) and per-image metrics (Mahalanobis distance)
Filter -- Select the best synthetic images based on quality scores
Mix -- Combine real and filtered synthetic data into a training dataset
Train -- Train your downstream model (YOLOv8 for detection, InceptionV3 for classification)

Available Extractors

Extractor	Description	Use Case
`canny`	Canny edge detection	General purpose, preserves structure
`openpose`	Human pose estimation	People, actions, body pose
`segmentation`	YOLOv8 semantic segmentation	Object boundaries
`mediapipe_face`	MediaPipe face landmarks	Facial emotion, face generation

Available Metrics

Metric	Type	Description
`fid`	Distribution-to-Distribution	Frechet Inception Distance -- lower is better
`inception_score`	Distribution-to-Distribution	Inception Score -- higher is better
`mld`	Point-to-Distribution	Mahalanobis distance -- per-image, lower is better

Data Structure

data/
├── real/{dataset}/
│   ├── train/{images,labels,captions}/
│   ├── val/{images,labels,captions}/
│   └── test/{images,labels,captions}/
├── generated/{dataset}/{controlnet-model}/
│   ├── metadata.yaml
│   └── *.png
└── mixed/{dataset}/

Example Datasets

python run.py task=prepare_data data.base=coco       # COCO People
python run.py task=prepare_data data.base=flickr30k   # Flickr30K Entities
python run.py task=prepare_data data.base=fer         # Facial Emotion Recognition
python run.py task=prepare_data data.base=mocs        # Construction Sites

Documentation

Full documentation is available in the docs/ directory and can be built with MkDocs:

pip install mkdocs-material mkdocstrings[python]
mkdocs serve

Contributing

See CONTRIBUTING.md for development setup, code style, and PR guidelines.

License

This project is licensed under the GNU Affero General Public License v3.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.1

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ciagen-1.0.1.tar.gz (54.6 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ciagen-1.0.1-py3-none-any.whl (70.5 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file ciagen-1.0.1.tar.gz.

File metadata

Download URL: ciagen-1.0.1.tar.gz
Upload date: Apr 26, 2026
Size: 54.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ciagen-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`d4f287ff2672c74e11db589661daea7e7deea6e6a1e5128de8700857d7abbf0b`
MD5	`0fc48271ff58d6eb4ee62bc1392f08d0`
BLAKE2b-256	`29647ebd9613a66fb675b606f570a8739db0c1bd8b90074689308b06bcca45b6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ciagen-1.0.1.tar.gz:

Publisher: python-publish.yml on fennecinspace/ciagen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ciagen-1.0.1.tar.gz
- Subject digest: d4f287ff2672c74e11db589661daea7e7deea6e6a1e5128de8700857d7abbf0b
- Sigstore transparency entry: 1389990810
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: fennecinspace/ciagen@15e2fdcc1793d1acf197c02e6d9a366a5da63afb
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/fennecinspace
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@15e2fdcc1793d1acf197c02e6d9a366a5da63afb
- Trigger Event: release

File details

Details for the file ciagen-1.0.1-py3-none-any.whl.

File metadata

Download URL: ciagen-1.0.1-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 70.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ciagen-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`51b306ca4af12fca2942890a6d3e918d2e236a179b1f1ca71d2e9d758a684810`
MD5	`a521eb2140a83e40ccbd91e42e19d0d9`
BLAKE2b-256	`b1af7a51679cd407033f90b76f47dc2981524fcb935cf34a47b01160f9e6e00b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ciagen-1.0.1-py3-none-any.whl:

Publisher: python-publish.yml on fennecinspace/ciagen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ciagen-1.0.1-py3-none-any.whl
- Subject digest: 51b306ca4af12fca2942890a6d3e918d2e236a179b1f1ca71d2e9d758a684810
- Sigstore transparency entry: 1389990909
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: fennecinspace/ciagen@15e2fdcc1793d1acf197c02e6d9a366a5da63afb
- Branch / Tag: refs/tags/v1.0.1
- Owner: https://github.com/fennecinspace
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@15e2fdcc1793d1acf197c02e6d9a366a5da63afb
- Trigger Event: release

ciagen 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

CIA: Controllable Image Augmentation

Features

Try it now

Installation

Development

Docker

Quick Start

Python API

CLI

Hydra (Advanced)

Pipeline

Available Extractors

Available Metrics

Data Structure

Example Datasets

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance