AI image and video generation assistant with CLI and web UI (Gemini, Seedream, SeeDance)

These details have not been verified by PyPI

Project links

Project description

stable-delusion - AI-powered image generation and editing assistant

Features

Image Generation: Generate images from text prompts using Gemini 2.5 Flash Image Preview
Multi-image Support: Use multiple reference images for generation
Automatic Upscaling: Optional 2x or 4x upscaling using Google Cloud Vertex AI
Flexible Output: Specify custom output directories and filenames
Storage Options: Local filesystem or AWS S3 storage backends
Error Handling: Comprehensive error logging and diagnostic information
Web API: RESTful API for integration with other applications
Command Line Interface: Full-featured CLI for batch processing and automation

Installation

From PyPI (Recommended for Users)

pip install stable-delusion

Setup

Configuration

The application uses environment variables for configuration. You can set these either through:

.env file (recommended) - Copy .env.example to .env and customize
Environment variables - Set directly in your shell (overrides .env values)

Option 1: Using .env File (Recommended)

# Copy the example file and edit it
cp .env.example .env

# Edit .env with your actual values
# At minimum, you need to set GEMINI_API_KEY

Option 2: Using Environment Variables

# Required: Gemini API key for image generation
export GEMINI_API_KEY="your-api-key-here"

# Optional: Flask debug mode (development only)
export FLASK_DEBUG="true"  # Enable debug mode in development
# export FLASK_DEBUG="false"  # Disable debug mode (default/production)

Security Notes:

FLASK_DEBUG is disabled by default for security reasons. Only enable it in development environments, never in production.
Never commit your .env file to version control - it contains sensitive information!
The .env file is already in .gitignore to prevent accidental commits.

AWS S3 Configuration (Optional)

The application supports storing generated images in AWS S3 instead of the local filesystem. You can configure S3 either in your .env file or via environment variables:

Using .env file:

# Add these to your .env file
STORAGE_TYPE=s3
AWS_S3_BUCKET=your-s3-bucket-name
AWS_S3_REGION=us-east-1
AWS_PROFILE=your-aws-profile  # or use direct credentials

Using environment variables:

# S3 Storage Configuration
export STORAGE_TYPE="s3"                    # Use "s3" for AWS S3, "local" for filesystem (default)
export AWS_S3_BUCKET="your-s3-bucket-name" # S3 bucket name for image storage
export AWS_S3_REGION="us-east-1"           # AWS region where your bucket is located

# AWS Credentials (use one of the following methods)
# Method 1: Environment variables
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

# Method 2: AWS CLI profiles (recommended)
# Configure with: aws configure --profile your-profile
export AWS_PROFILE="your-profile"

# Method 3: IAM roles (for EC2/Lambda deployment)
# No additional configuration needed if running on AWS with proper IAM roles

S3 Setup Requirements:

Create an S3 bucket in your desired AWS region
Ensure your AWS credentials have the following permissions for the bucket:
- s3:PutObject - Upload generated images
- s3:GetObject - Download images (if needed)
- s3:DeleteObject - Clean up old images
- s3:ListBucket - List bucket contents

Usage

CLI

Basic usage

$ poetry run python stable_delusion/hallucinate.py \
    --prompt "please make the women in the provided image look affectionately at each other" \
    --image samples/base.png

Advanced usage with all parameters

$ poetry run python stable_delusion/hallucinate.py \
    --prompt "a futuristic cityscape with flying cars" \
    --image samples/base.png \
    --image samples/reference.png \
    --output-filename custom_output.png \
    --output-dir ./generated \
    --project-id my-gcp-project \
    --location us-central1 \
    --scale 4

S3 storage examples

# Use S3 storage (requires S3 environment variables to be set)
$ poetry run python stable_delusion/hallucinate.py \
    --prompt "a beautiful landscape" \
    --image samples/base.png \
    --storage-type s3 \
    --output-dir generated-images

# Force local storage (override S3 configuration)
$ poetry run python stable_delusion/hallucinate.py \
    --prompt "a city at night" \
    --image samples/base.png \
    --storage-type local \
    --output-dir ./local-output

Command line parameters

--prompt: Text prompt for image generation (optional, defaults to sample prompt)
--image: Path to reference image(s), can be used multiple times
--output-filename: Output filename (default: "generated_gemini_image.png")
--output-dir: Directory where generated files will be saved (default: current directory)
--project-id: Google Cloud Project ID (defaults to value in conf.py)
--location: Google Cloud region (defaults to value in conf.py)
--scale: Upscale factor, 2 or 4 (optional, enables automatic upscaling)
--storage-type: Storage backend - "local" for filesystem or "s3" for AWS S3 (overrides configuration)

Web server

Start the server

$ poetry run python stable_delusion/main.py

Make a request to the web API

# Basic request
$ curl -X POST \
    -F "prompt=please make the women in the provided image look affectionately at each other" \
    -F "images=@samples/base_2.png" \
    http://127.0.0.1:5000/generate

# Request with custom output directory
$ curl -X POST \
    -F "prompt=create a sunset landscape" \
    -F "images=@samples/base.png" \
    -F "output_dir=./api_generated" \
    http://127.0.0.1:5000/generate

# Multiple images
$ curl -X POST \
    -F "prompt=blend these images creatively" \
    -F "images=@samples/image1.png" \
    -F "images=@samples/image2.png" \
    -F "output_dir=./results" \
    http://127.0.0.1:5000/generate

# S3 storage examples
# Save to S3 (requires S3 environment variables to be set)
$ curl -X POST \
    -F "prompt=a mountain landscape" \
    -F "images=@samples/base.png" \
    -F "storage_type=s3" \
    -F "output_dir=generated-images" \
    http://127.0.0.1:5000/generate

# Force local storage (override S3 configuration)
$ curl -X POST \
    -F "prompt=a city skyline" \
    -F "images=@samples/base.png" \
    -F "storage_type=local" \
    -F "output_dir=./local-results" \
    http://127.0.0.1:5000/generate

API Parameters

Content-Type: multipart/form-data

Parameters are sent as form fields (not JSON):

prompt: Text prompt for image generation (required)
images: Image file(s) to upload (required, can be multiple files)
output_dir: Directory where generated files will be saved (optional, default: ".")
storage_type: Storage backend - "local" for filesystem or "s3" for AWS S3 (optional, uses configuration default)

API Response

Content-Type: application/json

{
    "message": "Files uploaded successfully",
    "prompt": "your prompt text",
    "saved_files": ["/path/to/uploaded/file1.png", "/path/to/uploaded/file2.png"],
    "generated_file": "/path/to/generated_image.png",
    "output_dir": "/custom/output/directory"
}

Upscale generated images

Setup for upscaling

Preliminaries to get permissions sorted out:

$ gcloud init
$ gcloud auth login
$ gcloud auth application-default login
$ gcloud services enable aiplatform.googleapis.com

Upscale a specific image

$ poetry run python stable_delusion/upscale.py \
    --input generated_image.png \
    --scale 4 \
    --project-id my-gcp-project \
    --location us-central1

Upscale parameters

--input: Input image file to upscale (required)
--scale: Upscale factor, 2 or 4 (default: 2)
--project-id: Google Cloud Project ID (defaults to value in conf.py)
--location: Google Cloud region (defaults to value in conf.py)

Token Usage Tracking

The application automatically tracks API token usage for all image generation operations. You can view statistics and history using either the CLI tool or the web API.

CLI Tool

View token usage statistics:

# Display overall statistics (total tokens, breakdown by model and operation)
$ poetry run python stable_delusion/token_stats.py

# Show the last 10 token usage entries
$ poetry run python stable_delusion/token_stats.py --history 10

# Clear all token usage history
$ poetry run python stable_delusion/token_stats.py --clear

# Use a custom storage file
$ poetry run python stable_delusion/token_stats.py --storage-file /path/to/token_usage.json

Web API Endpoints

Query token usage via the REST API:

# Get token usage statistics
$ curl http://127.0.0.1:5000/token-usage

# Get token usage history (all entries)
$ curl http://127.0.0.1:5000/token-usage/history

# Get token usage history (last 20 entries)
$ curl http://127.0.0.1:5000/token-usage/history?limit=20

Token usage data is stored in ~/.stable-delusion/token_usage.json by default.

Error Handling

The application provides detailed error logging when image generation fails:

Safety filter violations with specific categories and probability levels
API response diagnostics including token usage and finish reasons
File upload details with metadata (size, MIME type, expiration times)
Comprehensive error messages for troubleshooting

Development

For development guidelines, code quality tools, CI/CD pipeline details, and security best practices, see doc/Development.md.

Documentation

Development Guide - Development practices, CI/CD pipeline, and code quality tools
Architecture - System architecture and design patterns
API Demo - API endpoint examples and usage
Changelog - Version history and release notes

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.6

Apr 4, 2026

0.1.5

Oct 10, 2025

0.1.4

Oct 4, 2025

0.1.3

Oct 3, 2025

0.1.2

Oct 3, 2025

0.1.1

Oct 1, 2025

0.1.0

Sep 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stable_delusion-0.1.6.tar.gz (65.5 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stable_delusion-0.1.6-py3-none-any.whl (105.0 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file stable_delusion-0.1.6.tar.gz.

File metadata

Download URL: stable_delusion-0.1.6.tar.gz
Upload date: Apr 4, 2026
Size: 65.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.3 CPython/3.13.12 Linux/5.15.154+

File hashes

Hashes for stable_delusion-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`43e30cc4c0af0111d3ba8e252981ef9af27d2a3f999a8f2edb118029b8f192a0`
MD5	`0bdf8be769be94116b2a72094afaebb7`
BLAKE2b-256	`2c9b9e41cfcb9b6715750bf9f3411c0b398216d3644b07840c54d1b0c6273655`

See more details on using hashes here.

File details

Details for the file stable_delusion-0.1.6-py3-none-any.whl.

File metadata

Download URL: stable_delusion-0.1.6-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 105.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.3 CPython/3.13.12 Linux/5.15.154+

File hashes

Hashes for stable_delusion-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a924eec218fcb5d28978ef9b46c0006e19ff29ae304ff79a4885b9826e12d1e`
MD5	`d76ec1a0fc58f7337f8086acf19b3991`
BLAKE2b-256	`2e15e60f47852d63e32e0fe174a5ed16b46bdb8a31cbfbdd13e9b6a3e02799a8`

See more details on using hashes here.

stable-delusion 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

stable-delusion - AI-powered image generation and editing assistant

Features

Installation

From PyPI (Recommended for Users)

Setup

Configuration

Option 1: Using .env File (Recommended)

Option 2: Using Environment Variables

AWS S3 Configuration (Optional)

Usage

CLI

Basic usage

Advanced usage with all parameters

S3 storage examples

Command line parameters

Web server

Start the server

Make a request to the web API

API Parameters

API Response

Upscale generated images

Setup for upscaling

Upscale a specific image

Upscale parameters

Token Usage Tracking

CLI Tool

Web API Endpoints

Error Handling

Development

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes