Training tools and scripts for FluxFlow text-to-image generation
Project description
FluxFlow Training
Training tools and scripts for FluxFlow text-to-image generation models.
Installation
Production Install
pip install fluxflow-training
What gets installed:
fluxflow-training- Training scripts and configuration toolsfluxflowcore package (automatically installed as dependency)- CLI commands:
fluxflow-train,fluxflow-generate
Development Install
git clone https://github.com/danny-mio/fluxflow-training.git
cd fluxflow-training
pip install -e ".[dev]"
🚧 Training Status
Models Currently In Training: FluxFlow is actively training models following the systematic TRAINING_VALIDATION_PLAN.md.
Current Phase: Phase 1 - VAE Training (Weeks 1-4)
Progress:
- 🔄 Bezier VAE training in progress
- ⏳ ReLU baseline VAE pending
- ⏳ Flow models pending VAE completion
When Available: Trained checkpoints and empirical performance metrics will be published to MODEL_ZOO.md upon validation completion.
Note: All performance claims in documentation are theoretical targets pending empirical validation.
Hardware Requirements for Training
Current Setup (December 2025 validation):
- GPU: NVIDIA A6000 (48GB VRAM) via Paperspace
- Constraint: Service interrupts every 6 hours
- Solution: Automatic checkpoint/resume (every 30-60 min)
Alternative Options:
- Local: 1× RTX 4090 (24GB) or 2× RTX 3090 (24GB each)
- Cloud: AWS p3.2xlarge (V100 16GB), GCP A100 (40GB)
Minimum Requirements:
- 16GB VRAM for VAE training (batch size 2-4)
- 24GB VRAM for Flow training (batch size 1-2)
- 48GB VRAM enables larger batches (batch size 8+)
Cost Comparison:
| Platform | GPU | $/hr | Est. Total (500 hrs) |
|---|---|---|---|
| Paperspace | A6000 48GB | $0.76 | ~$456 ✓ |
| AWS | p3.2xlarge (V100 16GB) | $3.06 | ~$1,530 |
| GCP | A100 40GB | $3.67 | ~$1,835 |
| Lambda Labs | A100 40GB | $1.10 | ~$550 |
Pre-download LPIPS Weights (Optional)
The training uses LPIPS for perceptual loss, which requires VGG16 weights (~528MB). To pre-download:
python -c "import lpips; lpips.LPIPS(net='vgg')"
Weights will be cached in ~/.cache/torch/hub/checkpoints/. If not pre-downloaded, they'll download automatically on first training run.
Features
Core Training Capabilities
-
🎯 Pipeline Training Mode (NEW in v0.2.0)
- Multi-step sequential training with independent configs per step
- Per-step freeze/unfreeze of model components
- Loss-threshold transitions with early stopping
- Full checkpoint resume from any step/epoch/batch
- See PIPELINE_ARCHITECTURE.md
-
🎨 GAN-Only Training Mode (NEW in v0.2.0)
- Train encoder/decoder with adversarial loss only (no reconstruction)
- Spatial conditioning (SPADE) without pixel-perfect reconstruction
- Faster training with focused gradient flow
-
VAE Training
- Variational autoencoders with GAN losses
- LPIPS perceptual loss support
- SPADE spatial conditioning
- KL divergence with beta warmup and free bits
-
Flow Training
- Flow-based diffusion transformers
- Text-to-image generation
- Conditional and unconditional modes
Training Infrastructure
- Data Loading: Efficient dataset handling with WebDataset support for large-scale training
- Checkpointing: Robust checkpoint management with automatic resume from interruptions
- Training Visualization: Automatic diagram generation for loss curves and metrics
- Optimizers: Multiple optimizer support (AdamW, Adam, SGD, RMSprop)
- Schedulers: Various learning rate schedulers (CosineAnnealing, StepLR, ExponentialLR, ReduceLROnPlateau)
- Mixed Precision: Accelerate training with automatic mixed precision (fp16)
Monitoring & Logging
-
Enhanced Console Output (v0.2.0)
- Real-time batch timing (seconds/batch)
- Comprehensive loss display (VAE, KL, G, D, LPIPS)
- Step/epoch/batch progress tracking
-
Step-Specific Metrics (Pipeline Mode)
- Separate JSONL files per training step
- Automatic graph generation per step
- Full training history preservation
-
Mid-Epoch Sample Generation (v0.2.0)
- Samples generated at configurable intervals
- Structured naming with step/epoch/batch numbers
- Prevents overwrites with unique identifiers
Quick Start
Basic Training
# Create a config file (see config.example.yaml)
fluxflow-train --config config.yaml
# With automatic diagram generation
fluxflow-train --config config.yaml --generate_diagrams
Pipeline Training (NEW)
Multi-step training for hypothesis testing and staged optimization:
training:
pipeline:
steps:
# Step 1: VAE warmup
- name: "vae_warmup"
n_epochs: 10
train_vae: true
gan_training: false
freeze:
- flow_processor
- text_encoder
# Step 2: Add GAN
- name: "vae_gan"
n_epochs: 20
train_vae: true
gan_training: true
use_lpips: true
# Step 3: Flow training
- name: "flow"
n_epochs: 30
train_vae: false # Freeze VAE
train_diff: true # Train flow
Run:
fluxflow-train --config pipeline_config.yaml
Validate config before training:
fluxflow-train --config pipeline_config.yaml --validate-pipeline
See PIPELINE_ARCHITECTURE.md for complete documentation.
GAN-Only Training (NEW)
Train with adversarial loss only (no pixel-level reconstruction):
training:
pipeline:
steps:
- name: "gan_only"
n_epochs: 20
train_vae: false # Skip reconstruction loss
gan_training: true # Train GAN discriminator
train_spade: true # With spatial conditioning
use_lpips: false
Use cases:
- Spatial structure learning without pixel constraints
- Faster training (no reconstruction computation)
- SPADE conditioning experiments
Generating Images
# Create a directory with .txt files containing prompts
mkdir prompts
echo "a beautiful sunset over mountains" > prompts/sunset.txt
# Generate images
fluxflow-generate \
--model_checkpoint path/to/checkpoint.safetensors \
--text_prompts_path prompts/ \
--output_path outputs/
Visualizing Training Progress
# Training metrics are automatically logged to outputs/graph/training_metrics.jsonl
# In pipeline mode: outputs/graph/training_metrics_{stepname}.jsonl
# Generate diagrams from logged metrics
python src/fluxflow_training/scripts/generate_training_graphs.py outputs/
# Diagrams are saved to outputs/graph/:
# - training_losses.png (VAE, Flow, Discriminator, Generator, LPIPS)
# - kl_loss.png (KL divergence with beta warmup)
# - learning_rates.png (LR schedules)
# - batch_times.png (training speed)
# - training_overview.png (combined overview)
# - training_summary.txt (statistics)
Console Output Examples
Pipeline Training Mode
PIPELINE STEP 1/3: vae_warmup
Description: Warmup VAE without GAN
Duration: 10 epochs
================================================================================
[00:15:23] Step vae_warmup (1/3) | Epoch 5/10 | Batch 127/500 | VAE: 0.0234 | KL: 12.45 | 3.2s/batch
GAN Training Mode
[00:15:23] Step vae_gan (2/3) | Epoch 5/20 | Batch 127/500 | VAE: 0.0234 | KL: 12.45 | G: 0.156 | D: 0.089 | LPIPS: 0.0812 | 3.2s/batch
Metrics Shown:
VAE- Reconstruction loss (L1 + MSE)KL- KL divergenceG- GAN generator loss (whengan_training: true)D- GAN discriminator loss (whengan_training: true)LPIPS- Perceptual loss (whenuse_lpips: true)Xs/batch- Average batch processing time
Sample Naming Convention
Pipeline Mode (v0.2.0)
Mid-epoch samples:
vae_warmup_001_005_00127_abc123-original.webp
{stepname}_{step}_{epoch}_{batch}_{hash}-{suffix}.webp
End-of-epoch samples:
vae_warmup_001_005_abc123-original.webp
{stepname}_{step}_{epoch}_{hash}-{suffix}.webp
Suffixes:
-originalor_ns_i- Input image-nr_o- VAE reconstruction (no random sampling)_ns_o- VAE reconstruction (with sampling)-ctx- Context vector visualization-nc- Non-context visualization
Package Contents
-
fluxflow_training.training- Training logic and trainerspipeline_orchestrator.py- Multi-step pipeline executionpipeline_config.py- Pipeline configuration and validationvae_trainer.py- VAE/GAN training logicflow_trainer.py- Flow model trainingcheckpoint_manager.py- Checkpoint save/resume
-
fluxflow_training.data- Dataset implementations and transformsdatasets.py- Image/caption datasets with WebDataset supporttransforms.py- Data augmentation pipelines
-
fluxflow_training.scripts- CLI scripts for training and generationtrain.py- Main training script (legacy + pipeline mode)generate_training_graphs.py- Visualization generation
Configuration
Training is configured via YAML files. See docs/TRAINING_GUIDE.md for detailed configuration options.
Basic Configuration Example
model:
vae_dim: 128
feature_maps_dim: 128
text_embedding_dim: 1024
data:
data_path: "/path/to/images"
captions_file: "/path/to/captions.txt" # image_name<tab>caption
fixed_prompt_prefix: null # Optional: e.g., "style anime" to prepend to all prompts
img_size: 1024
reduced_min_sizes: null # Optional: [128, 256, 512]
training:
n_epochs: 100
batch_size: 1
lr: 0.00001
train_vae: true
gan_training: true
use_lpips: true
use_fp16: false
output:
output_path: "outputs/flux"
log_interval: 10
checkpoint_save_interval: 50
samples_per_checkpoint: 1 # Generate samples at each checkpoint
sample_sizes: # Optional: generate samples at various sizes
- 512
- [768, 512] # landscape
- 1024
Pipeline Configuration Example
See test_pipeline_minimal.yaml or config.example.yaml for complete examples.
Key difference: Pipeline configs use training.pipeline.steps instead of flat training config.
training:
batch_size: 4
workers: 8
pipeline:
steps:
- name: "step1"
n_epochs: 10
train_vae: true
# ... step-specific config
Documentation
Comprehensive Guides
-
PIPELINE_ARCHITECTURE.md - Pipeline training mode documentation
- Configuration reference
- Checkpoint format
- Sample naming conventions
- Troubleshooting
-
TRAINING_GUIDE.md - Complete training guide
- Detailed configuration options
- Dataset preparation
- Training strategies
- Best practices
-
CONTRIBUTING.md - Development guide
- Setting up development environment
- Running tests
- Code style guidelines
- Pipeline config testing
Example Configurations
config.example.yaml- Complete configuration template with all optionstest_pipeline_minimal.yaml- Minimal working pipeline exampleconfig.example.sh- Environment setup script
What's New in v0.2.0
Major Features
- Pipeline Training Mode - Multi-step sequential training with per-step configs
- GAN-Only Mode - Train with adversarial loss only (no reconstruction)
- Enhanced Logging - Batch timing, step-specific metrics, correct GAN loss display
- Mid-Epoch Samples - Configurable sample generation with batch numbers
Bug Fixes
- Fixed encoder gradient flow in GAN-only mode
- Fixed GAN loss logging (generator/discriminator keys)
- Fixed EMA creation for GAN-only mode
- Fixed metrics logging to respect training modes
- Fixed sample generation to prevent overwrites
Breaking Changes
Config format: Pipeline configs must use steps: wrapper:
# OLD (will error)
training:
pipeline:
- name: "step1"
# NEW (required)
training:
pipeline:
steps:
- name: "step1"
See CHANGELOG.md for complete version history.
Links
- GitHub Repository
- Core Package
- Web UI
- ComfyUI Plugin
- PyPI Package (coming soon - not yet published)
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fluxflow_training-0.2.1.tar.gz.
File metadata
- Download URL: fluxflow_training-0.2.1.tar.gz
- Upload date:
- Size: 101.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2efe94f01507152a5cfe6f4bd8de46dd1ebb18046c5405c5d3dfef50ef2b62fb
|
|
| MD5 |
fd22c626ae2a4ca9e929bbc948400353
|
|
| BLAKE2b-256 |
d448053ccf45fbcb419fddb2e0a368a260a8d9e7d36208db405610b3ce272993
|
Provenance
The following attestation bundles were made for fluxflow_training-0.2.1.tar.gz:
Publisher:
ci.yml on danny-mio/fluxflow-training
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fluxflow_training-0.2.1.tar.gz -
Subject digest:
2efe94f01507152a5cfe6f4bd8de46dd1ebb18046c5405c5d3dfef50ef2b62fb - Sigstore transparency entry: 754834769
- Sigstore integration time:
-
Permalink:
danny-mio/fluxflow-training@8fb8c494fcc23520d1e2bc78943d1291806d90fe -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/danny-mio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@8fb8c494fcc23520d1e2bc78943d1291806d90fe -
Trigger Event:
push
-
Statement type:
File details
Details for the file fluxflow_training-0.2.1-py3-none-any.whl.
File metadata
- Download URL: fluxflow_training-0.2.1-py3-none-any.whl
- Upload date:
- Size: 80.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
725ec1a294b64fa8bff2efc9ee72b22dc2fb13f6f376998e0b6b21cc823f3da9
|
|
| MD5 |
5e13fa53355c1bb4e4b20c0941ea1fa7
|
|
| BLAKE2b-256 |
49fcf096ab9e7a8afed258e69ecb7b3c384e406bbe76005d227546ec849865ad
|
Provenance
The following attestation bundles were made for fluxflow_training-0.2.1-py3-none-any.whl:
Publisher:
ci.yml on danny-mio/fluxflow-training
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fluxflow_training-0.2.1-py3-none-any.whl -
Subject digest:
725ec1a294b64fa8bff2efc9ee72b22dc2fb13f6f376998e0b6b21cc823f3da9 - Sigstore transparency entry: 754834814
- Sigstore integration time:
-
Permalink:
danny-mio/fluxflow-training@8fb8c494fcc23520d1e2bc78943d1291806d90fe -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/danny-mio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@8fb8c494fcc23520d1e2bc78943d1291806d90fe -
Trigger Event:
push
-
Statement type: