Self-correcting image generation using Gemini - iterate until it's right!
Project description
🍌 Banana Straightener
Self-correcting image generation using Gemini - iterate until it's right!
Banana Straightener is an AI agent that automatically refines image generation through iterative improvement. It generates images based on your prompt, evaluates them against your original intent, and keeps improving until the image matches what you actually wanted.
✨ Features
- 🔄 Self-correcting loop: Automatically evaluates and improves images with intelligent iteration strategies
- 🎨 Gemini-powered: Uses Gemini 2.5 Flash Image Preview for both generation and evaluation
- 🧠 Smart iteration: Advanced prompt engineering prevents repetitive loops and adapts strategies
- 💻 Multiple interfaces: CLI, Python API, and Web UI with dark theme support
- 📊 Detailed feedback: Get insights into what's working and what needs improvement
- 💾 Session tracking: Save all iterations and see the improvement process
- ⚙️ Highly configurable: Customize models, thresholds, and iteration limits
- 🚀 Automated releases: Easy version management and publishing
🚀 Quick Start
Requirements: Python 3.12+ (tested on Python 3.12 & 3.13)
💡 New in v0.1.3: Check your installed version with
straighten --version
For Local Development
If you're working with the source code locally (before the package is published):
# Clone the repository
git clone https://github.com/velvet-shark/banana-straightener.git
cd banana-straightener
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create virtual environment and install in editable mode
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
# Set your API key (choose one method)
# Method 1: Create .env file (recommended for local development)
echo 'GEMINI_API_KEY=your-api-key-here' > .env
# Method 2: Set environment variable
export GEMINI_API_KEY="your-api-key-here"
# Test it works
uv run python -c "from banana_straightener import BananaStraightener; print('✅ Import successful!')"
# Test CLI (should show help)
uv run python -m banana_straightener.cli --help
# Run comprehensive local test
uv run python test_local.py
For Production Use
Once published to PyPI:
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install Banana Straightener
uv pip install banana-straightener
Get your API key
- Visit Google AI Studio
- Create a new API key
- Set it up (choose one method):
Option A: .env file (recommended)
echo 'GEMINI_API_KEY=your-api-key-here' > .env
Option B: Environment variable
export GEMINI_API_KEY="your-api-key-here"
Basic Usage
Command Line:
# Generate from a prompt
straighten generate "a perfectly straight banana"
# Modify an existing image
straighten generate "add wings to the cat" --image cat.jpg
# With custom settings
straighten generate "futuristic city" --iterations 10 --save-all
Python API:
from banana_straightener import BananaStraightener
agent = BananaStraightener()
result = agent.straighten("a dragon reading a book in a library")
if result['success']:
result['final_image'].save('dragon_librarian.png')
print(f"Success in {result['iterations']} iterations!")
Web UI:
straighten ui
# Opens http://localhost:7860 in your browser
🧠 How It Works
- Generate: Create an initial image based on your prompt
- Evaluate: Use Gemini to analyze if the image matches your intent
- Improve: If needed, enhance the prompt with specific feedback
- Iterate: Repeat until the image is right or max iterations reached
- Success: Get your perfectly straightened banana! 🍌
🚀 Publishing & Releases
For Contributors
To create a new release (maintainers only):
# Method 1: Use the automated bump script (recommended)
uv run python scripts/bump-version.py --patch --release
# or
uv run python scripts/bump-version.py 0.1.4 --release
# Method 2: Manual version update
# 1. Update version in pyproject.toml: version = "0.1.4"
# 2. Update version in src/banana_straightener/__init__.py: __version__ = "0.1.4"
# 3. Commit and push - GitHub Actions will automatically create the release
# Method 3: Manual release with script only (no auto-publish)
# python scripts/bump-version.py 0.1.4 (without --release flag)
The automated release system will:
- ✅ Update version files
- ✅ Create GitHub release with changelog
- ✅ Publish to PyPI automatically
- ✅ Run all tests before publishing
For detailed release instructions, see RELEASE.md.
For Users
To update to the latest version:
# Check current version
straighten --version
# Update to latest
uv pip install --upgrade banana-straightener
# Or with regular pip
pip install --upgrade banana-straightener
📖 Detailed Usage
Command Line Interface
The CLI provides the most comprehensive control over the straightening process:
# Basic generation
straighten generate "your prompt here"
# All available options
straighten generate "a majestic dragon" \
--image input.jpg \ # Starting image (optional)
--iterations 10 \ # Max iterations (default: 5)
--threshold 0.90 \ # Success threshold (default: 0.85)
--output ./my_outputs \ # Output directory
--save-all \ # Save intermediate images
--open # Open results folder when done
# Check version
straighten --version
# Other useful commands
straighten examples # Show example prompts
straighten config # Show current configuration
straighten ui --port 8080 # Launch web UI on custom port
straighten --version # Show installed version
Python API
For integration into your own applications:
from banana_straightener import BananaStraightener, Config
from PIL import Image
# Basic usage
agent = BananaStraightener()
result = agent.straighten("a sunset over mountains")
# Custom configuration
config = Config(
api_key="your-key",
default_max_iterations=8,
success_threshold=0.90,
save_intermediates=True
)
agent = BananaStraightener(config)
# With input image
input_img = Image.open("photo.jpg")
result = agent.straighten(
prompt="make this photo look like a painting",
input_image=input_img,
max_iterations=5
)
# Process results
if result['success']:
print(f"✅ Success after {result['iterations']} iterations!")
result['final_image'].save('output.png')
else:
print(f"⚠️ Best attempt: {result['best_confidence']:.1%} confidence")
# Real-time processing with generator
for iteration in agent.straighten_iterative("abstract art"):
print(f"Iteration {iteration['iteration']}: {iteration['evaluation']['confidence']:.1%}")
if iteration['success']:
break
Web Interface
The Gradio web UI provides an intuitive interface for interactive use:
- Real-time progress: Watch iterations happen live
- Gallery view: See all attempts side-by-side
- Detailed evaluation: Understand what the AI sees
- Easy sharing: Generate public links with
--share
Launch with: straighten ui
⚙️ Configuration
Configuration Options
Recommended: Create a .env file in your project directory:
# Required
GEMINI_API_KEY=your_api_key_here
# Optional - Model Selection
GENERATOR_MODEL=gemini-2.5-flash
EVALUATOR_MODEL=gemini-2.5-flash
# Optional - Generation Settings
MAX_ITERATIONS=5
SUCCESS_THRESHOLD=0.85
SAVE_INTERMEDIATES=false
OUTPUT_DIR=./outputs
# Optional - UI Settings
GRADIO_PORT=7860
GRADIO_SHARE=false
Alternative: Environment variables (useful for deployment)
export GEMINI_API_KEY="your_api_key_here"
export MAX_ITERATIONS=5
export SUCCESS_THRESHOLD=0.85
# ... etc
The system will check for API keys in this order:
.envfile in current or parent directoriesGEMINI_API_KEYenvironment variableGOOGLE_API_KEYenvironment variable
Python Configuration
from banana_straightener import Config
config = Config(
api_key="your-key",
generator_model="gemini-2.5-flash",
evaluator_model="gemini-2.5-flash",
default_max_iterations=5,
success_threshold=0.85,
save_intermediates=True,
output_dir=Path("./my_outputs")
)
🎨 Example Prompts
Creative Prompts
- "A majestic dragon reading a book in an ancient library"
- "A steampunk robot tending a garden of mechanical flowers"
- "A cozy coffee shop on a rainy evening with warm lighting"
- "Abstract art inspired by the sound of ocean waves"
Technical/Specific
- "A perfectly straight banana on a white background"
- "Technical diagram of a spaceship engine with labels"
- "Logo design for a tech startup, minimalist, blue and white"
- "Architectural blueprint of a modern sustainable house"
Photo Editing
- "Add a rainbow to this landscape photo"
- "Make this portrait look like an oil painting"
- "Remove the background and add a sunset sky"
- "Convert this photo to black and white with dramatic lighting"
🔧 Advanced Usage
Custom Evaluation Criteria
# Modify evaluation prompt template
config = Config()
config.evaluation_prompt_template = """
Analyze this image for: "{target_prompt}"
Rate on these specific criteria:
1. Technical quality (composition, lighting)
2. Prompt adherence (how well it matches)
3. Artistic appeal (creativity, style)
Provide scores 0.0-1.0 for each...
"""
Batch Processing
prompts = [
"a red car on a mountain road",
"a cat wearing sunglasses",
"abstract geometric patterns"
]
results = []
agent = BananaStraightener()
for prompt in prompts:
result = agent.straighten(prompt, max_iterations=3)
results.append(result)
print(f"Completed: {prompt}")
Integration with Other Tools
# Use with image preprocessing
from PIL import Image, ImageEnhance
def preprocess_image(img):
"""Enhance image before straightening."""
enhancer = ImageEnhance.Contrast(img)
return enhancer.enhance(1.2)
# Apply preprocessing
input_img = Image.open("photo.jpg")
enhanced_img = preprocess_image(input_img)
result = agent.straighten(
"improve the lighting in this photo",
input_image=enhanced_img
)
🛠️ Development
Setup Development Environment
# Clone the repository
git clone https://github.com/velvet-shark/banana-straightener.git
cd banana-straightener
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create and activate virtual environment
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode with all dependencies
uv pip install -e .[dev]
# Set up pre-commit hooks
pre-commit install
Running Tests
⚠️ Note: Most tests require a valid GEMINI_API_KEY for full functionality.
# Run quick tests (no API calls required)
uv run pytest tests/test_quick.py -v
# Run all tests (requires API key)
uv run pytest
# Run tests excluding slow API-dependent tests
uv run pytest -m "not slow"
# Run only fast integration tests
uv run pytest tests/test_integration.py -m "not slow"
# Run image generation tests (requires API key)
uv run pytest tests/test_image_generation.py -v
# Local development testing (comprehensive check)
uv run python test_local.py
# Quick manual image generation test
uv run python tests/test_quick_manual.py
# With coverage
uv run pytest --cov=banana_straightener --cov-report=term-missing
# Run specific test files
uv run pytest tests/test_quick.py tests/test_integration.py -v
Test Categories:
test_quick.py- Fast tests, no API calls requiredtest_image_generation.py- Core image generation functionality (requires API key)test_integration.py- End-to-end workflow tests (requires API key)test_local.py- Comprehensive local development validation scripttest_quick_manual.py- Simple manual test for image generation
Code Quality
# Format code
uv run black src/ tests/
# Check style
uv run flake8 src/ tests/
# Type checking
uv run mypy src/banana_straightener/
🤝 Contributing
Contributions are welcome! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Ensure code quality: Run
black,flake8, andmypy - Test thoroughly:
pytestshould pass - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
Areas for Contribution
- 🎨 New model integrations (DALL-E, Midjourney, etc.)
- 🔧 Evaluation improvements (custom metrics, multi-modal)
- 🌐 UI enhancements (mobile support, themes)
- 📚 Documentation (tutorials, examples, guides)
- 🧪 Testing (more test cases, integration tests)
- 🚀 CI/CD improvements (release automation, testing)
Development Workflow
All changes automatically trigger CI/CD:
- ✅ Tests run on Python 3.12 & 3.13
- ✅ Code quality checks (black, flake8, mypy)
- ✅ Automated releases when version changes
- ✅ PyPI publishing for tagged releases
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🆘 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
🙏 Acknowledgments
- Google Gemini for the powerful multimodal AI
- Gradio for the amazing web UI framework
- Rich for beautiful command-line interfaces
- The open-source community for inspiration and support
Made with 🍌 and ❤️
Straightening bananas, one pixel at a time
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file banana_straightener-0.1.8.tar.gz.
File metadata
- Download URL: banana_straightener-0.1.8.tar.gz
- Upload date:
- Size: 37.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c98915d7f27c33c6f917aaa0b7e73e2dadc66a31d1392375b3e8e27d1e3fb87a
|
|
| MD5 |
e94be14095f258bd384eabec6d250842
|
|
| BLAKE2b-256 |
007e7bbae3ee5e9cd85f29718041f124d36a911c7265ad79dbd641cbaa5b7f02
|
Provenance
The following attestation bundles were made for banana_straightener-0.1.8.tar.gz:
Publisher:
publish.yml on velvet-shark/banana-straightener
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
banana_straightener-0.1.8.tar.gz -
Subject digest:
c98915d7f27c33c6f917aaa0b7e73e2dadc66a31d1392375b3e8e27d1e3fb87a - Sigstore transparency entry: 478817659
- Sigstore integration time:
-
Permalink:
velvet-shark/banana-straightener@e3ac6d6c3ba528365a33fd404f4022a93ff372cd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/velvet-shark
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3ac6d6c3ba528365a33fd404f4022a93ff372cd -
Trigger Event:
push
-
Statement type:
File details
Details for the file banana_straightener-0.1.8-py3-none-any.whl.
File metadata
- Download URL: banana_straightener-0.1.8-py3-none-any.whl
- Upload date:
- Size: 31.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0229f7d2c0a528e7925e7ea5a7224e26b6db62461a93de2042e7a0e6e56d416e
|
|
| MD5 |
352cbf4ae899a6ca5ba2716b03c09cd3
|
|
| BLAKE2b-256 |
ec5296d7a071c0d3fe4ac417a1b414fe9844c1568050ebc743f7a3248c0bfdaf
|
Provenance
The following attestation bundles were made for banana_straightener-0.1.8-py3-none-any.whl:
Publisher:
publish.yml on velvet-shark/banana-straightener
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
banana_straightener-0.1.8-py3-none-any.whl -
Subject digest:
0229f7d2c0a528e7925e7ea5a7224e26b6db62461a93de2042e7a0e6e56d416e - Sigstore transparency entry: 478817683
- Sigstore integration time:
-
Permalink:
velvet-shark/banana-straightener@e3ac6d6c3ba528365a33fd404f4022a93ff372cd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/velvet-shark
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e3ac6d6c3ba528365a33fd404f4022a93ff372cd -
Trigger Event:
push
-
Statement type: