Skip to main content

A template extension for TTS Generation WebUI

Project description

SongBloom TTS WebUI Extension

A Gradio-based extension for TTS Generation WebUI that integrates the SongBloom AI music generation model.

Features

  • Interactive Gradio Interface: User-friendly web interface for music generation
  • Lyrics-to-Music: Generate music from text lyrics with style prompts
  • Audio Style Transfer: Use prompt audio to guide the style of generated music
  • Multiple Model Support: Choose between different SongBloom model variants
  • Batch Generation: Generate multiple samples with different variations
  • Memory Optimization: Support for both float32 and bfloat16 precision

Installation

Prerequisites

  1. Install the extension:
pip install git+https://github.com/rsxdalv/tts_webui_extension.songbloom@main
  1. Install SongBloom (required dependency):
pip install git+https://github.com/CypressYang/SongBloom.git

System Requirements

  • GPU: NVIDIA GPU with CUDA support (recommended)
  • Memory:
    • 8GB+ GPU memory for float32 precision
    • 4GB+ GPU memory for bfloat16 precision
  • Storage: ~2-4GB for model files (downloaded automatically)

Usage

Through TTS WebUI

  1. Install the extension in your TTS WebUI
  2. Navigate to the "Songbloom" tab
  3. Follow the interface instructions

Standalone Mode

Run the interface directly:

cd tts_webui_extension/songbloom
python gradio_ui.py

Interface Components

Input Section

  • Model: Choose between available SongBloom variants
    • songbloom_full_150s: Base model (150 seconds training)
    • songbloom_full_150s_dpo: Enhanced model with DPO training
  • Lyrics: Enter your song lyrics (supports verse/chorus structure)
  • Prompt Audio: Upload an audio file to guide the musical style
  • Precision: Choose between float32 (higher quality) or bfloat16 (memory efficient)
  • Number of Samples: Generate 1-5 variations

Output Section

  • Status: Real-time progress and error messages
  • Generated Audio: Individual audio players for each generated sample

Example Usage

  1. Upload Prompt Audio: Choose a song or instrumental that represents your desired style
  2. Enter Lyrics: Write structured lyrics like:
    Verse 1:
    Walking down the street tonight
    Under neon city lights
    
    Chorus:
    Let the rhythm take control
    Feel it deep within your soul
    
  3. Select Model: Choose your preferred model variant
  4. Generate: Click "Generate Music" and wait for results

Tips for Best Results

  1. Prompt Audio Quality: Use high-quality audio files with clear musical elements
  2. Lyrics Structure: Well-structured lyrics with clear verses and choruses work best
  3. Style Consistency: The prompt audio should match your desired output style
  4. Memory Management: Use bfloat16 if you encounter GPU memory issues
  5. Multiple Samples: Generate several samples to get the best results

Troubleshooting

Common Issues

  1. "SongBloom not installed" error:

    pip install git+https://github.com/CypressYang/SongBloom.git
    
  2. GPU memory errors:

    • Switch to bfloat16 precision
    • Reduce number of samples
    • Close other GPU-intensive applications
  3. Model download failures:

    • Check internet connection
    • Verify Hugging Face Hub access
    • Clear cache directory and retry

Development

To run the extension standalone:

cd tts_webui_extension/songbloom
python gradio_ui.py

License

Apache License, Version 2.0

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tts_webui_extension_songbloom-0.1.2-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file tts_webui_extension_songbloom-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for tts_webui_extension_songbloom-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 20327580c7b76c0f637ebd04e8e921fa122f2ae69fb9e7e4d1dd51e9af7ddcee
MD5 4369c8a896bf7a26c17c8a0afd8fff06
BLAKE2b-256 be2ab0ae81251f77db380ee835a1ddabbb7faf7ee4ee3720883ccd4826122d43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page