A template extension for TTS Generation WebUI
Project description
SongBloom TTS WebUI Extension
A Gradio-based extension for TTS Generation WebUI that integrates the SongBloom AI music generation model.
Features
- Interactive Gradio Interface: User-friendly web interface for music generation
- Lyrics-to-Music: Generate music from text lyrics with style prompts
- Audio Style Transfer: Use prompt audio to guide the style of generated music
- Multiple Model Support: Choose between different SongBloom model variants
- Batch Generation: Generate multiple samples with different variations
- Memory Optimization: Support for both float32 and bfloat16 precision
Installation
Prerequisites
- Install the extension:
pip install git+https://github.com/rsxdalv/tts_webui_extension.songbloom@main
- Install SongBloom (required dependency):
pip install git+https://github.com/CypressYang/SongBloom.git
System Requirements
- GPU: NVIDIA GPU with CUDA support (recommended)
- Memory:
- 8GB+ GPU memory for float32 precision
- 4GB+ GPU memory for bfloat16 precision
- Storage: ~2-4GB for model files (downloaded automatically)
Usage
Through TTS WebUI
- Install the extension in your TTS WebUI
- Navigate to the "Songbloom" tab
- Follow the interface instructions
Standalone Mode
Run the interface directly:
cd tts_webui_extension/songbloom
python gradio_ui.py
Interface Components
Input Section
- Model: Choose between available SongBloom variants
songbloom_full_150s: Base model (150 seconds training)songbloom_full_150s_dpo: Enhanced model with DPO training
- Lyrics: Enter your song lyrics (supports verse/chorus structure)
- Prompt Audio: Upload an audio file to guide the musical style
- Precision: Choose between float32 (higher quality) or bfloat16 (memory efficient)
- Number of Samples: Generate 1-5 variations
Output Section
- Status: Real-time progress and error messages
- Generated Audio: Individual audio players for each generated sample
Example Usage
- Upload Prompt Audio: Choose a song or instrumental that represents your desired style
- Enter Lyrics: Write structured lyrics like:
Verse 1: Walking down the street tonight Under neon city lights Chorus: Let the rhythm take control Feel it deep within your soul - Select Model: Choose your preferred model variant
- Generate: Click "Generate Music" and wait for results
Tips for Best Results
- Prompt Audio Quality: Use high-quality audio files with clear musical elements
- Lyrics Structure: Well-structured lyrics with clear verses and choruses work best
- Style Consistency: The prompt audio should match your desired output style
- Memory Management: Use bfloat16 if you encounter GPU memory issues
- Multiple Samples: Generate several samples to get the best results
Troubleshooting
Common Issues
-
"SongBloom not installed" error:
pip install git+https://github.com/CypressYang/SongBloom.git
-
GPU memory errors:
- Switch to bfloat16 precision
- Reduce number of samples
- Close other GPU-intensive applications
-
Model download failures:
- Check internet connection
- Verify Hugging Face Hub access
- Clear cache directory and retry
Development
To run the extension standalone:
cd tts_webui_extension/songbloom
python gradio_ui.py
License
Apache License, Version 2.0
Credits
- Original SongBloom model by Cypress Yang
- TTS WebUI integration by rsxdalv
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tts_webui_extension_songbloom-0.1.3-py3-none-any.whl.
File metadata
- Download URL: tts_webui_extension_songbloom-0.1.3-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abe29f369cb25b881bde63ab75507f827fe3c2252dbe149f86f7f3a634c9975c
|
|
| MD5 |
5d1d0db383fd898366ebf26b108c79d5
|
|
| BLAKE2b-256 |
bfb9493bf83256c60bda2105b940cc166f9dc97f71c23b11a7098e2d26bbe821
|