Better Lyrics Translation Toolkit - AI-powered song translation with music constraint preservation
Project description
🥪 BLT - Better Lyrics Translation Toolkit
BLT is a toolkit for lyrics and singing voice. The toolkit contains three modular components that can be used independently or combined through pre-defined pipelines.
Demo
Quick Start
from blt.translators import SoramimiTranslationAgent
# Soramimi translation (phonetic matching)
agent = SoramimiTranslationAgent()
result = agent.translate(["Your lyrics here"])
print(result.soramimi_lines) # Phonetically matched translation
Components
1. Translator
IPA-based lyrics translation tools with music constraints:
| Tool | Description |
|---|---|
LyricsTranslationAgent |
Main translator with syllable/rhyme preservation |
SoramimiTranslationAgent |
そらみみ (空耳) translator - creates text that sounds like the original |
Music Constraints Extracted:
-
syllable_counts:
list[int](ex. [4, 3])- Chinese: Character-based
- Other languages: IPA vowel nuclei
-
syllable_patterns:
list[list[int]](ex. [[1, 1, 2], [1, 2]])- With audio (WIP): Alignment problem - timing sync with vocals
- Without audio: Word segmentation problem
- Chinese: HanLP tokenizer
- English: Space splitting
- Other languages: LLM-based
-
rhyme_scheme:
str(ex. AB)- Chinese: Pinyin finals
- Other languages: IPA phonemes
-
ipa_similarity:
float(ex. 0.5)- Phonetic similarity threshold for soramimi translation
- Measured using IPA phoneme matching between source and target
Translation Flow
flowchart TD
A[Source Lyrics] --> B[LyricsAnalyzer]
B --> |Extract Constraints| C{TranslationAgent}
C --> |Generate Translation| D[Validator]
D --> |Check Constraints| E{Valid or Max Retries}
E --> |No| C
E --> |Yes| F[Target Lyrics]
style B fill:#64b5f6,stroke:#1976d2,stroke-width:2px,color:#fff
style C fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style D fill:#42a5f5,stroke:#1976d2,stroke-width:2px,color:#fff
2. Synthesizer (WIP)
| Tool | Description |
|---|---|
VocalSeparator |
Vocal / instrumental separation |
VoiceConverter |
Voice conversion (RVC) |
LyricsAligner |
Timing alignment |
AudioMixer |
Audio mixing with automatic resampling |
VideoGenerator |
Video generation (KTV, Lip-Synced) |
3. Pipeline (WIP)
| Pipeline | Description |
|---|---|
RVCKTVPipeline |
RVC voice conversion + KTV video with subtitles |
Requirements
- Python 3.11+
- espeak-ng (IPA analysis)
- Ollama + Qwen3:
ollama pull qwen3:30b-a3b-instruct-2507-q4_K_M - (Optional) LangSmith API key for tracing/monitoring
- (Optional) RVC_ZERO for voice conversion
Setup
uv venv --python 3.11
source .venv/bin/activate
uv sync
Model Files
Download and place these model files in assets/:
-
Wav2Lip model (for lip-sync):
assets/wav2lip_gan.pth -
RVC model (for voice conversion):
assets/model.pthandassets/model.index- Download: https://huggingface.co/spaces/r3gm/rvc_zero or train your own
Acknowledgments
Built with: LangGraph, LangChain, Ollama, PyTorch, Pydantic AI, Demucs, XTTS, HanLP, Phonemizer, Panphon, RVC, Wav2Lip, Whisper, Qwen3
Disclaimer
This project is intended for research and educational purposes only. All demo content is used for demonstration purposes. If you believe any content infringes on your rights, please contact us and we will remove it promptly.
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blt_toolkit-0.1.2.tar.gz.
File metadata
- Download URL: blt_toolkit-0.1.2.tar.gz
- Upload date:
- Size: 28.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
717cde027b8543ca45af5d0e0c7864e8aa7781bc20c6e3c8aafbff18729fa512
|
|
| MD5 |
b128609cd101f3f09a9c6615b85e1f41
|
|
| BLAKE2b-256 |
d71246d387ce67c1222915ea82c1295cd5622ee9676b15596323d61f64dd327f
|
Provenance
The following attestation bundles were made for blt_toolkit-0.1.2.tar.gz:
Publisher:
publish.yml on guan404ming/blt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
blt_toolkit-0.1.2.tar.gz -
Subject digest:
717cde027b8543ca45af5d0e0c7864e8aa7781bc20c6e3c8aafbff18729fa512 - Sigstore transparency entry: 767497527
- Sigstore integration time:
-
Permalink:
guan404ming/blt@f2f156a452b28ff363353ca23c26d84ba05053fb -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/guan404ming
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f2f156a452b28ff363353ca23c26d84ba05053fb -
Trigger Event:
push
-
Statement type:
File details
Details for the file blt_toolkit-0.1.2-py3-none-any.whl.
File metadata
- Download URL: blt_toolkit-0.1.2-py3-none-any.whl
- Upload date:
- Size: 100.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccad1f51478846ec89dab580898c45bcf86fdefb7e9fb9821ba8e74c9a7438fa
|
|
| MD5 |
52d6c84202f50262105651297f200323
|
|
| BLAKE2b-256 |
235a7b3995cb60396b9448648e11e7b20f3cef0fedfb3f91bc36826f8246ce4f
|
Provenance
The following attestation bundles were made for blt_toolkit-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on guan404ming/blt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
blt_toolkit-0.1.2-py3-none-any.whl -
Subject digest:
ccad1f51478846ec89dab580898c45bcf86fdefb7e9fb9821ba8e74c9a7438fa - Sigstore transparency entry: 767497528
- Sigstore integration time:
-
Permalink:
guan404ming/blt@f2f156a452b28ff363353ca23c26d84ba05053fb -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/guan404ming
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f2f156a452b28ff363353ca23c26d84ba05053fb -
Trigger Event:
push
-
Statement type: