Generate synchronized SRT subtitles using ElevenLabs Force Alignment API with AI-powered semantic segmentation

These details have not been verified by PyPI

Project links

Project description

ElevenLabs Force Alignment SRT Generator

🎬 A powerful Python tool for generating synchronized SRT subtitles using ElevenLabs Force Alignment API with optional AI-powered semantic segmentation.

✨ Features

High-Precision Alignment: Uses ElevenLabs Force Alignment API for accurate word-level timing
AI Semantic Segmentation: Leverages Google Gemini for intelligent subtitle breaking
Bilingual Support: Automatically generates bilingual subtitles (original + translation)
Multi-Language: Supports 99+ languages including Chinese, English, Japanese, Korean, etc.
Smart Formatting: Removes punctuation and optimizes line breaks for readability
Flexible Output: Configurable character limits and segmentation strategies

🚀 Quick Start

Prerequisites

Python 3.7+
ElevenLabs API key (Get one here)
Google Gemini API key (Get one here)

Installation

Option 1: Install from PyPI (Recommended)

pip install elevenlabs-srt-generator

Option 2: Install from Source

git clone https://github.com/preangelleo/script-force-alignment.git
cd script-force-alignment
pip install -r requirements.txt

Set up environment variables:

cp .env.example .env
# Edit .env and add your API keys

Run setup validation:

python setup.py

📖 Usage

Command Line Interface

After installing from PyPI, you can use the CLI directly:

# Basic usage
elevenlabs-srt audio.mp3 "Your transcript text" -o output.srt

# With options
elevenlabs-srt audio.mp3 transcript.txt \
  --output subtitles.srt \
  --max-chars 30 \
  --language chinese \
  --no-semantic  # Disable AI segmentation

Python API

from main import elevenlabs_force_alignment_to_srt

# Generate subtitles
success, result = elevenlabs_force_alignment_to_srt(
    audio_file="path/to/audio.mp3",
    input_text="Your transcript text here",
    output_filepath="output/subtitles.srt",
    max_chars_per_line=20,
    language='chinese',
    use_semantic_segmentation=True,  # Enable AI segmentation
    model='gemini-2.0-flash'  # Optional: specify Gemini model
)

if success:
    print(f"Subtitles saved to: {result}")

Using the Example Script

Edit example_usage.py with your parameters:

# Configuration
AUDIO_FILE_PATH = "./samples/your_audio.mp3"
TEXT_CONTENT = "Your transcript here..."
OUTPUT_FILE_PATH = "./output/subtitles.srt"
LANGUAGE = 'chinese'
MAX_CHARS_PER_LINE = 20
USE_SEMANTIC_SEGMENTATION = True

Then run:

python example_usage.py

Running Tests

The test script allows you to compare semantic vs simple segmentation:

python test.py

🔧 API Configuration

Required Environment Variables

Create a .env file with:

ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here

Getting API Keys

ElevenLabs API Key:
- Sign up at ElevenLabs
- Go to your profile settings
- Copy your API key
- Important: Enable the Force Alignment feature in your API settings (it's disabled by default)
Google Gemini API Key:
- Visit Google AI Studio
- Create a new API key
- Enable the Gemini API

📝 API Reference

Main Function

elevenlabs_force_alignment_to_srt(
    audio_file: str,           # Path to audio file
    input_text: str,           # Transcript text
    output_filepath: str,      # Output SRT path
    api_key: str = None,       # Optional API key override
    max_chars_per_line: int = 20,  # Max characters per line
    language: str = 'chinese',     # Language code
    use_semantic_segmentation: bool = True,  # Enable AI segmentation
    model: str = None          # Gemini model (default: gemini-2.0-flash)
) -> Tuple[bool, str]

Parameters

audio_file: Path to audio file (MP3, WAV, M4A, OGG, FLAC, etc.)
input_text: Exact transcript of the audio content
output_filepath: Where to save the SRT file
api_key: Optional ElevenLabs API key (overrides .env)
max_chars_per_line: Maximum characters per subtitle line
language: Language of the content (e.g., 'chinese', 'english')
use_semantic_segmentation: Enable AI-powered semantic breaking
model: Gemini model to use (default: 'gemini-2.0-flash'). Options:
- 'gemini-2.0-flash': Fast and efficient (default)
- 'gemini-2.0-flash-exp': Experimental features
- 'gemini-1.5-pro': Higher quality output
- 'gemini-2.0-flash-thinking': Complex reasoning

Returns

Tuple[bool, str]: (Success status, Output path or error message)

🎯 Features Comparison

Feature	Semantic Segmentation	Simple Segmentation
Natural breaks	✅ Yes	❌ No
Bilingual support	✅ Yes	❌ No
AI-powered	✅ Yes	❌ No
Processing time	~3-5s	~1-2s
Quality	High	Basic

🌍 Supported Languages

The tool supports 99+ languages including:

Chinese (Simplified & Traditional)
English
Japanese
Korean
Spanish
French
German
Russian
Arabic
Hindi
And many more...

📊 Output Format

The tool generates standard SRT format:

1
00:00:00,123 --> 00:00:02,456
这是第一行字幕
This is the first subtitle

2
00:00:02,456 --> 00:00:05,789
这是第二行字幕
This is the second subtitle

🔍 Troubleshooting

Common Issues

API Key Errors:
- Ensure your API keys are valid
- Check that .env file is in the correct location
- Verify keys don't have extra spaces
Audio File Issues:
- Maximum file size: 1GB
- Supported formats: MP3, WAV, M4A, OGG, FLAC, AAC, OPUS, MP4
- Ensure file path is correct
Text Alignment Issues:
- Text must match audio content exactly
- Remove extra spaces or formatting
- Check language setting matches audio

Debug Mode

Enable detailed logging by setting environment variable:

export DEBUG=true
python example_usage.py

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

ElevenLabs for the Force Alignment API
Google Gemini for AI semantic analysis
Community contributors

📧 Support

For issues, questions, or suggestions:

Open an issue on GitHub
Contact: your-email@example.com

🚦 Project Status

Python License API

Made with ❤️ for the subtitle generation community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.2

Aug 23, 2025

1.2.1

Aug 15, 2025

1.2.0

Aug 13, 2025

1.1.0

Aug 13, 2025

This version

1.0.2

Aug 13, 2025

1.0.1

Aug 13, 2025

1.0.0

Aug 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elevenlabs_srt_generator-1.0.2.tar.gz (19.4 kB view details)

Uploaded Aug 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

elevenlabs_srt_generator-1.0.2-py3-none-any.whl (13.2 kB view details)

Uploaded Aug 13, 2025 Python 3

File details

Details for the file elevenlabs_srt_generator-1.0.2.tar.gz.

File metadata

Download URL: elevenlabs_srt_generator-1.0.2.tar.gz
Upload date: Aug 13, 2025
Size: 19.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for elevenlabs_srt_generator-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`76ad8ce54e2ff2ca906a72d4b16ffde0737844f2da48acb845720abe529705aa`
MD5	`c0a1e73eef05d32954e388568a58a14f`
BLAKE2b-256	`4e9370d50f8c19e22026c31f35a3e9037b99e2bdb118c7523bb94ae8156497c7`

See more details on using hashes here.

File details

Details for the file elevenlabs_srt_generator-1.0.2-py3-none-any.whl.

File metadata

Download URL: elevenlabs_srt_generator-1.0.2-py3-none-any.whl
Upload date: Aug 13, 2025
Size: 13.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for elevenlabs_srt_generator-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a7181ba88f824b1db221c779043b97b8a8e0ac980d40d286124a87c9c0c5c4e`
MD5	`5946c1228c22cff3457e2a1f74836a1a`
BLAKE2b-256	`175589a91127250ea3796f452309fad11c6e44809252e8779619d1ce501f0af8`

See more details on using hashes here.

elevenlabs-srt-generator 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ElevenLabs Force Alignment SRT Generator

✨ Features

🚀 Quick Start

Prerequisites

Installation

Option 1: Install from PyPI (Recommended)

Option 2: Install from Source

📖 Usage

Command Line Interface

Python API

Using the Example Script

Running Tests

🔧 API Configuration

Required Environment Variables

Getting API Keys

📝 API Reference

Main Function

Parameters

Returns

🎯 Features Comparison

🌍 Supported Languages

📊 Output Format

🔍 Troubleshooting

Common Issues

Debug Mode

🤝 Contributing

📄 License

🙏 Acknowledgments

📧 Support

🚦 Project Status

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes