Japanese ebook audio subtitle aligner - Create synchronized subtitles from Japanese audiobooks and EPUB files
Project description
Jebasa (Japanese ebook audio subtitle aligner)
Jebasa is a Python package that creates synchronized subtitles from Japanese audiobooks and EPUB files using forced alignment. It handles Japanese-specific challenges like furigana annotations and morphological analysis to produce high-quality subtitle files.
Features
- 🎵 Audio Processing: Convert and prepare audio files for alignment
- 📖 Text Extraction: Process EPUB and text files with furigana support
- 🗣️ Dictionary Creation: Generate custom pronunciation dictionaries
- ⚖️ Forced Alignment: Use Montreal Forced Aligner for precise timing
- 📝 Subtitle Generation: Create properly timed SRT files
- 🔄 Complete Pipeline: Run all stages automatically or individually
- 🇯🇵 Japanese Optimized: Handles furigana, tokenization, and text normalization
Installation
Prerequisites
- Python 3.8 or higher
- FFmpeg (for audio processing)
- Montreal Forced Aligner (for alignment)
Install Jebasa
pip install jebasa
Install System Dependencies
Ubuntu/Debian
sudo apt update
sudo apt install ffmpeg
pip install montreal-forced-aligner
mfa model download acoustic japanese_mfa
macOS
brew install ffmpeg
pip install montreal-forced-aligner
mfa model download acoustic japanese_mfa
Windows
choco install ffmpeg
pip install montreal-forced-aligner
mfa model download acoustic japanese_mfa
Quick Start
Basic Usage
# Run complete pipeline
jebasa run --input-dir ./my_book --output-dir ./output
# Or run individual stages
jebasa prepare-audio --input-dir ./my_book/audio --output-dir ./processed
jebasa prepare-text --input-dir ./my_book/text --output-dir ./processed
jebasa create-dictionary --input-dir ./processed --output-dir ./dictionaries
jebasa align --corpus-dir ./processed --dictionary ./dictionaries/custom.dict --output-dir ./aligned
jebasa generate-subtitles --alignment-dir ./aligned --text-dir ./processed --output-dir ./subtitles
Python API
from jebasa import JebasaPipeline
from jebasa.config import JebasaConfig
# Create configuration
config = JebasaConfig()
config.paths.input_dir = "./my_book"
config.paths.output_dir = "./output"
# Run pipeline
pipeline = JebasaPipeline(config)
results = pipeline.run_all()
print(f"Generated {len(results)} subtitle files")
Input Requirements
Audio Files
- Formats: MP3, M4A, WAV, FLAC, AAC
- Will be converted to 16kHz mono WAV for alignment
- Quality: Clear speech with minimal background noise
Text Files
- Formats: EPUB, XHTML, HTML, TXT
- Japanese text with optional furigana (ruby) annotations
- Should correspond to audio content
Configuration
Jebasa can be configured via command-line options, configuration files, or environment variables.
Configuration File
Create a jebasa.yaml file:
audio:
sample_rate: 16000
channels: 1
format: wav
text:
tokenizer: mecab
normalize_text: true
extract_furigana: true
mfa:
acoustic_model: japanese_mfa
beam: 100
retry_beam: 400
num_jobs: 4
subtitles:
max_line_length: 42
max_lines: 2
min_duration: 1.0
max_duration: 7.0
paths:
input_dir: ./input
output_dir: ./output
temp_dir: ./temp
Command Line Options
jebasa run --help
jebasa prepare-audio --help
jebasa prepare-text --help
jebasa create-dictionary --help
jebasa align --help
jebasa generate-subtitles --help
Examples
Example 1: Basic Audiobook Processing
# Organize your files
mkdir my_book/{audio,text}
cp audiobook.mp3 my_book/audio/
cp book.epub my_book/text/
# Run complete pipeline
jebasa run --input-dir ./my_book --output-dir ./output
# Find your subtitles
ls output/srt/
Example 2: Custom Quality Settings
# High-quality alignment with more beam search
jebasa align \
--corpus-dir ./processed \
--dictionary ./dictionaries/custom.dict \
--output-dir ./aligned \
--beam 200 \
--retry-beam 600 \
--num-jobs 8
Example 3: Processing with Configuration File
# Create configuration file
cat > jebasa.yaml << EOF
audio:
sample_rate: 22050
ffmpeg_options:
acodec: pcm_s16le
text:
min_chapter_length: 500
mfa:
num_jobs: 8
beam: 150
EOF
# Use configuration file
jebasa run --config jebasa.yaml --input-dir ./my_book
Advanced Usage
Custom Audio Processing
from jebasa.audio import AudioProcessor
from jebasa.config import AudioConfig
config = AudioConfig(
sample_rate=22050,
channels=1,
format="wav",
ffmpeg_options={"acodec": "pcm_s16le"}
)
processor = AudioProcessor(config)
processed_files = processor.process_audio_files(
input_dir="./audio",
output_dir="./processed"
)
Custom Text Processing
from jebasa.text import TextProcessor
from jebasa.config import TextConfig
config = TextConfig(
tokenizer="mecab",
normalize_text=True,
extract_furigana=True
)
processor = TextProcessor(config)
processed_files = processor.process_text_files(
input_dir="./text",
output_dir="./processed"
)
Pipeline Stages
from jebasa.pipeline import JebasaPipeline
from jebasa.config import JebasaConfig
config = JebasaConfig()
pipeline = JebasaPipeline(config)
# Run individual stages
audio_files = pipeline.prepare_audio()
text_files = pipeline.prepare_text()
dictionary = pipeline.create_dictionary()
alignments = pipeline.run_alignment()
subtitles = pipeline.generate_subtitles()
Troubleshooting
Common Issues
-
FFmpeg not found
Error: FFmpeg not found. Please install FFmpeg.Solution: Install FFmpeg using your system's package manager.
-
MFA model not found
Error: Acoustic model 'japanese_mfa' not foundSolution: Download the model with
mfa model download acoustic japanese_mfa -
Poor alignment quality
- Check audio quality (clear speech, minimal noise)
- Verify text matches audio content
- Try adjusting beam search parameters
- Check pronunciation dictionary coverage
-
Memory issues during alignment
- Reduce
--num-jobsparameter - Process files in smaller batches
- Ensure sufficient RAM (8GB+ recommended)
- Reduce
Getting Help
- Check the documentation
- Report issues on GitHub
- Join our discussion forum
Contributing
We welcome contributions! Please see our Contributing Guide for details.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Montreal Forced Aligner for alignment
- fugashi for Japanese tokenization
- BeautifulSoup for HTML/XML parsing
Citation
If you use Jebasa in your research, please cite:
@software{jebasa,
title={Jebasa: Japanese ebook audio subtitle aligner},
author={Your Name},
year={2024},
url={https://github.com/OCboy5/jebasa}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jebasa-0.1.2.tar.gz.
File metadata
- Download URL: jebasa-0.1.2.tar.gz
- Upload date:
- Size: 57.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e09eff1506207ad174351419ae833236b9b65e265e1de7f04095395b46e6195
|
|
| MD5 |
4fe11c1031737239803ae6fad3ce91ef
|
|
| BLAKE2b-256 |
c526c8611c9a8de6f681e666d5ab0748b4dc7de47c3a982cb085ba181203d71c
|
File details
Details for the file jebasa-0.1.2-py3-none-any.whl.
File metadata
- Download URL: jebasa-0.1.2-py3-none-any.whl
- Upload date:
- Size: 39.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73621bf5e5237a63297accba352ddeeb338e07a15890f0bed719a74ae3432055
|
|
| MD5 |
5bd15b3fe7ff726fee7b76f264b2ab9c
|
|
| BLAKE2b-256 |
2a4450b8ad730aaf4c85e6a5c9747cb99742c596cec1b182629842c3c877066b
|