Hardware-aware, concurrent pipeline for subtitle generation.
Project description
🎬 Ultimate SRT Generator
Hardware-aware, zero-disk, highly concurrent AI pipeline for mass-generating broadcast-quality subtitles.
🌟 Overview
The Ultimate SRT Generator is a production-grade daemon built for power users, NAS hoarders, and sysadmins. It autonomously crawls massive network-attached media libraries, detects videos missing English subtitles, performs zero-disk audio extraction directly into RAM, and infers broadcast-quality .srt files using state-of-the-art faster-whisper AI models.
🛠️ Key Architectural Features
- Zero-Disk Audio Extraction: Stops SSD wear-and-tear by bypassing
/tmp/files completely. Audio streams are asynchronously ripped via FFmpeg and piped directly into RAM (NumPy arrays) for AI ingestion. - Bounded Asynchronous Concurrency: Eliminates Python GIL starvation via an
asyncio.TaskGroupproducer-consumer pipeline. It extracts audio streams exactly as fast as the GPU can transcribe them, strictly capping memory usage. - Intelligent Hardware Routing Matrix: Auto-detects NVIDIA GPUs (VRAM), Apple Silicon, or pure CPU environments to intelligently route to the most optimal
large-v3-turbo,small, or quantizedint8model. - NAS-Safe & Deduplicating: Backed by an asynchronous local SQLite database (WAL mode) tracking
inodeandsize. It avoids parsing active downloads, seamlessly skips duplicate hardlinks, and performs strict POSIX Atomicos.replaceoperations with original MKV metadata inheritance. - Broadcast Formatting: Implements a strict chunking algorithm on top of Whisper's word-level timestamps. No more "walls of text"—subtitles are limited to ~42 chars and 2 lines, naturally breaking on terminal punctuation.
🚀 Installation & Deployment
🐳 Docker (Recommended for TrueNAS / Unraid)
For maximum stability and ease-of-use with NVIDIA hardware, use the provided Docker stack.
- Clone the repository:
git clone https://github.com/arvarik/srt-generator.git cd srt-generator
- Review and modify the
docker-compose.ymlto point to your media directory:volumes: - /mnt/user/media/movies:/media:rw - ./aisrt_data:/root/.config/aisrt:rw
- Deploy:
docker compose up --build -d
💻 Native Python (Ubuntu Desktop / Server / macOS)
Prerequisites: Python 3.11+ and ffmpeg must be installed on your system.
# 1. Clone repository
git clone https://github.com/arvarik/srt-generator.git
cd srt-generator
# 2. Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# 3. Install the application
pip install -e .
# 4. (Optional) Install development dependencies
pip install -e ".[dev]"
🎮 Usage
The application features a beautifully formatted CLI built on Typer and Rich.
Dry-Run (Scan)
Safely scan a directory to see exactly what hardware will be loaded and what files will be processed, without actually running the AI model.
aisrt scan /path/to/movies --min-age-mins 60 --verbose
Live Run
Execute the extraction and inference pipeline.
aisrt run /path/to/movies
CLI Overrides & Environment Variables
You can manually override the hardware auto-detector and execution options.
Via CLI:
aisrt run /path/to/movies --force-device cuda --force-model large-v3-turbo --translate --watch --watch-interval 60
Via Docker Environment Variables:
Since AppConfig utilizes pydantic-settings, you can configure the daemon entirely through your docker-compose.yml:
AISRT_TRANSLATE=True(Auto-dub foreign audio into English)AISRT_WATCH=True(Run 24/7 as a daemon)AISRT_WATCH_INTERVAL_MINS=60(Time between library scans)AISRT_FILTERS__MIN_AGE_MINS=30(Skip active torrent/usenet downloads)AISRT_HARDWARE__FORCE_MODEL=large-v3-turbo
🏗️ Open Source Development
We welcome contributions! The codebase strictly adheres to enterprise-level typing and styling.
Development Setup:
poetry install # Or pip install -e ".[dev]"
Running Tests & Linters:
ruff check . # Linter
ruff format . # Formatter
mypy src/aisrt tests # Strict Type Checking
pytest tests # Asynchronous Unit Tests
📜 License
Distributed under the MIT License. See LICENSE for more information.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aisrt-0.1.0.tar.gz.
File metadata
- Download URL: aisrt-0.1.0.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73cbdd9b9a67bf47e97df350e630049b4f1ea7f10f1c1d74c1a7352d11b16cb8
|
|
| MD5 |
458a0edae1fb803618b7f668a9f00597
|
|
| BLAKE2b-256 |
109830cbbf92c36d2a72e39a96ef3f36407f31a481c4aece8428cb865bc557fb
|
File details
Details for the file aisrt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: aisrt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc82a4233f991b6e2cc2766168d7ce14f351e6d859a070ffb3c7a4943eac4f64
|
|
| MD5 |
68246c0cdf01f18b28797c20fd05c180
|
|
| BLAKE2b-256 |
04a4456f12a5f0d22e3562e8691236c2e0ac6c461ff94f8251425478922ff141
|