Skip to main content

AI-Powered Subtitle Generation with Translation

Project description

AI Sub: AI-Powered Subtitle Generation with Translation

PyPI version Downloads


Project Overview

AI Sub is a powerful tool that leverages AI (currently Google Gemini) to produce English and Japanese subtitles for videos, translating between languages as necessary. It is primarily tested and designed for Hololive concert/cover videos, but might work on other content.


Showcase

Here's an example of subtitles generated by AI Sub:

Video Screenshot

For more examples, please visit the showcase directory.


Pros and cons of using Gemini as the AI model

Pros:

  • Multimodal Context: Gemini's advanced multimodal capabilities enable it to analyze video content comprehensively, including on-screen text, for superior contextual understanding and more accurate subtitle generation.
  • Cloud-Based Processing: All processing is efficiently handled on Google Gemini's infrastructure, eliminating the need for local GPUs or extensive computational resources on your machine.

Cons:

  • Timestamp Precision: Subtitle timestamps may exhibit a minor offset of a few seconds.
  • Network Usage: Uploading entire video files to Google's services will consume network bandwidth.

How AI Sub Works

  • Video Segmentation: The input video is first segmented into 180-second segments. This duration is configurable via the --split_seconds flag.
  • Concurrent Processing: Each video segment is then sent to the AI model (Google Gemini) for subtitle generation. You can adjust the number of concurrent processing threads using the --num_processing_threads flag to optimize performance.
  • Subtitle Compilation: All generated subtitle parts are then combined into a single, final subtitle file.

Getting Started: A Quick Guide

1. Obtain Your Google Gemini API Key

Follow these simple steps to acquire your API key:

  1. Sign in to Google AI Studio.
  2. Click "Create API Key."
  3. Copy and securely store your API key. Never disclose your API key publicly.

2. Set Up Your Python Environment (Python 3.10+ Required)

Prepare your python virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate.bat`
pip install --upgrade ai-sub

3. Execute the Script

Run the application with your video file:

ai-sub --api_key=YOUR_API_KEY "path/to/your/video.mp4"

Note: Replace YOUR_API_KEY with your actual Google Gemini API key and "path/to/your/video.mp4" with the full path to your video file.


Known Limitations

  1. Timestamp Accuracy: Subtitle timestamps may exhibit inaccuracies. This is an inherent characteristic of the Gemini AI model.

    • Observations indicate that shorter video segments generally lead to improved timestamp accuracy.
    • Requesting second-level precision for timestamps generally yields more accurate results compared to millisecond-level precision from the model. Consequently, the current implementation is designed to request second-level timestamps.
  2. AI Hallucinations: Like all AI models, Gemini may occasionally produce "hallucinations" or inaccurate information. This is a known characteristic of current AI technology.

If you encounter issues related to these limitations, consider re-processing specific video segments as detailed in the "Re-processing Specific Video Segments" section below.


Re-processing Specific Video Segments

Intermediate files generated during processing are stored in the temporary directory, which defaults to tmp_<input_file_name> but can be specified using the --temp_dir CLI flag. Users can examine these part_XXX.json files within this directory to review the AI's results for individual segments. To re-process a specific video segment, simply delete its corresponding part_XXX.json file. Upon subsequent execution, the script will automatically re-process only those segments for which the part_XXX.json file is absent.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_sub-0.0.2.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_sub-0.0.2-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file ai_sub-0.0.2.tar.gz.

File metadata

  • Download URL: ai_sub-0.0.2.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_sub-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3a1f922dd851c96b836d31f55a81a1809ca3a568784c270da61d77477495ec66
MD5 232ca2a8082466a32c1dfb95548dd1e6
BLAKE2b-256 acee8ebd62627b8e8b3923eee58dac4dded17a84a6c6c0d79b9e27f020b40f7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.2.tar.gz:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_sub-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: ai_sub-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_sub-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 208493faab91debd987d6878b3e655280772d72312da868ed87e2fe50690699b
MD5 ee360742c2ea4fe398d56b6ec33f266c
BLAKE2b-256 003ec61164dadf11a06943a816b1c8ab593296375772c04491d4ec782acbf35e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.2-py3-none-any.whl:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page