Skip to main content

Generate and translate English and Japanese subtitles using AI.

Project description

AI Sub: AI-Powered Subtitle Generation with Translation

PyPI version Downloads


Project Overview

AI Sub is a powerful tool that leverages AI (currently Google Gemini) to produce English and Japanese subtitles for videos, translating between languages as necessary. It is primarily tested and designed for Hololive concert/cover videos, but might work on other content.


Showcase

Here's an example of subtitles generated by AI Sub:

Video Screenshot

For more examples, please visit the showcase directory.


Pros and cons of using Gemini as the AI model

Pros:

  • Multimodal Context: Gemini's advanced multimodal capabilities enable it to analyze video content comprehensively, including on-screen text, for superior contextual understanding and more accurate subtitle generation.
  • Cloud-Based Processing: All processing is efficiently handled on Google Gemini's infrastructure, eliminating the need for local GPUs or extensive computational resources on your machine.

Cons:

  • Timestamp Precision: Subtitle timestamps may exhibit a minor offset of a few seconds.
  • Network Usage: Uploading entire video files to Google's services will consume network bandwidth.

How AI Sub Works

  • Video Segmentation: The input video is first segmented into 180-second segments. This duration is configurable via the --split_seconds flag.
  • Concurrent Processing: Each video segment is then sent to the AI model (Google Gemini) for subtitle generation. You can adjust the number of concurrent processing threads using the --num_processing_threads flag to optimize performance.
  • Subtitle Compilation: All generated subtitle parts are then combined into a single, final subtitle file.

Getting Started: A Quick Guide

1. Obtain Your Google Gemini API Key

Follow these simple steps to acquire your API key:

  1. Sign in to Google AI Studio.
  2. Click "Create API Key."
  3. Copy and securely store your API key. Never disclose your API key publicly.

2. Set Up Your Python Environment (Python 3.10+ Required)

Prepare your python virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate.bat`
pip install --upgrade ai-sub

3. Execute the Script

Run the application with your video file:

ai-sub --api_key=YOUR_API_KEY "path/to/your/video.mp4"

Note: Replace YOUR_API_KEY with your actual Google Gemini API key and "path/to/your/video.mp4" with the full path to your video file.


Known Limitations

  1. Timestamp Accuracy: Subtitle timestamps may exhibit inaccuracies. This is an inherent characteristic of the Gemini AI model.

    • Observations indicate that shorter video segments generally lead to improved timestamp accuracy.
    • Requesting second-level precision for timestamps generally yields more accurate results compared to millisecond-level precision from the model. Consequently, the current implementation is designed to request second-level timestamps.
  2. AI Hallucinations: Like all AI models, Gemini may occasionally produce "hallucinations" or inaccurate information. This is a known characteristic of current AI technology.

If you encounter issues related to these limitations, consider re-processing specific video segments as detailed in the "Re-processing Specific Video Segments" section below.


Re-processing Specific Video Segments

Intermediate files generated during processing are stored in the temporary directory, which defaults to tmp_<input_file_name> but can be specified using the --temp_dir CLI flag. Users can examine these part_XXX.json files within this directory to review the AI's results for individual segments. To re-process a specific video segment, simply delete its corresponding part_XXX.json file. Upon subsequent execution, the script will automatically re-process only those segments for which the part_XXX.json file is absent.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_sub-0.0.1.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_sub-0.0.1-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file ai_sub-0.0.1.tar.gz.

File metadata

  • Download URL: ai_sub-0.0.1.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_sub-0.0.1.tar.gz
Algorithm Hash digest
SHA256 afd5f950e6aec2abd07542d543ecbf261164116be1a5333c9a5fa99f0e20c016
MD5 2a336a9e11621ca9f93ec5f37c6e75e7
BLAKE2b-256 6e1af6c752542d827f8ebc8b34d7c3b6de4a326dddb83591f10a9b2fea2014ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.1.tar.gz:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_sub-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: ai_sub-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_sub-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a08c87be49fa95549c29341630e550c0dd785be273908116cb5ab03a12e7d59c
MD5 5ef0a85b4f74da88905740d1eacf87cc
BLAKE2b-256 516f34b32db611132492b85d654e5b7b06bf7aa5f413c2c47b14861a9ef83238

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.1-py3-none-any.whl:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page