Skip to main content

AI-Powered Subtitle Generation with Translation

Project description

AI Sub: AI-Powered Subtitle Generation with Translation

PyPI version Downloads


Project Overview

AI Sub is a powerful tool that leverages AI (currently Google Gemini) to produce English and Japanese subtitles for videos, translating between languages as necessary. It is primarily tested and designed for Hololive concert/cover videos, but might work on other content.


Showcase

Here's an example of subtitles generated by AI Sub:

Video Screenshot

For more examples, please visit the showcase directory.


Pros and cons of using Gemini as the AI model

Pros:

  • Multimodal Context: Gemini's advanced multimodal capabilities enable it to analyze video content comprehensively, including on-screen text, for superior contextual understanding and more accurate subtitle generation.
  • Cloud-Based Processing: All processing is efficiently handled on Google Gemini's infrastructure, eliminating the need for local GPUs or extensive computational resources on your machine.

Cons:

  • Timestamp Precision: Subtitle timestamps may exhibit a minor offset of a few seconds.
  • Network Usage: Uploading entire video files to Google's services will consume network bandwidth.

How AI Sub Works

  • Video Segmentation: The input video is first segmented into 180-second segments. This duration is configurable via the --split_seconds flag.
  • Concurrent Processing: Each video segment is then sent to the AI model (Google Gemini) for subtitle generation. You can adjust the number of concurrent processing threads using the --num_processing_threads flag to optimize performance.
  • Subtitle Compilation: All generated subtitle parts are then combined into a single, final subtitle file.

Getting Started: A Quick Guide

1. Obtain Your Google Gemini API Key

Follow these simple steps to acquire your API key:

  1. Sign in to Google AI Studio.
  2. Click "Create API Key."
  3. Copy and securely store your API key. Never disclose your API key publicly.

2. Set Up Your Python Environment (Python 3.10+ Required)

Prepare your python virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate.bat`
pip install --upgrade ai-sub

3. Execute the Script

Run the application with your video file:

ai-sub --api_key=YOUR_API_KEY "path/to/your/video.mp4"

Note: Replace YOUR_API_KEY with your actual Google Gemini API key and "path/to/your/video.mp4" with the full path to your video file.


Known Limitations

  1. Timestamp Accuracy: Subtitle timestamps may exhibit inaccuracies. This is an inherent characteristic of the Gemini AI model.

    • Observations indicate that shorter video segments generally lead to improved timestamp accuracy.
    • Requesting second-level precision for timestamps generally yields more accurate results compared to millisecond-level precision from the model. Consequently, the current implementation is designed to request second-level timestamps.
  2. AI Hallucinations: Like all AI models, Gemini may occasionally produce "hallucinations" or inaccurate information. This is a known characteristic of current AI technology.

If you encounter issues related to these limitations, consider re-processing specific video segments as detailed in the "Re-processing Specific Video Segments" section below.


Re-processing Specific Video Segments

Intermediate files generated during processing are stored in the temporary directory, which defaults to tmp_<input_file_name> but can be specified using the --temp_dir CLI flag. Users can examine these part_XXX.json files within this directory to review the AI's results for individual segments. To re-process a specific video segment, simply delete its corresponding part_XXX.json file. Upon subsequent execution, the script will automatically re-process only those segments for which the part_XXX.json file is absent.

Project details


Release history Release notifications | RSS feed

This version

0.0.8

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_sub-0.0.8.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_sub-0.0.8-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file ai_sub-0.0.8.tar.gz.

File metadata

  • Download URL: ai_sub-0.0.8.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_sub-0.0.8.tar.gz
Algorithm Hash digest
SHA256 1fe76c3b44348ca83df7b1e89f04d0c967dad8c1baeeaa7e5910cb43408e0c65
MD5 1c5ad1af387a91921997c8a3f33364b3
BLAKE2b-256 af817287bafbc1fa2f6c098fd32c0fe636f37e83843b5a244b12997ff1f230bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.8.tar.gz:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_sub-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: ai_sub-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_sub-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 b2c0973363bcdcbe4132aea71d9d1e8001a40c1cc1b2d54d30986f4836f0f877
MD5 3b77958a6fa2031cb47d9244c2b689f3
BLAKE2b-256 f78919345e671a7917164fe43a204b556f3da4374b18263fbb9f68aec1c89d7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_sub-0.0.8-py3-none-any.whl:

Publisher: publish.yml on FlippFuzz/ai-sub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page