AI-powered subtitle generator using Whisper and FFmpeg

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

0xdilshan

These details have not been verified by PyPI

Project description

vid2cc-AI 🎙️🎬

CI Status

vid2cc-AI is a high-performance CLI tool designed to bridge the gap between raw video and accessible content. By leveraging OpenAI's Whisper models and FFmpeg's robust media handling, it automates the creation of perfectly synced .srt subtitles.

🚀 Key Features
⚙️ Installation
- 1. Prerequisite: FFmpeg
- 2. Install vid2cc-AI
📖 How To Use
☁️ Run on Google Colab (with UI)
🧪 Testing
🗺️ Roadmap
🛠️ Tech Stack
📄 License

🚀 Key Features

AI-Driven Transcription: Powered by OpenAI Whisper for industry-leading accuracy.
Hardware Acceleration: Automatic CUDA detection for GPU-accelerated processing.
Intelligent Pre-processing: FFmpeg-based audio extraction optimized for speech recognition (16kHz Mono).
Professional Packaging: Fully installable via pip with a dedicated command-line entry point.

⚙️ Installation

1. Prerequisite: FFmpeg

This tool requires FFmpeg to be installed on your system.

For a complete step-by-step guide on how to install FFmpeg on Windows (Winget/Choco), macOS (Homebrew), or Linux (Apt/Dnf/Pacman), please refer to the dedicated guide:

👉 FFmpeg Installation Guide

2. Install vid2cc-AI

pip install vid2cc-ai

*Install directly from the source for development:

git clone https://github.com/0xdilshan/vid2cc-AI.git
cd vid2cc-AI
pip install -e .

📖 How To Use

Once installed, the vid2cc command is available globally in your terminal.

Examples

For maximum accuracy with toggleable subs:

vid2cc example.mp4 --model large --embed

🛠️ Advanced Options

Fine-tune your output using the following flags:

Flag	Description
`--model [size]`	Choose Whisper model: `tiny`, `base`, `small`, `medium`, `large` or `turbo`.
`--embed`	Soft Subtitles: Adds the SRT as a metadata track. Fast and allows users to toggle subtitles on/off in players like VLC.
`--hardcode`	Burn-in Subtitles: Permanently draws subtitles onto the video. Essential for social media (Instagram/TikTok) where players don't support SRT files.
`--output-dir` or `-o`	Set Output Directory: Create the destination directory if it doesn't exist and ensure all generated files (SRT, audio, and video) are saved there.
`--translate` or `-t`	Translate to English: Automatically translate any supported language transcription to English

📦 Batch Processing

No need to run the command for every single file. You can pass multiple videos at once:

# Process all mp4 files in the current directory
vid2cc *.mp4 --model small --embed

# Process multiple specific files
vid2cc video1.mp4 video2.mkv video3.mov --model base --embed

📦 Usage as a Library

You can integrate vid2cc-AI directly into your Python projects:

from vid2cc_ai import Transcriber, extract_audio

# Extract and Transcribe
extract_audio("video.mp4", "audio.wav")
ts = Transcriber("base")
segments = ts.transcribe("audio.wav")

for s in segments:
    print(f"[{s['start']:.2f}s] {s['text']}")

Run on Google Colab (with UI)

You can run vid2cc-ai directly in your browser using Google Colab. This version includes a friendly interface to manage your Google Drive files and transcription settings without writing code.

Open the Notebook: .
Install & Mount: Run the first cell to install vid2cc-ai and connect your Google Drive.
Configure UI: * Video Path: Right-click your video in the Colab file sidebar and select "Copy Path."
- Model: Choose turbo or small for speed, large for accuracy.
- Output: Select if you want Soft Subtitles (toggleable) or Hardcoded (burned-in).
Start: Click "Start Processing" and find your result in your Drive folder.

⚡ For 10x faster transcription, ensure your Colab runtime is set to GPU (Runtime > Change runtime type > T4 GPU).

🧪 Testing

# Install test dependencies
pip install pytest

# Run the test suite
pytest

🗺️ Roadmap

Local video → SRT subtitle/ transcription
Embed subtitles into video containers (--embed)
Burn-in subtitles (--hardcode)
Set custom output directory (--output-dir)
Multilingual transcription
Support translation to English
~~[ ] Transcription from YouTube/Vimeo URLs (yt-dlp)~~
Google Colab notebook support

🛠️ Tech Stack

Inference: OpenAI Whisper
Media Engine: FFmpeg
Core: Python 3.9+, PyTorch
CLI Framework: Argparse

📄 License

Distributed under the MIT License.
See LICENSE for more information.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

0xdilshan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.5

Feb 19, 2026

0.1.4

Feb 11, 2026

0.1.3

Feb 5, 2026

0.1.2

Feb 5, 2026

0.1.1

Feb 4, 2026

0.1.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vid2cc_ai-0.1.5.tar.gz (11.9 kB view details)

Uploaded Feb 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vid2cc_ai-0.1.5-py3-none-any.whl (10.5 kB view details)

Uploaded Feb 19, 2026 Python 3

File details

Details for the file vid2cc_ai-0.1.5.tar.gz.

File metadata

Download URL: vid2cc_ai-0.1.5.tar.gz
Upload date: Feb 19, 2026
Size: 11.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vid2cc_ai-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`e818d1d562d8fb88c35d8ed2a7cbb5ca4076baf11e8c322a4a30845ebfd00da0`
MD5	`e7b6a091cab6e0d564140863d7dd298a`
BLAKE2b-256	`66110a834961b3b664f1a7160f3d58223d74b4fd61d7eadc934ece5af4ef89fe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vid2cc_ai-0.1.5.tar.gz:

Publisher: publish.yml on 0xdilshan/vid2cc-AI

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vid2cc_ai-0.1.5.tar.gz
- Subject digest: e818d1d562d8fb88c35d8ed2a7cbb5ca4076baf11e8c322a4a30845ebfd00da0
- Sigstore transparency entry: 969329073
- Sigstore integration time: Feb 19, 2026
Source repository:
- Permalink: 0xdilshan/vid2cc-AI@e5f078544556cf73bb3e5fd7df924c65a6c38245
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/0xdilshan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e5f078544556cf73bb3e5fd7df924c65a6c38245
- Trigger Event: release

File details

Details for the file vid2cc_ai-0.1.5-py3-none-any.whl.

File metadata

Download URL: vid2cc_ai-0.1.5-py3-none-any.whl
Upload date: Feb 19, 2026
Size: 10.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vid2cc_ai-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ccf6d5314a91c9e2146ba92cd629136c9957d2c90632018736ed8d7d567a4abd`
MD5	`ea6d92fcca6d5fc9c02ec5be804ac995`
BLAKE2b-256	`d26d3ff38d4c813be177b6d2f2f32d54692588e36c1cbb477120117d16cd3573`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vid2cc_ai-0.1.5-py3-none-any.whl:

Publisher: publish.yml on 0xdilshan/vid2cc-AI

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vid2cc_ai-0.1.5-py3-none-any.whl
- Subject digest: ccf6d5314a91c9e2146ba92cd629136c9957d2c90632018736ed8d7d567a4abd
- Sigstore transparency entry: 969329076
- Sigstore integration time: Feb 19, 2026
Source repository:
- Permalink: 0xdilshan/vid2cc-AI@e5f078544556cf73bb3e5fd7df924c65a6c38245
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/0xdilshan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e5f078544556cf73bb3e5fd7df924c65a6c38245
- Trigger Event: release

vid2cc-ai 0.1.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

vid2cc-AI 🎙️🎬

Table of contents

🚀 Key Features

⚙️ Installation

1. Prerequisite: FFmpeg

2. Install vid2cc-AI

📖 How To Use

Examples

🛠️ Advanced Options

📦 Batch Processing

📦 Usage as a Library

Run on Google Colab (with UI)

🧪 Testing

🗺️ Roadmap

🛠️ Tech Stack

📄 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance