Transcribe media files to SRT subtitles.

These details have not been verified by PyPI

Project links

Homepage

Project description

Audio2Sub

Audio2Sub is a command-line tool that automatically transcribes audio from video or audio files and generates subtitles in the .srt format. It uses FFmpeg for media handling, Silero VAD for precise voice activity detection, and supports multiple transcription backends to convert speech to text.

Installation

Before installing, you must have FFmpeg installed and available in your system's PATH.

You can install Audio2Sub using pip. The default installation includes the faster_whisper backend.

pip install audio2sub[faster_whisper]

To install with a different backend, see the table in the Backends section below.

Usage

Basic Example

audio2sub my_video.mp4 -o my_video.srt --lang en

This command will transcribe the audio from my_video.mp4 into English and save the subtitles to my_video.srt.

Notes:

First-Time Use: The first time you run the program, it will download the necessary transcription models. This may take some time and require significant disk space.
CUDA: Performance significantly degraded without CUDA when using whisper-based local models. The program will raise a warning if CUDA is not available when it starts. If your system has a compatible GPU, install the CUDA Toolkit first. If you are sure CUDA has been installed correctly and still get the warning, you may need to reinstall a compatible PyTorch version manually. The reinstallation of PyTorch may break other dependencies if you choose a different version than what you currently have. In this case, you may need to reinstall those according to the warnings shown.

Using a Different Transcriber

Use the -t or --transcriber flag to select a different backend.

audio2sub my_audio.wav -o my_audio.srt --lang en -t whisper --model medium

Each transcriber has its own options. To see them, use --help with the transcriber specified.

audio2sub -t faster_whisper --help

Docker

Audio2Sub provides official Docker images for easy deployment without managing dependencies.

Quick Start

# With GPU support (recommended)
docker run --rm --gpus all -v "$(pwd):/media" xavierlam/audio2sub \
  my_video.mp4 -o my_video.srt --lang en

# Without GPU support, whisper backend
docker run --rm -v "$(pwd):/media" xavierlam/audio2sub:whisper \
  my_video.mp4 -o my_video.srt --lang en

Use --gpus all to enable GPU support, use different tags to select different backends.

Available Images

Image Tag	Backend	Description
`xavierlam/audio2sub:latest`	faster-whisper	Recommended (same as faster-whisper)
`xavierlam/audio2sub:faster-whisper`	faster-whisper	Fast CTranslate2-based Whisper
`xavierlam/audio2sub:whisper`	whisper	Original OpenAI Whisper
`xavierlam/audio2sub:gemini`	gemini	Google Gemini API

For detailed Docker documentation, GPU setup, and troubleshooting, see docker/README.md.

Backends

Audio2Sub supports the following transcription backends.

Backend Name	Description
`faster_whisper`	A faster reimplementation of Whisper using CTranslate2. See Faster Whisper. This is the default backend.
`whisper`	The original speech recognition model by OpenAI. See OpenAI Whisper.
`gemini`	Google's Gemini model via their API. Requires a `GEMINI_API_KEY` environment variable or `--api-key` argument.

You should use pip install audio2sub[<backend>] to install the desired backend support and use the corresponding transcriber with the -t flag.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on the GitHub repository.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.1

Feb 18, 2026

0.2.0

Feb 14, 2026

This version

0.1.2

Feb 9, 2026

0.1.1

Feb 7, 2026

0.1.0

Feb 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio2sub-0.1.2.tar.gz (16.0 kB view details)

Uploaded Feb 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio2sub-0.1.2-py3-none-any.whl (17.2 kB view details)

Uploaded Feb 9, 2026 Python 3

File details

Details for the file audio2sub-0.1.2.tar.gz.

File metadata

Download URL: audio2sub-0.1.2.tar.gz
Upload date: Feb 9, 2026
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for audio2sub-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`3fc247833d70f7b6fa5ddab4bae6cf742c12cae7b2d7477bd95ea2478e17c60d`
MD5	`c791e96857f3df6cd2d1a7a611396fcb`
BLAKE2b-256	`8451154eb744d4adc83fd9b7777ec17b77edd8509e8fa1d4cfb3c9c9b16e4ff3`

See more details on using hashes here.

File details

Details for the file audio2sub-0.1.2-py3-none-any.whl.

File metadata

Download URL: audio2sub-0.1.2-py3-none-any.whl
Upload date: Feb 9, 2026
Size: 17.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for audio2sub-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`af8a622cd6cef73b72265bdaeb7d8cd44a52815f23db111a0913989c77cc6c03`
MD5	`d05eb866a3e7c10c430ab99b71000472`
BLAKE2b-256	`0404909c2f9d64694f82234360e1754d173bef9aa57772503537a705345adb7c`

See more details on using hashes here.

audio2sub 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Audio2Sub

Installation

Usage

Basic Example

Using a Different Transcriber

Docker

Quick Start

Available Images

Backends

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes