Skip to main content

CLI to transcript and translate audio and video files

Project description

Pytranscript 🎙️

Pytranscript is a powerful Python library and command-line tool designed to seamlessly convert video or audio files into text and translate them into various languages. It acts as a simple yet effective wrapper around Vosk, ffmpeg, and deep-translator, making the transcription and translation process straightforward.

Prerequisites

Before using pytranscript, ensure you have the following dependencies installed:

  • ffmpeg for audio conversion.
  • vosk-models required for speech recognition. You will have to specify to your specific model path in the --model argument.

Installation

pip install pytranscript

Usage

Command Line

pytranscript INPUT_FILE [OPTIONS]

Options

  • -m, --model - Path to the Vosk model directory. Always required.
  • -o, --output - Output file where the text will be saved. Default: input file name with .txt extension.
  • -f, --format - Format of the transcript. Must be one of 'csv', 'json', 'srt', 'txt', 'vtt' or 'all'. Default: input file extension.
  • -li, --lang_input - Language of the input / the model. Default: auto.
  • -lo --lang_input - Language to translate the text to. Default: no translation.
  • -s, --start - Start time of the audio to transcribe in seconds.
  • -e, --end - End time of the audio to transcribe in seconds.
  • --max_size - Will stop the transcription if the output file reaches the specified size in bytes. Takes precedence over the --end option.
  • --keep-wav - Keep the converted audio wav file after the process is done.
  • -v, -verbosity - Verbosity level. 0: no output, 1: only errors, 2: errors, info and progressbar, 3: debug. Default: 2.

Example

The most basic usage is:

pytranscript video.mp4 -m vosk-model-en-us-aspire-0.2 -lo fr -f srt

Where vosk-model-en-us-aspire-0.2 is the Vosk model directory. The text will be translated from English to French, and the output will be saved in video.srt.

Using the keep-wav option can be useful if you want to do many transcriptions within the same file, allowing you to use the same .wav file for each transcription, thus saving conversion time. ⚠️ The .wav file is cropped according to the start and end time options.

API

The API provides a Transcript object containing the time and text. The translate method can be used to get another Transcript object with the translated text. The output saved in a file in the cli is just a method to_{format} of the Transcript object.

A reproduction of the previous example using the API:

import pytranscript as pt

wav_file = pt.to_valid_wav("video.mp4", "video.wav", start=0, end=None)
transcript = pt.transcribe(wav_file, model="vosk-model-en-us-aspire-0.2", max_size=None)
transcript_fr, errors = transcript.translate("fr")

transcript_fr.write("video.srt")

Contributing

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change. Tests can be run with pytest. Use ruff with ruff format . to format the code before committing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytranscript-0.3.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

pytranscript-0.3.0-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file pytranscript-0.3.0.tar.gz.

File metadata

  • Download URL: pytranscript-0.3.0.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for pytranscript-0.3.0.tar.gz
Algorithm Hash digest
SHA256 cb1042521411a2b528e03cc8fe51c29f2ceb2fd7dec774dbd30c5e13bd445f97
MD5 d59a3eb30d52751640b2d90946d78649
BLAKE2b-256 e5e58b1a728050e3c455b609b64a61bf81a819a9ca4c96d5c9b7f36ef2988d46

See more details on using hashes here.

File details

Details for the file pytranscript-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pytranscript-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for pytranscript-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c8bca0ad0a94b530ad7275d7073b40a8c3c643f7f6b6b98c371f5c39689aa163
MD5 f1997202c5c779c751dd79cc8db7644b
BLAKE2b-256 bb3c6d4e90f973b6cbb862b49a93f4f6d351804070d4a26e367cf54f5496af81

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page