Skip to main content

A Python package for transcribing videos to text using various speech-to-text services

Project description

vid2txt

A Python package for transcribing videos/audios to text using various speech-to-text services. Currently supports AssemblyAI for high-quality transcription.

Features

  • Download and transcribe from YouTube or any URL (via yt-dlp)
  • Extract audio from video files using FFmpeg
  • Direct support for audio formats (MP3, WAV, M4A, AAC, FLAC, OGG, WMA)
  • Transcribe audio using AssemblyAI API
  • Export transcripts in multiple formats:
    • Plain text (.txt)
    • SubRip subtitles (.srt)
    • Interactive HTML (.html) with embedded video/audio player
  • Language forcing support

Installation

pip install vid2txt

Install from source:

git clone https://github.com/ahmedsalim3/vid2txt.git
cd vid2txt
pip install -e .

Setup

Set your AssemblyAI API key as an environment variable:

export ASSEMBLYAI_API_KEY="your-api-key-here"

# On Windows (PowerShell):
$env:ASSEMBLYAI_API_KEY="your-api-key-here"

Get a free API key from: https://www.assemblyai.com/dashboard/signup

Usage

Command Line Interface

vid2txt MEDIA_PATH [OPTIONS]

Where MEDIA_PATH can be:

  • A local video file (.mp4, .mkv, .mov, ...)
  • A local audio file (.mp3, .wav, ...)
  • A YouTube or other URL

Options

  • -o, --output-dir: Output directory (default: same as input file)
  • -l, --language: Force transcription language (e.g., 'en', 'ar', 'es')
  • --model: Speech-to-text model to use (currently only 'assemblyai')
  • --force-audio-extract: Force re-extraction of audio from video files
  • --audio: Download audio only when using YouTube URLs (faster)

Examples

# Transcribe a local video file
vid2txt video.mp4

# Transcribe a local audio file
vid2txt podcast.mp3

# Transcribe from YouTube (downloads best video+audio)
vid2txt "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

# Download and transcribe audio only (faster)
vid2txt "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --audio

# Specify output directory and language
vid2txt https://www.youtube.com/watch?v=dQw4w9WgXcQ -o ./output -l en

# Force re-extraction of audio even if cached
vid2txt video.mp4 --force-audio-extract

# Show help
vid2txt -h

Python API

from vid2txt import Transcriber
from pathlib import Path

media_path = Path("video.mp4") # or Path("audio.mp3"), or a URL
output_dir = Path("output") # Output directory
api_key = "your_api_key" # if using assemblyai


transcriber = Transcriber(
    output_dir=output_dir,
    language="en",
    model="assemblyai",
    api_key=api_key
)


segments = transcriber.transcribe(media_path=media_path)

# Save in different formats
transcriber.save_plain_text(
    segments=segments, 
    out_path=output_dir / Path("transcript.txt")
)
transcriber.save_srt(
    segments=segments,
    out_path=output_dir / Path("transcript.srt")
)
transcriber.save_html(
    segments=segments, 
    out_path=output_dir / Path("transcript.html"), 
    media_path=media_path
)

Contributing

Contributions are welcome, feel free to open an issue or submit a pull request

Here are a few ideas for future development:

TODO

  • Add support for additional speech-to-text models (e.g., OpenAI Whisper or other open-source models)
  • Extend the Summarizer to generate concise summaries of transcripts
  • Extend the Translator to support source → target language translation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vid2txt-0.1.1.tar.gz (57.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vid2txt-0.1.1-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file vid2txt-0.1.1.tar.gz.

File metadata

  • Download URL: vid2txt-0.1.1.tar.gz
  • Upload date:
  • Size: 57.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vid2txt-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1d99558380ad9b0d27ac7c54a345da78382ffaa9ed08d84702b436b3fbe1d9de
MD5 dfec324409d4015d0c7fda20707e396e
BLAKE2b-256 a20011305de4c069af05fc794579b417e63ed8cf70556f4e834470d3a60cd4c9

See more details on using hashes here.

File details

Details for the file vid2txt-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vid2txt-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vid2txt-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ffc8298ef58574dd2e9ff862746c241c212f9677f8d19eb944bdbbe5cb5c81bf
MD5 c4b74e6c717dbe6f308741be7b01a77c
BLAKE2b-256 98c0b778e039b200b4182e6939027d30f6255990087375d8a777400f2657d322

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page