Skip to main content

Local audio and YouTube transcription CLI

Project description

ytxt

Local-first, privacy-focused transcription CLI.

ytxt is a developer-centric tool for transcribing audio from YouTube, web URLs, or local files. It bridges the gap between yt-dlp and faster-whisper, providing a seamless, automated pipeline that runs entirely on your machine.

PyPI version License: MIT

Why ytxt?

  • Zero-Cloud Privacy: No data leaves your machine. Perfect for sensitive meetings or private research.
  • High Performance: Powered by faster-whisper (CTranslate2), which is up to 4x faster than OpenAI's original implementation.
  • Battery Included: Handles downloading, audio extraction (ffmpeg), and transcription in one command.
  • Smart Caching: Avoid redundant computations. ytxt hashes inputs to skip re-transcribing files you've already processed.
  • Universal: Supports 1,000+ sites including YouTube, Spotify (Podcasts), and SoundCloud via yt-dlp.

Installation

Requires ffmpeg installed on your system.

# Using pip
pip install ytxt

# Using uv (recommended for speed)
uv tool install ytxt

Quick Start (CLI)

Transcribe any YouTube video to Markdown with timestamps:

ytxt "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --format markdown --timestamps --output transcript.md

Power User Tricks

Pipe to an LLM for summarization:

ytxt <url> | llm "Summarize this transcript for a technical audience"

Extract metadata with jq:

ytxt <url> --format json | jq '.[].text'

Data Pipelines & Automation

ytxt is designed to be a "high-signal" component in your data infrastructure. Because status logs are routed to stderr, the stdout remains clean for programmatic use.

  • RAG Pipelines: Use ytxt as an ingestion layer to feed YouTube transcripts directly into vector databases like Pinecone or Chroma.
  • AI Agents: Pipe transcripts directly into LLMs for summarization, sentiment analysis, or entity extraction.
  • Subtitles: Generate industry-standard .srt files for video editing workflows.
  • Scheduled Jobs: Run ytxt in a cron job or GitHub Action to monitor and transcribe new videos from a playlist.

Library Usage

ytxt is designed to be imported into your own Python automation scripts.

from ytxt import download_audio, transcribe_audio

# 1. Download & Extract
audio_path = download_audio("https://youtube.com/...")

# 2. Transcribe Locally
transcript = transcribe_audio(audio_path, model_size="medium")

# 3. Use the result (list of dicts with 'start', 'end', 'text')
for segment in transcript:
    print(f"[{segment['start']}] {segment['text']}")

Configuration

Option Description Default
--model Whisper model size (tiny, base, small, medium, large-v3) base
--format Output format (text, markdown, srt, json) text
--timestamps Include timestamps in text/markdown output False
--no-cache Force re-transcription by ignoring cache False

Development

git clone https://github.com/rayanrane/ytxt.git
cd ytxt
pip install -e .

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ytxt-0.2.5.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ytxt-0.2.5-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file ytxt-0.2.5.tar.gz.

File metadata

  • Download URL: ytxt-0.2.5.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ytxt-0.2.5.tar.gz
Algorithm Hash digest
SHA256 65e803d00f0e39df7a4e1be62e824122964daf8a1df4d5135c5dc5d8fd8b7fc4
MD5 5716255bf011f71760e9fdef1e531956
BLAKE2b-256 5af7fcd5f240781da86e95a17576bb2c8f54edf1324ecc879687c75007fa7b61

See more details on using hashes here.

Provenance

The following attestation bundles were made for ytxt-0.2.5.tar.gz:

Publisher: publish.yml on RayanR000/ytxt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ytxt-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: ytxt-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ytxt-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a70bbe1918fa694a6b6c7bbe3da13e85839ae945e6a4aa0f257f23e348daffbf
MD5 a0047769499437bc881969160bde830e
BLAKE2b-256 f78bf362c6be9f56536910a89480e9d3765c5e2af6ca5e24d29b739d49156da1

See more details on using hashes here.

Provenance

The following attestation bundles were made for ytxt-0.2.5-py3-none-any.whl:

Publisher: publish.yml on RayanR000/ytxt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page