Skip to main content

Transcribe any YouTube video into a structural Markdown document

Project description

yt2doc

yt2doc transcribes videos online into structural documents in Markdown format.

Supported video sources:

  • YouTube
  • Twitter

yt2doc is meant to work fully locally, without invoking any external API. The OpenAI SDK dependency is required solely to interact with Ollama.

Installation

Install with pipx:

pipx install yt2doc

Or install with uv:

uv tool install yt2doc

Usage

Get helping information:

yt2doc --help

To transcribe a video (on YouTube or Twitter) into a document:

yt2doc --video <video-url>

To save your transcription:

yt2doc --video <video-url> -o some_dir/transcription.md

To transcribe all videos in a Youtube playlist:

yt2doc --playlist <playlist-url> -o some_dir

(Ollama Required) If the video is not chaptered, you can chapter it and add headings to each chapter:

yt2doc --video <video-url> --segment-unchaptered --llm-model <model-name>

Among smaller size models, qwen 2.5 7b seems works best.

For MacOS devices running Apple Silicon, (a hacky) support for whisper.cpp is supported:

yt2doc --video --whisper-backend whisper_cpp --whisper-cpp-executable <path-to-whisper-cpp-executable>  --whisper-cpp-model <path-to-whisper-cpp-model>

yt2doc uses Segment Any Text (SaT) to segment the transcript into sentences and paragraphs. You can change the SaT model:

yt2doc --video <video-url> --sat-model <sat-model>

List of available SaT models here.

TODOs

  • Tests and evaluation
  • CICD
  • Better whisper prompting strategy (right now hugely depend on the title and the description of the video).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt2doc-0.1.1.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

yt2doc-0.1.1-py3-none-any.whl (20.0 kB view details)

Uploaded Python 3

File details

Details for the file yt2doc-0.1.1.tar.gz.

File metadata

  • Download URL: yt2doc-0.1.1.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.20

File hashes

Hashes for yt2doc-0.1.1.tar.gz
Algorithm Hash digest
SHA256 88de75c0fa89b252340694d9f8f8c2c71f45674b7ea88573fcaaa9faaee82127
MD5 fd452215ea1b1c8d07dbb3beb0bccdce
BLAKE2b-256 83f69cbbc9ff5a665738a46f0890502a6af6aa55acd7525fc9468c8facec85c1

See more details on using hashes here.

File details

Details for the file yt2doc-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: yt2doc-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 20.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.20

File hashes

Hashes for yt2doc-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b072f8a4bd6b095c81f49f9b2b40719699ec1f52b04ca4b6eb754d81f608dd64
MD5 2c4c65cc01d608b3b6204135ad1d461f
BLAKE2b-256 eb2f58b3c2d0d81c9b325f08ee4efac89db9d00ebc36fb022f25ebce45d71884

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page