Transcribe any YouTube video into a structural Markdown document
Project description
yt2doc
yt2doc transcribes videos online into structural documents in Markdown format.
Supported video sources:
- YouTube
yt2doc is meant to work fully locally, without invoking any external API. The OpenAI SDK dependency is required solely to interact with Ollama.
Installation
Install with pipx:
pipx install yt2doc
Or install with uv:
uv tool install yt2doc
Usage
Get helping information:
yt2doc --help
To transcribe a video (on YouTube or Twitter) into a document:
yt2doc --video <video-url>
To save your transcription:
yt2doc --video <video-url> -o some_dir/transcription.md
To transcribe all videos in a Youtube playlist:
yt2doc --playlist <playlist-url> -o some_dir
(Ollama Required) If the video is not chaptered, you can chapter it and add headings to each chapter:
yt2doc --video <video-url> --segment-unchaptered --llm-model <model-name>
Among smaller size models, qwen 2.5
7b seems works best.
For MacOS devices running Apple Silicon, (a hacky) support for whisper.cpp is supported:
yt2doc --video --whisper-backend whisper_cpp --whisper-cpp-executable <path-to-whisper-cpp-executable> --whisper-cpp-model <path-to-whisper-cpp-model>
yt2doc uses Segment Any Text (SaT) to segment the transcript into sentences and paragraphs. You can change the SaT model:
yt2doc --video <video-url> --sat-model <sat-model>
List of available SaT models here.
TODOs
- Tests and evaluation
- CICD
- Better whisper prompting strategy (right now hugely depend on the title and the description of the video).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file yt2doc-0.1.1.tar.gz
.
File metadata
- Download URL: yt2doc-0.1.1.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88de75c0fa89b252340694d9f8f8c2c71f45674b7ea88573fcaaa9faaee82127 |
|
MD5 | fd452215ea1b1c8d07dbb3beb0bccdce |
|
BLAKE2b-256 | 83f69cbbc9ff5a665738a46f0890502a6af6aa55acd7525fc9468c8facec85c1 |
File details
Details for the file yt2doc-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: yt2doc-0.1.1-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b072f8a4bd6b095c81f49f9b2b40719699ec1f52b04ca4b6eb754d81f608dd64 |
|
MD5 | 2c4c65cc01d608b3b6204135ad1d461f |
|
BLAKE2b-256 | eb2f58b3c2d0d81c9b325f08ee4efac89db9d00ebc36fb022f25ebce45d71884 |