Convert YouTube videos into structured markdown instruction documents

These details have not been verified by PyPI

Project description

yt-instruct

Convert YouTube videos into structured markdown instruction documents.

Downloads audio via yt-dlp, transcribes with Mistral's voxtral API, then generates a clean how-to document using Claude.

Quick Start

# Run with uvx (no install needed)
uvx --from . yt-instruct https://www.youtube.com/watch?v=<id>

# Or install
pip install -e .
yt-instruct https://www.youtube.com/watch?v=<id>

Requirements

ffmpeg — brew install ffmpeg or apt install ffmpeg
MISTRAL_API_KEY — console.mistral.ai
ANTHROPIC_API_KEY — for default backend
NVIDIA_API_KEY — only for --backend nvidia

Usage

yt-instruct [OPTIONS] URL [URL...]
yt-instruct [OPTIONS] --url-file urls.txt
yt-instruct [OPTIONS] --transcript-file transcript.txt --title "Name"
yt-instruct [OPTIONS] --audio-file recording.mp3 --title "Name"

Options:
  --output-dir PATH              Output directory [default: .]
  --keep                         Keep intermediate audio + transcript files
  --merge                        Merge all videos into one document
  --resume                       Skip already-generated outputs; reuse cached transcripts
  --no-generate                  Stop after transcription; skip LLM generation
  --content-type [tutorial|lecture|ib|auto]
                                 Prompt style [default: auto]
  --backend [anthropic|llm|nvidia]
                                 LLM backend [default: anthropic]
  --model TEXT                   Model name [default: claude-sonnet-4-6]
  --prompt-file PATH             Custom system prompt (overrides built-in)
  --language LANG                Output language (e.g. 'French'). Defaults to English.
  --transcript-file PATH         Use existing transcript; skips download and transcription
  --audio-file PATH              Use existing audio file; skips download, transcribes directly
  --title TEXT                   Video title for --transcript-file or --audio-file
  --draft                        Set draft: true in the output frontmatter [default: false]
  --mistral-model TEXT           [default: voxtral-mini-latest]
  --audio-format [mp3|m4a]       [default: mp3]
  --version                      Show version and exit

Output Frontmatter

Every generated file includes YAML frontmatter:

---
title: "Video Title"
url: https://youtu.be/...
description: "YouTube video description"
date: 2026-04-12
draft: false
---

Use --draft to set draft: true (useful for Hugo, Jekyll, or similar static site generators). Merged documents (--merge) do not include frontmatter.

Content Types

Type	Use for
`auto`	Let the LLM detect (default)
`tutorial`	How-to / step-by-step videos
`lecture`	Tech talks, academic presentations
`ib`	IB student subject videos

Custom Prompts

Override the built-in prompt with your own file. Template variables: {title}, {channel}, {content_type}, {duration}

yt-instruct <url> --prompt-file my_prompt.md

Using the `llm` backend

pip install llm llm-anthropic
llm keys set anthropic
yt-instruct <url> --backend llm --model claude-sonnet-4-6

Using the `nvidia` backend

NVIDIA_API_KEY=... yt-instruct <url> --backend nvidia --model moonshotai/kimi-k2-instruct

Batch Processing

# Multiple URLs
yt-instruct url1 url2 url3 --output-dir ./docs

# Playlist (automatically expanded)
yt-instruct https://www.youtube.com/playlist?list=<id> --output-dir ./docs

# From file
cat urls.txt | yt-instruct --url-file /dev/stdin

# Merge all into one doc
yt-instruct url1 url2 --merge --output-dir ./docs

Skip Steps — Use Existing Files

--audio-file and --transcript-file resolve relative to --output-dir if the file isn't found at the given path. This lets you reference files already in the output directory without typing the full path:

# Start from an existing transcript (skips download + transcription)
yt-instruct --transcript-file transcript.txt --title "My Video" --output-dir ./docs

# File not found locally? Looked up in ./docs automatically
yt-instruct --transcript-file my_transcript.txt --output-dir ./docs

# Start from an existing audio file (skips download, still transcribes)
yt-instruct --audio-file recording.mp3 --output-dir ./docs

Resume an Interrupted Run

Use --keep to save transcripts alongside output files, then --resume to continue from where a previous run stopped:

# First run (interrupted partway through)
yt-instruct --url-file urls.txt --keep --output-dir ./docs

# Resume — skips videos with existing output; reuses cached transcripts
yt-instruct --url-file urls.txt --resume --output-dir ./docs

--resume checks at two levels per video:

Output .md already exists → skip entirely
Cached *_transcript.txt exists (saved by --keep) → skip download and transcription, regenerate only

Changelog

See CHANGELOG.md for release history.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.2.0

May 2, 2026

1.1.0

May 2, 2026

1.0.0

Apr 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_instruct-1.2.0.tar.gz (18.6 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

yt_instruct-1.2.0-py3-none-any.whl (18.4 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file yt_instruct-1.2.0.tar.gz.

File metadata

Download URL: yt_instruct-1.2.0.tar.gz
Upload date: May 2, 2026
Size: 18.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_instruct-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e67fb81c7da41eb1d93b0900abbc2795ac1a11902c9f54e93ac807466ab4fa16`
MD5	`f3bdf373fbd4b9ae8ec0e21fba089b70`
BLAKE2b-256	`9d9017529c1a29febd9f30ceac7b97c1882cb99e2d852c9155ac94c265fd8f99`

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_instruct-1.2.0.tar.gz:

Publisher: publish.yml on divyavanmahajan/yt-instruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: yt_instruct-1.2.0.tar.gz
- Subject digest: e67fb81c7da41eb1d93b0900abbc2795ac1a11902c9f54e93ac807466ab4fa16
- Sigstore transparency entry: 1428900966
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: divyavanmahajan/yt-instruct@596c556d56ef0f6e0cd47d6278fcd6086f6ef7ad
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/divyavanmahajan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@596c556d56ef0f6e0cd47d6278fcd6086f6ef7ad
- Trigger Event: push

File details

Details for the file yt_instruct-1.2.0-py3-none-any.whl.

File metadata

Download URL: yt_instruct-1.2.0-py3-none-any.whl
Upload date: May 2, 2026
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_instruct-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1f4b63918206a906511202bd306fb5974d2f6772039b8b40ce5caeef17f965b0`
MD5	`89630a766c4a2fdc6fc9fe7fe59441af`
BLAKE2b-256	`ad4c04691c9969e688fa4c741cd5dbce4ff8bc78752047cb102545e43259e35f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_instruct-1.2.0-py3-none-any.whl:

Publisher: publish.yml on divyavanmahajan/yt-instruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: yt_instruct-1.2.0-py3-none-any.whl
- Subject digest: 1f4b63918206a906511202bd306fb5974d2f6772039b8b40ce5caeef17f965b0
- Sigstore transparency entry: 1428900969
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: divyavanmahajan/yt-instruct@596c556d56ef0f6e0cd47d6278fcd6086f6ef7ad
- Branch / Tag: refs/tags/v1.2.0
- Owner: https://github.com/divyavanmahajan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@596c556d56ef0f6e0cd47d6278fcd6086f6ef7ad
- Trigger Event: push

yt-instruct 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

yt-instruct

Quick Start

Requirements

Usage

Output Frontmatter

Content Types

Custom Prompts

Using the `llm` backend

Using the `nvidia` backend

Batch Processing

Skip Steps — Use Existing Files

Resume an Interrupted Run

Changelog

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

yt-instruct 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

yt-instruct

Quick Start

Requirements

Usage

Output Frontmatter

Content Types

Custom Prompts

Using the llm backend

Using the nvidia backend

Batch Processing

Skip Steps — Use Existing Files

Resume an Interrupted Run

Changelog

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Using the `llm` backend

Using the `nvidia` backend