Skip to main content

Turn tutorial videos into reproducible Repeat Guides — and let agents run them.

Project description

WithVideo

License: MIT Python PyPI

WithVideo 是一个把教程视频转成可复现操作指南的命令行工具。

它接受本地视频文件或在线视频地址,自动完成媒体获取、字幕/转录、活动检测、语义判断、可选视觉分析,最后输出两份结果:

  • guide.md:给人读的 Repeat Guide
  • semantic.json:给机器消费的结构化语义结果

当前仓库同时包含两条链路:

  • withvideo/wv learn 主工具
  • act_with_video/:读取 guide.md / semantic.json 的执行侧原型

现在能做什么

  • 支持两类输入:
    • 本地原始视频文件
    • 网络 URL
  • 已验证的 URL 平台:
    • YouTube
    • Bilibili
  • 本地视频会自动吸收同目录 sidecar:
    • source.info.json
    • .srt / .vtt
  • YouTube 会优先下载原始语言字幕,而不是默认拉取所有字幕语言
  • Bilibili 支持站点级 cookies 缓存,减少反复触发浏览器钥匙串授权
  • --transcript auto 会根据实际 SourceMeta 在平台字幕和 Whisper 间动态选择
  • 长视频活动检测会自动降采样,避免 ffmpeg 抽帧阶段轻易超时
  • 语义与 Guide 生成支持自定义 llm_command,可接 codexclaude 或后续其他命令

当前边界

  • wv act 已支持 Tier 0/1 真实执行(ShellExecutor、FileWriteExecutor、URLOpenExecutor、ClaudeCodeExecutor);Tier 2/3(Browser / ComputerUse)仍在 roadmap
  • --vision none 是当前最稳的跑法;视觉链路仍在持续演进
  • 很长的视频在 LLM 阶段仍可能比较慢,建议配合 --llm-timeout

环境要求

  • Python 3.11+
  • ffmpeg / ffprobe(见下面"系统依赖")
  • 建议使用 uv

系统依赖

ffmpeg / ffprobe 不会被 pip 自动安装,请先在系统层装好:

平台 安装命令
macOS brew install ffmpeg
Ubuntu/Debian sudo apt install -y ffmpeg
Windows choco install ffmpeg 或从 ffmpeg.org 下载

装好后 ffmpeg -version 能正常输出即可。wv preflight 会自动检测。

Python 依赖

跑完整 URL 链路的常见组合:

# Apple Silicon(mlx-whisper 更快)
uv pip install "withvideo[whisper,youtube,bilibili]"

# 其他平台(faster-whisper,CPU/CUDA 通用)
uv pip install "withvideo[faster-whisper,youtube,bilibili]"

如果你只想本地开发运行,也可以直接用 uv run --with ... 临时补依赖。

YouTube / Bilibili 下载出错?pip install -U yt-dlp——反爬规则频繁变化,保持 yt-dlp 最新通常就能解决。

快速开始

1. 分析本地视频

uv run --python 3.11 --with mlx-whisper python -m withvideo.cli learn \
  "/path/to/source.mp4" \
  -o .withvideo/demo-local \
  --vision none \
  -v

2. 分析 YouTube 视频

uv run --python 3.11 --with youtube-transcript-api --with mlx-whisper python -m withvideo.cli learn \
  "https://www.youtube.com/watch?v=xZaSPw14Cfo" \
  -o .withvideo/demo-youtube \
  --vision none \
  --llm-timeout 600 \
  -v

3. 分析 Bilibili 视频

匿名下载:

uv run --python 3.11 --with mlx-whisper python -m withvideo.cli learn \
  "https://www.bilibili.com/video/BV1StX3B7E9X/" \
  -o .withvideo/demo-bilibili \
  --vision none \
  -v

使用浏览器登录态:

uv run --python 3.11 --with mlx-whisper python -m withvideo.cli learn \
  "https://www.bilibili.com/video/BV1StX3B7E9X/" \
  -o .withvideo/demo-bilibili-login \
  --cookies-from-browser chrome:Default \
  --vision none \
  -v

第一次读取浏览器 cookies 时,macOS 可能会弹钥匙串授权。后续默认会复用项目内缓存的站点级 cookies。

4. 直接用安装后的 wv

如果你已经把项目装成可执行脚本,可以直接:

wv learn "https://www.youtube.com/watch?v=xZaSPw14Cfo" -o .withvideo/demo --vision none

典型输出

一次成功运行后,输出目录通常类似这样:

.withvideo/demo/
└── 10_Months_of_Unity_Dev_with_Claude_Code/
    ├── guide.md
    ├── semantic.json
    ├── source.mp4
    ├── source.info.json
    ├── source.en-orig.vtt
    ├── source.en.vtt
    └── keyframes/

其中:

  • guide.md 是给人看的复现指南
  • semantic.jsonwv act 和后续自动化的权威输入
  • source.* 是 Stage 0 获取到的媒体与 sidecar
  • keyframes/ 是活动检测 / 视觉分析阶段的中间产物

常用参数

  • --transcript auto|platform-subs|mlx-whisper
  • --vision cli|claude|ollama:<model>|none
  • --activity auto|mv|ssim
  • --media-access download|remote
  • --llm-command 'codex exec -'
  • --llm-command 'claude -p {prompt}'
  • --llm-timeout 600
  • --cookies /path/to/cookies.txt
  • --cookies-from-browser chrome:Default
  • --event-stream jsonl
  • --decision-mode auto|prompt|fail
  • --force

llm_command 默认会自动探测:

  1. WITHVIDEO_LLM_COMMAND / WV_LLM_COMMAND
  2. codex exec -
  3. claude -p {prompt}

wv act

wv act 会读取 guide.mdsemantic.json,并生成执行计划。

wv act .withvideo/demo/10_Months_of_Unity_Dev_with_Claude_Code/guide.md --dry-run

当前状态:

  • 已能解析 guide
  • 已能生成 action plan
  • Tier 0/1 执行器可真实执行(Shell / FileWrite / URLOpen / ClaudeCode);Tier 2/3(Browser / ComputerUse)未实现
  • 首次执行建议用 --dry-run 预览路由决策

运行策略

wv learn 的大致流水线是:

  1. Stage 0:获取媒体与平台元数据
  2. Stage 1:字幕读取或 Whisper 转录
  3. Stage 1c:活动检测
  4. Stage 2:语义判断
  5. Stage 3:视觉分析
  6. Stage 4:生成 guide.mdsemantic.json

当前实现里,平台字幕优先于 Whisper,但只有在字幕文件真实可用时才会走平台字幕链路。

开发与验证

跑测试:

uv run --python 3.11 -m unittest discover -s tests -v

检查语法:

uv run --python 3.11 python -m compileall withvideo tests

仓库结构

withvideo/       # learn pipeline
act_with_video/  # act-side prototype
design/          # design docs

如果你把它当工具来用,重点先看 withvideo/cli.pywithvideo/pipeline.py 和本 README。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

withvideo-0.1.0.tar.gz (218.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

withvideo-0.1.0-py3-none-any.whl (99.1 kB view details)

Uploaded Python 3

File details

Details for the file withvideo-0.1.0.tar.gz.

File metadata

  • Download URL: withvideo-0.1.0.tar.gz
  • Upload date:
  • Size: 218.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for withvideo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0e437e895c5f3fede0f298356271c79594a3ae0eb5bff82efc5a1a277a063169
MD5 b2b68385bc7f477f0abb6ff022532837
BLAKE2b-256 2ae18f27bb2ae2209c9e661881850d73cc02114e167e4aabf8b4dc930637a550

See more details on using hashes here.

File details

Details for the file withvideo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: withvideo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 99.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for withvideo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c8926735aa04465fa86296884323a9524b8fc83f6f8aebd97001e737d91b6908
MD5 0acf9568351cfc6653db799bc37f3e03
BLAKE2b-256 4e7e9856c5b29c448621dae308bc94677a0f51b058afc8bcfa9efb818128b92a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page