CLI workflow for AI-assisted livestream clip extraction

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Rousseau512

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Developers
Programming Language
- Python :: 3
- Python :: 3.11
Topic
- Multimedia :: Video

Project description

Clipora

Clipora 是一个面向直播电商输入视频的 CLI 视频切片工作流。

它会把一个本地视频处理成一套可复查的中间产物和最终结果：

run <video> 产出的 job workspace 与 analysis/edit_plan.json
render <job> 或 render <job> --plan <plan.json> 产出的渲染结果
clean / semantic / analyze / render 阶段产出的中间素材
每个 job 或 production 目录独立的运行日志

推荐使用 uv 管理 Python 环境与依赖。支持 CLI 命令行调用和本地 Web UI 两种使用方式。

当前完成情况（2026-04-17）

当前公开主流程面向 analysis.video_type=live_commerce：

run <video>：执行 clean -> inspect -> semantic_discover -> semantic_cluster -> analyze -> render
clean <video>：可选清洗阶段；按配置移除 frozen intervals，并更新 effective_source_path
semantic-discover <job>：基于当前有效源视频与 transcript 发现候选语义区间
semantic-cluster <job>：先按 analysis.semantic_min_duration_seconds 移除过短 discovered interval，再对保留区间做聚类并写入最终 semantic item set
analyze <job>：基于 semantic/transcript.json 与 semantic/semantic_item_set.json 生成 analysis/edit_plan.json
render <job>：按 job 内已有 plan 导出片段视频
render <job> --plan <plan.json>：基于指定 plan 执行确定性切片生产

主流程的公共导出位置：

job-local plans：<workdir>/<job-id>/analysis/edit_plan.json
render outputs：<workdir>/<job-id>/renders/

已落地的核心能力：

本地视频输入
当前公开分析流程面向 analysis.video_type=live_commerce 的直播电商输入视频
后续如需区分分析能力，会按输入视频类型组织，而不是按当前内部实现路径命名
ASR 是当前 live-commerce 链路中的内部能力，用于 semantic 阶段生成 transcript / interval 所需素材，不是对外公开的 strategy 选择项
多后端 ASR 适配
- local：进程内调用 faster-whisper
- aliyun：调用阿里云 ISI，并通过 OSS 暴露音频 URL
OpenAI-compatible provider 接入，用于分析和规划
ffmpeg 渲染导出
每次运行都有独立 job workspace
阶段命令默认复用已完成结果，并支持 --force 重跑
按 job 隔离的终端 / 文件日志

当前 stage 模型：

clean
inspect
semantic_discover
semantic_cluster
analyze
render

Running tests

快速本地测试：

uv run pytest -m "not integration and not e2e"

运行单个测试文件或单个用例：

uv run pytest tests/test_web_app.py
uv run pytest tests/test_web_app.py::test_api_config_saves_updated_values

Integration tests:

uv run pytest -m integration

End-to-end tests:

uv run pytest -m e2e

Integration 和 e2e 分层测试使用生成的媒体文件与本地 stub HTTP 服务，不依赖在线外部 API 凭证。依赖媒体工具的测试在缺少 ffmpeg / ffprobe 时会自动跳过。

快速开始

系统依赖：

Python >=3.11
ffmpeg

如果你要实际运行 local ASR，还需要本机可用的 Whisper 运行环境。如果你要启用 analysis.pose_presence.enabled=true，还需要单独下载 MediaPipe Pose Landmarker .task 模型文件，并在本地配置里填写 analysis.pose_presence.model_path。

方式 A：从已发布的包安装后直接使用

如果你只是想使用 CLI / Web UI，而不是在仓库内开发，且当前版本已经发布到包索引，可在 Python 3.11+ 环境里执行：

pip install clipora

安装后先确认命令可用：

clipora --help

如果你使用 uv 管理工具，也可以安装成命令行工具：

uv tool install clipora

方式 B：从源码开发使用

如果你要在当前仓库里开发或调试，推荐继续使用 uv：

uv sync
uv run clipora --help

如果你已经把 clipora 安装到了当前环境，也可以直接运行：

clipora --help

准备配置文件

纯安装用户可以直接新建一个本地 clipora.yaml；源码仓库用户可以直接复制示例配置：

cp clipora.example.yaml clipora.yaml

最小配置至少需要 provider API 信息和工作目录，例如：

provider:
  base_url: https://api.openai.com/v1
  api_key: your-llm-api-key
  model: gpt-4.1-mini

paths:
  workdir: ./workdir

第一次运行

clipora run /absolute/path/to/video.mp4 --config clipora.yaml

或者启动本地 Web UI：

clipora web --config clipora.yaml

当前 CLI 命令

clipora --help

当前公开命令为：

clipora run <video>
clipora clean <video>
clipora semantic-discover <job-dir>
clipora semantic-cluster <job-dir>
clipora analyze <job-dir>
clipora analyze-item <job-dir> --semantic-item-id <id>
clipora render <job-dir>
clipora render <job-dir> --plan <plan.json>
clipora web

源码仓库内也可以继续使用：

uv run clipora --help

CLI 入口：

src/clipora/cli.py

完整流水线入口：

src/clipora/pipeline/run.py

Web UI

本地 Web UI 提供视频上传、job 提交、进度查看和结果下载功能，适合不熟悉命令行的使用场景。

启动：

clipora web --config clipora.yaml

默认监听 http://127.0.0.1:8000。可通过参数调整绑定地址：

clipora web --host 0.0.0.0 --port 9000 --config clipora.yaml

源码仓库内如果仍然使用 uv run，等价命令是：

uv run clipora web --config clipora.yaml

--config 指定默认配置文件路径；首页提交任务和 /config 配置编辑页都会默认使用该配置。

页面说明：

/：上传视频、提交 job，查看历史 job 列表；首页表单默认值会读取当前 effective config，但 cleaning / 字幕位置 / 字幕字体主题仍可按任务临时覆盖
/config：按 section 编辑大部分配置字段并保存回 YAML；页面展示的是文件保存值，如果字段被环境变量覆盖，会额外标出运行时实际值；密码字段留空时会保留现有 secret
/jobs/<id>：查看 job 进度、日志、下载输出片段

JSON API：

GET /api/config：获取配置编辑页当前状态
POST /api/config：校验并保存配置文件
GET /api/jobs：列出所有 job
GET /api/jobs/<id>：获取 job 快照
GET /api/jobs/<id>/logs：获取最近日志行
GET /api/jobs/<id>/outputs/<index>：下载指定输出文件

Web 入口：

src/clipora/web/app.py
src/clipora/web/service.py

配置

示例配置文件：

clipora.example.yaml

建议复制为本地配置：

cp clipora.example.yaml clipora.yaml

当前 .gitignore 已忽略：

clipora.yaml
workdir/

所以可以直接把本地密钥和工作目录配置写进去。

默认 workdir 下会出现两类主要产物：

<workdir>/<job-id>/：run 或分阶段命令产生的 job workspace，包含 analysis/edit_plan.json 和 renders/

示例配置

provider:
  # OpenAI-compatible API base URL used for analysis and clip planning
  base_url: https://api.openai.com/v1
  # Provider API key used for analysis and clip planning
  api_key: your-llm-api-key
  # Chat model used for analysis and clip planning
  model: gpt-4.1-mini
  # Request timeout in seconds for provider API calls
  timeout: 60.0
  # Number of retries for transient provider API failures
  max_retries: 2
  # Whether to read LLM responses from streaming chat completion chunks; enable this for gateways that return empty non-streaming bodies
  stream: true

paths:
  # Directory where clipora stores planning job workspaces, exported plans, and production outputs
  workdir: ./workdir

logging:
  # Terminal log level filter; supported values: debug, info, warning, error
  # This affects terminal output only; full accepted event stream and pipeline log is still written under each job/production logs directory
  level: info

asr:
  # Transcription backend: local uses faster-whisper, aliyun uses Aliyun ISI file transcription
  mode: local

  # Local Whisper model name used when mode=local
  local_model: small

  aliyun:
    # Alibaba Cloud AccessKey ID used for OSS and signed Aliyun OpenAPI requests
    access_key_id: your-aliyun-access-key-id
    # Alibaba Cloud AccessKey Secret used for OSS and signed Aliyun OpenAPI requests
    access_key_secret: your-aliyun-access-key-secret
    # Aliyun ISI AppKey for recording file transcription
    app_key: your-aliyun-app-key
    # Aliyun region ID used for recording file transcription OpenAPI requests; cn-beijing is recommended
    region_id: cn-beijing
    # Aliyun file transcription domain; cn-beijing is recommended
    filetrans_domain: filetrans.cn-beijing.aliyuncs.com
    # Aliyun file transcription API version
    api_version: '2018-08-17'
    # OSS endpoint used to upload extracted audio before Aliyun reads it by URL; cn-beijing is recommended
    oss_endpoint: https://oss-cn-beijing.aliyuncs.com
    # OSS bucket name used to store uploaded audio files
    oss_bucket_name: your-oss-bucket
    # Object key prefix for uploaded audio files
    oss_key_prefix: clipora/audio
    # Signed GET URL lifetime in seconds for Aliyun to fetch uploaded audio
    signed_url_expires_seconds: 3600
    # Aliyun transcription task version
    version: '4.0'
    # Whether to request word-level timestamps from Aliyun
    enable_words: false
    # Poll interval in seconds while waiting for Aliyun task completion
    poll_interval_seconds: 5.0
    # Maximum number of polling attempts before timing out
    max_polls: 240

cleaning:
  # Whether to remove frozen spans before inspect / semantic / analyze
  enabled: false
  # freezedetect noise floor in dB; more negative = stricter match for true freeze
  freezedetect_noise_db: -60.0
  # Minimum candidate freeze duration passed to freezedetect before merge/collapse
  freezedetect_min_duration_seconds: 0.75
  # Minimum removed interval duration after merge/collapse
  min_remove_duration_seconds: 4.0
  # Merge adjacent removed spans when gap is below this threshold
  merge_gap_seconds: 0.5
  # Minimum keep span after removal merging
  min_keep_duration_seconds: 0.5

analysis:
  # Current public analysis flow targets live-commerce input video
  video_type: live_commerce
  # Extract one frame every N seconds within each final semantic item window; gaps between merged windows are skipped
  frame_interval_seconds: 2.0
  # Number of transcript segments per semantic discovery window
  semantic_window_segments: 200
  # Number of overlapping transcript segments between consecutive semantic windows
  semantic_window_overlap_segments: 30
  # Breaking change: discovered intervals shorter than this are dropped before semantic-cluster unless you lower this threshold
  semantic_min_duration_seconds: 30.0
  # Number of candidate variants requested per final semantic item in the current live-commerce analysis flow
  variants_per_segment: 2

  pose_presence:
    # Whether to run MediaPipe Pose-based host presence detection during analyze
    enabled: false
    # Path to MediaPipe Pose Landmarker .task model bundle; relative paths resolve from config file location
    model_path:
    # Optional override for pose presence sampling interval in seconds; blank falls back to analysis.frame_interval_seconds
    sample_interval_seconds:
    # Whether to persist JSON presence artifacts under analysis/presence
    write_debug_json: true
    # Whether to persist annotated debug frames alongside presence artifacts
    write_debug_frames: false

concurrency:
  # Max parallel semantic items during analyze (frame sampling + provider calls)
  analyze: 1
  # Max parallel clips during render (ffmpeg re-encode, CPU bound)
  render: 1
  # Threads passed to each ffmpeg render process; auto divides cpu_count by render concurrency
  render_ffmpeg_threads: auto
  # Max windows rendered per segment chunk before stitching parts to reduce peak FFmpeg memory on many-window clips
  render_segment_window_chunk_size: 8

render_subtitles:
  # Subtitle output mode; supported values: disabled, sidecar_only, both, burned_only
  mode: disabled
  # Subtitle output formats to generate; supported values: srt, vtt. Must include srt when mode burns subtitles into video
  formats: [srt]
  # Burn-in subtitle position preset; supported values: bottom, top, short_video_safe
  position_preset: bottom
  # Burn-in subtitle font theme; supported values: default, short_video_bold, minimal
  font_theme: default
  # Base reference size for subtitle burn-in style scaling against final output dimensions
  font_scale_base_width: 1080
  # Extra multiplier applied after proportional scaling
  font_scale_multiplier: 1.0

配置加载逻辑：

src/clipora/config.py

Web 配置编辑行为：

/config 保存的是配置文件里的原始 YAML 值，不会把默认值或环境变量覆盖结果直接写回文件
运行时实际配置仍按“YAML 文件 + 环境变量 + CLI / 单次任务覆盖”的优先级解析
首页里的 cleaning_enabled、subtitle_position、subtitle_font_theme 仍然是本次任务的一次性覆盖，不会回写配置文件

人物探测模型

当前 mediapipe 依赖使用 mediapipe.tasks API，不再自带可直接使用的 Pose solution 资源。若启用 analysis.pose_presence.enabled=true，需要先下载官方 landmarker 模型，例如：

mkdir -p workdir/models
curl -L "https://storage.googleapis.com/mediapipe-models/pose_landmarker/pose_landmarker_full/float16/latest/pose_landmarker_full.task" -o "workdir/models/pose_landmarker_full.task"

然后在本地 clipora.yaml 中配置：

analysis:
  pose_presence:
    enabled: true
    model_path: ./workdir/models/pose_landmarker_full.task

model_path 支持相对路径；相对路径会按配置文件所在目录解析。

渲染字幕输出

可以在 clipora.yaml 中开启 render / produce 阶段的字幕产物生成：

render_subtitles:
  mode: both
  formats: [srt, vtt]
  font_scale_base_width: 1080
  font_scale_multiplier: 1.0

字段说明：

render_subtitles.mode：控制字幕产物模式。disabled 不生成任何字幕产物；sidecar_only 只生成 sidecar 字幕文件；both 生成 sidecar 字幕文件并额外输出一个烧录字幕版本视频；burned_only 主输出直接为烧录字幕视频。
render_subtitles.formats：控制输出哪些 sidecar 字幕格式；当前支持 srt 和 vtt。如果 render_subtitles.mode 会烧录字幕进视频（both 或 burned_only），这里必须包含 srt。
render_subtitles.font_scale_base_width：字幕烧录样式缩放基准尺寸；烧录时会按 最终输出短边 / 基准尺寸 等比例缩放主题字号、描边、阴影，并按输出高度缩放垂直边距。
render_subtitles.font_scale_multiplier：在这套比例缩放结果上再乘一次的系数，用来整体放大或缩小字幕。
concurrency.render_segment_window_chunk_size：单个 segment 内每次送进 FFmpeg filter graph 的 window 数上限。超过这个值时，render 阶段会按 chunk 依次产出临时 part，再 concat 成最终片段；值越小，峰值内存通常越低，但总 render 次数会增加。
concurrency.render_segment_window_chunk_max_span_seconds：单个 chunk 内允许覆盖的最大时间跨度（max(end_seconds) - min(start_seconds)）。当窗口彼此相隔很远时，即使窗口数不多，也会提前拆 chunk 以避免超大 filter graph 导致 OOM。

行为说明：

sidecar 字幕文件会写在每个渲染出的 mp4 旁边，例如 clip-001.mp4 会同时生成 clip-001.srt / clip-001.vtt。
字幕 cue 时间轴跟随最终导出的 clip 时间线，而不是原始素材时间线；如果一个 clip 由多个 window 拼接而成，字幕时间也会按拼接后的连续时间重新映射。
mode: both 时，会额外写出 clip-001-subtitled.mp4 这一类烧录字幕文件，不会替换原始主输出 clip-001.mp4。
mode: burned_only 时，主输出会直接包含烧录字幕，不再额外写出 *-subtitled.mp4。

使用方式

方式 A：推荐主流程 `run`

clipora run /absolute/path/to/video.mp4 --config clipora.yaml

run 会执行完整主链路：

clean
inspect
semantic_discover
semantic_cluster
analyze
render

成功后 CLI 会输出：

[clipora] job=/absolute/path/to/<workdir>/<job-id>

如需后续基于同一分析结果重复生产，可直接复用 job 内导出的 plan：

clipora render /absolute/path/to/<workdir>/<job-id> --plan /absolute/path/to/<workdir>/<job-id>/analysis/edit_plan.json --config clipora.yaml

render <job> --plan <plan.json> 会在当前 job workspace 内重跑 render，并复用同一份日志与输出目录。

成功后 CLI 会输出：

[clipora] job=/absolute/path/to/<workdir>/<job-id>
[clipora] log=/absolute/path/to/<workdir>/<job-id>/logs/pipeline.log

方式 B：只执行清洗阶段

clipora clean /absolute/path/to/video.mp4 --config clipora.yaml

clean 只负责创建或更新 job workspace，并执行清洗阶段；它不会自动串行跑完后续 semantic-discover、semantic-cluster、analyze、render。

成功后 CLI 会输出：

[clipora] job=/absolute/path/to/<workdir>/<job-id>
[clipora] log=/absolute/path/to/<workdir>/<job-id>/logs/pipeline.log

方式 C：按阶段执行

注意：semantic-discover、semantic-cluster、analyze、render 都以已有 job 目录为输入；clean 直接接收源视频路径。

先做清洗：

clipora clean /absolute/path/to/video.mp4 --config clipora.yaml

然后发现与聚类 semantic item：

clipora semantic-discover ./workdir/<job-id> --config clipora.yaml
clipora semantic-cluster ./workdir/<job-id> --config clipora.yaml

然后分析：

clipora analyze ./workdir/<job-id> --config clipora.yaml

然后渲染：

clipora render ./workdir/<job-id> --config clipora.yaml

强制重跑阶段

clipora clean /absolute/path/to/video.mp4 --config clipora.yaml --force
clipora semantic-discover ./workdir/<job-id> --config clipora.yaml --force
clipora semantic-cluster ./workdir/<job-id> --config clipora.yaml --force
clipora analyze ./workdir/<job-id> --config clipora.yaml --force
clipora render ./workdir/<job-id> --config clipora.yaml --force

输出目录结构

主流程的公共导出目录：

workdir/<job-id>/
  renders/
    execution_result.json
    *.mp4
  logs/
    pipeline.log

job workspace 仍会包含完整中间产物：

workdir/<job-id>/
  inspect/
    probe.json
  clean/
    active_source.mp4              # 仅 cleaning.enabled=true 且检测到可移除区间时出现
    raw_probe.json
    packet_windows.json
    removed_intervals.json
    summary.json
    keep_intervals.json
    time_map.json
    parts/
      part-*.mp4
  semantic/
    source.audio.wav               # semantic 阶段抽取的全局音频
    transcript.json                # semantic 阶段的全局 transcript 产物
    window_plans.json              # semantic 一次滑窗发现结果与丢弃原因
    cluster_plan.json              # semantic 二次 provider 聚类结果与最终组装状态
    semantic_item_set.json
    items/
      <semantic_item_id>.json
  analysis/
    edit_plan.json
    requests/
      <semantic_item_id>.json      # analyze 阶段记录的多模态请求元数据
    frames/
      <semantic_item_id>/
        frame-001.jpg              # analyze 阶段为当前 semantic item 抽取的样本帧
  renders/
    execution_result.json
    *.mp4
    items/
      <item_plan_id>/
        execution_result.json
        *.mp4
  logs/
    pipeline.log
  manifest.json

workspace 管理逻辑：

src/clipora/io/workspace.py

运行时进度与日志

运行任意阶段命令时，CLI 会先输出当前 job 目录和日志路径：

[clipora] job=/absolute/path/to/workdir/<job-id>
[clipora] log=/absolute/path/to/workdir/<job-id>/logs/pipeline.log

随后各阶段会持续输出运行时事件。例如：

clean：检测并重建有效源视频
semantic-discover：抽取全局音频、转写并发现候选单品区间
semantic-cluster：先移除过短 discovered interval，再对保留区间做聚类并写入最终 semantic interval contract
analyze：记录当前视频类型，基于 semantic transcript、抽样帧和 provider 调用写入 edit_plan
render：按片段输出 ffmpeg 渲染进度

这些事件会同时写入：

workdir/<job-id>/logs/pipeline.log

日志是按 job 隔离的，不同运行不会共用同一个日志文件，因此更适合排查单次执行的问题、回看阶段进度，以及在按阶段重跑时确认之前发生过什么。

当前日志实现：

每次创建 job logger 时自动确保 logs/pipeline.log 存在
logging.level 支持 debug / info / warning / error
logging.level 只影响终端输出的级别过滤
pipeline.log 持续保留完整的已接收 stage event / job log
stage started / stage progress / stage completed 使用 INFO，stage failed 使用 ERROR
对 ffmpeg progress 会做去重与节流，避免刷屏
如果日志写入失败，CLI 仍会继续把告警输出到终端

对应代码：

src/clipora/observability/job_logger.py
src/clipora/pipeline/clean.py
src/clipora/pipeline/inspect.py
src/clipora/pipeline/semantic.py
src/clipora/pipeline/analyze.py
src/clipora/pipeline/render.py

当前流水线行为

完整流程大致为：

clean：按配置检测 frozen intervals，并在需要时重建 effective_source_path
inspect：探测当前有效源视频并写入 inspect/probe.json
解析当前分析视频类型（优先级：job manifest > config）
semantic：
- 先校验 inspect 已完成
- 从当前 effective_source_path 抽取全局音频 semantic/source.audio.wav
- 转写为全局 transcript，并写入 semantic/transcript.json
- 基于全局 transcript 做滑窗一次发现，写入 semantic/window_plans.json
- 将一次发现得到的 interval + summary 再发给 provider 做同单品聚类，写入 semantic/cluster_plan.json
- 本地按聚类结果组装最终 semantic item；只合并边界相接的窗口，并写入 semantic/semantic_item_set.json
analyze：
- 读取 semantic/transcript.json 与 semantic/semantic_item_set.json
- 针对每个 semantic item 的 windows 从当前有效源视频按 source-global 时间抽样帧，并记录 analysis/requests/<semantic_item_id>.json
- 按当前 analysis.video_type 组织 semantic item 文本、抽样帧和 provider 调用
- 当前公开契约下会产出 live_commerce 类型的 edit plan
- 写入 analysis/edit_plan.json
- edit_plan.segments 的数组顺序就是执行顺序，不再要求按 start_seconds 排序
render：
- 直接基于 EditPlanSegment.windows 的 source-global 时间窗渲染 clip
- 字幕从 semantic/transcript.json 过滤并重映射，不再依赖单独的 segment transcript
- render 结果的 execution_result.json 与导出视频放在同一个 renders/ 目录下；item 级结果则放在对应的 renders/items/<item_plan_id>/ 目录下，与该 item 的视频同级

其中 semantic/transcript.json 是全局 transcript contract，semantic/window_plans.json / semantic/cluster_plan.json 是 semantic 调试产物，semantic/semantic_item_set.json 是最终 semantic interval contract，analysis/edit_plan.json 仍然是 render contract。

对应阶段代码：

clean: src/clipora/pipeline/clean.py
inspect: src/clipora/pipeline/inspect.py
semantic: src/clipora/pipeline/semantic.py
analyze: src/clipora/pipeline/analyze.py
render: src/clipora/pipeline/render.py

依赖说明

当前项目关键依赖：

typer
pydantic
httpx
PyYAML
faster-whisper
aliyun-python-sdk-core
oss2
ffmpeg（系统依赖）

目前更适合什么场景

当前版本已经适合：

本地试跑完整工作流
调整 provider / ASR / analysis.video_type 配置
复用导出的 plan 做确定性生产
调试 semantic -> analyze -> render 主链路

Build and verify locally

如果你要验证打包产物，而不是只验证源码仓库运行，推荐在发布前执行下面这套流程。

先构建 wheel 和 sdist：

uv build

然后检查构建产物中是否包含运行时资源文件：

clipora/prompts/**/*.txt
clipora/web/templates/**/*.html
clipora/web/static/*

这些文件位于包目录内部，安装后会被 load_prompt_text() 和 create_app() 直接按包内相对路径读取；因此发布前一定要确认它们确实进了 wheel 和 sdist。

建议分别在两个全新 Python 3.11+ 虚拟环境中安装 wheel 和 sdist，并做最小 smoke test。下面示例使用 uv venv 显式创建 3.11 环境：

uv venv --python 3.11 .venv-wheel-smoke
uv pip install --python .venv-wheel-smoke/bin/python dist/*.whl
. .venv-wheel-smoke/bin/activate
clipora --help
python -c "import clipora; print(clipora.__version__)"
python -c "from clipora.prompts.loader import load_prompt_text; print(load_prompt_text('analyze', 'system')[:40])"
python -c "from clipora.web.app import create_app; app = create_app(); print(app.title)"
deactivate

uv venv --python 3.11 .venv-sdist-smoke
uv pip install --python .venv-sdist-smoke/bin/python dist/*.tar.gz
. .venv-sdist-smoke/bin/activate
clipora --help
python -c "import clipora; print(clipora.__version__)"
python -c "from clipora.prompts.loader import load_prompt_text; print(load_prompt_text('analyze', 'system')[:40])"
python -c "from clipora.web.app import create_app; app = create_app(); print(app.title)"
deactivate

如果你想在发布前多做一步回归验证，可以运行：

uv run pytest -m "not integration and not e2e"

Publish to PyPI

仓库现在包含 GitHub Actions 发布流水线：.github/workflows/publish-pypi.yml。

自动发布流程

推送形如 v0.1.0 的 tag 时，workflow 会自动：
1. 校验 pyproject.toml、src/clipora/__init__.py、git tag 三者版本一致。
2. 运行 uv run pytest -m "not integration and not e2e"。
3. 构建 wheel 和 sdist。
4. 检查构建产物中是否包含 CLI 代码与关键运行时资源文件。
5. 运行 uvx twine check dist/*。
6. 在全新的 Python 3.11 环境里分别安装 wheel 和 sdist 并做 smoke test。
7. 通过 GitHub OIDC trusted publishing 自动上传到 PyPI。

要使用这条流水线，GitHub 仓库和 PyPI 项目需要先配置 trusted publisher。

手动触发

workflow 也支持 workflow_dispatch，你可以先手动触发一次 build-and-verify 检查构建链路；真正发布到 PyPI 只会在 v* tag push 时执行。

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Rousseau512

These details have not been verified by PyPI

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Developers
Programming Language
- Python :: 3
- Python :: 3.11
Topic
- Multimedia :: Video

Release history Release notifications | RSS feed

This version

0.1.0

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipora-0.1.0.tar.gz (104.4 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

clipora-0.1.0-py3-none-any.whl (138.2 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file clipora-0.1.0.tar.gz.

File metadata

Download URL: clipora-0.1.0.tar.gz
Upload date: Apr 26, 2026
Size: 104.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipora-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`66411feebc8b4bbd2b3f79fbe267b72a00f19bd72d6b277dd1a494a9a545b8d2`
MD5	`d35736dc49f2de220fc348709adb7862`
BLAKE2b-256	`31607f5759dec7b9545e77262f2a7a96b356d6569d4335a57ff559e877791d36`

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipora-0.1.0.tar.gz:

Publisher: publish-pypi.yml on Rousseau512/Clipora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: clipora-0.1.0.tar.gz
- Subject digest: 66411feebc8b4bbd2b3f79fbe267b72a00f19bd72d6b277dd1a494a9a545b8d2
- Sigstore transparency entry: 1385799587
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: Rousseau512/Clipora@f2828a8f681bc8450be453e9faacb6d6bda18858
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rousseau512
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@f2828a8f681bc8450be453e9faacb6d6bda18858
- Trigger Event: push

File details

Details for the file clipora-0.1.0-py3-none-any.whl.

File metadata

Download URL: clipora-0.1.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 138.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipora-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`503a4559cea9e1bfe18fab491d4db18ccad568192781ff76836b0f96bd01cf02`
MD5	`92632173eb4cceeb46f9cca720535698`
BLAKE2b-256	`c5e78eb5d12a501dd258f25e1d45a9240c357c39a5b677d816be1e7c714c0bb0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipora-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on Rousseau512/Clipora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: clipora-0.1.0-py3-none-any.whl
- Subject digest: 503a4559cea9e1bfe18fab491d4db18ccad568192781ff76836b0f96bd01cf02
- Sigstore transparency entry: 1385799631
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: Rousseau512/Clipora@f2828a8f681bc8450be453e9faacb6d6bda18858
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rousseau512
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@f2828a8f681bc8450be453e9faacb6d6bda18858
- Trigger Event: push

clipora 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Clipora

当前完成情况（2026-04-17）

Running tests

快速开始

方式 A：从已发布的包安装后直接使用

方式 B：从源码开发使用

准备配置文件

第一次运行

当前 CLI 命令

Web UI

配置

示例配置

人物探测模型

渲染字幕输出

使用方式

方式 A：推荐主流程 run

方式 B：只执行清洗阶段

方式 C：按阶段执行

强制重跑阶段

输出目录结构

运行时进度与日志

当前流水线行为

依赖说明

目前更适合什么场景

Build and verify locally

Publish to PyPI

自动发布流程

手动触发

推荐发布步骤

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

方式 A：推荐主流程 `run`