DSPy-based prompt optimizer for video-analyzer

These details have not been verified by PyPI

Project description

video-analyzer-tune

DSPy-based prompt optimizer for video-analyzer.

Automatically improves the two prompts that video-analyzer uses — the per-frame analysis prompt and the final video reconstruction prompt — based on examples of what good output looks like for your specific content and use case.

Overview

video-analyzer works in two stages: it analyzes each video frame individually (building up a running log of observations), then synthesizes all the frame notes into a final video description. Both stages are driven by prompt files that you can customize.

video-analyzer-tune uses DSPy MIPROv2 to optimize both prompts end-to-end. You provide a few examples of what ideal output looks like — both at the frame level and the final description level — and the tuner finds better prompt instructions automatically.

The main video-analyzer package is not affected in any way. Tuned prompts are written as new files that you point to via your config.

Requirements

Python 3.8+
video-analyzer >= 0.1.1
An Ollama instance with a vision model, or an OpenAI-compatible API

Installation

pip install video-analyzer-tune

Quick Start

Step 1 — Generate output with frames kept

Run video-analyzer on a representative video and keep the extracted frames:

video-analyzer my_video.mp4 --keep-frames

This produces an output/ directory containing:

analysis.json — frame-by-frame notes and the final description
frames/ — the extracted frame images

Step 2 — Edit analysis.json with your ideal output

Open output/analysis.json and edit two things:

Required: Edit video_description.response to show what the ideal final description looks like for your use case.

Recommended: Edit each frame_analyses[i].response to show what ideal frame notes look like. This gives the optimizer a signal at both stages of the pipeline and produces better results.

{
  "frame_analyses": [
    {
      "frame": 0,
      "timestamp": 0.0,
      "response": "Your ideal frame note here — what details matter for your use case"
    }
  ],
  "video_description": {
    "response": "Your ideal final description here — the style, length, and focus you want"
  }
}

The more videos you edit and include as training examples, the better the results.

Step 3 — Create training_data.json

{
  "examples": [
    { "output_dir": "output" }
  ]
}

Add one entry per video you edited:

{
  "examples": [
    { "output_dir": "output/video1" },
    { "output_dir": "output/video2" },
    { "output_dir": "output/video3" }
  ]
}

Step 4 — Run the tuner

video-analyzer-tune --training-data training_data.json --output-dir tuned_prompts/

This runs MIPROv2 optimization, which will take some time depending on --num-candidates and --num-trials.

Step 5 — Update your config

When tuning completes, the tool prints a config snippet to paste into your config/config.json:

"prompt_dir": "tuned_prompts",
"prompts": [
  {"name": "Frame Analysis", "path": "frame_analysis_tuned.txt"},
  {"name": "Video Reconstruction", "path": "describe_tuned.txt"}
]

Run video-analyzer as normal — it will use your tuned prompts automatically.

Training Data Format

training_data.json

{
  "examples": [
    { "output_dir": "path/to/output" }
  ]
}

Paths can be absolute or relative to the location of training_data.json.

What to edit in analysis.json

Field	Required	Description
`video_description.response`	Yes	Your ideal final video description
`frame_analyses[i].response`	Recommended	Your ideal frame note for each frame
`prompt`	No	Leave as-is
`transcript`	No	Leave as-is

CLI Reference

Flag	Default	Description
`--training-data`	required	Path to training_data.json
`--output-dir`	`tuned_prompts`	Directory to write tuned prompt files
`--client`	`ollama`	LLM client: `ollama` or `openai_api`
`--model`	`llama3.2-vision`	Vision model to use for optimization runs
`--ollama-url`	`http://localhost:11434`	Ollama server URL
`--api-key`	—	API key (required when `--client openai_api`)
`--api-url`	—	API endpoint URL (required when `--client openai_api`)
`--num-candidates`	`10`	Number of prompt variations generated per module. Higher = more thorough but slower. Range: 5–20
`--num-trials`	`20`	Number of optimization trials. Higher = better results but slower. Range: 10–50
`--max-bootstrapped-demos`	`3`	Max few-shot examples generated by bootstrapping
`--max-labeled-demos`	`4`	Max few-shot examples taken from your training data
`--description-weight`	`0.7`	How much the final description quality influences the score (0.0–1.0). The remainder weights frame analysis quality. Use `0.5` if you care equally about both; use `1.0` to optimize only for the final description
`--log-level`	`INFO`	Logging level: DEBUG / INFO / WARNING / ERROR

LLM Configuration

Using Ollama (default)

video-analyzer-tune \
  --training-data training_data.json \
  --output-dir tuned_prompts/ \
  --model llama3.2-vision

Using an OpenAI-compatible API (e.g. OpenRouter)

video-analyzer-tune \
  --training-data training_data.json \
  --output-dir tuned_prompts/ \
  --client openai_api \
  --model meta-llama/llama-3.2-11b-vision-instruct \
  --api-url https://openrouter.ai/api/v1 \
  --api-key YOUR_API_KEY

How It Works

video-analyzer uses two prompt files:

frame_analysis.txt — called once per frame with the image and all previous frame notes. Produces the per-frame observation log.
describe.txt — called once at the end with all frame notes and the audio transcript. Produces the final video description.

video-analyzer-tune wraps both prompts in a DSPy pipeline that mirrors the exact processing logic of video-analyzer. It then runs MIPROv2 — a Bayesian optimizer that generates candidate instruction variations and scores them against your training examples.

Scoring uses an LLM-as-judge approach: the same model evaluates how well the generated output matches your ideal examples on a 1–5 scale. Frame note quality and final description quality are combined using the configurable --description-weight.

After optimization, the improved instruction text is written into new .txt files that preserve all the {TOKEN} placeholders ({PREVIOUS_FRAMES}, {FRAME_NOTES}, etc.) that video-analyzer uses for its string replacement — making the output files drop-in compatible.

Tips for Better Results

Use multiple videos. Even 3–5 diverse examples significantly improves optimization quality.
Edit frame notes too. If you only edit the final description, the optimizer has less signal about what good intermediate analysis looks like.
Be specific in your edits. The more clearly your ideal examples demonstrate the style and focus you want, the better the optimizer can learn from them.
Use the same model for tuning as for inference. The optimized prompts are tuned to the specific model's behavior.
Increase --num-candidates and --num-trials for better results if you have the time. Start with defaults and increase from there.
Use --description-weight 0.5 if you read the frame notes directly and care as much about their quality as the final description.

License

Apache License 2.0 — same as video-analyzer.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

video_analyzer_tune-0.1.0.tar.gz (23.6 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

video_analyzer_tune-0.1.0-py3-none-any.whl (26.6 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file video_analyzer_tune-0.1.0.tar.gz.

File metadata

Download URL: video_analyzer_tune-0.1.0.tar.gz
Upload date: Mar 20, 2026
Size: 23.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for video_analyzer_tune-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7a2244b865a54a78d534a7abe66800a7af9274aa7f11af3b09e139f3e0bc5782`
MD5	`bcd025de8ed2cb65baf392fec0a73a9d`
BLAKE2b-256	`2182ca9a9f7c7dff092ef5cccac76a47d805344908afd37c9e86a1bfd0e85dd6`

See more details on using hashes here.

File details

Details for the file video_analyzer_tune-0.1.0-py3-none-any.whl.

File metadata

Download URL: video_analyzer_tune-0.1.0-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 26.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for video_analyzer_tune-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ced4091ec7572acb06c59a404ab1e03061ea6c80b544cd762b73b6e56653d62`
MD5	`427a3ece98736e13cd46c951233d670f`
BLAKE2b-256	`e39cc9a65b24008953eddbcdb97ee1622a7ac6497a600ce127a6004fe6a616c8`

See more details on using hashes here.

video-analyzer-tune 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

video-analyzer-tune

Overview

Requirements

Installation

Quick Start

Step 1 — Generate output with frames kept

Step 2 — Edit analysis.json with your ideal output

Step 3 — Create training_data.json

Step 4 — Run the tuner

Step 5 — Update your config

Training Data Format

training_data.json

What to edit in analysis.json

CLI Reference

LLM Configuration

Using Ollama (default)

Using an OpenAI-compatible API (e.g. OpenRouter)

How It Works

Tips for Better Results

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes