Installable audio, video, and audio-visual analysis toolbox
Project description
av-toolbox
Upload a video, get visual/audio/AV diagnostics with overlay videos.
Live demo: demo.yan-peng.com - choose a Video, Audio, or Audio-Visual tool; use the sample clip or upload a short non-sensitive file; then view/download the overlay MP4, metrics, and artifacts.
av-toolbox is an installable audio, video, and audio-visual analysis toolbox with one Python registry, one CLI, and a Streamlit demo UI.
PyPI distribution name: av-analysis-toolbox (the import package remains av_toolbox, and the CLI remains av-toolbox).
Tool Catalog
See docs/tool-catalog.md for detailed per-tool instructions, CLI and Python examples, UI notes, generated config files, input types, output artifacts, optional dependency extras, and GPU/model requirements.
Overlay Examples
The overlays below are rendered on demo footage from this YouTube video. All rights to the original footage remain with its creator; it is included here for demonstration only. See Credits.
Video editing
| Cut Detection | Shot Type |
|---|---|
Video quality
| Image Quality | Camera Shake |
|---|---|
Motion detection
| Motion | Optical Flow | Foreground Motion |
|---|---|---|
Object and action understanding
| Object Detection | Segmentation | Action Recognition | Pose Detection |
|---|---|---|---|
Audio tools
| Beat Detection | Audio Energy | Audio Events |
|---|---|---|
Audio-visual foundation model
| DenseAV on CatFu |
|---|
Happy Path: Local Install And Demo
This path works from a fresh clone without private media, cloud services, or model checkpoints. It needs Python 3.10+ and FFmpeg on PATH, installs the local UI plus the lightweight audio/video tools, generates a small synthetic demo clip, runs one CLI tool, and starts the web UI.
git clone https://github.com/yanpeng0520/av-toolbox.git
cd av-toolbox
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[web,audio,video]"
av-toolbox generate-demo-media --output-dir data_segments --duration 12
av-toolbox video motion \
data_segments/synthetic_hiphop_60s.mp4 \
--output outputs/motion_demo \
--sample-fps 5 \
--max-seconds 8
av-toolbox serve \
--host 127.0.0.1 \
--port 8501 \
--output-root outputs/web_runs
Open http://127.0.0.1:8501, choose a tool, use the generated sample or upload a short local clip, and inspect the overlay, transcript/metrics, and downloadable artifacts in Results.
Optional Model Tools
Install heavier extras only for the tools you plan to run:
# YOLO object detection/segmentation/pose and shot-type classification
python -m pip install -e ".[vision-models]"
# TransNetV2/PySceneDetect cut detection backends
python -m pip install -e ".[cut-detection]"
# PyTorchVideo action recognition
python -m pip install -e ".[action]"
# faster-whisper transcription
python -m pip install -e ".[transcription]"
DenseAV is a separate heavyweight install because it needs the DenseAV Git package and checkpoint setup:
python -m pip install -e ".[denseav]"
python -m pip install "git+https://github.com/mhamilton723/DenseAV.git"
GPU And Model Cache
Classical tools run on CPU. Model-backed tools can use GPU when their PyTorch/accelerator stack is installed and the tool supports it.
Recommended cache setup:
export AV_TOOLBOX_CACHE_DIR=/mnt/models/av_toolbox_cache
mkdir -p "$AV_TOOLBOX_CACHE_DIR/weights"
You can also pass --cache-dir through CLI/runtime options. The default cache is under ~/.cache/av_toolbox/weights; if that directory is root-owned or unwritable, set AV_TOOLBOX_CACHE_DIR to a writable path before running model-backed tools.
DenseAV checkpoints require explicit setup. See docs/denseav.md.
Developer Docs
- Developer README: local development, CLI examples, tests, Docker, Python API, and web UI commands.
- Tool catalog: registered tools, CLI wrappers, inputs, outputs, and runtime controls.
- DenseAV setup: optional DenseAV dependencies, checkpoint names, cache paths, and GPU flags.
Contributing
Contributions are welcome. See CONTRIBUTING.md for the dev setup, how to add a tool, the overlay style guide, and PR expectations. To report a security issue, follow SECURITY.md (please do not open a public issue).
Credits
- Demo/sample footage is sourced from this YouTube video and used solely to demonstrate the tools' overlays. All rights to the original footage belong to its creator. The
av-toolboxsource code is licensed separately under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file av_analysis_toolbox-0.1.0.tar.gz.
File metadata
- Download URL: av_analysis_toolbox-0.1.0.tar.gz
- Upload date:
- Size: 133.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2455415133170c5b52a7789397fb7ad7edb7787e3887f55f5039211fa974a46
|
|
| MD5 |
10d5b0b0a6c7e934eab591adb5655a38
|
|
| BLAKE2b-256 |
66944ef209b33bdbdfb645d7f1d7de6a1dffc820a9d523a5f76268808f144814
|
File details
Details for the file av_analysis_toolbox-0.1.0-py3-none-any.whl.
File metadata
- Download URL: av_analysis_toolbox-0.1.0-py3-none-any.whl
- Upload date:
- Size: 141.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64283241f466ee66fb68fa7087d5c8ba56a515ae70c21d6aad3132cf63912304
|
|
| MD5 |
0e8cd79b7f03560e367050d4daf924a3
|
|
| BLAKE2b-256 |
464ef15984f65d0dedfd1ece250327c5ca9febc83a95f97ff34b67fa3dd471df
|