Audio/video chop → analyze toolkit.

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

ChopShop

ChopShop header

A toolkit for turning messy A/V and text into clean, analysis-ready artifacts and features. Think of it as a small pit crew for your data: split → diarize → transcribe → gather text → extract features (dictionaries, archetypes, whisper embeddings) — with predictable filenames and folders. And, also, everything in-between.

Status: early WIP. It works, but expect rough edges and occasional breaking changes.

What it does (high level)

Audio from video — pull each audio stream from a container into WAV.
Diarize + transcribe — wrapper around Mahmoud Ashraf's whisper-diarization (CSV/SRT/TXT outputs).
Per-speaker WAVs — cut a source WAV into one file per speaker using the transcript.
Whisper encoder embeddings — segment-level embeddings (and general audio modes) via Faster-Whisper (CTranslate2).
Text gatherer — stream/scale a CSV or folder of .txt into a single “analysis-ready” CSV (optionally grouped).
Feature extraction
- Dictionary / ContentCoder across any number of dictionaries → one wide CSV with stable column order.
- Archetypes using archetypes (sentence-transformer) → one CSV mirroring your analysis-ready file name.
Predictable outputs — if you don't provide an output path, ChopShop writes to ./features/<kind>/<filename>.csv, where <filename> comes from your analysis-ready CSV (so grouping/concat choices are visible in the name).

The API you'll use

ChopShop exposes namespaced sub-APIs for clarity:

from chopshop import ChopShop
cs = ChopShop()

# Audio
wav_paths = cs.audio.extract_wavs_from_video(input_path="input.mp4", output_dir="audio_out/")
tp = cs.diarizer.with_thirdparty(audio_path=wav_paths[0], out_dir="transcripts/", whisper_model="small", device="cuda")
cs.audio.split_wav_by_speaker(source_wav=wav_paths[0], transcript_csv=tp["csv"], out_dir="per_speaker/")

# Embeddings (transcript-driven OR general-audio)
cs.audio.export_whisper_embeddings(source_wav=wav_paths[0], transcript_csv=tp["csv"])          # segment CSV
cs.audio.export_whisper_embeddings(source_wav=wav_paths[0], strategy="nonsilent", aggregate="mean")  # general audio

# Text gather → Dictionaries
feat_csv = cs.text.analyze_with_dictionaries(
    csv_path="transcripts/session.csv",
    dict_paths=["dictionaries/LIWC-22.dicx", "dictionaries/empath-default.dicx"],
    text_cols=["text"], id_cols=["speaker"], group_by=["speaker"], delimiter=",",
)

# Text gather → Archetypes
arch_csv = cs.text.analyze_with_archetypes(
    csv_path="transcripts/session.csv",
    archetype_csvs=["dictionaries/archetypes/Suicidality.csv", "dictionaries/archetypes/Resilience.csv"],
    text_cols=["text"], id_cols=["speaker"], group_by=["speaker"], delimiter=",",
)

Default output locations If you omit out_features_csv, ChopShop writes to:

Dictionaries → ./features/dictionary/<analysis_ready_filename>.csv
Archetypes → ./features/archetypes/<analysis_ready_filename>.csv
Whisper embeddings → ./features/whisper_embed/<analysis_ready_filename>.csv

The <analysis_ready_filename> comes from the text-gather step (e.g., dataset_grouped_speaker.csv), or from your provided analysis_csv.

CLI (quick hits)

Anything you can do in Python... well, you can also run from the terminal.

# Gather text from a CSV (auto-named output if --out omitted)
python -m chopshop.helpers.text_gather \
  --csv transcripts/session.csv \
  --text-col text --group-by speaker --delimiter , --encoding utf-8-sig

# Diarization (wrapper; writes CSV/SRT/TXT under out_dir/<basename>/)
python -m chopshop.audio.diarize_with_thirdparty \
  --audio_path audio/session_a1.wav --out_dir transcripts/ --whisper_model small --device cuda --num_speakers 2

# Whisper embeddings (general audio; nonsilent with mean pool)
python -m chopshop.audio.extract_whisper_embeddings \
  --source_wav audio/session_a1.wav \
  --strategy nonsilent --aggregate mean --output_dir features/whisper_embed/

Installation

A fresh virtual environment is strongly recommended.

python -m venv venv-chopshop
source venv-chopshop/bin/activate

Quick path (when available)

pip install "chopshop[diarization,cuda]"

Then install the three git extras used by the diarization wrapper:

pip install git+https://github.com/MahmoudAshraf97/demucs.git
pip install git+https://github.com/oliverguhr/deepmultilingualpunctuation.git
pip install git+https://github.com/MahmoudAshraf97/ctc-forced-aligner.git

Install PyTorch built for CUDA 12.4 (the stack ChopShop targets):

pip install --force-reinstall --no-cache-dir \
  torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
  --index-url https://download.pytorch.org/whl/cu124

And ensure FFmpeg is on your PATH (Ubuntu: sudo apt-get install ffmpeg, macOS: brew install ffmpeg).

Manual stack (same versions, explicit)

# Core pieces
pip install "faster-whisper>=1.1.0"
pip install "nemo-toolkit[asr]>=2.dev"
pip install git+https://github.com/MahmoudAshraf97/demucs.git
pip install git+https://github.com/oliverguhr/deepmultilingualpunctuation.git
pip install git+https://github.com/MahmoudAshraf97/ctc-forced-aligner.git

# cuDNN user-space libs (CUDA 12)
pip install -U nvidia-cudnn-cu12

# PyTorch for CUDA 12.4
pip install --force-reinstall --no-cache-dir \
  torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 \
  --index-url https://download.pytorch.org/whl/cu124

# Text features
pip install contentcoder archetyper

If you hit CUDA/cuDNN loader errors, it usually means the runtime and wheel builds don't match. Keep CUDA 12.4, cu124 wheels, and cuDNN 9 aligned.

Troubleshooting quickies

Delimiter or encoding issues when gathering text Pass --delimiter and --encoding parameters explicitly for CSV inputs, just to be safe. If you run into any errors, try using --delimiter , and --encoding utf-8-sig as a starting point.
Diarizer ignores --num_speakers Use the custom entrypoint (enabled by default) which wires num_speakers through properly... for now. If needed, pin min_num_speakers == max_num_speakers == N.
cuDNN / CUDA symbol errors Mismatched CUDA/cuDNN vs wheel builds. Reinstall the cu124 PyTorch wheels and nvidia-cudnn-cu12.
Embeddings subprocess fails Use device=cpu to rule out GPU issues; or set CHOPSHOP_DEBUG=1 to surface more logs.

Credits

Diarization stack adapted from Mahmoud Ashraf's excellent whisper-diarization.
Dictionaries via ContentCoder-Py; archetypes via archetypes (sentence-transformers). Well, okay, I wrote those. But I didn't know at the time that they'd be so handy. So... good job, former me.

License & status

MIT (see LICENSE). Active WIP; APIs and default paths may (read: will) shift as the project settles — release notes will most likely call out breaking changes.

Happy chopping.

Project details

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Analysis

Release history Release notifications | RSS feed

0.0.6

Sep 21, 2025

This version

0.0.5

Sep 17, 2025

0.0.2

Sep 12, 2025

0.0.1

Sep 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chopshop-0.0.5.tar.gz (44.0 kB view details)

Uploaded Sep 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chopshop-0.0.5-py3-none-any.whl (54.1 kB view details)

Uploaded Sep 17, 2025 Python 3

File details

Details for the file chopshop-0.0.5.tar.gz.

File metadata

Download URL: chopshop-0.0.5.tar.gz
Upload date: Sep 17, 2025
Size: 44.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for chopshop-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`263c9284148be36a644541b2c62ade199c2338e681b2a66e19b91b5995e96c16`
MD5	`c561991cfc9393c972aab7df6135772f`
BLAKE2b-256	`00270b5c9a8d9969ae5d9a218c2ac3946048eebb7e4a7b8853132aab53528db5`

See more details on using hashes here.

File details

Details for the file chopshop-0.0.5-py3-none-any.whl.

File metadata

Download URL: chopshop-0.0.5-py3-none-any.whl
Upload date: Sep 17, 2025
Size: 54.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for chopshop-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`968aa284e69fe7365e7c44a1247204820be73be8f5e218048ffcdd494f76fac5`
MD5	`f398a2c1e2f098c3fa7c1084c3c074dd`
BLAKE2b-256	`db5b413b1d492f42eec273247379273dbdfd32c6357192d2864791dd478ad117`

See more details on using hashes here.

chopshop 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ChopShop

What it does (high level)

The API you'll use

CLI (quick hits)

Installation

Quick path (when available)

Manual stack (same versions, explicit)

Troubleshooting quickies

Credits

License & status

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes