Classify audio segments in concert recordings
Project description
Concert Scribe
Classify audio in concert recordings into segments of silence, talking, music, and applause.
Takes video or audio files as input, extracts the audio, runs it through Google's YAMNet model, and produces a simple text file describing the timeline.
Example output
0.0-3.36: talking
3.36-33.12: silence
33.12-37.44: applause
37.44-50.4: silence
50.4-108.96: music (Cello)
108.96-118.56: silence
118.56-274.56: music (Cello, Piano)
274.56-285.6: silence
285.6-365.76: music (Cello)
365.76-377.28: applause
377.28-381.6: silence
With --verbose, instrument durations are included:
118.56-274.56: music (Cello: 82.6s, Piano: 15.4s)
Install
pip install concert-scribe
Or with pipx:
pipx install concert-scribe
Requires ffmpeg on the system for audio extraction.
Usage
# Single file
concert-scribe recording.mp4
# All videos in a directory
concert-scribe /path/to/videos/
# Custom output directory
concert-scribe recording.mp4 -o /path/to/output/
# Include per-instrument durations
concert-scribe recording.mp4 --verbose
How it works
- Extracts audio from video via ffmpeg (mono, 16kHz)
- Classifies each 0.48s frame using YAMNet (521 AudioSet classes mapped to 4 categories)
- Merges adjacent same-category frames into segments
- Filters out short spurious segments (< 1.5s for music/talking, < 2s for silence)
- Deduplicates music sub-types using the AudioSet hierarchy (keeps only the most specific instrument)
- Writes a
.txtfile per input clip
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file concert_scribe-0.2.0.tar.gz.
File metadata
- Download URL: concert_scribe-0.2.0.tar.gz
- Upload date:
- Size: 794.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d69297bc513a92a338539bd2951f9b76036ef5906cdc062470dab179535f3761
|
|
| MD5 |
6f7596162fa62dae321c3bdc61d7fbf7
|
|
| BLAKE2b-256 |
6f0afaa3214304c4bc1e081c802b3e683be7ad932a1b5849cebbf0407174215b
|
File details
Details for the file concert_scribe-0.2.0-py3-none-any.whl.
File metadata
- Download URL: concert_scribe-0.2.0-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63d8dcdbd0b0abe24007f070fccbe90f1ec7c3030947d904034f76a4a3d4d392
|
|
| MD5 |
b1ec3e660aad92ef788dc4bd9022c361
|
|
| BLAKE2b-256 |
dafc16883039699d98df967082eda919be13ef2b08e8ebdfd0fab5382d93ad58
|