Skip to main content

Classify audio segments in concert recordings

Project description

Concert Scribe

Classify audio in concert recordings into segments of silence, talking, music, and applause.

Takes video files as input (typically output from SoundGraft), extracts the audio, runs it through Google's YAMNet model, and produces a simple text file describing the timeline.

Example output

0.0-3.36: talking
3.36-33.12: silence
33.12-37.44: applause
37.44-50.4: silence
50.4-108.96: music (Cello)
108.96-118.56: silence
118.56-274.56: music (Cello, Piano)
274.56-285.6: silence
285.6-365.76: music (Cello)
365.76-377.28: applause
377.28-381.6: silence

With --verbose, instrument durations are included:

118.56-274.56: music (Cello: 82.6s, Piano: 15.4s)

Install

pip install concert-scribe

Or with pipx:

pipx install concert-scribe

Requires ffmpeg on the system for audio extraction.

Usage

# Single file
concert-scribe recording.mp4

# All videos in a directory
concert-scribe /path/to/videos/

# Custom output directory
concert-scribe recording.mp4 -o /path/to/output/

# Include per-instrument durations
concert-scribe recording.mp4 --verbose

How it works

  1. Extracts audio from video via ffmpeg (mono, 16kHz)
  2. Classifies each 0.48s frame using YAMNet (521 AudioSet classes mapped to 4 categories)
  3. Merges adjacent same-category frames into segments
  4. Filters out short spurious segments (< 1.5s for music/talking, < 2s for silence)
  5. Deduplicates music sub-types using the AudioSet hierarchy (keeps only the most specific instrument)
  6. Writes a .txt file per input clip

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concert_scribe-0.1.12.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

concert_scribe-0.1.12-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file concert_scribe-0.1.12.tar.gz.

File metadata

  • Download URL: concert_scribe-0.1.12.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for concert_scribe-0.1.12.tar.gz
Algorithm Hash digest
SHA256 312cffde49c81bb85f8c83394e6c076c46ac63c854adbf2fc20b2a0dbb678af1
MD5 5117521f049cd03ff83eef6f7d108912
BLAKE2b-256 b2d549373aeed0787c8941cb631a3b6f1d134896d75b708537b35516fd5ab3fb

See more details on using hashes here.

File details

Details for the file concert_scribe-0.1.12-py3-none-any.whl.

File metadata

File hashes

Hashes for concert_scribe-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 921d14fb977612c9aa56990e89a708b491244697a0bf018d11f08d5309068f4e
MD5 23de562661bf7b230ad2f440e4dba184
BLAKE2b-256 23ea0e4795470d2b331a2455b12af451951ed323870c7f012b15f6a0ce6a7159

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page