Skip to main content

Classify audio segments in concert recordings

Project description

Concert Scribe

Concert Scribe

Classify audio in concert recordings into segments of silence, talking, music, and applause.

Takes video files as input (typically output from SoundGraft), extracts the audio, runs it through Google's YAMNet model, and produces a simple text file describing the timeline.

Example output

0.0-3.36: talking
3.36-33.12: silence
33.12-37.44: applause
37.44-50.4: silence
50.4-108.96: music (Cello)
108.96-118.56: silence
118.56-274.56: music (Cello, Piano)
274.56-285.6: silence
285.6-365.76: music (Cello)
365.76-377.28: applause
377.28-381.6: silence

With --verbose, instrument durations are included:

118.56-274.56: music (Cello: 82.6s, Piano: 15.4s)

Install

pip install concert-scribe

Or with pipx:

pipx install concert-scribe

Requires ffmpeg on the system for audio extraction.

Usage

# Single file
concert-scribe recording.mp4

# All videos in a directory
concert-scribe /path/to/videos/

# Custom output directory
concert-scribe recording.mp4 -o /path/to/output/

# Include per-instrument durations
concert-scribe recording.mp4 --verbose

How it works

  1. Extracts audio from video via ffmpeg (mono, 16kHz)
  2. Classifies each 0.48s frame using YAMNet (521 AudioSet classes mapped to 4 categories)
  3. Merges adjacent same-category frames into segments
  4. Filters out short spurious segments (< 1.5s for music/talking, < 2s for silence)
  5. Deduplicates music sub-types using the AudioSet hierarchy (keeps only the most specific instrument)
  6. Writes a .txt file per input clip

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concert_scribe-0.1.13.tar.gz (793.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

concert_scribe-0.1.13-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file concert_scribe-0.1.13.tar.gz.

File metadata

  • Download URL: concert_scribe-0.1.13.tar.gz
  • Upload date:
  • Size: 793.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for concert_scribe-0.1.13.tar.gz
Algorithm Hash digest
SHA256 0ff3e2ecc6f8626aab6cf74470546d2c0c7777a75bb6a04a9973219bb576a84e
MD5 c4f847e74aee72c6cdac83fbe8a64052
BLAKE2b-256 643ea21f54d8e1560c61b459c3bec85e3cd654464f62d7d4445556fcf01cd88c

See more details on using hashes here.

File details

Details for the file concert_scribe-0.1.13-py3-none-any.whl.

File metadata

File hashes

Hashes for concert_scribe-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 02739499abd9c54fcdd91fe5eb85cbf19089eaf7ba1ad158bab7625d595e3002
MD5 1f658372a37ef49425bf65b1f94861ec
BLAKE2b-256 760f558de8c6d08e69d864596ad9dddb20e4c499b694e102d1f9b03a21ece7a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page