Skip to main content

Music structural segmentation for the Zigify pipeline (MSAF olda + scluster)

Project description

zigify-msaf

Music structural segmentation for the Zigify pipeline. A thin CLI wrapper around MSAF that pins to the olda boundary detector and scluster labeler — the combination that scored best in evaluation against hand-annotated ground truth.

Install

uvx zigify-msaf <audio>          # ephemeral, recommended
uv pip install zigify-msaf       # into a project

uvx resolves and caches a dedicated environment on first run; subsequent calls cold-start in ~100 ms.

Use

zigify-msaf path/to/track.mp3
zigify-msaf track.mp3 --out track.segments.json
zigify-msaf track.mp3 --feature mfcc --verbose
zigify-msaf track.mp3 --bpm 118        # skip detection, force a known tempo

stdout is newline-delimited JSON: progress events first, then a single final result line. stderr carries human-readable logs from msaf/librosa (silenced by default; pass --verbose to surface them).

Output schema

Each line on stdout is one JSON object. The shape of the final result line:

{
  "type": "result",
  "source": "path/to/track.mp3",
  "duration": 357.98,
  "tempo": 117.45,
  "tempoPrior": 117.45,
  "bpmOverride": null,
  "beatCount": 700,
  "accentCount": 88,
  "feature": "pcp",
  "boundaryAlgo": "olda",
  "labelAlgo": "scluster",
  "nSegments": 12,
  "nClusters": 5,
  "loudness": -18.4,
  "peakLoudness": -3.1,
  "segments": [
    {
      "start": 0.0,
      "end": 18.1,
      "duration": 18.1,
      "cluster": "S4",
      "beats": [0.51, 1.02, 1.53, 2.04],
      "beatCount": 36,
      "accents": [0.51, 4.6, 9.2, 13.8],
      "accentCount": 4,
      "topAccent": 9.2,
      "onsetCount": 22,
      "onsetRate": 1.215,
      "loudness": -23.7,
      "peakLoudness": -8.9,
      "dynamicRange": 14.8,
      "energy": 0.21,
      "brightness": 1840.5
    }
  ],
  "elapsed": 12.3
}

Per-segment fields beyond start/end/cluster describe musical character useful for downstream light-show or visualization generation:

Field Meaning
beats, beatCount Beat timestamps (s, absolute) inside the segment, from librosa.beat.beat_track.
accents, accentCount, topAccent Strong onsets (top-quartile of onset-strength envelope) — the "hits" to flash on. topAccent is the loudest one in the segment.
onsetCount, onsetRate All detected onsets and their density (events / sec) — distinguishes calm sections from busy ones.
loudness, peakLoudness, dynamicRange Mean / peak RMS in dBFS, and their difference.
energy 0..1 loudness normalized to the loudest segment in the track (peakLoudness − 30 dB ↦ 0, peakLoudness ↦ 1). Suitable for direct mapping to brightness/intensity.
brightness Mean spectral centroid in Hz — higher = brighter / more high-frequency content.

Earlier lines look like:

{"type":"stage","name":"loading","message":"reading track.mp3"}
{"type":"stage","name":"onsets","message":"computing onset strength envelope"}
{"type":"stage","name":"tempo","message":"estimating bpm and beats"}
{"type":"stage","name":"accents","message":"detecting onsets and accents"}
{"type":"stage","name":"energy","message":"computing loudness and brightness"}
{"type":"stage","name":"features","message":"extracting pcp"}
{"type":"stage","name":"boundaries","message":"olda"}
{"type":"stage","name":"labels","message":"scluster"}

On failure the tool emits a single {"type": "error", ...} line and exits non-zero.

Calling from Node / TypeScript

import { spawn } from 'node:child_process'
import { createInterface } from 'node:readline'

const proc = spawn('uvx', ['zigify-msaf', audioPath], { stdio: ['ignore', 'pipe', 'pipe'] })
const rl = createInterface({ input: proc.stdout })

let result: SegmentResult | undefined
for await (const line of rl) {
  const evt = JSON.parse(line)
  if (evt.type === 'stage') console.log(`[${evt.name}] ${evt.message ?? ''}`)
  if (evt.type === 'result') result = evt
  if (evt.type === 'error') throw new Error(evt.message)
}

Algorithm choice

Evaluated against a 14-boundary ground truth on Michael Jackson — Thriller (tolerance ±5 s):

Boundary algo Hits Miss Spurious
olda 9 4 2
foote 9 4 6
sf (default) 6 7 5
cnmf 5 8 2
scluster 4 9 10
vmo 12 1 457

olda wins precision and ties for recall. Scluster labels group the segments into ~5 clusters that align with verse/chorus/outro structure on test tracks.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zigify_msaf-0.2.1.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zigify_msaf-0.2.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file zigify_msaf-0.2.1.tar.gz.

File metadata

  • Download URL: zigify_msaf-0.2.1.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for zigify_msaf-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0b31e063a3c8d61a85992657742c1ccce52cf7a3620736e2e80bb9eca3f59ff5
MD5 3c7e86bee9a50d7539e9403f3434b5d7
BLAKE2b-256 34bd78cf25240fe7eaf8e378bf86ed0f22027d868502d87fcd9771fecf0dbc00

See more details on using hashes here.

File details

Details for the file zigify_msaf-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: zigify_msaf-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for zigify_msaf-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0a0d4269b898d6c060108e19b9dd3949cc3d216d9da080fdd797107d333fd0ab
MD5 549dadadab70c15dc73516f51e2474e0
BLAKE2b-256 e88feffb62b21358c54230cfb58f36f7df21d105b6c73dab5d3eab54430dedeb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page