Skip to main content

CoreML/ANE port of the CREPE pitch estimator with a torchcrepe-compatible API

Project description

coremlcrepe

A vibe-coded CoreML / Apple Neural Engine (ANE) port of the CREPE pitch estimator, exposing a torch-free, torchcrepe-compatible API. At inference time it only needs numpy + coremltools — no PyTorch.

  • 🎯 Drop-in-style API: coremlcrepe.predict(...), coremlcrepe.decode, coremlcrepe.filter, coremlcrepe.threshold, coremlcrepe.convert, ...
  • ⚡ Runs on the ANE via a float16 mlprogram (10–13× faster per frame than Torch CPU).
  • ✅ Validated against the original torchcrepe model (< 1 cent median F0 error on the full model).

Model I/O

Input frames (batch, 1024) float32 — 16 kHz audio windows, per-frame mean-centered + unit-std normalized
Output probabilities (batch, 360) float32 — sigmoid probabilities over 360 pitch bins (20 cents/bin)

Project layout

coremlcrepe/            The library (torch-free, mirrors torchcrepe)
├── __init__.py         Public API + constants
├── core.py             predict / preprocess / infer / postprocess / *_from_file
├── model.py            CoreML MLModel wrapper (Crepe)
├── decode.py           argmax / weighted_argmax / viterbi
├── convert.py          bins <-> cents <-> frequency conversions
├── filter.py           mean / median (+ NaN-aware) sequence filters
├── threshold.py        At / Hysteresis / Silence thresholding
├── load.py             model + audio loading
├── loudness.py         A-weighted perceptual loudness
└── assets/             full.mlpackage, tiny.mlpackage  (the CoreML models)

convert_crepe.py        Torch -> CoreML conversion script
validate_and_benchmark.py  Validation + latency benchmark vs torchcrepe
examples/               Runnable usage examples
tests/                  pytest suite (torch-free + optional parity tests)
pyproject.toml          Package metadata (pip install -e .)

Install

# Python 3.9–3.13 recommended.
python3 -m venv .venv && source .venv/bin/activate

# Runtime only (numpy + coremltools) + the package:
pip install -e .

# Optional extras:
pip install -e ".[audio]"    # soundfile, to read audio files
pip install -e ".[dsp]"      # librosa + resampy (viterbi/loudness/hi-q resample)
pip install -e ".[convert]"  # torch + torchcrepe, to (re)build the .mlpackage
pip install -e ".[test]"     # pytest

The converted models are shipped in coremlcrepe/assets/. To rebuild them:

python convert_crepe.py --capacity full   # -> coremlcrepe/assets/full.mlpackage
python convert_crepe.py --capacity tiny   # -> coremlcrepe/assets/tiny.mlpackage
# Most ANE-friendly: bake a static batch size
python convert_crepe.py --capacity full --fixed-batch 100

Usage

The API mirrors torchcrepe, using numpy arrays of shape (1, time):

import numpy as np
import coremlcrepe

# Load audio (needs the `audio` extra) or bring your own numpy array.
audio, sr = coremlcrepe.load.audio("audio.wav")

# Predict pitch (Hz) and periodicity (confidence).
pitch, periodicity = coremlcrepe.predict(
    audio, sr,
    fmin=50., fmax=550.,
    model="full",                 # or "tiny"
    decoder=coremlcrepe.decode.weighted_argmax,
    return_periodicity=True,
)

Recommended cleanup pipeline (matches torchcrepe)

# Remove periodicity in silent regions (needs the `dsp` extra for loudness).
periodicity = coremlcrepe.threshold.Silence(-60.)(periodicity, audio, sr)

# Mark low-confidence frames unvoiced (NaN).
pitch = coremlcrepe.threshold.At(0.21)(pitch, periodicity)

# Smooth.
pitch = coremlcrepe.filter.median(pitch, 3)
periodicity = coremlcrepe.filter.mean(periodicity, 3)

Choosing a decoder

decoder notes needs librosa?
coremlcrepe.decode.weighted_argmax default, sub-bin accurate no
coremlcrepe.decode.argmax fastest, bin-quantized no
coremlcrepe.decode.viterbi temporally smooth path librosa if installed, else pure-numpy fallback

Selecting compute units

# Force a specific backend when loading (ALL lets the ANE be used):
coremlcrepe.load.model("full", compute_units="CPU_AND_NE")

Examples

python examples/01_basic_prediction.py          # synthetic tone
python examples/02_from_file_with_cleanup.py a.wav   # file + threshold + filter
python examples/03_pitch_sweep.py               # track a glissando, compare decoders

Tests

pytest                # torch-free core tests always run
                      # torchcrepe parity tests run only if torch is installed

The suite covers unit conversions, decoders, filters/thresholds, the end-to-end prediction pipeline, and — when torch is available — numerical parity with the original torchcrepe model.

Audio corpus suite

tests/audio_corpus.py deterministically generates a diverse, labeled corpus (73 cases): notes across octaves in several timbres (sine / harmonic / saw / square), vibrato, glissando, and harmonic tones at decreasing SNR. tests/test_audio_corpus.py runs prediction over the whole corpus and asserts per-case and per-category accuracy. Overall mean median error is ~2 cents.

Get a standalone accuracy report (and optionally export the corpus to wav):

python tests/corpus_report.py                       # full model, weighted_argmax
python tests/corpus_report.py --model tiny --decoder viterbi --verbose
python tests/corpus_report.py --export tests/audio/generated

Drop your own .wav / .flac files into tests/audio/real/ to include them automatically; encode an expected fundamental as _<f0>hz in the filename (e.g. violin_440hz.wav) to also assert accuracy.

Validation vs torchcrepe

CoreML (float16, ANE) vs the original Torch model on harmonic sine waves:

Model max |Δprob| (CoreML vs Torch) mean |F0 error|
full 3.4e-04 0.74 cents
tiny 7.9e-03 2.54 cents

Decoder parity vs torchcrepe (median |Δ| in cents): weighted_argmax ≈ 3.3, argmax ≈ 6.3, viterbi ≈ 6.0.

Benchmarks (Apple Silicon)

Per-frame time = one 10 ms hop of audio. "×RT" > 1 means faster than realtime.

Full model

batch CoreML per-frame CoreML ×RT Torch (CPU) per-frame Torch ×RT
1 2.99 ms 3.3× 6.84 ms 1.5×
10 0.49 ms 20.6× 5.73 ms 1.7×
100 0.30 ms 33.4× 3.99 ms 2.5×
500 0.38 ms 26.3× 4.82 ms 2.1×

Tiny model

batch CoreML per-frame CoreML ×RT Torch (CPU) per-frame Torch ×RT
1 0.32 ms 30.9× 3.04 ms 3.3×
10 0.19 ms 52.1× 0.86 ms 11.6×
100 0.16 ms 63.8× 0.51 ms 19.6×
500 0.15 ms 65.4× 0.44 ms 22.6×

Single-frame latency: full ≈ 3.0 ms, tiny ≈ 0.31 ms. Batch as many frames as latency allows for the best throughput. Regenerate these numbers with:

python validate_and_benchmark.py --capacity full
python validate_and_benchmark.py --capacity tiny

ANE notes

  • Exported as an mlprogram with float16 precision and ComputeUnit.ALL, which lets the runtime schedule work on the ANE.
  • A fixed batch size (--fixed-batch) is the most ANE-friendly layout; flexible RangeDim shapes may fall back to GPU/CPU for some ops.
  • Verify actual placement with Xcode → Performance report (Instruments' Core ML template) on a .mlpackage.

Differences from torchcrepe

  • Torch-free at runtime — arrays are numpy, the model runs via CoreML.
  • embed() is not supported — the CoreML model outputs pitch-bin probabilities only. Re-export with an embedding output if you need it.
  • weighted_argmax is the default decoder (torchcrepe defaults to viterbi), so no librosa is required out of the box. Pass decoder=coremlcrepe.decode.viterbi to match torchcrepe's default.
  • Unvoiced pitch is represented as NaN (coremlcrepe.UNVOICED).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coremlcrepe-0.1.0.tar.gz (42.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coremlcrepe-0.1.0-py3-none-any.whl (42.4 MB view details)

Uploaded Python 3

File details

Details for the file coremlcrepe-0.1.0.tar.gz.

File metadata

  • Download URL: coremlcrepe-0.1.0.tar.gz
  • Upload date:
  • Size: 42.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coremlcrepe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 830c418428779805d95c925afb75b6ad3d9b56664bb906cfe967975cc0966033
MD5 a7e79e6c2473bf75330a1afd36f40d85
BLAKE2b-256 874e3ed597ced0827ceaa1c31c42501bf2626348d0e572fe69008cd7b13e37aa

See more details on using hashes here.

Provenance

The following attestation bundles were made for coremlcrepe-0.1.0.tar.gz:

Publisher: ci.yml on sakamoto-poteko/coremlcrepe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coremlcrepe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: coremlcrepe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 42.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coremlcrepe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 28508a0727cee1698278ba5f928913061a407a49303949391d9ea30af7c101a3
MD5 75e107a902020cc18563f71c659fa67a
BLAKE2b-256 d7ea73bc4dadd5ffc540c12d422e7043dd5668a2691c57fa996dd9bd9aa5c918

See more details on using hashes here.

Provenance

The following attestation bundles were made for coremlcrepe-0.1.0-py3-none-any.whl:

Publisher: ci.yml on sakamoto-poteko/coremlcrepe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page