CoreML/ANE port of the CREPE pitch estimator with a torchcrepe-compatible API
Project description
coremlcrepe
A vibe-coded CoreML / Apple Neural Engine (ANE) port of the
CREPE pitch estimator, exposing a
torch-free, torchcrepe-compatible
API. At inference time it only needs numpy + coremltools — no PyTorch.
- 🎯 Drop-in-style API:
coremlcrepe.predict(...),coremlcrepe.decode,coremlcrepe.filter,coremlcrepe.threshold,coremlcrepe.convert, ... - ⚡ Runs on the ANE via a float16
mlprogram(10–13× faster per frame than Torch CPU). - ✅ Validated against the original torchcrepe model (< 1 cent median F0 error
on the
fullmodel).
Model I/O
Input frames |
(batch, 1024) float32 — 16 kHz audio windows, per-frame mean-centered + unit-std normalized |
Output probabilities |
(batch, 360) float32 — sigmoid probabilities over 360 pitch bins (20 cents/bin) |
Project layout
coremlcrepe/ The library (torch-free, mirrors torchcrepe)
├── __init__.py Public API + constants
├── core.py predict / preprocess / infer / postprocess / *_from_file
├── model.py CoreML MLModel wrapper (Crepe)
├── decode.py argmax / weighted_argmax / viterbi
├── convert.py bins <-> cents <-> frequency conversions
├── filter.py mean / median (+ NaN-aware) sequence filters
├── threshold.py At / Hysteresis / Silence thresholding
├── load.py model + audio loading
├── loudness.py A-weighted perceptual loudness
└── assets/ full.mlpackage, tiny.mlpackage (the CoreML models)
convert_crepe.py Torch -> CoreML conversion script
validate_and_benchmark.py Validation + latency benchmark vs torchcrepe
examples/ Runnable usage examples
tests/ pytest suite (torch-free + optional parity tests)
pyproject.toml Package metadata (pip install -e .)
Install
# Python 3.9–3.13 recommended.
python3 -m venv .venv && source .venv/bin/activate
# Runtime only (numpy + coremltools) + the package:
pip install -e .
# Optional extras:
pip install -e ".[audio]" # soundfile, to read audio files
pip install -e ".[dsp]" # librosa + resampy (viterbi/loudness/hi-q resample)
pip install -e ".[convert]" # torch + torchcrepe, to (re)build the .mlpackage
pip install -e ".[test]" # pytest
The converted models are shipped in coremlcrepe/assets/. To rebuild them:
python convert_crepe.py --capacity full # -> coremlcrepe/assets/full.mlpackage
python convert_crepe.py --capacity tiny # -> coremlcrepe/assets/tiny.mlpackage
# Most ANE-friendly: bake a static batch size
python convert_crepe.py --capacity full --fixed-batch 100
Usage
The API mirrors torchcrepe, using numpy arrays of shape (1, time):
import numpy as np
import coremlcrepe
# Load audio (needs the `audio` extra) or bring your own numpy array.
audio, sr = coremlcrepe.load.audio("audio.wav")
# Predict pitch (Hz) and periodicity (confidence).
pitch, periodicity = coremlcrepe.predict(
audio, sr,
fmin=50., fmax=550.,
model="full", # or "tiny"
decoder=coremlcrepe.decode.weighted_argmax,
return_periodicity=True,
)
Recommended cleanup pipeline (matches torchcrepe)
# Remove periodicity in silent regions (needs the `dsp` extra for loudness).
periodicity = coremlcrepe.threshold.Silence(-60.)(periodicity, audio, sr)
# Mark low-confidence frames unvoiced (NaN).
pitch = coremlcrepe.threshold.At(0.21)(pitch, periodicity)
# Smooth.
pitch = coremlcrepe.filter.median(pitch, 3)
periodicity = coremlcrepe.filter.mean(periodicity, 3)
Choosing a decoder
| decoder | notes | needs librosa? |
|---|---|---|
coremlcrepe.decode.weighted_argmax |
default, sub-bin accurate | no |
coremlcrepe.decode.argmax |
fastest, bin-quantized | no |
coremlcrepe.decode.viterbi |
temporally smooth path | librosa if installed, else pure-numpy fallback |
Selecting compute units
# Force a specific backend when loading (ALL lets the ANE be used):
coremlcrepe.load.model("full", compute_units="CPU_AND_NE")
Examples
python examples/01_basic_prediction.py # synthetic tone
python examples/02_from_file_with_cleanup.py a.wav # file + threshold + filter
python examples/03_pitch_sweep.py # track a glissando, compare decoders
Tests
pytest # torch-free core tests always run
# torchcrepe parity tests run only if torch is installed
The suite covers unit conversions, decoders, filters/thresholds, the end-to-end prediction pipeline, and — when torch is available — numerical parity with the original torchcrepe model.
Audio corpus suite
tests/audio_corpus.py deterministically generates a
diverse, labeled corpus (73 cases): notes across octaves in several timbres
(sine / harmonic / saw / square), vibrato, glissando, and harmonic tones at
decreasing SNR. tests/test_audio_corpus.py
runs prediction over the whole corpus and asserts per-case and per-category
accuracy. Overall mean median error is ~2 cents.
Get a standalone accuracy report (and optionally export the corpus to wav):
python tests/corpus_report.py # full model, weighted_argmax
python tests/corpus_report.py --model tiny --decoder viterbi --verbose
python tests/corpus_report.py --export tests/audio/generated
Drop your own .wav / .flac files into
tests/audio/real/ to include them automatically; encode
an expected fundamental as _<f0>hz in the filename (e.g. violin_440hz.wav)
to also assert accuracy.
Validation vs torchcrepe
CoreML (float16, ANE) vs the original Torch model on harmonic sine waves:
| Model | max |Δprob| (CoreML vs Torch) | mean |F0 error| |
|---|---|---|
| full | 3.4e-04 | 0.74 cents |
| tiny | 7.9e-03 | 2.54 cents |
Decoder parity vs torchcrepe (median |Δ| in cents): weighted_argmax ≈ 3.3,
argmax ≈ 6.3, viterbi ≈ 6.0.
Benchmarks (Apple Silicon)
Per-frame time = one 10 ms hop of audio. "×RT" > 1 means faster than realtime.
Full model
| batch | CoreML per-frame | CoreML ×RT | Torch (CPU) per-frame | Torch ×RT |
|---|---|---|---|---|
| 1 | 2.99 ms | 3.3× | 6.84 ms | 1.5× |
| 10 | 0.49 ms | 20.6× | 5.73 ms | 1.7× |
| 100 | 0.30 ms | 33.4× | 3.99 ms | 2.5× |
| 500 | 0.38 ms | 26.3× | 4.82 ms | 2.1× |
Tiny model
| batch | CoreML per-frame | CoreML ×RT | Torch (CPU) per-frame | Torch ×RT |
|---|---|---|---|---|
| 1 | 0.32 ms | 30.9× | 3.04 ms | 3.3× |
| 10 | 0.19 ms | 52.1× | 0.86 ms | 11.6× |
| 100 | 0.16 ms | 63.8× | 0.51 ms | 19.6× |
| 500 | 0.15 ms | 65.4× | 0.44 ms | 22.6× |
Single-frame latency: full ≈ 3.0 ms, tiny ≈ 0.31 ms. Batch as many frames as latency allows for the best throughput. Regenerate these numbers with:
python validate_and_benchmark.py --capacity full
python validate_and_benchmark.py --capacity tiny
ANE notes
- Exported as an
mlprogramwith float16 precision andComputeUnit.ALL, which lets the runtime schedule work on the ANE. - A fixed batch size (
--fixed-batch) is the most ANE-friendly layout; flexibleRangeDimshapes may fall back to GPU/CPU for some ops. - Verify actual placement with Xcode → Performance report (Instruments'
Core ML template) on a
.mlpackage.
Differences from torchcrepe
- Torch-free at runtime — arrays are numpy, the model runs via CoreML.
embed()is not supported — the CoreML model outputs pitch-bin probabilities only. Re-export with an embedding output if you need it.weighted_argmaxis the default decoder (torchcrepe defaults toviterbi), so no librosa is required out of the box. Passdecoder=coremlcrepe.decode.viterbito match torchcrepe's default.- Unvoiced pitch is represented as
NaN(coremlcrepe.UNVOICED).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coremlcrepe-0.1.0.tar.gz.
File metadata
- Download URL: coremlcrepe-0.1.0.tar.gz
- Upload date:
- Size: 42.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
830c418428779805d95c925afb75b6ad3d9b56664bb906cfe967975cc0966033
|
|
| MD5 |
a7e79e6c2473bf75330a1afd36f40d85
|
|
| BLAKE2b-256 |
874e3ed597ced0827ceaa1c31c42501bf2626348d0e572fe69008cd7b13e37aa
|
Provenance
The following attestation bundles were made for coremlcrepe-0.1.0.tar.gz:
Publisher:
ci.yml on sakamoto-poteko/coremlcrepe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coremlcrepe-0.1.0.tar.gz -
Subject digest:
830c418428779805d95c925afb75b6ad3d9b56664bb906cfe967975cc0966033 - Sigstore transparency entry: 2054974179
- Sigstore integration time:
-
Permalink:
sakamoto-poteko/coremlcrepe@a05b0e8a3560b31f615604c8b7a2447717d124a6 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/sakamoto-poteko
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@a05b0e8a3560b31f615604c8b7a2447717d124a6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file coremlcrepe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: coremlcrepe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 42.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28508a0727cee1698278ba5f928913061a407a49303949391d9ea30af7c101a3
|
|
| MD5 |
75e107a902020cc18563f71c659fa67a
|
|
| BLAKE2b-256 |
d7ea73bc4dadd5ffc540c12d422e7043dd5668a2691c57fa996dd9bd9aa5c918
|
Provenance
The following attestation bundles were made for coremlcrepe-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on sakamoto-poteko/coremlcrepe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coremlcrepe-0.1.0-py3-none-any.whl -
Subject digest:
28508a0727cee1698278ba5f928913061a407a49303949391d9ea30af7c101a3 - Sigstore transparency entry: 2054974770
- Sigstore integration time:
-
Permalink:
sakamoto-poteko/coremlcrepe@a05b0e8a3560b31f615604c8b7a2447717d124a6 -
Branch / Tag:
refs/tags/v0.0.0 - Owner: https://github.com/sakamoto-poteko
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@a05b0e8a3560b31f615604c8b7a2447717d124a6 -
Trigger Event:
push
-
Statement type: