Fast change-point detection bindings backed by Rust.
Project description
changepoint-doctor Python Bindings (MVP-A)
changepoint-doctor exposes fast offline change-point detection from Rust into Python.
For citation and provenance policy, see ../CITATION.cff
and ../docs/clean_room_policy.md.
Install
From PyPI (target release 0.0.3):
python -m pip install --upgrade pip
python -m pip install changepoint-doctor==0.0.3
For local development from this repository:
cd cpd/python
python -m pip install --upgrade pip maturin
maturin develop --release --manifest-path ../crates/cpd-python/Cargo.toml
python -m pip install --upgrade ".[dev]"
Apple Silicon contributors should run the architecture checks and sanity path in
../docs/python_apple_silicon_toolchain.md
before debugging pyo3/linker errors.
Common extras:
plot:python -m pip install "changepoint-doctor[plot]==0.0.3"notebooks:python -m pip install "changepoint-doctor[notebooks]==0.0.3"parity:python -m pip install "changepoint-doctor[parity]==0.0.3"dev:python -m pip install "changepoint-doctor[dev]==0.0.3"
plot/notebooks/parity extras only install optional Python tooling. They do
not toggle Rust compile-time features. Rust features are set when building the
extension (for example maturin develop --features preprocess,serde ...).
Install/import naming: install with
python -m pip install changepoint-doctor, then import withimport cpdin Python. Optional compatibility alias:import changepoint_doctor as cpd.
API Map
cpd.Pelt: high-level PELT detector.cpd.Binseg: high-level Binary Segmentation detector.cpd.Fpop: high-level FPOP detector (L2 cost only).cpd.detect_offline: low-level API for explicit detector/cost/constraints/stopping/preprocess selection, includingdetector="segneigh"(exact fixed-K DP;dynpalias supported).cpd.OfflineChangePointResult: typed result object with breakpoints and diagnostics.
Streaming update() vs update_many() Policy
update_many() now uses a size-aware GIL strategy in Rust bindings:
- Workloads with
< 16scalar work items (n * d) keep the GIL (lower overhead for tiny micro-batches). - Workloads with
>= 16scalar work items (n * d) release the GIL (py.allow_threads) for throughput and thread fairness.
To reproduce the benchmark snapshot used for this policy:
cd cpd/python
python -m pip install --upgrade ".[dev]"
pytest -q tests/test_streaming_perf_contract.py
Optional controls:
CPD_PY_STREAMING_PERF_ENFORCE=1: enable stricter ratio gates.CPD_PY_STREAMING_PERF_REPORT_OUT=/tmp/cpd-python-streaming-perf.json: write JSON metrics.
The perf contract uses median latency with outlier-triggered retry rounds to reduce scheduler-noise flakiness.
Reference run (local dev machine, tests/test_streaming_perf_contract.py, median ms):
| Batch size | update() median ms |
update_many() median ms |
update_many() speedup vs update() |
|---|---|---|---|
| 1 | 0.0035 | 0.0097 | 0.36x |
| 8 | 0.0177 | 0.0194 | 0.91x |
| 16 | 0.0356 | 0.0310 | 1.15x |
| 64 | 0.1308 | 0.0891 | 1.47x |
| 4096 | 7.8216 | 4.4616 | 1.75x |
Masking Risk Guidance
If BinSeg diagnostics indicate masking risk (for example warnings that closely
spaced weaker changes may be hidden), prefer Wild Binary Segmentation (WBS) in
Rust/offline flows (cpd-offline::Wbs) for stronger recovery.
Python high-level APIs expose cpd.Pelt, cpd.Binseg, and cpd.Fpop.
WBS and SegNeigh are not yet exposed as Python high-level detector classes; use detect_offline(...).
Quickstart
See QUICKSTART.md for a full walkthrough.
Reproducibility Modes
detect_offline(..., repro_mode=...) supports strict, balanced (default),
and fast.
For deterministic contracts, cross-platform expectations, and tolerance gates,
see ../docs/reproducibility_modes.md.
Result JSON Contract
OfflineChangePointResult.to_json() / OfflineChangePointResult.from_json(...)
follow the versioned contract in
../docs/result_json_contract.md, with the
canonical schema marker at diagnostics.schema_version.
When available, build provenance is emitted under diagnostics.build (for
Python adapters this includes ABI and enabled feature context).
In 0.x, schema compatibility follows the bounded version window documented in
../VERSIONING.md: readers accept only supported
schema-marker versions (currently 1..=2 for offline result fixtures).
Serialization + plotting workflow:
import numpy as np
import cpd
x = np.concatenate([
np.zeros(40, dtype=np.float64),
np.full(40, 8.0, dtype=np.float64),
np.full(40, -4.0, dtype=np.float64),
])
pelt = cpd.Pelt(model="l2").fit(x).predict(n_bkps=2)
binseg = cpd.Binseg(model="l2").fit(x).predict(n_bkps=2)
fpop = cpd.Fpop(min_segment_len=2).fit(x).predict(n_bkps=2)
low = cpd.detect_offline(
x,
detector="pelt",
cost="l2",
constraints={"min_segment_len": 2},
stopping={"n_bkps": 2},
)
segneigh = cpd.detect_offline(
x,
detector="segneigh", # 'dynp' alias also supported
cost="l2",
constraints={"min_segment_len": 2},
stopping={"n_bkps": 2},
)
payload = pelt.to_json()
restored = cpd.OfflineChangePointResult.from_json(payload)
assert restored.breakpoints == pelt.breakpoints
try:
fig = restored.plot(x, title="Detected breakpoints")
except ImportError:
# Plotting remains optional.
# Install with: python -m pip install "changepoint-doctor[plot]==0.0.3"
fig = None
Compatibility + limitations:
from_json(...)accepts only supported schema markers (diagnostics.schema_version, currently1..=2in0.x).to_json()writes the current schema marker (currently1) and preserves additive unknown fields when round-tripping payloads.plot()requires optional plotting dependencies (changepoint-doctor[plot]).plot(values=None, ...)requires per-segment summaries in the result; if segments are unavailable, pass explicitvalues.plot(ax=...)is supported only for univariate data (diagnostics.d == 1).
These paths are smoke-tested in CI in
tests/test_integration_mvp_a.py, including
fixture compatibility checks and example-script execution.
Stopping and Penalty Guide
Ruptures-compatible naming is supported in Python:
n_bkps: exact number of change points (Stopping::KnownK)pen: manual penalty scalar (Stopping::Penalized(Penalty::Manual(...)))min_segment_len: minimum segment size (Constraints.min_segment_len)
When to use each stopping style:
n_bkps(KnownK): use when you know the expected number of changes and need an exact count.pen="bic": good default when you want automatic model-selection behavior that scales with sample size.pen="aic": less conservative than BIC; can recover weaker changes but may over-segment noisy data.pen=<float>: use when you need tight operational control over sensitivity (lower finds more changes, higher finds fewer).stopping={"PenaltyPath": [...]}(pipeline serde form): request multiple penalties in one PELT sweep and inspect diagnostics notes for each path entry.
BIC/AIC complexity terms are model-aware by default:
l2usesparams_per_segment=2(mean + residual variance proxy)normalusesparams_per_segment=3(mean + variance + residual term)normal_full_covuses model-aware effective complexity for BIC/AIC:1 + d + d(d+1)/2(mean vector + full covariance + residual term)
Advanced users can still override params_per_segment in low-level pipeline detector config.
SegNeigh Sizing Guide (detector="segneigh" / "dynp")
SegNeigh is exact dynamic programming for fixed-k segmentation (n_bkps / KnownK).
- Let
mbe the effective candidate count after constraints (jump,candidate_splits,min_segment_lenfiltering). - Expected scaling is approximately:
- runtime:
O(k * m^2) - memory:
O(k * m + m)
- runtime:
- Practical guidance:
- Use SegNeigh when
kis known andmis modest. - Increase
jumpand/ormin_segment_lenfirst when runtime or memory is high. - Prefer
pelt/fpopwhenkis unknown or when very largenrequires penalty-based model selection.
- Use SegNeigh when
Reproducible local benchmark harness for representative (n, k) regimes:
cd cpd
cargo bench -p cpd-bench --bench offline_segneigh
Preprocess Config Contract
detect_offline(..., preprocess=...) validates keys and method payloads.
Unknown preprocess stage keys fail with ValueError.
Default PyPI wheels include preprocess support.
Canonical shape:
preprocess = {
"detrend": {"method": "linear"}, # or {"method": "polynomial", "degree": 2}
"deseasonalize": {"method": "differencing", "period": 2}, # or method="stl_like" (period >= 2)
"winsorize": {"lower_quantile": 0.05, "upper_quantile": 0.95}, # optional fields
"robust_scale": {"mad_epsilon": 1e-9, "normal_consistency": 1.4826}, # optional fields
}
Validation details:
detrend.method:"linear"or"polynomial"(degreerequired for polynomial).deseasonalize.method:"differencing"(period >= 1) or"stl_like"(period >= 2).winsorize: defaults tolower_quantile=0.01,upper_quantile=0.99when omitted.robust_scale: defaults tomad_epsilon=1e-9,normal_consistency=1.4826when omitted.
Example Scripts
examples/synthetic_signal.py: synthetic step-function detection with all MVP-A APIs.examples/csv_detect.py: detect breakpoints from a CSV column.examples/plot_breakpoints.py: render detected breakpoints over a synthetic signal.
Run from repo root:
cpd/python/.venv/bin/python cpd/python/examples/synthetic_signal.py
cpd/python/.venv/bin/python cpd/python/examples/csv_detect.py --csv /path/to/data.csv --column 0
cpd/python/.venv/bin/python cpd/python/examples/plot_breakpoints.py --out /tmp/cpd_breakpoints.png
Notebook Examples
examples/notebooks/01_offline_algorithms.ipynb: quick comparison of offline detectors (Pelt,Binseg,Fpop,segneigh, and pipeline-formwbs).examples/notebooks/02_online_algorithms.ipynb: streaming workflows forBocpd,Cusum, andPageHinkley.examples/notebooks/03_doctor_recommendations.ipynb: doctor recommendation workflow with live CLI execution and snapshot fallback.examples/notebooks/README.md: notebook launch instructions and workflow overview.
Launch from cpd/python:
python -m pip install --upgrade "changepoint-doctor[notebooks]==0.0.3"
jupyter lab
Ruptures Parity Suite
To run the differential parity suite locally:
cd cpd/python
python -m pip install --upgrade ".[parity]"
CPD_PARITY_PROFILE=smoke pytest -q tests/test_ruptures_parity.py
CPD_PARITY_PROFILE=full CPD_PARITY_REPORT_OUT=/tmp/cpd-parity-report.json pytest -q tests/test_ruptures_parity.py
See ../docs/parity_ruptures.md for corpus structure,
tolerance rules, and CI thresholds.
BOCPD Bayesian Parity Suite
To run BOCPD parity against
hildensia/bayesian_changepoint_detection (preferred pin with fallback):
cd cpd/python
python -m pip install --upgrade ".[parity]"
REF_REPO="https://github.com/hildensia/bayesian_changepoint_detection.git"
PREFERRED_REF="f3f8f03af0de7f4f98bd54c7ca0b5f6d0b0f6f8c"
python -m pip install "git+${REF_REPO}@${PREFERRED_REF}" || \
python -m pip install "git+${REF_REPO}"
CPD_BOCPD_PARITY_PROFILE=smoke pytest -q tests/test_bocpd_bayesian_parity.py
CPD_BOCPD_PARITY_PROFILE=full CPD_BOCPD_PARITY_REPORT_OUT=/tmp/cpd-bocpd-parity-report.json pytest -q tests/test_bocpd_bayesian_parity.py
Extras Validation
Run the metadata sanity checks for optional extras:
cd cpd/python
pytest -q tests/test_optional_extras_contract.py
Optional install commands (one per workflow extra):
python -m pip install "changepoint-doctor[plot]==0.0.3"
python -m pip install "changepoint-doctor[notebooks]==0.0.3"
python -m pip install "changepoint-doctor[parity]==0.0.3"
python -m pip install "changepoint-doctor[dev]==0.0.3"
See ../docs/parity_bocpd_bayesian.md for
comparison logic, corpus layout, and threshold gates.
Wheel CI Policy
Cross-platform wheel hardening is enforced by
../../.github/workflows/wheel-build.yml
and ../../.github/workflows/wheel-smoke.yml.
- Build backend:
cibuildwheel - Platforms:
- Linux manylinux x86_64
- macOS universal2 (validated on
macos-13andmacos-14) - Windows amd64 (
windows-2022)
- Python matrix:
- Full (
main/nightly/tag):3.9,3.10,3.11,3.12,3.13 - Tiered (
pull_request): representative subset with at least one3.13row
- Full (
- NumPy matrix:
1.26.*and2.*3.13 + numpy 1.26.*is excluded
- Python
3.13rows are markedexperimentaland soft-gated (continue-on-error)
Default wheels are BLAS-free by policy:
- Native dependency reports are gated by
../../.github/scripts/wheel_dependency_gate.pyusingauditwheel(Linux),delocate(macOS), anddelvewheel(Windows). - Runtime smoke asserts
low.diagnostics.blas_backend is Nonefor default wheel installs.
Troubleshooting
-
TypeError: expected float32 or float64Cause: integer/object arrays are passed into.fit(...)ordetect_offline(...). Fix: cast first, e.g.x = np.asarray(x, dtype=np.float64). -
Input contains NaN/missing values and detection fails Cause: MVP-A Python APIs reject missing values under
MissingPolicy::Error. Fix: impute/drop NaNs before calling detectors. -
RuntimeError: fit(...) must be called before predict(...)Cause:.predict(...)called on an unfitted high-level detector. Fix: always call.fit(x)first. -
Extension import fails after Rust/Python upgrade Cause: wheel/extension built against a different interpreter environment. Fix: rebuild via
maturin develop --releasein the active environment. -
Apple Silicon linker mismatch (
arm64vsx86_64) Cause: host shell/interpreter/libpython architectures do not match. Fix: follow../docs/python_apple_silicon_toolchain.mdto verify architecture and run the CI-aligned local sanity flow.
API Reference Outline
Pelt(model="l2"|"normal"|"normal_full_cov", min_segment_len, jump, max_change_points).fit(x)-> detector.predict(pen=..., n_bkps=...)->OfflineChangePointResult
Binseg(model="l2"|"normal"|"normal_full_cov", min_segment_len, jump, max_change_points, max_depth).fit(x)-> detector.predict(pen=..., n_bkps=...)->OfflineChangePointResult
Fpop(min_segment_len, jump, max_change_points)(l2only).fit(x)-> detector.predict(pen=..., n_bkps=...)->OfflineChangePointResult
detect_offline(x, pipeline=None, detector, cost, constraints, stopping, preprocess, repro_mode, return_diagnostics)detectoracceptspelt,binseg,fpop, orsegneigh(dynpalias).fpoprequirescost="l2".segneighis exact fixed-K dynamic programming (best whenstoppingisn_bkps/KnownK); runtime/memory can grow quickly on largenand highk.costacceptsl1_median,l2,normal,normal_full_cov, and (pipeline-only)nig.pipelineaccepts both simplified Python dicts (for example{"detector": {"kind": "segneigh"}}) and RustPipelineSpecserde shape (for example{"detector": {"Offline": {"SegNeigh": {...}}}, ...}).
OfflineChangePointResult- fields:
breakpoints,change_points,scores,segments,diagnostics - helpers:
to_json(),from_json(payload),plot(values=None, *, ax=None, title=...)
- fields:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file changepoint_doctor-0.0.3.tar.gz.
File metadata
- Download URL: changepoint_doctor-0.0.3.tar.gz
- Upload date:
- Size: 303.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b76b264e6099bd8416d7df44d94551cca2529f97a3ee606a36398f2d85b4448
|
|
| MD5 |
310760435274d47ef94c6479d117f7c8
|
|
| BLAKE2b-256 |
a2936340c9e93382f845efbb0b378e6dee4302bc9272a6d2371da4c11de16467
|
Provenance
The following attestation bundles were made for changepoint_doctor-0.0.3.tar.gz:
Publisher:
release.yml on xang1234/changepoint-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
changepoint_doctor-0.0.3.tar.gz -
Subject digest:
7b76b264e6099bd8416d7df44d94551cca2529f97a3ee606a36398f2d85b4448 - Sigstore transparency entry: 969309970
- Sigstore integration time:
-
Permalink:
xang1234/changepoint-doctor@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/xang1234
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Trigger Event:
push
-
Statement type:
File details
Details for the file changepoint_doctor-0.0.3-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: changepoint_doctor-0.0.3-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 924.0 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b8ab8cd3a674b02321a4b7bd346f8b7611f6d0f8b51a90bdf83152e68e6b72e
|
|
| MD5 |
1f0f94331786073900f24a55f83b2b1d
|
|
| BLAKE2b-256 |
4ba2f20ddfa43dd31b7900c5e997ae8599ccd88e655e815817a1f934c5b10f91
|
Provenance
The following attestation bundles were made for changepoint_doctor-0.0.3-cp39-abi3-win_amd64.whl:
Publisher:
release.yml on xang1234/changepoint-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
changepoint_doctor-0.0.3-cp39-abi3-win_amd64.whl -
Subject digest:
5b8ab8cd3a674b02321a4b7bd346f8b7611f6d0f8b51a90bdf83152e68e6b72e - Sigstore transparency entry: 969309984
- Sigstore integration time:
-
Permalink:
xang1234/changepoint-doctor@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/xang1234
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Trigger Event:
push
-
Statement type:
File details
Details for the file changepoint_doctor-0.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: changepoint_doctor-0.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe80e22a282597b7d600a56e683d4916edd74ac037b1af9b4cde7d529fb525aa
|
|
| MD5 |
69dcc8984ae353be443d034a969f923a
|
|
| BLAKE2b-256 |
7a944fc98276d1fa0af18dcd4cf80dc266fab15dcab9c4d03ea8a0974fe73201
|
Provenance
The following attestation bundles were made for changepoint_doctor-0.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on xang1234/changepoint-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
changepoint_doctor-0.0.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
fe80e22a282597b7d600a56e683d4916edd74ac037b1af9b4cde7d529fb525aa - Sigstore transparency entry: 969309992
- Sigstore integration time:
-
Permalink:
xang1234/changepoint-doctor@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/xang1234
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Trigger Event:
push
-
Statement type:
File details
Details for the file changepoint_doctor-0.0.3-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: changepoint_doctor-0.0.3-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 986.2 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e5791e3ced0eda68ed6f501ade48489550cd3e6c65ad4b7e993166400f8eb5a
|
|
| MD5 |
dc17814a772cc1b72825f073e648987b
|
|
| BLAKE2b-256 |
13b064600c227a22a04675acfe11594e3d70a39a7c31b21573bb81d7fbf1f75f
|
Provenance
The following attestation bundles were made for changepoint_doctor-0.0.3-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on xang1234/changepoint-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
changepoint_doctor-0.0.3-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
3e5791e3ced0eda68ed6f501ade48489550cd3e6c65ad4b7e993166400f8eb5a - Sigstore transparency entry: 969309977
- Sigstore integration time:
-
Permalink:
xang1234/changepoint-doctor@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/xang1234
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b19bbe9a8d900caad14c368006a1f33da7ca38db -
Trigger Event:
push
-
Statement type: