Skip to main content

Robust covariance, heavy-tail scatter estimation, anomaly diagnostics, and benchmark galleries

Project description

robustcov

robustcov is an experimental Python/C++ library for robust covariance, heavy-tail scatter estimation, and interpretable anomaly diagnostics.

It is designed for workflows where classical covariance estimates are unstable: contamination, heavy-tailed data, small samples, high-dimensional scatter estimation, and robust-distance based anomaly screening.

Status: alpha / experimental. APIs and benchmark pages may change before a stable release.

Highlights

  • Fast robust covariance via FastMCD
  • Heavy-tail scatter estimators: RegularizedCauchy, StudentTScatter, RegularizedTyler
  • Robust anomaly detection with Mahalanobis-style robust distances
  • Cluster-aware robust diagnostics for multimodal data
  • Visual diagnostics: distance profiles, QQ plots, covariance heatmaps, anomaly panels
  • Optional OpenMP acceleration in the C++ backend
  • Sphinx documentation with benchmark and use-case galleries
  • Optional external/Kaggle examples for fraud, finance, maintenance, and medical screening

Installation

From PyPI after a release is published:

python -m pip install -U pip
python -m pip install robustcov

Supported release wheels are built for CPython 3.12, 3.13, and 3.14 on Ubuntu, Windows, and macOS by GitHub Actions. The package uses a C++/pybind11 backend built with scikit-build-core.

Inside a conda environment, install the PyPI wheels with pip:

conda create -n robustcov python=3.12 pip
conda activate robustcov
python -m pip install robustcov

For local development:

git clone https://github.com/smiryusupov/robustcov.git
cd robustcov

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

python -m pip install -U pip
python -m pip install -e ".[dev,docs,examples]"
python -m compileall -q robustcov tests examples benchmarks docs
python -m pytest -q

Quickstart

import numpy as np
import robustcov as rc

rng = np.random.default_rng(0)

# Heavy-tailed data with injected outliers
X = rng.standard_t(df=3, size=(400, 5))
X[:30] += 8.0

est = rc.FastMCD(quality="balanced", random_state=42).fit(X)

print(est.location_)
print(est.covariance_)
print(est.radial_kurtosis_)

det = rc.RobustOutlierDetector(estimator=est, contamination=0.075).fit(X)
print(det.labels_)

For small-sample or high-dimensional heavy-tailed data:

est = rc.RegularizedCauchy(alpha=0.10).fit(X)
print(est.covariance_)

student = rc.StudentTScatter(df=3, alpha=0.05).fit(X)
print(student.radial_kurtosis_)

For automatic exploratory selection:

auto = rc.AutoRobustScatter(selection="diagnostic").fit(X)

print(auto.best_estimator_name_)
print(auto.summary())

Main estimators

Estimator Best use case Notes
FastMCD Separable contamination, n >> p Fast robust covariance and support diagnostics
RegularizedCauchy Very heavy tails, small samples, p close to n Strong radial downweighting plus shrinkage
StudentTScatter Diffuse heavy tails Smooth heavy-tail scatter estimator
RegularizedTyler Heavy-tailed shape estimation Scale-free shape unless scale correction is requested
AutoRobustScatter Exploratory estimator selection Diagnostic or stability-based selector
ClusterRobustOutlierDetector Multimodal data Cluster-then-local-robust-scatter diagnostic

KLRegularizedTyler and WieselTyler are currently documented as aliases/prototype variants around the regularized Tyler implementation. HellingerRegularizedTyler is experimental.

Visual diagnostics

est = rc.FastMCD(quality="balanced", random_state=0).fit(X)

rc.plot_robust_distance_profile(
    est,
    output_path="distance_profile.png",
    show=False,
)

rc.plot_mahalanobis_qq(
    est,
    output_path="qq.png",
    show=False,
)

rc.plot_covariance_heatmap(
    est.covariance_,
    title="FastMCD covariance",
    output_path="covariance.png",
    show=False,
)

Diagnostic reports summarize robust-distance behavior:

report = rc.diagnostic_report(est)
print(report.summary())

Reports include radial kurtosis, detected fraction, condition number, support fraction, QQ tail deviation, and heuristic recommendations.

Multimodal data

A single global robust covariance model can fail when the data have several legitimate modes. Use cluster-aware diagnostics when modes correspond to meaningful groups, regimes, or segments.

det = rc.ClusterRobustOutlierDetector(
    n_clusters=3,
    contamination=0.05,
    random_state=0,
).fit(X)

scores = det.decision_function(X)
labels = det.predict(X)

rc.plot_cluster_robust_distances(
    det,
    X,
    output_path="cluster_distances.png",
    show=False,
)

This is not a full robust mixture model. It is a practical cluster-then-robust-scatter diagnostic.

OpenMP acceleration

If OpenMP is available at build time, the C++ backend can parallelize distance evaluation, covariance accumulation, Tyler scatter updates, and FastMCD candidate evaluation.

import robustcov as rc

print(rc.has_openmp())
rc.set_num_threads(4)

est = rc.FastMCD(n_init=500, n_jobs=4, random_state=0).fit(X)

For reproducible scaling benchmarks, avoid BLAS/OpenMP oversubscription:

OMP_NUM_THREADS=4 OPENBLAS_NUM_THREADS=1 MKL_NUM_THREADS=1 \
python benchmarks/openmp_scaling.py \
  --n 8000 \
  --p 20 \
  --threads 1 2 4 \
  --csv results/openmp_scaling.csv

Documentation

Build the Sphinx docs locally:

python -m pip install -e ".[docs]"
python -m sphinx -b html docs docs/_build/html

Main documentation entry points:

  • Use-case gallery: practical application pages organized by topic
  • Benchmark gallery: benchmark plots, tables, and interpretation
  • Algorithms: mathematical descriptions and references
  • Robust statistics background: influence functions, Gateaux derivatives, breakdown point, geodesic convexity, and small-sample issues
  • External and Kaggle gallery: optional external-data results

Do not commit docs/_build/; it is generated by Sphinx.

Benchmarks

Generate the benchmark report:

OMP_NUM_THREADS=4 OPENBLAS_NUM_THREADS=1 MKL_NUM_THREADS=1 \
python benchmarks/make_report.py --outdir results/report

This writes CSV files, plots, a Markdown report, and a standalone HTML report:

results/report/benchmark_report.html
results/report/benchmark_report.md
results/report/*.csv
results/report/*.png

The benchmark documentation is intentionally honest. robustcov is not expected to win every anomaly-detection task. The package is strongest when the signal is covariance-shaped, heavy-tailed, high-dimensional, or benefits from interpretable robust distances.

Examples

Run the reproducible use-case gallery:

python examples/run_use_case_gallery.py --all

Selected examples:

python examples/use_case_finance_risk.py
python examples/use_case_multimodal_anomaly.py
python examples/use_case_sensor_anomaly.py
python examples/use_case_breast_cancer_screening.py
python examples/use_case_digits_one_class_baselines.py
python examples/use_case_ml_preprocessing.py

Refresh generated gallery assets after editing examples:

python docs/generate_gallery_assets.py
python -m sphinx -b html docs docs/_build/html

External and Kaggle examples

External examples live under examples_external/. They are optional and are not part of the test suite because they require manual dataset downloads and may have separate licenses.

Example:

python examples_external/kaggle_credit_card_fraud.py \
  --data examples_external/data/creditcard.csv \
  --outdir results/external/credit_card_fraud

Collect external result summaries:

python examples_external/collect_external_results.py \
  --root results/external \
  --outdir results/external_registry

External result pages should be read as evidence, not as leaderboard claims. Some datasets are strong wins, some are competitive but slower, and some are included mainly to show limitations.

Scope

robustcov currently focuses on:

  1. efficient robust covariance for classical contamination;
  2. heavy-tail scatter estimators for small-sample/high-dimensional regimes;
  3. robust-distance anomaly diagnostics;
  4. application and benchmark galleries with reproducible scripts.

Minimum-volume ellipsoid and full robust mixture modeling are not core priorities yet. They may be added later as experimental features if they strengthen the package without distracting from the current scope.

Development

python -m pip install -e ".[dev,docs]"
python -m pytest -q
python -m sphinx -b html docs docs/_build/html

Build distribution artifacts:

python -m build
python -m twine check dist/*

Release wheels are built by .github/workflows/wheels.yml using cibuildwheel. Push a v* tag to publish to PyPI via Trusted Publishing after configuring the pypi environment on PyPI/GitHub. See RELEASE.md for the full checklist.

Project status

This is a pre-1.0 alpha package. Public APIs may change. The goal of the early releases is to make the estimators, diagnostics, benchmarks, and documentation easy to inspect before stabilizing the interface.

License

Apache-2.0. See LICENSE.

Citation

If you use robustcov in research or applied work, please cite the package using the metadata in CITATION.cff.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robustcov-0.0.1.tar.gz (2.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

robustcov-0.0.1-cp314-cp314-win_amd64.whl (139.7 kB view details)

Uploaded CPython 3.14Windows x86-64

robustcov-0.0.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (262.0 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

robustcov-0.0.1-cp314-cp314-macosx_11_0_x86_64.whl (130.6 kB view details)

Uploaded CPython 3.14macOS 11.0+ x86-64

robustcov-0.0.1-cp314-cp314-macosx_11_0_arm64.whl (122.6 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

robustcov-0.0.1-cp313-cp313-win_amd64.whl (137.0 kB view details)

Uploaded CPython 3.13Windows x86-64

robustcov-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (261.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

robustcov-0.0.1-cp313-cp313-macosx_11_0_x86_64.whl (130.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ x86-64

robustcov-0.0.1-cp313-cp313-macosx_11_0_arm64.whl (122.4 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

robustcov-0.0.1-cp312-cp312-win_amd64.whl (137.0 kB view details)

Uploaded CPython 3.12Windows x86-64

robustcov-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (261.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

robustcov-0.0.1-cp312-cp312-macosx_11_0_x86_64.whl (130.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ x86-64

robustcov-0.0.1-cp312-cp312-macosx_11_0_arm64.whl (122.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file robustcov-0.0.1.tar.gz.

File metadata

  • Download URL: robustcov-0.0.1.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for robustcov-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ee1200f60e8109e03c34dd0be0e3a6461e13ce76f85a57c60c0823f11d0fac12
MD5 3c413b2c143fafa8f292beb49739b881
BLAKE2b-256 11451447c0d450b77fc893845e4d116878651f6e0976cd9cf38c240c4c35d7d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1.tar.gz:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp314-cp314-win_amd64.whl.

File metadata

  • Download URL: robustcov-0.0.1-cp314-cp314-win_amd64.whl
  • Upload date:
  • Size: 139.7 kB
  • Tags: CPython 3.14, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for robustcov-0.0.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 98881b2a8217d2a511ced6df798c846915d2b62e5895022b091f7971f85bb93a
MD5 f1864a329c75359c825e3bbbc1677b39
BLAKE2b-256 9e2ef7d4790ad9a5d20d746797c203db64ba64fbd87d9f08be4acaf9808b8c5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp314-cp314-win_amd64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ee1b2c6037088a170158a9cae4b34f2ed3f2267c7adfed24c43e684b58750a4a
MD5 aa2562de7dbc16f718247331d7fdde8f
BLAKE2b-256 e85068ec7742aaaadbb08cc150924f950e5aea0059a511b1e756f92c9becce28

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp314-cp314-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp314-cp314-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 92f26c38abf808619fcaee5ab2d47a4a8229222c6b4663a9837722eafac57ac2
MD5 413e44008e4544e08697fff1175450d8
BLAKE2b-256 ffc541d561be0476d788635f5b65e0583509d608b61c50b681c97d1bc47aadda

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp314-cp314-macosx_11_0_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6da6ae3ca90cc3b676bfb77f6f124c90e26ae3f57994d13cd784824108010522
MD5 2d9df7252e65e916542a05175879c6fe
BLAKE2b-256 a3f328b2a6dc0b80834ea8eb24d8f5af3d3fefb23b3feb1456d71f959b7d7ba4

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: robustcov-0.0.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 137.0 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for robustcov-0.0.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 a335687e601efae01fb4a3c841409d41aea054aaa378a73e59b9c676c5ba6534
MD5 d6470ad43920bd3577aff6fa6f660339
BLAKE2b-256 00c11a5db4537e0b750cea0677c29985724f868deec145eeccad662285bc9fbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp313-cp313-win_amd64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0e30f8b7c20198f8a8c1737dc723b77a636186eb2f423bc6e603b710150f5a64
MD5 63d671bd63ef6c8352a8c24a45d1220a
BLAKE2b-256 daba583c981badf7048f267f9844e3dbe02b24957ac4df04fa1de8472a1ac7bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp313-cp313-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp313-cp313-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 c5af108b095acdfe3dd2dd95ad4ba03c9dbf80fa6951b747354ed32d6c80d417
MD5 1f8833cb65167d7c1886e6c3136c6c79
BLAKE2b-256 dafd388db19790c83804e560e3bffb24dabc4fb0654bf128ec28ed6b8f1c1f57

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp313-cp313-macosx_11_0_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 37caccd45338fadd39cf1850c59bfb4c4262001cb33e7806de0dfd8d1ccaa72c
MD5 0240467b44991e711e2f963105a9dcdf
BLAKE2b-256 0a2e2881c4f2630fb750c280bebb5448d1b9500df34d1b6448f8688b8b194c8f

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: robustcov-0.0.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 137.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for robustcov-0.0.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6b896fdc23ceb95f6f92c385e43d9872b08b51a5e497f11b1efcdbd82e8b9814
MD5 af941009d981262106379fd966fef714
BLAKE2b-256 e6d40db68a3d47ad984fb309817965b73a74f96b524379e881d8b2e30596ed56

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp312-cp312-win_amd64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6561e7f805152f725ad5a01edcbc9dd5d29ed6cbdadaf92db85a0e8a121f759e
MD5 fe91717301ddb4bc1b39a3caeb15ef2d
BLAKE2b-256 e4e9312fa8fcfbbd0f1926cf581c4b1515a144f055aa623bdeb8ba56bc08bf84

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp312-cp312-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp312-cp312-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 1fc01b7dd4a7508c414f855f4f57ee78d5665145edb7ec764253fa751f22b221
MD5 594f1a4d2178ac00388faf4b7c602c4d
BLAKE2b-256 b7ed56cea55b80c769905c0c1565c1d025877e9c06a0b7278947018cbd17eee8

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp312-cp312-macosx_11_0_x86_64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file robustcov-0.0.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for robustcov-0.0.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6198ba266eb3eb14351e128627328d69c1791ee729679808d7b3e06a7f04d856
MD5 07b8029d28bca0d2f2e242f8bda5b9ad
BLAKE2b-256 abe4df0bb39cf947d033a16e872e76b567213725b2b9c5abd846e70de9a5854d

See more details on using hashes here.

Provenance

The following attestation bundles were made for robustcov-0.0.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: wheels.yml on smiryusupov/robustcov

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page