Skip to main content

Full conformal novelty detection with conformal e-values and e-BH.

Project description

fcnd

fcnd is a Python package for full conformal novelty detection with finite-sample false discovery rate (FDR) control. It provides implementations of the full conformal methodology developed in the following paper:

J. Lee, I. Popov, Z. Ren. "Full conformal novelty detection". (arXiv)

Novelty detection is the problem of identifying outliers in a test dataset given a reference dataset of only inliers. The goal is to learn (via ML models) a representation that can distinguish between the outliers in the test dataset and the inliers in the test and reference dataset.

This package is for method development and is designed to fit into workflows that want to apply or build on top of full conformal novelty detection methods. This repository is not a reproduction repository for the above mentioned paper. Reproduction-specific assets such as job scripts and plotting code are not included. Split conformal baselines are included, in case the application of these methods are also of interest.

Features

  • Full conformal novelty detection via FCND
  • Split conformal novelty detection via SCND
  • Model-selection full conformal novelty detection via MSFCND
  • Weighted conformal p-values and e-values for distribution-shifted settings
  • Fast and lightweight conditionally calibrated e-BH boosting via EBHCC
  • Flexible learner interface via BaseLearner, with built-in wrappers for common scikit-learn novelty-detection models
  • Numba-backed numerical kernels enabled by default

Installation

Install the package from PyPI:

pip install fcnd

For notebook examples, install the optional notebook dependencies:

pip install "fcnd[notebook]"

fcnd supports Python 3.10, 3.11, and 3.12. The default dependency set includes numba, numpy, and scikit-learn; scipy is installed transitively by scikit-learn.

Quickstart Example

import numpy as np

from fcnd import FCND, IsoForestLearner
from fcnd.synthetic import generate_wset, gen_data

rng = np.random.default_rng(123)
W = generate_wset(size=16, dims=8, random_state=rng)

X_ref = gen_data(W, 60, a=1.0, random_state=rng)
X_test = np.vstack([
    gen_data(W, 30, a=1.0, random_state=rng),
    gen_data(W, 10, a=3.5, random_state=rng),
])

detector = FCND(
    IsoForestLearner(random_state=1, n_estimators=50),
    use_numba=True,
)
result = detector.detect(X_ref, X_test, alpha=0.2, method="ebh")

print(result.rejections)
print(result.n_rejections)

Expected output:

[ 1 20 31 32 34 37 39]
7

The detect method returns a DetectionResult with conformal p- or e-values, non-conformity scores, the rejection set, and basic metadata.

Example Notebook

The notebook examples/fcnd_quickstart.ipynb demonstrates FCND, weighted FCND, and MSFCND in a single workflow.

Main API

Object Purpose
FCND Full conformal novelty detector for reference and test samples
SCND Split conformal novelty detector with separate train/calibration/test inputs
MSFCND Model-selection full conformal novelty detector over a candidate learner library
EBHCC Streaming conditionally calibrated e-BH booster
IsoForestLearner Isolation Forest wrapper with larger scores meaning more anomalous
OneClassSVMLearner One-class SVM wrapper with larger scores meaning more anomalous
BaseLearner Abstract interface for custom score learners
bh, ebh BH and e-BH multiple-testing procedures

Custom Learners

FCND, SCND, and MSFCND can use any learner that implements BaseLearner. A learner must provide:

  • reset(): return a fresh, unfitted learner state;
  • fit(X, **kwargs): fit on an array of observations and return self;
  • score(X): return a one-dimensional array of nonconformity scores. Larger scores should indicate more anomalous observations.

Example wrapper around a scikit-learn-compatible estimator:

import numpy as np
from sklearn.neighbors import LocalOutlierFactor

from fcnd import BaseLearner, FCND


class LOFLearner(BaseLearner):
    def __init__(self, **kwargs):
        self._params = kwargs
        self.model = None
        self.fitted_ = False

    def reset(self):
        self.model = LocalOutlierFactor(novelty=True, **self._params)
        self.fitted_ = False
        return self

    def fit(self, X, **kwargs):
        if self.model is None:
            self.reset()
        self.model.set_params(**kwargs)
        self.model.fit(X)
        self.fitted_ = True
        return self

    def score(self, X):
        if self.model is None:
            raise RuntimeError("Call fit before score.")
        return -np.asarray(self.model.score_samples(X), dtype=float).reshape(-1)


detector = FCND(LOFLearner(n_neighbors=20), use_numba=True)
result = detector.detect(X_ref, X_test, alpha=0.1, method="ebh")

Score Construction Modes

Full conformal methods can construct scores in two main ways:

  • In-sample (IS) scoring fits the learner once on the combined reference and test data and scores all observations with that fitted learner.
  • Leave-one-out (LOO) scoring refits the learner after removing each observation and then scores the held-out observation.

For a single learner, use FCND(..., leave_one_out=True) to enable LOO scoring. By default, FCND uses IS scoring.

Model Selection

MSFCND accepts a dictionary or sequence of learner objects and performs block-wise model selection before forming conformal values:

from fcnd import MSFCND, IsoForestLearner, OneClassSVMLearner

learners = {
    "if100": IsoForestLearner(random_state=0, n_estimators=100),
    "svm_auto": OneClassSVMLearner(nu=0.05, kernel="rbf", gamma="auto"),
}
training_modes = {"if100": "is", "svm_auto": "loo"}

ms = MSFCND(
    learners,
    alpha=0.1,
    K=10,
    training_modes=training_modes,
    use_numba=True,
)
result = ms.detect(X_ref, X_test, method="ebh")

As seen above, MSFCND controls the score construction mode through training_modes: pass "is", "loo", "both", or a dictionary assigning modes for each learner.

Weighted FCND

Weighted conformal calibration is available through load_weights or the high-level detect interface:

result = detector.detect(
    X_ref,
    X_test,
    alpha=0.1,
    method="ebh",
    weights_ref=w_ref,
    weights_test=w_test,
)

The weights should encode the relevant density-ratio information for the weighted-exchangeability setting. For further details, see Section 4 of the paper.

Conditional Calibration

As the full conformal method uses e-values, we use the e-BH procedure for the multiple testing algorithm. However, e-BH can be further improved through leveraging an informative conditional distribution with regard to a sufficient statistic (see here).

EBHCC implements this ``boosted'' e-BH method, called e-BH-CC (e-BH with Conditional Calibration). It is a fast and lightweight implementation that is suited for conformal novelty detection, reducing memory use and computation time by no longer depending on resampling and Monte Carlo estimation.

The optimized shortcuts in EBHCC are deliberately restricted to the compatible built-in design choices used by the package:

  • built-in p-value auxiliary statistics for the prune=True skipping shortcut;
  • the built-in hybrid hatR denominator for the guarantee=True uniform-improvement shortcut.

For custom conditional-calibration statistics/denominators, use guarantee=False and prune=False unless the corresponding validity argument has been verified for that statistic/denominator pair. For further details, see Section 3.4 of the paper.

Citation

If you use fcnd in your research or workflow, please consider citing our accompanying paper!

@misc{lee2026fullconformal,
  title = {Full-conformal novelty detection},
  author = {Lee, Junu and Popov, Ilia and Ren, Zhimei},
  year = {2026},
  eprint = {2501.02703v2},
  archivePrefix = {arXiv},
  primaryClass = {stat.ME},
  doi = {10.48550/arXiv.2501.02703},
  url = {https://arxiv.org/abs/2501.02703v2}
}

A machine-readable citation is also available in CITATION.cff.

License

fcnd is released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fcnd-0.1.2.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fcnd-0.1.2-py3-none-any.whl (35.0 kB view details)

Uploaded Python 3

File details

Details for the file fcnd-0.1.2.tar.gz.

File metadata

  • Download URL: fcnd-0.1.2.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fcnd-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4fef6af263ec6b3b4a9ae9c5f939992f9c170b5e681f2e04157d7025fea0a5f9
MD5 d720d519f7b7a91e70f08ff68a4d5ad6
BLAKE2b-256 0b2c333bc6b76c69837a723050c9a0be28180d02db42e04d18449f911a3afcec

See more details on using hashes here.

Provenance

The following attestation bundles were made for fcnd-0.1.2.tar.gz:

Publisher: publish.yml on leejunu/fcnd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fcnd-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: fcnd-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 35.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fcnd-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ea0d7dc91b245d16df42edfbd1ccf1815f9e24d0c58542747fa92711c4e9ed1b
MD5 9d26d3829dd04403697718ba529b548c
BLAKE2b-256 6920335a962c4022788bb9ac401c74e03e964f1102a851a5caeb8f868192a345

See more details on using hashes here.

Provenance

The following attestation bundles were made for fcnd-0.1.2-py3-none-any.whl:

Publisher: publish.yml on leejunu/fcnd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page