Modernized CADE package for concept drift detection, adapted for FIRCE integration.
Project description
CADE-FIRCE
Modernized CADE for concept drift detection in reusable Python workflows.
This package adapts the original CADE codebase from the USENIX Security 2021 paper into a library-oriented Python package for integration into other systems. In particular, it provides a runtime detector API that can be used inside streaming pipelines, evaluation frameworks, and drift monitoring components rather than only through the original experimental scripts. :contentReference[oaicite:2]{index=2} :contentReference[oaicite:3]{index=3}
What this fork adds
This fork keeps the core CADE idea intact while updating the codebase for modern Python packaging and programmatic use.
Key changes include:
- packaging through
pyproject.toml - modern dependency management with
uv - a runtime-facing detector class,
CadeRuntimeDetector - a clearer fit and detect workflow for integration into other projects
- improved validation and runtime checks for detector configuration and input data shapes
The main entry point for integration is cade.runtime.CadeRuntimeDetector, which exposes a direct library API for training on reference data and then scoring incoming batches for drift. :contentReference[oaicite:4]{index=4}
Background
CADE, short for Contrastive Autoencoder for Drift Detection and Explanation, was introduced in:
Limin Yang, Wenbo Guo, Qingying Hao, Arridhana Ciptadi, Ali Ahmadzadeh, Xinyu Xing, and Gang Wang.
CADE: Detecting and Explaining Concept Drift Samples for Security Applications.
USENIX Security 2021. :contentReference[oaicite:5]{index=5}
The original work targets a specific form of concept drift in security settings, especially cases where new samples no longer align well with previously learned class structure. This fork focuses on making that detector easier to embed in downstream systems.
If you build on this package in a project or publication, please cite the original CADE paper.
@inproceedings{yang2021cade,
title={$\{$CADE$\}$: Detecting and explaining concept drift samples for security applications},
author={Yang, Limin and Guo, Wenbo and Hao, Qingying and Ciptadi, Arridhana and Ahmadzadeh, Ali and Xing, Xinyu and Wang, Gang},
booktitle={30th USENIX Security Symposium (USENIX Security 21)},
pages={2327--2344},
year={2021}
}
Installation
This project uses uv for environment and dependency management.
Clone the repository, then sync dependencies:
uv sync
For development dependencies:
uv sync --group dev
For scripting helpers:
uv sync --group scripting
To install all configured dependency groups:
uv sync --all-groups
Development workflow
Common development commands:
uv lock
uv sync
uv run pytest -q
uv run pytest --cov=cade --cov-report=term-missing --cov-report=xml
uv run ruff format .
uv run ruff check .
uv run ruff check . --fix
uv build
uv run twine check dist/*
uv run deptry .
If you use the included Makefile, these commands are wrapped in targets such as make sync, make test, make lint, and make build.
Runtime drift detection
The primary integration surface is CadeRuntimeDetector.
It is designed for the common pattern:
- Fit the detector on known reference data
- Encode incoming samples into CADE's latent space
- Measure distance to learned class centroids
- Compute robust anomaly scores using per-class median and MAD statistics
- Flag row-level drift and summarize chunk-level drift status
Detector behavior
After fit, the detector stores:
- the observed training classes
- a label-to-index mapping
- latent centroids for each class
- per-class median distances
- per-class MAD-scaled distance statistics
- the trained encoder model
During detect(x), the detector:
- validates the input batch
- encodes each row into latent space
- computes distance from each encoded row to every class centroid
- converts those distances into anomaly scores
- marks a row as drifted if its minimum anomaly score exceeds
mad_threshold - marks the chunk as drifted if drift count or drift ratio exceeds configured thresholds
This makes the detector useful both for per-row inspection and for higher-level monitoring decisions.
Basic example
A minimal runtime example looks like this:
from __future__ import annotations
import numpy as np
from cade.runtime import CadeRuntimeDetector
X_train = np.random.rand(1000, 32).astype(np.float32)
y_train = np.random.randint(0, 3, size=1000)
X_chunk = np.random.rand(128, 32).astype(np.float32)
detector = CadeRuntimeDetector(
dims=[32, 64, 16],
margin=10.0,
mad_threshold=3.5,
min_drift_ratio=0.05,
min_drift_count=1,
batch_size=64,
epochs=25,
lr=1e-3,
)
detector.fit(X_train, y_train)
out = detector.detect(X_chunk)
print("Chunk drift:", out.chunk_drift)
print("Drifted rows:", int(out.row_flags.sum()))
print("Scores shape:", out.scores.shape)
The returned object contains:
row_flags: boolean drift flags for each rowscores: per-row anomaly scoresclosest_classes: nearest learned class for each rowchunk_drift: overall chunk-level drift decision
Integration example in a monitoring pipeline
One intended use of this package is wrapping the runtime detector inside a project-specific monitoring interface. For example, a monitoring component can fit CADE on training data and then translate CADE output into a framework-specific drift result object:
from __future__ import annotations
from typing import TYPE_CHECKING
import numpy as np
from cade.runtime import CadeRuntimeDetector
from firce.drift_monitor.base import DriftDetectionResult
from .cade_config import CadeMonitorConfig
if TYPE_CHECKING:
from firce.utils.config import SimulationConfig
from firce.utils.perf_stats import PerformanceStats
class CadeDriftMonitor:
def __init__(self, config: SimulationConfig) -> None:
cade_cfg = CadeMonitorConfig(**config.monitor_kwargs)
self._detector = CadeRuntimeDetector(
dims=cade_cfg.dims,
margin=cade_cfg.margin,
mad_threshold=cade_cfg.mad_threshold,
min_drift_ratio=cade_cfg.min_drift_ratio,
min_drift_count=cade_cfg.min_drift_count,
batch_size=cade_cfg.batch_size,
epochs=cade_cfg.epochs,
lr=cade_cfg.lr,
cae_lambda_1=cade_cfg.cae_lambda_1,
similar_ratio=cade_cfg.similar_ratio,
display_interval=cade_cfg.display_interval,
force_retrain=cade_cfg.force_retrain,
weights_path=cade_cfg.weights_path,
device=cade_cfg.device,
)
def fit(
self,
X_train: np.ndarray,
y_train: np.ndarray,
perf_stats: PerformanceStats | None = None,
) -> None:
self._detector.fit(X_train, y_train)
def detect(self, X: np.ndarray) -> DriftDetectionResult:
out = self._detector.detect(X)
row_flags = np.asarray(out.row_flags, dtype=bool).reshape(-1)
scores = np.asarray(out.scores, dtype=float).reshape(-1)
return DriftDetectionResult(
row_flags=row_flags,
chunk_drift=bool(row_flags.any()),
scores=scores,
metadata={
"drift_count": int(row_flags.sum()),
"chunk_size": int(len(row_flags)),
"drift_ratio": float(row_flags.mean()) if len(row_flags) else 0.0,
},
)
This pattern is useful when CADE is one detector among several, or when a larger framework expects a standard drift-monitor interface.
API notes
CadeRuntimeDetector(...)
Important configuration parameters include:
dims: network dimensions, including input and latent dimensionsmargin: contrastive margin used during trainingmad_threshold: row-level anomaly thresholdmin_drift_ratio: chunk-level ratio thresholdmin_drift_count: chunk-level count thresholdbatch_size: training batch sizeepochs: number of training epochslr: optimizer learning ratecae_lambda_1: CAE training weightsimilar_ratio: ratio used for similar-pair constructiondisplay_interval: training log intervalweights_path: optional saved weights pathdevice: TensorFlow device string such as/CPU:0force_retrain: whether to discard an existing weights file before training
fit(x_train, y_train)
Fits the detector on labeled reference data. Input requirements:
x_trainmust be a 2D arrayy_trainmust be a 1D array- lengths must match
x_train.shape[1]must equaldims[0]- at least two classes must be present in training data
detect(x)
Scores a batch for drift. Input requirements:
xmust be a 2D arrayx.shape[1]must equaldims[0]- the detector must already be fitted
When to use this package
This package is a good fit when you need:
- a drift detector that can be embedded directly into Python systems
- row-level drift flags and continuous anomaly scores
- chunk-level drift decisions based on configurable thresholds
- a detector that learns class structure in a latent space rather than relying only on raw-feature distances
It is especially useful in workflows where training data represents known classes and incoming data may contain new or shifted patterns that no longer fit those learned latent distributions.
Project status
This package is a maintained downstream adaptation of the original CADE research code. It is intended to make CADE easier to use in modern Python environments and in integration-heavy projects such as evaluation pipelines, security tooling, and drift monitoring frameworks.
It should not be treated as the official upstream release.
Attribution
This package is derived from the original CADE codebase and research work by:
- Limin Yang
- Wenbo Guo
- Qingying Hao
- Arridhana Ciptadi
- Ali Ahmadzadeh
- Xinyu Xing
- Gang Wang
If you use this fork, please credit both:
- the original CADE paper for the research contribution
- this package or repository for packaging and runtime integration work, where appropriate
License
This repository retains the original CADE licensing terms.
For ethical considerations, the code and data are covered by a modified BSD 3-Clause style license that restricts use to non-commercial scientific research and non-commercial education. Commercial use is prohibited.
Please review the LICENSE file before redistribution or use.
Repository links
- Upstream CADE research repository:
whyisyoung/CADE - This repository:
DFAIR-LAB-Augusta/CADE_FIRCE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cade_firce-0.4.1.tar.gz.
File metadata
- Download URL: cade_firce-0.4.1.tar.gz
- Upload date:
- Size: 54.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
990b5c66de3d6d666f66c4fc5d9c86a79571f192badc5ba276592f323e8c0927
|
|
| MD5 |
804cf8fd6540ad4f8d2f1293aec9ef70
|
|
| BLAKE2b-256 |
643ae991990c8411e80624195054194a5487debc6623e678c2d4c98a1dd82a65
|
File details
Details for the file cade_firce-0.4.1-py3-none-any.whl.
File metadata
- Download URL: cade_firce-0.4.1-py3-none-any.whl
- Upload date:
- Size: 64.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62e6fdb8743b0bee42fcd2f125099764884573ffcf1eef8acabcd9e1071eae0e
|
|
| MD5 |
ee147734ef12f75c6effb8a013b14c85
|
|
| BLAKE2b-256 |
608cd83ffc9d5f05840fc01a2ce13faa0b213ff7cea2f1544ce2da2fc9f0a3f8
|