Skip to main content

A Python library for anomaly detection across tabular, time series, graph, text, and image data. 60+ detectors, benchmark-backed ADEngine orchestration, and an agentic workflow for AI agents.

Project description

PyOD 3: Agentic Anomaly Detection At Scale

PyPI version Anaconda version Documentation status GitHub stars GitHub forks Downloads Testing Coverage Status Maintainability License Benchmark


PyOD is now agentic. Any AI agent can drive an expert-level anomaly detection investigation on your data in plain English. The classic fit/predict API stays unchanged.


PyOD 3 is the most comprehensive Python library for anomaly detection. Four pillars:

Pillar

What it means

Multi-Modal

60+ detectors across tabular, time series, graph, text, and image data, one API

Full Lifecycle

From raw data to explained anomalies and next-step guidance in a single call

Agentic

Ask in plain English, and AI agents run expert-level detection without OD expertise

Most Used

38+ million downloads; benchmark-backed routing (ADBench, TSB-AD, BOND, NLP-ADBench)

Install

Core library (required for every activation path):

pip install pyod

Then pick the activation path that matches your agent stack:

# 1. Claude Code / Claude Desktop / Codex — enables the od-expert skill
pyod install skill              # Claude Code / Desktop: user-global (~/.claude/skills/)
pyod install skill --project    # Codex: project-local (./skills/, Codex has no user-global dir)

# 2. Any MCP-compatible LLM — requires the optional mcp extra
pip install pyod[mcp]
pyod mcp serve                 # alias for `python -m pyod.mcp_server`

# 3. Pure Python — no extra step
#    from pyod.utils.ad_engine import ADEngine

Run pyod info at any time to see version, detector counts, and the install state of each activation path. pyod info also detects which agent stack you have installed (~/.claude/ for Claude Code, ~/.codex/ for Codex) and recommends the right install command.

For conda, source install, dependency details, and troubleshooting, see the full installation guide. The legacy pyod-install-skill command from v3.0.0 still works as an alias for pyod install skill.

Outlier Detection with 5 Lines of Code (pip install pyod):

from pyod.models.iforest import IForest
clf = IForest()
clf.fit(X_train)
y_train_scores = clf.decision_scores_          # training anomaly scores
y_test_scores = clf.decision_function(X_test)   # test anomaly scores

Three ways to use PyOD:

Layer

Name

When to use

Entry point

1

Classic API

You know which detector you want

Layer 1 examples

2

ADEngine

You want PyOD to choose, compare, and assess automatically

Layer 2 walkthrough

3

Agentic Investigation

You want an AI agent to drive OD through natural conversation

Layer 3 walkthrough

Layers 2 and 3 are powered by ADEngine, PyOD’s intelligent orchestration core. Layer 3 adds two agentic activation paths: the od-expert skill for Claude Code and Codex, and an MCP server (python -m pyod.mcp_server) that works with any MCP-compatible LLM out of the box. See the Install block above for detailed setup instructions.

PyOD 3 agentic investigation demo on cardiotocography dataset

The figure above shows a real 5-turn agentic conversation on the UCI Cardiotocography dataset. See the full walkthrough, runnable agentic example, or interactive HTML demo.

PyOD Ecosystem & Resources: NLP-ADBench (NLP anomaly detection) | TODS (time-series) | PyGOD (graph) | ADBench (benchmark) | AD-LLM (LLM-based AD) [50] | Resources


About PyOD

PyOD, established in 2017, is the longest-running and most widely used Python library for anomaly detection. With 38+ million downloads, it serves both academic research (featured in Analytics Vidhya, KDnuggets, and Towards Data Science) and commercial products.

V3 extends the library with ADEngine (intelligent orchestration) and the od-expert skill (agentic workflow), while keeping the classic fit/predict API fully backward-compatible. V3 is built on SUOD [55] for fast parallel training and numba JIT for per-model speedups.

Impact & Recognition:

Area

Examples

Space & science

European Space Agency OPS-SAT spacecraft telemetry benchmark (Nature Scientific Data, 2025) uses PyOD for all 30 algorithms.

Enterprise deployment

Walmart (1M+ daily pricing updates, KDD 2019), Databricks (Kakapo framework integrating PyOD with MLflow/Hyperopt; insider-threat detection solution), IQVIA (123K+ pharmacy claims), Altair AI Studio, Ericsson (patent WO2023166515A1).

Books

Outlier Detection in Python (Brett Kennedy, Manning); Handbook of Anomaly Detection with Python (Chris Kuo, Columbia); Finding Ghosts in Your Data (Kevin Feasel, Apress).

Courses

DataCamp Anomaly Detection in Python (19M+ platform learners), Manning liveProject, O’Reilly video edition, multiple Udemy courses.

Podcasts

Talk Python To Me #497, Real Python Podcast #208.

International

Tutorials in 5 non-English languages: Chinese (CSDN, Zhihu, 搜狐, 机器之心, aidoczh.com full doc translation), Japanese, Korean, German, Spanish.

See the full impact page on Read the Docs for the complete list of citations, enterprise deployments, patents, and media coverage.

Citing PyOD:

If you use PyOD in a scientific publication, we would appreciate citations to the following paper(s):

PyOD 2: A Python Library for Outlier Detection with LLM-powered Model Selection is available as a preprint. If you use PyOD in a scientific publication, we would appreciate citations to the following paper:

@inproceedings{chen2025pyod,
  title={Pyod 2: A python library for outlier detection with llm-powered model selection},
  author={Chen, Sihan and Qian, Zhuangzhuang and Siu, Wingchun and Hu, Xingcan and Li, Jiaqi and Li, Shawn and Qin, Yuehan and Yang, Tiankai and Xiao, Zhuo and Ye, Wanghao and others},
  booktitle={Companion Proceedings of the ACM on Web Conference 2025},
  pages={2807--2810},
  year={2025}
}

PyOD paper is published in Journal of Machine Learning Research (JMLR) (MLOSS track).:

@article{zhao2019pyod,
    author  = {Zhao, Yue and Nasrullah, Zain and Li, Zheng},
    title   = {PyOD: A Python Toolbox for Scalable Outlier Detection},
    journal = {Journal of Machine Learning Research},
    year    = {2019},
    volume  = {20},
    number  = {96},
    pages   = {1-7},
    url     = {http://jmlr.org/papers/v20/19-011.html}
}

or:

Zhao, Y., Nasrullah, Z. and Li, Z., 2019. PyOD: A Python Toolbox for Scalable Outlier Detection. Journal of machine learning research (JMLR), 20(96), pp.1-7.

For a broader perspective on anomaly detection, see our NeurIPS papers on ADBench [17] and ADGym.

Table of Contents:


API Cheatsheet & Reference

The full API Reference is split by modality at PyOD Documentation: Tabular, Time Series, Graph, Embedding, ADEngine, Utilities. Below is a quick cheatsheet for all detectors:

  • fit(X): Fit the detector. The parameter y is ignored in unsupervised methods.

  • decision_function(X): Predict raw anomaly scores for X using the fitted detector.

  • predict(X): Determine whether a sample is an outlier or not as binary labels using the fitted detector.

  • predict_proba(X): Estimate the probability of a sample being an outlier using the fitted detector.

  • predict_confidence(X): Assess the model’s confidence on a per-sample basis (applicable in predict and predict_proba) [38].

  • predict_with_rejection(X): Allow the detector to reject (i.e., abstain from making) highly uncertain predictions (output = -2) [39].

Key Attributes of a fitted model:

  • decision_scores_: Outlier scores of the training data. Higher scores typically indicate more abnormal behavior. Outliers usually have higher scores.

  • labels_: Binary labels of the training data, where 0 indicates inliers and 1 indicates outliers/anomalies.


Benchmarks

  • ADBench [17]: 30 algorithms on 57 tabular datasets. See comparison.

  • NLP-ADBench: 19 methods on 8 text datasets. Two-step (embedding + detector) beats end-to-end.

  • TSB-AD [70]: 40 algorithms on 1070 time series datasets (NeurIPS 2024).

  • BOND [71]: 14 graph anomaly detection algorithms on 14 datasets (NeurIPS 2022).


Additional Topics


Implemented Algorithms

PyOD is organized into two functional groups: (i) Detection Algorithms, with dedicated subsections for tabular, time series, and graph data (EmbeddingOD inside the tabular table adds multi-modal support for text and image via foundation model encoders); and (ii) Utility Functions for data generation, evaluation, and intelligent orchestration.

(i-a) Tabular & Multi-Modal Detection Algorithms :

Type

Abbr

Algorithm

Year

Ref

Probabilistic

ECOD

Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions (example)

2022

[31]

Probabilistic

ABOD

Angle-Based Outlier Detection (example)

2008

[24]

Probabilistic

FastABOD

Fast Angle-Based Outlier Detection using approximation (example)

2008

[24]

Probabilistic

COPOD

COPOD: Copula-Based Outlier Detection (example)

2020

[30]

Probabilistic

MAD

Median Absolute Deviation (MAD) (example)

1993

[21]

Probabilistic

SOS

Stochastic Outlier Selection (example)

2012

[22]

Probabilistic

QMCD

Quasi-Monte Carlo Discrepancy outlier detection (example)

2001

[13]

Probabilistic

KDE

Outlier Detection with Kernel Density Functions (example)

2007

[26]

Probabilistic

Sampling

Rapid distance-based outlier detection via sampling (example)

2013

[46]

Probabilistic

GMM

Probabilistic Mixture Modeling for Outlier Analysis (example)

[1] [Ch.2]

Linear Model

PCA

Principal Component Analysis (sum of weighted projected distances to eigenvector hyperplanes) (example)

2003

[45]

Linear Model

KPCA

Kernel Principal Component Analysis (example)

2007

[20]

Linear Model

MCD

Minimum Covariance Determinant (Mahalanobis distances as outlier scores) (example)

1999

[18] [41]

Linear Model

CD

Cook’s distance for outlier detection (example)

1977

[11]

Linear Model

OCSVM

One-Class Support Vector Machines (example)

2001

[44]

Linear Model

LMDD

Deviation-based Outlier Detection (LMDD) (example)

1996

[6]

Proximity-Based

LOF

Local Outlier Factor (example)

2000

[8]

Proximity-Based

COF

Connectivity-Based Outlier Factor (example)

2002

[47]

Proximity-Based

(Incr.) COF

Memory Efficient Connectivity-Based Outlier Factor (slower, reduced storage) (example)

2002

[47]

Proximity-Based

CBLOF

Clustering-Based Local Outlier Factor (example)

2003

[19]

Proximity-Based

LOCI

LOCI: Fast outlier detection via local correlation integral (example)

2003

[36]

Proximity-Based

HBOS

Histogram-based Outlier Score (example)

2012

[14]

Proximity-Based

HDBSCAN

Density-based clustering via hierarchical density estimates (example)

2013

[10]

Proximity-Based

kNN

k Nearest Neighbors (distance to k-th neighbor as outlier score) (example)

2000

[40]

Proximity-Based

AvgKNN

Average kNN (average distance to k neighbors as outlier score) (example)

2002

[5]

Proximity-Based

MedKNN

Median kNN (median distance to k neighbors as outlier score) (example)

2002

[5]

Proximity-Based

SOD

Subspace Outlier Detection (example)

2009

[25]

Proximity-Based

ROD

Rotation-based Outlier Detection (example)

2020

[4]

Outlier Ensembles

IForest

Isolation Forest (example)

2008

[32]

Outlier Ensembles

INNE

Isolation-based Anomaly Detection via Nearest-Neighbor Ensembles (example)

2018

[7]

Outlier Ensembles

DIF

Deep Isolation Forest for Anomaly Detection (example)

2023

[49]

Outlier Ensembles

FB

Feature Bagging (example)

2005

[27]

Outlier Ensembles

LSCP

LSCP: Locally Selective Combination of Parallel Outlier Ensembles (example)

2019

[54]

Outlier Ensembles

XGBOD

Extreme Boosting Based Outlier Detection (Supervised) (example)

2018

[53]

Outlier Ensembles

LODA

Lightweight On-line Detector of Anomalies (example)

2016

[37]

Outlier Ensembles

SUOD

SUOD: Accelerating Large-scale Unsupervised Heterogeneous OD (Acceleration) (example)

2021

[55]

Neural Networks

AutoEncoder

Fully connected AutoEncoder (reconstruction error as outlier score) (example)

[1] [Ch.3]

Neural Networks

VAE

Variational AutoEncoder (reconstruction error as outlier score) (example)

2013

[23]

Neural Networks

Beta-VAE

Variational AutoEncoder with customized loss (gamma and capacity) (example)

2018

[9]

Neural Networks

SO_GAAL

Single-Objective Generative Adversarial Active Learning (example)

2019

[33]

Neural Networks

MO_GAAL

Multiple-Objective Generative Adversarial Active Learning (example)

2019

[33]

Neural Networks

DeepSVDD

Deep One-Class Classification (example)

2018

[42]

Neural Networks

AnoGAN

Anomaly Detection with Generative Adversarial Networks

2017

[43]

Neural Networks

ALAD

Adversarially learned anomaly detection (example)

2018

[52]

Neural Networks

AE1SVM

Autoencoder-based One-class Support Vector Machine (example)

2019

[34]

Neural Networks

DevNet

Deep Anomaly Detection with Deviation Networks (example)

2019

[35]

Graph-based

R-Graph

Outlier detection by R-graph (example)

2017

[51]

Graph-based

LUNAR

LUNAR: Unifying Local OD Methods via Graph Neural Networks (example)

2022

[15]

Embedding-based

EmbeddingOD

Multi-modal anomaly detection via foundation model embeddings, text and image (example)

2025

[28]

Ensemble methods (IForest, INNE, DIF, FB, LSCP, LODA, SUOD, XGBOD) are included in the table above. Score combination functions (average, maximization, AOM, MOA, median, majority vote) are in pyod.models.combination. See API docs for details.

(i-b) Time Series Anomaly Detection :

All time series detectors use the same fit/predict/decision_function API as tabular detectors, with one exception: MatrixProfile is transductive (train-only; use decision_scores_ and labels_ after fit(), no out-of-sample predict).

Input format: numpy array of shape (n_timestamps,) for univariate or (n_timestamps, n_channels) for multivariate. Each row is one timestep; columns are channels/features. Pandas DataFrames and lists are auto-converted. Output: decision_scores_ of shape (n_timestamps,) with one anomaly score per timestep.

Time series detection in 3 lines:

from pyod.models.ts_kshape import KShape      # or any TS detector
clf = KShape(window_size=20)
clf.fit(X_train)                               # shape (n_timestamps,) or (n_timestamps, n_channels)
scores = clf.decision_scores_                  # per-timestamp anomaly scores

Algorithm rankings from TSB-AD benchmark [70] (NeurIPS 2024, 1070 datasets):

Type

Abbr

Algorithm

Year

Ref

Windowed Bridge

TimeSeriesOD

Any PyOD detector on sliding windows (example)

2026

Subsequence

MatrixProfile

Matrix Profile via STOMP, transductive (example)

2016

[61]

Frequency

SpectralResidual

Spectral Residual: FFT-based saliency (example)

2019

[59]

Clustering

KShape

k-Shape clustering (#2 in TSB-AD) (example)

2015

[58]

Streaming

SAND

Streaming with drift adaptation, experimental (example)

2021

[56]

Deep Learning

LSTMAD

LSTM prediction error + Mahalanobis scoring

2015

[57]

Deep Learning

AnomalyTransformer

Transformer with association discrepancy (experimental)

2022

[60]

(i-c) Graph Anomaly Detection (pip install pyod[graph]):

All graph detectors are transductive in v1: use decision_scores_ and labels_ after fit(). No out-of-sample predict. Input: PyG Data object with x (node features) and edge_index (COO edges). SCAN works without features.

Graph detection in 3 lines (pip install pyod[graph]):

from pyod.models.pyg_dominant import DOMINANT
clf = DOMINANT(hidden_dim=64, epochs=100)
clf.fit(data)                                  # PyG Data object
scores = clf.decision_scores_                  # per-node anomaly scores

Algorithm rankings from BOND benchmark [71] (NeurIPS 2022, 14 datasets):

Type

Abbr

Algorithm

Year

Ref

GCN Autoencoder

DOMINANT

GCN AE, structure + attribute reconstruction (#1 BOND deep) (dominant example)

2019

[62]

Contrastive

CoLA

Contrastive self-supervised, local neighbor context (#2 BOND deep) (cola example)

2022

[63]

Contrastive+AE

CONAD

Contrastive with anomalous-view injection + dual reconstruction (conad example)

2022

[64]

Attention AE

AnomalyDAE

GAT structure encoder + MLP attribute encoder (anomalydae example)

2020

[65]

Motif AE

GUIDE

Dual GCN AE on original + triangle-motif adjacency (guide example)

2021

[66]

Matrix Factor.

Radar

Residual analysis via matrix factorization (radar example)

2017

[67]

Matrix Factor.

ANOMALOUS

Joint MF with Laplacian regularization (anomalous example)

2018

[68]

Structural

SCAN

Structural clustering, no features needed (scan example)

2007

[69]

(ii) Utility Functions:

Type

Name

Function

Data

generate_data

Synthesized data generation; normal data from multivariate Gaussian, outliers from uniform distribution

Data

generate_data_clusters

Synthesized data generation in clusters for more complex patterns

Evaluation

evaluate_print

Print ROC-AUC and Precision @ Rank n for a detector

Evaluation

precision_n_scores

Calculate Precision @ Rank n

Utility

get_label_n

Turn raw outlier scores into binary labels by assigning 1 to the top n scores

Stat

wpearsonr

Calculate the weighted Pearson correlation of two samples

Encoding

resolve_encoder

Resolve an encoder from a string name, BaseEncoder instance, or callable

Encoding

SentenceTransformerEncoder

Encode text via sentence-transformers models (e.g., MiniLM, mpnet)

Encoding

OpenAIEncoder

Encode text via OpenAI Embeddings API (text-embedding-3-small/large)

Encoding

HuggingFaceEncoder

Encode text or images via HuggingFace transformers (BERT, DINOv2, CLIP)


Quick Start for Outlier Detection

PyOD has been well acknowledged by the machine learning community with a few featured posts and tutorials.

Analytics Vidhya: An Awesome Tutorial to Learn Outlier Detection in Python using PyOD Library

KDnuggets: Intuitive Visualization of Outlier Detection Methods, An Overview of Outlier Detection Methods from PyOD

Towards Data Science: Anomaly Detection for Dummies

“examples/knn_example.py” demonstrates the basic API of using kNN detector. It is noted that the API across all other algorithms are consistent/similar.

More detailed instructions for running examples can be found in examples directory.

  1. Initialize a kNN detector, fit the model, and make the prediction.

    from pyod.models.knn import KNN   # kNN detector
    
    # train kNN detector
    clf_name = 'KNN'
    clf = KNN()
    clf.fit(X_train)
    
    # get the prediction label and outlier scores of the training data
    y_train_pred = clf.labels_  # binary labels (0: inliers, 1: outliers)
    y_train_scores = clf.decision_scores_  # raw outlier scores
    
    # get the prediction on the test data
    y_test_pred = clf.predict(X_test)  # outlier labels (0 or 1)
    y_test_scores = clf.decision_function(X_test)  # outlier scores
    
    # it is possible to get the prediction confidence as well
    y_test_pred, y_test_pred_confidence = clf.predict(X_test, return_confidence=True)  # outlier labels (0 or 1) and confidence in the range of [0,1]
  2. Evaluate the prediction by ROC and Precision @ Rank n (p@n).

    from pyod.utils.data import evaluate_print
    
    # evaluate and print the results
    print("\nOn Training Data:")
    evaluate_print(clf_name, y_train, y_train_scores)
    print("\nOn Test Data:")
    evaluate_print(clf_name, y_test, y_test_scores)
  3. See a sample output & visualization.

    On Training Data:
    KNN ROC:1.0, precision @ rank n:1.0
    
    On Test Data:
    KNN ROC:0.9989, precision @ rank n:0.9
    visualize(clf_name, X_train, y_train, X_test, y_test, y_train_pred,
        y_test_pred, show_figure=True, save_figure=False)

Reference

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyod-3.2.1.tar.gz (304.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyod-3.2.1-py3-none-any.whl (364.7 kB view details)

Uploaded Python 3

File details

Details for the file pyod-3.2.1.tar.gz.

File metadata

  • Download URL: pyod-3.2.1.tar.gz
  • Upload date:
  • Size: 304.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pyod-3.2.1.tar.gz
Algorithm Hash digest
SHA256 f7b729d601e6442e67cbf9d845f3200657ca5ae96359c07fa597afdbb87c1114
MD5 3a92b7a73370c951cae8760bfc944086
BLAKE2b-256 a05d1cf8e3b991989b8b573c9e0079a0385a6c1444a8702bef5241fd99e6a939

See more details on using hashes here.

File details

Details for the file pyod-3.2.1-py3-none-any.whl.

File metadata

  • Download URL: pyod-3.2.1-py3-none-any.whl
  • Upload date:
  • Size: 364.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for pyod-3.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 258afe6d93a02935b221f19fc7bfa8e94b47a6e081cdd16d67f69bf377855bd5
MD5 805f33656a054422a4619e0508229be5
BLAKE2b-256 cb831ae33ae6db890f20b32bf63a9333df271ec1edeca222e60d03faa512a7c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page