OSBAD (Open-source Benchmark of Anomaly Detection)
Project description
Open-Source Benchmark of Anomaly Detection (OSBAD)
Welcome to the Open-Source Benchmark of Anomaly Detection (OSBAD) repository, a unified, reproducible framework for evaluating the performance of various statistical, distance-based, and machine learning methods in detecting anomalies for chemical and material science applications. With the growing reliance on data-driven analysis in fields ranging from battery, catalysis and polymers to alloys and nanomaterials, the ability to detect anomalies reliably and efficiently is crucial for discovery, safety and performance optimization.
What Are Anomalies?
Anomalies are observations in data that deviate significantly from expected or typical patterns. For example, in the context of battery systems, anomalies may signal degradation, faults, or unsafe conditions and can indicate issues like overheating, capacity fade, or internal short circuits.
We consider two primary types of anomalies:
-
Point anomalies: A single data point that is significantly different from the rest. Example: A sudden change in the voltage, current or temperature measurement for battery systems or a sudden spike in temperature during a reaction or an outlying measurement in spectroscopic data.
-
Collective anomalies: A sequence or group of data points that, when considered together, are anomalous, even if each point appears normal in isolation. Example: A continuous series of abnormal voltage measurement across a window that diverges from expected discharge behavior or a time series of abnormal stress-strain measurements in a mechanical test that diverges from expected material deformation behavior.
Why Is Anomaly Detection Important?
Chemical and material systems are critical in applications such as energy storage, catalysis, electronics, structural design, and biomedical devices. Anomaly detection plays a vital role in:
- Research efficiency: Identifying irregular experimental data points for faster analysis and reproducibility.
- Material discovery: Detecting rare but valuable events that could indicate new material properties.
- Process safety: Early detection of unsafe conditions (e.g., thermal runaway in battery operations).
- Preventive maintenance: Identifying degradation or failure modes before performance drops.
- Regulatory compliance: Ensuring that processes and produced materials meet consistency and performance standards.
Robust anomaly detection helps improve reliability, accelerate innovation, and ensure safety across a wide range of chemical and material applications.
Methods Included in This Benchmark
This benchmark includes a broad spectrum of approaches grouped into three categories:
Statistical Methods
- Standard Deviation
- Median Absolute Deviation (MAD)
- Interquartile Range (IQR)
- Z-score
- Modified Z-score
Distance-Based Metrics
- Euclidean Distance
- Manhattan Distance
- Minkowski Distance
- Mahalanobis Distance
Machine Learning Approaches
- Isolation Forest
- K-Nearest Neighbors (KNN)
- Gaussian Mixture Models (GMM)
- Local Outlier Factor (LOF)
- Principal Component Analysis (PCA)
- Autoencoders (AE)
Each method is applied and tested on curated benchmarking datasets to assess its suitability and effectiveness.
Evaluation Metrics
This benchmark evaluates each method using the following metrics:
- Accuracy: Overall correctness of the anomaly detector
- Precision: Proportion of detected anomalies that are truly anomalous
- Recall: Proportion of actual anomalies that were correctly detected
- F1-score: Harmonic mean of precision and recall
- Matthews Correlation Coefficient (MCC): A balanced measure that handles imbalanced datasets by evaluating true positives, true negatives, false positives and false negatives,
These metrics help ensure a fair and comprehensive comparison across different detection techniques.
Documentation
The documentation to our project can be found here: OSBAD Documentation
Contributing
Contributions are welcome! Whether it's new methods, datasets, or performance improvements, feel free to open an issue or submit a pull request.
License
This project is licensed under the Apache License, Version 2.0.
Contact
For questions, collaborations, or feedback, please open an issue or contact the repository maintainer.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file osbad-1.5.0-py3-none-any.whl.
File metadata
- Download URL: osbad-1.5.0-py3-none-any.whl
- Upload date:
- Size: 55.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42fe7d82947912f889214e078081e37f9ee5a23a928426a9dc068d539d831b08
|
|
| MD5 |
e9cb717a7a8e0d5fd73b4c47b82f9d3e
|
|
| BLAKE2b-256 |
b91d1ef4fdee7d99dc6791e7263355b9ebb029669a35e8091f53eecc136d022b
|