Multi-Dimensional Outlier Detection (MDOD) using vector cosine similarity with added virtual dimension
Project description
MDOD - Multi-Dimensional Outlier Detection
MDOD is a Python library for outlier detection in multi-dimensional data using vector cosine similarity with an added virtual dimension.
-
The core algorithm is based on the author's original academic paper:
"Outlier Detect using Vector Cosine Similarity by Adding a Dimension"
Published in: The 2024 International Conference on Artificial Intelligence in Information and Communications (ICAIIC 2024)
DOI: 10.1109/ICAIIC60209.2024.10463442 -
The code implementation was developed by the author with significant optimization assistance from Grok (xAI).
This software is licensed under the BSD 3-Clause License - see the LICENSE.txt file for details.
Disclaimer
This library is provided "AS IS", WITHOUT ANY WARRANTY of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the software or the use or other dealings in the software.
Users are strongly advised to thoroughly test and validate the library for their specific use case, especially in production environments or critical applications.
Installation
pip install mdod
Or from source:
git clone https://github.com/mddod/mdod.git
cd mdod
python setup.py install
Usage Example
Please visit the repository for detailed examples:
https://github.com/mddod/mdod
or the documentation site: https://mddod.github.io/
You can also run testmdodmodelv3.py or testmdod_simple_example_en.py to see demonstrations.
Example Performance (Synthetic Dataset: 1000 samples, 2D, 150 outliers, sampling_rate=0.05)
- MDOD AUC: 1.0
- MDOD Runtime: ~0.004–0.08 seconds (significantly faster than LOF)
- Spearman correlation with LOF scores: 0.9191
Parameters
norm_distance: Distance of the virtual dimension (default: 1.0) — affects similarity sensitivity.top_n: Number of top similar points to consider (default: 5).contamination: Expected proportion of outliers (default: 0.1) — used for threshold.sampling_rate: Sampling ratio (0–1, default: 1.0) — lower values speed up computation.random_state: Random seed for reproducibility.
Performance Comparison with LOF
On standard test datasets, MDOD consistently achieves perfect or near-perfect detection (AUC ≈ 1.0) while being 2–3x faster than scikit-learn's Local Outlier Factor (LOF).
Quick Start
from mdod import MDOD
import numpy as np
# Example data (replace with your own dataset)
X = np.random.randn(1000, 10) # 1000 samples, 10 features
model = MDOD(
norm_distance=1.0,
top_n=5,
contamination=0.1,
sampling_rate=0.1, # Use lower values for faster computation on large data
random_state=42
)
model.fit(X)
# Predict outliers
labels = model.predict() # 1 = outlier, 0 = normal
scores = model.decision_function(X) # indicate outlier probability
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mdod-3.0.5.tar.gz.
File metadata
- Download URL: mdod-3.0.5.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cb59b1e16dd47145b48c43e9585ccaeca0aa05fab50fa08fedaea12649b8d79
|
|
| MD5 |
10b972538b0ea2ef18c59e7269b18440
|
|
| BLAKE2b-256 |
82ac738ef2b1299a3a7332b48faf7ac1ee2ad81d27ed415970b94ab838e1440c
|
File details
Details for the file mdod-3.0.5-py3-none-any.whl.
File metadata
- Download URL: mdod-3.0.5-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6cdefdcb5e6aa4693e21a454331034a09a53711e0211f285ef00a1a6a30ffaf
|
|
| MD5 |
e6d76f2633990bce9dcfd287b03a8c3c
|
|
| BLAKE2b-256 |
84e1e97336817b9201083aa74c253d97f4817983fe8bbc5521a8e702a6907550
|