ProRCA - Root Cause Analysis Tool
Project description
ProRCA: A Causal Pathway Approach for Complex Operational Environments
Overview
ProRCA is an end-to-end framework for diagnosing anomalies in complex operational environments by uncovering multi-hop causal pathways. Unlike traditional anomaly detection methods that focus on correlations or feature importance (e.g., via SHAP), our approach leverages structural causal modeling to trace the full causal chainโfrom hidden root causes to observed anomalies.
Inspired by the paper:
Beyond Traditional Problem-Solving: A Causal Pathway Approach for Complex Operational Environments
Ahmed Dawoud & Shravan Talupula, February 9, 2025 ๐ Download PDFThis work introduces a methodology that combines conditional anomaly scoring with causal path discovery and ranking. By extending the DoWhy library, the framework provides decision-makers with actionable insights into the true source of complex operational disruptions.
Features
-
Anomaly Detection:
Detect anomalies in time series data using ADTKโsInterQuartileRangeADvia theAnomalyDetectorclass. -
Synthetic Data Generation:
Generate realistic synthetic transactional data with adjustable parameters usingcreate_synthetic_data.py. -
Structural Causal Modeling:
Build a causal graph and fit a Structural Causal Model (SCM) usingScmBuilderin thepathway.pymodule. -
Causal Root Cause Analysis:
Discover and rank multi-hop causal pathways usingCausalRootCauseAnalyzer, which combines structural and noise-based anomaly scoring. -
Visualization:
Visualize causal pathways with Graphviz diagrams, using gradient backgrounds to indicate path importance viaCausalResultsVisualizer.
Project Structure
ProRCA/
โโโ .gitignore
โโโ .github/
โโโ CHANGELOG.md
โโโ CONTRIBUTING.md
โโโ LICENSE
โโโ README.md
โโโ docs/
โ โโโ Examples/
โ โ โโโ Example_1/
โ โ โโโ Example_2/
โ โโโ research_paper/
โโโ pyproject.toml
โโโ src/
โ โโโ anomaly/
โ โ โโโ __init__.py
โ โ โโโ adtk.py
โ โโโ data_generators/
โ โ โโโ __init__.py
โ โ โโโ synthetic_sales_data.py
โ โโโ prorca/
โ โโโ __init__.py
โ โโโ dag_builder.py
โ โโโ pathway.py
Installation
Clone the repository and install dependencies:
git clone https://github.com/profitopsai/ProRCA.git
cd ProRCA
pip install .
Usage
1. Generate Synthetic Data
from src.create_synthetic_data import generate_fashion_data_with_brand
df = generate_fashion_data_with_brand(start_date="2023-01-01", end_date="2023-12-31")
2. Inject Anomalies
from src.create_synthetic_data import inject_anomalies_by_date
anomaly_schedule = {
'2023-06-10': ('ExcessiveDiscount', 0.8),
'2023-06-15': ('COGSOverstatement', 0.4),
'2023-07-01': ('FulfillmentSpike', 0.5)
}
df_anomalous = inject_anomalies_by_date(df, anomaly_schedule)
3. Detect Anomalies
from src.anomaly.adtk import AnomalyDetector
detector = AnomalyDetector(df_anomalous, date_col="ORDERDATE", value_col="PROFIT_MARGIN")
anomalies = detector.detect()
anomaly_dates = detector.get_anomaly_dates()
detector.visualize(figsize=(12, 6), ylim=(40, 60))
4. Build the Structural Causal Model (SCM)
from src.prorca.pathway import ScmBuilder
edges = [
("PRICEEACH", "UNIT_COST"), ("PRICEEACH", "SALES"),
("UNIT_COST", "COST_OF_GOODS_SOLD"),
("QUANTITYORDERED", "SALES"), ("QUANTITYORDERED", "COST_OF_GOODS_SOLD"),
("SALES", "DISCOUNT"), ("SALES", "NET_SALES"),
("DISCOUNT", "NET_SALES"),
("NET_SALES", "FULFILLMENT_COST"), ("NET_SALES", "MARKETING_COST"),
("NET_SALES", "RETURN_COST"), ("NET_SALES", "PROFIT"),
("FULFILLMENT_COST", "PROFIT"), ("MARKETING_COST", "PROFIT"),
("RETURN_COST", "PROFIT"), ("COST_OF_GOODS_SOLD", "PROFIT"),
("SHIPPING_REVENUE", "PROFIT"), ("PROFIT", "PROFIT_MARGIN"),
("NET_SALES", "PROFIT_MARGIN")
]
nodes = ["PRICEEACH", "UNIT_COST", "SALES", "COST_OF_GOODS_SOLD", "PROFIT_MARGIN"]
builder = ScmBuilder(edges=edges, nodes=nodes)
scm = builder.build(df_anomalous)
5. Perform Causal Root Cause Analysis
from src.prorca.pathway import CausalRootCauseAnalyzer
analyzer = CausalRootCauseAnalyzer(scm, min_score_threshold=0.8)
results = analyzer.analyze(df_anomalous, anomaly_dates, start_node='PROFIT_MARGIN')
6. Visualize Causal Pathways
from src.prorca.pathway import CausalResultsVisualizer
visualizer = CausalResultsVisualizer(analysis_results=results)
visualizer.plot_root_cause_paths()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file profitops_rca-0.1.tar.gz.
File metadata
- Download URL: profitops_rca-0.1.tar.gz
- Upload date:
- Size: 20.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8b8b909c44883f91b12c5354ed3cdee4ac89a51eb4cbc02e51821c2bfbe374c
|
|
| MD5 |
02499734dc8906ae224479d81838b3bb
|
|
| BLAKE2b-256 |
d03d0a01454820fc6120afde7d54d7bb385c22b616505c9bdca27ea8e876c33e
|
File details
Details for the file profitops_rca-0.1-py3-none-any.whl.
File metadata
- Download URL: profitops_rca-0.1-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e942b596a7183075471da38e8de03ff8b561ca4eb95dc5945c7437f9c7a0492
|
|
| MD5 |
159a209cb50baf617346e43694acbcfa
|
|
| BLAKE2b-256 |
178260d7ed820ae6d6bd714ac8c1b9b89861b7c4e216db78062b3c6d14250a4d
|