ProRCA - Root Cause Analysis Tool
Project description
ProRca: A Causal Pathway Approach for Complex Operational Environments
Overview
ProRca is an end-to-end framework for diagnosing anomalies in complex operational environments by uncovering multi-hop causal pathways. Unlike traditional anomaly detection methods that focus on correlations or feature importance (e.g., via SHAP), our approach leverages structural causal modeling to trace the full causal chain—from hidden root causes to observed anomalies.
Inspired by the paper:
Beyond Traditional Problem-Solving: A Causal Pathway Approach for Complex Operational Environments
Ahmed Dawoud & Shravan Talupula, February 9, 2025 📄 Download PDFThis work introduces a methodology that combines conditional anomaly scoring with causal path discovery and ranking. By extending the DoWhy library, the framework provides decision-makers with actionable insights into the true source of complex operational disruptions.
Features
-
Anomaly Detection:
Detect anomalies in time series data using ADTK’sInterQuartileRangeADvia theAnomalyDetectorclass. -
Synthetic Data Generation:
Generate realistic synthetic transactional data with adjustable parameters usingcreate_synthetic_data.py. -
Structural Causal Modeling:
Build a causal graph and fit a Structural Causal Model (SCM) usingScmBuilderin thepathway.pymodule. -
Causal Root Cause Analysis:
Discover and rank multi-hop causal pathways usingCausalRootCauseAnalyzer, which combines structural and noise-based anomaly scoring. -
Visualization:
Visualize causal pathways with Graphviz diagrams, using gradient backgrounds to indicate path importance viaCausalResultsVisualizer. -
End-to-End Example:
A complete example of the workflow is provided in the Jupyter Notebooknotebooks/test.ipynb.
Project Structure
ProRca/
├── src/ # Source code directory
│ ├── __init__.py
│ ├── anomaly.py
│ ├── create_synthetic_data.py
│ ├── pathway.py
├── notebooks/ # Jupyter notebooks for demonstrations
│ ├── test.ipynb
├── README.md # Documentation
├── .gitignore # Git ignore file
├── requirements.txt # Dependencies list
Installation
Clone the repository and install dependencies:
git clone https://github.com/yourusername/causal-rca.git
cd causal-rca
pip install -r requirements.txt
Usage
1. Generate Synthetic Data
from src.create_synthetic_data import generate_fashion_data_with_brand
df = generate_fashion_data_with_brand(start_date="2023-01-01", end_date="2023-12-31")
2. Inject Anomalies
from src.create_synthetic_data import inject_anomalies_by_date
anomaly_schedule = {
'2023-06-10': ('ExcessiveDiscount', 0.8),
'2023-06-15': ('COGSOverstatement', 0.4),
'2023-07-01': ('FulfillmentSpike', 0.5)
}
df_anomalous = inject_anomalies_by_date(df, anomaly_schedule)
3. Detect Anomalies
from src.anomaly import AnomalyDetector
detector = AnomalyDetector(df_anomalous, date_col="ORDERDATE", value_col="PROFIT_MARGIN")
anomalies = detector.detect()
anomaly_dates = detector.get_anomaly_dates()
detector.visualize(figsize=(12, 6), ylim=(40, 60))
4. Build the Structural Causal Model (SCM)
from src.pathway import ScmBuilder
edges = [
("PRICEEACH", "UNIT_COST"), ("PRICEEACH", "SALES"),
("UNIT_COST", "COST_OF_GOODS_SOLD"),
("QUANTITYORDERED", "SALES"), ("QUANTITYORDERED", "COST_OF_GOODS_SOLD"),
("SALES", "DISCOUNT"), ("SALES", "NET_SALES"),
("DISCOUNT", "NET_SALES"),
("NET_SALES", "FULFILLMENT_COST"), ("NET_SALES", "MARKETING_COST"),
("NET_SALES", "RETURN_COST"), ("NET_SALES", "PROFIT"),
("FULFILLMENT_COST", "PROFIT"), ("MARKETING_COST", "PROFIT"),
("RETURN_COST", "PROFIT"), ("COST_OF_GOODS_SOLD", "PROFIT"),
("SHIPPING_REVENUE", "PROFIT"), ("PROFIT", "PROFIT_MARGIN"),
("NET_SALES", "PROFIT_MARGIN")
]
nodes = ["PRICEEACH", "UNIT_COST", "SALES", "COST_OF_GOODS_SOLD", "PROFIT_MARGIN"]
builder = ScmBuilder(edges=edges, nodes=nodes)
scm = builder.build(df_anomalous)
5. Perform Causal Root Cause Analysis
from src.pathway import CausalRootCauseAnalyzer
analyzer = CausalRootCauseAnalyzer(scm, min_score_threshold=0.8)
results = analyzer.analyze(df_anomalous, anomaly_dates, start_node='PROFIT_MARGIN')
6. Visualize Causal Pathways
from src.pathway import CausalResultsVisualizer
visualizer = CausalResultsVisualizer(analysis_results=results)
visualizer.plot_root_cause_paths()
7. Run the End-to-End Example
Open notebooks/test.ipynb in Jupyter to see the complete workflow.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prorca-0.1.0.tar.gz.
File metadata
- Download URL: prorca-0.1.0.tar.gz
- Upload date:
- Size: 16.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e452ae9645863cdbb419b87acb3e8ced054ac3d2d124913eb21b09e380876359
|
|
| MD5 |
85c164149b424d1585ed69c8d0a83274
|
|
| BLAKE2b-256 |
60352fa58f5d0cab44a727530f6a7896b88fd9ac5c01f9ac7761facf0e6ae0e5
|
File details
Details for the file prorca-0.1.0-py3-none-any.whl.
File metadata
- Download URL: prorca-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ad3bcde393482dab6613d380a11c11e448e982d60997982ada74f5859a57fbd
|
|
| MD5 |
792f6d411cf1b2efd5155d3fb1221258
|
|
| BLAKE2b-256 |
b62fc64fbf0c69596ff017aa6a12a44bf94408b3f518eceaa787526e289dd0da
|