Skip to main content

ProRCA - Root Cause Analysis Tool

Project description

ProRca: A Causal Pathway Approach for Complex Operational Environments

Overview

ProRca is an end-to-end framework for diagnosing anomalies in complex operational environments by uncovering multi-hop causal pathways. Unlike traditional anomaly detection methods that focus on correlations or feature importance (e.g., via SHAP), our approach leverages structural causal modeling to trace the full causal chain—from hidden root causes to observed anomalies.

Inspired by the paper:

Beyond Traditional Problem-Solving: A Causal Pathway Approach for Complex Operational Environments
Ahmed Dawoud & Shravan Talupula, February 9, 2025 📄 Download PDF

This work introduces a methodology that combines conditional anomaly scoring with causal path discovery and ranking. By extending the DoWhy library, the framework provides decision-makers with actionable insights into the true source of complex operational disruptions.

Features

  • Anomaly Detection:
    Detect anomalies in time series data using ADTK’s InterQuartileRangeAD via the AnomalyDetector class.

  • Synthetic Data Generation:
    Generate realistic synthetic transactional data with adjustable parameters using create_synthetic_data.py.

  • Structural Causal Modeling:
    Build a causal graph and fit a Structural Causal Model (SCM) using ScmBuilder in the pathway.py module.

  • Causal Root Cause Analysis:
    Discover and rank multi-hop causal pathways using CausalRootCauseAnalyzer, which combines structural and noise-based anomaly scoring.

  • Visualization:
    Visualize causal pathways with Graphviz diagrams, using gradient backgrounds to indicate path importance via CausalResultsVisualizer.

  • End-to-End Example:
    A complete example of the workflow is provided in the Jupyter Notebook notebooks/test.ipynb.

Project Structure

ProRca/
├── src/                  # Source code directory
│   ├── __init__.py
│   ├── anomaly.py
│   ├── create_synthetic_data.py
│   ├── pathway.py
├── notebooks/            # Jupyter notebooks for demonstrations
│   ├── test.ipynb
├── README.md                 # Documentation
├── .gitignore            # Git ignore file
├── requirements.txt      # Dependencies list

Installation

Clone the repository and install dependencies:

git clone https://github.com/yourusername/causal-rca.git
cd causal-rca
pip install -r requirements.txt

Usage

1. Generate Synthetic Data

from src.create_synthetic_data import generate_fashion_data_with_brand

df = generate_fashion_data_with_brand(start_date="2023-01-01", end_date="2023-12-31")

2. Inject Anomalies

from src.create_synthetic_data import inject_anomalies_by_date

anomaly_schedule = {
    '2023-06-10': ('ExcessiveDiscount', 0.8),
    '2023-06-15': ('COGSOverstatement', 0.4),
    '2023-07-01': ('FulfillmentSpike', 0.5)
}

df_anomalous = inject_anomalies_by_date(df, anomaly_schedule)

3. Detect Anomalies

from src.anomaly import AnomalyDetector

detector = AnomalyDetector(df_anomalous, date_col="ORDERDATE", value_col="PROFIT_MARGIN")
anomalies = detector.detect()
anomaly_dates = detector.get_anomaly_dates()

detector.visualize(figsize=(12, 6), ylim=(40, 60))

4. Build the Structural Causal Model (SCM)

from src.pathway import ScmBuilder

edges = [
    ("PRICEEACH", "UNIT_COST"), ("PRICEEACH", "SALES"),
    ("UNIT_COST", "COST_OF_GOODS_SOLD"),
    ("QUANTITYORDERED", "SALES"), ("QUANTITYORDERED", "COST_OF_GOODS_SOLD"),
    ("SALES", "DISCOUNT"), ("SALES", "NET_SALES"),
    ("DISCOUNT", "NET_SALES"),
    ("NET_SALES", "FULFILLMENT_COST"), ("NET_SALES", "MARKETING_COST"),
    ("NET_SALES", "RETURN_COST"), ("NET_SALES", "PROFIT"),
    ("FULFILLMENT_COST", "PROFIT"), ("MARKETING_COST", "PROFIT"),
    ("RETURN_COST", "PROFIT"), ("COST_OF_GOODS_SOLD", "PROFIT"),
    ("SHIPPING_REVENUE", "PROFIT"), ("PROFIT", "PROFIT_MARGIN"),
    ("NET_SALES", "PROFIT_MARGIN")
]

nodes = ["PRICEEACH", "UNIT_COST", "SALES", "COST_OF_GOODS_SOLD", "PROFIT_MARGIN"]

builder = ScmBuilder(edges=edges, nodes=nodes)
scm = builder.build(df_anomalous)

5. Perform Causal Root Cause Analysis

from src.pathway import CausalRootCauseAnalyzer

analyzer = CausalRootCauseAnalyzer(scm, min_score_threshold=0.8)
results = analyzer.analyze(df_anomalous, anomaly_dates, start_node='PROFIT_MARGIN')

6. Visualize Causal Pathways

from src.pathway import CausalResultsVisualizer

visualizer = CausalResultsVisualizer(analysis_results=results)
visualizer.plot_root_cause_paths()

RCA Pathways

RCA Pathways

7. Run the End-to-End Example

Open notebooks/test.ipynb in Jupyter to see the complete workflow.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prorca-0.1.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prorca-0.1.0-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file prorca-0.1.0.tar.gz.

File metadata

  • Download URL: prorca-0.1.0.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for prorca-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e452ae9645863cdbb419b87acb3e8ced054ac3d2d124913eb21b09e380876359
MD5 85c164149b424d1585ed69c8d0a83274
BLAKE2b-256 60352fa58f5d0cab44a727530f6a7896b88fd9ac5c01f9ac7761facf0e6ae0e5

See more details on using hashes here.

File details

Details for the file prorca-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: prorca-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for prorca-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2ad3bcde393482dab6613d380a11c11e448e982d60997982ada74f5859a57fbd
MD5 792f6d411cf1b2efd5155d3fb1221258
BLAKE2b-256 b62fc64fbf0c69596ff017aa6a12a44bf94408b3f518eceaa787526e289dd0da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page