Skip to main content

ProRCA - Root Cause Analysis Tool

Project description

Build Status PyPI License Documentation Status

Logo

ProRCA: A Causal Pathway Approach for Complex Operational Environments

Overview

ProRCA is an end-to-end framework for diagnosing anomalies in complex operational environments by uncovering multi-hop causal pathways. Unlike traditional anomaly detection methods that focus on correlations or feature importance (e.g., via SHAP), our approach leverages structural causal modeling to trace the full causal chainโ€”from hidden root causes to observed anomalies.

Inspired by the paper:

Beyond Traditional Problem-Solving: A Causal Pathway Approach for Complex Operational Environments
Ahmed Dawoud & Shravan Talupula, February 9, 2025 ๐Ÿ“„ Download PDF

This work introduces a methodology that combines conditional anomaly scoring with causal path discovery and ranking. By extending the DoWhy library, the framework provides decision-makers with actionable insights into the true source of complex operational disruptions.

Features

  • Anomaly Detection:
    Detect anomalies in time series data using ADTKโ€™s InterQuartileRangeAD via the AnomalyDetector class.

  • Synthetic Data Generation:
    Generate realistic synthetic transactional data with adjustable parameters using create_synthetic_data.py.

  • Structural Causal Modeling:
    Build a causal graph and fit a Structural Causal Model (SCM) using ScmBuilder in the pathway.py module.

  • Causal Root Cause Analysis:
    Discover and rank multi-hop causal pathways using CausalRootCauseAnalyzer, which combines structural and noise-based anomaly scoring.

  • Visualization:
    Visualize causal pathways with Graphviz diagrams, using gradient backgrounds to indicate path importance via CausalResultsVisualizer.

Project Structure

ProRCA/
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ .github/
โ”œโ”€โ”€ CHANGELOG.md
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ Examples/
โ”‚   โ”‚   โ”œโ”€โ”€ Example_1/
โ”‚   โ”‚   โ””โ”€โ”€ Example_2/
โ”‚   โ””โ”€โ”€ research_paper/
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ anomaly/
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ””โ”€โ”€ adtk.py
โ”‚   โ”œโ”€โ”€ data_generators/
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ””โ”€โ”€ synthetic_sales_data.py
โ”‚   โ””โ”€โ”€ prorca/
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ dag_builder.py
โ”‚       โ””โ”€โ”€ pathway.py

Installation

Clone the repository and install dependencies:

git clone https://github.com/profitopsai/ProRCA.git
cd ProRCA
pip install .

Usage

1. Generate Synthetic Data

from src.create_synthetic_data import generate_fashion_data_with_brand

df = generate_fashion_data_with_brand(start_date="2023-01-01", end_date="2023-12-31")

2. Inject Anomalies

from src.create_synthetic_data import inject_anomalies_by_date

anomaly_schedule = {
    '2023-06-10': ('ExcessiveDiscount', 0.8),
    '2023-06-15': ('COGSOverstatement', 0.4),
    '2023-07-01': ('FulfillmentSpike', 0.5)
}

df_anomalous = inject_anomalies_by_date(df, anomaly_schedule)

3. Detect Anomalies

from src.anomaly.adtk import AnomalyDetector

detector = AnomalyDetector(df_anomalous, date_col="ORDERDATE", value_col="PROFIT_MARGIN")
anomalies = detector.detect()
anomaly_dates = detector.get_anomaly_dates()

detector.visualize(figsize=(12, 6), ylim=(40, 60))

4. Build the Structural Causal Model (SCM)

from src.prorca.pathway import ScmBuilder

edges = [
    ("PRICEEACH", "UNIT_COST"), ("PRICEEACH", "SALES"),
    ("UNIT_COST", "COST_OF_GOODS_SOLD"),
    ("QUANTITYORDERED", "SALES"), ("QUANTITYORDERED", "COST_OF_GOODS_SOLD"),
    ("SALES", "DISCOUNT"), ("SALES", "NET_SALES"),
    ("DISCOUNT", "NET_SALES"),
    ("NET_SALES", "FULFILLMENT_COST"), ("NET_SALES", "MARKETING_COST"),
    ("NET_SALES", "RETURN_COST"), ("NET_SALES", "PROFIT"),
    ("FULFILLMENT_COST", "PROFIT"), ("MARKETING_COST", "PROFIT"),
    ("RETURN_COST", "PROFIT"), ("COST_OF_GOODS_SOLD", "PROFIT"),
    ("SHIPPING_REVENUE", "PROFIT"), ("PROFIT", "PROFIT_MARGIN"),
    ("NET_SALES", "PROFIT_MARGIN")
]

nodes = ["PRICEEACH", "UNIT_COST", "SALES", "COST_OF_GOODS_SOLD", "PROFIT_MARGIN"]

builder = ScmBuilder(edges=edges, nodes=nodes)
scm = builder.build(df_anomalous)

5. Perform Causal Root Cause Analysis

from src.prorca.pathway import CausalRootCauseAnalyzer

analyzer = CausalRootCauseAnalyzer(scm, min_score_threshold=0.8)
results = analyzer.analyze(df_anomalous, anomaly_dates, start_node='PROFIT_MARGIN')

6. Visualize Causal Pathways

from src.prorca.pathway import CausalResultsVisualizer

visualizer = CausalResultsVisualizer(analysis_results=results)
visualizer.plot_root_cause_paths()

RCA Pathways

RCA Pathways

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

profitops_rca-0.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

profitops_rca-0.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file profitops_rca-0.1.tar.gz.

File metadata

  • Download URL: profitops_rca-0.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for profitops_rca-0.1.tar.gz
Algorithm Hash digest
SHA256 c8b8b909c44883f91b12c5354ed3cdee4ac89a51eb4cbc02e51821c2bfbe374c
MD5 02499734dc8906ae224479d81838b3bb
BLAKE2b-256 d03d0a01454820fc6120afde7d54d7bb385c22b616505c9bdca27ea8e876c33e

See more details on using hashes here.

File details

Details for the file profitops_rca-0.1-py3-none-any.whl.

File metadata

  • Download URL: profitops_rca-0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for profitops_rca-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e942b596a7183075471da38e8de03ff8b561ca4eb95dc5945c7437f9c7a0492
MD5 159a209cb50baf617346e43694acbcfa
BLAKE2b-256 178260d7ed820ae6d6bd714ac8c1b9b89861b7c4e216db78062b3c6d14250a4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page