Large Language Models (LLMs) with Bayesian causal inference to discover causal relationships and associations from observational data and domain knowledge

These details have not been verified by PyPI

Project links

Project description

Causal Inference Framework for AWS (causalif)

Overview
Logical Flow
Why Hill Climb and BDeu Score?
Prerequisites
Installation
Usage Examples
Architecture
Limitations
Contributing
License

Overview

Causalif combines Large Language Models (LLMs) with Bayesian causal inference to discover causal relationships and associations from observational data and domain knowledge. Unlike traditional causal discovery algorithms that rely solely on statistical patterns, Causalif leverages:

Background Knowledge: LLM's pre-trained knowledge about causal relationships
Document Knowledge: Domain-specific documents retrieved via RAG
Statistical Evidence: Correlation patterns from observational data
Bayesian Structure Learning: Data-driven causal graph orientation

This hybrid approach enables causal discovery and associations even with limited data or when statistical methods alone are insufficient.

Note: LLM interpretation of causalif is best realised when this library is used as a tool in agentic systems.

GitHub: awslabs/causalif
PyPI: causalif (reference paper for LACR 1 algorithm: https://arxiv.org/html/2402.15301v2)

Ideal Use Cases

Causalif is particularly powerful when you have both qualitative domain knowledge and quantitative observational data. The library excels at discovering causal relationships between derived factors by combining: It is ideal to be integrated as a tool to agentic workflows so that the agent can interpret its results and provides an overall response to the user.

Qualitative Knowledge: Documents containing formulas, relationships, and domain expertise
Quantitative Data: Noisy observational data that fuels those formulas

Example: Financial Analysis

Scenario: A financial institution wants to understand what drives the behavior of derived financial metrics.

What They Have:

Qualitative Finance Data: Research papers, financial articles, analyst reports, and documents describing:
- Derived formulas (e.g., "ROE = Net Income / Shareholder Equity")
- Market relationships (e.g., "Interest rates affect bond prices inversely")
- Economic theories and domain expertise
Quantitative Data: Historical time-series data with noise:
- Stock prices, trading volumes, interest rates
- Company financials (revenue, earnings, debt ratios)
- Market indicators (VIX, sector indices)

What They Want to Discover:

Which factors causally drive a target metric (e.g., "Factors influencing volatility in Commodities?").
Why any derived factors is low or high around a specific time period. -What is causing a target factor to behave differently and what are influencing the target factor.

Key Advantages for use Cases

Handles Noisy Data: Bayesian approach robust to measurement error and missing values
Leverages Domain Knowledge: RAG retrieval incorporates expert knowledge from documents
Discovers Hidden Relationships: Finds causal links not obvious from data alone
Quantifies Effects: Provides effect sizes, not just "yes/no" causality
Validates with Multiple Sources: Voting mechanism across LLM, documents, and data reduces false discoveries

When Causalif is Most Effective

✅ Use Causalif when you have:

Rich document corpus with domain knowledge and formulas
Observational data (even if noisy or limited)
Derived metrics whose dependencies are unclear
Need to understand "what causes what" not just "what correlates"

⚠️ Consider alternatives when:

You have no domain documents (pure data-driven methods may suffice)
You need real-time causal discovery (Causalif requires LLM calls)
Your data has <10 samples (insufficient for Bayesian structure learning)
Relationships are purely experimental (randomized controlled trials are better)

Logical Flow

Causalif implements a two-stage algorithm with parallel processing and RAG integration:

Architecture Diagram

Library Architecture

Causalif implements a three-stage algorithm:

Stage 1: Edge Existence Verification (Causalif 1)

Goal: Determine which pairs of variables are causally related

Process:

Initialize: Start with a complete undirected graph (all possible edges between variables)
Knowledge Base Assembly: For each variable pair (A, B):
- Query LLM's background knowledge
- Retrieve relevant documents via RAG
- Extract statistical evidence from data
Voting Mechanism: Each knowledge base votes on edge existence:
- +1: Variables are associated (edge should exist)
- -1: Variables are independent (edge should be removed)
- 0: Unknown (no vote)
Edge Removal: Remove edges where total vote score ≤ 0
Output: Skeleton graph (undirected graph of causal relationships)

Parallel Optimization: Causalif batches LLM queries for multiple variable pairs, executing them in parallel (configurable up to 50 concurrent queries) for significant speedup.

Stage 2: Causal Orientation (Causalif 2)

Goal: Determine the direction of causal relationships (A → B or B ← A)

Process:

Input: Skeleton graph from Stage 1
Bayesian Structure Learning:
- Use Hill Climbing search with BDeu scoring
- Constrain search to edges in skeleton (prior knowledge)
- Weight edges by LLM confidence from Stage 1
Direction Determination: For each edge in skeleton:
- Compute Bayesian posterior: P(G | Data, Priors) ∝ P(Data | G) × P(G | Priors)
- Select direction that maximizes posterior probability
Output: Directed Acyclic Graph (DAG) representing causal relationships

Degree-Limited Analysis: Optionally focus on relationships within N degrees of separation from a target variable for faster analysis.

Stage 3: Causal Inference (Optional)

Goal: Quantify causal effects and enable interventional queries

Process:

Input: Causal DAG from Stage 2 + Observational data
Fit CPDs: Learn Conditional Probability Distributions using Maximum Likelihood Estimation
Create Bayesian Network: Combine structure (DAG) with parameters (CPDs)
Estimate Effects: Compute Average Treatment Effects (ATE) for each cause
Enable Queries: Support interventional queries P(Y | do(X))
Output: Quantitative causal model with effect sizes

When to Enable:

Need effect sizes ("how much does X affect Y?")
Want to simulate interventions ("what if we change X?")
Need to identify confounders and adjustment sets
Require quantitative prioritization of causes

Note: This stage is optional and disabled by default. Enable with enable_causal_inference=True parameter.

Why Hill Climb and BDeu Score?

Why Hill Climbing?

Hill Climbing is a local search algorithm that iteratively improves a causal graph structure by:

Starting from an initial graph (skeleton from Stage 1)
Testing local modifications (add/remove/reverse edges)
Accepting changes that improve the score
Stopping at a local optimum

Advantages for Causalif:

Constraint Compatibility: Easily incorporates prior knowledge (skeleton graph) as hard constraints
Computational Efficiency: Scales to moderate-sized graphs (10-20 variables) with reasonable runtime
Interpretability: Local search steps are traceable and explainable
Flexibility: Supports custom scoring functions (like Prior-Weighted BDeu)

Alternatives Considered:

PC Algorithm: Constraint-based, but doesn't naturally incorporate LLM priors
GES (Greedy Equivalence Search): Similar to Hill Climb but more complex
Exact Search: Computationally prohibitive for >5 variables
MCMC Sampling: More accurate but much slower; overkill for typical use cases

Why BDeu Score?

BDeu (Bayesian Dirichlet equivalent uniform) is a Bayesian scoring function that measures how well a causal graph explains the observed data.

Mathematical Foundation:

BDeu(G, D) = P(D | G) = ∏ᵢ ∏ⱼ [Γ(α) / Γ(α + Nᵢⱼ)] × ∏ₖ [Γ(αₖ + Nᵢⱼₖ) / Γ(αₖ)]

Where:

G: Causal graph structure
D: Observational data
α: Equivalent sample size (prior strength)
Nᵢⱼₖ: Count of observations in configuration

Advantages for Causalif:

Bayesian Framework: Naturally combines prior knowledge (LLM) with data evidence
Score Equivalence: Assigns same score to equivalent graph structures (Markov equivalence)
Regularization: Built-in penalty for complex graphs (Occam's razor)
Theoretical Soundness: Proven consistency properties as data grows

Causalif Enhancement - Prior-Weighted BDeu:

Score(G) = BDeu(G | Data) + λ × Prior(G | LLM)

Where:

BDeu(G | Data): Standard BDeu score from data
Prior(G | LLM): LLM confidence scores from Stage 1
λ: Weight parameter balancing data vs. prior

This implements true Bayesian inference: P(G | Data, LLM) ∝ P(Data | G) × P(G | LLM)

Alternatives Considered:

BIC (Bayesian Information Criterion): Simpler but less theoretically principled
AIC (Akaike Information Criterion): Doesn't incorporate priors naturally
K2 Score: Similar to BDeu but requires variable ordering
MIT Score: More complex, no clear advantage for this use case

Prerequisites

1. AWS Bedrock Knowledge Base

Causalif requires a RAG knowledge base for document retrieval. Set up an AWS Bedrock Knowledge Base following the official instructions.

Recommended Configuration:

Vector Store: Amazon OpenSearch Serverless or Amazon Aurora
Embedding Model: Amazon Titan Embeddings or Cohere Embed
Document Format: Markdown, PDF, or plain text
Number of Results: 10-20 documents per query

2. Create Retriever Tool

After setting up the knowledge base, create a LangChain retriever tool:

from langchain_aws.retrievers import AmazonKnowledgeBasesRetriever
from langchain.tools.retriever import create_retriever_tool

retriever = AmazonKnowledgeBasesRetriever(
    knowledge_base_id="<your-knowledge-base-id>",
    retrieval_config={
        "vectorSearchConfiguration": {
            "numberOfResults": 20  # Adjust based on your needs
        }
    },
)

retriever_tool = create_retriever_tool(
    retriever,
    "domain_knowledge_retriever",
    "Retrieves domain-specific documents about causal relationships between factors",
)

3. LLM Model

Causalif works with any LangChain-compatible LLM. AWS Bedrock is recommended:

from langchain_aws import ChatBedrock

model = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name="us-east-1",
    model_kwargs={
        "temperature": 0.0,  # Deterministic for causal reasoning
        "max_tokens": 4096
    }
)

Supported Models:

Anthropic Claude (recommended)
Amazon Titan
Meta Llama
Cohere Command
Any OpenAI-compatible model

4. Observational Data

Provide a pandas DataFrame with observational data:

import pandas as pd

df = pd.DataFrame({
    'sleep_hours': [7, 6, 8, 5, 7, 9],
    'exercise_minutes': [30, 20, 45, 10, 35, 60],
    'stress_level': [5, 7, 3, 8, 4, 2],
    'productivity': [8, 6, 9, 4, 7, 10]
})

Requirements:

Minimum 100 samples (more is better)
Numeric or categorical columns
No missing values (or handle them beforehand)

Installation

pip install causalif

Usage Examples

Basic Usage

from causalif import set_causalif_engine, causalif_tool, visualize_causalif_results
from langchain_aws import ChatBedrock
import pandas as pd

# 1. Prepare your data
df = pd.DataFrame({
    'sleep_hours': [7, 6, 8, 5, 7, 9, 6, 8, 7, 5],
    'exercise_minutes': [30, 20, 45, 10, 35, 60, 25, 50, 40, 15],
    'stress_level': [5, 7, 3, 8, 4, 2, 6, 3, 5, 8],
    'productivity': [8, 6, 9, 4, 7, 10, 6, 9, 8, 5]
})

# 2. Initialize LLM
model = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={"temperature": 0.0}
)

# 3. Configure Causalif engine
# Configure with financial data

set_causalif_engine(
            model=<your_bedrock_model>,
            retriever_tool=retriever_tool,
            dataframe=<dataframe_name>, 
            max_degrees=<degree of edges>,  # None = no filtering (show entire graph), or set to int (e.g., 2) to filter.
            max_parallel_queries=50, #This is variable but the code is tested with 50.
            excluded_target_columns=None, # This a list of factors that shouldn't be target columns
            excluded_related_columns=None, # This a list of factors that shouldn't be related columns
            related_factors=None,  # Add custom related factors here (will be appended with dataframe columns). Mostly derived columns from documents
            selected_dataframe_columns=None, # list of columns from your dataframe if you dont want the whole dataframe to be analyzed.
            enable_causal_estimate = True  #Causal inference to find upstream or downstream direct effects of the target factor.
        )

# 4. Run causal analysis
result = causalif.causalif("Why is interest_rate so low in week 3?")

# 5. Visualize results
fig = visualize_causalif_results(result)
fig.show()

Query Formats

Causalif supports natural language queries in various formats. The <target_factor> is the column or factor whose dependencies with other variables you want to analyze:

"""
Allowed query formats (where <target_factor> is the variable to analyze):

1. why (is|are) <target_factor> so (low|high|poor|bad|good)
2. what (causes|affects|influences) <target_factor>
3. <target_factor> (is|are) too (low|high)
4. analyze the causes (of|for) <target_factor>
5. dependencies (of|for) <target_factor>
6. factors (affecting|influencing) <target_factor>
"""

# Format 1: Why questions
result = causalif.causalif("Why is stress_level so high?")
result = causalif.causalif("Why are sales so low?")

# Format 2: What causes questions
result = causalif.causalif("What causes low productivity?")
result = causalif.causalif("What affects customer satisfaction?")

# Format 3: Direct statements
result = causalif.causalif("productivity is too low")
result = causalif.causalif("revenue is too high")

# Format 4: Analysis requests
result = causalif.causalif("analyze the causes of high stress_level")
result = causalif.causalif("analyze the causes for poor performance")

# Format 5: Dependency queries
result = causalif.causalif("dependencies of productivity")
result = causalif.causalif("dependencies for stock_price")

# Format 6: Factor influence queries
result = causalif.causalif("factors affecting sleep_hours")
result = causalif.causalif("factors influencing market_volatility")

Visualization Features

The interactive visualization includes:

Node Colors: Degree of separation from target factor (red = direct, blue = distant)
Edge Colors: Same color scheme as nodes
Arrows: Direction of causality
Hover Information: Detailed relationship information
Interactive: Zoom, pan, and click for details

fig = visualize_causalif_results(result)

# Customize visualization
fig.update_layout(
    title="Custom Title",
    width=1200,
    height=800
)

# Save to file
fig.write_html("causal_graph.html")
fig.write_image("causal_graph.png")  # Requires kaleido

Architecture

System Integration

Library Architecture

Causalif integrates with agentic LLM applications as a tool:

Agent Layer: LangChain agents or custom orchestrators
Causalif Tool: Exposes causalif_tool for natural language queries
Engine Layer: CausalifEngine implements core algorithms
Knowledge Layer: RAG retriever + LLM background knowledge
Data Layer: Pandas DataFrame with observational data

Component Architecture

causalif/
├── core.py           # Data structures (AssociationResponse, CausalDirection, KnowledgeBase)
├── engine.py         # CausalifEngine (main algorithm implementation)
├── prompts.py        # CausalifPrompts (LLM prompt templates)
├── tools.py          # causalif_tool, set_causalif_engine (LangChain integration)
├── visualization.py  # visualize_causalif_results (Plotly graphs)
└── __init__.py       # Public input exports

Key Classes

CausalifEngine:

causalif_1_edge_existence_verification(): Stage 1 algorithm
causalif_2_orientation(): Stage 2 algorithm
run_complete_causalif(): End-to-end pipeline
batch_association_queries(): Parallel LLM queries
batch_causal_direction_queries(): Parallel direction queries
visualize_graph(): Visualization

KnowledgeBase:

kb_type: "BG" (background), "DOC" (document)
content: Knowledge content
source: Source identifier

Limitations

This method isn't ideal for only quantitative data and feedback loop driven inference. This method is built aiming finding hybrid association and causality among qualitative and quatitative data sets.

Data & Computational

Minimum 10 samples required for Bayesian structure learning (100+ recommended)
Scalability: Practical limit of 15-20 variables without degree filtering
Time Complexity: O(n² × k) for n variables and k LLM queries per pair
LLM Costs: 2-5 LLM calls per variable pair

Mitigation: Use max_degrees parameter to focus analysis; increase max_parallel_queries for speed.

LLM & Knowledge

Hallucination: LLM may invent unsupported relationships
Bias: Reflects training data biases
Consistency: Results may vary (use temperature=0 for determinism)
RAG Quality: Results depend on document corpus quality and retrieval accuracy

Mitigation: Validate outputs with domain expertise; use voting across multiple knowledge sources.

Causal Assumptions

Acyclicity: Assumes DAG structure (no feedback loops)
Causal Sufficiency: Assumes no unmeasured confounders
Markov Condition: Assumes conditional independence given parents

Mitigation: Include potential confounders in variable set; validate DAG assumption with domain knowledge.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Reporting Issues

Please report bugs and feature requests on GitHub Issues.

License

This project is licensed under the Apache-2.0 License. See LICENSE for details.

Version History

v0.1.9.1: Remeved LLM based causal directions and introduced bayesian based causal direction with hill climb search and immediate upstream and downstream effects. Building a hybrid graph with associations and causal directions.
v0.1.6: Removed directed graph dependencies, added example notebook.
v0.1.5: README updates.
v0.1.4: Base version with complete Causalif algorithm.

Support

Documentation: GitHub README
Issues: GitHub Issues
Email: bossubhr@amazon.co.uk

Acknowledgments

Built with:

LangChain - LLM orchestration
NetworkX - Graph algorithms
Plotly - Interactive visualization
AWS Bedrock - LLM and RAG infrastructure

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.9.8

May 16, 2026

0.1.9.7

Mar 20, 2026

0.1.9.6

Mar 19, 2026

0.1.9.5

Mar 3, 2026

0.1.9.4

Mar 3, 2026

0.1.9.3

Feb 23, 2026

This version

0.1.9.2

Feb 23, 2026

0.1.9.1

Feb 20, 2026

0.1.6

Oct 21, 2025

0.1.5

Oct 20, 2025

0.1.4

Oct 20, 2025

0.1.3

Oct 20, 2025

0.1.0

Oct 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causalif-0.1.9.2.tar.gz (57.9 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

causalif-0.1.9.2-py3-none-any.whl (52.2 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file causalif-0.1.9.2.tar.gz.

File metadata

Download URL: causalif-0.1.9.2.tar.gz
Upload date: Feb 23, 2026
Size: 57.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for causalif-0.1.9.2.tar.gz
Algorithm	Hash digest
SHA256	`95c181f984e88cb976e9e3da768edf62a997f378f28b4a5888b5773c97327bca`
MD5	`aca66eb79b74c92406837a2e2211cb9a`
BLAKE2b-256	`c364e97af9b7a74f4f04f36ae95b389b15b0bf200e981a3029fd8fbef8459bbe`

See more details on using hashes here.

File details

Details for the file causalif-0.1.9.2-py3-none-any.whl.

File metadata

Download URL: causalif-0.1.9.2-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 52.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for causalif-0.1.9.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`62bd75a11a392272c2afc4b4d824e4fe99540046c6d3f56a6866c4d3be51eca1`
MD5	`cbf469e046c66f96318d03869f42345d`
BLAKE2b-256	`31160299fc291255fd383a1617b5ac6e1725fc61b8056bdaa550bd23ec03e27d`

See more details on using hashes here.

causalif 0.1.9.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Causal Inference Framework for AWS (causalif)

Table of Contents

Overview

Ideal Use Cases

Example: Financial Analysis

Key Advantages for use Cases

When Causalif is Most Effective

Logical Flow

Architecture Diagram

Stage 1: Edge Existence Verification (Causalif 1)

Stage 2: Causal Orientation (Causalif 2)

Stage 3: Causal Inference (Optional)

Why Hill Climb and BDeu Score?

Why Hill Climbing?

Why BDeu Score?

Prerequisites

1. AWS Bedrock Knowledge Base

2. Create Retriever Tool

3. LLM Model

4. Observational Data

Installation

Usage Examples

Basic Usage

Query Formats

Visualization Features

Architecture

System Integration

Component Architecture

Key Classes

Limitations

This method isn't ideal for only quantitative data and feedback loop driven inference. This method is built aiming finding hybrid association and causality among qualitative and quatitative data sets.

Data & Computational

LLM & Knowledge

Causal Assumptions

Contributing

Reporting Issues

License

Version History

Support

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes