A library for automated causal inference

These details have not been verified by PyPI

Project links

Project description

CAIS - Causal AI Scientist

Causal AI Scientist (CAIS) is an LLM-powered tool for generating data-driven answers to natural language causal queries. It takes a natural language query (for example, "Does participating in a job training program lead to higher income?"), an accompanying dataset, and the corresponding description as inputs. CAIS then frames a suitable causal estimation problem by selecting appropriate treatment and outcome variables. It finds the suitable method for causal effect estimation, implements it, runs diagnostic tests, and finally interprets the numerical results in the context of the original query

🚀 Quick Start

Installation

pip install causal_agent

Basic Usage

from causal_agent import run_causal_analysis

# Run causal analysis with a simple question
result = run_causal_analysis(
    query="What is the effect of education on income?",
    dataset_path="your_data.csv",
    dataset_description="Dataset containing education and income data"
)

print(f"Causal effect: {result['results']['results']['effect_estimate']}")
print(f"Method used: {result['results']['results']['method_used']}")
print(f"Explanation: {result['explanation']}")

Command Line Interface

# Single analysis
causal_agent run dataset.csv "What is the effect of treatment on outcome?"

# Batch analysis
causal_agent batch metadata.csv data_folder/ results.json

🔧 Setup

1. Configure LLM Provider

Set your API key for your preferred LLM provider:

import os

# OpenAI (default)
os.environ["OPENAI_API_KEY"] = "your-api-key"

# Or use Anthropic
os.environ["LLM_PROVIDER"] = "anthropic"
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

# Or use Google Gemini
os.environ["LLM_PROVIDER"] = "gemini"
os.environ["GOOGLE_API_KEY"] = "your-api-key"

2. Prepare Your Data

CSV format with clear column names
Include relevant variables for causal analysis
Ensure sufficient sample size (typically >100 observations)

📊 What CAIS Does

Parses your natural language causal question
Analyzes your dataset structure and variables
Selects the most appropriate causal inference method:
- Randomized Controlled Trials (RCT)
- Difference-in-Differences (DiD)
- Instrumental Variables (IV)
- Regression Discontinuity Design (RDD)
- Propensity Score Matching/Weighting
- Linear Regression with controls
- And more...
Executes the analysis with proper diagnostics
Interprets results in the context of your original question

🎯 Example Use Cases

Education Research

result = run_causal_analysis(
    query="Does smaller class size improve student test scores?",
    dataset_path="education_data.csv",
    dataset_description="Student data with class sizes and test scores"
)

Healthcare

result = run_causal_analysis(
    query="What is the effect of the new treatment on patient recovery time?",
    dataset_path="clinical_trial_data.csv",
    dataset_description="Randomized trial data comparing treatments"
)

Economics

result = run_causal_analysis(
    query="How does minimum wage increase affect employment?",
    dataset_path="employment_data.csv",
    dataset_description="Employment data before and after policy change"
)

📈 Advanced Features

Batch Processing

Process multiple datasets at once:

import pandas as pd

# Create metadata file
metadata = pd.DataFrame({
    'natural_language_query': [
        'Effect of education on income',
        'Impact of training on employment'
    ],
    'data_files': ['education.csv', 'training.csv'],
    'data_description': ['Education dataset', 'Training program data']
})

# Save metadata to CSV file first
metadata.to_csv('metadata.csv', index=False)

# Run batch analysis using CLI
# causal_agent batch metadata.csv ./data/ results.json

Custom LLM Configuration

# Use different models
os.environ["LLM_MODEL"] = "gpt-4o-mini"  # Faster, cheaper
# os.environ["LLM_MODEL"] = "gpt-4"      # More accurate
# os.environ["LLM_MODEL"] = "claude-3-haiku-20240307"  # Anthropic

🔍 Understanding Results

CAIS returns structured results including:

Effect Estimate: The causal effect size
Standard Error: Uncertainty in the estimate
Confidence Interval: Range of plausible values
Method Used: Which causal inference technique was applied
Variables Identified: Treatment, outcome, and control variables
Explanation: Plain-language interpretation of results

result = run_causal_analysis(query, dataset_path, description)

# Access key results
effect = result['results']['results']['effect_estimate']
method = result['results']['results']['method_used']
variables = result['results']['variables']
explanation = result['explanation']

print(f"Using {method}, we found that {variables['treatment_variable']} "
      f"has an effect of {effect} on {variables['outcome_variable']}")

🛠️ Supported Methods

CAIS automatically selects from:

Experimental Methods: RCT analysis
Quasi-Experimental: DiD, RDD, IV
Observational: Propensity scoring, backdoor adjustment
Machine Learning: Causal forests, double ML (coming soon)

📚 Best Practices

Writing Good Causal Questions

✅ Good: "What is the causal effect of education on income?"
✅ Good: "Does job training increase employment rates?"
❌ Avoid: "Are education and income related?" (correlation, not causation)

Dataset Requirements

Clear variable names
Sufficient sample size
Relevant control variables
Clean data (handle missing values)

Providing Context

Include dataset descriptions with:

Variable definitions
Data collection method
Time period covered
Known confounders

🔄 Migration from Previous Versions

If you're upgrading from the old cais package, see our Migration Guide for step-by-step instructions.

Quick update:

pip uninstall cais
pip install causal-agent

Then update your imports:

# Old
from cais import run_causal_analysis

# New  
from causal_agent import run_causal_analysis

🤝 Support

Documentation: GitHub README
Migration Guide: MIGRATION.md
Issues: GitHub Issues
Examples: Check the test examples

📄 License

MIT License - see LICENSE for details.

Citation

If you use CAIS in your research, please cite:

@software{causal_agent2025,
  title={CAIS: Causal AI Scientist for Automated Causal Inference},
  author={Verma, Vishal and Acharya, Sawal and Simko, Samuel and Bhardwaj, Devansh and Haghighat, Anahita and Jin, Zhijing},
  year={2025},
  url={https://github.com/causalNLP/causal-agent}
}

Get started with causal inference in minutes, not hours! 🎉

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Oct 7, 2025

0.1.1

Aug 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causal_agent-0.1.2.tar.gz (294.3 kB view details)

Uploaded Oct 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

causal_agent-0.1.2-py3-none-any.whl (341.1 kB view details)

Uploaded Oct 7, 2025 Python 3

File details

Details for the file causal_agent-0.1.2.tar.gz.

File metadata

Download URL: causal_agent-0.1.2.tar.gz
Upload date: Oct 7, 2025
Size: 294.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for causal_agent-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`69e098b6799c5517c51198c03257afe96802d7d910d26434406788b591240896`
MD5	`2f6096c9f97e1389a84770a007aeb87f`
BLAKE2b-256	`fa93195945b3e5b975a5fe44ce2aca0707f09432c64d4857442c65536d722977`

See more details on using hashes here.

File details

Details for the file causal_agent-0.1.2-py3-none-any.whl.

File metadata

Download URL: causal_agent-0.1.2-py3-none-any.whl
Upload date: Oct 7, 2025
Size: 341.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for causal_agent-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4df126fa48571ca28e6444d48a011b4b2a6ebe00508552e2a041fb462a21d7bd`
MD5	`a0d58a0236298b18116a25b820502587`
BLAKE2b-256	`35daf628128b826c66e52d42a5036ced933cab4f591004411b13dea99bbc6ec3`

See more details on using hashes here.

causal-agent 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CAIS - Causal AI Scientist

🚀 Quick Start

Installation

Basic Usage

Command Line Interface

🔧 Setup

1. Configure LLM Provider

2. Prepare Your Data

📊 What CAIS Does

🎯 Example Use Cases

Education Research

Healthcare

Economics

📈 Advanced Features

Batch Processing

Custom LLM Configuration

🔍 Understanding Results

🛠️ Supported Methods

📚 Best Practices

Writing Good Causal Questions

Dataset Requirements

Providing Context

🔄 Migration from Previous Versions

🤝 Support

📄 License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes