Skip to main content

Automated Individual Treatment Effect Estimation via residual-based latent environment discovery

Project description

AutoITE: Automated Individual Treatment Effect Estimation

A residual-based approach to causal inference that detects latent heterogeneity through baseline coupling, enabling Just-in-Time discovery of treatment effects.

Key Insight

Traditional causal inference methods condition on observed features, but latent confounders create hidden subgroups with dramatically different treatment responses. AutoITE exploits baseline coupling---the fact that latent confounders affect not just treatment response but also baseline outcomes---to discover these hidden subgroups through residual analysis.

Installation

pip install -r requirements.txt

Quick Start

from autoite import AutoITEEstimator, BimodalityDiagnostic

# Fit the model
model = AutoITEEstimator(k=1000, alpha=1.0)
model.fit(X_train, T_train, Y_train, Y_pre_train)

# Predict individual treatment effects
tau_pred = model.predict(X_test, Y_pre_test)

# Check for hidden subgroups
diag = BimodalityDiagnostic()
diag.fit(X_train, Y_pre_train)
result = diag.quantify_unknown(X_test, Y_pre_test)
print(f"Bimodality Score: {result['bimodality_score']:.4f}")
print(f"Interpretation: {result['interpretation']}")

Architecture

  1. Global Ridge: Baseline model predicting pre-treatment outcomes from features
  2. Residual Computation: Leave-one-out residuals encode latent causal state
  3. Residual Matching: k-NN in residual space finds individuals with similar latent states
  4. Local Ridge: Treatment effects estimated from residual neighbors
  5. Triage: High-uncertainty cases flagged for expert review

Key Results

From the accompanying paper:

Method Corr(τ̂, U) Detection Rate MAE Median
Causal Forest 0.00 27.3% 0.230 0.042
X-Learner 0.00 27.1% 0.245 0.045
AutoITE -0.94 97.5% 0.095 0.034

AutoITE achieves 59% lower MAE than Causal Forest (0.095 vs 0.230). With 15% triage, MAE reduces to 0.042—only 18% of Causal Forest's error—and deaths drop from 8 to 5.

Components

AutoITEEstimator

Core estimator for individual treatment effect prediction.

  • k: Number of residual neighbors (default: 1000, or use fraction like 0.10)
  • alpha: Ridge regularization strength
  • triage_percentile: Fraction of high-uncertainty cases to flag

BimodalityDiagnostic

Detects hidden subgroups via GMM-based residual analysis.

  • Bimodality score < 0.01: No hidden structure
  • Bimodality score 0.01-0.05: Weak structure
  • Bimodality score 0.05-0.10: Moderate structure
  • Bimodality score > 0.10: Strong hidden structure (likely latent confounder)

UnexplainedHeterogeneityIndex

Measures whether local models improve over global, indicating heterogeneity not captured by observed features.

Reproducing Paper Results

cd experiments/paper_experiments
python run_all_experiments.py

Real-World Validation

The UCI Student Performance experiment demonstrates AutoITE on real educational data:

cd experiments/paper_experiments
python uci_student_intervention.py

Fundamental Limits

AutoITE can detect latent confounders that affect baseline outcomes (baseline coupling). However, interaction-only confounders---those that affect ONLY treatment response without leaving baseline fingerprints---are fundamentally undetectable by any observational method.

Paper

See paper/auto_ite_final.pdf for the full manuscript:

AutoITE: Residual-Based Individual Treatment Effect Estimation via Baseline Coupling

Jake Peace, November 2025

Data Attribution

UCI Student Performance Dataset

The real-world validation uses the Student Performance dataset from the UCI Machine Learning Repository, provided under the CC BY 4.0 license.

  • Creator: Paulo Cortez
  • Source: https://archive.ics.uci.edu/dataset/320/student+performance
  • DOI: 10.24432/C5TG7T
  • Citation: P. Cortez and A. Silva. Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, EUROSIS, ISBN 978-9077381-39-7.

License

MIT License - see LICENSE file for details.

Author

Jake Peace (2025)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoite-1.0.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoite-1.0.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file autoite-1.0.0.tar.gz.

File metadata

  • Download URL: autoite-1.0.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for autoite-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dab1f2076e6c7e16f9f1d63bdf31c636a9e678abf8598c1035c429f094616337
MD5 324f7b6e51fccc6ffced6630b2f2b1a4
BLAKE2b-256 c548ed1ab0481b630ed2856173ee4358f2e0dc6916d48bb73f24e7cc40dd0e33

See more details on using hashes here.

File details

Details for the file autoite-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: autoite-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for autoite-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a147ae04406597583dc92ef0faae23c3f1b2fdc6708252885a0bb1c0b5cc3a42
MD5 9420f72f93eff32730c747f499d51fb1
BLAKE2b-256 f6c82f91cccfaaf3182841c6589dfa14d6c42d3c210d2ba33896a4aabdda16f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page