The identification contract engine for the LLM era.
Project description
██████╗ █████╗ ██████╗ ██████╗ ███████╗██████╗
██╔══██╗██╔══██╗██╔════╝ ██╔════╝ ██╔════╝██╔══██╗
██║ ██║███████║██║ ███╗██║ ███╗█████╗ ██████╔╝
██║ ██║██╔══██║██║ ██║██║ ██║██╔══╝ ██╔══██╗
██████╔╝██║ ██║╚██████╔╝╚██████╔╝███████╗██║ ██║
╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝╚═╝ ╚═╝
The identification contract engine for the LLM era.
Your AI agent doesn't test for parallel trends. DAGger does.
Quickstart · Why DAGger? · Architecture · MCP Server · References
The Problem
Modern AI tooling has made econometric execution trivially easy and causal validity invisibly catastrophic.
Ask an AI agent to run a DiD analysis. It will produce a beautifully formatted coefficient table with stars, clustered standard errors, and a significant p-value. What it will never do: test whether parallel trends hold, check for anticipation effects, or verify that your instrument has a strong first stage.
The output looks like science. It is causal fraud.
This isn't a model capability failure — it's an architectural one. There is no software primitive that makes "I must validate my identification strategy before I can estimate" a programmable constraint rather than a vague checklist item in a methods section.
DAGger is that primitive.
The Solution
import dagr as dg
# 1. Declare your identification strategy — before touching data
contract = dg.DiffInDiffContract(
estimand=dg.Estimand.ATT,
outcome_var="log_employment",
treatment_var="min_wage_increase",
time_var="year",
unit_var="county_fips",
assumptions=frozenset([
dg.Assumption.PARALLEL_TRENDS,
dg.Assumption.NO_ANTICIPATION,
]),
pre_periods=(-4, -3, -2, -1),
post_periods=(0, 1, 2, 3),
)
# 2. Run the preflight battery — or don't estimate
with dg.AuditLedger(contract=contract, experiment_id="min_wage_2024",
ledger_path="artifacts/ledger.jsonld") as ledger:
preflight = contract.validate(data, verbose=True)
ledger.attach_preflight(preflight)
preflight.assert_valid() # raises IdentificationError if INVALID
# 3. Estimate — @requires_contract is satisfied by the AuditLedger context
results = contract.build_estimator(data).fit()
ledger.attach_results(results)
# 4. Quantify robustness
report = ledger.generate_report()
print(report.sensitivity.rr_breakdown_m_bar) # breakdown M* for Rambachan-Roth
print(report.sensitivity.oster_delta) # Oster delta for selection on unobservables
The preflight renders this in your terminal:
+----------------------------------+-------------------+-----------+-----------+
| Assumption | Status | Statistic | p-value |
+----------------------------------+-------------------+-----------+-----------+
| parallel_trends | VALID | F=0.421 | p=0.657 |
| no_anticipation | VALID | F=0.183 | p=0.831 |
| no_differential_attrition | VALID | - | - |
+----------------------------------+-------------------+-----------+-----------+
VERDICT: VALID | Pass rate: 100% | Contract: eg:3f4a8c2b1d...
Why DAGger?
DAG (Directed Acyclic Graph) is the mathematical foundation of causal inference — Pearl's do-calculus, structural causal models, identification theory. -ger is the agent suffix: logger, debugger, linter. DAGger is the tool that brings DAG-rigour to production pipelines.
The mypy of causal inference.
Three principles:
1. Contracts before estimation. An IdentificationContract is a Pydantic v2 model that declares your entire causal strategy — estimand, assumptions, sensitivity analyses — in a typed, serializable, content-addressed document. You cannot call .fit() without one.
2. Tests, not checklists. Every declared assumption is backed by a statistically correct, peer-reviewed test. Parallel trends uses the Rambachan-Roth pre-trend F-test. First stage uses the Olea-Pflueger (2013) effective F, not the Staiger-Stock rule of thumb. The tests are the contract.
3. Machine-readable by default. Every result is a Pydantic model with semantic field names and paired interpretation strings. The audit ledger is SHA-256 content-addressed JSON-LD. The MCP server exposes everything to LLM agents natively. Provenance is not an afterthought — it's the architecture.
Quickstart
Install
pip install dagr
# With estimators (pyfixest, linearmodels, doubleml):
pip install "dagr[estimators]"
# With R bridge (HonestDiD, rdrobust, synthdid):
pip install "dagr[r-bridge]"
DiD in 5 steps
import dagr as dg
import polars as pl
# Step 1: Load your panel data
data = pl.read_parquet("county_employment_panel.parquet")
# Step 2: Declare the identification contract (pre-registration)
contract = dg.DiffInDiffContract(
estimand=dg.Estimand.ATT,
outcome_var="log_employment",
treatment_var="min_wage_increase",
time_var="year",
unit_var="county_fips",
assumptions=frozenset([
dg.Assumption.PARALLEL_TRENDS,
dg.Assumption.NO_ANTICIPATION,
]),
pre_periods=(-4, -3, -2, -1),
post_periods=(0, 1, 2, 3),
)
contract.to_file("artifacts/contract.json") # pre-registration artifact
# Step 3: Validate assumptions — the preflight battery
with dg.AuditLedger(contract=contract, experiment_id="min_wage_2024",
ledger_path="artifacts/ledger.jsonld") as ledger:
preflight = contract.validate(data) # runs all declared tests
ledger.attach_preflight(preflight)
preflight.assert_valid() # hard gate: stops here if INVALID
# Step 4: Estimate
results = contract.build_estimator(data).fit()
ledger.attach_results(results)
# Step 5: Sensitivity analysis
rr = dg.RambachanRothSensitivity(results=results)
rr_report = rr.compute(pre_period_max_abs=0.03)
ledger.attach_sensitivity(rr_report)
# Step 6: The machine-readable report
report = ledger.generate_report()
print(report.model_dump_json(indent=2)) # LLM-consumable, SHA-256 signed
Instrumental Variables
contract = dg.IVContract(
estimand=dg.Estimand.LATE,
outcome_var="earnings",
treatment_var="years_education",
time_var="birth_cohort",
unit_var="individual_id",
assumptions=frozenset([
dg.Assumption.INSTRUMENT_RELEVANCE, # tested: Olea-Pflueger effective F
dg.Assumption.INSTRUMENT_EXCLUSION, # tested: reduced-form plausibility
]),
instruments=("compulsory_schooling_law",),
endogenous_vars=("years_education",),
estimator_preference="2SLS",
pre_periods=(-3, -2, -1),
post_periods=(0, 1, 2),
)
Architecture
+-----------------------------------------------------------------------+
| DAGR STACK |
+-----------------------------------------------------------------------+
| Human Researcher (Python API) | AI Agent (MCP Tool Call) |
| | | | |
| v v |
| +--------------------------------------------------+ |
| | dagr.contracts | |
| | DiffInDiffContract | IVContract | |
| | Pydantic v2, frozen, content-addressed | |
| +---------------------+----------------------------+ |
| | .validate(data) |
| v |
| +--------------------------------------------------+ |
| | dagr.validators | |
| | TWFE event-study | Olea-Pflueger F | |
| | Rambachan-Roth | Sargan-Hansen J | |
| +---------------------+----------------------------+ |
| | ValidationSuiteResult |
| | VALID / VALID_CONDITIONAL |
| | FRAGILE / INVALID |
| v |
| +--------------------------------------------------+ |
| | contract.build_estimator(data).fit() | |
| | TWFE (pyfixest) | 2SLS / LIML / GMM-IV | |
| | Callaway-Sant'Anna | AIPW (doubly-robust) | |
| +---------------------+----------------------------+ |
| | EconGuardResults |
| v |
| +--------------------------------------------------+ |
| | dagr.sensitivity | |
| | Rambachan-Roth (2023) | Oster delta (2019) | |
| | Spec Curve | Rosenbaum bounds | |
| +---------------------+----------------------------+ |
| v |
| +--------------------------------------------------+ |
| | dagr.ledger | |
| | AuditLedger — SHA-256 content-addressed | |
| | IdentificationReport — LLM-optimised JSON | |
| | MCP Server — 4 tools, Claude/GPT-4 native | |
| +--------------------------------------------------+ |
+-----------------------------------------------------------------------+
MCP Server
DAGger exposes its full validation and sensitivity stack as an MCP server that any LLM agent can call natively.
dagr serve --port 8080
Four tools:
| Tool | Description |
|---|---|
run_identification_preflight |
Validate contract assumptions against data. Returns ValidationSuiteResult. |
compute_sensitivity |
Rambachan-Roth bounds or Oster delta. Returns SensitivityReport. |
generate_identification_report |
Full audit report from a ledger file. |
validate_did_contract |
Flat-parameter convenience tool for LLM agents. |
The IdentificationReport is designed for LLM consumption: semantic field names, paired interpretation strings, controlled-vocabulary verdicts.
{
"schema_version": "dagr/v1",
"overall_verdict": "valid",
"identification": {
"strategy": "difference_in_differences",
"estimand": "average_treatment_effect_on_treated",
"status": "valid",
"recommendation": "Identification is valid. Proceed with estimation.",
"failed_assumptions": []
},
"sensitivity": {
"rr_breakdown_m_bar": 1.43,
"rr_verdict": "valid",
"oster_delta": 2.14,
"oster_verdict": "valid"
},
"audit_hash": "sha256:3f4a8c2b1d..."
}
Feature Matrix
| Feature | DAGger | Naive LLM | Manual Checklist |
|---|---|---|---|
| Assumption validation | Automated | None | Manual |
| Fails on violation | Hard gate | Silent | Sometimes |
| Parallel trends | Event-study F + max|beta| | - | Visual |
| Weak instruments | Olea-Pflueger (2013) | - | Rule-of-thumb |
| Rambachan-Roth bounds | Python + R bridge | - | - |
| Oster delta | Analytic (3dp verified) | - | - |
| Audit trail | SHA-256 JSON-LD | - | Notes |
| LLM-readable output | Pydantic + MCP | Unstructured | - |
| Pre-registration | OSF JSON-LD | - | Manual |
What DAGger Catches
The demo notebook notebooks/01_the_llm_got_it_wrong.ipynb walks through a real case:
- An AI agent produces a "significant" employment effect of a minimum wage increase
- DAGger runs the preflight — the pre-trend F-test fails
- Rambachan-Roth bounds show the CI crosses zero at M-bar = 0.38
- The corrected analysis on properly identified data: VALID verdict
[AI result] ATT = -0.047*** SE = 0.018 p = 0.009 <- looks correct
[DAGger] Pre-trend F(3,...) = 8.41 p = 0.002 <- assumption violated
Rambachan-Roth: breakdown M* = 0.38 <- not robust
Verdict: INVALID. Do not use this estimate.
Quality
| Metric | Value |
|---|---|
| Test suite | 330+ tests |
| Type checking | mypy --strict (zero errors) |
| Linting | ruff (zero violations) |
| Coverage | >= 80% |
| Build | uv build + twine check PASSED |
| License | Apache 2.0 |
| Python | 3.12+ |
Installation Options
# Core (contracts, validators, sensitivity, ledger, MCP, CLI)
pip install dagr
# With real estimators (pyfixest, linearmodels, doubleml)
pip install "dagr[estimators]"
# With R bridge (HonestDiD LP bounds, rdrobust, synthdid)
pip install "dagr[r-bridge]"
# Full installation
pip install "dagr[estimators,r-bridge]"
References
DAGger implements or wraps published statistical methods. All implementations cite their source paper and include analytic test cases with known expected values.
Callaway, Brantly and Pedro H.C. Sant'Anna. 2021. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, 225(2), 200-230.
Olea, Jose Luis Montiel and Carolin Pflueger. 2013. "A Robust Test for Weak Instruments." Journal of Business & Economic Statistics, 31(3), 358-369.
Oster, Emily. 2019. "Unobservable Selection and Coefficient Stability: Theory and Evidence." Journal of Business & Economic Statistics, 37(2), 187-204.
Rambachan, Ashesh and Jonathan Roth. 2023. "A More Credible Approach to Parallel Trends." The Review of Economic Studies, 90(5), 2555-2591.
Rosenbaum, Paul R. 2002. Observational Studies (2nd ed.). Springer. Chapter 4.
Simonsohn, Uri, Joseph P. Simmons, and Leif D. Nelson. 2020. "Specification Curve Analysis." Nature Human Behaviour, 4, 1208-1214.
Contributing
See CONTRIBUTING.md. We especially welcome:
- New validators with cited source papers and analytic test cases
RDDContractimplementation (good first issue)- Callaway-Sant'Anna doubly-robust with cross-fitting
- Spanish/Portuguese translations of interpretation strings
Critical rule: Any modification to a statistical validator must cite the source paper and include an analytic test with a known expected value. Statistical correctness is not negotiable.
DAGger — Because causal validity should be a compiler error, not a footnote.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file econguard-0.1.0.tar.gz.
File metadata
- Download URL: econguard-0.1.0.tar.gz
- Upload date:
- Size: 302.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dc8701af4ef3a81c91cf74e30fd6cf6c99609f6087db9fdbe1df0c275d99d1a
|
|
| MD5 |
2f4e12cfccf189657299062425b447ce
|
|
| BLAKE2b-256 |
632ab48e6e6f603f86322596b64d00aaedee03819313883f3680d9be360a7b41
|
File details
Details for the file econguard-0.1.0-py3-none-any.whl.
File metadata
- Download URL: econguard-0.1.0-py3-none-any.whl
- Upload date:
- Size: 77.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3f5fe40b4e887f6baab79938bed8982bcab4e402a71b576d1ad5ab143750391
|
|
| MD5 |
c89766504c6b0aa67c4e9a96e87e1507
|
|
| BLAKE2b-256 |
6f1d25f06959abeeb34b60c60740c7df5f05401c063a3a7b388656fc840fa162
|