Behavioral inference. IRL and DDC with standard errors.
Project description
econirl
Benchmarking dynamic discrete choice and inverse RL algorithms on a variety of MDPs — comparing reward recovery, imitation, and generalization.
Install
pip install econirl
Try It
Load a bundled dataset and fit one estimator in under a second.
from econirl.datasets import load_rust_bus
from econirl import CCP
df = load_rust_bus()
model = CCP(n_states=90, discount=0.9999)
model.fit(df, state="mileage_bin", action="replaced", id="bus_id")
print(model.params_)
print(model.summary())
Benchmark
Compare 10 estimators on a simulated 5-state bus engine replacement MDP. This runs each estimator sequentially and takes a few minutes.
from econirl.evaluation.benchmark import BenchmarkDGP, run_single, get_default_estimator_specs
dgp = BenchmarkDGP(n_states=5, discount_factor=0.95)
specs = get_default_estimator_specs()
for spec in specs:
result = run_single(dgp, spec, n_agents=100, n_periods=50, seed=42)
print(f"{result.estimator:12s} {result.pct_optimal:6.1f}% {result.time_seconds:5.1f}s")
Results
The table below shows all 18 algorithms (10 default from get_default_estimator_specs() plus 8 additional from econirl.contrib).
| Estimator | Category | Recovers Params | Recovers Reward | % Optimal | % Transfer | Time |
|---|---|---|---|---|---|---|
| Structural Estimators | ||||||
| NFXP | Structural | Yes | Yes | 99.7% | 99.8% | 13.9s |
| CCP | Structural | Yes | Yes | 99.7% | 99.8% | 18.6s |
| SEES | Structural | Yes | Yes | 99.6% | 99.6% | 28.6s |
| NNES | Structural | Yes | Yes | 99.6% | 99.1% | 13.7s |
| Entropy-Based IRL | ||||||
| MCE IRL | IRL | Yes | Yes | 99.7% | 99.7% | 20.6s |
| MaxEnt IRL | IRL | No | Yes | 98.2% | 97.8% | 9.1s |
| Deep MaxEnt | IRL | No | Yes | 98.3% | 98.2% | 52.3s |
| BIRL | IRL | No | Yes | 99.5% | 99.5% | 237.8s |
| Margin-Based IRL | ||||||
| Max Margin | IRL | Yes | Yes | 99.3% | 99.3% | 64.8s |
| Max Margin IRL | IRL | No | Yes | 31.1% | 34.2% | 0.3s |
| Distribution Matching | ||||||
| f-IRL | IRL | No | Yes | 99.1% | 99.1% | 44.9s |
| Neural Estimators | ||||||
| TD-CCP | Neural | Yes | Yes | 99.8% | 99.7% | 16.3s |
| GLADIUS | Neural | Yes | Yes | 99.6% | 88.7% | 4.2s |
| Adversarial Methods | ||||||
| GAIL | Adversarial | No | No | 54.3% | 50.9% | 112.9s |
| AIRL | Adversarial | No | Yes | 99.4% | 99.5% | 123.0s |
| GCL | Adversarial | No | Yes | 92.7% | 95.3% | 166.5s |
| Inverse Q-Learning | ||||||
| IQ-Learn | IRL | No | Yes | 99.5% | 99.1% | 0.0s |
| Baseline | ||||||
| BC | Baseline | No | No | 99.5% | 99.5% | 0.1s |
5-state MDP, 100 agents x 50 periods, seed=42. % Optimal = value achieved vs true optimal on training dynamics (baseline-normalized). % Transfer = same metric on held-out transition dynamics (same rewards, different wear rates). Recovers Params = recovers interpretable structural parameters. Recovers Reward = recovers a reward function (enables transfer to new dynamics).
Algorithms
Structural Estimators
Assume the econometrician knows the model and recover flow utility parameters by maximum likelihood.
| Algorithm | Paper | Method |
|---|---|---|
| NFXP | Rust (1987) | Full-solution MLE via nested fixed point |
| CCP | Hotz & Miller (1993) | Two-step conditional choice probability with NPL iterations |
| SEES | Luo & Sang (2024) | Sieve basis V(s) approximation + penalized joint MLE |
| NNES | Nguyen (2025) | Neural V(s) network (NPL Bellman) + structural MLE |
Entropy-Based IRL
Recover reward functions from demonstrations using maximum entropy or Bayesian principles.
| Algorithm | Paper | Method |
|---|---|---|
| MCE IRL | Ziebart (2010) | Maximum causal entropy IRL with soft value iteration |
| MaxEnt IRL | Ziebart et al. (2008) | Maximum entropy IRL with state visitation frequencies |
| Deep MaxEnt | Wulfmeier et al. (2016) | Neural reward network + MaxEnt feature matching |
| BIRL | Ramachandran & Amir (2007) | Bayesian MCMC (Metropolis-Hastings) over reward parameters |
Margin-Based IRL
Recover rewards by maximizing the margin between expert and non-expert behavior.
| Algorithm | Paper | Method |
|---|---|---|
| Max Margin | Ratliff et al. (2006) | Structured max-margin planning |
| Max Margin IRL | Abbeel & Ng (2004) | Apprenticeship learning via margin maximization |
Distribution Matching
Match state-marginal distributions rather than feature expectations.
| Algorithm | Paper | Method |
|---|---|---|
| f-IRL | Ni et al. (2022) | State-marginal matching via f-divergences (KL, chi-squared, TV) |
Neural Estimators
Approximate value functions with neural networks for scalability to large state spaces.
| Algorithm | Paper | Method |
|---|---|---|
| TD-CCP | Adusumilli & Eckardt (2022) | TD-learning + CCP with neural approximate value iteration |
| GLADIUS | Kang, Yoganarasimhan & Jain (2025) | Dual Q + EV networks with Bellman consistency penalty |
Inverse Q-Learning
Recover reward and policy by learning a single soft Q-function, avoiding adversarial training.
| Algorithm | Paper | Method |
|---|---|---|
| IQ-Learn | Garg et al. (2021) | Inverse soft-Q learning with chi-squared divergence |
Adversarial Methods
Learn reward or policy via a discriminator that distinguishes expert from generated behavior.
| Algorithm | Paper | Method |
|---|---|---|
| GAIL | Ho & Ermon (2016) | Generative adversarial imitation learning |
| AIRL | Fu et al. (2018) | Adversarial inverse RL with disentangled reward |
| GCL | Finn et al. (2016) | Guided cost learning with importance sampling |
Baseline
| Algorithm | Paper | Method |
|---|---|---|
| BC | — | Supervised: empirical P(a|s) from demonstrations |
Pseudocode
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file econirl-0.0.4.tar.gz.
File metadata
- Download URL: econirl-0.0.4.tar.gz
- Upload date:
- Size: 6.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
124f780f65a563bddfac5cc20e2390672b2b516ebe50245c9e8a3a79bb404b39
|
|
| MD5 |
c187f372ef0deba9d4b827a2465d5321
|
|
| BLAKE2b-256 |
d695327ce0741ae3cbe24b16fa354d5c4025b133e53299a89e147639f9bedcd1
|
File details
Details for the file econirl-0.0.4-py3-none-any.whl.
File metadata
- Download URL: econirl-0.0.4-py3-none-any.whl
- Upload date:
- Size: 6.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53295c719c3a447346e133960b0533f10bdf131e47d3c56378ee9dae3e79aca3
|
|
| MD5 |
737c8b5e238556b86a1b187d1c57baa8
|
|
| BLAKE2b-256 |
d82fda4bd6e146e50ae8263f6077a2573d6fdc77f7d93df7b5f26d2779fd96ee
|