Fast counterfactual estimators for panel data — Python reimplementation of R fect
Project description
pyfector
Alpha (v0.1.x). Results have been checked against R
fecton synthetic data, but edge cases may remain. APIs may change. Please verify critical results independently. Issues and pull requests are welcome at GitHub.
Counterfactual estimators for panel data in Python. Port of the R fect package (Liu, Wang & Xu, 2024, AJPS), built on numpy, scipy, polars, and joblib, with optional CuPy GPU support.
Installation
pip install pyfector
From source:
git clone https://github.com/AlanHuang99/pyfector.git
cd pyfector
pip install -e ".[dev]"
Optional extras: pip install pyfector[gpu] (CuPy), pip install pyfector[pandas].
Quick Start
import polars as pl
import pyfector
data = pl.read_parquet("panel_data.parquet")
result = pyfector.fect(
data=data,
Y="outcome",
D="treatment",
index=("unit_id", "year"),
X=["gdp", "population"],
method="ife",
r=(0, 5),
se=True,
nboots=500,
seed=42,
)
result.summary()
result.plot(kind="gap")
result.diagnose()
Estimators
method="fe" -- Two-way fixed effects
Unit and time fixed effects on the control sample, counterfactuals imputed for the treated sample. Assumes parallel trends.
method="ife" -- Interactive fixed effects (Bai 2009, Xu 2017)
Adds latent factors with unit loadings on top of fixed effects. Number of factors r can be fixed or chosen by cross-validation over r=(r_min, r_max).
method="mc" -- Matrix completion (Athey et al. 2021)
Nuclear-norm-penalized matrix completion of the counterfactual matrix. Penalty lam can be fixed or selected by cross-validation.
method="cfe" -- Complex fixed effects
Unit/time-varying interactions with covariates Z and Q.
Cross-validation masks a fraction cv_prop of control cells and picks r (IFE) or lam (MC) by prediction error (mspe, gmspe, or mad).
Parameters
fect()
fect(
data, Y, D, index,
*, X=None, W=None, method="ife", force="two-way",
r=0, lam=None, nlambda=10, CV=True, k=10, cv_prop=0.1, criterion="mspe",
se=False, vartype="bootstrap", nboots=200, alpha=0.05,
tol=1e-7, max_iter=5000, min_T0=1, normalize=False,
device="cpu", n_jobs=1, seed=None,
) -> FectResult
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
pl.DataFrame or pd.DataFrame |
required | Panel data in long format |
Y |
str |
required | Outcome column |
D |
str |
required | Treatment indicator (0/1) |
index |
(str, str) |
required | (unit_id, time) column names |
X |
list[str] |
None |
Time-varying covariate columns |
W |
str |
None |
Observation weight column |
method |
str |
"ife" |
"fe", "ife", "mc", "cfe" |
force |
str |
"two-way" |
"none", "unit", "time", "two-way" |
r |
int or (int, int) |
0 |
Number of factors; tuple triggers CV over range |
lam |
float |
None |
MC nuclear-norm penalty; None selects by CV |
CV |
bool |
True |
Cross-validate over r or lam |
k |
int |
10 |
CV folds |
cv_prop |
float |
0.1 |
Fraction of control cells masked per CV fold |
criterion |
str |
"mspe" |
CV loss: "mspe", "gmspe", "mad" |
se |
bool |
False |
Compute standard errors |
vartype |
str |
"bootstrap" |
"bootstrap" or "jackknife" |
nboots |
int |
200 |
Bootstrap replications |
tol |
float |
1e-7 |
EM convergence tolerance |
max_iter |
int |
5000 |
Max EM iterations |
min_T0 |
int |
1 |
Minimum pre-treatment periods required per unit |
normalize |
bool |
False |
Normalize outcome by standard deviation |
device |
str |
"cpu" |
"cpu" or "gpu" (requires pyfector[gpu]) |
n_jobs |
int |
1 |
Parallel workers for CV / bootstrap |
seed |
int |
None |
Random seed |
Data format
Long-format panel with one row per unit per period.
| Column | Type | Description |
|---|---|---|
| outcome | float | Outcome variable |
| unit id | int/str | Unit identifier |
| time | int | Time period |
| treatment | int | 0/1 treatment indicator |
Output
fect() returns a FectResult with these fields:
| Field | Type | Description |
|---|---|---|
att_avg |
float |
Overall ATT (weighted by cell counts) |
att_avg_unit |
float |
Unit-averaged ATT |
att_on |
ndarray |
Dynamic ATT by relative time |
time_on |
ndarray |
Relative time indices |
count_on |
ndarray |
Observation count per relative time |
beta |
ndarray |
Covariate coefficients |
Y_ct |
ndarray |
(T, N) counterfactual outcome matrix |
eff |
ndarray |
(T, N) treatment effect matrix |
factors |
ndarray |
(T, r) estimated factors (IFE) |
loadings |
ndarray |
(N, r) estimated loadings (IFE) |
sigma2 |
float |
Residual variance |
r_cv |
int |
CV-selected r |
lambda_cv |
float |
CV-selected lam |
inference |
InferenceResult |
Bootstrap or jackknife results (if se=True) |
Methods
| Method | Description |
|---|---|
summary() |
Formatted text summary |
plot(kind, **kwargs) |
Matplotlib figure. Kinds: gap, status, factors, counterfactual |
diagnose(...) |
Diagnostic tests (see below) |
Inference
When se=True, result.inference carries bootstrap or jackknife SEs and CIs for the overall ATT and per-period effects, plus the full bootstrap distribution (att_avg_boot, att_on_boot).
Diagnostics
diag = result.diagnose(
f_threshold=0.5,
tost_threshold=0.36,
placebo_period=(-5, -1),
loo=True,
)
diag.summary()
| Test | Output |
|---|---|
| Pre-trend F-test | f_stat, f_pval |
| Equivalence F-test | equiv_f_pval |
| TOST | tost_pvals |
| Placebo | placebo_att, placebo_pval |
| Carryover | carryover_att, carryover_pval |
| Leave-one-out | loo_max_change |
Validation against R fect
Synthetic DGP, N=200, T=50. Point estimates:
| Scenario | pyfector | R fect | Difference |
|---|---|---|---|
| FE | 4.995583 | 4.995640 | -0.000057 |
| FE + X | 4.975683 | 4.975809 | -0.000126 |
| IFE r=2 | 3.010223 | 3.013046 | -0.002822 |
| IFE r=2 + X | 2.993155 | 2.996099 | -0.002944 |
| MC lambda=0.01 | 3.176671 | 3.176721 | -0.000050 |
FE and MC agree to 4-6 decimal places. IFE differences reflect SVD rotation non-uniqueness.
Bootstrap SEs (500 reps):
| Scenario | pyfector | R fect | Ratio |
|---|---|---|---|
| FE | 0.020011 | 0.021291 | 0.94 |
| IFE r=2 | 0.017128 | 0.018382 | 0.93 |
Covariate coefficients agree to 6 decimal places. Comparison scripts are in benchmarks/.
Performance notes
Hot paths use vectorised numpy and a randomised truncated SVD for the interactive-FE update when r << min(T, N). Cross-validation and bootstrap replications parallelise over n_jobs via joblib. GPU execution is available through device="gpu" when CuPy is installed. See benchmarks/ for reproducible timing scripts.
Testing
uv run pytest tests/ # full suite
uv run pytest tests/test_vs_r.py -v -s # R validation (requires R + fect)
API reference
- REFERENCE.md -- parameter tables, return types
- API docs -- searchable HTML reference (pdoc)
References
- Liu, L., Wang, Y., & Xu, Y. (2024). "A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data." American Journal of Political Science, 68(1), 160-176.
- Xu, Y. (2017). "Generalized Synthetic Control Method." Political Analysis, 25(1), 57-76.
- Athey, S., Bayati, M., Doudchenko, N., Imbens, G., & Khosravi, K. (2021). "Matrix Completion Methods for Causal Panel Data Models." JASA, 116(536), 1716-1730.
- Bai, J. (2009). "Panel Data Models with Interactive Fixed Effects." Econometrica, 77(4), 1229-1279.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyfector-0.1.2.tar.gz.
File metadata
- Download URL: pyfector-0.1.2.tar.gz
- Upload date:
- Size: 263.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b966a4b2513ed588f4a3b459ffa49478061df62dd69940937c2b7b7c749396f5
|
|
| MD5 |
837150ffa54c1a9f9b28d9608f532485
|
|
| BLAKE2b-256 |
afd46cd4becf017eba6eb951b29900fd0bdc4c59fa0325b8ba361f3dfa769fc7
|
Provenance
The following attestation bundles were made for pyfector-0.1.2.tar.gz:
Publisher:
publish.yml on AlanHuang99/pyfector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyfector-0.1.2.tar.gz -
Subject digest:
b966a4b2513ed588f4a3b459ffa49478061df62dd69940937c2b7b7c749396f5 - Sigstore transparency entry: 1264757600
- Sigstore integration time:
-
Permalink:
AlanHuang99/pyfector@001a7caa31c02ff42b678d13f3db37d34be5a5ac -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/AlanHuang99
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@001a7caa31c02ff42b678d13f3db37d34be5a5ac -
Trigger Event:
release
-
Statement type:
File details
Details for the file pyfector-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pyfector-0.1.2-py3-none-any.whl
- Upload date:
- Size: 43.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
517d0010b8d6156cff367c375a794be5d401b3beb49e4b8f431bf3de86feb4a5
|
|
| MD5 |
7c0d684720ade2ed88e2e277feafd3e0
|
|
| BLAKE2b-256 |
0624f99f7a9c73b126ef11c34fdbb7f3e5cf7ef100d1de35389b74d4c867af47
|
Provenance
The following attestation bundles were made for pyfector-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on AlanHuang99/pyfector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyfector-0.1.2-py3-none-any.whl -
Subject digest:
517d0010b8d6156cff367c375a794be5d401b3beb49e4b8f431bf3de86feb4a5 - Sigstore transparency entry: 1264757749
- Sigstore integration time:
-
Permalink:
AlanHuang99/pyfector@001a7caa31c02ff42b678d13f3db37d34be5a5ac -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/AlanHuang99
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@001a7caa31c02ff42b678d13f3db37d34be5a5ac -
Trigger Event:
release
-
Statement type: