A Python package for Recentered Influence Function (RIF) regression
Project description
pyrifreg
A Python package for Recentered Influence Function (RIF) regression analysis. Provides tools for analyzing distributional effects in econometrics and data science applications. Bridges the gap between Python developers and econometricians to enable deeper unconditional distributional analysis.
Installation
You can install the package using pip:
pip install pyrifreg
Features
- Implementation of Recentered Influence Function (RIF) regression
- Support for various distributional statistics (mean, quantiles, variance, gini, etc.)
- Easy-to-use API for regression analysis
- Integration with pandas and scikit-learn
Background
Motivation: From Conditional to Unconditional Effects
Most of you are familiar with conditional moments—e.g.
$$ \mathbb{E}[Y \mid X = x] $$
or conditional quantiles
$$ Q_\tau(Y \mid X = x). $$
But policy questions often concern how a change in some covariate $X$ shifts the entire (marginal or unconditional) distribution of an outcome $Y$. For instance:
- Inequality analysis: How would increasing education change the 90th vs.\ 10th percentile of the wage distribution?
- Welfare evaluation: What is the impact of a cash transfer on the variance or Gini of consumption?
Formally, let $F_Y$ be the baseline distribution of $Y$, and imagine a small intervention on $X$ that perturbs $F_Y$ to $G_Y$. For a scalar functional $\nu(\cdot)$ (e.g. mean, variance, quantile), define the unconditional effect:
$$ \Delta\nu =\nu(G_Y)-\nu(F_Y). $$
Our goal is to estimate how “marginal shifts” in $X$ translate into $\Delta\nu$.
Influence Functions (IF)
An influence function captures the first‐order sensitivity of a distributional functional $T(F)$ to an infinitesimal contamination at the point $y$. Concretely, define
$$ F_\varepsilon = (1-\varepsilon)F + \varepsilon,\delta_y, $$
where $\delta_y$ is a point‐mass at $y$. Then
$$ \mathrm{IF}(y,T, F)
\lim_{\varepsilon\to 0} \frac{T(F_\varepsilon) - T(F)}{\varepsilon}. $$
IFs tell us "how much does a single observation at $y$ “pull” the estimator of $T$ away from its nominal value".
Recentered Influence Functions (RIF)
Since $\mathbb{E}[\mathrm{IF}(Y;T,F)] = 0$, we cannot regress $\mathrm{IF}(Y)$ directly to target $T(F)$. The recentered influence function adds back the functional itself:
$$ \mathrm{RIF}(y,T, F)
T(F)+\mathrm{IF}(y,T, F). $$
Its key property is:
$$ \mathbb{E}[\mathrm{RIF}(Y)] = T(F). $$
Thus $\mathrm{RIF}(Y)$ is an unbiased “pseudo‐outcome” for $T(F)$, which we can now relate to covariates.
RIF Regression
A RIF regression proceeds in two steps:
-
Compute the plug‐in estimate $T(\widehat F)$ and the influence function $\mathrm{IF}(y_i;T,\widehat F)$ for each $i$.
-
Form the RIF outcome $r_i = T(\widehat F) + \mathrm{IF}(y_i,T,\widehat F),$ and estimate the linear model
$$ r_i = x_i^\top\beta +\varepsilon_i. $$
Under regularity conditions (smoothness of $T$, overlap in $X$, etc.), each component $\beta_j$ approximates the marginal effect of $X_j$ on the unconditional functional $T(F_Y)$.
Unconditional Quantile Regression (UQR)
Unconditional quantile regression is simply RIF regression with $T(F)=Q_\tau(Y)$. Then:
$$ r_i = Q_\tau(Y) +\frac{\tau - \mathbf{1}{y_i \le Q_\tau}}{f_Y(Q_\tau)}, $$
and regressing $r_i$ on $X$ yields estimates of how a marginal change in each $X_j$ shifts the $\tau$-th marginal quantile of $Y$.
Traditional conditional quantile regression (Koenker & Bassett, 1978) estimates how covariates $X$ shift the conditional quantile $Q_\tau(Y\mid X)$, which effectively amounts to examining the unconditional distribution of the residual $\varepsilon$. By contrast, unconditional quantile regression (UQR) assesses how marginal changes in $X$ directly alter the overall distribution of $Y$. Personally, I find the conditional approach far less interpretable and meaningful.
Inference
Inference in RIF regression proceeds via a two‐stage procedure. First, estimating the target functional $T(\widehat F)$, any necessary density (e.g.\ $f_Y(Q_\tau)$), and the influence values $\mathrm{IF}(y_i)$. Next, regressing the recentered outcomes on covariates. Because the RIFs are themselves estimated, naïve OLS standard errors are inconsistent and must be adjusted. The package supports bootstrap estimation of standard errors.
Quick Start
import numpy as np
import pandas as pd
from pyrifreg import RIFRegression
# Create sample data
X = np.random.randn(1000, 2)
y = np.random.randn(1000)
# Initialize and fit RIF regression
median_rif = RIFRegression(statistic='quantile', q=0.5)
median_rif.fit(X, y)
# Get regression results
results = median_rif.summary()
print(results)
You can find more examples in example.py.
References
- Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional Quantile Regressions. Econometrica, 77(3), 953–973.
- Hampel, F. R. (1974). The Influence Curve and Its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393.
- Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society, 33-50.
- Rios-Avila, F. (2020). Recentered influence functions (RIFs) in Stata: RIF regression and RIF decomposition. The Stata Journal, 20(1), 51-94.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyrifreg-0.1.0.tar.gz.
File metadata
- Download URL: pyrifreg-0.1.0.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecc8b4e4300717cc0c522122023e7ae5e3e159d3d04ac4bc11dbdc343a0cbaa3
|
|
| MD5 |
4541549a30a42dc7019a4f18c77aa047
|
|
| BLAKE2b-256 |
ddfd7ea63d528de6af8096a51de88b26307fb4496e345ca1aee68f15e1eaefa5
|
File details
Details for the file pyrifreg-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyrifreg-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2621e004f814f1da41a64955cc42576f391f98f7b8b65e50b07ba680f7c19c8f
|
|
| MD5 |
6ce0c8eb4d969a6b7d9b4a139b4dbfb1
|
|
| BLAKE2b-256 |
6025d321818fdfa3e07f680dfa36e91c75353532d29a8b302ffd7ba6f2cf9bc3
|