Skip to main content

A Python package for Recentered Influence Function (RIF) regression

Project description

pyrifreg

A Python package for Recentered Influence Function (RIF) regression analysis. Provides tools for analyzing distributional effects in econometrics and data science applications. Bridges the gap between Python developers and econometricians to enable deeper unconditional distributional analysis.

Installation

You can install the package using pip:

pip install pyrifreg

Features

  • Implementation of Recentered Influence Function (RIF) regression
  • Support for various distributional statistics (mean, quantiles, variance, gini, etc.)
  • Easy-to-use API for regression analysis
  • Integration with pandas and scikit-learn

Background

Motivation: From Conditional to Unconditional Effects

Most of you are familiar with conditional moments—e.g.

$$ \mathbb{E}[Y \mid X = x] $$

or conditional quantiles

$$ Q_\tau(Y \mid X = x). $$

But policy questions often concern how a change in some covariate $X$ shifts the entire (marginal or unconditional) distribution of an outcome $Y$. For instance:

  • Inequality analysis: How would increasing education change the 90th vs.\ 10th percentile of the wage distribution?
  • Welfare evaluation: What is the impact of a cash transfer on the variance or Gini of consumption?

Formally, let $F_Y$ be the baseline distribution of $Y$, and imagine a small intervention on $X$ that perturbs $F_Y$ to $G_Y$. For a scalar functional $\nu(\cdot)$ (e.g. mean, variance, quantile), define the unconditional effect:

$$ \Delta\nu =\nu(G_Y)-\nu(F_Y). $$

Our goal is to estimate how “marginal shifts” in $X$ translate into $\Delta\nu$.


Influence Functions (IF)

An influence function captures the first‐order sensitivity of a distributional functional $T(F)$ to an infinitesimal contamination at the point $y$. Concretely, define

$$ F_\varepsilon = (1-\varepsilon)F + \varepsilon,\delta_y, $$

where $\delta_y$ is a point‐mass at $y$. Then

$$ \mathrm{IF}(y,T, F)

\lim_{\varepsilon\to 0} \frac{T(F_\varepsilon) - T(F)}{\varepsilon}. $$

IFs tell us "how much does a single observation at $y$ “pull” the estimator of $T$ away from its nominal value".


Recentered Influence Functions (RIF)

Since $\mathbb{E}[\mathrm{IF}(Y;T,F)] = 0$, we cannot regress $\mathrm{IF}(Y)$ directly to target $T(F)$. The recentered influence function adds back the functional itself:

$$ \mathrm{RIF}(y,T, F)

T(F)+\mathrm{IF}(y,T, F). $$

Its key property is:

$$ \mathbb{E}[\mathrm{RIF}(Y)] = T(F). $$

Thus $\mathrm{RIF}(Y)$ is an unbiased “pseudo‐outcome” for $T(F)$, which we can now relate to covariates.


RIF Regression

A RIF regression proceeds in two steps:

  1. Compute the plug‐in estimate $T(\widehat F)$ and the influence function $\mathrm{IF}(y_i;T,\widehat F)$ for each $i$.

  2. Form the RIF outcome $r_i = T(\widehat F) + \mathrm{IF}(y_i,T,\widehat F),$ and estimate the linear model

    $$ r_i = x_i^\top\beta +\varepsilon_i. $$

Under regularity conditions (smoothness of $T$, overlap in $X$, etc.), each component $\beta_j$ approximates the marginal effect of $X_j$ on the unconditional functional $T(F_Y)$.


Unconditional Quantile Regression (UQR)

Unconditional quantile regression is simply RIF regression with $T(F)=Q_\tau(Y)$. Then:

$$ r_i = Q_\tau(Y) +\frac{\tau - \mathbf{1}{y_i \le Q_\tau}}{f_Y(Q_\tau)}, $$

and regressing $r_i$ on $X$ yields estimates of how a marginal change in each $X_j$ shifts the $\tau$-th marginal quantile of $Y$.

Traditional conditional quantile regression (Koenker & Bassett, 1978) estimates how covariates $X$ shift the conditional quantile $Q_\tau(Y\mid X)$, which effectively amounts to examining the unconditional distribution of the residual $\varepsilon$. By contrast, unconditional quantile regression (UQR) assesses how marginal changes in $X$ directly alter the overall distribution of $Y$. Personally, I find the conditional approach far less interpretable and meaningful.


Inference

Inference in RIF regression proceeds via a two‐stage procedure. First, estimating the target functional $T(\widehat F)$, any necessary density (e.g.\ $f_Y(Q_\tau)$), and the influence values $\mathrm{IF}(y_i)$. Next, regressing the recentered outcomes on covariates. Because the RIFs are themselves estimated, naïve OLS standard errors are inconsistent and must be adjusted. The package supports bootstrap estimation of standard errors.

Quick Start

import numpy as np
import pandas as pd
from pyrifreg import RIFRegression

# Create sample data
X = np.random.randn(1000, 2)
y = np.random.randn(1000)

# Initialize and fit RIF regression
median_rif = RIFRegression(statistic='quantile', q=0.5)
median_rif.fit(X, y)

# Get regression results
results = median_rif.summary()
print(results)

You can find more examples in example.py.

References

  • Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional Quantile Regressions. Econometrica, 77(3), 953–973.
  • Hampel, F. R. (1974). The Influence Curve and Its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393.
  • Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society, 33-50.
  • Rios-Avila, F. (2020). Recentered influence functions (RIFs) in Stata: RIF regression and RIF decomposition. The Stata Journal, 20(1), 51-94.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrifreg-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrifreg-0.1.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file pyrifreg-0.1.0.tar.gz.

File metadata

  • Download URL: pyrifreg-0.1.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for pyrifreg-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ecc8b4e4300717cc0c522122023e7ae5e3e159d3d04ac4bc11dbdc343a0cbaa3
MD5 4541549a30a42dc7019a4f18c77aa047
BLAKE2b-256 ddfd7ea63d528de6af8096a51de88b26307fb4496e345ca1aee68f15e1eaefa5

See more details on using hashes here.

File details

Details for the file pyrifreg-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyrifreg-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for pyrifreg-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2621e004f814f1da41a64955cc42576f391f98f7b8b65e50b07ba680f7c19c8f
MD5 6ce0c8eb4d969a6b7d9b4a139b4dbfb1
BLAKE2b-256 6025d321818fdfa3e07f680dfa36e91c75353532d29a8b302ffd7ba6f2cf9bc3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page