Skip to main content

A Python package for Recentered Influence Function (RIF) regression

Project description

pyrifreg

A Python package for Recentered Influence Function (RIF) regression analysis. Provides tools for analyzing distributional effects in econometrics and data science applications.

Installation

You can install the package using pip:

pip install pyrifreg

Features

  • Implementation of Recentered Influence Function (RIF) regression
  • Support for various distributional statistics (mean, quantiles, variance, gini, etc.)
  • Easy-to-use API for regression analysis
  • Integration with pandas and scikit-learn

Quick Start

import numpy as np
import pandas as pd
from pyrifreg import RIFRegression

# Create sample data
X = np.random.randn(1000, 2)
y = np.random.randn(1000)

# Initialize and fit RIF regression
median_rif = RIFRegression(statistic='quantile', q=0.5)
median_rif.fit(X, y)

# Get regression results
results = median_rif.summary()
print(results)

You can find more examples in example.py.

Examples

You can find detailed usage examples in the examples/ directory.

Background

From Conditional to Unconditional Effects

Many regression models focus on conditional statistics like:

$$ \mathbb{E}[Y \mid X = x] $$

or conditional quantiles

$$ Q_\tau(Y \mid X = x). $$

But policy questions often require understanding how a variable like education or income influences the entire distribution of an outcome, not just its mean or conditional parts. For example:

  • How would expanding access to college change the 90th percentile of the wage distribution?
  • What is the effect of a tax policy on income inequality or the Gini index?

Instead of looking at changes within subgroups (conditional on $X$), RIF regression helps us estimate how changes in covariates shift the overall, or unconditional, distribution of $Y$.

Let $F_Y$ be the original distribution of $Y$, and suppose an intervention shifts it to $G_Y$. For a statistic $\nu$ (like the mean, a quantile, or variance), we want to estimate:

$$ \Delta\nu = \nu(G_Y) - \nu(F_Y), $$

i.e., how that statistic changes when the distribution shifts. RIF regression provides a way to estimate how different variables contribute to such shifts.

Influence Functions (IF)

The influence function measures how sensitive a statistic is to a small change in the data. More precisely, it tells us how much an individual observation $y$ influences a statistic like the mean or a quantile.

Formally, imagine a slightly perturbed distribution:

$$ F_\varepsilon = (1 - \varepsilon) F + \varepsilon, \delta_y, $$

where $\delta_y$ is a point mass at $y$. Then the influence function is:

$$ \mathrm{IF}(y; T, F) = \lim_{\varepsilon \to 0} \frac{T(F_\varepsilon) - T(F)}{\varepsilon}. $$

This gives us a first-order approximation of how $y$ affects the statistic $T$.

Recentered Influence Functions (RIF)

Because the average of the influence function is always zero, we can’t use it directly in a regression. To fix this, we “recenter” it by adding the original statistic back:

$$ \mathrm{RIF}(y; T, F) = T(F) + \mathrm{IF}(y; T, F). $$

Now, the expected value of the RIF is equal to the statistic itself:

$$ \mathbb{E}[\mathrm{RIF}(Y)] = T(F). $$

This makes it a useful outcome variable for regression, allowing us to relate changes in the statistic $T$ to changes in covariates.

RIF Regression

RIF regression works in two main steps:

  1. Estimate the target statistic $T(F)$ (e.g. median or Gini) and compute the influence value for each observation.

  2. Construct the RIF pseudo-outcome for each data point and regress it on $X$ using linear regression:

    $$ r_i = x_i^\top \beta + \varepsilon_i. $$

The regression coefficients $\beta_j$ can then be interpreted as the marginal effect of each $X_j$ on the statistic of interest.

Unconditional Quantile Regression (UQR)

UQR is a special case of RIF regression, where the statistic of interest is an unconditional quantile $Q_\tau(Y)$. For each observation $y_i$, we compute:

$$ r_i = Q_\tau(Y) + \frac{\tau - \mathbf{1}{y_i \le Q_\tau}}{f_Y(Q_\tau)}, $$

where $f_Y(Q_\tau)$ is the density at the $\tau$-th quantile. Regressing $r_i$ on $X$ tells us how each covariate shifts the $\tau$-th quantile of the overall outcome distribution.

This is in contrast to conditional quantile regression (Koenker & Bassett, 1978), which examines changes in $Q_\tau(Y \mid X)$—a different and often less intuitive object for understanding broad policy effects.

Confidence Intervals

Since RIFs are estimated in a first step before regression, the usual OLS standard errors are biased. To correct this, inference proceeds in two stages:

  1. Estimate the statistic $T$, the influence function, and any needed density estimates.
  2. Run the regression and compute corrected standard errors using bootstrap.

The package includes support for bootstrap inference out of the box.

References

  • Firpo, S., Fortin, N. M., & Lemieux, T. (2009). Unconditional Quantile Regressions. Econometrica, 77(3), 953–973.
  • Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica: journal of the Econometric Society, 33-50.
  • Rios-Avila, F. (2020). Recentered influence functions (RIFs) in Stata: RIF regression and RIF decomposition. The Stata Journal, 20(1), 51-94.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

To cite this package in publications, please use the following BibTeX entry:

@misc{yasenov2025pyrifreg,
  author       = {Vasco Yasenov},
  title        = {pyrifreg: Python Tools for Recentered Influence Function (RIF) Regression},
  year         = {2025},
  howpublished = {\url{https://github.com/vyasenov/pyrifreg}},
  note         = {Version 0.1.0}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrifreg-0.1.1.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrifreg-0.1.1-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file pyrifreg-0.1.1.tar.gz.

File metadata

  • Download URL: pyrifreg-0.1.1.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for pyrifreg-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7b602aa283764ac4c411bc95971a08cbaba7a389a8dddd8590b846f3f84475e6
MD5 9f8b3aaa2df719bc8cd3396ef52c2cbe
BLAKE2b-256 01ecd9af6234301273dedef6482e4dab74beee666066831f690f5b947d230f2a

See more details on using hashes here.

File details

Details for the file pyrifreg-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pyrifreg-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for pyrifreg-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 79e1bdf2709650a494443c9323eb6d7d0ef5f03a29cd04168485b9733d6269f8
MD5 715ea2a1e51401bf7436c599f6febc2a
BLAKE2b-256 c41591ba4b960d8728be8f0587255631935632f3cbd46aed4e14fb83b346ec0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page