A Python package for estimating multi-way gravity models with high-dimensional fixed effects. The polyad estimator addresses the incidental parameter problem in Poisson models by conditioning on sufficient statistics for fixed effects.
Project description
Polyads: Statistical Inference in Large Multi-way Networks
A Python package for estimating multi-way gravity models with high-dimensional fixed effects. The polyad estimator addresses the incidental parameter problem in Poisson models by conditioning on sufficient statistics for fixed effects.
Overview
Traditional PPML estimators may suffer from the incidental parameter problem when the number of fixed effects grows with sample size. This package implements a conditional likelihood approach that eliminates fixed effects from the estimation problem entirely.
Key Features:
- Handles arbitrary fixed effects structures (two-way, three-way, four-way, etc.)
- Computationally efficient for sparse network data
- Provides asymptotically valid inference
Quick Start
import numpy as np
import pandas as pd
from src.polyads.data import generate_data
from src.polyads.model import PolyadEstimator
# Generate synthetic three-way gravity data
beta_true = np.array([1.0, -0.5])
df, X = generate_data(
seed=1,
n_ds=(100, 100, 100), # Dimensions: n1 × n2 × n3
c=-5, # Baseline intensity
shape=np.inf, # Poisson model
beta=beta_true,
groups=[[0,1], [0,2], [1,2]] # Three-way fixed effects
)
# Fit the model
columns = df.columns.tolist()
estimator = PolyadEstimator(use_tqdm=True)
estimator.fit(
df=df,
indices=columns[:-1], # Index columns
values=columns[-1], # Count column
beta_init=np.zeros(2),
X=X
)
# Display results
estimator.summary()
The Method
Multi-Way Gravity Model
Consider count data indexed by D dimensions:
log λ_{i₁,...,iD} = β'X_{i₁,...,iD} + Σ_g θ^g_{g(i)}
where β are structural parameters and θ^g are fixed effects.
The Incidental Parameter Problem
When the number of fixed effects grows with sample size:
- Two-way models (D=2): PPML is consistent
- Three-way models (D=3+): PPML yields unreliable confidence intervals.
The Polyad Solution
The method conditions on node degrees (sufficient statistics for fixed effects) and maximizes a conditional likelihood that depends only on β. This eliminates the incidental parameter problem by removing fixed effects from the objective function.
Basic Usage
Model Setup
from src.polyads.model import PolyadEstimator
estimator = PolyadEstimator(
max_iter=100, # Maximum iterations
tol=1e-4, # Convergence tolerance
max_n_polyads=int(1e8), # Max polyads to process
use_tqdm=False # Progress bar
)
Two-Way Example (Trade)
# Bilateral trade: log λ_ij = β'X_ij + u_i + v_j
df = pd.DataFrame({
'exporter': [...],
'importer': [...],
'flow': [...]
})
estimator.fit(
df=df,
indices=['exporter', 'importer'],
values='flow',
beta_init=np.zeros(p),
X=features
)
Three-Way Example (Panel)
# Trade panel: log λ_ijt = β'X_ijt + u_ij + v_it + w_jt
df = pd.DataFrame({
'exporter': [...],
'importer': [...],
'year': [...],
'flow': [...]
})
estimator.fit(
df=df,
indices=['exporter', 'importer', 'year'],
values='flow',
beta_init=np.zeros(p),
X=features
)
Advanced Features
Custom Feature Function
Compute features on-the-fly to save memory:
def compute_features(indices):
i, j, t = indices
return np.array([
np.log(distance[i, j]),
fta_indicator[i, j, t],
border[i, j]
])
estimator.fit(
df=df,
indices=['i', 'j', 't'],
values='y',
beta_init=np.zeros(3),
eval_X=compute_features
)
Results and Inference
# Point estimates
beta_hat = estimator.beta_
# Standard errors
se = np.sqrt(np.diag(estimator.var_))
# Confidence intervals
estimator.summary(alpha=0.05) # 95% CI
# Diagnostics
print(f"Converged: {estimator.converged_}")
print(f"Iterations: {estimator.iterations_}")
print(f"Active polyads: {estimator.n_polyads_}")
Data Format
Input DataFrame
df = pd.DataFrame({
'i1': [0, 0, 1, ...], # First dimension
'i2': [0, 1, 0, ...], # Second dimension
'i3': [0, 0, 1, ...], # Third dimension (if 3-way)
'y': [5, 0, 3, ...] # Non-negative counts
})
Feature Matrix
# 2-way: (n1, n2, p)
# 3-way: (n1, n2, n3, p)
# 4-way: (n1, n2, n3, n4, p)
X = np.random.randn(n1, n2, n3, p)
Or custom feature function, as described.
Diagnostics
Check Convergence
if not estimator.converged_:
print("Warning: Did not converge")
if estimator.det_ < 1e-8:
print("Singular Hessian - possible collinearity")
estimator.summary() # Shows eigenstructure
Common Issues
No polyads found: Data too sparse or no variation
print(f"Positive edges: {estimator.n_edges_}")
print(f"Active polyads: {estimator.n_polyads_}")
Singular Hessian: Collinear features or absorbed by fixed effects
# Check correlation
import pandas as pd
pd.DataFrame(X.reshape(-1, p)).corr()
Comparison with PPML
| Method | Best For | Pros | Cons |
|---|---|---|---|
| Polyads | D≥3, sparse | No bias, valid inference | Slower for dense data |
| PPML | D=2, dense | Fast, familiar | IPP for D≥3 |
Best Practices
- Start simple: Fewer features initially
- Check sparsity: Method works best when |E| ≪ n
- Scale features: Normalize for numerical stability
- Warm-up: Use small problem first for JIT compilation
- Validate: Check convergence and Hessian determinant
Limitations
- Assumes conditional independence given fixed effects and covariates
- Designed for count data (not continuous)
- Slower than PPML for very dense networks
- Requires sufficient within-group variation
Citation
@misc{resende2025polyads,
title={Statistical Inference in Large Multi-way Networks},
author={Lucas Resende and Guillaume Lecué and Lionel Wilner and Philippe Choné},
year={2025},
eprint={2512.02203},
archivePrefix={arXiv},
primaryClass={econ.EM},
url={https://arxiv.org/abs/2512.02203},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polyads-0.0.1.tar.gz.
File metadata
- Download URL: polyads-0.0.1.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f293f5e9954e53a8b7083a3d1db58c19f9f357e17728111111373c104b21f53f
|
|
| MD5 |
645918fb25da5adf8010bb1d4e58b85f
|
|
| BLAKE2b-256 |
3925c4c7a5598ee5076fdb7feaf27c1f335179b2b7fe61dfb931bd3b0149a563
|
File details
Details for the file polyads-0.0.1-py3-none-any.whl.
File metadata
- Download URL: polyads-0.0.1-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53c7dc2af910212529df428e882e30eb93c7c4a19a0d161a0833750f0af6cb32
|
|
| MD5 |
5ac84952cf33dee6150a0bbe7f90202c
|
|
| BLAKE2b-256 |
f30db0e0b0013c240d3484da02b2645e9eabeb57ad034b67b207e4065b89f4f0
|