Firth-penalized models in Python

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jzluo

These details have not been verified by PyPI

Project description

firthmodels

Pepy Total Downloads Python Version from PEP 621 TOML GitHub License

Firth-penalized models in Python:

FirthLogisticRegression: scikit-learn–compatible Firth logistic regression
FirthCoxPH: scikit-survival-style Firth Cox proportional hazards

Firth penalization reduces small-sample bias and produces finite estimates even when standard MLE fails due to (quasi-)complete separation or monotone likelihood.

See benchmarking results here comparing firthmodels, logistf, brglm2, and coxphf.

Why Firth penalization?

Standard maximum-likelihood logistic regression fails when your data has complete or quasi-complete separation: when a predictor (or combination of predictors) perfectly separates the outcome classes. In these cases, MLE produces infinite coefficient estimates.

In Cox proportional hazards, an analogous failure mode is monotone likelihood, where the partial likelihood becomes unbounded (often due to small samples, rare events, or near-perfect risk separation).

These problems are common in:

Case-control studies with rare exposures
Small clinical trials
Genome-wide or Phenome-wide association studies (GWAS/PheWAS)
Any dataset where events are rare relative to predictors

Firth's method adds a penalty term that:

Produces finite, well-defined estimates even with separated data
Reduces small-sample bias in coefficient estimates

Kosmidis and Firth (2021) formally proved that bias reduction for logistic regression models guarantees finite estimates as long as the model matrix has full rank.

Detecting separation

You can use detect_separation to check if your data has separation before fitting. This implements the linear programming method from Konis (2007), as used in the R detectseparation package by Kosmidis et al (2022).

The following example is based on the endometrial dataset used in Heinze and Schemper (2002), where the NV feature causes quasi-complete separation.

from firthmodels import detect_separation

result = detect_separation(X, y)
result.separation    # True
result.is_finite     # array([False,  True,  True,  True])
result.directions    # array([1, 0, 0, 0])  # where 1 = +Inf, -1 = -Inf, 0 = finite
print(result.summary())
# Separation: True
#   NV         +Inf
#   PI         finite
#   EH         finite
#   intercept  finite

Installation

Pip

pip install firthmodels

Requires Python 3.10+ and depends on NumPy, SciPy, and scikit-learn.

Optional dependencies:

Numba acceleration: pip install firthmodels[numba]
- The first run with the Numba backend after installing or updating firthmodels may take 10-30 seconds due to JIT compilation. Subsequent runs are fast thanks to caching.
Formula interface for the statsmodels adapter: pip install firthmodels[formula] (or simply install formulaic).

Note: Performance is significantly improved when NumPy/SciPy are built against a well-optimized BLAS/LAPACK library. You can check which library yours is using with np.show_config(). As a rule of thumb, MKL offers the best performance for Intel CPUs, while OpenBLAS is also a good choice for Intel and generally the best option for AMD. On macOS, ensure NumPy/SciPy are linked to Apple Accelerate.

The most straightforward way to control the BLAS/LAPACK library is to install firthmodels in a conda environment:

conda

conda install -c conda-forge firthmodels  # usually defaults to OpenBLAS
conda install -c conda-forge firthmodels "libblas=*=*_newaccelerate"  # Apple Accelerate
conda install -c conda-forge firthmodels "libblas=*=*mkl"  # Intel MKL
conda install -c conda-forge firthmodels "libblas=*=*openblas"  # OpenBLAS

Add numba to the conda install command to enable Numba acceleration.

Quick start

Firth logistic regression

import numpy as np
from firthmodels import FirthLogisticRegression

# Separated data: x=1 perfectly predicts y=1
X = np.array([[0], [0], [0], [1], [1], [1]])
y = np.array([0, 0, 0, 1, 1, 1])

# Standard logistic regression would fail here
model = FirthLogisticRegression().fit(X, y)

print(model.coef_)        # array([3.89181893])
print(model.intercept_)   # -2.725...
print(model.pvalues_)     # Wald p-values
print(model.bse_)         # Standard errors

Firth Cox proportional hazards

import numpy as np
from firthmodels import FirthCoxPH

X = np.array([[1.0], [0.0]])
event = np.array([True, False])
time = np.array([1.0, 2.0])

model = FirthCoxPH().fit(X, (event, time))
print(model.coef_)          # log hazard ratios
print(model.pvalues_)       # Wald p-values

# Survival curves evaluated at the training event times
S = model.predict_survival_function(X)  # shape: (n_samples, n_event_times)

FirthCoxPH also accepts y as a structured array with boolean event and float time fields (scikit-survival style).

Both estimators take a backend parameter that can be one of 'auto' (default), 'numba', or 'numpy'. If 'auto', firthmodels auto-detects Numba availability and uses it if installed, otherwise numpy/scipy.

Estimators

scikit-learn compatible API

FirthLogisticRegression follows the scikit-learn estimator API (fit, predict, predict_proba, get_params, set_params, etc.), and can be used with pipelines, cross-validation, and other sklearn tools:

from sklearn.model_selection import cross_val_score
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

pipe = make_pipeline(StandardScaler(), FirthLogisticRegression())
scores = cross_val_score(pipe, X, y, cv=5)

FirthCoxPH also follows the sklearn estimator API (fit, predict, score, etc.). It also has a scikit-survival-like interface:

methods: fit(X, y), predict(X) (linear predictor), score(X, y) (C-index), predict_survival_function(X, return_array=True), predict_cumulative_hazard_function(X, return_array=True).
attributes: unique_times_, cum_baseline_hazard_, baseline_survival_ (Breslow-style baseline).

Inference

Likelihood ratio tests (LRT)

Both estimators support penalized likelihood ratio tests for individual coefficients. These are often more reliable than Wald p-values in small samples.

lrt() populates lrt_pvalues_ and lrt_bse_ (a back-corrected standard error such that (beta / lrt_bse_)**2 matches the 1-df chi-squared statistic), which can be useful for meta-analysis weighting:

model.fit(X, y).lrt()  # Compute LRT for all coefficients

model.lrt_pvalues_     # LRT p-values
model.lrt_bse_         # Back-corrected standard errors (separate from Wald bse_)

Each feature requires a separate constrained model fit, so you can test selectively to avoid unnecessary computation. By default, LRT uses a warm start based on the full-model covariance to reduce Newton-Raphson iterations; pass warm_start=False to disable it.

model.lrt(0)              # Single feature by index
model.lrt([0, 2])         # Multiple features
model.lrt(['snp', 'age']) # By name (if fitted with DataFrame)
model.lrt(['snp', 2])     # Mixed
model.lrt(warm_start=False)  # Disable warm start

Confidence intervals

model.conf_int()                    # 95% Wald CIs
model.conf_int(alpha=0.1)           # 90% CIs
model.conf_int(method='pl')         # Profile likelihood CIs (more accurate)
model.conf_int(method='pl', features=['snp', 'age'])  # can selectively compute as with LRT

Sample weights and offsets

# currently for FirthLogisticRegression only
model.fit(X, y, sample_weight=weights)
model.fit(X, y, offset=offset)

Statsmodels adapter (`FirthLogit`)

The statsmodels adapter wraps FirthLogisticRegression behind a statsmodels-like API and returns a results object with common statsmodels attributes/methods (params, bse, pvalues, summary(), cov_params(), etc.).

Notes:

Unlike sklearn, FirthLogit does not add an intercept automatically; use sm.add_constant(X).
fit(pl=True) (default) computes likelihood ratio p-values and uses profile likelihood confidence intervals by default, matching R logistf convention. Standard errors (bse) remain Wald standard errors.

import numpy as np
import statsmodels.api as sm
from firthmodels.adapters.statsmodels import FirthLogit

X = np.array([[1], [2], [3], [4], [5], [6]])
y = np.array([0, 0, 0, 1, 1, 1])

X = sm.add_constant(X)
res = FirthLogit(y, X).fit()

print(res.params)
print(res.pvalues)      # LRT p-values when pl=True
print(res.conf_int())   # profile likelihood CIs when pl=True
print(res.summary())

Formula interface

If you install firthmodels[formula] (or pip install formulaic), you can fit from a formula and a pandas DataFrame:

import pandas as pd
from firthmodels.adapters.statsmodels import FirthLogit

df = pd.DataFrame({"y": [0, 1, 0, 1], "age": [20, 30, 40, 50], "treatment": [0, 1, 0, 1]})
res = FirthLogit.from_formula("y ~ age + treatment", df).fit()

API notes

`FirthLogisticRegression` parameters

Parameter	Default	Description
`backend`	`'auto'`	`'auto'`, `'numba'`, or `'numpy'`. `'auto'` uses numba if available.
`fit_intercept`	`True`	Whether to add an intercept term
`max_iter`	`25`	Maximum Newton-Raphson iterations
`gtol`	`1e-4`	Gradient convergence tolerance (converged when max\|gradient\| < gtol)
`xtol`	`1e-4`	Parameter convergence tolerance (converged when max\|delta\| < xtol)
`max_step`	`5.0`	Maximum step size per coefficient
`max_halfstep`	`25`	Maximum step-halvings per iteration
`penalty_weight`	`0.5`	Weight of the Firth penalty term. The default 0.5 corresponds to the standard Firth bias reduction method (Firth, 1993), equivalent to using Jeffreys' invariant prior. Set to `0.0` for unpenalized maximum likelihood estimation.

`FirthLogisticRegression` attributes (after fitting)

Attribute	Description
`coef_`	Coefficient estimates
`intercept_`	Intercept (0.0 if `fit_intercept=False`)
`bse_`	Wald standard errors; includes intercept if `fit_intercept=True`
`pvalues_`	Wald p-values; includes intercept if `fit_intercept=True`
`loglik_`	Penalized log-likelihood
`n_iter_`	Number of iterations
`converged_`	Whether the solver converged
`lrt_pvalues_`	LRT p-values (after calling `lrt()`); includes intercept if `fit_intercept=True`
`lrt_bse_`	Back-corrected SEs (after calling `lrt()`); includes intercept if `fit_intercept=True`
`classes_`	Class labels (shape `(2,)`)
`n_features_in_`	Number of features seen during fit
`feature_names_in_`	Feature names (if X had string column names)

`FirthCoxPH` parameters

Parameter	Default	Description
`backend`	`'auto'`	`'auto'`, `'numba'`, or `'numpy'`. `'auto'` uses numba if available.
`max_iter`	`50`	Maximum Newton-Raphson iterations
`gtol`	`1e-4`	Gradient convergence tolerance (converged when max\|gradient\| < gtol)
`xtol`	`1e-6`	Parameter convergence tolerance (converged when max\|delta\| < xtol)
`max_step`	`5.0`	Maximum step size per coefficient
`max_halfstep`	`5`	Maximum step-halvings per iteration
`penalty_weight`	`0.5`	Weight of the Firth penalty term. The default 0.5 corresponds to the standard Firth bias reduction method (Heinze and Schemper, 2001), equivalent to using Jeffreys' invariant prior. Set to `0.0` for unpenalized Cox partial likelihood estimation.

`FirthCoxPH` attributes (after fitting)

Attribute	Description
`coef_`	Coefficient estimates (log hazard ratios)
`bse_`	Wald standard errors
`pvalues_`	Wald p-values
`loglik_`	Penalized log partial likelihood
`n_iter_`	Number of iterations
`converged_`	Whether the solver converged
`lrt_pvalues_`	LRT p-values (after calling `lrt()`)
`lrt_bse_`	Back-corrected SEs (after calling `lrt()`)
`unique_times_`	Unique event times (ascending order)
`cum_baseline_hazard_`	Breslow cumulative baseline hazard at `unique_times_`
`baseline_survival_`	Baseline survival function at `unique_times_`
`n_features_in_`	Number of features seen during fit
`feature_names_in_`	Feature names (if X had string column names)

predict(X) returns the linear predictor X @ coef_ (log partial hazard). predict_cumulative_hazard_function(X) and predict_survival_function(X) return arrays evaluated at unique_times_.

References

Firth D (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27-38.

Heinze G, Schemper M (2001). A solution to the problem of monotone likelihood in Cox regression. Biometrics 57, 114-119.

Heinze G, Schemper M (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine 21, 2409-2419.

Konis, K. (2007). Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models. DPhil thesis, University of Oxford.

Kosmidis I, Firth D (2021). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika 108, 71-82.

Kosmidis I, Schumacher D, Schwendinger F (2022). detectseparation: Detect and Check for Separation and Infinite Maximum Likelihood Estimates. doi:10.32614/CRAN.package.detectseparation https://doi.org/10.32614/CRAN.package.detectseparation, R package version 0.3, https://CRAN.R-project.org/package=detectseparation.

Mbatchou J et al. (2021). Computationally efficient whole-genome regression for quantitative and binary traits. Nature Genetics 53, 1097-1103.

Venzon DJ, Moolgavkar SH (1988). A method for computing profile-likelihood-based confidence intervals. Applied Statistics 37, 87-94.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jzluo

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.7.2

Feb 20, 2026

0.7.1

Jan 29, 2026

0.6.0

Jan 15, 2026

0.5.0

Jan 14, 2026

0.4.0

Jan 13, 2026

0.3.0

Jan 4, 2026

0.2.2

Dec 22, 2025

0.2.1

Dec 21, 2025

0.2.0

Dec 18, 2025

0.1.2

Dec 14, 2025

0.1.1

Dec 10, 2025

0.1.0

Dec 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

firthmodels-0.7.2.tar.gz (636.8 kB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

firthmodels-0.7.2-py3-none-any.whl (59.8 kB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file firthmodels-0.7.2.tar.gz.

File metadata

Download URL: firthmodels-0.7.2.tar.gz
Upload date: Feb 20, 2026
Size: 636.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for firthmodels-0.7.2.tar.gz
Algorithm	Hash digest
SHA256	`28e9b06b2698a6ca14f217ab56d7af2e405744c657eeaa55db42e824ba5ae7fa`
MD5	`bcc45a587166ae044b97d19586a2585f`
BLAKE2b-256	`65ec406a5939ac57f17d5b278f34a495c4a318383b1352dee2d85230188bba53`

See more details on using hashes here.

Provenance

The following attestation bundles were made for firthmodels-0.7.2.tar.gz:

Publisher: publish.yml on jzluo/firthmodels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: firthmodels-0.7.2.tar.gz
- Subject digest: 28e9b06b2698a6ca14f217ab56d7af2e405744c657eeaa55db42e824ba5ae7fa
- Sigstore transparency entry: 975027835
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: jzluo/firthmodels@25d467451c43161051fecb149664eb7d107929dc
- Branch / Tag: refs/tags/v0.7.2
- Owner: https://github.com/jzluo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@25d467451c43161051fecb149664eb7d107929dc
- Trigger Event: push

File details

Details for the file firthmodels-0.7.2-py3-none-any.whl.

File metadata

Download URL: firthmodels-0.7.2-py3-none-any.whl
Upload date: Feb 20, 2026
Size: 59.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for firthmodels-0.7.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`398e11c0b5522f348dea7feec399bfd18d2a16127cd0b916123b40721b584bf0`
MD5	`0e52c1049aac21bc0f2e4b69aaee3bab`
BLAKE2b-256	`7518a2a2dce2bdd1c546bc93bc67a8b6657e8560cc99c40f3d5d714a331a4627`

See more details on using hashes here.

Provenance

The following attestation bundles were made for firthmodels-0.7.2-py3-none-any.whl:

Publisher: publish.yml on jzluo/firthmodels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: firthmodels-0.7.2-py3-none-any.whl
- Subject digest: 398e11c0b5522f348dea7feec399bfd18d2a16127cd0b916123b40721b584bf0
- Sigstore transparency entry: 975027868
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: jzluo/firthmodels@25d467451c43161051fecb149664eb7d107929dc
- Branch / Tag: refs/tags/v0.7.2
- Owner: https://github.com/jzluo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@25d467451c43161051fecb149664eb7d107929dc
- Trigger Event: push

firthmodels 0.7.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

firthmodels

Why Firth penalization?

Detecting separation

Installation

Pip

conda

Quick start

Firth logistic regression

Firth Cox proportional hazards

Estimators

scikit-learn compatible API

Inference

Likelihood ratio tests (LRT)

Confidence intervals

Sample weights and offsets

Statsmodels adapter (FirthLogit)

Formula interface

API notes

FirthLogisticRegression parameters

FirthLogisticRegression attributes (after fitting)

FirthCoxPH parameters

FirthCoxPH attributes (after fitting)

References

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Statsmodels adapter (`FirthLogit`)

`FirthLogisticRegression` parameters

`FirthLogisticRegression` attributes (after fitting)

`FirthCoxPH` parameters

`FirthCoxPH` attributes (after fitting)