A Stochastic Frontier Analysis (SFA) library featuring MLE and Bayesian (PyMC) estimations.

These details have not been verified by PyPI

Project links

Project description

Depp_sfa: stochastic frontier analysis

Depp_sfa is a python library dedicated to the estimation of stochastic frontier analysis (sfa) models.

It is designed to provide high robustness against the numerical convergence issues frequently encountered in applied econometrics. For cross-sectional data, the library relies primarily on maximum likelihood estimation (mle), featuring an automatic fallback to bayesian estimation (mcmc via pymc) in the event of optimization failure. For panel data, it implements a strictly bayesian estimation of the dynamic battese and coelli (1992) model.

Main features

Production and cost frontiers: supports both orientations.
Functional forms: linear, cobb-douglas, and translog specifications. Includes support for dummy variables.
Cross-sectional data: standard estimation with optional inclusion of inefficiency determinants (bc95 model).
Panel data (time-varying): implementation of the battese and coelli (1992) model, capturing the temporal evolution of technical inefficiency.
True effects (greene 2005): integration of unobserved heterogeneity separation.
Efficiency decomposition methods: jondrow et al. (1982), battese and coelli (1988), and a modified approach.

Mle versus pymc

The library offers two distinct inference methods to adapt to your data structure and overcome traditional solver limitations.

Mle (maximum likelihood estimation) is the standard frequentist approach. It is computationally fast and works well on large, balanced cross-sectional datasets. However, it frequently suffers from boundary failures (where the inefficiency variance collapses to zero) on unbalanced panels or when dealing with high ratios of singletons.

Pymc (bayesian inference) is the robust alternative. By integrating prior distributions, it prevents boundary collapses and successfully separates signal from noise even when the majority of the panel consists of single observations. It provides full posterior distributions but requires more computation time.

Installation

The library can be installed directly from its git repository:

pip install depp_sfa

Usage and data preparation

It is strongly recommended to standardize (center and scale) continuous variables prior to estimation to ensure the convergence of the optimization algorithms.

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from depp_SFA import SFA, FUN_COST

# 1. data preparation
df = pd.read_csv("data.csv")
df = df[(df["cost"] > 0) & (df["output"] > 0)]

# standardization
vars_to_scale = ["density", "quality_index"]
scaler = StandardScaler()
df[vars_to_scale] = scaler.fit_transform(df[vars_to_scale])

# 2. vector extraction
y = df["cost"].to_numpy(dtype=float)
X = df[["output"] + vars_to_scale].to_numpy(dtype=float)
firm_ids = df["firm_id"].to_numpy()
years = df["year"].to_numpy(dtype=float)
Z_vars = df[["contract_type", "vandalism_risk"]].to_numpy(dtype=float)

Implementation guide

1. Panel data with time evolution (bc92)

The battese and coelli (1992) model captures how inefficiency evolves over time using a decay parameter. It is best estimated using pymc for short or unbalanced panels.

model_bc92 = SFA(
    y=y, 
    x=X, 
    id_var=firm_ids, 
    time_var=years, 
    fun=FUN_COST, 
    panel_model='bc92',
    inference_method='pymc', # or mle
    draws=2000
)

# display results
model_bc92.summary()

# retrieve efficiency scores
efficiency_scores = model_bc92.get_technical_efficiency()

2. Inefficiency effects model (bc95)

The battese and coelli (1995) model uses environmental variables (z) to explain why certain units are less efficient, rather than treating these variables as direct cost drivers.

model_bc95 = SFA(
    y=y, 
    x=X, 
    z=Z_vars,
    fun=FUN_COST,      
    inference_method='pymc', #or mle
    draws=2000
)

# display results
model_bc95.summary()

3. True random effects (greene 2005)

This specification separates unobserved structural heterogeneity (specific to each firm) from pure managerial inefficiency. It prevents the model from penalizing firms for structural geographical disadvantages.

model_greene = SFA(
    y=y, 
    x=X, 
    id_var=firm_ids,
    time_var=years,
    fun=FUN_COST,
    panel_model='greene',
    inference_method='pymc', # or mle
    draws=2000
)

# display results
model_greene.summary()

Api documentation (sfa class)

Instantiation: sfa(...)

The class constructor configures the model parameters and transforms the data according to the specified functional form.

Parameters:

y (array-like): dependent variable (output for production frontier, cost for cost frontier).
x (array-like, 2d): independent variables (inputs or prices/outputs).
z (array-like, 2d, optional): explanatory variables for inefficiency (cross-sectional data only).
id_var (array-like, optional): individual identifiers for panel data.
time_var (array-like, optional): time variable for panel data.
fun (constant): frontier type. Use sfa.fun_prod (default) or sfa.fun_cost.
intercept (bool): include an intercept in the model (default: true).
lamda0 (float): initial value of lambda for mle optimization (default: 1).
method (constant): method for computing technical efficiency. Choose among sfa.te_tej (jondrow et al.), sfa.te_te (battese and coelli), or sfa.te_temod.
form (str): functional form. Choose among 'linear', 'cobb_douglas', or 'translog'.
dummy_indices (list): list of column indices in x that are indicator variables (0/1) and should not be log-transformed.

Public methods

Once the model is instantiated, the following methods are available. Calling any of these methods will automatically trigger the estimation process (optimize()) if it has not already been executed.

optimize(): triggers the estimation algorithm. Automatically routes to panel estimation (mcmc) or cross-sectional estimation (mle, with an mcmc fallback in case of convergence failure).
summary(): computes and prints a summary table of the results to the console, including estimated coefficients, standard errors, t-values, p-values, associated significance levels, and the log-likelihood value (for mle).
get_beta(): returns a numpy array containing the estimated coefficients for the frontier variables (including the intercept if intercept=true).
get_residuals(): returns a numpy array containing the model residuals.
get_sigma2(): returns the estimated total variance of the composite error.
get_lambda(): returns the ratio of standard deviations, which measures the relative importance of inefficiency compared to statistical noise.
get_technical_efficiency(): returns a numpy array containing the technical efficiency scores (bounded between 0 and 1) computed for each observation in the sample, using the method specified during instantiation.

Credits and licenses

This software is distributed under the mit license.

The base architecture, matrix processing, and classic log-likelihood derivations are inspired by and adapted from the work of sheng dai (copyright (c) 2023, mit license).
The integration of bayesian estimators (markov chain monte carlo via pymc), the numerical stabilization of panel models (battese and coelli 1992), and the algorithmic exception handling were developed specifically for this project.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Apr 26, 2026

0.1.2

Mar 25, 2026

This version

0.1.1

Mar 25, 2026

0.1.0

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

depp_sfa-0.1.1.tar.gz (18.5 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

depp_sfa-0.1.1-py3-none-any.whl (16.2 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file depp_sfa-0.1.1.tar.gz.

File metadata

Download URL: depp_sfa-0.1.1.tar.gz
Upload date: Mar 25, 2026
Size: 18.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for depp_sfa-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`fda5af419041e140bb7ce28aea03fe3649841ac69d0e385c6d6926df35ff4877`
MD5	`606c2c813dafc8643851b9a4cb04e91e`
BLAKE2b-256	`ed180354c8792343757614fd17e02e3b3fafabc0bc220bb4e1ced6de69eb56aa`

See more details on using hashes here.

File details

Details for the file depp_sfa-0.1.1-py3-none-any.whl.

File metadata

Download URL: depp_sfa-0.1.1-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 16.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for depp_sfa-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`182169de5f123aff0a3a9fda64ca381c5116b39d84f917ecd7f3aa246085e6fd`
MD5	`226b5e55f9220f9551eff471f7322935`
BLAKE2b-256	`646574caa4db1786a486a69c0d70d92519b94e5f36aaec40c326c4436ca39b9e`

See more details on using hashes here.

depp-sfa 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Depp_sfa: stochastic frontier analysis

Main features

Mle versus pymc

Installation

Usage and data preparation

Implementation guide

1. Panel data with time evolution (bc92)

2. Inefficiency effects model (bc95)

3. True random effects (greene 2005)

Api documentation (sfa class)

Instantiation: sfa(...)

Public methods

Credits and licenses

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes