Skip to main content

samplefit library implements the Sample Fit Reliability algorithm for the assessment of sample fit in econometric models.

Project description

samplefit

samplefit is a Python library to assess sample fit, as opposed to model fit, via the Sample Fit Reliability algorithm as developed by Okasa & Younge (2022). samplefit is linked to the statsmodels library (Seabold & Perktold, 2010) and follows the same command workflow.

Copyright (c) 2022 Gabriel Okasa & Kenneth A. Younge.

AUTHOR:  Gabriel Okasa & Kenneth A. Younge
SOURCE:  https://github.com/okasag/samplefit
LICENSE: Access to this code is provided under an MIT License.

Repo maintainer: Gabriel Okasa (okasa.gabriel@gmail.com)

Introduction

samplefit is a Python library for the assessment of sample fit in econometric models. In particular, samplefit implements the Sample Fit Reliability (SFR) algorithm, a re-sampling procedure to estimate the reliability of data and check the sensitivity of results. To that end, SFR is a computational approach with three aspects: Scoring, to estimate a point-wise reliability score for every observation in a sample based on the expected estimation loss over sub-samples; Annealing, to test the sensitivity of results to the sequential removal of unreliable data points; and Fitting, to estimate a weighted regression that adjusts for the reliability of the data.

Detailed documentation of the samplefit library is available here.

Installation

To install the samplefit library from PyPi run:

pip install samplefit

or alternatively, to clone the repo run:

git clone https://github.com/okasag/samplefit.git

The required modules can be installed by navigating to the root of the cloned project and executing the following command: pip install -r requirements.txt.

Example

The example below demonstrates the workflow of using the samplefit library in conjunction with the well-known statsmodels library.

Import libraries:

import samplefit as sf
import statsmodels.api as sm

Get data:

boston = sm.datasets.get_rdataset("Boston", "MASS")
Y = boston.data['crim']
X = boston.data['lstat']
X = sm.add_constant(X)

Assess model fit:

model = sm.OLS(endog=Y, exog=X)
model_fit = model.fit()
model_fit.summary()

Assess sample fit:

sample = sf.SFR(linear_model=model)
sample_fit = sample.fit()
sample_fit.summary()

Assess sample reliability:

sample_scores = sample.score()
sample_scores.plot()

Assess sample sensitivity:

sample_annealing = sample.anneal()
sample_annealing.plot()

References

  • Okasa, Gabriel, and Kenneth A. Younge. “Sample Fit.” Working Paper. 2022.
  • Seabold, Skipper, and Josef Perktold. “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samplefit-0.3.1.tar.gz (22.5 kB view details)

Uploaded Source

File details

Details for the file samplefit-0.3.1.tar.gz.

File metadata

  • Download URL: samplefit-0.3.1.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.8

File hashes

Hashes for samplefit-0.3.1.tar.gz
Algorithm Hash digest
SHA256 5f843cb9a01f2945762383179af43f8e2402c5f95156ed6be0d19f80cae5aa66
MD5 4933a3258fb02f0b175d63bb4b5b18c0
BLAKE2b-256 7b79b20bb527699393b13f0cb505e5857e345b72f339deaa06e81da540a062ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page