Skip to main content

binsmooth - Better Estimates from Binned Income Data.

Project description

binsmooth

PyPI version Build Status

Python implementation of "Better Estimates from Binned Income Data"

Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching
Paul T. von Hippel, David J. Hunter, McKalie Drown
Sociological Science
Volume 4, Number 26, Pages 641-655
2017

Originally implemented in the R package binsmooth.

Usage

from binsmooth import BinSmooth

bin_edges = np.array([0, 18200, 37000, 87000, 180000])
counts = np.array([0, 7527, 13797, 75481, 50646, 803])

bs = BinSmooth()
bs.fit(bin_edges, counts)

# Print median estimate
print(bs.inv_cdf(0.5))

Installation

Install via pip

pip install binsmooth

pypi page https://pypi.org/project/binsmooth/

Improvements

Better tail estimate by bounded optimisation rather than the adhoc search method found in the R implementation.

More precise inverse CDF by dynamically sampling the CDF. This is done by sampling proportional to the steepness of the CDF i.e. sampling more in areas where the CDF is steeper.

Warnings

Results may not exactly match R binsmooth because of:

  1. a different approach to estimating the tail (upper bound) and
  2. differences in the spline interpolation method

This implementation uses scipy's PchipInterpolator which implements [1], while the default interpolator in the R implementation is [2]. The interpolator in the R implementation can be changed to [1] by setting monoMethod="monoH.FC".

Accuracy is dependent on the mean of the distribution. If you do not supply a mean, then one will be estimated in an adhoc manner and the accuracy of estimates may be poor.

References

[1]: Fritsch, F. N. and Carlson, R. E. (1980). Monotone piecewise cubic interpolation. SIAM Journal on Numerical Analysis
[2]: Hyman, J. M. (1983). Accurate monotonicity preserving cubic interpolation. SIAM Journal on Scientific and Statistical Computing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binsmooth-0.15.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

binsmooth-0.15-py2.py3-none-any.whl (6.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file binsmooth-0.15.tar.gz.

File metadata

  • Download URL: binsmooth-0.15.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.27.1

File hashes

Hashes for binsmooth-0.15.tar.gz
Algorithm Hash digest
SHA256 1623c9faaf3abc5779b843104fdea29833cc96f4d5b5342dc4ae23b89b6cc180
MD5 7ef4f9427edc481c78682b7982abab23
BLAKE2b-256 d894be4906c2bf9440e9b6f4e782a666edeeed23c3317e942f5f1b5fc429ffc2

See more details on using hashes here.

File details

Details for the file binsmooth-0.15-py2.py3-none-any.whl.

File metadata

  • Download URL: binsmooth-0.15-py2.py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.27.1

File hashes

Hashes for binsmooth-0.15-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a20b9e2324d5209e51e045fca8ae90998413cd6f5814fcd434c889e5d4d02ded
MD5 978ece66b841f1492a584fbf9584a12b
BLAKE2b-256 99653ed8e4cea3d41119ebd4eeb9ddc19bff8e77ca925942660465492fc1c9fd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page