Skip to main content

Calculate weight factors for survey data to approximate a representative sample

Project description

Continuous Integration Python

Weight Factors

Calculate weight factors for survey data to approximate a representative sample

Installation

pip install weightfactors

or clone and install from source

git clone https://github.com/markteffect/weightfactors
cd weightfactors
poetry install

Usage

Currently, the package implements a generalized raking algorithm.
If you'd like to see support for other algorithms, please open an issue or submit a pull request.

Let's use the following dataset as an example:

sample = pd.DataFrame(
    {
        "Gender": [
            "Male",
            "Male",
            "Female",
            "Female",
            "Female",
            "Male",
            "Female",
            "Female",
            "Male",
            "Female",
        ],
        "Score": [7.0, 6.0, 8.5, 7.5, 8.0, 5.0, 9.5, 8.0, 4.5, 8.5],
    }
)

Suppose our sample comprises 40% males and 60% females.
If we were to calculate the average score, we would get:

np.average(sample["Score"])
# 7.25

Now, assuming a 50/50 gender distribution in the population,
let's calculate weight factors to approximate the population distribution:

from weightfactors import GeneralizedRaker

raker = GeneralizedRaker({"Gender": {"Male": 0.5, "Female": 0.5}})
weights = raker.rake(sample)
# [1.25000008 1.25000008 0.83333334 0.83333334 0.83333334 1.25000008
# 0.83333334 0.83333334 1.25000008 0.83333334]

Let's calculate the average score again, this time applying the weight factors:

np.average(sample["Score"], weights=weights)
# 6.9791666284520835

For more detailed information and customization options, please refer to the docstrings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weightfactors-0.0.3.tar.gz (6.3 kB view hashes)

Uploaded Source

Built Distribution

weightfactors-0.0.3-py3-none-any.whl (7.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page