Skip to main content

Calculate weight factors for survey data to approximate a representative sample

Project description

Continuous Integration Python

Weight Factors

Calculate weight factors for survey data to approximate a representative sample

Installation

pip install weightfactors

or clone and install from source

git clone https://github.com/markteffect/weightfactors
cd weightfactors
poetry install

Usage

Currently, the package implements a generalized raking algorithm.
If you'd like to see support for other algorithms, please open an issue or submit a pull request.

Let's use the following dataset as an example:

sample = pd.DataFrame(
    {
        "Gender": [
            "Male",
            "Male",
            "Female",
            "Female",
            "Female",
            "Male",
            "Female",
            "Female",
            "Male",
            "Female",
        ],
        "Score": [7.0, 6.0, 8.5, 7.5, 8.0, 5.0, 9.5, 8.0, 4.5, 8.5],
    }
)

Suppose our sample comprises 40% males and 60% females.
If we were to calculate the average score, we would get:

np.average(sample["Score"])
# 7.25

Now, assuming a 50/50 gender distribution in the population,
let's calculate weight factors to approximate the population distribution:

from weightfactors import GeneralizedRaker

raker = GeneralizedRaker({"Gender": {"Male": 0.5, "Female": 0.5}})
weights = raker.rake(sample)
# [1.25000008 1.25000008 0.83333334 0.83333334 0.83333334 1.25000008
# 0.83333334 0.83333334 1.25000008 0.83333334]

Let's calculate the average score again, this time applying the weight factors:

np.average(sample["Score"], weights=weights)
# 6.9791666284520835

For more detailed information and customization options, please refer to the docstrings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weightfactors-0.0.11.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

weightfactors-0.0.11-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file weightfactors-0.0.11.tar.gz.

File metadata

  • Download URL: weightfactors-0.0.11.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.3 Windows/11

File hashes

Hashes for weightfactors-0.0.11.tar.gz
Algorithm Hash digest
SHA256 7e1ddaf6f8f2de95c9925f3bb88c0fcc31e3b67d65cfb22367958958ee941cec
MD5 513450a88dd7b154da501ebe4f07ec20
BLAKE2b-256 0f7af7d402d874350747d3922a5955218dbf2920650db30a834a9cc4db058cbb

See more details on using hashes here.

File details

Details for the file weightfactors-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: weightfactors-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.3 Windows/11

File hashes

Hashes for weightfactors-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 383fb02aaca722910e13bb5085b01e9980ccdd6780c69004b72e9d14f042f0a6
MD5 7c738557919727bc410c8f8adedda909
BLAKE2b-256 57db49907d35dc9aa45d1e5eb9f5cd082761fb5bf4c89383b68cdae8ef20ac27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page