Skip to main content

Calculate weight factors for survey data to approximate a representative sample

Project description

Continuous Integration Python

Weight Factors

Calculate weight factors for survey data to approximate a representative sample

Installation

pip install weightfactors

or clone and install from source

git clone https://github.com/markteffect/weightfactors
cd weightfactors
poetry install

Usage

Currently, the package implements a generalized raking algorithm.
If you'd like to see support for other algorithms, please open an issue or submit a pull request.

Let's use the following dataset as an example:

sample = pd.DataFrame(
    {
        "Gender": [
            "Male",
            "Male",
            "Female",
            "Female",
            "Female",
            "Male",
            "Female",
            "Female",
            "Male",
            "Female",
        ],
        "Score": [7.0, 6.0, 8.5, 7.5, 8.0, 5.0, 9.5, 8.0, 4.5, 8.5],
    }
)

Suppose our sample comprises 40% males and 60% females.
If we were to calculate the average score, we would get:

np.average(sample["Score"])
# 7.25

Now, assuming a 50/50 gender distribution in the population,
let's calculate weight factors to approximate the population distribution:

from weightfactors import GeneralizedRaker

raker = GeneralizedRaker({"Gender": {"Male": 0.5, "Female": 0.5}})
weights = raker.rake(sample)
# [1.25000008 1.25000008 0.83333334 0.83333334 0.83333334 1.25000008
# 0.83333334 0.83333334 1.25000008 0.83333334]

Let's calculate the average score again, this time applying the weight factors:

np.average(sample["Score"], weights=weights)
# 6.9791666284520835

For more detailed information and customization options, please refer to the docstrings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

weightfactors-0.0.10.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

weightfactors-0.0.10-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file weightfactors-0.0.10.tar.gz.

File metadata

  • Download URL: weightfactors-0.0.10.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.3 Windows/11

File hashes

Hashes for weightfactors-0.0.10.tar.gz
Algorithm Hash digest
SHA256 fa64db5c1fe3455fabac28abaa26ee5c39174d35ec1fb7a347eb5c3ae37a7de4
MD5 f076369ddc26a5e0508fd5540625768c
BLAKE2b-256 14f9abbc61105f54b0a76bea31b9414bcf259699bd717df3cd0d323208c00c6f

See more details on using hashes here.

File details

Details for the file weightfactors-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: weightfactors-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.3 Windows/11

File hashes

Hashes for weightfactors-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 b13d9bcc8e8b752d381902cc9143921ba3e9cb580ccb389b9407865cd3a37d1b
MD5 31aa0f9d200606f79ac8f513c89843fe
BLAKE2b-256 918e1ee5b2224f8a2b98dae4c8d057b66c5f5cbb27db49a3425cf564a5b5048f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page