Skip to main content

Get weighted median values, treating weights as the number of occurrences of a given value.

Project description

stacked_quantile

'Stacked' quantile functions. Close to weighted quantile functions.

These functions are used to calculate quantiles of a set of values, where each value has a weight. The typical process for calculating a weighted quantile is to create a CDF from the weights, then interpolate the values to find the quantile.

These functions, however, treat weighted values (given integer weights) exactly as multiple values.

So, values (1, 2, 3) with weights (4, 5, 6) will be treated as

(1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3)

If the quantile falls exactly between two values, the non-weighted average of the two values is returned. This is consistent with the "weights as occurrences" interpretation. Strips all zero-weight values, so these will never be included in such averages.

If using non-integer weights, the results will be as if some scalar were applied to make all weights into integers.

This "weights as occurrences" interpretation has two pitfalls:

  1. Identical values will be returned for different quantiles (e.g., the results for quantiles == 0.5, 0.6, and 0.7 might be identical). The effect of this is that some some common data practices like "robust scalar" will not be robust because of the potential for a 0 interquartile range. Again this is consistent, because the same thing could happen with repeated, non-weighted values.

  2. With any number of values, the stacked_median could still be the first or last value (if it has enough weight), so separating by the median is not robust. This could also happen with repeaded, non-weighted values. One workaround is to divide the values into group_a = values strictly < median, group_b = values strictly > median, then add == median to the smaller group.

where FPArray: TypeAlias = npt.NDArray[np.floating[Any]]

def get_stacked_quantile(values: FParray, weights: FPArray, quantile: float) -> float:
    """Get a weighted quantile for a vector of values.

    :param values: array of values with shape (n,)
    :param weights: array of weights where weights.shape == values.shape
    :param quantile: quantile to calculate, in [0, 1]
    :return: weighted quantile of values
    :raises ValueError: if values and weights do not have the same length
    :raises ValueError: if quantile is not in interval [0, 1]
    :raises ValueError: if values array is empty (after removing zero-weight values)
    :raises ValueError: if weights are not all positive
    """
def get_stacked_quantiles(
    values: FPArray, weights: FPArray, quantile: float
) -> FPArray:
    """Get a weighted quantile for an array of vectors.

    :param values: array of vectors with shape (..., m)
        will return one m-length vector
    :param weights: array of weights with shape (..., 1)
        where shape[:-1] == values.shape[:-1]
    :param quantile: quantile to calculate, in [0, 1]
    :return: axiswise weighted quantile of an m-length vector
    :raises ValueError: if values and weights do not have the same shape[:-1]

    The "gotcha" here is that the weights must be passed as 1D vectors, not scalars.
    """
def get_stacked_median(values: FPArray, weights: FPArray) -> float:
    """Get a weighted median for a value.

    :param values: array of values with shape (n,)
    :param weights: array of weights where weights.shape == values.shape
    :return: weighted median of values
    :raises ValueError: if values and weights do not have the same length
    :raises ValueError: if values array is empty (after removing zero-weight values)
    :raises ValueError: if weights are not all positive
    """
def get_stacked_medians(values: FPArray, weights: FPArray) -> FPArray:
    """Get a weighted median for an array of vectors.

    :param values: array of vectors with shape (..., m)
        will return one m-length vector
    :param weights: array of weights with shape (..., 1)
        where shape[:-1] == values.shape[:-1]
    :return: axiswise weighted median of an m-length vector
    :raises ValueError: if values and weights do not have the same shape[:-1]

    The "gotcha" here is that the weights must be passed as 1D vectors, not scalars.
    """

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stacked_quantile-0.4.0.tar.gz (42.3 kB view details)

Uploaded Source

Built Distribution

stacked_quantile-0.4.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file stacked_quantile-0.4.0.tar.gz.

File metadata

  • Download URL: stacked_quantile-0.4.0.tar.gz
  • Upload date:
  • Size: 42.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for stacked_quantile-0.4.0.tar.gz
Algorithm Hash digest
SHA256 75fc35941906b19ad0f2de92a9870e938eb83371f11588d21f345fd03044728d
MD5 05812db2a86703e9cb3bbf21d4e108ff
BLAKE2b-256 9588310d23f60b342e1372b457976f4d21049e23ea08448b69b7b3cbd42360fe

See more details on using hashes here.

File details

Details for the file stacked_quantile-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for stacked_quantile-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3615ccb62f4312129e30ed3e393fa5e3df7f924738090e3d8808533bf38b8930
MD5 b0597d9af2a410b3ccbba286eba1aabb
BLAKE2b-256 4698a9781c42ed8c7d3e3f197219b5017a753e6936fce74ad63617567f0def48

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page