Get weighted median values, treating weights as the number of occurrences of a given value.
Project description
stacked_quantile
'Stacked' quantile functions. Close to weighted quantile functions.
These functions are used to calculate quantiles of a set of values, where each value has a weight. The typical process for calculating a weighted quantile is to create a CDF from the weights, then interpolate the values to find the quantile.
These functions, however, treat weighted values (given integer weights) exactly as multiple values.
So, values (1, 2, 3)
with weights (4, 5, 6)
will be treated as
(1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3)
If the quantile falls exactly between two values, the non-weighted average of the two values is returned. This is consistent with the "weights as occurrences" interpretation. Strips all zero-weight values, so these will never be included in such averages.
If using non-integer weights, the results will be as if some scalar were applied to make all weights into integers.
This "weights as occurrences" interpretation has two pitfalls:
-
Identical values will be returned for different quantiles (e.g., the results for quantiles == 0.5, 0.6, and 0.7 might be identical). The effect of this is that some some common data practices like "robust scalar" will not be robust because of the potential for a 0 interquartile range. Again this is consistent, because the same thing could happen with repeated, non-weighted values.
-
With any number of values, the stacked_median could still be the first or last value (if it has enough weight), so separating by the median is not robust. This could also happen with repeaded, non-weighted values. One workaround is to divide the values into group_a = values strictly < median, group_b = values strictly > median, then add == median to the smaller group.
where FPArray: TypeAlias = npt.NDArray[np.floating[Any]]
def get_stacked_quantile(values: FParray, weights: FPArray, quantile: float) -> float:
"""Get a weighted quantile for a vector of values.
:param values: array of values with shape (n,)
:param weights: array of weights where weights.shape == values.shape
:param quantile: quantile to calculate, in [0, 1]
:return: weighted quantile of values
:raises ValueError: if values and weights do not have the same length
:raises ValueError: if quantile is not in interval [0, 1]
:raises ValueError: if values array is empty (after removing zero-weight values)
:raises ValueError: if weights are not all positive
"""
def get_stacked_quantiles(
values: FPArray, weights: FPArray, quantile: float
) -> FPArray:
"""Get a weighted quantile for an array of vectors.
:param values: array of vectors with shape (..., m)
will return one m-length vector
:param weights: array of weights with shape (..., 1)
where shape[:-1] == values.shape[:-1]
:param quantile: quantile to calculate, in [0, 1]
:return: axiswise weighted quantile of an m-length vector
:raises ValueError: if values and weights do not have the same shape[:-1]
The "gotcha" here is that the weights must be passed as 1D vectors, not scalars.
"""
def get_stacked_median(values: FPArray, weights: FPArray) -> float:
"""Get a weighted median for a value.
:param values: array of values with shape (n,)
:param weights: array of weights where weights.shape == values.shape
:return: weighted median of values
:raises ValueError: if values and weights do not have the same length
:raises ValueError: if values array is empty (after removing zero-weight values)
:raises ValueError: if weights are not all positive
"""
def get_stacked_medians(values: FPArray, weights: FPArray) -> FPArray:
"""Get a weighted median for an array of vectors.
:param values: array of vectors with shape (..., m)
will return one m-length vector
:param weights: array of weights with shape (..., 1)
where shape[:-1] == values.shape[:-1]
:return: axiswise weighted median of an m-length vector
:raises ValueError: if values and weights do not have the same shape[:-1]
The "gotcha" here is that the weights must be passed as 1D vectors, not scalars.
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file stacked_quantile-0.4.0.tar.gz
.
File metadata
- Download URL: stacked_quantile-0.4.0.tar.gz
- Upload date:
- Size: 42.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75fc35941906b19ad0f2de92a9870e938eb83371f11588d21f345fd03044728d |
|
MD5 | 05812db2a86703e9cb3bbf21d4e108ff |
|
BLAKE2b-256 | 9588310d23f60b342e1372b457976f4d21049e23ea08448b69b7b3cbd42360fe |
File details
Details for the file stacked_quantile-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: stacked_quantile-0.4.0-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3615ccb62f4312129e30ed3e393fa5e3df7f924738090e3d8808533bf38b8930 |
|
MD5 | b0597d9af2a410b3ccbba286eba1aabb |
|
BLAKE2b-256 | 4698a9781c42ed8c7d3e3f197219b5017a753e6936fce74ad63617567f0def48 |