No project description provided
Project description
Dissaggregation under Generalized Proportionality Assumptions
This package dissaggregates an estimated count observation into buckets based on the assumption that the rate (in a suitably transformed space) is proportional to some baseline rate.
The most basic functionality is to perform disaggregation under the rate multiplicative model that is currently in use.
The setup is as follows:
Let $D_{1,...,k}$ be an aggregated measurement across groups ${g_1,...,g_k}$, where the population of each is $p_i,...,p_k$. Let $f_1,...,f_k$ be the baseline pattern of the rates across groups, which could have potentially been estimated on a larger dataset or a population in which have higher quality data on. Using this data, we generate estimates for $D_i$, the number of events in group $g_i$ and $\hat{f_{i}}$, the rate in each group in the population of interest by combining $D_{1,...,k}$ with $f_1,...,f_k$ to make the estimates self consistent.
Mathematically, in the simpler rate multiplicative model, we find $\beta$ such that $$D_{1,...,k} = \sum_{i=1}^{k}\hat{f}_i \cdot p_i $$ Where $$\hat{f_i} = T^{-1}(\beta + T(f_i)) $$
This yields the estimates for the per-group event count,
$$D_i = \hat f_i \cdot p_i $$ For the current models in use, T is just a logarithm, and this assumes that each rate is some constant muliplied by the overall rate pattern level. Allowing a more general transformation T, such as a log-odds transformation, assumes multiplicativity in the associated odds, rather than the rate, and can produce better estimates statistically (potentially being a more realistic assumption in some cases) and practically, restricting the estimated rates to lie within a reasonable interval.
Current Package Capabilities and Models
Currently, the multiplicative-in-rate model RateMultiplicativeModel with $T(x)=\log(x)$ and the Log Modified Odds model LMO_model(m) with $T(x)=\log(\frac{x}{1-x^{m}})$ are implemented. Note that the LMO_model with m=1 gives a multiplicative in odds model.
A useful (but slightly wrong) analogy is that the multiplicative-in-rate is to the multiplicative-in-odds model as ordinary least squares is to logistic regression in terms of the relationship between covariates and output (not in terms of anything like the likelihood)
Increasing m in the model LMO_model(m) gives results that are more similar to the multiplicative-in-rate model currently in use, while preserving the property that rate estimates are bounded by 1.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydisagg-0.2.0.tar.gz
.
File metadata
- Download URL: pydisagg-0.2.0.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3d199f17745de071f11ec2694d539d31166b09a0eb191df06679ebc8445c166 |
|
MD5 | f140316507929caf5f95f5164eead7af |
|
BLAKE2b-256 | 7b16a1038f97b2636e3c62f15febbcbcd57a13d996088251776d93b09ed8aa81 |
Provenance
File details
Details for the file pydisagg-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: pydisagg-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05ea60f2c76bcaf7088ad2ed21855a0f214a01e85947724b274b9106deb3f114 |
|
MD5 | 8ca8b9f803a0a1c7025a62f3298df241 |
|
BLAKE2b-256 | eac52e26911b8175186a26f837e9f6ad35a1d3f90773d888217f3d4df36dd85c |