Population table manipulation.
Project description
estime2
This is a Python package to manipulate and make corrections on the end-of-period population of a given table based on the component method. The program aims to “distribute” values of components to other records so that no end-of-period population estimates are negative. Moreover, it incorporates sum constraints across regional levels, provincial and subprovincial, so that the total end-of-period population is the same as the original population table after it goes through the process.
Public version: https://gitlab.com/joon3216/estime2 (private
repository)
StatCan version: https://f3eaipitcap01.statcan.ca/junkpar/estime2 (not
available to public)
Refer to documentations for details.
Installation
In the command line, simply type:
pip install estime2
To update to the latest version, type:
pip install estime2 --upgrade
To install from source, first download the whole repository using a
proper git clone
command. Then, move your working directory to that
repository, and type:
python setup.py install --user
Example
Suppose tbl
is a pandas.DataFrame
that qualifies to become a
estime2.ProvPopTable
. Creating an instance of ProvPopTable
is done
as follows:
import estime2
poptbl = estime2.ProvPopTable(tbl)
print(poptbl)
#> Sex Age Initial Population BTH ... NPR, 2019-07-01 IMM IIM RAI
#> 0 1 -1 0 473 ... 0 0 5 2
#> 1 1 0 455 0 ... 0 0 12 2
#> 2 1 1 449 0 ... 0 0 10 2
#> 3 1 2 446 0 ... 0 0 10 2
#> 4 1 3 435 0 ... 0 0 11 2
#> .. ... ... ... ... ... ... ... ... ...
#> 97 1 96 0 0 ... 0 0 0 1
#> 98 1 97 0 0 ... 0 0 0 2
#> 99 1 98 1 0 ... 0 0 0 2
#> 100 1 99 0 0 ... 0 0 0 2
#> 101 1 100+ 1 0 ... 0 0 0 2
#>
#> [102 rows x 15 columns]
See the source code for more information about the arguments of
ProvPopTable
.
ProvPopTable.calculate_pop()
is the method that computes the
end-of-period population:
calculated_poptbl = poptbl.calculate_pop()
print(calculated_poptbl)
#> Sex Age Postcensal Population
#> 0 1 0 461
#> 1 1 1 449
#> 2 1 2 446
#> 3 1 3 442
#> 4 1 4 435
#> .. ... ... ...
#> 96 1 96 1
#> 97 1 97 -4
#> 98 1 98 1
#> 99 1 99 2
#> 100 1 100+ 2
#>
#> [101 rows x 3 columns]
Note that the total end-of-period population of poptbl
before applying
the corrections is:
print(calculated_poptbl[estime2.options.pop.end].sum())
#> 20023
estime2.options
has many global options available for the package to
work. See the source codes for details.
ProvPopTable.fix_issues()
returns the fixed version of the original
ProvPopTable
where there are no negative end-of-period population(s):
poptbl_fixed_tbl = poptbl.fix_issues()
print(poptbl_fixed_tbl)
#> Sex Age Initial Population BTH ... NPR, 2019-07-01 IMM IIM RAI
#> 0 1 -1 0 473 ... 0 0 5 2
#> 1 1 0 455 0 ... 0 0 12 2
#> 2 1 1 449 0 ... 0 0 10 2
#> 3 1 2 446 0 ... 0 0 10 2
#> 4 1 3 435 0 ... 0 0 11 2
#> .. ... ... ... ... ... ... ... ... ...
#> 97 1 96 0 0 ... 0 0 0 1
#> 98 1 97 0 0 ... 0 0 0 2
#> 99 1 98 1 0 ... 0 0 0 2
#> 100 1 99 0 0 ... 0 0 0 2
#> 101 1 100+ 1 0 ... 0 0 0 2
#>
#> [102 rows x 15 columns]
Any negative end-of-period is brought up to 0, and the counter-modifications are applied to records of neighbouring ages:
calculated_poptbl_fixed = poptbl_fixed_tbl.calculate_pop()
print(calculated_poptbl_fixed)
#> Sex Age Postcensal Population
#> 0 1 0 461
#> 1 1 1 449
#> 2 1 2 446
#> 3 1 3 442
#> 4 1 4 435
#> .. ... ... ...
#> 96 1 96 1
#> 97 1 97 0
#> 98 1 98 1
#> 99 1 99 2
#> 100 1 100+ 2
#>
#> [101 rows x 3 columns]
ProvPopTable.fix_issues()
preserves the total end-of-period population
of the original table:
print(calculated_poptbl_fixed[estime2.options.pop.end].sum())
#> 20023
If you let return_all_mods
to be True
in
ProvPopTable.fix_issues()
, you get the wrapper object which allows you
to compute relevant metrics:
poptbl_fixed = poptbl.fix_issues(return_all_mods = True)
For example, you may compute the standard deviation of all the
corrections applied to poptbl
as follows:
poptbl_sd = poptbl_fixed.get_metric_sd()
print(poptbl_sd)
#> Sex Age Component sd
#> 0 1 97 DTH 2.236068
The wrapper object also comes with some visualization tools. For example, you can visualize pre- and post-correction end-of-period populations as follows:
poptbl_fixed.plot_pop(age_range = [87, 97])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file estime2-0.1.2.tar.gz
.
File metadata
- Download URL: estime2-0.1.2.tar.gz
- Upload date:
- Size: 59.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.24.0 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2d33c1c42487ad7b71fd5dd415b0c0e1f32845b0250d4a80becad76244492ff |
|
MD5 | f4dbf1f4afc3734cfe1e730e556b80ea |
|
BLAKE2b-256 | fbce2744b1c49b17d9444e1ab6c691e573b8e572d1232493cffc5bce6b1e7d55 |