Population table manipulation.
Project description
estime2
This is a Python package to manipulate and make corrections on the end-of-period population of a given table based on the component method. The program aims to “distribute” values of components to other records so that no end-of-period population estimates are negative. Moreover, it incorporates sum constraints across regional levels, provincial and subprovincial, so that the total end-of-period population is the same as the original population table after it goes through the process.
Public version: https://gitlab.com/joon3216/estime2 (private
repository)
StatCan version: https://f3eaipitcap01.statcan.ca/junkpar/estime2 (not
available to public)
Refer to documentations for details.
Installation
In the command line, simply type:
pip install estime2
To update to the latest version, type:
pip install estime2 --upgrade
To install from source, first download the whole repository using a
proper git clone
command. Then, move your working directory to that
repository, and type:
python setup.py install --user
Example
Suppose tbl
is a pandas.DataFrame
that qualifies to become a
estime2.ProvPopTable
. Creating an instance of ProvPopTable
is done
as follows:
import estime2
poptbl = estime2.ProvPopTable(tbl)
print(poptbl)
#> Sex Age Initial Population BTH ... NPR, 2019-07-01 IMM IIM RAI
#> 0 1 -1 0 473 ... 0 0 5 2
#> 1 1 0 455 0 ... 0 0 12 2
#> 2 1 1 449 0 ... 0 0 10 2
#> 3 1 2 446 0 ... 0 0 10 2
#> 4 1 3 435 0 ... 0 0 11 2
#> .. ... ... ... ... ... ... ... ... ...
#> 97 1 96 0 0 ... 0 0 0 1
#> 98 1 97 0 0 ... 0 0 0 2
#> 99 1 98 1 0 ... 0 0 0 2
#> 100 1 99 0 0 ... 0 0 0 2
#> 101 1 100+ 1 0 ... 0 0 0 2
#>
#> [102 rows x 15 columns]
See the source code for more information about the arguments of
ProvPopTable
.
ProvPopTable.calculate_pop()
is the method that computes the
end-of-period population:
calculated_poptbl = poptbl.calculate_pop()
print(calculated_poptbl)
#> Sex Age Postcensal Population
#> 0 1 0 461
#> 1 1 1 449
#> 2 1 2 446
#> 3 1 3 442
#> 4 1 4 435
#> .. ... ... ...
#> 96 1 96 1
#> 97 1 97 -4
#> 98 1 98 1
#> 99 1 99 2
#> 100 1 100+ 2
#>
#> [101 rows x 3 columns]
Note that the total end-of-period population of poptbl
before applying
the corrections is:
print(calculated_poptbl[estime2.options.pop.end].sum())
#> 20023
estime2.options
has many global options available for the package to
work. See the source codes for details.
ProvPopTable.fix_issues()
returns the fixed version of the original
ProvPopTable
where there are no negative end-of-period population(s):
poptbl_fixed_tbl = poptbl.fix_issues()
print(poptbl_fixed_tbl)
#> Sex Age Initial Population BTH ... NPR, 2019-07-01 IMM IIM RAI
#> 0 1 -1 0 473 ... 0 0 5 2
#> 1 1 0 455 0 ... 0 0 12 2
#> 2 1 1 449 0 ... 0 0 10 2
#> 3 1 2 446 0 ... 0 0 10 2
#> 4 1 3 435 0 ... 0 0 11 2
#> .. ... ... ... ... ... ... ... ... ...
#> 97 1 96 0 0 ... 0 0 0 1
#> 98 1 97 0 0 ... 0 0 0 2
#> 99 1 98 1 0 ... 0 0 0 2
#> 100 1 99 0 0 ... 0 0 0 2
#> 101 1 100+ 1 0 ... 0 0 0 2
#>
#> [102 rows x 15 columns]
Any negative end-of-period is brought up to 0, and the counter-modifications are applied to records of neighbouring ages:
calculated_poptbl_fixed = poptbl_fixed_tbl.calculate_pop()
print(calculated_poptbl_fixed)
#> Sex Age Postcensal Population
#> 0 1 0 461
#> 1 1 1 449
#> 2 1 2 446
#> 3 1 3 442
#> 4 1 4 435
#> .. ... ... ...
#> 96 1 96 1
#> 97 1 97 0
#> 98 1 98 1
#> 99 1 99 2
#> 100 1 100+ 2
#>
#> [101 rows x 3 columns]
ProvPopTable.fix_issues()
preserves the total end-of-period population
of the original table:
print(calculated_poptbl_fixed[estime2.options.pop.end].sum())
#> 20023
If you let return_all_mods
to be True
in
ProvPopTable.fix_issues()
, you get the wrapper object which allows you
to compute relevant metrics:
poptbl_fixed = poptbl.fix_issues(return_all_mods = True)
For example, you may compute the standard deviation of all the
corrections applied to poptbl
as follows:
poptbl_sd = poptbl_fixed.get_metric_sd()
print(poptbl_sd)
#> Sex Age Component sd
#> 0 1 97 DTH 2.236068
The wrapper object also comes with some visualization tools. For example, you can visualize pre- and post-correction end-of-period populations as follows:
poptbl_fixed.plot_pop(age_range = [87, 97])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.