Value Based Prioritization
Project description
ValueBasedPrioritization
Article
- PDF: value_based_prioritization.pdf
- TeX Source: value_based_prioritization.tex
vbp
Value Based Prioritization (vbp) uses value theory to quantitatively prioritize potential actions to accomplish a goal.
https://github.com/freeradical13/ValueBasedPrioritization
This package provides abstract classes and utility methods to run VBP, mostly focused on Modeled VBP which uses time series data to predict future values and prioritize actions based on the relative predicted values.
The DataSource class is the base abstract class for VBP.
The TimeSeriesDataSource abstract class inherits from DataSource and may be used for Modeled VBP. The ExampleDataSource class demonstrates a simple data source based on TimeSeriesDataSource.
Built-in Modeled VBPs include Underlying Cause of Death models for the United States (UCODUnitedStates) and the World (UCODWorld). These data sources both inherit from ICDDataSource which inherits from TimeSeriesDataSource.
The run module may be used from the command line to perform different VBP actions such as listing actions (list), counting actions (count), predicting values (predict), running Modeled VBP (modeled_value_based_prioritization), and more. For usage, run:
python -m vbp.run
Source Code
Prerequisites:
pip3 install numpy pandas matplotlib statsmodels scipy fbprophet
Usage
python3 -m vbp.run -h
Any non-screen output goes to the vbpoutput
folder.
Running
The model type is specified with --ets
, --ols
, and/or --prophet
.
These are not mutually exclusive; if combined during
modeled_value_based_prioritization
, an average is taken of the
results. The default is --ets
.
By default, action names are obfuscated to reduce bias during model
building and testing. Specify --do-not-obfuscate
to show actual names.
Some data sources have different data types (e.g. mutually exclusive
groupings of data). Add the -a
argument before the data source
name to run for all data types. Add the --data-type X
argument after
the data source name to specify a specific data type.
In general, a list of actions may be specified to run for just that list; otherwise, without such a list, all actions are processed.
Examples:
python3 -m vbp.run modeled_value_based_prioritization UnderlyingCausesOfDeathUnitedStates
python3 -m vbp.run modeled_value_based_prioritization UnderlyingCausesOfDeathUnitedStates --do-not-obfuscate "Ischemic heart diseases" Malaria
Exponential Smoothing
Using exponential smoothing:
python3 -m vbp.run modeled_value_based_prioritization ${DATA_SOURCE} --ets
Specify --ets-no-multiplicative-models
to only use additive models.
Specify --ets-no-additive-models
to only use multiplicative models.
Linear Regression
Using linear regression.
python3 -m vbp.run modeled_value_based_prioritization ${DATA_SOURCE} --ols
Specify --ols-max-degrees X
to model higher degrees.
Prophet
Using Facebook Prophet.
python3 -m vbp.run modeled_value_based_prioritization ${DATA_SOURCE} --prophet
United States
As of 2019-03-01, the unzipped U.S. mortality data consumes ~36GB of disk. It will be downloaded and unzipped automatically when a function is used that needs it.
Long-term, comparable, leading causes of death
Generate data/ucod/united_states/comparable_data_since_1959.xlsx
for
all long-term, comparable, leading causes of death in
https://www.cdc.gov/nchs/data/dvs/lead1900_98.pdf:
python3 -m vbp.run prepare_data UnderlyingCausesOfDeathUnitedStates
Rows 1900:1957 and the sheet Comparability Ratios
in
data/ucod/united_states/comparable_ucod_estimates.xlsx
were manually
input from https://www.cdc.gov/nchs/data/dvs/lead1900_98.pdf
Open comparable_data_since_1959.xlsx
and copy rows 1959:Present.
Open comparable_ucod_estimates.xlsx
and paste on top starting
at 1959.
Process comparable_ucod_estimates.xlsx
with its
Comparability Ratios
sheet to generate
comparable_ucod_estimates_ratios_applied.xlsx
:
python3 -m vbp.run prepare_data UnderlyingCausesOfDeathUnitedStates --comparable-ratios
Final output:
python3 -m vbp.run modeled_value_based_prioritization UnderlyingCausesOfDeathUnitedStates --data-type UCOD_LONGTERM_COMPARABLE_LEADING
World
As of 2019-03-01, the unzipped World mortality data consumes ~320MB of disk. It will be downloaded and unzipped automatically when a function is used that needs it.
When testing, writing data spreadsheets takes a lot of time and may be avoided with --do-not-write-spreadsheets.
Creating a new Data Source
Review vbp/example.py for a simple example. The basic process is:
- Create a sub-class of vbp.DataSource in somename.py
- Implement all
@abc.abstractmethod
methods and override any other superclass methods as needed. - Import somename.py at the top of vbp/run.py
Development
# Edit version in setup.py and __init__.py
python3 setup.py sdist bdist_wheel
python3 -m twine upload --skip-existing dist/*
# https://pypi.org/project/vbp/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.