Modelling COVID-19 using SIR-like models
Project description
Epidemia
Epidemia is a Python library that simulates epidemic outbreaks using a SIR-like model.
Install
pip install epidemia
It is recommended to use PyPy instead of CPython (the default Python interpreter) if you have performance problems.
Models
The following SIR-like models have been implemented.
- SIR: susceptible, infected, recovered model
- SEIR: susceptible, exposed, infected, recovered model
- ModelReport2: the model defined in report 2 at COVID-19 en Chile
Optimization methods
List of available optimization methods.
Nelder-Mead
,Powell
,CG
,BFGS
,Newton-CG
,COBYLA
,trust-constr
,dogleg
,trust-ncg
,trust-exact
,trust-krylov
using scipy.optimize.minimize without boundslm
using scipy.optimize.least_squared without boundsL-BFGS-B
,TNC
,SLSQP
using scipy.optimize.minimizetrf
,dogbox
using scipy.optimize.least_squaredannealing
using scipy.optimize.dual_annealingbayesian
using skopt.gp_minimizegbrt
using skopt.gbrt_minimizeforest
using skopt.forest_minimizePSO
using pyswarms.single.global_best.GlobalBestPSO
From experience it is necessary to have a good initial estimate of parameters before fitting. An initial optimization round using either annealing
or especially PSO
work very well. For improved speed it is recommended to use the fast=True
flag for naive integration. Finally, an L-BFGS-B
with fast=False
(i.e. using RK4) will tweak the parameters to fit the curve as best as possible.
Examples
Jupyter notebooks of trained models for specific countries.
Simulating
We need to define the start time and initial state the compartments are in. Using a parameter function we can feed parameter values to the model dependent on time. To run the simulation we additionally define the end time.
import numpy as np
from epidemic import *
# Initial time and state
t0 = np.datetime64('2020-01-01')
y0 = {'S': 1e7, 'E': 200, 'Im': 200, 'I': 20, 'H': 0, 'Hc': 0, 'R': 0, 'D': 0}
# Our alpha at time 't'
def α(t):
if t >= np.datetime64('2020-06-01') and t < np.datetime64('2020-09-01'):
return 0.5
return 1.0
# Our parameters at time 't'
params = lambda Y,t: {
'βE': α(t) * 0.062015625,
'βIm': α(t) * 0.12403125,
'βI': α(t) * 0.165375,
'βH': α(t) * 0.0,
'βHc': α(t) * 0.0,
'γE': 0.2,
'γIm': 0.1,
'γI': 0.1,
'γH': 0.1666,
'γHc': 0.1,
'μb': 3.57e-5,
'μd': 1.57e-5,
'φEI': 0.50,
'φIR': 0.85,
'φHR': 0.85,
'φD': 0.50,
}
# Create and run model till time 'tmax'
model = ModelReport2()
epidemic = Epidemic(model, t0, tmax=np.datetime64('2021-06-01'))
epidemic.run(y0, params)
epidemic.run_parameter('R_effective', model.R_effective)
epidemic.plot('Epidemic', cols=['I', 'H', 'Hc', 'D'])
epidemic.plot_params('Epidemic (R effective)', cols=['R_effective'])
See example.py.
Training parameters
In order to train our parameters, we define a mapping function x_params
: x
=> params
with bounds x_bounds
for x
. The first parameter to x_params
is the initial state y0
, and the following parameters are those that are being optimized. The x_params
function will return a new initial state y0
and a new params
function to define how parameters develop over time (see above).
First we load our model like above, but we pass a data
DataFrame from which we can calculate the error. The DataFrame must have column names that correspond to the model's compartments and derived compartments (more on that later). We define our training variables, bounds and function in order to train the model.
# Define our training parameters: initial value, bounds, and mapping function to model parameters
x = [
0.74, # E0
10.3, # Im0
0.38, # CE
0.75, # CIm
0.165, # βI
0.2, # γE
0.1, # γIm
0.1, # γI
0.1667, # γH
0.1, # γHc
]
x_bounds = [
(0,20), # E0
(0,20), # Im0
(0.0,0.4), # CE
(0.0,0.9), # CIm
(0.0,0.75), # βI
(0.17,0.25), # γE
(0.07,0.14), # γIm
(0.07,0.14), # γI
(0.1,0.5), # γH
(0.0625,0.14), # γHc
]
def x_params(E0, Im0, CE, CIm, βI, γE, γIm, γI, γH, γHc):
y0 = {
'S': 1e7,
'E': E0 * I0,
'Im': Im0 * I0,
'I': I0,
'H': 0,
'Hc': 0,
'R': 0,
'D': D0,
}
λ1 = np.datetime64('2020-04-01')
κ1 = 0.05
α2 = 0.75
α = lambda t: 1.0 if t < λ1 else α2 + (1.0-α2)*np.exp(-κ1*(t-λ1)/np.timedelta64(1,'D'))
return y0, lambda t: {
'βE': α(t) * CE * βI,
'βIm': α(t) * CIm * βI,
'βI': α(t) * βI,
'βH': 0.0,
'βHc': 0.0,
'γE': γE,
'γIm': γIm,
'γI': γI,
'γH': γH,
'γHc': γHc,
'μb': 3.57e-5,
'μd': 1.57e-5,
'φEI': 0.5,
'φIR': 0.6,
'φHR': 0.6,
'φD': 0.2,
}
Now we can train the parameters in x
using a variety of methods. Currently implemented are the scipy.optimize.minimize
methods, and the scipy.optimize.dual_annealing
, scipy.optimize.least_squares
, and skopt.*_minimize
methods. It is recommended to use fast=True
(which doesn't use Runge-Kutta 4 and is this ~4 times faster) for all but the last optimization.
options = {
'bayesian': {
'n_calls': 100,
'n_random_starts': 10,
'fast': True,
},
'annealing': {
'seed': 1234567,
'fast': True,
},
'L-BFGS-B': {
'disp': True,
},
}
for method in ['annealing', 'L-BFGS-B']:
opt = {}
if method in options:
opt = options[method]
x = epidemic.optimize(x, x_bounds, x_params, method=method, **opt)
epidemic.plot(cols=['I_cases', 'I'])
epidemic.plot(cols=['D'])
See example_train.py.
Loading data
In order to optimize our simulation, we need to pass training data to the model. The Epidemic
class accepts a data
argument of that should be a DataFrame, where its columns correspond to the compartments or semi-compartments of the model. For instance, I
, or D
are valid compartments, but also derived compartments such as the cumulative I_cases
or H_cases
.
df_infectados = pd.read_csv('data/chile_minsal_infectados.csv', sep=';', index_col=0)
df_infectados = df_infectados.transpose()
df_infectados.index = pd.to_datetime(df_infectados.index, format='%d-%b') + pd.offsets.DateOffset(years=120)
df_fallecidos = pd.read_csv('data/chile_minsal_fallecidos.csv', sep=';', index_col=0)
df_fallecidos = df_fallecidos.transpose()
df_fallecidos.index = pd.to_datetime(df_fallecidos.index, format='%d-%b') + pd.offsets.DateOffset(years=120)
data = pd.DataFrame({
'I_cases': df_infectados['Región Metropolitana'],
'D': df_fallecidos['Región Metropolitana'],
})
epidemic = Epidemic(model, t0, tmax, data=data)
Change simulation time range
Given an Epidemic, we can extend the simulation time range. This will run all simulations for the new time range.
epidemic.extend(np.datetime64('2021-06-01'))
Add confidence intervals
When calling run
on an Epidemic
, we can pass the tag
argument. If this is anything but None
, empty, or mean
, we will assume this is an extra curve that will be saved. If the tag name is lower
or upper
, it will serve as the lower and upper bounds respectively for the confidence intervals while plotting.
epidemic.run(y0_lower, params_lower, tag='lower')
epidemic.run(y0_upper, params_upper, tag='upper')
# or
epidemic.run(*x_params(*x_lower), tag='lower')
epidemic.run(*x_params(*x_upper), tag='upper')
Visualization and parameter analysis
Plotting data columns
epidemic.plot(cols=['I', 'H', 'Hc', 'D'])
Plotting derived parameters
epidemic.plot_params(cols=['Reff'])
Printing parameters
Display the model parameters and their values.
epidemic.print_params()
Parameter | Value |
---|---|
βE | 0.02293 |
βIm | 0.06112 |
βI | 0.1891 |
βH | 0 |
βHc | 0 |
γE | 0.186 |
γIm | 0.1376 |
γI | 0.1116 |
γH | 0.14 |
γHc | 0.0626 |
μb | 3.57e-05 |
μd | 1.57e-05 |
φEI | 0.5 |
φIR | 0.6006 |
φHR | 0.6049 |
φD | 0.2 |
Printing training parameters
Display the training parameters and their values and ranges.
epidemic.print_x_params(x, x_bounds, x_params)
Parameter | Value | Range |
---|---|---|
E0 | 0.8514 | [0, 20] |
Im0 | 2.526 | [0, 20] |
CE | 0.1213 | [0, 0.4] |
CIm | 0.3233 | [0, 0.9] |
βI | 0.2521 | [0, 0.75] |
γE | 0.186 | [0.17, 0.25] |
γIm | 0.1376 | [0.07, 0.14] |
γI | 0.1116 | [0.07, 0.14] |
γH | 0.14 | [0.07, 0.14] |
γHc | 0.0626 | [0.0625, 0.14] |
Printing statistics
Print relevant statistics about the simulation.
epidemic.print_stats()
Parameter | Value | Date |
---|---|---|
R effective | 1.76 | 2020-03-15 |
R effective | 1.41 | 2020-05-01 |
Fatality | 0.02 | 2020-05-01 |
max(I) | 3011 | 2020-05-01 |
max(H) | 1974 | 2020-05-01 |
max(Hc) | 1139 | 2020-05-01 |
max(D) | 173 | 2020-05-01 |
Parameter sensitivity
Probing parameter sensitivity to the error. Each parameter is moved a small distance epsilon and evaluated to see how much it impacts the error. Higher values means these parameter are very sensitive. When zero it means it has no impact on the error and is thus independent.
Each column shows the impact of the error on that data series, while * is the total model error.
epidemic.param_sensitivity()
* | D | I_cases | |
---|---|---|---|
E0 | 0.0025952 | 0.00267847 | 0.00253102 |
Im0 | 0.000353852 | 0.000319161 | 0.00036079 |
CE | 0.00634905 | 0.00140165 | 0.0113138 |
CIm | 0.00543327 | 0.00270623 | 0.00813083 |
βI | 0.0271594 | 0.00895137 | 0.0453727 |
γE | 0.0110189 | 0.0161675 | 0.00588935 |
γIm | 0.0115601 | 0.00123512 | 0.0219025 |
γI | 0.00850704 | 0.0615199 | 0.0445023 |
γH | 0.0087152 | 0.0174582 | 0 |
γHc | 0.00790336 | 0.0157928 | 0 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file epidemia-0.0.3.tar.gz
.
File metadata
- Download URL: epidemia-0.0.3.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c10d7a158edd5cbf4efa5dd53aaefdee16bb9180e1d11c7964cc3818ca718789 |
|
MD5 | d2f584315b75edde7ff6532f3bc50cb7 |
|
BLAKE2b-256 | b91f399ad22cf3489331f9c12563677c1857ad8e57f812f37385c183ea0049e8 |
Provenance
File details
Details for the file epidemia-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: epidemia-0.0.3-py3-none-any.whl
- Upload date:
- Size: 28.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b44d7f785de7ff569473102616aadc39a6d89d4265efe91839af2a667e75eb98 |
|
MD5 | e1f0c5553b23a6831134f0a70af99d00 |
|
BLAKE2b-256 | d2c49acd28e2974a584fea5ece072df52990c23332a822c819d4d36327efeb02 |