Skip to main content

Maximum likelihood estimation and model selection of EMMs

Project description

exp_mixture_model

Maximum likelihood estimation and model selection for the exponential mixture model (i.e., mixture of exponential distributions)

When you use this code, please cite the following paper:

Makoto Okada, Kenji Yamanishi, Naoki Masuda. Long-tailed distributions of inter-event times as mixtures of exponential distributions. arXiv:19xx.xxxxx

Installation

exp_mixture_model is hosted on PyPI. So, one can install it by running

pip install exp_mixture_model

If you want to install from source, clone the exp_mixture_model git repository by running

git clone https://github.com/naokimas/exp_mixture_model.git

Then, navigate to the top-level of the cloned directory and run

python setup.py install

You can test our code by running

python setup.py test

If you use Anaconda, you may install required packages by running

conda install --file requirements.txt

Quick Use

Fit an EMM to data stored in a file (e.g. sample.dat) by running

python emmfit.py -f sample.dat -k 10
  • '10' is the initial number of components.
  • 'sample.dat' is provided as part of this package. It is synthetic data that we generated by running
from emm import generate_emm
x = generate_emm(1000, 10)

To select the best model among EMMs with different numbers of components, don't specify 'k' and instead specify the model selection criterion using '-c' as follows.

python emmfit.py -f sample.dat -c DNML
  • One can specify either 'marginal_log_likelihood', 'joint_log_likelihood', 'AIC', 'BIC', 'AIC_LVC', 'BIC_LVC', 'NML_LVC', or 'DNML' as the argument of '-c'.
  • Default is 'DNML'.

Check details of the usage by running

python emmfit.py --help

Usage

Fit an EMM to data by running

from exp_mixture_model import EMM
x = [1.5, 2.3, ...]  # data can be either a list or numpy array
#
# Alternatively, one can load data from a file using numpy as follows.
#
# import numpy as np
# x = np.loadtxt("sample.dat")
#
model = EMM()
pi, mu = model.fit(x)  # estimate the parameters
model.print_result()  # print 'k_final' (i.e., the estimated effective number of components) and the estimated parameters
model.plot_survival_probability()  # plot the survival probability (= cumulative complementary distribution function) for the estimated EMM and the given data 'x'.

Select the number of components based on a model selection criterion by running

from exp_mixture_model import EMMs
x = [1.5, 2.3, ...]
emms = EMMs()
emms.fit(x)  # fit EMMs with different values of 'k', i.e., the number of components. The default uses 13 values of 'k'. This process is computationally heavy.
best_model = emms.select('DNML')  # select the best number of components under the 'DNML' criterion. One can specify either 'marginal_log_likelihood', 'joint_log_likelihood', 'AIC', 'BIC', 'AIC_LVC', 'BIC_LVC', 'NML_LVC', or 'DNML' as the argument of 'emms.select'.
emms.print_result_table()  # print the values of 'k_final', likelihoods, and 'DNML' for each 'k' value
best_model.print_result() # print 'k_final' and the estimated parameter values of the selected EMM

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for exp-mixture-model, version 1.0.0
Filename, size File type Python version Upload date Hashes
Filename, size exp_mixture_model-1.0.0-py3-none-any.whl (11.2 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size exp_mixture_model-1.0.0.tar.gz (11.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page