A Python Dirichlet Multinomial Mixture Model
Project description
pyDIMM
A Python Dirichlet Multinomial Mixture Model.
Ready
Typically, if you import pyDIMM in your program, clibs will be automatically compiled, and you can skip this part.
We need to first compile the files in clibs
. The makefile has been provided.
cd clibs && make
Also, you can compile it by yourself using gcc. Compile ./clibs/pyDIMM_libs.c
by the instructions in the head of that file, and then you will get a ./clibs/pyDIMM_libs.so
file.
cd clibs && gcc -lm -shared -fPIC -o pyDIMM_libs.so pyDIMM_libs.c
Check the files now.
└───pyDIMM
|
├───pyDIMM
| ├───__init__.py
| ├───class_DIMM.py
| └───clibs
| ├───makefile
| ├───pyDIMM_libs.c
| └───pyDIMM_libs.so
|
└───Some other files...
How to use
All the methods are based on the class DIMM
. You need an instance of pyDIMM.DIMM
to get started.
import pyDIMM
- Example 1
dimm_0 = pyDIMM.DIMM(observe_data=your_data, n_components=3, alpha_init='kmeans')
- Example 2
dimm_1 = pyDIMM.DIMM(observe_data=your_data, n_components=5, alpha_init='manual', prior_label=your_label, print_log=True)
Train (by EM algorithm)
Use EM algorithm to train the model. The EM algorithm is written in C (yes, it's the code in pyDIMM.c
, we use ctypes
to implement that.).
- Example
dimm_0.EM(max_iter=100, max_loglik_tol=1e-3, max_alpha_tol=1)
OK, the DIMM is already trained now. We need to get the result back. All the result information is in one dictionary.
- Example
result_0 = dimm_0.get_model() print(result_0)
you're supposed to see{ 'alpha': array([...]), 'pie': array([...]), 'delta': array([...]), 'loglik': ..., 'AIC': ..., 'BIC': ... }
That's all! You get it.
Note:
Once you get the trained DIMM model, the result parameters are stored in the DIMM instance. So every time you want to retrieve the result back, just call theget_model()
method. (Only if you don't change the instance before, such as callEM()
again. That will of course change the result stored.)
Predict
Sometimes you are not only want to fit a DIMM, but also want to use this model to predict some other data (If you don't want, forget it). Fortunately, we have the method predict()
.
- Example
predict_res = dimm_0.predict(another_data)
Then you'll get the predict result label
and delta
in the predict_res
dictionary. Find the detail explanation in the doc in codes.
Save & Load
All information of your DIMM instance can be saved to .npy
file and then can be loaded anytime and anywhere.
- Example
dimm_0.save('./dimm_0_file')
After this, a new file calleddimm_0_file.npy
(the postfix .npy is automatically added) will appear in your current folder. You can read from the file later.dimm_load = pyDIMM.DIMM.load('./dimm_0_file.npy')
Contact
Ziqi Rong rongziqi@sjtu.edu.cn
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pyDIMM-0.0.1.tar.gz
.
File metadata
- Download URL: pyDIMM-0.0.1.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7b67f0d7d146e6ca377587b1d21eed225fc7ea9454813366c4a46b10cf0cbf9 |
|
MD5 | a9251fbe9987fe7b99d29cae8af605c7 |
|
BLAKE2b-256 | 4e8f5ce51b694eea2859cddb0a45b594398f406ad6de74cfccefe49a3505d429 |