Skip to main content

Various bayesian models based on stan and pystan with a elegant interface like a scikit-learn or keras.

Project description

scikit-stan
===========

|Build Status| |codecov|

What is scikit-stan
-------------------

``scikit-stan`` will enable you to use various bayesian models based on
``stan``\ (http://mc-stan.org) and ``pystan`` with a elegant interface
like a ``scikit-learn`` or ``keras``.

Demo
----

.. code:: python

import numpy as np

from skstan.regression.linear_models import LogisticRegression

if __name__ == '__main__':
x = np.array(
[
[1,2,3,],
[1,2,7,],
[1,0,3,],
[1,1,3,],
[3,7,3,],
]
)
y = np.array([0,0,0,0,1])

glm = LogisticRegression(shrinkage=10, chains=8)
fit = glm.fit(x, y)

Then we got result object ``fit``, and field ``stanfit`` is a stanfit
object of pystan.

.. code:: python

print(fit.stanfit)

It gives following

::

Inference for Stan model: anon_model_f63cd5ccdd67c22034b2490ae4c9cdd1.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha[0] 2.23 0.29 8.88 -14.39 -3.87 2.02 8.07 20.67 966 1.0
alpha[1] 7.81 0.18 5.29 -1.08 4.01 7.48 11.23 19.01 880 1.0
alpha[2] -9.79 0.22 5.87 -22.91 -13.41 -9.37 -5.38 -0.17 728 1.0
beta -2.48 0.29 9.8 -22.63 -9.03 -2.3 3.99 16.91 1146 1.0
yp[0] -13.99 0.32 11.19 -40.69 -20.24 -11.35 -5.42 0.3 1259 1.0
yp[1] -53.15 1.14 32.08 -128.4 -71.99 -48.46 -29.35 -5.24 790 1.0
yp[2] -29.61 0.6 16.66 -67.24 -39.44 -27.97 -17.0 -4.37 771 1.0
yp[3] -21.8 0.44 13.17 -52.98 -29.03 -19.57 -11.91 -3.23 894 1.0
yp[4] 29.51 0.69 24.68 0.58 10.3 23.36 42.72 90.17 1276 1.0
lp__ -2.16 0.05 1.48 -5.93 -2.9 -1.81 -1.07 -0.32 956 1.0

Samples were drawn using NUTS at Thu Apr 13 07:52:33 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

Result object of ``skstan`` also have prediction methods. Predicted
values can be obtained as samples from distribution with a
``predict_dist`` method, because it is bayesian model.

.. code:: python

yp_dist = fit.predict_dist(x)
print(yp_dist)

Then we got

::

array([[ 2.63886682e-08, 5.23976746e-04, 5.54863097e-05, ...,
2.46008578e-08, 3.74830192e-01, 3.45994043e-03],
[ 1.07746578e-22, 1.01664809e-18, 4.12813154e-26, ...,
5.64992544e-19, 7.24386097e-12, 1.75795155e-23],
[ 8.04688037e-22, 4.44522113e-12, 1.42920488e-11, ...,
7.71565191e-13, 5.13118658e-05, 4.26331280e-05],
[ 4.60810657e-15, 4.82743551e-08, 2.81612678e-08, ...,
1.37772153e-10, 5.51614998e-03, 3.84594197e-04],
[ 9.99999998e-01, 1.00000000e+00, 1.00000000e+00, ...,
9.99965378e-01, 1.00000000e+00, 1.00000000e+00]])

So let's check the histgram of first row with ``pandas.Series``.

.. code:: python

import pandas as pd
pd.Series(yp_dist[0]).hist(bins=20)

.. figure:: image/hist.png
:alt: Histgram of first row

Histgram of first row
If you need a median of samples, you can get it with just ``predict``
method

.. code:: python

yp = fit.predict(x)
print(yp)

gives

::

array([ 1.17280235e-05, 9.01419773e-22, 7.16023732e-13,
3.18368664e-09, 1.00000000e+00])

How to install
--------------

Install
~~~~~~~

.. code:: sh

git clone https://github.com/BayesianFreaks/scikit-stan
cd scikit-stan
python3 setup.py install

Uninstall
~~~~~~~~~

.. code:: sh

pip3 uninstall scikit-stan

Using python2?
==============

Are you joking?

We can't touch you because we are living in the future from you, and
you're living in past ages. Please say hello to Nobunaga Oda.

We will always use newest features of the latest version of python, so
you should use the latest version of python.

Models
======

Ready
-----

Regression Models
~~~~~~~~~~~~~~~~~

- Linear Regrassion
- Poisson Regression
- Logistic Regression

Next Steps
----------

Regression Models
~~~~~~~~~~~~~~~~~

- Gamma Regression
- GLMM
- etc...

Time Series Models
~~~~~~~~~~~~~~~~~~

- AR Model
- MA Model
- ARMA Model
- ARIMA Model
- ARCH Model
- GARCH Model
- TAR Model
- State Space Model
- or Some Dynamic Regression Models
- etc...

Clustering Model
~~~~~~~~~~~~~~~~

- Gaussian Mixture Model
- Latent Dirichlet Allocation
- etc...

Particular Application
~~~~~~~~~~~~~~~~~~~~~~

- Modeling about online-advertisement
- Decompose time series data
- Empirical Bayesian Estimation

.. |Build Status| image:: https://travis-ci.org/BayesianFreaks/scikit-stan.svg?branch=master
:target: https://travis-ci.org/BayesianFreaks/scikit-stan
.. |codecov| image:: https://codecov.io/gh/BayesianFreaks/scikit-stan/branch/master/graph/badge.svg
:target: https://codecov.io/gh/BayesianFreaks/scikit-stan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for skstan, version 0.0.0.dev2
Filename, size File type Python version Upload date Hashes
Filename, size skstan-0.0.0.dev2.tar.gz (7.0 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page