Statistical computations and models for use with SciPy
Project description
Statsmodels is a python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation of statistical models.
scikits.statsmodels provides classes and functions for the estimation of several categories of statistical models. These currently include linear regression models, OLS, GLS, WLS and GLS with AR(p) errors, generalized linear models for six distribution families, M-estimators for robust linear models, and regression with discrete dependent variables, Logit, Probit, MNLogit, Poisson, based on maximum likelihood estimators, timeseries models, ARMA, AR and VAR. An extensive list of result statistics are available for each estimation problem. Statsmodels also contains descriptive statistics, a wide range of statistical tests and more.
We welcome feedback: mailing list at http://groups.google.com/group/pystatsmodels or our bug tracker at https://bugs.launchpad.net/statsmodels
For updated versions between releases, we recommend our repository at http://code.launchpad.net/statsmodels We will move to github in the near future https://github.com/statsmodels
Main changes for 0.3.0
Changes that break backwards compatibility
Added api.py for importing. So the new convention for importing is
import scikits.statsmodels.api as sm
Importing from modules directly now avoids unnecessary imports and increases the import speed if a library or user only needs specific functions.
sandbox/output.py -> iolib/table.py
lib/io.py -> iolib/foreign.py (Now contains Stata .dta format reader)
family -> families
families.links.inverse -> families.links.inverse_power
Datasets’ Load class is now load function.
regression.py -> regression/linear_model.py
discretemod.py -> discrete/discrete_model.py
rlm.py -> robust/robust_linear_model.py
glm.py -> genmod/generalized_linear_model.py
model.py -> base/model.py
t() method -> tvalues attribute (t() still exists but raises a warning)
main changes and additions
Numerous bugfixes.
Time Series Analysis model (tsa)
Vector Autoregression Models VAR (tsa.VAR)
Autogressive Models AR (tsa.AR)
Autoregressive Moving Average Models ARMA (tsa.ARMA) : optionally uses Cython for Kalman Filtering use setup.py install with option –with-cython
Baxter-King band-pass filter (tsa.filters.baxter_king)
Hodrick-Prescott filter (tsa.filters.hpfilter)
Christiano-Fitzgerald filter (tsa.filters.cffilter)
Improved maximum likelihood framework uses all available scipy.optimize solvers
Refactor of the datasets sub-package.
Added more datasets for examples.
Removed RPy dependency for running the test suite.
Refactored the test suite.
Refactored codebase/directory structure.
Support for offset and exposure in GLM.
Removed data_weights argument to GLM.fit for Binomial models.
New statistical tests, especially diagnostic and specification tests
Multiple test correction
General Method of Moment framework in sandbox
Improved documentation
and other additions
Main Changes in 0.2.0
Improved documentation and expanded and more examples
Added four discrete choice models: Poisson, Probit, Logit, and Multinomial Logit.
Added PyDTA. Tools for reading Stata binary datasets (*.dta) and putting them into numpy arrays.
Added four new datasets for examples and tests.
Results classes have been refactored to use lazy evaluation.
Improved support for maximum likelihood estimation.
bugfixes
renames for more consistency
RLM.fitted_values -> RLM.fittedvalues
GLMResults.resid_dev -> GLMResults.resid_deviance
Python 3
scikits.statsmodels has been ported and tested for Python 3.2. Python 3 version of the code can be obtained by running 2to3.py over the entire statsmodels source. The numerical core of statsmodels worked almost without changes, however there can be problems with data input and plotting. The STATA file reader and writer in iolib.foreign has not been ported yet. And there are still some problems with the matplotlib version for Python 3 that was used in testing. Running the test suite with Python 3.2 shows some errors related to foreign and matplotlib.
Sandbox
We are continuing to work on support for systems of equations models, panel data models, time series analysis, and information and entropy econometrics in the sandbox. This code is often merged into trunk as it becomes more robust.
Windows Help
The source distribution for Windows includes a htmlhelp file (statsmodels.chm). This can be opened from the python interpreter
>>> import scikits.statsmodels.api as sm >>> sm.open_help()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Hashes for scikits.statsmodels-0.3.0rc1_with_winhelp.zip
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e25352b82e4f464525553c14a6238b6ab758532c335f559e7cdb13b0f582379 |
|
MD5 | 585260c3c69d66b089c6ba2c9ee7db39 |
|
BLAKE2b-256 | 53ef0af0b9964c5c4ac9d455085b103f43ecbe96a6bb00537d0cfad2cacc639d |
Hashes for scikits.statsmodels-0.3.0rc1_python3.zip
Algorithm | Hash digest | |
---|---|---|
SHA256 | 874e5c20d4acf261485f0cbf5f3dad3a64f3e70fd2e675f8b2c78b3ca2c36692 |
|
MD5 | 502369b478367ca9cbbb4383d4ee50c6 |
|
BLAKE2b-256 | 75ad3c2666879e2402acdbd7a2ad98635ec683c73f987a21fc94e9ea33fff3a0 |
Hashes for scikits.statsmodels-0.3.0rc1.zip
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1b094272d233db23ea66f8a43110504c4e21f8d5bca3cec273c6d8c3d20aeef |
|
MD5 | 6c6e0ddcc404bbf9518b9ee670cda54c |
|
BLAKE2b-256 | 014ddf93c2f79f3ab81f3fae6acc34c85365c0428d95d828785b777ee8dbc809 |