Skip to main content

Statistical computations and models for use with SciPy

Project description

What it is

Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.

Documentation for the 0.4 version is currently at http://statsmodels.sourceforge.net/devel/

Main Features

  • linear regression models: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares.

  • glm: Generalized linear models with support for all of the one-parameter exponential family distributions.

  • discrete: regression with discrete dependent variables, including Logit, Probit, MNLogit, Poisson, based on maximum likelihood estimators

  • rlm: Robust linear models with support for several M-estimators.

  • tsa: models for time series analysis

    • univariate time series analysis: AR, ARIMA

    • vector autoregressive models, VAR and structural VAR

    • descriptive statistics and process models for time series analysis

  • nonparametric : (Univariate) kernel density estimators

  • datasets: Datasets to be distributed and used for examples and in testing.

  • stats: a wide range of statistical tests

    • diagnostics and specification tests

    • goodness-of-fit and normality tests

    • functions for multiple testing

    • various additional statistical tests

  • iolib

    • Tools for reading Stata .dta files into numpy arrays.

    • printing table output to ascii, latex, and html

  • miscellaneous models

  • sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered “production ready”. This covers among others Mixed (repeated measures) Models, GARCH models, general method of moments (GMM) estimators, kernel regression, various extensions to scipy.stats.distributions, panel data models, generalized additive models and information theoretic measures.

Where to get it

The master branch on GitHub is the most up to date code

https://www.github.com/statsmodels/statsmodels

Source download of release tags are available on GitHub

https://github.com/statsmodels/statsmodels/tags

Binaries and source distributions are available from PyPi

http://pypi.python.org/pypi/statsmodels/

Installation from sources

See INSTALL.txt for requirements or see the documentation

http://statsmodels.sf.net/devel/install.html

License

Modified BSD (3-clause)

Documentation

The official documentation is hosted on SourceForge

http://statsmodels.sf.net/

Windows Help

A htmlhelp file (statsmodels.chm) will be available. This can be opened from the python interpreter

>>> import statsmodels.api as sm
>>> sm.open_help()

Discussion and Development

Discussions take place on our mailing list.

http://groups.google.com/group/pystatsmodels

We are very interested in feedback about usability and suggestions for improvements.

Bug Reports

Bug reports can be submitted to the issue tracker at

https://github.com/statsmodels/statsmodels/issues

Release History

0.4.0

Main Changes and Additions

  • Added pandas dependency.

  • Cython source is built automatically if cython and compiler are present

  • Support use of dates in timeseries models

  • Improved plots - Violin plots - Bean Plots - QQ Plots

  • Added lowess function

  • Support for pandas Series and DataFrame objects. Results instances return pandas objects if the models are fit using pandas objects.

  • Full Python 3 compatibility

  • Fix bugs in genfromdta. Convert Stata .dta format to structured array preserving all types. Conversion is much faster now.

  • Improved documentation

  • Models and results are pickleable via save/load, optionally saving the model data.

  • Kernel Density Estimation now uses Cython and is considerably faster.

  • Diagnostics for outlier and influence statistics in OLS

  • Added El Nino Sea Surface Temperatures dataset

  • Numerous bug fixes

  • Internal code refactoring

  • Improved documentation including examples as part of HTML

Changes that break backwards compatibility

  • Deprecated scikits namespace. The recommended import is now:

    import statsmodels.api as sm
  • model.predict methods signature is now (params, exog, …) where before it assumed that the model had been fit and omitted the params argument.

  • For consistency with other multi-equation models, the parameters of MNLogit are now transposed.

  • tools.tools.ECDF -> distributions.ECDF

  • tools.tools.monotone_fn_inverter -> distributions.monotone_fn_inverter

  • tools.tools.StepFunction -> distributions.StepFunction

0.3.1

  • Removed academic-only WFS dataset.

  • Fix easy_install issue on Windows.

0.3.0

Changes that break backwards compatibility

Added api.py for importing. So the new convention for importing is:

import scikits.statsmodels.api as sm

Importing from modules directly now avoids unnecessary imports and increases the import speed if a library or user only needs specific functions.

  • sandbox/output.py -> iolib/table.py

  • lib/io.py -> iolib/foreign.py (Now contains Stata .dta format reader)

  • family -> families

  • families.links.inverse -> families.links.inverse_power

  • Datasets’ Load class is now load function.

  • regression.py -> regression/linear_model.py

  • discretemod.py -> discrete/discrete_model.py

  • rlm.py -> robust/robust_linear_model.py

  • glm.py -> genmod/generalized_linear_model.py

  • model.py -> base/model.py

  • t() method -> tvalues attribute (t() still exists but raises a warning)

Main changes and additions

  • Numerous bugfixes.

  • Time Series Analysis model (tsa)

    • Vector Autoregression Models VAR (tsa.VAR)

    • Autogressive Models AR (tsa.AR)

    • Autoregressive Moving Average Models ARMA (tsa.ARMA) optionally uses Cython for Kalman Filtering use setup.py install with option –with-cython

    • Baxter-King band-pass filter (tsa.filters.bkfilter)

    • Hodrick-Prescott filter (tsa.filters.hpfilter)

    • Christiano-Fitzgerald filter (tsa.filters.cffilter)

  • Improved maximum likelihood framework uses all available scipy.optimize solvers

  • Refactor of the datasets sub-package.

  • Added more datasets for examples.

  • Removed RPy dependency for running the test suite.

  • Refactored the test suite.

  • Refactored codebase/directory structure.

  • Support for offset and exposure in GLM.

  • Removed data_weights argument to GLM.fit for Binomial models.

  • New statistical tests, especially diagnostic and specification tests

  • Multiple test correction

  • General Method of Moment framework in sandbox

  • Improved documentation

  • and other additions

0.2.0

Main changes

  • renames for more consistency RLM.fitted_values -> RLM.fittedvalues GLMResults.resid_dev -> GLMResults.resid_deviance

  • GLMResults, RegressionResults: lazy calculations, convert attributes to properties with _cache

  • fix tests to run without rpy

  • expanded examples in examples directory

  • add PyDTA to lib.io – functions for reading Stata .dta binary files and converting them to numpy arrays

  • made tools.categorical much more robust

  • add_constant now takes a prepend argument

  • fix GLS to work with only a one column design

New

  • add four new datasets

    • A dataset from the American National Election Studies (1996)

    • Grunfeld (1950) investment data

    • Spector and Mazzeo (1980) program effectiveness data

    • A US macroeconomic dataset

  • add four new Maximum Likelihood Estimators for models with a discrete dependent variables with examples

    • Logit

    • Probit

    • MNLogit (multinomial logit)

    • Poisson

Sandbox

  • add qqplot in sandbox.graphics

  • add sandbox.tsa (time series analysis) and sandbox.regression (anova)

  • add principal component analysis in sandbox.tools

  • add Seemingly Unrelated Regression (SUR) and Two-Stage Least Squares for systems of equations in sandbox.sysreg.Sem2SLS

  • add restricted least squares (RLS)

0.1.0b1

  • initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

statsmodels-0.4.0.zip (4.4 MB view details)

Uploaded Source

statsmodels-0.4.0.tar.gz (4.1 MB view details)

Uploaded Source

Built Distributions

statsmodels-0.4.0.win-amd64-py3.2.exe (3.5 MB view details)

Uploaded Source

statsmodels-0.4.0.win-amd64-py2.7.exe (3.5 MB view details)

Uploaded Source

statsmodels-0.4.0.win-amd64-py2.6.exe (3.5 MB view details)

Uploaded Source

statsmodels-0.4.0.win32-py3.2.exe (3.5 MB view details)

Uploaded Source

statsmodels-0.4.0.win32-py2.7.exe (3.5 MB view details)

Uploaded Source

statsmodels-0.4.0.win32-py2.6.exe (3.5 MB view details)

Uploaded Source

File details

Details for the file statsmodels-0.4.0.zip.

File metadata

  • Download URL: statsmodels-0.4.0.zip
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for statsmodels-0.4.0.zip
Algorithm Hash digest
SHA256 527396d220f84d60e50d1a9b91f3dbe364c821a3344fab27bb2ffbc5c766aaec
MD5 9077a8dbdcc3e5555e3f338e02f4bcd2
BLAKE2b-256 65966d99d8055dee6aba982775aa577d79db195e26a4ad534482a293440dc95b

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.tar.gz.

File metadata

  • Download URL: statsmodels-0.4.0.tar.gz
  • Upload date:
  • Size: 4.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for statsmodels-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b11280a773c4ceb95e8ee4fdf32a8d2e337ab4c303ad134a63b64d62277202e4
MD5 566440f4b468fef71396e895ee3eb0f9
BLAKE2b-256 76a080cc74654f30eb01a95194c6d486f4ce0dcee268c687d571b8aa7984478f

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win-amd64-py3.2.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win-amd64-py3.2.exe
Algorithm Hash digest
SHA256 099769b68bfe8db6b1e6a416ad6440bf95b3141a7adec22ea895a91af7555b60
MD5 fb55c0ba41e298b2c96710e2ad99668a
BLAKE2b-256 a1c62349e2b4a6791a99ae607fba1597be1047cf738e7cbcdb02219d5c4eaedd

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win-amd64-py2.7.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win-amd64-py2.7.exe
Algorithm Hash digest
SHA256 9e77660286a5e5f8e7ed34f955f89586658544da9d406f776304d8413ede869e
MD5 8f0d6cc01da912f328db693b8b963f6c
BLAKE2b-256 d2fcd9a766ad568e16be2311ecb239b035e8b50ee3724b7ebc61391f14941ce1

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win-amd64-py2.6.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win-amd64-py2.6.exe
Algorithm Hash digest
SHA256 d6801dccd57fea4a12a4a8bc8c4f2fdc7c7ee17f5a4ab4d71955515cc41d674e
MD5 29bf76dd21b7cf4801caef3962ea6530
BLAKE2b-256 297b86b65e5d2567b8795d2403ff290dc3dd0dc88e41856509129c418f7a06f1

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win32-py3.2.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win32-py3.2.exe
Algorithm Hash digest
SHA256 b4fb5e78bf3040518cdbd925d078b207d651c1611a4f7d688e3f2cc38050d143
MD5 afc4bd42fc04879bdfe450761268e4b8
BLAKE2b-256 e8a9174b29b14ffedb23c54d9038ed3ee58c4f85e8eb64296e7d1ba0f76597b7

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win32-py2.7.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win32-py2.7.exe
Algorithm Hash digest
SHA256 387be4a9cb07305df1453f87791b2f9c5faa0778776276f524ca57847e2baaa2
MD5 54b475b8f5799d66d1524037279cd0ac
BLAKE2b-256 e8da2c5171f41ec68c34845cbb084071fe88441cf6dbc61f95a7156716a7f132

See more details on using hashes here.

File details

Details for the file statsmodels-0.4.0.win32-py2.6.exe.

File metadata

File hashes

Hashes for statsmodels-0.4.0.win32-py2.6.exe
Algorithm Hash digest
SHA256 0eb716f59ce1a1d7d0a95dca3dae7535f0d9adf324e76655743811c1fcf5ebe9
MD5 98beb3355b480d4897bbaab1bf9df31f
BLAKE2b-256 3ed99d25a99db2a261da2e49cfb6a5ca03310a61310219b28f8612b6fcd582ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page