Skip to main content

Data science tools built with a Jax backend.

Project description

statjax

This library is my attempt to compile the data science tools that I use day-to-day in a single place while maintaining my coding ability and checking my understanding of various models by implementing them.

It consists of a handful of causal estimators and a flexible linear model framework. The main convenience feature over Scikit-Learn or Statsmodels is a port of the Python Stargazer package that can produce latex tables displaying any of the linear models in the package side-by-side. The backend of the package is written in Jax and Oryx. Overhead is higher, but the package can outperform other data science libraries in large-sample or high-dimensional cases.

The probabilistic GLM framework in the package is designed to be very modular, capable of defining GLMS in terms of a passed link function and error distribution. There are two GLM extensions that are potentially novel to Python. The first is elastic net regularization based on Glmnet in R, and the second is the ability to easily generate Bayesian linear models defined by a link function, error distribution, and prior distribution in a more standard API than that offered by PyMC3. The novelty comes from the flexibility of Oryx: rather than restricting the models to a pre-determined family of distributions, any Oryx distribution, including user-defined, can be passed for any of the distribution arguments. Similarly, any user-defined link function can be used to initialize the model, and this flexibility doesn't compromise the simplicity of the API.

See the demo.ipynb for a full demonstration of the packages functionality.

While the models don't directly support R-style formulas as arguments, all are natively compatible with Formulaic model matrices. Initializing the design matrix using Formulaic then passing those matrices into a Statjax model duplicates the R functionality of passing a formula as model argument at the cost of a single additional line of code. The model will then automatically apply the formula of the design matrix used to fit the model in predict/score/similar functions on dataframe inputs and prevent the user from passing model matrices with different formulas to minimize headaches when iterating on formulas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statjax-0.0.8.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

statjax-0.0.8-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file statjax-0.0.8.tar.gz.

File metadata

  • Download URL: statjax-0.0.8.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for statjax-0.0.8.tar.gz
Algorithm Hash digest
SHA256 20f4511e666079280668d95f978ae673069e1bb0598e7dfa57e93f96de62c3f9
MD5 1be6f9278344e9f2a13053fadcb8bc74
BLAKE2b-256 8e1f408ec09715bf60810d594acc52b6037419ec49afdd2d45835d814bc99246

See more details on using hashes here.

File details

Details for the file statjax-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: statjax-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for statjax-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 06bd1626f7178eb1b5e78f1c7bb9415811dbfd1fa85962f11e83794b15980588
MD5 cc100f66abebfdc5012904f885c67754
BLAKE2b-256 c20de8747fea532ec09cd458ab20ca24fe855b2a31ac8592c4074df7a1dc4b10

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page