Skip to main content

Do likelihood based parameter estimation using maximum likeihood and bayesian methods

Project description

docs/badges/tests-badge.svg docs/badges/coverage-badge.svg

dataprob was designed to allow scientists to easily fit user-defined models to experimental data. It allows maximum likelihood, bootstrap, and Bayesian analyses with a simple and consistent interface.

Design principles

  • ease of use: Users write a python function that describes their model, then load in their experimental data as a dataframe.

  • dataframe centric: Uses a pandas dataframe to specify parameter bounds, guesses, fixedness, and priors. Observed data can be passed in as a dataframe or numpy vector. All outputs are pandas dataframes.

  • consistent experience: Users can run maximum-likelihood, bootstrap resampling, or Bayesian MCMC analyses with an identical interface and nearly identical diagnostic outputs.

  • interpretable: Provides diagnostic plots and runs tests to validate fit results.

Simple example

The following code generates noisy linear data and uses dataprob to find the maximum likelihood estimate of its slope and intercept. Run on Google Colab.

import dataprob
import numpy as np

# Generate "experimental" linear data (slope = 5, intercept = 5.7) that has
# random noise on each point.
x_array = np.linspace(0,10,25)
noise = np.random.normal(loc=0,scale=0.5,size=x_array.shape)
y_obs = 5*x_array + 5.7 + noise

# 1. Define a linear model
def linear_model(m=1,b=1,x=[]):
    return m*x + b

# 2. Set up the analysis. 'method' can be "ml", "mcmc", or "bootstrap"
f = dataprob.setup(linear_model,
                   method="ml",
                   non_fit_kwargs={"x":x_array})

# 3. Fit the parameters of linear_model model to y_obs, assuming uncertainty
#    of 0.5 on each observed point.
f.fit(y_obs=y_obs,
      y_std=0.5)

# 4. Access results
fig = dataprob.plot_summary(f)
fig = dataprob.plot_corner(f)
print(f.fit_df)
print(f.fit_quality)

The plots will be:

data.plot_summary result data.plot_corner result

The f.fit_df dataframe will look something like:

index

name

estimate

std

low_95

high_95

prior_std

m

m

5.009

0.045

4.817

5.202

NaN

b

b

5.644

0.274

4.465

6.822

NaN

The f.fit_quality dataframe will look something like:

name

description

is_good

value

num_obs

number of observations

True

25.000

num_param

number of fit parameters

True

2.000

lnL

log likelihood

True

-18.761

chi2

chi^2 goodness-of-fit

True

0.241

reduced_chi2

reduced chi^2

True

1.192

mean0_resid

t-test for residual mean != 0

True

1.000

durbin-watson

Durbin-Watson test for correlated residuals

True

2.265

ljung-box

Ljung-Box test for correlated residuals

True

0.943

Installation

We recommend installing dataprob with pip:

pip install dataprob

To install from source and run tests:

git clone https://github.com/harmslab/dataprob.git
cd dataprob
pip install .

# to run test-suite
pytest --runslow

Examples

A good way to learn how to use the library is by working through examples. The following notebooks are included in the dataprob/examples/ directory. They are self-contained demonstrations in which dataprob is used to analyze various classes of experimental data. The links below launch each notebook in Google Colab:

Documentation

Full documentation is on readthedocs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataprob-0.9.3.tar.gz (49.4 kB view details)

Uploaded Source

Built Distribution

dataprob-0.9.3-py2.py3-none-any.whl (61.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file dataprob-0.9.3.tar.gz.

File metadata

  • Download URL: dataprob-0.9.3.tar.gz
  • Upload date:
  • Size: 49.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for dataprob-0.9.3.tar.gz
Algorithm Hash digest
SHA256 45350420d864e41e63a7a4656f7c2b0e11a2fbdd2cee246f1ae201e2be3f3389
MD5 c4fc1be4cded51f9d7b0ba7776a8f568
BLAKE2b-256 cf6f8a4d32fd8663a5c7bb6ab5aac8a4b5de191ff45fff34785035bc9d203ff0

See more details on using hashes here.

File details

Details for the file dataprob-0.9.3-py2.py3-none-any.whl.

File metadata

  • Download URL: dataprob-0.9.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 61.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for dataprob-0.9.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 25cf491e174fa0bc47637795678de947d1aabaaa59e4a77875e1828c6d3cf540
MD5 840e3caaf10bb980359b7fab8c4d5b45
BLAKE2b-256 49496d46ee702a81c8d8a113df03fcdd31c432eedfb027529a6f2a7dd2d0b8bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page