Skip to main content

Classy Probabilistic Programming

Project description

binder Tests Build Documentation Status JOSS

Installation Guide | readthedocs | Introduction on Binder | HowToFit

PyAutoFit is a Python based probabilistic programming language for model fitting and Bayesian inference of large datasets.

The basic PyAutoFit API allows us a user to quickly compose a probabilistic model and fit it to data via a log likelihood function, using a range of non-linear search algorithms (e.g. MCMC, nested sampling).

Users can then set up PyAutoFit scientific workflow, which enables streamlined modeling of small datasets with tools to scale up to large datasets.

PyAutoFit supports advanced statistical methods, most notably a big data framework for Bayesian hierarchical analysis.

Getting Started

The following links are useful for new starters:

Support

Support for installation issues, help with Fit modeling and using PyAutoFit is available by raising an issue on the GitHub issues page.

We also offer support on the PyAutoFit Slack channel, where we also provide the latest updates on PyAutoFit. Slack is invitation-only, so if you’d like to join send an email requesting an invite.

HowToFit

For users less familiar with Bayesian inference and scientific analysis you may wish to read through the HowToFits lectures. These teach you the basic principles of Bayesian inference, with the content pitched at undergraduate level and above.

A complete overview of the lectures is provided on the HowToFit readthedocs page

API Overview

To illustrate the PyAutoFit API, we use an illustrative toy model of fitting a one-dimensional Gaussian to noisy 1D data. Here’s the data (black) and the model (red) we’ll fit:

https://raw.githubusercontent.com/rhayes777/PyAutoFit/master/files/toy_model_fit.png

We define our model, a 1D Gaussian by writing a Python class using the format below:

class Gaussian:

    def __init__(
        self,
        centre=0.0,        # <- PyAutoFit recognises these
        normalization=0.1, # <- constructor arguments are
        sigma=0.01,        # <- the Gaussian's parameters.
    ):
        self.centre = centre
        self.normalization = normalization
        self.sigma = sigma

    """
    An instance of the Gaussian class will be available during model fitting.

    This method will be used to fit the model to data and compute a likelihood.
    """

    def model_data_1d_via_xvalues_from(self, xvalues):

        transformed_xvalues = xvalues - self.centre

        return (self.normalization / (self.sigma * (2.0 * np.pi) ** 0.5)) * \
                np.exp(-0.5 * (transformed_xvalues / self.sigma) ** 2.0)

PyAutoFit recognises that this Gaussian may be treated as a model component whose parameters can be fitted for via a non-linear search like emcee.

To fit this Gaussian to the data we create an Analysis object, which gives PyAutoFit the data and a log_likelihood_function describing how to fit the data with the model:

class Analysis(af.Analysis):

    def __init__(self, data, noise_map):

        self.data = data
        self.noise_map = noise_map

    def log_likelihood_function(self, instance):

        """
        The 'instance' that comes into this method is an instance of the Gaussian class
        above, with the parameters set to values chosen by the non-linear search.
        """

        print("Gaussian Instance:")
        print("Centre = ", instance.centre)
        print("normalization = ", instance.normalization)
        print("Sigma = ", instance.sigma)

        """
        We fit the ``data`` with the Gaussian instance, using its
        "model_data_1d_via_xvalues_from" function to create the model data.
        """

        xvalues = np.arange(self.data.shape[0])

        model_data = instance.model_data_1d_via_xvalues_from(xvalues=xvalues)
        residual_map = self.data - model_data
        chi_squared_map = (residual_map / self.noise_map) ** 2.0
        log_likelihood = -0.5 * sum(chi_squared_map)

        return log_likelihood

We can now fit our model to the data using a non-linear search:

model = af.Model(Gaussian)

analysis = Analysis(data=data, noise_map=noise_map)

emcee = af.Emcee(nwalkers=50, nsteps=2000)

result = emcee.fit(model=model, analysis=analysis)

The result contains information on the model-fit, for example the parameter samples, maximum log likelihood model and marginalized probability density functions.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autofit-2024.1.27.4.tar.gz (255.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autofit-2024.1.27.4-py3-none-any.whl (360.9 kB view details)

Uploaded Python 3

File details

Details for the file autofit-2024.1.27.4.tar.gz.

File metadata

  • Download URL: autofit-2024.1.27.4.tar.gz
  • Upload date:
  • Size: 255.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for autofit-2024.1.27.4.tar.gz
Algorithm Hash digest
SHA256 6ef3d589a304c72e171b18342bd8a59eb245c25ed0a2d137c07fcbb529a355e6
MD5 3adfaf90a0b709fbd0de4035a46a534f
BLAKE2b-256 e7baf2309c3c052e48be7309e1a7968b4f9dd5af7704d8188a34e258fde9b5c2

See more details on using hashes here.

File details

Details for the file autofit-2024.1.27.4-py3-none-any.whl.

File metadata

  • Download URL: autofit-2024.1.27.4-py3-none-any.whl
  • Upload date:
  • Size: 360.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for autofit-2024.1.27.4-py3-none-any.whl
Algorithm Hash digest
SHA256 632c9e27a73b7f1c856d5c6a8a1ec17bf61558ac3334b98bbf1af8851095f948
MD5 a2ab6c1708f1d2e13988d494fb3d09c2
BLAKE2b-256 f9e2ad7128a50e8beaa764f0ba7ee16924c4614d216ab78e99991652c40d0194

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page