Skip to main content

Classy Probabilistic Programming

Project description

PyAutoFit: Classy Probabilistic Programming

.. |binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/HEAD

.. |JOSS| image:: https://joss.theoj.org/papers/10.21105/joss.02550/status.svg :target: https://doi.org/10.21105/joss.02550

|binder| |JOSS|

Installation Guide <https://pyautofit.readthedocs.io/en/latest/installation/overview.html>_ | readthedocs <https://pyautofit.readthedocs.io/en/latest/index.html>_ | Introduction on Binder <https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/release?filepath=introduction.ipynb>_ | HowToFit <https://pyautofit.readthedocs.io/en/latest/howtofit/howtofit.html>_

PyAutoFit is a Python based probabilistic programming language for the fully Bayesian analysis of extremely large datasets which:

  • Makes it simple to compose and fit mult-level models using a range of Bayesian inference libraries, such as emcee <https://github.com/dfm/emcee>_ and dynesty <https://github.com/joshspeagle/dynesty>_.

  • Handles the 'heavy lifting' that comes with model-fitting, including model composition & customization, outputting results, model-specific visualization and posterior analysis.

  • Is built for big-data analysis, whereby results are output as a sqlite database which can be queried after model-fitting is complete.

PyAutoFit supports advanced statistical methods such as graphical and hierarchical models <https://pyautofit.readthedocs.io/en/latest/features/graphical.html>, model-fit chaining <https://pyautofit.readthedocs.io/en/latest/features/search_chaining.html>, sensitivity mapping <https://pyautofit.readthedocs.io/en/latest/features/sensitivity_mapping.html>_ and massively parallel model-fits <https://pyautofit.readthedocs.io/en/latest/features/search_grid_search.html>_ .

Getting Started

The following links are useful for new starters:

  • The introduction Jupyter Notebook on Binder <https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/release?filepath=introduction.ipynb>_, where you can try PyAutoFit in a web browser (without installation).

  • The PyAutoFit readthedocs <https://pyautofit.readthedocs.io/en/latest>, which includes an installation guide <https://pyautofit.readthedocs.io/en/latest/installation/overview.html> and an overview of PyAutoFit's core features.

  • The autofit_workspace GitHub repository <https://github.com/Jammy2211/autofit_workspace>, which includes example scripts and the HowToFit Jupyter notebook tutorials <https://github.com/Jammy2211/autofit_workspace/tree/master/notebooks/howtofit> which give new users a step-by-step introduction to PyAutoFit.

Why PyAutoFit?

PyAutoFit began as an Astronomy project for fitting large imaging datasets of galaxies after the developers found that existing PPLs (e.g., PyMC3 <https://github.com/pymc-devs/pymc3>, Pyro <https://github.com/pyro-ppl/pyro>, STAN <https://github.com/stan-dev/stan>_) were not suited to the model fitting problems many Astronomers faced. This includes:

  • Efficiently analysing large and homogenous datasets with an identical model fitting procedure, with tools for processing the large libraries of results output.

  • Problems where likelihood evaluations are expensive (e.g. run times of days per model-fit), necessitating highly customizable model-fitting pipelines with support for massively parallel computing.

  • Fitting many different models to the same dataset with tools that streamline model comparison.

If these challenges sound familiar, then PyAutoFit may be the right software for your model-fitting needs!

API Overview

To illustrate the PyAutoFit API, we'll use an illustrative toy model of fitting a one-dimensional Gaussian to noisy 1D data. Here's the data (black) and the model (red) we'll fit:

.. image:: https://raw.githubusercontent.com/rhayes777/PyAutoFit/master/files/toy_model_fit.png :width: 400 :alt: Alternative text

We define our model, a 1D Gaussian by writing a Python class using the format below:

.. code-block:: python

class Gaussian:

    def __init__(
        self,
        centre=0.0,        # <- PyAutoFit recognises these
        normalization=0.1, # <- constructor arguments are
        sigma=0.01,        # <- the Gaussian's parameters.
    ):
        self.centre = centre
        self.normalization = normalization
        self.sigma = sigma

    """
    An instance of the Gaussian class will be available during model fitting.

    This method will be used to fit the model to data and compute a likelihood.
    """

    def model_data_1d_via_xvalues_from(self, xvalues):

        transformed_xvalues = xvalues - self.centre

        return (self.normalization / (self.sigma * (2.0 * np.pi) ** 0.5)) * \
                np.exp(-0.5 * (transformed_xvalues / self.sigma) ** 2.0)

PyAutoFit recognises that this Gaussian may be treated as a model component whose parameters can be fitted for via a non-linear search like emcee <https://github.com/dfm/emcee>_.

To fit this Gaussian to the data we create an Analysis object, which gives PyAutoFit the data and a log_likelihood_function describing how to fit the data with the model:

.. code-block:: python

class Analysis(af.Analysis):

    def __init__(self, data, noise_map):

        self.data = data
        self.noise_map = noise_map

    def log_likelihood_function(self, instance):

        """
        The 'instance' that comes into this method is an instance of the Gaussian class
        above, with the parameters set to values chosen by the non-linear search.
        """

        print("Gaussian Instance:")
        print("Centre = ", instance.centre)
        print("normalization = ", instance.normalization)
        print("Sigma = ", instance.sigma)

        """
        We fit the ``data`` with the Gaussian instance, using its
        "model_data_1d_via_xvalues_from" function to create the model data.
        """

        xvalues = np.arange(self.data.shape[0])

        model_data = instance.model_data_1d_via_xvalues_from(xvalues=xvalues)
        residual_map = self.data - model_data
        chi_squared_map = (residual_map / self.noise_map) ** 2.0
        log_likelihood = -0.5 * sum(chi_squared_map)

        return log_likelihood

We can now fit our model to the data using a non-linear search:

.. code-block:: python

model = af.Model(Gaussian)

analysis = Analysis(data=data, noise_map=noise_map)

emcee = af.Emcee(nwalkers=50, nsteps=2000)

result = emcee.fit(model=model, analysis=analysis)

The result contains information on the model-fit, for example the parameter samples, maximum log likelihood model and marginalized probability density functions.

Support

Support for installation issues, help with Fit modeling and using PyAutoFit is available by raising an issue on the GitHub issues page <https://github.com/rhayes777/PyAutoFit/issues>_.

We also offer support on the PyAutoFit Slack channel <https://pyautoFit.slack.com/>, where we also provide the latest updates on PyAutoFit. Slack is invitation-only, so if you'd like to join send an email <https://github.com/Jammy2211> requesting an invite.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autofit-2022.6.29.1.tar.gz (7.1 MB view details)

Uploaded Source

Built Distribution

autofit-2022.6.29.1-py3-none-any.whl (326.2 kB view details)

Uploaded Python 3

File details

Details for the file autofit-2022.6.29.1.tar.gz.

File metadata

  • Download URL: autofit-2022.6.29.1.tar.gz
  • Upload date:
  • Size: 7.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5

File hashes

Hashes for autofit-2022.6.29.1.tar.gz
Algorithm Hash digest
SHA256 c5bd675ea829698145c481e08078f34ff80fc176a5e0f5e8fd9e9b520a540239
MD5 5301f482e1c31164cbde3cb2127cd1fe
BLAKE2b-256 09af58f0e2b64796bcd3ea20fbfd87b0e6335311d0ed863c4b4cb57ce8da8e7c

See more details on using hashes here.

File details

Details for the file autofit-2022.6.29.1-py3-none-any.whl.

File metadata

  • Download URL: autofit-2022.6.29.1-py3-none-any.whl
  • Upload date:
  • Size: 326.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5

File hashes

Hashes for autofit-2022.6.29.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0bf27cbc28f3984ac39b2edb70510c2d960e00eb8a6fecce3616cba7bf64b268
MD5 bbdd382961ec521929f73e00ad02d217
BLAKE2b-256 64c4d6a900f8b383f939e3865788d9e227f21851ec109ba26b1d5a528c198331

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page