Skip to main content

Classy Probabilistic Programming

Project description

PyAutoFit: Classy Probabilistic Programming

.. |binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/Jammy2211/autofit_workspace/HEAD

|binder|

PyAutoFit is a Python-based probabilistic programming language which:

  • Makes it simple to compose and fit models using a range of Bayesian inference libraries, such as emcee <https://github.com/dfm/emcee>_ and dynesty <https://github.com/joshspeagle/dynesty>_.

  • Handles the 'heavy lifting' that comes with model-fitting, including model composition & customization, outputting results, model-specific visualization and posterior analysis.

  • Is built for big-data analysis, whereby results are output as a database which can be loaded after model-fitting is complete.

PyAutoFit supports advanced statistical methods such as massively parallel non-linear search grid-searches <https://pyautofit.readthedocs.io/en/latest/features/search_grid_search.html>, chaining together model-fits <https://pyautofit.readthedocs.io/en/latest/features/search_chaining.html> and sensitivity mapping <https://pyautofit.readthedocs.io/en/latest/features/sensitivity_mapping.html>_.

Getting Started

You can try PyAutoFit now by following the introduction Jupyter Notebook on Binder <https://gesis.mybinder.org/binder/v2/gh/Jammy2211/autofit_workspace/7586a67b726dca612404cf5fab1d77d8738f3737?filepath=introduction.ipynb>_.

On readthedocs <https://pyautofit.readthedocs.io/>_ you'll find the installation guide, a complete overview of PyAutoFit's features, examples scripts, and the HowToFit Jupyter notebook tutorials <https://pyautofit.readthedocs.io/en/latest/howtofit/howtofit.html>_ which introduces new users to PyAutoFit.

Why PyAutoFit?

PyAutoFit is developed by Astronomers for fitting large imaging datasets of galaxies. We found that existing probabilistic programming languages (e.g PyMC3 <https://github.com/pymc-devs/pymc3>, Pyro <https://github.com/pyro-ppl/pyro>, STAN <https://github.com/stan-dev/stan>_) were not suited to the type of model fitting problems Astronomers faced, for example:

  • Fitting large and homogenous datasets with an identical model fitting procedure, with tools for processing the large libraries of results output.

  • Problems where likelihood evaluations are expensive, leading to run times of days per fit and necessitating support for massively parallel computing.

  • Fitting many different models to the same dataset with tools that streamline model comparison.

If these challenges sound familiar, then PyAutoFit may be the right software for your model-fitting needs!

API Overview

To illustrate the PyAutoFit API, we'll use an illustrative toy model of fitting a one-dimensional Gaussian to noisy 1D data. Here's the data (black) and the model (red) we'll fit:

.. image:: https://raw.githubusercontent.com/rhayes777/PyAutoFit/master/toy_model_fit.png :width: 400 :alt: Alternative text

We define our model, a 1D Gaussian by writing a Python class using the format below:

.. code-block:: python

class Gaussian:

    def __init__(
        self,
        centre=0.0,     # <- PyAutoFit recognises these
        intensity=0.1,  # <- constructor arguments are
        sigma=0.01,     # <- the Gaussian's parameters.
    ):
        self.centre = centre
        self.intensity = intensity
        self.sigma = sigma

    """
    An instance of the Gaussian class will be available during model fitting.

    This method will be used to fit the model to ``data`` and compute a likelihood.
    """

    def profile_from_xvalues(self, xvalues):

        transformed_xvalues = xvalues - self.centre

        return (self.intensity / (self.sigma * (2.0 * np.pi) ** 0.5)) * \
                np.exp(-0.5 * transformed_xvalues / self.sigma)

PyAutoFit recognises that this Gaussian may be treated as a model component whose parameters can be fitted for via a NonLinearSearch like emcee <https://github.com/dfm/emcee>_.

To fit this Gaussian to the data we create an Analysis object, which gives PyAutoFit the data and a log_likelihood_function describing how to fit the data with the model:

.. code-block:: python

class Analysis(af.Analysis):

    def __init__(self, data, noise_map):

        self.data = data
        self.noise_map = noise_map

    def log_likelihood_function(self, instance):

        """
        The 'instance' that comes into this method is an instance of the Gaussian class
        above, with the parameters set to values chosen by the non-linear search.
        """

        print("Gaussian Instance:")
        print("Centre = ", instance.centre)
        print("Intensity = ", instance.intensity)
        print("Sigma = ", instance.sigma)

        """
        We fit the ``data`` with the Gaussian instance, using its
        "profile_from_xvalues" function to create the model data.
        """

        xvalues = np.arange(self.data.shape[0])

        model_data = instance.profile_from_xvalues(xvalues=xvalues)
        residual_map = self.data - model_data
        chi_squared_map = (residual_map / self.noise_map) ** 2.0
        log_likelihood = -0.5 * sum(chi_squared_map)

        return log_likelihood

We can now fit our model to the data using a NonLinearSearch:

.. code-block:: python

model = af.PriorModel(Gaussian)

analysis = Analysis(data=data, noise_map=noise_map)

emcee = af.Emcee(nwalkers=50, nsteps=2000)

result = emcee.fit(model=model, analysis=analysis)

The result contains information on the model-fit, for example the parameter samples, maximum log likelihood model and marginalized probability density functions.

Support

Support for installation issues and integrating your modeling software with PyAutoFit is available by raising an issue on the autofit_workspace GitHub page <https://github.com/Jammy2211/autofit_workspace/issues>. or joining the PyAutoFit Slack channel <https://pyautofit.slack.com/>, where we also provide the latest updates on PyAutoFit.

Slack is invitation-only, so if you'd like to join send an email <https://github.com/Jammy2211>_ requesting an invite.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autofit-0.72.1.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

autofit-0.72.1-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file autofit-0.72.1.tar.gz.

File metadata

  • Download URL: autofit-0.72.1.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5

File hashes

Hashes for autofit-0.72.1.tar.gz
Algorithm Hash digest
SHA256 dfdc01b277dea79650ae9865368fea0ef5bf77f285a37ea512df7dd5fb1e89aa
MD5 5eb104f6a1e6fae691ed8da03f1cfab2
BLAKE2b-256 7a978dacb4aa2cc5152c74f18ceb4d6a78d1d7b29c61fa9d6017c423ce71c7e3

See more details on using hashes here.

File details

Details for the file autofit-0.72.1-py3-none-any.whl.

File metadata

  • Download URL: autofit-0.72.1-py3-none-any.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.5

File hashes

Hashes for autofit-0.72.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b770abb35f11a342668eb8df69a1f373713c4a59c1cd3116ffab62d1e51e3731
MD5 3ffdab4e5fc5e256c869551886efd421
BLAKE2b-256 ee9355d436e439fcf45e99ebbf80c3f5bf6d1e1654bc51109d54cb859306418d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page