Skip to main content

This package contains methods that assist in performing bivariate analysis of datasets.

Project description

Bivariate

This package is meant to make life easier for students and teachers working on probabilistic applications. It was developed initially for MSc education at TU Delft. The focus is on visualizing multivariate distributions in 2D plots (hence the name bivariate) and making this process as easy as possible, as well as definition of the multivariate distribution and common calculations.

Contributors:

  • 2022: Thirza Feenstra, Jelle Knibbe, Irene van der Veer, Caspar Jungbacker
  • 2023: Benjamin Ramousse
  • 2024: Siemen Algra, Max Guichard

Version history:

  • v0.0: never published, originally stored on TU Delft GitLab, only used with students in class
  • v1.0: never published, moved to GitHub, only used with students in class (Q4, Q1 2023) and for book figures
  • v2.0: first release on PyPI for use with students. Wait for v3.0 for better organization and documentation, and for something that begins to approach stability

License: still working out the details, but it will eventually have a permissive license along the lines of GPL 3.0 or CC BY 4.0. Stay tuned...

Installation instructions

Using Python virtual environment:

PATH_TO_YOUR/python -m venv .venv
source .venv/Scripts/activate
python -m pip install -r requirements.txt

Guidelines - 06/10/2023 (written by Benjamin Ramousse)

NOTE THAT EVERYTHING FROM HERE AND BELOW HAS NOT BEEN UPDATED AFTER OCTOBER 1, 2023.

  • src/bivariate contains the source code for the package
  • tests/test_bivariate.py is the test file used for automated testing with pytest. In particular, the syntax of the methods defined in the said file relates to that of the pytest module.
  • tests/examples.ipynb illustrates the in-use behaviour of the package and the purpose of its methods.

Focus on src/bivariate/class_multivar.py:

The rationale behind the creation of the Multivariate class is the construct an object which contains several attributes:

  • 3 random variables $X_0, X_1, X_2$ ;
  • 2 bivariate copulas $C_{0,1}$ and $C_{1,2}$ between $X_0, X_1$ and $X_1, X_2$, respectively;
  • the conditional copula $C_{0,2|1}$.

Bivariate class

To facilitate the definition of the Multivariate class, a Bivariate class was created. Similarly to Multivariate, it contains 3 attributes:

  • a list of two random variables (RVs) $[X_0, X_1]$;
  • the family of the bivariate copula $C_{0,1}$ (and its parameter);
  • the bivariate copula defined using the package pyvinecopulib and the couple family/parameter aforementioned.

All the methods of Bivariate access the random variables using their index (0 or 1). This formulation is similarly used in Multivariate where the index ranges from 0 to 2. The plotting methods are pretty classic, and return f (matplotlib.pyplot.Figure object) and ax (matplotlib.pyplot.Axes object). The ax keyword argument present in most of Bivariate's plotting methods allows to sequentially add plots on a same Figure object.

Multivariate class

A Multivariate is defined as follows:

M = Multivariate([X_0, X_1, X_2], [(family1, parameter1), (family2, parameter2), (family3, parameter3)]) where family1 (family2) is the family of $C_{0,1}$ ($C_{1,2}$) and parameter1 (parameter2) its parameter. family3 and parameter3 relate to the conditional copula $C_{0,2|1}$.

Using these arguments, two Bivariate objects are created for $X_0, X_1$ and $X_1, X_2$. The conditional copula and the two bivariate copulas are then used to sample the copula $C_{0,2}$ of $X_0, X_2$.

Note: the current version only applies if all the (conditional) copulas are normal. Other special cases (and treatment of the sampling) should be implemented for generalization.

A key method of the Multivariate class is bivariate_plot which allows to plot a (limit-state) function and the multivariate joint distribution's contours in a given bivariate plan. x_index and y_index are the indices of the variables taken for the plot's x and y axes: for instance, x_index=1 and y_index=0 positions the plot in the plane $(x_1, x_0)$. z_value is the value at which the third (not plotted) random variable is set: in the previous example in the plane $(x_1, x_0)$, z_value allows to set the value of $X_2$ used for the plot.

The same x_index and y_index system is used in the other plotting methods of the class.

Structure of bivariate package, Siemen

├── github
│   └── ... # Github workflows                
├── src\bivariate                  
│   └── ... # All package source files   
├── tests
│   └── ... # All testing files        
├── ...                     
└── README.md 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bivariate-2.0.3.tar.gz (71.9 kB view hashes)

Uploaded Source

Built Distribution

bivariate-2.0.3-py3-none-any.whl (59.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page