This package contains methods that assist in performing bivariate analysis of datasets.
Project description
Bivariate
This package is meant to make life easier for students and teachers working on probabilistic applications. It was developed initially for MSc education at TU Delft. The focus is on visualizing multivariate distributions in 2D plots (hence the name bivariate) and making this process as easy as possible, as well as definition of the multivariate distribution and common calculations.
Contributors:
- 2022: Thirza Feenstra, Jelle Knibbe, Irene van der Veer, Caspar Jungbacker
- 2023: Benjamin Ramousse
- 2024: Siemen Algra, Max Guichard
Version history:
- v0.0: never published, originally stored on TU Delft GitLab, only used with students in class
- v1.0: never published, moved to GitHub, only used with students in class (Q4, Q1 2023) and for book figures
- v2.0: first release on PyPI for use with students. Wait for v3.0 for better organization and documentation, and for something that begins to approach stability
License: still working out the details, but it will eventually have a permissive license along the lines of GPL 3.0 or CC BY 4.0. Stay tuned...
Installation instructions
Using Python virtual environment:
PATH_TO_YOUR/python -m venv .venv
source .venv/Scripts/activate
python -m pip install -r requirements.txt
Guidelines - 06/10/2023 (written by Benjamin Ramousse)
NOTE THAT EVERYTHING FROM HERE AND BELOW HAS NOT BEEN UPDATED AFTER OCTOBER 1, 2023.
src/bivariate
contains the source code for the packagetests/test_bivariate.py
is the test file used for automated testing with pytest. In particular, the syntax of the methods defined in the said file relates to that of thepytest
module.tests/examples.ipynb
illustrates the in-use behaviour of the package and the purpose of its methods.
Focus on src/bivariate/class_multivar.py
:
The rationale behind the creation of the Multivariate class is the construct an object which contains several attributes:
- 3 random variables $X_0, X_1, X_2$ ;
- 2 bivariate copulas $C_{0,1}$ and $C_{1,2}$ between $X_0, X_1$ and $X_1, X_2$, respectively;
- the conditional copula $C_{0,2|1}$.
Bivariate class
To facilitate the definition of the Multivariate class, a Bivariate class was created. Similarly to Multivariate, it contains 3 attributes:
- a list of two random variables (RVs) $[X_0, X_1]$;
- the family of the bivariate copula $C_{0,1}$ (and its parameter);
- the bivariate copula defined using the package
pyvinecopulib
and the couple family/parameter aforementioned.
All the methods of Bivariate access the random variables using their index (0 or 1). This formulation is similarly used
in Multivariate where the index ranges from 0 to 2. The plotting methods are pretty classic, and return f
(matplotlib.pyplot.Figure
object) and ax (matplotlib.pyplot.Axes
object). The ax
keyword argument present in most
of Bivariate's plotting methods allows to sequentially add plots on a same Figure object.
Multivariate class
A Multivariate is defined as follows:
M = Multivariate([X_0, X_1, X_2], [(family1, parameter1), (family2, parameter2), (family3, parameter3)])
where family1 (family2) is the family of $C_{0,1}$ ($C_{1,2}$) and parameter1 (parameter2) its parameter. family3
and parameter3 relate to the conditional copula $C_{0,2|1}$.
Using these arguments, two Bivariate objects are created for $X_0, X_1$ and $X_1, X_2$. The conditional copula and the two bivariate copulas are then used to sample the copula $C_{0,2}$ of $X_0, X_2$.
Note: the current version only applies if all the (conditional) copulas are normal. Other special cases (and treatment of the sampling) should be implemented for generalization.
A key method of the Multivariate class is bivariate_plot
which allows to plot a (limit-state) function and the
multivariate joint distribution's contours in a given bivariate plan. x_index
and y_index
are the indices of the
variables taken for the plot's x and y axes: for instance, x_index
=1 and y_index
=0 positions the plot in the plane
$(x_1, x_0)$. z_value
is the value at which the third (not plotted) random variable is set: in the previous example
in the plane $(x_1, x_0)$, z_value
allows to set the value of $X_2$ used for the plot.
The same x_index
and y_index
system is used in the other plotting methods of the class.
Structure of bivariate package, Siemen
├── github
│ └── ... # Github workflows
├── src\bivariate
│ └── ... # All package source files
├── tests
│ └── ... # All testing files
├── ...
└── README.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bivariate-2.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cf0fc2c2e7964e937c36ab0c452538b85d1a918b69e5ac571bb46bcffec299a |
|
MD5 | 4be8c32cb001a2bfb24d9a6986263c21 |
|
BLAKE2b-256 | 17f8daf42614112c9061c5a9b25d4aeca754e461733ba99df16ad686b985dac3 |