Skip to main content

Draw Logistic Regression Plots in Python

Project description

Run Pytest Coverage Code style: black DOI PyPI version License: CC BY-NC-SA 4.0

lorepy: Logistic Regression Plots for Python

Logistic Regression plots are used to plot the distribution of a categorical dependent variable in function of a continuous independent variable.

If you prefer an R implementation of this package, have a look at loreplotr.

LoRePlot example on Iris Dataset

Installation

Lorepy can be installed using pip using the command below.

pip install lorepy

Usage

Data needs to be provided as a DataFrame and the columns for the x (independent continuous) and y (dependant categorical) variables need to be defined. Here the iris dataset is loaded and converted to an appropriate DataFrame. Once the data is in shape it can be plotted using a single line of code loreplot(data=iris_df, x="sepal width (cm)", y="species").

from lorepy import loreplot

from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import pandas as pd

iris_obj = load_iris()
iris_df = pd.DataFrame(iris_obj.data, columns=iris_obj.feature_names)

iris_df["species"] = [iris_obj.target_names[s] for s in iris_obj.target]

loreplot(data=iris_df, x="sepal width (cm)", y="species")

plt.show()

Options

While lorepy has very few customizations, it is possible to pass arguments through to Pandas' DataFrame.plot.area and Matplotlib's pyplot.scatter to change the aesthetics of the plots.

Disable sample dots

Dots indicating where samples are located can be en-/disabled using the add_dots argument.

loreplot(data=iris_df, x="sepal width (cm)", y="species", add_dots=False)
plt.show()

LoRePlot dots can be disabled

Custom styles

Additional keyword arguments are passed to Pandas' DataFrame.plot.area. This can be used, among other things, to define a custom colormap. For more options to customize these plots consult Pandas' documentation.

from matplotlib.colors import ListedColormap

colormap=ListedColormap(['red', 'green', 'blue'])

loreplot(data=iris_df, x="sepal width (cm)", y="species", colormap=colormap)
plt.show()

LoRePlot custom colors

Using scatter_kws arguments for pyplot.scatter can be set to change the appearance of the sample markers.

scatter_options = {
    's': 20,                  # Marker size
    'alpha': 1,               # Fully opaque
    'color': 'black',         # Set color to black
    'marker': 'x'             # Set style to crosses
}

loreplot(data=iris_df, x="sepal width (cm)", y="species", scatter_kws=scatter_options)
plt.show()

LoRePlot custom markers

You can use LoRePlots in subplots as you would expect.

fig, ax = plt.subplots(1,2, sharex=False, sharey=True)
loreplot(data=iris_df, x="sepal width (cm)", y="species", ax=ax[0])
loreplot(data=iris_df, x="petal width (cm)", y="species", ax=ax[1])

ax[0].get_legend().remove()
ax[0].set_title("Sepal Width")
ax[1].set_title("Petal Width")

plt.savefig('./docs/img/loreplot_subplot.png', dpi=150)
plt.show()

LoRePlot in subplots

By default lorepy uses a multi-class logistic regression model, however this can be replaced with any classifier from scikit-learn that implements predict_proba and fit. Below you can see the code and output with a Support Vector Classifier (SVC) and Random Forest Classifier (RF).

from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier

fig, ax = plt.subplots(1, 2, sharex=False, sharey=True)

svc = SVC(probability=True)
rf = RandomForestClassifier(n_estimators=10, max_depth=2)

loreplot(data=iris_df, x="sepal width (cm)", y="species", clf=svc, ax=ax[0])
loreplot(data=iris_df, x="sepal width (cm)", y="species", clf=rf, ax=ax[1])

ax[0].get_legend().remove()
ax[0].set_title("SVC")
ax[1].set_title("RF")

plt.savefig("./docs/img/loreplot_other_clf.png", dpi=150)
plt.show()

Lorepy with different types of classifiers

In case there are confounders, these can be taken into account using the confounders argument. This requires a list of tuples, with the feature and the reference value for that feature to use in plots. E.g. if you wish to deconfound for Body Mass Index (BMI) and use a BMI of 25 in plots, set this to [("BMI", 25)].

loreplot(
    data=iris_df,
    x="sepal width (cm)",
    y="species",
    confounders=[("petal width (cm)", 1)],
)
plt.savefig("./docs/img/loreplot_confounder.png", dpi=150)
plt.show()

Loreplot with a confounder

Assess uncertainty

From loreplots it isn't possible to assess how certain we are of the prevalence of each group across the range. To provide a view into this there is a function uncertainty_plot, which can be used as shown below. This will use resampling (or jackknifing) to determine the 50% and 95% interval of predicted values and show these in a multi-panel plot with one plot per category.

from lorepy import uncertainty_plot

uncertainty_plot(
    data=iris_df,
    x="sepal width (cm)",
    y="species",
)
plt.savefig("./docs/img/uncertainty_default.png", dpi=150)
plt.show()

Default uncertainty plot

This also supports custom colors, ranges and classifiers. More examples are available in example_uncertainty.py.

Development

Additional documentation for developers is included with details on running tests, building and deploying to PyPi.

Contributing

Any contributions you make are greatly appreciated.

  • Found a bug or have some suggestions? Open an issue.
  • Pull requests are welcome! Though open an issue first to discuss which features/changes you wish to implement.

Contact

lorepy was developed by Sebastian Proost at the RaesLab and was based on R code written by Sara Vieira-Silva. As of version 0.2.0 lorepy is available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

For commercial access inquiries, please contact Jeroen Raes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lorepy-0.4.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

lorepy-0.4.0-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file lorepy-0.4.0.tar.gz.

File metadata

  • Download URL: lorepy-0.4.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.11

File hashes

Hashes for lorepy-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1d948dafbc91a5c247052a629773dc7e44d5f636d7beae4e4236c5b46e6b22f1
MD5 4a0186265800b66a539a7d8d72e3cfd4
BLAKE2b-256 7a500593c6cc4d6defc0360ceabb03c07e9bf346d8f66a54461aea523a09ff97

See more details on using hashes here.

File details

Details for the file lorepy-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: lorepy-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.11

File hashes

Hashes for lorepy-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 580c86e4d9eed6097b8485724475ec882cf6bbc5d325c0a7acb907b9f2d0180c
MD5 2dadcf9dfb844bb4c4614547912415e8
BLAKE2b-256 34ac9032e88817a8d51aecb605658a0966b954f3d476e4300251be62d52490b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page