Tools for creating and analyzing confidence ellipses, including Hotelling's T-squared ellipses for multivariate statistical analysis and data visualization.

These details have not been verified by PyPI

Project links

Project description

pyEllipse

A Python package for computing Hotelling's T² statistics and generating confidence ellipse/ellipsoid coordinates for multivariate data analysis and visualization.

PyPI - Downloads PyPI - Format PyPI - Status PyPI - Implementation

Overview

pyEllipse provides three main functions for analyzing multivariate data:

hotelling_parameters - Calculate Hotelling's T² statistics and ellipse parameters
hotelling_coordinates - Generate Hotelling's ellipse/ellipsoid coordinates from PCA/PLS scores
confidence_ellipse - Compute confidence ellipse/ellipsoid coordinates from raw data with grouping support

Installation

pip install pyEllipse

Usage Examples

Example 1: Hotelling's T² statistic and confidence ellipse from PCA Scores

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from pathlib import Path
from pyEllipse import hotelling_parameters, hotelling_coordinates, confidence_ellipse

def load_wine_data():
    """Load wine dataset and add cultivar labels"""
    wine_df = pd.read_csv('data/wine.csv')
    
    # Add cultivar labels based on standard Wine dataset structure
    cultivar = []
    for i in range(len(wine_df)):
        if i < 59:
            cultivar.append('Cultivar 1')
        elif i < 130:
            cultivar.append('Cultivar 2')
        else:
            cultivar.append('Cultivar 3')
    
    wine_df['Cultivar'] = cultivar
    return wine_df

wine_df = load_wine_data()
X = wine_df.drop('Cultivar', axis=1)
y = wine_df['Cultivar']

# Perform PCA
pca = PCA()
SS = StandardScaler()
X = SS.fit_transform(X)
pca_scores = pca.fit_transform(X)
explained_var = pca.explained_variance_ratio_

plt.style.use('bmh')
# Calculate T² statistics
results = hotelling_parameters(pca_scores, k=2)
t2 = results['Tsquared'].values

# Generate ellipse coordinates for plotting
ellipse_95 = hotelling_coordinates(pca_scores, pcx=1, pcy=2, conf_limit=0.95)
ellipse_99 = hotelling_coordinates(pca_scores, pcx=1, pcy=2, conf_limit=0.99)

# Plot the PCA scores with Hotelling's T² ellipse
plt.figure(figsize=(8, 6))
scatter = plt.scatter(
    pca_scores[:, 0], pca_scores[:, 1], 
    c=t2, cmap='jet', alpha=0.85, s=70, label='Wine samples'
    )
cbar = plt.colorbar(scatter)
cbar.set_label('Hotelling T² Statistic', rotation=270, labelpad=20)

plt.plot(ellipse_95['x'], ellipse_95['y'], 'r-', linewidth=1, label='95% Confidence level')
plt.plot(ellipse_99['x'], ellipse_99['y'], 'k-', linewidth=1, label='99% Confidence level')
plt.xlim(-1000, 1000)
plt.ylim(-50, 60)
plt.xlabel(f'PC1 ({explained_var[0]*100:.2f}%)', fontsize=14, labelpad=10, fontweight='bold')
plt.ylabel(f'PC2 ({explained_var[1]*100:.2f}%)', fontsize=14, labelpad=10, fontweight='bold')
plt.title("Hotelling's T² Ellipse from PCA Scores", fontsize=16, pad=10, fontweight='bold')
plt.legend(
    loc='upper left', fontsize=10, frameon=True, framealpha=0.9, 
    edgecolor='black', shadow=True, facecolor='white', borderpad=1
    )
plt.show()

Hotelling Ellipse

Example 2: Grouped Confidence Ellipses

wine_df['PC1'] = pca_scores[:, 0]
wine_df['PC2'] = pca_scores[:, 1]

colors = ['red', 'blue', 'green']
cultivars = wine_df['Cultivar'].unique()
color_map = {cultivar: color for cultivar, color in zip(cultivars, colors)}
point_colors = wine_df['Cultivar'].map(color_map)

# Plott PCA scores with confidence ellipses for each cultivar
plt.figure(figsize=(8, 6))

for i, cultivar in enumerate(cultivars):
    mask = wine_df['Cultivar'] == cultivar
    plt.scatter(
        wine_df.loc[mask, 'PC1'], wine_df.loc[mask, 'PC2'], # type: ignore
        c=colors[i], alpha=0.6, s=70, label=cultivar
        ) 

ellipse_coords = confidence_ellipse(
    data=wine_df,
    x='PC1',
    y='PC2',
    group_by='Cultivar',
    conf_level=0.95,
    robust=True,
    distribution='hotelling'
)

for i, cultivar in enumerate(cultivars):
    ellipse_data = ellipse_coords[ellipse_coords['Cultivar'] == cultivar]
    plt.plot(
        ellipse_data['x'], ellipse_data['y'], 
        color=colors[i], linewidth=1, linestyle='-', label=f'{cultivar} (95% CI)'
        )

plt.xlim(-1000, 1000)
plt.ylim(-50, 60)
plt.xlabel(f'PC1 ({explained_var[0]*100:.2f}%)', fontsize=14, labelpad=10, fontweight='bold')
plt.ylabel(f'PC2 ({explained_var[1]*100:.2f}%)', fontsize=14, labelpad=10, fontweight='bold')
plt.title("PCA Scores with Cultivar Group Confidence Ellipses", fontsize=16, pad=10, fontweight='bold')
plt.legend(
    loc='upper left', fontsize=10, frameon=True, framealpha=0.9, 
    edgecolor='black', shadow=True, facecolor='white', borderpad=1
    )
plt.show()

Hotelling Ellipse

Example 3: Grouped 3D Confidence Ellipsoids

wine_df['PC1'] = pca_scores[:, 0]
wine_df['PC2'] = pca_scores[:, 1]
wine_df['PC3'] = pca_scores[:, 2]

colors = ['red', 'blue', 'green']
light_colors = ['lightcoral', 'lightblue', 'lightgreen']
cultivars = wine_df['Cultivar'].unique()

ellipse_coords = confidence_ellipse(
    data=wine_df,
    x='PC1',
    y='PC2',
    z='PC3',
    group_by='Cultivar',
    conf_level=0.95,
    robust=True,
    distribution='hotelling'
)

fig = plt.figure(figsize=(10, 6), facecolor='white')
ax = fig.add_subplot(111, projection='3d', facecolor='white')

for i, cultivar in enumerate(cultivars):
    mask = wine_df['Cultivar'] == cultivar
    ax.scatter(
        wine_df.loc[mask, 'PC1'], 
        wine_df.loc[mask, 'PC2'], 
        wine_df.loc[mask, 'PC3'], # type: ignore
        c=colors[i], 
        alpha=0.8, 
        s=50, 
        label=cultivar, 
        edgecolors='black', 
        linewidth=0.5
        )
     
    ellipse_data = ellipse_coords[ellipse_coords['Cultivar'] == cultivar]
    n_points = int(np.sqrt(len(ellipse_data)))
    
    x_2d = ellipse_data['x'].values.reshape(n_points, -1)
    y_2d = ellipse_data['y'].values.reshape(n_points, -1)
    z_2d = ellipse_data['z'].values.reshape(n_points, -1)
    
    ax.plot_surface(
        x_2d, 
        y_2d, 
        z_2d, 
        color=light_colors[i], 
        alpha=0.4, 
        linewidth=0, 
        antialiased=True
        )

ax.set_xlabel(f'PC1 ({explained_var[0]*100:.2f}%)', fontsize=12, labelpad=5, fontweight='bold')
ax.set_ylabel(f'PC2 ({explained_var[1]*100:.2f}%)', fontsize=12, labelpad=5, fontweight='bold')
ax.set_zlabel(f'PC3 ({explained_var[2]*100:.2f}%)', fontsize=12, labelpad=1, fontweight='bold')
ax.set_title('3D PCA Scores with 95% Confidence Ellipsoids', fontsize=16, fontweight='bold')
ax.legend(
    loc='upper right', fontsize=10, frameon=True, framealpha=0.9, 
    edgecolor='black', shadow=True, facecolor='white', borderpad=1
    )
ax.grid(True, alpha=0.3, color='gray')
ax.view_init(elev=20, azim=65)
plt.tight_layout()
plt.show()

Hotelling Ellipse

Key Differences Between Functions

Feature	`hotelling_parameters`	`hotelling_coordinates`	`confidence_ellipse`
Input	Component scores	Component scores	Raw data
Purpose	T² statistics	Plot coordinates	Plot coordinates
Grouping	--	--	Yes
Robust	--	--	Yes
2D/3D	2D only for ellipse params	Both	Both
Distribution	Hotelling only	Hotelling only	Normal or Hotelling
Use Case	Outlier detection, QC	Visualizing PCA	Exploratory data analysis

When to Use Each Function

Use `hotelling_parameters` when:

You need T² statistics for outlier detection
You want confidence cutoff values
You're performing quality control or process monitoring
You need ellipse parameters (semi-axes lengths)

Use `hotelling_coordinates` when:

You have PCA/PLS component scores
You want to visualize confidence regions on score plots
You need precise control over which components to plot
You're creating publication-quality figures from multivariate models

Use `confidence_ellipse` when:

You're working with raw data (not scores)
You need to compare multiple groups
You want robust estimation for outlier-resistant analysis
You need flexibility in distribution choice (normal vs Hotelling)

References

Hotelling, H. (1931). The generalization of Student's ratio. Annals of Mathematical Statistics, 2(3), 360-378.
Brereton, R. G. (2016). Hotelling's T-squared distribution, its relationship to the F distribution and its use in multivariate space. Journal of Chemometrics, 30(1), 18-21.
Raymaekers, J., & Rousseeuw, P. J. (2019). Fast robust correlation for high dimensional data. Technometrics, 63(2), 184-198.
Jackson, J. E. (1991). A User's Guide to Principal Components. Wiley.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Oct 19, 2025

0.1.2

Oct 19, 2025

0.1.1

Oct 16, 2025

0.1.0

Oct 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyellipse-0.1.3.tar.gz (13.3 kB view details)

Uploaded Oct 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyellipse-0.1.3-py3-none-any.whl (12.4 kB view details)

Uploaded Oct 19, 2025 Python 3

File details

Details for the file pyellipse-0.1.3.tar.gz.

File metadata

Download URL: pyellipse-0.1.3.tar.gz
Upload date: Oct 19, 2025
Size: 13.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.0 CPython/3.13.7 Darwin/24.6.0

File hashes

Hashes for pyellipse-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`4a3add0fdd984af31f1f6d8abb6b5c4e0e75eb6f2bbfc607d5bd6d5ccc68508f`
MD5	`5dd9054888f7ded92eaebc82964f0d65`
BLAKE2b-256	`602a3aef4bc0aeecec019ac1de9d7e3636c3244ab9bfc80fda4f4ec1a4549bc2`

See more details on using hashes here.

File details

Details for the file pyellipse-0.1.3-py3-none-any.whl.

File metadata

Download URL: pyellipse-0.1.3-py3-none-any.whl
Upload date: Oct 19, 2025
Size: 12.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.0 CPython/3.13.7 Darwin/24.6.0

File hashes

Hashes for pyellipse-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`17d9428b9908c6d6fbfa15a72176b4b28564b247a0bce52facfef33d44321d27`
MD5	`89d7cc752fff164a8e57375e7866f8e7`
BLAKE2b-256	`58a51967237d6b8d5b8b88df080886b13c4ceedab66bab9c64ee5de5a098590a`

See more details on using hashes here.

pyEllipse 0.1.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

pyEllipse

Overview

Installation

Usage Examples

Example 1: Hotelling's T² statistic and confidence ellipse from PCA Scores

Example 2: Grouped Confidence Ellipses

Example 3: Grouped 3D Confidence Ellipsoids

Key Differences Between Functions

When to Use Each Function

Use hotelling_parameters when:

Use hotelling_coordinates when:

Use confidence_ellipse when:

References

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Use `hotelling_parameters` when:

Use `hotelling_coordinates` when:

Use `confidence_ellipse` when: