Skip to main content

XMPro Dimensionality is a Python library for Dimensionality reduction.

Project description

XMdim

XMdim is a Python library designed for performing dimensionality reduction on embedding data, with a primary focus on Principal Component Analysis (PCA). It provides a flexible and extensible framework for reducing data dimensions, analyzing variance, and reconstructing data using PCA.

Features

  • PCA Transformation: Perform PCA on your embedding data with customizable number of components.
  • Flexible Scaling: Option to apply standard scaling or min-max scaling before PCA.
  • Variance Analysis: Calculate and retrieve explained variance ratios and cumulative explained variance.
  • Component Loadings: Access the loadings (principal components) of the PCA.
  • Data Reconstruction: Inverse transform PCA results to reconstruct original data.
  • Reconstruction Error: Calculate the mean squared error between original and reconstructed data.
  • Optimal Components: Find the optimal number of components for a given variance threshold.
  • New Data Projection: Project new data onto the existing PCA space.

Installation

Install XMdim using pip:

pip install xmdim

Usage

Here's a basic example of how to use XMdim:

from xmdim import PCAAnalyzer, ScalingType

# Sample embeddings
embeddings = {
    'key1': [[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10], [10, 11, 12, 13]],
    'key2': [[2, 3, 4, 5], [5, 6, 7, 8], [8, 9, 10, 11], [11, 12, 13, 14]]
}

# Create a PCAAnalyzer instance
analyzer = PCAAnalyzer(embeddings)

# Perform PCA
transformed_data = analyzer.perform_pca(n_components=2, scaling=ScalingType.STANDARD)

# Get explained variance ratio
explained_variance = analyzer.get_explained_variance_ratio()

# Get cumulative explained variance
cumulative_variance = analyzer.get_cumulative_explained_variance()

print("Transformed Data:", transformed_data)
print("Explained Variance Ratio:", explained_variance)
print("Cumulative Explained Variance:", cumulative_variance)

Advanced Usage

Loadings and Data Reconstruction

# Get loadings
loadings = analyzer.get_loadings()

# Inverse transform
reconstructed_data = analyzer.inverse_transform()

# Get reconstruction error
error = analyzer.get_reconstruction_error()

print("Loadings:", loadings)
print("Reconstructed Data:", reconstructed_data)
print("Reconstruction Error:", error)

Optimal Components and New Data Projection

# Get optimal number of components
optimal_components = analyzer.get_optimal_components(variance_threshold=0.95)

# Project new data
new_data = {
    'key1': [[2, 3, 4, 5], [5, 6, 7, 8]],
    'key2': [[3, 4, 5, 6], [6, 7, 8, 9]]
}
projected_data = analyzer.project_new_data(new_data)

print("Optimal Number of Components:", optimal_components)
print("Projected New Data:", projected_data)

Dependencies

  • numpy
  • scikit-learn

Contributing

We welcome contributions! Please see our contributing guidelines for more details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any queries or support, please contact [your contact information].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmdim-0.0.1.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xmdim-0.0.1-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file xmdim-0.0.1.tar.gz.

File metadata

  • Download URL: xmdim-0.0.1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for xmdim-0.0.1.tar.gz
Algorithm Hash digest
SHA256 5adf800f47c99ad4678adbb52b13e1024018da1fc98524dba82d089c9ec8519b
MD5 ac02b323a0b0f56e85528c25fbd07fb2
BLAKE2b-256 df0cc79c0a2517d9f3c347a2fa4c55443e725b1bb07564c8074a1e3d502aae15

See more details on using hashes here.

File details

Details for the file xmdim-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: xmdim-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for xmdim-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cc79497763f299489ee919a15840421aef5f25f3202af4dcd66c3dc2fa08f390
MD5 b5168bec620c7cc353b3f78128cfb59a
BLAKE2b-256 cb4d4857322356f0efb0fa9faaa441b6a351fa8f3488739fda87f95f96c6084e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page