Skip to main content

Adaptive PCA with parallel scaling and dimensionality reduction

Project description

AdaptivePCA AdaptivePCA is a flexible, scalable Python package that enables dimensionality reduction with PCA, automatically selecting the best scaler and the optimal number of components to meet a specified variance threshold. Built for efficiency, AdaptivePCA includes parallel processing capabilities to speed up large-scale data transformations, making it ideal for data scientists and machine learning practitioners working with high-dimensional datasets.

Features Automatic Component Selection: Automatically selects the optimal number of principal components based on a specified variance threshold. Scaler Selection: Compares multiple scalers (StandardScaler and MinMaxScaler) to find the best fit for the data. Parallel Processing: Option to use concurrent scaling for faster computations. Easy Integration: Built on top of widely-used libraries like scikit-learn and numpy. Installation You can install AdaptivePCA via pip:

bash Copy code pip install adaptivepca Usage Import and Initialize python Copy code from adaptivepca import AdaptivePCA import pandas as pd

Load your dataset

X = pd.read_csv("your_data.csv") # Ensure your dataset is loaded as a Pandas DataFrame Basic Usage Initialize AdaptivePCA and fit it to your data:

python Copy code

Initialize AdaptivePCA with desired variance threshold and maximum components

adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=10)

Fit and transform data

X_transformed = adaptive_pca.fit_transform(X) Parallel Processing For larger datasets, enable parallel processing to speed up computations:

python Copy code

Fit AdaptivePCA with parallel processing

adaptive_pca.fit(X, parallel=True) Accessing Best Parameters After fitting, you can retrieve the best scaler, number of components, and explained variance score:

python Copy code print(f"Best Scaler: {adaptive_pca.best_scaler}") print(f"Optimal Components: {adaptive_pca.best_n_components}") print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}") Parameters variance_threshold (float): Desired variance threshold for component selection. Default is 0.95. max_components (int): Maximum number of PCA components to consider. Default is 10. Methods fit(X, parallel=False): Fits AdaptivePCA to the dataset X. Use parallel=True to enable parallel processing. transform(X): Transforms the dataset X using the previously fitted configuration. fit_transform(X): Combines fit and transform steps in one call. Example python Copy code from adaptivepca import AdaptivePCA import pandas as pd

Example dataset

X = pd.DataFrame({ 'feature1': [1, 2, 3, 4, 5], 'feature2': [10, 9, 8, 7, 6], 'feature3': [2, 4, 6, 8, 10] })

adaptive_pca = AdaptivePCA(variance_threshold=0.95, max_components=2) X_transformed = adaptive_pca.fit_transform(X)

Retrieve best configuration details

print(f"Best Scaler: {adaptive_pca.best_scaler}") print(f"Optimal Components: {adaptive_pca.best_n_components}") print(f"Explained Variance Score: {adaptive_pca.best_explained_variance}") Dependencies scikit-learn>=0.24 numpy>=1.19 pandas>=1.1 License This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adaptivepca-1.0.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

adaptivepca-1.0.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file adaptivepca-1.0.0.tar.gz.

File metadata

  • Download URL: adaptivepca-1.0.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for adaptivepca-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dfceba8f71f43db78c40aeb82d1e06acb668b55a75ad2e9ae096f08bb858865c
MD5 c646106d55db95bff8d4a4a9924f326a
BLAKE2b-256 649c557c3706c1978df838ab5ba8d06dc6954a28a74f2905a9c9ca411463cd0d

See more details on using hashes here.

File details

Details for the file adaptivepca-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: adaptivepca-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for adaptivepca-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27aa9874b0298934e933f8d187f27bb03fc31bdc0ca3d958490b6c6ef675e55e
MD5 70da688e7e6112ee07d196462ac31ab4
BLAKE2b-256 67cd315289b64e2f638ded3d2836f86dbb6af1b1cf12fd77c090b1c32e64b293

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page