pca is a python package that performs the principal component analysis and to make insightful plots.

These details have not been verified by PyPI

Project links

Project description

pca

pca is a python package that performs the principal component analysis and creates insightful plots.
Biplot to plot the loadings
Explained variance
Scatter plot with the loadings

Method overview

# fit
model=pca.fit(X)
# biplot
ax=pca.biplot(model)
ax=pca.biplot3d(model)
# plot explained variance
ax = pca.plot(model)
# Normalize out components from your dataset
Xnorm=pca.norm(X)

Installation
Requirements
Quick Start
Contribute
Citation
Maintainers
License

Installation

Install pca from PyPI (recommended). pca is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
It is distributed under the MIT license.

Requirements

Creation of a new environment is not necessarily.

conda create -n env_pca python=3.6
conda activate env_pca
pip install numpy matplotlib sklearn

Quick Start

pip install pca

Alternatively, install pca from the GitHub source:

git clone https://github.com/erdogant/pca.git
cd pca
python setup.py install

Import pca package

import pca as pca

Load example data

import numpy as np
from sklearn.datasets import load_iris
X = load_iris().data
label=iris.feature_names
labx=iris.target

X looks like this:

X=array([[5.1, 3.5, 1.4, 0.2],
         [4.9, 3. , 1.4, 0.2],
         [4.7, 3.2, 1.3, 0.2],
         [4.6, 3.1, 1.5, 0.2],
         ...
         [5. , 3.6, 1.4, 0.2],
         [5.4, 3.9, 1.7, 0.4],
         [4.6, 3.4, 1.4, 0.3],
         [5. , 3.4, 1.5, 0.2],

labx=[0, 0, 0, 0,...,2, 2, 2, 2, 2]
label=['label1','label2','label3','label4']

PCA reduce dimensions and plot explained variance

# Fit
model = pca.fit(X)
# Plot the explained variance. The total of captured variance is 1 and PC1 captures more then 90% of it.
ax = pca.plot(model)
# Biplot in 2D with shows the directions of features and weights of influence
ax  = pca.biplot(model)
# Biplot in 3D
ax  = pca.biplot3d(model)

Reduce dimensions as above but now plot with labx and label names

model = pca.fit(X, labx=labx, feat=feat)
ax  = pca.biplot(model)
ax  = pca.biplot3d(model)

Reduce dimensions to the number of components that capture 95% of the explained variance

# Fit model and determine the number of required components that captures 95% of the explained variance.
model = pca.fit(X, components=0.95)
# Plot the explained variance. The required number of components is 2 to capture 95% of the variance.
ax = pca.plot(model)

Reduce dimensions to exactly 2d and 3d

# Set components=2 to reduce to 2d
model = pca.fit(X, components=2)
# Set components=3 to reduce to 3d
model = pca.fit(X, components=3)

PCA normalization.

# Normalizing out the 1st and more components from the data. 
# This is usefull if the data is seperated in its first component(s) by unwanted or biased variance. Such as sex or experiment location etc. 

print(X.shape)
(150, 4)

# Normalize out 1st component and return data
Xnorm = pca.norm(X, pcexclude=[1])

print(Xnorm.shape)
(150, 4)

# In this case, PC1 is "removed" and the PC2 has become PC1 etc
ax = pca.biplot(model)

Citation

Please cite pca in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{erdogant2019pca,
  title={pca},
  author={Erdogan Taskesen},
  year={2019},
  howpublished={\url{https://github.com/erdogant/pca}},
}

Maintainers

Erdogan Taskesen, github: erdogant

Contribute

Contributions are welcome.

Â© Copyright

See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.7

May 17, 2024

2.0.5

Aug 25, 2023

2.0.4

Aug 5, 2023

2.0.3

May 31, 2023

2.0.2

May 28, 2023

2.0.1

Apr 28, 2023

2.0.0

Apr 23, 2023

1.9.2

Mar 25, 2023

1.9.1

Feb 17, 2023

1.9.0

Feb 16, 2023

1.8.6

Jan 11, 2023

1.8.5

Dec 2, 2022

1.8.4

Nov 1, 2022

1.8.3

Sep 6, 2022

1.8.2

May 8, 2022

1.8.1

May 7, 2022

1.8.0

Apr 15, 2022

1.7.2

Apr 9, 2022

1.7.1

Apr 8, 2022

1.7.0

Mar 23, 2022

1.6.4

Mar 6, 2022

1.6.3

Feb 22, 2022

1.6.2

Feb 1, 2022

1.6.1

Jan 7, 2022

1.6.0

Jan 4, 2022

1.5.5

Nov 30, 2021

1.5.4

Nov 28, 2021

1.5.3

Nov 15, 2021

1.5.2

May 31, 2021

1.5.1

May 19, 2021

1.5.0

Apr 25, 2021

1.4.0

Mar 26, 2021

1.3.0

Sep 15, 2020

1.2.0

Sep 13, 2020

1.1.2

Jul 24, 2020

1.1.1

Jul 23, 2020

1.1.0

Jul 22, 2020

1.0.9

Jul 22, 2020

1.0.8

Jul 18, 2020

1.0.7

Jul 4, 2020

1.0.6

Jul 3, 2020

1.0.5

Jul 1, 2020

1.0.4

May 24, 2020

1.0.3

May 24, 2020

1.0.2

May 24, 2020

1.0.1

May 23, 2020

1.0.0

May 23, 2020

0.1.7

May 1, 2020

0.1.5

Mar 21, 2020

0.1.4

Mar 8, 2020

This version

0.1.3

Jan 23, 2020

0.1.2

Jan 18, 2020

0.1.1

Jan 11, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pca-0.1.3.tar.gz (7.8 kB view hashes)

Uploaded Jan 23, 2020 Source

Built Distribution

pca-0.1.3-py3-none-any.whl (8.5 kB view hashes)

Uploaded Jan 23, 2020 Python 3

Hashes for pca-0.1.3.tar.gz

Hashes for pca-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`5b9ae4ad3a5f02d60dba2f776f5c151427abb972216a6d191aa63946fc8784a7`
MD5	`b52d925928640441e97161224b2b725b`
BLAKE2b-256	`49f4e95abd3b49b37167ea54daaa94c0384ba80ef1c5d031468ba00a3ab8f202`

Hashes for pca-0.1.3-py3-none-any.whl

Hashes for pca-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`32880cf14e97ab77728762146b3af2de47575fc3a90d3b6a76e1ba5e84b1c792`
MD5	`a8d8197dcd7fd9681bacb1ffbda08d2e`
BLAKE2b-256	`badf12c0434bb7012732a1f16f1e5e35392c0baa3f0d46e462616a49b7cd1d4a`