Skip to main content

pca is a python package that performs the principal component analysis and to make insightful plots.

Project description

pca

Python PyPI Version License Downloads

     Star it if you like it!
  • pca is a python package that performs the principal component analysis and creates insightful plots.
  • Biplot to plot the loadings
  • Explained variance
  • Scatter plot with the loadings

Contents

Installation

  • Install pca from PyPI (recommended). pca is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
  • It is distributed under the MIT license.

Requirements

  • Creation of a new environment is not required but if you wish to do it:
conda create -n env_pca python=3.6
conda activate env_pca
pip install numpy matplotlib sklearn

Installation

pip install pca
  • Install the latest version from the GitHub source:
git clone https://github.com/erdogant/pca.git
cd pca
python setup.py install

Import pca package

from pca import pca

Load example data

import numpy as np
from sklearn.datasets import load_iris

# Load dataset
X = pd.DataFrame(data=load_iris().data, columns=load_iris().feature_names, index=load_iris().target)

# Load pca
from pca import pca

# Initialize to reduce the data up to the nubmer of componentes that explains 95% of the variance.
model = pca(n_components=0.95)

# Reduce the data towards 3 PCs
model = pca(n_components=3)

# Fit transform
results = model.fit_transform(X)

X looks like this:

X=array([[5.1, 3.5, 1.4, 0.2],
         [4.9, 3. , 1.4, 0.2],
         [4.7, 3.2, 1.3, 0.2],
         [4.6, 3.1, 1.5, 0.2],
         ...
         [5. , 3.6, 1.4, 0.2],
         [5.4, 3.9, 1.7, 0.4],
         [4.6, 3.4, 1.4, 0.3],
         [5. , 3.4, 1.5, 0.2],

labx=[0, 0, 0, 0,...,2, 2, 2, 2, 2]
label=['label1','label2','label3','label4']

Make scatter plot

fig, ax = model.scatter()

Make biplot

fig, ax = model.biplot(n_feat=4)

Make plot

fig, ax = model.plot()

Make 3d plots

fig, ax model.scatter3d()
fig, ax = model.biplot3d(n_feat=2)

PCA normalization.

Normalizing out the 1st and more components from the data. This is usefull if the data is seperated in its first component(s) by unwanted or biased variance. Such as sex or experiment location etc.

print(X.shape)
(150, 4)

# Normalize out 1st component and return data
model = pca()
Xnew = model.norm(X, pcexclude=[1])


print(Xnorm.shape)
(150, 4)

# In this case, PC1 is "removed" and the PC2 has become PC1 etc
ax = pca.biplot(model)

Maintainer

Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)
Contributions are welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pca-1.0.2.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pca-1.0.2-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file pca-1.0.2.tar.gz.

File metadata

  • Download URL: pca-1.0.2.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for pca-1.0.2.tar.gz
Algorithm Hash digest
SHA256 dc46e3d5b4b7e23dab004b3621cd7307cf3043fe27c881000452328714c2626a
MD5 6b95c883dc1109170de1797c1293c288
BLAKE2b-256 a667b50de9ba215db8d47ced5ac97252647a7990497a70942bc4a5256aa36a23

See more details on using hashes here.

File details

Details for the file pca-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pca-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0.post20200518 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.10

File hashes

Hashes for pca-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 79a52c944f7fad3b72c2ddc9b3209aa9dac6be79ba5cb32d765f7c9a672ac1f1
MD5 af1eef4c21bb7762fe714447b0ee4a2f
BLAKE2b-256 7a90688f61d5a6a22dd093ad392fd02ee82741e1a45577ba4045988e4510ac85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page