Factor analysis in Python: PCA, CA, MCA, MFA, FAMD, GPA
Project description
Prince is a Python library for multivariate exploratory data analysis in Python. It includes a variety of methods for summarizing tabular data, including principal component analysis (PCA) and correspondence analysis (CA). Prince provides efficient implementations, using a scikit-learn API.
I made Prince when I was at university, back in 2016. I spent a significant amount of time in 2022 to revamp the entire package. It is thoroughly tested and supports many features, such as supplementary row/columns, as well as row/column weights.
Example usage
>>> import prince
>>> dataset = prince.datasets.load_decathlon()
>>> decastar = dataset.query('competition == "Decastar"')
>>> pca = prince.PCA(n_components=5)
>>> pca = pca.fit(decastar, supplementary_columns=['rank', 'points'])
>>> pca.eigenvalues_summary
eigenvalue % of variance % of variance (cumulative)
component
0 3.114 31.14% 31.14%
1 2.027 20.27% 51.41%
2 1.390 13.90% 65.31%
3 1.321 13.21% 78.52%
4 0.861 8.61% 87.13%
>>> pca.transform(dataset).tail()
component 0 1 2 3 4
competition athlete
OlympicG Lorenzo 2.070933 1.545461 -1.272104 -0.215067 -0.515746
Karlivans 1.321239 1.318348 0.138303 -0.175566 -1.484658
Korkizoglou -0.756226 -1.975769 0.701975 -0.642077 -2.621566
Uldal 1.905276 -0.062984 -0.370408 -0.007944 -2.040579
Casarsa 2.282575 -2.150282 2.601953 1.196523 -3.571794
>>> chart = pca.plot(dataset)
This chart is interactive, which doesn't show on GitHub. The green points are the column loadings.
>>> chart = pca.plot(
... dataset,
... show_row_labels=True,
... show_row_markers=False,
... row_labels_column='athlete',
... color_rows_by='competition'
... )
Installation
pip install prince
🎨 Prince uses Altair for making charts.
Methods
flowchart TD
cat?(Categorical data?) --> |"✅"| num_too?(Numerical data too?)
num_too? --> |"✅"| FAMD
num_too? --> |"❌"| multiple_cat?(More than two columns?)
multiple_cat? --> |"✅"| MCA
multiple_cat? --> |"❌"| CA
cat? --> |"❌"| groups?(Groups of columns?)
groups? --> |"✅"| MFA
groups? --> |"❌"| shapes?(Analysing shapes?)
shapes? --> |"✅"| GPA
shapes? --> |"❌"| PCA
Principal component analysis (PCA)
Correspondence analysis (CA)
Multiple correspondence analysis (MCA)
Multiple factor analysis (MFA)
Factor analysis of mixed data (FAMD)
Generalized procrustes analysis (GPA)
Correctness
Prince is tested against scikit-learn and FactoMineR. For the latter, rpy2 is used to run code in R, and convert the results to Python, which allows running automated tests. See more in the tests directory.
Citation
Please use this citation if you use this software as part of a scientific publication.
@software{Halford_Prince,
author = {Halford, Max},
license = {MIT},
title = {{Prince}},
url = {https://github.com/MaxHalford/prince}
}
License
The MIT License (MIT). Please see the license file for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prince-0.16.5.tar.gz.
File metadata
- Download URL: prince-0.16.5.tar.gz
- Upload date:
- Size: 183.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.23 {"installer":{"name":"uv","version":"0.9.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b927f196caf8b2ce930c8d04fd90a488004907c0c59e28e92e1dad62ae17a590
|
|
| MD5 |
7e282cbcbff97436da059e12e963946d
|
|
| BLAKE2b-256 |
e1f5c1d1d6bbe092d01ebdf869d38f27c2cc0ff4e00e5925d8b5e387154e9839
|
File details
Details for the file prince-0.16.5-py3-none-any.whl.
File metadata
- Download URL: prince-0.16.5-py3-none-any.whl
- Upload date:
- Size: 179.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.23 {"installer":{"name":"uv","version":"0.9.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1556502acfbd3dfa655b7ea7cfc01b9ea586340b8d5cbd1a438663c0f8fe7ad8
|
|
| MD5 |
8280a24f489c9f13f3ca9584d84aa2d2
|
|
| BLAKE2b-256 |
50078f02b5c352e5deaf1461ededd4cb844e96da96f0158fccfa397e85f4a8d0
|