Compositional data (CoDa) analysis tools for Python
Project description
pyCoDaMath
pyCoDaMath provides compositional data (CoDa) analysis tools for Python
- Source code: https://bitbucket.org/genomicepidemiology/pycoda
Getting Started
This package extends the Pandas dataframe object with various CoDa tools. It also provides a set of plotting functions for CoDa figures.
Installation
Clone the git repo to your local hard drive:
git clone https://brinch@bitbucket.org/genomicepidemiology/pycoda.git
Enter pycoda directory and type
pip install ./
Usage
The pyCoDaMath module is loaded as
import pycodamath
At this point, in order to get CLR values from a Pandas DataFrame df, do
df.coda.clr()
Documentation
CLR transformation - point estimate
df.coda.clr()
Returns centered logratio coefficients. If the data frame contains zeros, values will be replaced by the Aitchison mean point estimate.
CLR transformation - standard deviation
df.coda.clr_std(n_samples=5000)
Returns the standard deviation of n_samples random draws in CLR space.
Parameters
- n_samples (int) - Number of random draws from a Dirichlet distribution.
ALR transformation - point estimate
df.coda.alr(part=None)
Same as clr() but returning additive logratio values. If part is None, then the last part of the composition is used, otherwise part is used as denominator.
Parameters
- part (str) - Name of the part to be used as denominator.
ALR transformation - standard deviation
df.coda.alr_std(part=None, n_samples=5000)
Same as clr_std, but in ALR space.
Parameters
-
part (str) - Name of the part to be used as denominator.
-
n_samples (int) - Number of random draws from a Dirichlet distribution.
ILR transformation - point estimate
df.coda.ilr(psi=None)
Same as clr() but for isometric logratio transform. An orthonormal basis can be provided as psi. If no basis is given, a default sequential binary partition basis will be used.
Parameters
- psi (array_like) - Orthonormal basis.
ILR transformation - standard deviation
df.coda.ilr_std(psi=None, n_samples=5000)
This method does not exist (yet).
Bayesian zero replacement
df.coda.zero_replacement(n_samples=5000)
Returns a count table with zero values replaced by finite values using Bayesian inference.
Parameters
- n_samples (int) - Number of random draws from a Dirichlet distribution.
Closure
df.coda.closure(N)
Apply closure to constant N to the composition.
Parameters
- N (int) - Closure constant.
Total variance
df.coda.totvar()
Calculates the total variance of a set of compositions.
Geometric mean
df.coda.gmean()
Calculates the geometric mean of a set of compositions.
Centering
df.coda.center()
Centers (and scales) the composition by dividing by the geometric mean and powering by the reciprocal variance.
Plotting functions
PCA biplot
class pycoda.pca.Biplot(data, default=True)
Plots a PCA biplot. Set default to False for an empty plot. The parameter data (DataFrame) is the data to be analyzed. Use counts, not CLR values.
A number of methods are available for customizing the biplot:
- plotloadings(cutoff=0, scale=None, labels=None)
- plotloadinglabels(labels=None)
- plotscores(group=None, palette=None, legend=True, labels=None)
- plotscorelables(labels=None)
- plotellipses(group=None, palette=None)
- plotcentroids(group=None, palette=None)
- plothulls(group=None, palette=None)
- plotcontours(group=None, palette=None, size=None, levels=None)
- removepatches()
- removescores()
- removelabels()
The keyword labels is a list of labelnames. If labels is None, all labels are plottet. Use labels=[] for no labels.
The keyword group is a Pandas dataframe with index equal to the index of data.
The keyword palette is a dict with colors to use to each unique member of group.
Example import pycoda as coda import pandas as pd
data = pd.read_csv('example/kilauea_iki_chem.csv')
mypca = coda.pca.Biplot(data)
mypca.plothulls()
mypca.removelabels()
mypca.plotloadinglabels(['FeO'])
Ternary diagram
pycoda.plot.ternary()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyCoDaMath-1.0.tar.gz.
File metadata
- Download URL: pyCoDaMath-1.0.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6a336d9b65185539324c6cfce53e28bbdfcd9b03936c5d33a53770d851165e9
|
|
| MD5 |
e2f5894f6c72f48a9985c4364f3f9be1
|
|
| BLAKE2b-256 |
2b01cb504ee3da026b066d5c692d93996a21ef6e706b009c7a26aba2b121c7f5
|
File details
Details for the file pyCoDaMath-1.0-py3-none-any.whl.
File metadata
- Download URL: pyCoDaMath-1.0-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a60231a42751c33e6d2931d802253e2cf47fe93c01e20f13d86f9d32cc8d6f1a
|
|
| MD5 |
2bcb394670ab6f363a3869522d36a1ee
|
|
| BLAKE2b-256 |
18424722ee92c467bb258c26b67253f7bb0b1d00bf5e7c4c270d25c879ece1dc
|