Skip to main content

Compositional data (CoDa) analysis tools for Python

Project description

pyCoDaMath

made-with-python

pyCoDaMath provides compositional data (CoDa) analysis tools for Python

Getting Started

This package extends the Pandas dataframe object with various CoDa tools. It also provides a set of plotting functions for CoDa figures.

Installation

Clone the git repo to your local hard drive:

git clone https://brinch@bitbucket.org/genomicepidemiology/pycoda.git

Enter pycoda directory and type

pip install ./

Usage

The pyCoDaMath module is loaded as

import pycodamath

At this point, in order to get CLR values from a Pandas DataFrame df, do

df.coda.clr()

Documentation

CLR transformation - point estimate

df.coda.clr()

Returns centered logratio coefficients. If the data frame contains zeros, values will be replaced by the Aitchison mean point estimate.

CLR transformation - standard deviation

df.coda.clr_std(n_samples=5000)

Returns the standard deviation of n_samples random draws in CLR space.

Parameters

  • n_samples (int) - Number of random draws from a Dirichlet distribution.

ALR transformation - point estimate

df.coda.alr(part=None)

Same as clr() but returning additive logratio values. If part is None, then the last part of the composition is used, otherwise part is used as denominator.

Parameters

  • part (str) - Name of the part to be used as denominator.

ALR transformation - standard deviation

df.coda.alr_std(part=None, n_samples=5000)

Same as clr_std, but in ALR space.

Parameters

  • part (str) - Name of the part to be used as denominator.

  • n_samples (int) - Number of random draws from a Dirichlet distribution.

ILR transformation - point estimate

df.coda.ilr(psi=None)

Same as clr() but for isometric logratio transform. An orthonormal basis can be provided as psi. If no basis is given, a default sequential binary partition basis will be used.

Parameters

  • psi (array_like) - Orthonormal basis.

ILR transformation - standard deviation

df.coda.ilr_std(psi=None, n_samples=5000)

This method does not exist (yet).

Bayesian zero replacement

df.coda.zero_replacement(n_samples=5000)

Returns a count table with zero values replaced by finite values using Bayesian inference.

Parameters

  • n_samples (int) - Number of random draws from a Dirichlet distribution.

Closure

df.coda.closure(N)

Apply closure to constant N to the composition.

Parameters

  • N (int) - Closure constant.

Total variance

df.coda.totvar()

Calculates the total variance of a set of compositions.

Geometric mean

df.coda.gmean()

Calculates the geometric mean of a set of compositions.

Centering

df.coda.center()

Centers (and scales) the composition by dividing by the geometric mean and powering by the reciprocal variance.

Plotting functions

PCA biplot

class pycoda.pca.Biplot(data, default=True)

Plots a PCA biplot. Set default to False for an empty plot. The parameter data (DataFrame) is the data to be analyzed. Use counts, not CLR values.

A number of methods are available for customizing the biplot:

  • plotloadings(cutoff=0, scale=None, labels=None)
  • plotloadinglabels(labels=None)
  • plotscores(group=None, palette=None, legend=True, labels=None)
  • plotscorelables(labels=None)
  • plotellipses(group=None, palette=None)
  • plotcentroids(group=None, palette=None)
  • plothulls(group=None, palette=None)
  • plotcontours(group=None, palette=None, size=None, levels=None)
  • removepatches()
  • removescores()
  • removelabels()

The keyword labels is a list of labelnames. If labels is None, all labels are plottet. Use labels=[] for no labels.

The keyword group is a Pandas dataframe with index equal to the index of data.

The keyword palette is a dict with colors to use to each unique member of group.

Example import pycoda as coda import pandas as pd

data = pd.read_csv('example/kilauea_iki_chem.csv')
mypca = coda.pca.Biplot(data)
mypca.plothulls()
mypca.removelabels()
mypca.plotloadinglabels(['FeO'])

Ternary diagram

pycoda.plot.ternary()

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyCoDaMath-1.0.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyCoDaMath-1.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file pyCoDaMath-1.0.tar.gz.

File metadata

  • Download URL: pyCoDaMath-1.0.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for pyCoDaMath-1.0.tar.gz
Algorithm Hash digest
SHA256 c6a336d9b65185539324c6cfce53e28bbdfcd9b03936c5d33a53770d851165e9
MD5 e2f5894f6c72f48a9985c4364f3f9be1
BLAKE2b-256 2b01cb504ee3da026b066d5c692d93996a21ef6e706b009c7a26aba2b121c7f5

See more details on using hashes here.

File details

Details for the file pyCoDaMath-1.0-py3-none-any.whl.

File metadata

  • Download URL: pyCoDaMath-1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for pyCoDaMath-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a60231a42751c33e6d2931d802253e2cf47fe93c01e20f13d86f9d32cc8d6f1a
MD5 2bcb394670ab6f363a3869522d36a1ee
BLAKE2b-256 18424722ee92c467bb258c26b67253f7bb0b1d00bf5e7c4c270d25c879ece1dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page