Hierarchical partitioning and variation partitioning for canonical analyses in Python.

These details have not been verified by PyPI

Project links

Project description

rdacca_hp

Python implementation of hierarchical partitioning and variation partitioning for canonical analyses, inspired by the R package rdacca.hp.

rdacca_hp provides hierarchical partitioning and variation partitioning for:

RDA (Redundancy Analysis)
CCA (Canonical Correspondence Analysis)
dbRDA (distance-based Redundancy Analysis)

It is designed for users who want a Python workflow similar to rdacca.hp, while supporting mixed predictor types such as:

numeric variables
unordered categorical variables
ordered categorical variables
grouped predictor sets

The package also provides:

permutation-based significance testing
plotting utilities for hierarchical partitioning and variation partitioning results

Features

Hierarchical partitioning (hier_part)
Variation partitioning (var_part)
Support for:
- numeric predictors
- unordered factors
- ordered factors
- grouped predictors
Permutation testing with permu_hp()
Plotting utilities for single results and result comparison
Baseline validation against R outputs for key RDA use cases

Current status

This package is currently in an early public release stage.

At the current stage:

the RDA workflow has been checked carefully against the R package rdacca.hp
mixed predictor inputs (numeric + unordered factor + ordered factor) are supported
permutation testing is available
baseline tests against R outputs are included for selected cases

Notes:

results for RDA are expected to closely match the R implementation in validated scenarios
CCA and dbRDA are implemented and tested, and further benchmark expansion is planned in future releases.
permutation p-values may show small Monte Carlo differences relative to R because random permutation sequences differ across platforms

Installation

Install from local source

pip install .

Install in editable mode for development

pip install -e .[dev]

Install optional plotting dependencies

pip install -e .[plot]

Install from a published package

pip install rdacca_hp

Public API

The main public functions can be imported directly from the package top level:

from rdacca_hp import rdacca_hp, permu_hp, plot_rdaccahp, plot_comparison

Main public objects include:

rdacca_hp
RdaccaHpResult
calculate_rda
calculate_cca
calculate_dbrda
create_test_data
create_cca_test_data
create_distance_test_data
permu_hp
plot_rdaccahp
plot_comparison

Quick start

1. Numeric predictors only

from rdacca_hp import create_test_data, rdacca_hp

dv, iv = create_test_data()

result = rdacca_hp(
    dv=dv,
    iv=iv,
    method="RDA",
    type="adjR2",
    scale=False,
    var_part=True
)

print(result.total_explained_variation)
print(result.hier_part)
print(result.var_part)

2. Mixed predictors: numeric + unordered factor + ordered factor

If your predictors contain mixed types, you can explicitly specify factor handling.

import pandas as pd
from rdacca_hp import rdacca_hp

dv = pd.DataFrame({
    "sp1": [2, 3, 5, 4, 6, 7],
    "sp2": [1, 2, 2, 3, 4, 5],
})

iv = pd.DataFrame({
    "WatrCont": [10.1, 9.8, 8.7, 7.5, 6.9, 6.1],
    "Substrate": ["A", "B", "A", "C", "B", "A"],
    "Shrub": ["None", "Few", "Few", "Many", "Many", "Few"],
})

result = rdacca_hp(
    dv=dv,
    iv=iv,
    method="RDA",
    type="adjR2",
    scale=False,
    var_part=True,
    categorical_factors=["Substrate"],
    ordered_factors={"Shrub": ["None", "Few", "Many"]}
)

print(result.hier_part)
print(result.var_part)

3. Grouped predictors

import pandas as pd
from rdacca_hp import create_test_data, rdacca_hp

dv, iv = create_test_data(n_predictors=4)

groups = {
    "Climate": pd.DataFrame(iv[:, :2], columns=["Temp", "Rain"]),
    "Soil": pd.DataFrame(iv[:, 2:], columns=["N", "C"]),
}

result = rdacca_hp(
    dv=dv,
    iv=groups,
    method="RDA",
    type="R2",
    var_part=True
)

print(result.hier_part)
print(result.var_part)

4. Permutation test

from rdacca_hp import create_test_data, permu_hp

dv, iv = create_test_data()

perm_result = permu_hp(
    dv=dv,
    iv=iv,
    method="RDA",
    type="adjR2",
    permutations=99,
    scale=False,
    random_state=123,
    verbose=False
)

print(perm_result)

5. Plotting

from rdacca_hp import create_test_data, rdacca_hp, plot_rdaccahp

dv, iv = create_test_data()
result = rdacca_hp(dv=dv, iv=iv, method="RDA", type="R2", var_part=True)

fig = plot_rdaccahp(result, plot_type="bar")

You can also use the convenience method on the result object:

fig = result.plot(plot_type="bar")

Main functions

`rdacca_hp()`

Main function for hierarchical partitioning and variation partitioning.

`permu_hp()`

Permutation test for hierarchical partitioning results.

`plot_rdaccahp()`

Plot a single hierarchical partitioning result.

`plot_comparison()`

Compare multiple hierarchical partitioning results in one figure.

Input conventions

Response matrix (`dv`)

dv can be:

a NumPy array
a pandas DataFrame

For RDA, users often apply Hellinger transformation before analysis when working with community data.

For dbRDA, dv should be a square symmetric distance matrix.

Predictor matrix (`iv`)

iv can be:

a NumPy array
a pandas DataFrame
a grouped structure such as dict
a grouped structure such as list

Supported predictor types include:

continuous numeric columns
unordered categorical columns
ordered categorical columns

Predictor handling

rdacca_hp supports several predictor formats.

1. Numeric matrix or array

If iv is given as a numeric array or numeric matrix, all predictors are treated as numeric variables.

result = rdacca_hp(dv=dv, iv=iv_numeric)

2. pandas DataFrame with mixed predictor types

If iv is given as a pandas DataFrame, the package can handle mixed predictor types, including:

continuous numeric variables
unordered categorical factors
ordered factors

Numeric columns are handled directly as numeric predictors.

For non-numeric predictors, users can explicitly specify variable types when needed:

use categorical_factors=[...] for unordered categorical variables
use ordered_factors={...} for ordered variables with a declared level order

result = rdacca_hp(
    dv=dv,
    iv=iv_df,
    categorical_factors=["Substrate"],
    ordered_factors={"Shrub": ["None", "Few", "Many"]},
)

For mixed-type DataFrames, explicit specification is recommended, especially when:

the dataset contains string-based predictors
factor level order matters
reproducible encoding behavior is important

In practice:

numeric variables are supported directly
unordered factors should be declared with categorical_factors
ordered factors should be declared with ordered_factors

This makes the package easy to use for standard numeric analyses, while still allowing precise control over how mixed predictor data are encoded.

3. Grouped predictors as a dictionary

result = rdacca_hp(dv=dv, iv={"Climate": climate_df, "Soil": soil_df})

4. Grouped predictors as a list

result = rdacca_hp(dv=dv, iv=[group1_df, group2_df, group3_df])

Returned object

rdacca_hp() returns a RdaccaHpResult object containing at least:

method_type
total_explained_variation
hier_part

and optionally:

var_part

It also provides:

summary()
plot()

Example:

result = rdacca_hp(dv=dv, iv=iv)
result.summary()
fig = result.plot(plot_type="bar")

`hier_part`

A table containing:

Unique
Average.share
Individual
I.perc(%)

`var_part`

A table containing:

Fractions
% Total

Running tests

Run all tests:

pytest -q

Run coverage:

pytest --cov=rdacca_hp --cov-report=term-missing

Run only R baseline tests:

pytest tests/test_r_baselines.py -q

R baseline validation

This project includes a benchmark workflow against R outputs.

Benchmark directories

benchmark/data/: fixed input data
benchmark/expected/: expected outputs exported from R
benchmark/r_scripts/: scripts used to generate expected R outputs

Current validated RDA baselines

rda_numeric_2vars
rda_unordered_factor
rda_mite_full_mixed
rda_ordered_factor_mixed

These baselines are used to check that Python results remain aligned with the corresponding R workflow for validated RDA scenarios.

Important notes

1. Small p-value differences are normal

Permutation p-values may differ slightly from R because:

permutation sequences differ
random seeds differ across platforms
permutation p-values are Monte Carlo estimates

2. Ordered factors matter

Ordered factors should not be treated the same way as ordinary categorical variables. If you have ordered predictor levels, specify them explicitly.

3. CSV reading and `"None"`

If a valid category level is literally "None", make sure it is not accidentally parsed as missing data when reading CSV files.

For example:

import pandas as pd
pd.read_csv("file.csv", keep_default_na=False)

Limitations

RDA is currently the most thoroughly validated workflow
CCA and dbRDA are available, but more benchmark expansion is still desirable
very large permutation jobs may be slow in pure Python workflows

Recommended usage for reproducibility

For the most reproducible results:

keep benchmark datasets fixed
explicitly specify unordered and ordered factors when needed
use baseline tests against R outputs
report the package version and analysis settings

Package structure

rdacca_hp/
├── rdacca_hp/
│   ├── __init__.py
│   ├── core.py
│   ├── utils.py
│   ├── permutation.py
│   └── plotting.py
│
├── tests/
│   ├── test_r_baselines.py
│   ├── test_assertions.py
│   ├── test_core.py
│   ├── test_cca.py
│   ├── test_dbrda.py
│   ├── test_permutation.py
│   ├── test_plotting.py
│   └── test_public_api.py
│
├── benchmark/
│   ├── data/
│   ├── expected/
│   └── r_scripts/
│
├── scripts/
│   └── test_time.py
│
├── README.md
├── pyproject.toml
└── LICENSE

Citation / inspiration

This Python project is inspired by the R package rdacca.hp and its hierarchical partitioning framework for canonical analyses.

If you use this package in academic work, you should also cite the original methodological and/or R package sources as appropriate.

License

This project is licensed under the MIT License.

Contact

Author: Jiangshan Lai Email: lai@njfu.edu.cn

Repository: https://github.com/peony-peo/rdacca_hp

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 17, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdacca_hp-0.1.1.tar.gz (40.3 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rdacca_hp-0.1.1-py3-none-any.whl (33.2 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file rdacca_hp-0.1.1.tar.gz.

File metadata

Download URL: rdacca_hp-0.1.1.tar.gz
Upload date: Mar 17, 2026
Size: 40.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rdacca_hp-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ab4d9ebb61a457b28adff2312451056328b4f36706b89f8c06f98fe4056346c5`
MD5	`1ddedd557853e788aaa29ce41ef96dcd`
BLAKE2b-256	`9ca590d421890882153797edcc8bc10b93a9ee019b9e1bf631ff2020aa4c883d`

See more details on using hashes here.

File details

Details for the file rdacca_hp-0.1.1-py3-none-any.whl.

File metadata

Download URL: rdacca_hp-0.1.1-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 33.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for rdacca_hp-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6455d097d290ac426b38899af6d2d2c7d7e0fb6e35c12ccc7a45bdd44f771f31`
MD5	`3dd5803e8909ea365c19d0ec3974ab2a`
BLAKE2b-256	`89fea52a0b717e45da8ba220a0bc343ca5cb51b831ccf1961f168a32262db3ff`

See more details on using hashes here.

rdacca-hp 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rdacca_hp

Features

Current status

Installation

Install from local source

Install in editable mode for development

Install optional plotting dependencies

Install from a published package

Public API

Quick start

1. Numeric predictors only

2. Mixed predictors: numeric + unordered factor + ordered factor

3. Grouped predictors

4. Permutation test

5. Plotting

Main functions

rdacca_hp()

permu_hp()

plot_rdaccahp()

plot_comparison()

Input conventions

Response matrix (dv)

Predictor matrix (iv)

Predictor handling

1. Numeric matrix or array

2. pandas DataFrame with mixed predictor types

3. Grouped predictors as a dictionary

4. Grouped predictors as a list

Returned object

hier_part

var_part

Running tests

R baseline validation

Benchmark directories

Current validated RDA baselines

Important notes

1. Small p-value differences are normal

2. Ordered factors matter

3. CSV reading and "None"

Limitations

Recommended usage for reproducibility

Package structure

Citation / inspiration

License

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`rdacca_hp()`

`permu_hp()`

`plot_rdaccahp()`

`plot_comparison()`

Response matrix (`dv`)

Predictor matrix (`iv`)

`hier_part`

`var_part`

3. CSV reading and `"None"`