Fast Computation of Latent Correlations for Mixed Data
Project description
Free software: GNU General Public License v3
Documentation: https://latentcor-py.readthedocs.io.
Introduction
latentcor is an Python package for estimation of latent correlations with mixed data types (continuous, binary, truncated, and ternary) under the latent Gaussian copula model. For references on the estimation framework, see
Fan, J., Liu, H., Ning, Y., and Zou, H. (2017), “High Dimensional Semiparametric Latent Graphical Model for Mixed Data.” JRSS B. Continuous/binary types.
Quan X., Booth J.G. and Wells M.T. “Rank-based approach for estimating correlations in mixed ordinal data.” arXiv. Ternary type.
Yoon G., Carroll R.J. and Gaynanova I. (2020). “Sparse semiparametric canonical correlation analysis for data of mixed types.” Biometrika. Truncated type for zero-inflated data.
Yoon G., Müller C.L. and Gaynanova I. (2021). “Fast computation of latent correlations.”. Approximation method of computation, see math framework for details.
Statement of need
No Python software package is currently available that allows accurate and fast correlation estimation from mixed variable data in a unifying manner.
The Python package latentcor
, introduced here, thus represents the first stand-alone Python package for computation of latent correlation that
takes into account all variable types (continuous/binary/ordinal/zero-inflated), comes with an optimized memory footprint, and is computationally efficient,
essentially making latent correlation estimation almost as fast as rank-based correlation estimation.
Installation
The easiest way to install latentcor
is using pip
.
pip install latentcor
Example
Let’s import gen_data
, get_tps
and latentcor
from latentcor
.
from latentcor import gen_data, get_tps, latentcor
First, we will generate a pair of variables with different types using a sample size n=100
which will serve as example data. Here first variable will be ternary, and second variable will be continuous.
simdata = gen_data(n = 100, tps = ["ter", "con"])
print(simdata['X'][ : 6, : ])
Then we can estimate the latent correlation matrix based on these 2 variables using latentcor
function.
estimate = latentcor(simdata['X'], tps = ["ter", "con"])
print(estimate['R'])
Community Guidelines
Contributions and suggestions to the software are always welcome. Please consult our contribution guidelines prior to submitting a pull request.
Report issues or problems with the software using github’s issue tracker.
The easiest way to replicate development environment of latentcor is using pip:
pip install -r requirements_dev.txt
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.1.0 (2021-12-28)
First version.
0.1.1 (2022-01-06)
Fix some typos.
0.1.2 (2022-01-06)
Fix some bug on
use_nearPD
argument in functionlatentcor
.
0.1.3 (2022-01-07)
Fix syntax errors for
jupyter-execute
in README.txt.
0.1.4 (2022-05-23)
Fix error for continuous estimation.
0.2.0 (2022-08-16)
Increase maximum iteration for positive definiteness adjustment.
Make function outputs as dictionary.
0.2.1 (2022-08-22)
Make output latent correlation matrix as pandas.DataFrame.
Polish output heatmap.
0.2.2 (2022-08-22)
Update README file.
0.2.3 (2022-08-22)
Correct update history.
0.2.4 (2022-09-07)
Correct incompatible versions.
0.2.5 (2023-11-05)
Regenerate interpolants for approximation method and fix version compatibility for Python 3.7.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file latentcor-0.2.5.tar.gz
.
File metadata
- Download URL: latentcor-0.2.5.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b220216f5b86a404cbc51a9552eb4355f7c8729a43dd575bccb2d22d614d679 |
|
MD5 | 379c70c637c2f765f3307a94c85096f5 |
|
BLAKE2b-256 | 85cd93f07b8b587e4343f97dc6929eef700c58466ad74bebc0ba3f74a71e4121 |
File details
Details for the file latentcor-0.2.5-py2.py3-none-any.whl
.
File metadata
- Download URL: latentcor-0.2.5-py2.py3-none-any.whl
- Upload date:
- Size: 4.0 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ceb7f2c8ea95ed143e92dec84a05f1beca0cc30a76c0558bd3650190ba09a01 |
|
MD5 | 3c630331fc825b1ea2a54c2eaa91eba6 |
|
BLAKE2b-256 | 70fa5711a6e3205006a17a8f542f54b017bcea86bd4af041d0e6967cd4db829f |