Skip to main content

PyNetCor is a fast Python C++ extension for correlation and network analysis

Reason this release was yanked:

Package uploaded with incorrect build artifacts affecting functionality.

Project description

pyNetCor

PyNetCor is a fast Python C++ extension for correlation and network analysis on high-dimensional datasets. It aims to serve as a scalable foundational package to accelerate large-scale computations.

Features

  • Calculate correlation matrix using Pearson, Spearman, or Kendall methods
  • Processing large-scale computation in chunks (larger than RAM)
  • Find top-k and differential correlations between each row of two arrays
  • Efficient P-value approximation and multiple testing correction
  • Handle missing values
  • Multi-thread

Installation

You can install pyNetCor using pip:

pip install pynetcor

Quick Start

Create Data

import numpy as np

features = 100000
sampes = 100
arr1 = np.random.random((features, samples))
arr2 = np.random.random((features, samples))

Calculate correlation matrix

Compute and return the full matrix at once.

from pynetcor.cor import corrcoef

# using 8 threads
# Pearson correlations between `arr1` and itself
cor_result = corrcoef(arr1, threads=8)

Compute the matrix in chunks and return an Iterator, recommended for large-scale analysis that exceed RAM.

from pynetcor.cor import chunked_corrcoef

# Calculate and return `chunk_size=1024` rows of the correlation matrix with each iteration.
cor_iter = chunked_corrcoef(arr1, chunk_size=1024, threads=8)
for cor_chunk_matrix in cor_iter:
    ...

Top-k correlation search

Identify the accurate top k correlations (Spearman correlation).

from pynetcor.cor import cor_topk

# top 1% correlations
cor_topk_result = cor_topk(arr1, method="spearman", k=0.001, threads=8)

# top 100 correlations
cor_topk_result = cor_topk(arr1, method="spearman", k=100, threads=8)

# Return a 2D array with 4 columns: [row_index, col_index, correlation, pvalue]

Top-k differential correlation search

Identify the accurate top k differences in correlation between pairs of features across two states or time points.

# Compute the pairwise correlations separately for `arr1` with `arr1`, and `arr2` with `arr2`, then identify the feature pairs with the largest difference
from pynetcor.cor import cor_topkdiff

# top 1% differential correlations
cor_topkdiff_result = cor_topkdiff(x1=arr1, y1=arr2, x2=arr1, y2=arr2, k=0.001, threads=8)

# top 100 differential correlations
cor_topkdiff_result = cor_topkdiff(x1=arr1, y1=arr2, x2=arr1, y2=arr2, k=100, threads=8)

# Return a 2D array with 5 columns: [row_index, col_index, diffCor, cor1, cor2]

P-value computation

Compute the P-values for correlations (Pearson or Spearman) using the Student's t-distribution. The approximation method is significantly faster than the classical method, with the absolute errors are nearly less than 1e-8.

from pynetcor.cor import corrcoef, pvalue_student_t
samples = arr1.shape[1]

# Generate the Pearson correlation matrix
cor_result = corrcoef(arr1, threads=8)

# P-value approximation
pvalue_result = pvalue_student_t(cor_result, df=samples-2, approx=True, threads=8)

# P-value classic
pvalue_result = pvalue_student_t(cor_result, df=samples-2, approx=False, threads=8)

Unified implementation for calculating correlations and P-values.

from pynetcor.cor import cortest, chunked_cortest

# Pearson correlation & P-value approximation
cortest_result = cortest(arr1, approx=True, threads=8)

# chunking computation, recommended for large-scale analysis that exceed RAM
for iter in chunked_cortest(arr1, approx=True, threads=8):
    for (row_index, col_index, correlation, pvalue) in iter:
        ...
        
# Return a 2D array with 4 columns: [row_index, col_index, correlation, pvalue]

Multiple testing correction: holm, hochberg, bonferroni, BH, BY.

from pynetcor.cor import cortest, chunked_cortest

# Pearson correlation & multiple testing correction
cortest_result = cortest(arr1, adjust_pvalue=True, adjust_method="BH", threads=8)

# chunking computation, recommended for large-scale analysis that exceed RAM
for iter in chunked_cortest(arr1, adjust_pvalue=True, adjust_method="BH", threads=8):
    for (row_index, col_index, correlation, pvalue) in iter:
        ...
        
# Return a 2D array with 5 columns: [row_index, col_index, correlation, pvalue, adjusted_pvalue]       

NOTE: chunked function only supports approximate adjusted P-value. PyNetCor utilizes approximation methods to achieve effective FDR control before computing all P-values.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pynetcor-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

pynetcor-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

pynetcor-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pynetcor-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

pynetcor-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

pynetcor-0.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.0 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

File details

Details for the file pynetcor-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81d5fd71867a8b42608d9a0e9a41958b9bb0788122f39818a94a7df83a7dc576
MD5 8248eb0be3b500c79d42d810a653d7b9
BLAKE2b-256 0967655142c48219410e4636a814ef12aec3432bc6fbee0db88349dc313bcd46

See more details on using hashes here.

File details

Details for the file pynetcor-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e6118b6d75de0a0909c009bf0b8510d7d64f99d7d313579ebe8138ddd2e0cf9a
MD5 103ba690d61d0f7559ba682eb306cca0
BLAKE2b-256 818a7b1d1f9949f5bb228d3b4e47dfc63d3ba92c1d22c3e2ebb669fcf8035f38

See more details on using hashes here.

File details

Details for the file pynetcor-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6559753f4a71fee72ef0327394d2b4e42914328cc332415c75b69f361141291
MD5 792632ecf9996dfe1f2dbe760b74658f
BLAKE2b-256 e74fa068321d2149198a96d9a7465f852b2693f229eda3d33fdef964748fb060

See more details on using hashes here.

File details

Details for the file pynetcor-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5ae22f7a31ae7c8e4531f067a04534fc4b62783baf1613e84225cbd5852d4dd1
MD5 c36426ed387e70b563decb18f022a4cb
BLAKE2b-256 b4d73e4effa1462e774f7d8e70cb903411f7f686405eb4200cccd506029a18e3

See more details on using hashes here.

File details

Details for the file pynetcor-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 67ed7099c48c948234fc7ef79fbac9a10be10839977474cb78a5e96bfb9c66cf
MD5 8fd4a160425d1091c5c3b20173de897f
BLAKE2b-256 51609af4bf784948aae6fa8afc0ab2a2ac1888716d3c6750a4025b08ef5fdb39

See more details on using hashes here.

File details

Details for the file pynetcor-0.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pynetcor-0.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 26bfd8dad8d2ede1c921a5013fb7a6c6af35dedb8057130024c53579f7cd8318
MD5 8d5a03b8146413003401e357eb689b83
BLAKE2b-256 9f94c0c22e9d3d5460ea648b49b4a999279d5b8d0376ba5d45c033ed16f24494

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page