Skip to main content

A python module to reduce features using correlation matrix

Project description

Publish to PyPI Tests

corrfeatred

select features using correlation matrix

Installation

pip install corrfeatred

Usage

from corrfeatred import reduce_features

correlation_matrix = #correlation matrix
feature_set = reduce_features(correlation_matrix, threshold=0.8, policy='min')


# if you want another set of features for same correlation matrix, then use random seed to change the output.

different_feature_set = reduce_features(correlation_matrix, threshold=0.8, policy='min', random_seed = 42)

Idea and workflow

Currently there is only one function which takes correlation matrix and thresholds as input and then constructs a graph.

We create a graph where each node is represents a feature, and edge represents collinearity between the features. Then maximal cliques present in the graph are calculated.

Each clique represents a cluster of features that are correlated with each other, and hence only one feature from this cluster is enough to represent whole cluster in the final feature sets. Hence, we can have multiple policies about how we want to choose the features (minimum number of features, maximum number of features etc).

Our goal is to have at max one feature from each clique.

And finally the feature set we get from this function will all have pairwise correlation less than the threshhold.

workflow

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

corrfeatred-0.0.3.3.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

corrfeatred-0.0.3.3-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file corrfeatred-0.0.3.3.tar.gz.

File metadata

  • Download URL: corrfeatred-0.0.3.3.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for corrfeatred-0.0.3.3.tar.gz
Algorithm Hash digest
SHA256 962bc4dfa98619679e323585b725eedc0a3fc46b1562b0086a1985f721ce7264
MD5 3619b8b08e916ecf22089583a01ec840
BLAKE2b-256 c6ebb5ab3c55ea2cb2889c744011899c4a5cac83dbf299bf4377a86e597c6fd2

See more details on using hashes here.

File details

Details for the file corrfeatred-0.0.3.3-py3-none-any.whl.

File metadata

File hashes

Hashes for corrfeatred-0.0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 dd5e74e3b33db89f770f612e8096538c3989f8d040e97258ae2be834c6f4c829
MD5 c10f511451590901bec273e8a90f1ec2
BLAKE2b-256 b86b397b26adb52cd125ba10d9fd05f106080aed7b55e94ecd3262d28f44294e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page