Skip to main content

Data Complexity Measures

Project description

data-complexity

The Data Complexity Measures in Python

Install

$ pip install data-complexity

How it works

Maximum Fisher's Discriminant Ratio (F1)

from dcm import dcm
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target

index, F1 = dcm.F1(X, y)

Fraction of Borderline Points (N1)

from dcm import dcm
from sklearn import datasets

bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values

N1 = dcm.N1(X, y)

Entropy of Class Proportions (C1) and Imbalance Ratio (C2)

from dcm import dcm
from sklearn import datasets

bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values

C1, C2 = dcm.C12(X, y)

Other Measures

Coming soon...

References

[1] How Complex is your classification problem? A survey on measuring classification complexity, https://arxiv.org/abs/1808.03591

[2] The Extended Complexity Library (ECoL), https://github.com/lpfgarcia/ECoL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-complexity-0.1.3.tar.gz (6.1 kB view hashes)

Uploaded Source

Built Distribution

data_complexity-0.1.3-py3-none-any.whl (9.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page