Data Complexity Measures
Project description
data-complexity
The Data Complexity Measures in Python
Install
$ pip install data-complexity
How it works
Maximum Fisher's Discriminant Ratio (F1)
from dcm import dcm
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
index, F1 = dcm.F1(X, y)
Fraction of Borderline Points (N1)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
N1 = dcm.N1(X, y)
Entropy of Class Proportions (C1) and Imbalance Ratio (C2)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
C1, C2 = dcm.C12(X, y)
Other Measures
Coming soon...
References
[1] How Complex is your classification problem? A survey on measuring classification complexity, https://arxiv.org/abs/1808.03591
[2] The Extended Complexity Library (ECoL), https://github.com/lpfgarcia/ECoL
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data-complexity-0.1.2.tar.gz
(6.1 kB
view hashes)
Built Distribution
Close
Hashes for data_complexity-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | be7b031e3f64a22d3c165b117cfaa93201b901ff675590a1e39fbcc86bdd4781 |
|
MD5 | 0c11cb9821dbac204ac6cb1ddc2b70aa |
|
BLAKE2b-256 | 76641177d45b182311221edcf285a63f4f536baf5515a102c0e79e9f2f4340ba |