Data Complexity Measures
Project description
data-complexity
The Data Complexity Measures in Python
Install
$ pip install data-complexity
How it works
Maximum Fisher's Discriminant Ratio (F1)
from dcm import dcm
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
index, F1 = dcm.F1(X, y)
Fraction of Borderline Points (N1)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
N1 = dcm.N1(X, y)
Entropy of Class Proportions (C1) and Imbalance Ratio (C2)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
C1, C2 = dcm.C12(X, y)
Other Measures
Coming soon...
References
[1] How Complex is your classification problem? A survey on measuring classification complexity, https://arxiv.org/abs/1808.03591
[2] The Extended Complexity Library (ECoL), https://github.com/lpfgarcia/ECoL
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data-complexity-0.1.0.tar.gz
(6.1 kB
view hashes)
Built Distribution
Close
Hashes for data_complexity-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88b0165bda2f7acf02309d4e11a77e772010bee805399e446eedd10100b1d2a2 |
|
MD5 | 3423d95ad557b7819f1cc34c44efac8c |
|
BLAKE2b-256 | 5bcb18e9f608ef29d7f4fbfcdc8933cba8bda794dfa7ac5e089da90d9877b640 |