Data Complexity Measures
Project description
data-complexity
The Data Complexity Measures in Python
Install
$ pip install data-complexity
How it works
Maximum Fisher's Discriminant Ratio (F1)
from dcm import dcm
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
index, F1 = dcm.F1(X, y)
Fraction of Borderline Points (N1)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
N1 = dcm.N1(X, y)
Entropy of Class Proportions (C1) and Imbalance Ratio (C2)
from dcm import dcm
from sklearn import datasets
bc = datasets.load_breast_cancer(as_frame=True)
X = bc.data.values
y = bc.target.values
C1, C2 = dcm.C12(X, y)
Other Measures
Coming soon...
References
[1] How Complex is your classification problem? A survey on measuring classification complexity, https://arxiv.org/abs/1808.03591
[2] The Extended Complexity Library (ECoL), https://github.com/lpfgarcia/ECoL
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data-complexity-0.1.3.tar.gz
(6.1 kB
view hashes)
Built Distribution
Close
Hashes for data_complexity-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f17f3f8a08a8ca8baeb78fd9d2b535c460b73781d3ade783f0f664caaa91d983 |
|
MD5 | acdeef83702cf1abfeb9e83002800212 |
|
BLAKE2b-256 | ec92e4ece9ab281ef7cfd6a01bfd7468d3e90a576a6afe75730a7c959bd89788 |