Skip to main content

lapros data for better AI

Project description

LaPros

Install

pip install -U lapros

How to use

LaPros works with classifiers. It ranks the suspicious labels given probabilies by some classification model. You can use normal Python lists, Numpy arrays or Pandas data. Return values are in a Numpy array or a Pandas series, the larger the value, the more suspicious are the coresponding labels.

from lapros import suspect
labels = [1, 0, 0, 1, 1];
probas = [
    #
    [0.5, 0.6, 0.7, 0.8, 0.9],
    [0.5, 0.4, 0.3, 0.2, 0.1],
];
suspect(
    probas,
    labels=labels,
)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
err suspected
0 0.4 False
1 0.3 False
2 0.1 False
3 0.7 False
4 0.9 True

docstring


suspect

Rank the suspicious labels given probas from a classifier. Accept Numpy arrays, Pandas dataframes and series, and normal Python lists. We can use interger, string or even float labels, given that the probability matrix’s columns are indexed by the same label set.

Args:

  • probas (n x m matrix): probabilites for possible classes.

KwArgs:

  • labels (n x 1 vector): observed class labels

Returns: a Pandas DataFrame including 1 index and 2 columns:

  • id (int): the index which is the same to the original data row index
  • err (float): the magnitude of suspiciousness, valued between [0, 1]
  • suspected (bool): whether the data row is suspected as having a label error.
help(suspect)
Help on Function in module lapros.api:

suspect(...)
    Rank the suspicious labels given probas from a classifier.
    Accept Numpy arrays, Pandas dataframes and series, and normal Python lists.
    We can use interger, string or even float labels, given that
    the probability matrix's columns are indexed by the same label set.
    
    Args:
    - probas (n x m matrix): probabilites for possible classes.
    
    KwArgs:
    - labels (n x 1 vector): observed class labels
    
    Returns: a Pandas DataFrame including 1 index and 2 columns:
    
    - id (int): the index which is the same to the original data row index
    - err (float): the magnitude of suspiciousness, valued between [0, 1]
    - suspected (bool):  whether the data row is suspected as having a label error.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lapros-0.2.1.tar.gz (13.0 kB view hashes)

Uploaded Source

Built Distribution

lapros-0.2.1-py3-none-any.whl (11.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page