Skip to main content

lapros data for better AI

Project description

LaPros

Install

pip install -U lapros

How to use

LaPros works with classifiers. It ranks the suspicious labels given probabilies by some classification model. You can use normal Python lists, Numpy arrays or Pandas data. Return values are in a Numpy array or a Pandas series, the larger the value, the more suspicious are the coresponding labels.

assert lapros.__version__ == '0.3'
from lapros import suspect
labels = pd.Series(["cat", "dog", "dog", "cat", "cat"])
0    cat
1    dog
2    dog
3    cat
4    cat
dtype: object
probas = pd.DataFrame(dict(
    cat=[0.5, 0.4, 0.3, 0.2, 0.1],
    dog=[0.5, 0.6, 0.7, 0.8, 0.9],
))
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
cat dog
0 0.5 0.5
1 0.4 0.6
2 0.3 0.7
3 0.2 0.8
4 0.1 0.9
suspect(
    probas,
    labels=labels,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65      ]
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
err suspected
0 0.000000 False
1 0.183333 True
2 0.000000 False
3 0.216667 True
4 0.416667 True
residual = suspect(
    probas,
    labels=labels,
    rank_method="residual",
    return_non_errors=False,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65      ]
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
err
1 0.4
3 0.8
4 0.9
set_logger("INFO")
confidence = suspect(
    probas,
    labels=labels,
    rank_method="confidence",
    return_non_errors=False,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65      ]
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
err
id
1 0.183333
3 0.216667
4 0.416667
probas.assign(labels=labels, residual=residual, confidence=confidence)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
cat dog labels residual confidence
0 0.5 0.5 cat NaN NaN
1 0.4 0.6 dog 0.4 0.183333
2 0.3 0.7 dog NaN NaN
3 0.2 0.8 cat 0.8 0.216667
4 0.1 0.9 cat 0.9 0.416667

docstring


suspect

Rank the suspicious labels given probas from a classifier. Accept Numpy arrays, Pandas dataframes and series. We can use interger, string or even float labels, given that the probability matrix’s columns are indexed by the same label set.

Args

  • probas (n x m matrix): probabilites for possible classes.

KwArgs

  • labels (n x 1 vector): observed class labels
  • rank_method (str): residual or confidence
  • return_non_errors (bool, default = True): return all rows, including non-errors

Returns

a Pandas DataFrame including 1 index and 2 columns:

  • id (int): the index which is the same to the original data row index
  • err (float): the magnitude of suspiciousness, valued between [0, 1]
  • suspected (bool): whether the data row is suspected as having a label error. This collum is returned iff return_non_errors=True.
help(suspect)
Help on function suspect in module lapros.api:

suspect(...)
    Rank the suspicious labels given probas from a classifier.
    Accept Numpy arrays, Pandas dataframes and series.
    We can use interger, string or even float labels, given that
    the probability matrix's columns are indexed by the same label set.
    
    #### Args
    
    - probas (n x m matrix): probabilites for possible classes.
    
    #### KwArgs
    
    - labels (n x 1 vector): observed class labels
    - rank_method (str): `residual` or `confidence`
    - return_non_errors (bool, default = True): return all rows, including non-errors
    
    #### Returns
    
    a Pandas DataFrame including 1 index and 2 columns:
    
    - id (int): the index which is the same to the original data row index
    - err (float): the magnitude of suspiciousness, valued between [0, 1]
    - suspected (bool):  whether the data row is suspected as having a label error. This collum is returned iff return_non_errors=True.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lapros-0.3.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lapros-0.3-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file lapros-0.3.tar.gz.

File metadata

  • Download URL: lapros-0.3.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for lapros-0.3.tar.gz
Algorithm Hash digest
SHA256 383567e6ca4056b1f4dcc66784cbeada9aa7bcf50488920ab0e488808c84f213
MD5 3c6bdc8fdc37cf2560c0c5e6f91909d0
BLAKE2b-256 4de1cfd820b9190d14ef94d3cc02edfdc35d733af6483d595a00de74ea0b8d01

See more details on using hashes here.

File details

Details for the file lapros-0.3-py3-none-any.whl.

File metadata

  • Download URL: lapros-0.3-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for lapros-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 01ea80f8f1f698482c39fa8c685cadf31f537af4f6fb8b6462fe4864a6fdabcb
MD5 bd24a51dd23a285ff18891e8cc2d396b
BLAKE2b-256 5b9c2812df07e08ea9d712996dd3cbe5c1270d729f71f47a0c621eb24ea5c71b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page