lapros data for better AI
Project description
LaPros
Install
pip install -U lapros
How to use
LaPros works with classifiers. It ranks the suspicious labels given probabilies by some classification model. You can use normal Python lists, Numpy arrays or Pandas data. Return values are in a Numpy array or a Pandas series, the larger the value, the more suspicious are the coresponding labels.
assert lapros.__version__ == '0.3'
from lapros import suspect
labels = pd.Series(["cat", "dog", "dog", "cat", "cat"])
0 cat
1 dog
2 dog
3 cat
4 cat
dtype: object
probas = pd.DataFrame(dict(
cat=[0.5, 0.4, 0.3, 0.2, 0.1],
dog=[0.5, 0.6, 0.7, 0.8, 0.9],
))
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| cat | dog | |
|---|---|---|
| 0 | 0.5 | 0.5 |
| 1 | 0.4 | 0.6 |
| 2 | 0.3 | 0.7 |
| 3 | 0.2 | 0.8 |
| 4 | 0.1 | 0.9 |
suspect(
probas,
labels=labels,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65 ]
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| err | suspected | |
|---|---|---|
| 0 | 0.000000 | False |
| 1 | 0.183333 | True |
| 2 | 0.000000 | False |
| 3 | 0.216667 | True |
| 4 | 0.416667 | True |
residual = suspect(
probas,
labels=labels,
rank_method="residual",
return_non_errors=False,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65 ]
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| err | |
|---|---|
| 1 | 0.4 |
| 3 | 0.8 |
| 4 | 0.9 |
set_logger("INFO")
confidence = suspect(
probas,
labels=labels,
rank_method="confidence",
return_non_errors=False,
)
lapros.classification.estimate_noise.avg_confidence:36 [0.26666667 0.65 ]
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| err | |
|---|---|
| id | |
| 1 | 0.183333 |
| 3 | 0.216667 |
| 4 | 0.416667 |
probas.assign(labels=labels, residual=residual, confidence=confidence)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
| cat | dog | labels | residual | confidence | |
|---|---|---|---|---|---|
| 0 | 0.5 | 0.5 | cat | NaN | NaN |
| 1 | 0.4 | 0.6 | dog | 0.4 | 0.183333 |
| 2 | 0.3 | 0.7 | dog | NaN | NaN |
| 3 | 0.2 | 0.8 | cat | 0.8 | 0.216667 |
| 4 | 0.1 | 0.9 | cat | 0.9 | 0.416667 |
docstring
suspect
Rank the suspicious labels given probas from a classifier. Accept Numpy arrays, Pandas dataframes and series. We can use interger, string or even float labels, given that the probability matrix’s columns are indexed by the same label set.
Args
- probas (n x m matrix): probabilites for possible classes.
KwArgs
- labels (n x 1 vector): observed class labels
- rank_method (str):
residualorconfidence - return_non_errors (bool, default = True): return all rows, including non-errors
Returns
a Pandas DataFrame including 1 index and 2 columns:
- id (int): the index which is the same to the original data row index
- err (float): the magnitude of suspiciousness, valued between [0, 1]
- suspected (bool): whether the data row is suspected as having a label error. This collum is returned iff return_non_errors=True.
help(suspect)
Help on function suspect in module lapros.api:
suspect(...)
Rank the suspicious labels given probas from a classifier.
Accept Numpy arrays, Pandas dataframes and series.
We can use interger, string or even float labels, given that
the probability matrix's columns are indexed by the same label set.
#### Args
- probas (n x m matrix): probabilites for possible classes.
#### KwArgs
- labels (n x 1 vector): observed class labels
- rank_method (str): `residual` or `confidence`
- return_non_errors (bool, default = True): return all rows, including non-errors
#### Returns
a Pandas DataFrame including 1 index and 2 columns:
- id (int): the index which is the same to the original data row index
- err (float): the magnitude of suspiciousness, valued between [0, 1]
- suspected (bool): whether the data row is suspected as having a label error. This collum is returned iff return_non_errors=True.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lapros-0.3.tar.gz.
File metadata
- Download URL: lapros-0.3.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
383567e6ca4056b1f4dcc66784cbeada9aa7bcf50488920ab0e488808c84f213
|
|
| MD5 |
3c6bdc8fdc37cf2560c0c5e6f91909d0
|
|
| BLAKE2b-256 |
4de1cfd820b9190d14ef94d3cc02edfdc35d733af6483d595a00de74ea0b8d01
|
File details
Details for the file lapros-0.3-py3-none-any.whl.
File metadata
- Download URL: lapros-0.3-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01ea80f8f1f698482c39fa8c685cadf31f537af4f6fb8b6462fe4864a6fdabcb
|
|
| MD5 |
bd24a51dd23a285ff18891e8cc2d396b
|
|
| BLAKE2b-256 |
5b9c2812df07e08ea9d712996dd3cbe5c1270d729f71f47a0c621eb24ea5c71b
|