Scoring and metrics app

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Scoring and metrics app

Description

This app improves the efficiency of annotating data by improving annotation quality while reducing the time required to produce them.

The components of this app are:

Functions to calculate scores for quality tasks and model predictions.
Custom nodes that can be added to pipelines to calculate scores when a task's quality items are completed.

Also check this notebook for more information

Quality Flows

To understand more about each of these tasks, refer to the main Dataloop documentation linked:

In general, an annotator will receive an assignment to complete their annotation task. For a given item in a consensus task, each assignment will be cross-compared with every other assignment. In the case of qualification and honeypot tasks, each item will only have one assignment associated with it.

What's Supported?

Supported file types:

image
video

Scoring is currently supported for quality tasks with the following annotation types (with geometry score method in parentheses, where applicable):

classification
bounding box (IOU)
polygon (IOU)
segmentation (IOU)
point (distance)

Score Types

During scoring, the following scores will be created for each annotation:

raw_annotation_scores - for each annotation comparison we have geometry, label and attribute matching scores
annotation_overall - the mean of each annotation’s raw scores
user_confusion_score - the mean of every annotation overall score, relative to ref or another assignee
item_confusion_score - the count of the number of label pairs associated with the assignee’s label, relative to the reference’s label
item_overall_score - the mean value of each annotation overall score associated with an item

1) Raw annotation scores:

There are three types of scores for annotations: annotation_iou, annotation_label and annotation_attribute.
These scores can be determined by the user, and the default is to include all three scores, and the default value is 1 (which can be modified).

2) Annotation overall

For annotation_overall score we calculate the mean value for all raw annotation scores per annotation.

3) User confusion score

The user_confusion score represents the mean annotation score a given assignee has, relative to raw scores when comparing it to another set of annotations (either the reference or another assignee).

4) Label confusion score

The label_confusion score represents the count for a label annotated by a given assignee, relative to label each label class in the other set of annotations (either reference or another assignee).

5) Item overall score

The item_overall score is the mean value of all annotations associated with an item, averaging the mean overall annotation score.

Any calculated and uploaded scores will replace any previous scores for all items of a given task.

Note about videos: Video scores will differ slightly from image scores. Video sores are calculated frame by frame, and then specific annotation scores will be the average of these scores across all relevant frames for that specific annotation. Confusion scores are not calculated due to the multi-frame nature of videos. Item overall scores remain an average of all annotations of the video item.

Confusion Example

There are generally two kinds of scores: regular scores, and “confusion” scores.

Regular scores show the level of agreement or overlap between two sets of annotations. They use the ID of the entities being compared for the entityID and relative fields. This can be for comparing annotations or items. value will typically be a number between 0 and 1.

There are two types of confusion scores: item label confusion, and user confusion. Item label confusion shows the number of instances in which an assignee’s label corresponds with the ground truth labels.

Ground truth annotations:

Cat v dog

item = dl.items.dl(item_id='64c0fc0730b03f27ca3a58db')

Assignee annotations:

Cat v dog

item = dl.items.dl(item_id='64c0f2e1ec9103d52eaedbe2')

In this example item, the ground truth has 3 for each cat and dog class. The assignee however, labels 1 as cat and 5 as dog. This would result in the following item label confusion scores:

{
        "type": "label_confusion",
        "value": 1,
        "entityId": "cat",
        "context": {
            "relative": "cat",
            "taskId": "<TASK_ID>",
            "itemId": "<ITEM_ID">,
            "datasetId": "<DATASET_ID>"
        }
},
{
        "type": "label_confusion",
        "value": 3,
        "entityId": "dog",
        "context": {
            "relative": "dog",
            "taskId": "<TASK_ID>",
            "itemId": "<ITEM_ID">,
            "datasetId": "<DATASET_ID>"
        }
},
{
        "type": "label_confusion",
        "value": 2,
        "entityId": "dog",
        "context": {
            "relative": "cat",
            "taskId": "<TASK_ID>",
            "itemId": "<ITEM_ID">,
            "datasetId": "<DATASET_ID>"
        }
}

Python installation

pip install dtlpymetrics

Functions

See this page for details on additional functions.

Contributions, Bugs and Issues - How to Contribute

We welcome anyone to help us improve this app.
Here are detailed instructions to help you open a bug or ask for a feature request

Project details

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.106

May 1, 2024

1.0.100

Nov 19, 2023

1.0.99

Nov 16, 2023

1.0.98

Nov 8, 2023

1.0.97

Nov 2, 2023

1.0.96

Oct 31, 2023

1.0.94

Oct 4, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

dtlpymetrics-1.0.106-py3-none-any.whl (36.1 kB view hashes)

Uploaded May 1, 2024 Python 3

Hashes for dtlpymetrics-1.0.106-py3-none-any.whl

Hashes for dtlpymetrics-1.0.106-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c04f4c62e2b909ca2a2dca87ab77a1198dfead42a23018f7f641c5b228d006d9`
MD5	`b8c683b4c5bb1a9c1287c67e9ae5d402`
BLAKE2b-256	`f12cef0da002dd3c6f7fba99d17e9d524931479f15e35102477e22c044d45751`