Skip to main content

Automatic annotation of images library

Project description

Auto Annotator

Logo

GitHub Actions Workflow Status codecov PyPI version PyPI - Format license GitHub top language PythonVersion Downloads GitHub closed pull requests GitHub contributors GitHub forks GitHub Repo stars

An extendable tool for automatic annotation of image data by a combination of deep neural networks.

The primary objective of this annotator is to prioritize the accuracy and quality of predictions over speed. The autoannotator has been specifically designed to surpass the precision offered by most publicly available tools. It leverages ensembles of deep neural models to ensure the utmost quality in its predictions. It is important to note that neural networks trained on clean datasets tend to yield superior results compared to those trained on larger but noisier datasets.

Supported tasks

Human Face Detection and Recognition

Human Face Detection and Recognition

  • Face and landmarks detection
  • Face alignment via keypoints
  • Face descriptor extraction

Human Body Detection

  • UniHCP
  • IterDETR (Progressive DETR)
  • RTDETR, ResNet-101
  • InternImage-XL

Other

  • DBSCAN clustering

📊 Benchmarks

Face Recognition

Speed

Task Hardware Time, s
Face detection + landmarks + extraction Xeon e5 2678 v3 ~1

Human Detection

Quality

AP@50 when RT-DETR ResNet-18 was trained on CUHK-SYSU

Test set Original Markup Markup via DDQ-DETR Markup via Ensemle
CrowdHuman 52.30 76.97 77.31
WiderPerson 54.66 60.89 63.71

🏗 Installation

PIP package

pip install autoannotator

🚲 Getting started

Face recognition example

Check out our demo face recognition pipeline at: examples/face_recognition_example.py

[Optional] Run frontend and backend

git clone https://github.com/CatsWhoTrain/autoannotator_client
cd autoannotator_client
docker compose up

The webinterface could be found locally at: http://localhost:8080/

Human full-body detection

Detailed description is given in the separate document

Human detection example

Check out our demo face recognition pipeline at: examples/human_detection_example.py

FAQ

Do companies and engineers actually need this tool?

We have asked engineers in the field of video analytics whether they are interested in such a library. Their responses were:

  • IREX: would use this library and contribute to it.
  • NapoleonIT: would use this library and contribute to it.
  • Linza Metrics: would use this library.

What are the reasons for choosing this data labeling tool over the alternative of employing human annotators?

Human accuracy is not so good

Long time ago Andrej Karpathy observed that his accuracy was only 94% when he tried to label just 400 images of the CIFAR-10 dataset while SOTA Efficient adaptive ensembling for image classification (August 29, 2023) achieves >99.6% accuracy. When expert labelers had to choose from ~100 labels while annotating ImageNet, their error rate increased to 13-15%.

Andrej's error rate was determined to be 5.1%, and he initially invested approximately one minute in labeling a single image. Conversely, utilizing Florence or never models for the same task can deliver a top-5 error rate of less than 1%.

Industry case: human face classification.

A certain undisclosed company, bound by a non-disclosure agreement (NDA), has utilized a technique wherein face images captured under challenging environmental conditions are pre-processed. This procedure involves the application of both a facial recognition network and DBSCAN algorithm to divide the images into distinct individuals. Subsequently, human annotators undertook a validation process to verify the accuracy of the pre-processed data. The work conducted by the annotators was inspected by their team leader. Ultimately, it was determined by an ML engineer that 1.4% of the clustered face images were mislabeled.

🏰 Legacy

Current repository takes ideas and some code form the following projects:

✒ Legal

All individuals depicted in the "assets" images have provided explicit consent to utilize their photos for the purpose of showcasing the current library's work. Kindly refrain from utilizing these images within your individual projects.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoannotator-0.2.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

autoannotator-0.2.0-py3-none-any.whl (40.0 kB view details)

Uploaded Python 3

File details

Details for the file autoannotator-0.2.0.tar.gz.

File metadata

  • Download URL: autoannotator-0.2.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for autoannotator-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6f3b08fb54741f514825ba72bf6f4ae2c7084d20a0a6099f29510bf590e4400b
MD5 b276327e9b24095a9b9aceff8a8f2493
BLAKE2b-256 b4b6c2197bf171c943b87b23d7b9b6185e5c5c98a77056bbb4a83fddf5e88fd9

See more details on using hashes here.

File details

Details for the file autoannotator-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for autoannotator-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a98f03077082e6e15b10fcad84a14a9b864f0c97737b29c68bb7164c28ca9f5
MD5 c21e9957569482c25ff81daf73295e82
BLAKE2b-256 43a546ee0f07167bcfdde7f36245025903cc0d8b00ed197f8929f73439472101

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page