Skip to main content

Globox is a package and command line interface to read and convert object detection databases (COCO, YOLO, PascalVOC, LabelMe, CVAT, OpenImage, ...) and evaluate them with COCO and PascalVOC.

Project description

Globox — Object Detection Toolbox

This framework can:

  • parse all kinds of object detection datasets (ImageNet, COCO, YOLO, PascalVOC, OpenImage, CVAT, LabelMe, etc.) and show statistics,
  • convert them to other formats (ImageNet, COCO, YOLO, PascalVOC, OpenImage, CVAT, LabelMe, etc.),
  • and evaluate predictions using standard object detection metrics such as AP@[.5:.05:.95], AP@50, mAP, AR1, AR10, AR100.

This framework can be used both as a library in your own code and as a command line tool. This tool is designed to be simple to use, fast and correct.

Quick Start

Install

You can install the package using pip:

pip install globox

Use as a library

Parse Annotations

The library has three main components:

  • BoundingBox: represents a bounding box with a label and an optional confidence score
  • Annotation: represent the bounding boxes annotations for one image
  • AnnotationSet: represents annotations for a set of images (a database)

The AnnotationSet class contains static methods to read different databases:

# COCO
coco = AnnotationSet.from_coco(file_path="path/to/file.json")

# YOLOv5
yolo = AnnotationSet.from_yolo_v5(
    folder="path/to/files/",
    image_folder="path/to/images/"
)

# Pascal VOC
pascal = AnnotationSet.from_pascal_voc(folder="path/to/files/")

Annotation offers file-level granularity for compatible datasets:

annotation = Annotation.from_labelme(file_path="path/to/file.xml")

For more specific implementations the BoundingBox class contains lots of utilities to parse bounding boxes in different formats, like the create() method.

AnnotationsSets are set-like objects. They can be combined and annotations can be added:

gts = coco + yolo
gts.add(annotation)

Inspect Datasets

Iterators and efficient lookup by image_id's are easy to use:

if annotation in gts:
    print("This annotation is present.")

if "image_123.jpg" in gts.image_ids:
    print("Annotation of image 'image_123.jpg' is present.")

for box in gts.all_boxes:
    print(box.label, box.area, box.is_ground_truth)

for annotation in gts:
    nb_boxes = len(annotation.boxes)
    print(f"{annotation.image_id}: {nb_boxes} boxes")

Database stats can printed to the console:

coco_gts.show_stats()
         Database Stats         
┏━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ Label       ┃ Images ┃ Boxes ┃
┡━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ aeroplane   │     10 │    15 │
│ bicycle     │      7 │    14 │
│ bird        │      4 │     6 │
│ boat        │      7 │    11 │
│ bottle      │      9 │    13 │
│ bus         │      5 │     6 │
│ car         │      6 │    14 │
│ cat         │      4 │     5 │
│ chair       │      9 │    15 │
│ cow         │      6 │    14 │
│ diningtable │      7 │     7 │
│ dog         │      6 │     8 │
│ horse       │      7 │     7 │
│ motorbike   │      3 │     5 │
│ person      │     41 │    91 │
│ pottedplant │      6 │     7 │
│ sheep       │      4 │    10 │
│ sofa        │     10 │    10 │
│ train       │      5 │     6 │
│ tvmonitor   │      8 │     9 │
├─────────────┼────────┼───────┤
│ Total       │    100 │   273 │
└─────────────┴────────┴───────┘

Convert and Save to many Formats

Datasets can be converted to and savde in other formats easily:

# ImageNet
gts.save_imagenet(save_dir="pascalVOC_db/")

# YOLO Darknet
gts.save_yolo_darknet(
    save_dir="yolo_train/", 
    label_to_id={"cat": 0, "dog": 1, "racoon": 2}
)

# YOLOv5
gts.save_yolo_v5(
    save_dir="yolo_train/", 
    label_to_id={"cat": 0, "dog": 1, "racoon": 2},
)

# CVAT
gts.save_cvat(path="train.xml")

COCO Evaluation

Evaluating is as easy as:

evaluator = COCOEvaluator(
    ground_truths=gts, 
    predictions=dets
)

ap = evaluator.ap()
ar_100 = evaluator.ar_100()
ap_75 = evaluator.ap_75()
ap_small = evaluator.ap_small()
...

All COCO standard metrics can be displayed in a pretty printed table with:

evaluator.show_summary()

which outputs:

                              COCO Evaluation
┏━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳...┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━┓
┃ Label     ┃ AP 50:95 ┃  AP 50 ┃   ┃   AR S ┃   AR M ┃   AR L ┃
┡━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇...╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━┩
│ airplane  │    22.7% │  25.2% │   │   nan% │  90.0% │   0.0% │
│ apple     │    46.4% │  57.4% │   │  48.5% │   nan% │   nan% │
│ backpack  │    54.8% │  85.1% │   │ 100.0% │  72.0% │   0.0% │
│ banana    │    73.6% │  96.4% │   │   nan% │ 100.0% │  70.0% │
.           .          .        .   .        .        .        .
.           .          .        .   .        .        .        .
.           .          .        .   .        .        .        .
├───────────┼──────────┼────────┼...┼────────┼────────┼────────┤
│ Total     │    50.3% │  69.7% │   │  65.4% │  60.3% │  55.3% │
└───────────┴──────────┴────────┴...┴────────┴────────┴────────┘

The array of results can be saved in CSV format:

evaluator.save_csv("where/to/save/results.csv")

Custom evaluations can be achieved with:

evaluation = evaluator.evaluate(
    iou_threshold=0.33,
    max_detections=1_000,
    size_range=(0.0, 10_000)
)

ap = evaluation.ap()
cat_ar = evaluation["cat"].ar

Evaluations are cached by (iou_threshold, max_detections, size_range) keys. This means that you should not care about about performance, repetead queries to the evaluator are fast!

Use in command line

Get a summary of annotations for one dataset:

globox summary /yolo/folder/ --format yolo

Convert annotations from one format to another one:

globox convert input/yolo/folder/ output_coco_file_path.json --format yolo --save_fmt coco

Evaluate a set of detections with COCO metrics:

globox evaluate groundtruths/ predictions.json --format yolo --format_dets coco

Show the help message for an exhaustive list of options:

globox summary -h
globox convert -h
globox evaluate -h

Tests

  1. Clone the repo with its test data:
git clone https://github.com/laclouis5/globox --recurse-submodules=tests/globox_test_data
  1. Install developement dependencies (virtual env recommended):
pip install -e ".[dev]"
  1. Run tox:
tox

Speed

Click to expand

Speed benchmark can be executed with:

python3 tests/benchmark.py

Speed test is done using timeit with 5 iterations on an early 2015 MacBook Air (8 GB RAM Dual-Core 1.6 GHz). The dataset is COCO 2017 Validation which comprises 5k images and 36 781 bounding boxes.

Task COCO CVAT OpenImage LabelMe PascalVOC YOLO TXT
Parsing 0.52s 0.59s 3.44s 1.84s 2.45s 3.01s 2.54s
Saving 1.12s 0.74s 0.42s 4.39s 4.46s 3.75s 3.52s

OpenImage, YOLO and TXT are slower because they store bounding box coordinates in relative coordinates and do not provide the image size, so reading it from the image file is required.

The fastest format is COCO and LabelMe.

AnnotationSet.show_stats(): 0.12 s

TODO

  • Basic data structures and utilities
  • Parsers (ImageNet, COCO, YOLO, Pascal, OpenImage, CVAT, LabelMe)
  • Parser tests
  • Database summary and stats
  • Database converters
  • Visualization options
  • COCO Evaluation
  • Tests with a huge load (5k images)
  • CLI interface
  • Make image_size optional and raise err when required (bbox conversion)
  • Make file saving atomic with a temporary to avoid file corruption
  • Pip package!
  • PascalVOC Evaluation
  • Parsers for TFRecord and TensorFlow
  • UI interface?

Acknowledgement

This repo is based on the work of Rafael Padilla. The goal of this repo is to improve the performance and flexibility and to provide additional tools.

Contribution

Feel free to contribute, any help you can offer with this project is most welcome. Some suggestions where help is needed:

  • CLI tools and scripts
  • Building a PIP package
  • Developing a UI interface

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

globox-2.0.0.tar.gz (31.6 kB view hashes)

Uploaded Source

Built Distribution

globox-2.0.0-py3-none-any.whl (32.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page