Skip to main content

Web platform for fast image dataset cleaning

Project description

Fast Dataset Cleaner by PhotoRoom - 🏃

Installation

  • Run pip3 install fast-dataset-cleaner.

Launch the platform

Run fast-dataset-cleaner in your CLI. You can optionally add a specific port: fast-dataset-cleaner --port _CUSTOM_PORT_ (default: 1747). Open your browser and go to localhost:1747 (or your custom port) to see the live platform.

Requirements

  • Save images for annotation in a single folder.
  • Create a csv with an id column containing all the ids or names of the images to annotate. For instance, if your images are in image_{id}.jpg format, your csv should look like :
id
image_0
image_1
image_2

How it works

This platform is designed for binary classification of images. This can be helpful either to clean up datasets or to add a label to each image.

When launching the platform for the first time, you have to fill in the entries in the left menu - accessible by clicking on the banner or by typing on the Space bar. Once you are finished, click on the Get images button or reload the page. The required password is the one displayed in your CLI.

The entire annotation process can be done using the keyboard. The images are displayed with a number on their left. To annotate one of them, press the associated key or click on the card. By default, each image has the value true. When all the images on a page are annotated, press the Enter key to validate the annotations. You can then check in your files that a new csv was created - the initial name of the csv with the suffix _annotated - with two new columns for the annotator and the annotation, and that these annotations have been saved.

You can change pages with the arrows on your keyboard, which allows you to navigate through your dataset and re-annotate some images if necessary. BEWARE: ONLY the Enter button saves the annotations.

When refreshing the page, unlabeled images are displayed. If after a page refresh the final screen is displayed, you're done labeling your dataset! 🎉

Use masks

You can also use masks to check segmentations. For this task, save all your binary masks in another folder with the same ids as the original images. Add this folder to the platform entries and you should be able to see the segmented images after a page refresh.

Shortcuts

For convenience and speed, we implemented a few keyboard shortcuts:

  • Open/Close the menu: m or Space bar.
  • Navigate between images: Keyboard arrows.
  • Annotate an image: Press the number key associated with the image number.
  • Validate annotations: Enter.
  • Load images when the menu is open: i or g.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast-dataset-cleaner-1.0.0.tar.gz (653.0 kB view details)

Uploaded Source

Built Distribution

fast_dataset_cleaner-1.0.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file fast-dataset-cleaner-1.0.0.tar.gz.

File metadata

  • Download URL: fast-dataset-cleaner-1.0.0.tar.gz
  • Upload date:
  • Size: 653.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.8

File hashes

Hashes for fast-dataset-cleaner-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dda9da57545cf73f091319eb08a4d1199f24f5659e8280b3b719f71a4ac1fe5d
MD5 c217c421685922548ce19c58f17a31a5
BLAKE2b-256 ed4151844532ba79620d4d0e1b3f99da9dc5c387f055aae8b201b56b544d3ec6

See more details on using hashes here.

Provenance

File details

Details for the file fast_dataset_cleaner-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: fast_dataset_cleaner-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.8

File hashes

Hashes for fast_dataset_cleaner-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b438b89d4b7b8cdae7416c7a676b7561f3f138164c11d574bf7fa7d9c3d5b940
MD5 59b69a98488cd225d29dfc2466ae1bc7
BLAKE2b-256 36432f6b023e8708b0991263835a9584fca5fc6b65e6a57f214ef9b6b4e1df6e

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page