Web platform for fast image dataset cleaning
Project description
Fast Dataset Cleaner by PhotoRoom - 🏃
Installation
- Run
pip3 install fast-dataset-cleaner
.
Launch the platform
Run fast-dataset-cleaner
in your CLI. You can optionally add a specific port: fast-dataset-cleaner --port _CUSTOM_PORT_
(default: 1747).
Open your browser and go to localhost:1747 (or your custom port) to see the live platform.
Requirements
- Save images for annotation in a single folder.
- Create a csv with an id column containing all the ids or names of the images to annotate. For instance, if your images are in image_{id}.jpg format, your csv should look like :
id
image_0
image_1
image_2
How it works
This platform is designed for binary classification of images. This can be helpful either to clean up datasets or to add a label to each image.
When launching the platform for the first time, you have to fill in the entries in the left menu - accessible by clicking on the banner or by typing on the Space bar. Once you are finished, click on the Get images button or reload the page. The required password is the one displayed in your CLI.
The entire annotation process can be done using the keyboard. The images are displayed with a number on their left. To annotate one of them, press the associated key or click on the card. By default, each image has the value true. When all the images on a page are annotated, press the Enter key to validate the annotations. You can then check in your files that a new csv was created - the initial name of the csv with the suffix _annotated - with two new columns for the annotator and the annotation, and that these annotations have been saved.
You can change pages with the arrows on your keyboard, which allows you to navigate through your dataset and re-annotate some images if necessary. BEWARE: ONLY the Enter button saves the annotations.
When refreshing the page, unlabeled images are displayed. If after a page refresh the final screen is displayed, you're done labeling your dataset! 🎉
Use masks
You can also use masks to check segmentations. For this task, save all your binary masks in another folder with the same ids as the original images. Add this folder to the platform entries and you should be able to see the segmented images after a page refresh.
Shortcuts
For convenience and speed, we implemented a few keyboard shortcuts:
- Open/Close the menu: m or Space bar.
- Navigate between images: Keyboard arrows.
- Annotate an image: Press the number key associated with the image number.
- Validate annotations: Enter.
- Load images when the menu is open: i or g.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fast-dataset-cleaner-1.0.0.tar.gz
.
File metadata
- Download URL: fast-dataset-cleaner-1.0.0.tar.gz
- Upload date:
- Size: 653.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dda9da57545cf73f091319eb08a4d1199f24f5659e8280b3b719f71a4ac1fe5d |
|
MD5 | c217c421685922548ce19c58f17a31a5 |
|
BLAKE2b-256 | ed4151844532ba79620d4d0e1b3f99da9dc5c387f055aae8b201b56b544d3ec6 |
File details
Details for the file fast_dataset_cleaner-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: fast_dataset_cleaner-1.0.0-py3-none-any.whl
- Upload date:
- Size: 1.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.6.0.post20201009 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b438b89d4b7b8cdae7416c7a676b7561f3f138164c11d574bf7fa7d9c3d5b940 |
|
MD5 | 59b69a98488cd225d29dfc2466ae1bc7 |
|
BLAKE2b-256 | 36432f6b023e8708b0991263835a9584fca5fc6b65e6a57f214ef9b6b4e1df6e |