Skip to main content

python scripts to convert labelme-generated-jsons to voc/coco style datasets.

Project description

Pylint codebeat badge english
Contributors Forks Stargazers Issues MIT License


Logo

labelme2Datasets

python scripts to convert labelme-generated-jsons to voc/coco style datasets.
Report Bug · Request Feature

(中文 README)

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Scripts in this repository are used to convert labelme-annotated jsons into standard datasets in PASCAL VOC format or MS COCO format.

Scripts are written in Python.

Most of the scripts refer to the examples section of labelme. Then I add some features according my own dataset, like class name conversion, customise image name, etc.

Attention: these scripts are not complicated, and if you have the basis of python, please go through the convert workflows, and ensure that it fits your datasets. There are some places I annotated MARK, which means pay attention to it, and you could customize it to fit your needs.

Customize: these scripts are only for the conversion of data I currently have. If you want to convert datasets in other areas, like instance segmentation, segmantic segmentation, video annotation, etc. please take a look at the examples section in labelme.

(back to top)

Built With

(back to top)

Getting Started

Prerequisites

  1. gather the labelme-annotated jsons into a folder. In the next steps, we will refer to this folder as labelme_jsons_dir.

  2. prepare a text file to store class names in your dataset. named it label_names.txt. take a look at test/label_names.txt for an example.

  3. if need class name conversion, prepare a text file to store the conversion rules. named it label_dict.txt. take a look at test/label_dict.txt for an example.

Installation

install in develop mode

  1. suggested to use virtualenv to install python packages.

    conda create --name=labelme python=3.6
    conda activate labelme
    pip install -r requirements.txt
    
  2. clone the repo.

    git clone git@github.com:veraposeidon/labelme2Datasets.git
    
  3. install the package

     cd labelme2Datasets
     # (prefer this way!) install in editable mode, so that you can modify the package 
     pip install -e .
     # install in non-editable mode, so that you can use the package, but cannot modify it
     #python setup.py install
    

simply use PyPI

I also published a PyPI package named labelme2datasets.

you can just use pip3 install labelme2datasets to install this package.

if the baseline in this project not work for your datasets, you can install in develop mode, and modify the code by your own.

(back to top)

Usage

  • convert a single json into dataset. (labelme_json2dataset.py)

    labelme_json2dataset --json_file=data/test.json \
      --output_dir=output/test_single_output
    
  • convert a folder of jsons into voc-format dataset. (labelme_bbox_json2voc.py)

    • without label conversion
      labelme_bbox_json2voc --json_dir=data/test_jsons \
        --output_dir=output/test_voc_output --labels data/label_names.txt
      
    • with label conversion
      labelme_bbox_json2voc --json_dir=data/test_jsons \
        --output_dir=output/test_voc_output \
        --labels data/label_names.txt \
        --label_dict data/label_dict.txt
      
  • splitting voc datasets into train set and test set. (split_voc_datasets.py)

      split_voc_datasets --voc_dir output/test_voc_output --test_ratio 0.3 --random_seed 42
    

    train.txt and test.txt should be generated in voc_dir/ImageSets/Main/.

  • turn voc format dataset into coco style dataset. (voc2coco.py)

      voc2coco --voc_dir output/test_voc_output --coco_dir output/test_coco_output
    

(back to top)

Roadmap

  • add all scripts with pylint passed
  • chinese and english readme
  • modify project architecture
  • publish as package

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

veraposeidon - veraposeidon@gmail.com

Project Link: https://github.com/veraposeidon/labelme2Datasets

(back to top)

Acknowledgments

(back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

labelme2datasets-0.0.2.tar.gz (15.3 kB view hashes)

Uploaded Source

Built Distribution

labelme2datasets-0.0.2-py3-none-any.whl (14.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page