Skip to main content

一个简单的深度学习数据集转换工具

Project description

dataset_convert_toolkit

Documentation Status

deep learning dataset convert toolkit

labelme

labelme annotation tool and convert labels to VOC, launch labelme and prapar labels.txt for masrk label names. and then gui will be display, you can annotation on it.(data_annotated represent marsked label save path)

eg: traffic light annocation with command, data_annotated can replease with ./datasetsample/traffic_light/label
labelme data_annotated --labels config/labels.txt --nodata --autosave

dataset convert yolo format

  1. convert labelme marsk label to yolo datasets format with command.
python ./dataset_convert_tool.py  --rectjson2yolo --json_dir "./datasetsample/traffic_light_labelme/label/" --output_dir './traffic_light_yolo' --labels "./config/traffic_light_labelme.yaml"
or 
python ./dataset_convert_tool.py  --rectjson2yolo --json_dir "./datasetsample/traffic_light_labelme/label/" --output_dir './traffic_light_yolo' --labels "config/labels.txt"
or only display with(normal datasets should use up 2 command)
python ./dataset_convert_tool.py  --rectjson2yolo --json_dir "datasetsample/traffic_light_labelme/label" --output_dir './traffic_light_yolo'

转换效果demo:

  1. convert VOC label to yolo datasets format with command.
python ./dataset_convert_tool.py  --rectvoc2yolo --xml_dir "datasetsample/SafetyHelmet_VOC/Annotations" --output_dir  "./helmet_yolo" --labels "config/labels_safetyhelmet_voc.txt"

VOC 格式的头盔数据集转为yolo 格式数据集效果

convert labelme marsk label to voc datasets format with command.(data_dataset_voc represent label converted of VOC dolder)

./labelme2voc.py data_annotated data_dataset_voc --labels labels.txt

so this code is refrence from https://github.com/labelmeai/labelme/tree/main/examples/bbox_detection

  1. COCO to yolo

coco folder

python dataset_convert_tool.py --cocotoyolo --cocojson_file 'datasetsample/coco_test/train' --output_dir './coco_test' --labels 'config/coco_test.yaml'

coco to yolo 目前没有测,因为手上没有coco数据集

labelme json(line) convert to culane

  1. convert labelme line convert to culane formate
python dataset_convert_tool.py --linejson2culane --line_json_dir 'datasetsample/laneline_test/label' --output_dir './laneline_culane' --labels 'config/laneline.yaml' --sample_dir 'datasetsample/laneline_test/foxcon' --crop_height 0.5

生成后的效果

  • datasetsample/laneline_test 为供转换参考的demo 样本,
  • foxcon/ 为训练的样本,
  • laneline_culane/laneseg_label_w16/foxcon 为label真值,将参与训练,
  • laneline_culane/list/trainval_gt.txt 为train和val 的样本list,
  • laneline_culane/marsk_disp/foxcon 为可视化的效果。

需要注意,可以用laneline_culane/foxcon/1682315093.575532.lines.txt 作为评分真值车道线抽样点,每一行表示一个车道线的sample点,sample点第一个值表示车道线属性值,1表示label为1的车道线,2表示label为2的车道线,注意和culane数据集的区别 [ * lines.txt ] 一般只参与评分。

label 1,2,3,...表示什么在 config/laneline.yaml 文件中定义,一般0为背景,例如:

names: ['ll', 'l', 'r', 'rr']

则,0 表示背景,1表示ll,2表示l,3表示r,4表示rr

未来将考虑接入:

  • KITTI Road
  • BDD100K
  • Waymo Open Dataset
  • Argoverse 2
  • Lyft Level 5

labelme json(rect) convert to VOC

将labelme 标注的rect labels 转换为VOC 数据集格式

 python dataset_convert_tool.py --labelmejson2voc --json_dir 'datasetsample/traffic_light_labelme/label' --output_dir './traffic_light_voc' --labels 'config/labels.txt'

UA-DETRAC 数据集转VOC 格式数据集

 python ua_detrac_convert_tool.py --rectxml2voc --image_dir /home/udi/DataSet1/车辆检测数据集/train --xml_dir /home/udi/DataSet1/车辆检测数据集/Train-XML/DETRAC-Train-Annotations-XML --output_dir ./detrac_train

参数说明

如果需要单独将YOLO 或者VOC 等格式的数据集划分为train.txt val.txt 可以参考

  • 按照绝对路径
from dataset_convert_toolkit.utils.file.py import filetool
if __name__ == '__main__':
    #  python train_val_split_2.py
    filetool.generate_sets_val_train_txt_absolute_paths('/home/udi/workspace/panchuanchao/yolov8/datasets/traffic_light_yolo/images',
                                                        endwith='.jpg',
                                                        test_ratio=0.2, random_seed=42,
                                                        tranval_save_dir='/home/udi/workspace/panchuanchao/yolov8/datasets/traffic_light_yolo/ImageSets/Main', 
                                                        images_dir='/home/udi/workspace/panchuanchao/yolov8/datasets/traffic_light_yolo/images')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataset_convert_toolkit-0.0.2.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

dataset_convert_toolkit-0.0.2-py3-none-any.whl (27.8 kB view details)

Uploaded Python 3

File details

Details for the file dataset_convert_toolkit-0.0.2.tar.gz.

File metadata

File hashes

Hashes for dataset_convert_toolkit-0.0.2.tar.gz
Algorithm Hash digest
SHA256 d4885f2e4353617d452405ad6800597939fe3747df78ed364b169de71f182a46
MD5 2643e133e03fdafef9b85b030a83e070
BLAKE2b-256 18d97c09edb961b372cd806d9c0534253a1530819e25dafeae0b6e50c1df4ac7

See more details on using hashes here.

File details

Details for the file dataset_convert_toolkit-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for dataset_convert_toolkit-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 035829e6045012dc42d3972078f934d955a0b0e9e9863592245300a0c6e1cf8b
MD5 f5fe5dad1a0b10aed6f7ddeb63e85105
BLAKE2b-256 3f7fe7859380c0c0b21ab3f22dc69ea355323b07199eecc187bac4d6c02ee282

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page