Tools to merge and remap computer vision datasets
Project description
datakit
Python package for YOLO-format dataset operations:
- merge multiple datasets into one
- merge multiple class names into a target class
- remap class IDs
- visualize labeled samples
CLI Usage
1) Merge datasets
datakit merge /path/ds1 /path/ds2 --out /path/out
2) Merge classes
datakit merge-classes /path/dataset --from Backpack Backpacks --to bag
3) Remap classes
datakit remap /path/dataset --names bag person --map 0:0 1:0 2:1
Remap safety behavior:
- validates that all mapped target IDs are within length of given class range
- pre-scans all label files to ensure every class ID has a mapping before writing
- only writes labels and
data.yamlafter validation succeeds
4) Visualize samples
datakit visualize --images-dir /path/dataset/val/images --labels-dir /path/dataset/val/labels --n 12 --seed 1
Python API
from datakit import merge_datasets, merge_classes, remap_dataset, plot_random_samples
merge_datasets(["/path/ds1", "/path/ds2"], "/path/out")
merge_classes("/path/dataset", ["Backpack", "Backpacks"], "bag")
remap_dataset("/path/dataset", ["bag", "person"], {0: 0, 1: 0, 2: 1})
plot_random_samples("/path/dataset/val/images", "/path/dataset/val/labels", n=12, seed=1)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cv_datakit-0.1.1.tar.gz
(4.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cv_datakit-0.1.1.tar.gz.
File metadata
- Download URL: cv_datakit-0.1.1.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af7a31d02ca53d0d0de56148ac0331caf9b6809134966b71328700d5ae71ec73
|
|
| MD5 |
7e943eed8cf8bd7f54c20798dcbf30b9
|
|
| BLAKE2b-256 |
8b8c91aaa59a228f8a36236793d0ba770fe53e74d52249b38c98fcdcf162d28f
|
File details
Details for the file cv_datakit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: cv_datakit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db1b9a95e1a251b049b35346d871cbd5a8e3434d22120200878003a77771bd86
|
|
| MD5 |
f81ca64a4336da76b1665929e086863a
|
|
| BLAKE2b-256 |
38863c17e7072cf5893e0cfc1074862b6c71a19f1b7990d8c8555790a3f854b6
|