Split training data images into training, validation and test (dataset) folders.
Project description
Annotated images
Split folders with files (e.g. images) into train, validation and test folders.
Keeps the annotation data (if there are any) together with their images.
Given the input folder in the following format:
input/
img1.jpg
img1.xml
img1.json
img1.*
img2.jpg
img2.xml
img2.json
img2.*
...
...
Gives you this:
output/
train/
img1.jpg
img1.xml
img1.json
img1.*
...
val/
img2.jpg
img2.xml
img2.json
img2.*
...
test/
whatever.jpg
whatever.xml
whatever.json
whatever.*
...
- Works on any file types.
- A seed lets you reproduce the splits.
Counting occurrences of tags
This package includes functions to count the occurrences of a tag in JSON and XML files.
They can go through all files in a folder and count the occurrence of each tag on every (annotated) image.
Install
pip install annotated-images
Module
import annotated_images
# To only split into training and validation set, set a tuple to `ratio`, i.e, `(.8, .2)`.
annotated-images.split('input_folder', output="output", seed=1337, ratio=(.8, .1, .1))
import annotated_images
# Returns total count of 'tag' found in all json files in 'path'
annotated-images.findTagsJson('path', 'tag')
# Returns total count of 'tag' found in all xml files in 'path'
annotated-images.findTagsXml('path', 'tag')
Ref
this package is forked from https://github.com/jfilter/split-folders
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file annotated_images-0.1.4.tar.gz
.
File metadata
- Download URL: annotated_images-0.1.4.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2ab7003e62649936f90643d0b4cd0101777b7579fa74248732a5de503653af9 |
|
MD5 | b290e5d9663fb131f510e5154972989b |
|
BLAKE2b-256 | 50d6e04b595a2d16847e66afdcc047bfff42607c34b6010327758aec89c5cc0a |