Skip to main content

Split training data images into training, validation and test (dataset) folders.

Project description

Annotated images

Split folders with files (e.g. images) into train, validation and test folders.

Keeps the annotation data (if there are any) together with their images.
Given the input folder in the following format:

input/
        img1.jpg
        img1.xml
        img1.json
        img1.*
        img2.jpg
        img2.xml
        img2.json
        img2.*
        ...
    ...

Gives you this:

output/
    train/
        img1.jpg
        img1.xml
        img1.json
        img1.*
        ...
    val/
        img2.jpg
        img2.xml
        img2.json
        img2.*
        ...
    test/
        whatever.jpg
        whatever.xml
        whatever.json
        whatever.*
        ...
  • Works on any file types.
  • A seed lets you reproduce the splits.

Counting occurrences of tags

This package includes functions to count the occurrences of a tag in JSON and XML files.
They can go through all files in a folder and count the occurrence of each tag on every (annotated) image.

Install

pip install annotated-images

Module

import annotated_images

# To only split into training and validation set, set a tuple to `ratio`, i.e, `(.8, .2)`.
annotated-images.split('input_folder', output="output", seed=1337, ratio=(.8, .1, .1))
import annotated_images

# Returns total count of 'tag' found in all json files in 'path'
annotated-images.findTagsJson('path', 'tag')

# Returns total count of 'tag' found in all xml files in 'path'
annotated-images.findTagsXml('path', 'tag')

Ref

this package is forked from https://github.com/jfilter/split-folders

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

annotated_images-0.1.4.tar.gz (3.5 kB view details)

Uploaded Source

File details

Details for the file annotated_images-0.1.4.tar.gz.

File metadata

  • Download URL: annotated_images-0.1.4.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for annotated_images-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a2ab7003e62649936f90643d0b4cd0101777b7579fa74248732a5de503653af9
MD5 b290e5d9663fb131f510e5154972989b
BLAKE2b-256 50d6e04b595a2d16847e66afdcc047bfff42607c34b6010327758aec89c5cc0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page