Skip to main content

This package is useful for checking image data quality, separating image and label files, and splitting datasets.

Project description

mlutils

This package is useful for checking image data quality, separating image and label files, and splitting datasets.

Requirements

The package requires the following dependencies to be installed:

  • os
  • PIL (Python Imaging Library)
  • cv2 (OpenCV)
  • random
  • shutil

Installation

you can install mlutils via pip: pip install mlutils

Usage

Check for corrupted images

Use the check_corrupted_images_in_directory(dir_in) function to check for corrupted images in a directory. It returns the number of corrupted images found.

from mlutils import check_corrupted_images_in_directory
check_corrupted_images_in_directory("/path/to/images/folder")

#####Check image naming conventions Use the check_naming_conventions(dir_in) function to check and fix image naming conventions in a directory. This function renames images with upper-case file extensions to lower-case and changes "jpeg" to "jpg" file extensions.

from mlutils import check_naming_conventions
check_naming_conventions("/path/to/images/folder")

####Detect and fix images with a premature ending Use the detect_and_fix_premature_ending(dir_path) function to detect and fix images with a premature ending in a directory. It returns the number of fixed images.

from mlutils import detect_and_fix_premature_ending
detect_and_fix_premature_ending("/path/to/images/folder")

####Separate and copy files Use the separate_and_copy_files(folders, dest_main_folder) function to separate image and label files from multiple folders and copy them to a new destination folder.

from mlutils import separate_and_copy_files
folders = ["/path/to/1st/images/folder", "/path/to/2nd/images/folder"]
dest_main_folder = "/path/to/destination/folder"
separate_and_copy_files(folders, dest_main_folder)

####Split a dataset Use the split_dataset(images_path, labels_path, img_val_path, label_val_path, split_fraction) function to split a dataset into training and validation sets. The function moves a fraction of images and their corresponding label files to validation folders.

from mlutils import split_dataset
images_path = "/path/to/training/images"
labels_path = "/path/to/training/labels"
img_val_path = "/path/to/validation/images"
label_val_path = "/path/to/validation/labels"
split_fraction = 0.2 # 20% of the data will be moved to validation set
split_dataset(images_path, labels_path, img_val_path, label_val_path, split_fraction)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymlimageutils-0.1.0.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

pymlimageutils-0.1.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file pymlimageutils-0.1.0.tar.gz.

File metadata

  • Download URL: pymlimageutils-0.1.0.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.10.6 Linux/5.19.0-35-generic

File hashes

Hashes for pymlimageutils-0.1.0.tar.gz
Algorithm Hash digest
SHA256 113202225b52e22c80f7e3fc09f720def229388b1f22abc17a63e721d2adedb8
MD5 b7673ba40510fb6de4af049e89519102
BLAKE2b-256 56a42d83c3305de744d61e1c085f039ec6aac115fb281e2166a8b71bbeb9b42c

See more details on using hashes here.

File details

Details for the file pymlimageutils-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pymlimageutils-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.10.6 Linux/5.19.0-35-generic

File hashes

Hashes for pymlimageutils-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75cedaf24f0a4d85428eedbfa12c2d3ad476c50d63a8a50688c2bf6782fb44df
MD5 acd2927495fe8b98b4f47448ecc91190
BLAKE2b-256 92c844ba359df15d648f7ea84affd164e1ecfdecbc464e788d22c94d5d5c301e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page