This package is useful for checking image data quality, separating image and label files, and splitting datasets.
Project description
mlutils
This package is useful for checking image data quality, separating image and label files, and splitting datasets.
Requirements
The package requires the following dependencies to be installed:
- os
- PIL (Python Imaging Library)
- cv2 (OpenCV)
- random
- shutil
Installation
you can install mlutils via pip:
pip install mlutils
Usage
Check for corrupted images
Use the check_corrupted_images_in_directory(dir_in) function to check for corrupted images in a directory. It returns the number of corrupted images found.
from mlutils import check_corrupted_images_in_directory
check_corrupted_images_in_directory("/path/to/images/folder")
#####Check image naming conventions Use the check_naming_conventions(dir_in) function to check and fix image naming conventions in a directory. This function renames images with upper-case file extensions to lower-case and changes "jpeg" to "jpg" file extensions.
from mlutils import check_naming_conventions
check_naming_conventions("/path/to/images/folder")
####Detect and fix images with a premature ending Use the detect_and_fix_premature_ending(dir_path) function to detect and fix images with a premature ending in a directory. It returns the number of fixed images.
from mlutils import detect_and_fix_premature_ending
detect_and_fix_premature_ending("/path/to/images/folder")
####Separate and copy files Use the separate_and_copy_files(folders, dest_main_folder) function to separate image and label files from multiple folders and copy them to a new destination folder.
from mlutils import separate_and_copy_files
folders = ["/path/to/1st/images/folder", "/path/to/2nd/images/folder"]
dest_main_folder = "/path/to/destination/folder"
separate_and_copy_files(folders, dest_main_folder)
####Split a dataset Use the split_dataset(images_path, labels_path, img_val_path, label_val_path, split_fraction) function to split a dataset into training and validation sets. The function moves a fraction of images and their corresponding label files to validation folders.
from mlutils import split_dataset
images_path = "/path/to/training/images"
labels_path = "/path/to/training/labels"
img_val_path = "/path/to/validation/images"
label_val_path = "/path/to/validation/labels"
split_fraction = 0.2 # 20% of the data will be moved to validation set
split_dataset(images_path, labels_path, img_val_path, label_val_path, split_fraction)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pymlimageutils-0.1.0.tar.gz
.
File metadata
- Download URL: pymlimageutils-0.1.0.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.0 CPython/3.10.6 Linux/5.19.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 113202225b52e22c80f7e3fc09f720def229388b1f22abc17a63e721d2adedb8 |
|
MD5 | b7673ba40510fb6de4af049e89519102 |
|
BLAKE2b-256 | 56a42d83c3305de744d61e1c085f039ec6aac115fb281e2166a8b71bbeb9b42c |
File details
Details for the file pymlimageutils-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: pymlimageutils-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.0 CPython/3.10.6 Linux/5.19.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75cedaf24f0a4d85428eedbfa12c2d3ad476c50d63a8a50688c2bf6782fb44df |
|
MD5 | acd2927495fe8b98b4f47448ecc91190 |
|
BLAKE2b-256 | 92c844ba359df15d648f7ea84affd164e1ecfdecbc464e788d22c94d5d5c301e |