NPEC pipeline
Project description
pyphenotyper
PyPhenotyper is a Python library and command-line tool hosted on GitHub, specializing in high throughput phenotyping of Arabidopsis plants. It automates the measurement of various morphological traits, offering a user-friendly interface for both novice and advanced users. With its capability to handle large datasets and customizable analysis options, PyPhenotyper facilitates comprehensive studies of plant growth and development.
Getting Started
Installing
The pyphenotyper package can be installed using pip:
pip install pyphenotyper
Usage
pyphenotyper can be used both from the command line and as a Python library.
Command line usage
The interactive CLI prompts(check below) eliminate the need for command line arguments:
poetry run python main.py
CLI interaction
1. Inference Pipeline
Starting Pyphenotyper
First Prompt
The first prompt is used as confirmation to ensure that the used put all of the images that they want to analyze in the folder called 'input'.
Second Prompt
The second prompt gives the user the possibility to either use the pre-trained models or choose their own ones. If **'n' ** was indicated a third prompt will appear asking the user to provide the full path to the model.
Output
The output of the pipeline is saved in the newly created timeseries folder.
The output for each image follows the structure below.
./timeseries/
- {IMAGE NAME}
- {IMAGE NAME}
- plant_{n}
- landmarked_image.png
- landmarks.xlsx
- plant_data.xlsx
- plant_measurements.xlsx
- root_mask.png
- shoot_mask.png
- shoot_root_mask.png
- image_mask.png
- measurements.xlsx
- occlusion_mask.png
- root_mask.png
- root_mask_fixed.png
- root_structure.rsml
- shoot_structure.rsml
- plant_{n}
- assets
- lateral_length.png
- plant_{n}.png
- primary_length.png
- total_length.png
- {IMAGE NAME}.png
- {IMAGE NAME}
2. Data Preparation Pipeline
Requirements
The only extra module it uses is shuttle
from data.data_processing import import padder, patch_image, roi_extraction_coords_direct
Usage
To run the script, use the following command in your terminal:
python data_prep_pipeline.py <image_folder> <masks_folder>
image_folder: Path to the folder containing images.masks_folder: Path to the folder containing masks.
Ensure that:
- masks and images are both be in .png format
- there should be at least 10 images (and respective masks) otherwise you won't be able to prepare the data
- the mask of an image should have the exact same name as its corresponding image
- the masks don't have to be normalized (their values between 0 and 1) but it is reccomended as if you don't the script will take more time to run
Example
python data_prep_pipeline.py personal_data/images personal_data/masks
Steps
-
Validation: The script checks if the specified folders exist and contain
.pngfiles. It also ensures that the filenames in both folders match and that there are at least 10 images for the split. -
Cropping (Optional): The script prompts the user to decide whether to crop the images and masks. If cropping is chosen, it creates new folders with the cropped images and masks.
-
Folder Structure Creation: The script creates the following folder structure in the base directory of the provided image and mask folders:
train_images/train train_masks/train val_images/val val_masks/val test_images/test test_masks/test -
Data Splitting: The script splits the images and masks into training (60%), validation (20%), and test (20%) sets, and copies them to the respective folders.
-
Padding: The script prompts the user to input a patch size (256 or 512). It pads all the images and masks in the created folders to match the specified patch size.
-
Patching: The script divides each padded image and mask into smaller patches and saves them with a naming convention indicating the original image and patch number.
-
Cleanup: The script ensures all files in the matching folders (e.g.,
train_images/trainandtrain_masks/train) have the same names and deletes the original padded images, keeping only the patches.
Functions
validate_folder(folder: str, folder_type: str): Validates if the folder exists and contains.pngfiles.create_folder_structure(base_path: str): Creates the required folder structure.split_data(files: list, train_ratio: float, val_ratio: float): Splits data into training, validation, and test sets.copy_files(files: list, src_folder: str, dest_folder: str): Copies files from the source folder to the destination folder.normalize_masks(mask_folder: str): Normalizes mask files to be binary (0 and 1). If they are between 0 and 255, they are divided by 255.pad_and_save(folder: str, patch_size: int): Pads images to the specified patch size and saves them.patch_and_save(folder: str, patch_size: int): Patches images into smaller patches and saves the patches.validate_and_cleanup(images_folder: str, masks_folder: str): Validates and cleans up the padded images, keeping only the patches.main(image_folder: str, masks_folder: str): Main function to process images and masks, including validation, optional cropping, folder structure creation, data splitting, normalization, padding, patching, and cleanup.
Notes
- The cropping functionality is optional and can be skipped.
- The patch size can be specified as either 256 or 512.
Purpose
The overall use and purpose of this script is to streamline and automate the preprocessing of image and mask data, ensuring that they are properly validated, optionally cropped, padded to match patch sizes, divided into patches, and organized into training, validation, and test sets.
Server usage
Requirements
The sever usage requires Docker environment.
Library usage
For more information check out our official Sphinx documentation.
Versioning
We use Docker Hub for versioning.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyphenotyper-0.1.2.tar.gz.
File metadata
- Download URL: pyphenotyper-0.1.2.tar.gz
- Upload date:
- Size: 57.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7f85e2255160461aece88369043b95777b4bca9bcf630d1374932a992ec5fcf
|
|
| MD5 |
6b702773f3b9ec47e00083d261851fbd
|
|
| BLAKE2b-256 |
9e8a710e8804b0b9450d26c85d4c85238eefda600fa4dee136442b783a2ab8bb
|
File details
Details for the file pyphenotyper-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pyphenotyper-0.1.2-py3-none-any.whl
- Upload date:
- Size: 57.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfd101247086f734931666f743f8f144e768f856c6d1ab86813df3a1f8c2eb03
|
|
| MD5 |
4738bc550c1918c4be6cfd938a284e58
|
|
| BLAKE2b-256 |
293e2a39f4d1843fdb24674299abb8ac64aec4f217b6880084f53392bed5fb25
|