pyphenotyper

NPEC pipeline

These details have not been verified by PyPI

Project description

pyphenotyper

PyPhenotyper is a Python library and command-line tool hosted on GitHub, specializing in high throughput phenotyping of Arabidopsis plants. It automates the measurement of various morphological traits, offering a user-friendly interface for both novice and advanced users. With its capability to handle large datasets and customizable analysis options, PyPhenotyper facilitates comprehensive studies of plant growth and development.

Getting Started

Installing

The pyphenotyper package can be installed using pip:

pip install pyphenotyper

Usage

pyphenotyper can be used both from the command line and as a Python library.

Command line usage

The interactive CLI prompts(check below) eliminate the need for command line arguments:

poetry run python main.py

CLI interaction

1. Inference Pipeline

Starting Pyphenotyper

First Prompt

first_prompt

The first prompt is used as confirmation to ensure that the used put all of the images that they want to analyze in the folder called 'input'.

Second Prompt

first_prompt

The second prompt gives the user the possibility to either use the pre-trained models or choose their own ones. If **'n' ** was indicated a third prompt will appear asking the user to provide the full path to the model.

first_prompt

Output

The output of the pipeline is saved in the newly created timeseries folder.

The output for each image follows the structure below.

./timeseries/

{IMAGE NAME}
- {IMAGE NAME}
  - plant_{n}
    - landmarked_image.png
    - landmarks.xlsx
    - plant_data.xlsx
    - plant_measurements.xlsx
    - root_mask.png
    - shoot_mask.png
    - shoot_root_mask.png
  - image_mask.png
  - measurements.xlsx
  - occlusion_mask.png
  - root_mask.png
  - root_mask_fixed.png
  - root_structure.rsml
  - shoot_structure.rsml
- assets
  - lateral_length.png
  - plant_{n}.png
  - primary_length.png
  - total_length.png
- {IMAGE NAME}.png

2. Data Preparation Pipeline

Requirements

The only extra module it uses is shuttle

from data.data_processing import import padder, patch_image, roi_extraction_coords_direct

Usage

To run the script, use the following command in your terminal:

python data_prep_pipeline.py <image_folder> <masks_folder>

image_folder: Path to the folder containing images.
masks_folder: Path to the folder containing masks.

Ensure that:

masks and images are both be in .png format
there should be at least 10 images (and respective masks) otherwise you won't be able to prepare the data
the mask of an image should have the exact same name as its corresponding image
the masks don't have to be normalized (their values between 0 and 1) but it is reccomended as if you don't the script will take more time to run

Example

python data_prep_pipeline.py personal_data/images personal_data/masks

Steps

Validation: The script checks if the specified folders exist and contain .png files. It also ensures that the filenames in both folders match and that there are at least 10 images for the split.
Cropping (Optional): The script prompts the user to decide whether to crop the images and masks. If cropping is chosen, it creates new folders with the cropped images and masks.
Folder Structure Creation: The script creates the following folder structure in the base directory of the provided image and mask folders:
```
train_images/train
train_masks/train
val_images/val
val_masks/val
test_images/test
test_masks/test
```
Data Splitting: The script splits the images and masks into training (60%), validation (20%), and test (20%) sets, and copies them to the respective folders.
Padding: The script prompts the user to input a patch size (256 or 512). It pads all the images and masks in the created folders to match the specified patch size.
Patching: The script divides each padded image and mask into smaller patches and saves them with a naming convention indicating the original image and patch number.
Cleanup: The script ensures all files in the matching folders (e.g., train_images/train and train_masks/train) have the same names and deletes the original padded images, keeping only the patches.

Functions

validate_folder(folder: str, folder_type: str): Validates if the folder exists and contains .png files.
create_folder_structure(base_path: str): Creates the required folder structure.
split_data(files: list, train_ratio: float, val_ratio: float): Splits data into training, validation, and test sets.
copy_files(files: list, src_folder: str, dest_folder: str): Copies files from the source folder to the destination folder.
normalize_masks(mask_folder: str): Normalizes mask files to be binary (0 and 1). If they are between 0 and 255, they are divided by 255.
pad_and_save(folder: str, patch_size: int): Pads images to the specified patch size and saves them.
patch_and_save(folder: str, patch_size: int): Patches images into smaller patches and saves the patches.
validate_and_cleanup(images_folder: str, masks_folder: str): Validates and cleans up the padded images, keeping only the patches.
main(image_folder: str, masks_folder: str): Main function to process images and masks, including validation, optional cropping, folder structure creation, data splitting, normalization, padding, patching, and cleanup.

Notes

The cropping functionality is optional and can be skipped.
The patch size can be specified as either 256 or 512.

Purpose

The overall use and purpose of this script is to streamline and automate the preprocessing of image and mask data, ensuring that they are properly validated, optionally cropped, padded to match patch sizes, divided into patches, and organized into training, validation, and test sets.

Server usage

Requirements

The sever usage requires Docker environment.

Library usage

For more information check out our official Sphinx documentation.

Versioning

We use Docker Hub for versioning.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2

Jun 17, 2024

This version

0.1.2b1 pre-release

Jun 26, 2024

0.1.1

May 15, 2024

0.1.0

May 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyphenotyper-0.1.2b1.tar.gz (57.8 MB view details)

Uploaded Jun 26, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyphenotyper-0.1.2b1-py3-none-any.whl (57.8 MB view details)

Uploaded Jun 26, 2024 Python 3

File details

Details for the file pyphenotyper-0.1.2b1.tar.gz.

File metadata

Download URL: pyphenotyper-0.1.2b1.tar.gz
Upload date: Jun 26, 2024
Size: 57.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1022-azure

File hashes

Hashes for pyphenotyper-0.1.2b1.tar.gz
Algorithm	Hash digest
SHA256	`f8b8bf74936869e2e4b75019f3d0b346362c0455bf2700526e0fe436482a11ec`
MD5	`a677babb64b5af2ec736ad02a283aed7`
BLAKE2b-256	`ca21f464702a879bf4166ef9e24abfdfbddbcb7ce4ce58b25654471e79004d46`

See more details on using hashes here.

File details

Details for the file pyphenotyper-0.1.2b1-py3-none-any.whl.

File metadata

Download URL: pyphenotyper-0.1.2b1-py3-none-any.whl
Upload date: Jun 26, 2024
Size: 57.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.5.0-1022-azure

File hashes

Hashes for pyphenotyper-0.1.2b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8758acd546aa408cd5b327594e98737014943453672f2144c7dbadb1d02f52da`
MD5	`7420ef3aa8798569e2667768bc533cf5`
BLAKE2b-256	`60a9e4f4d0f3708043f3add95f42ea256254cd40adaa3b7d55e4bdef9fae5791`

See more details on using hashes here.

pyphenotyper 0.1.2b1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pyphenotyper

Getting Started

Installing

Usage

Command line usage

CLI interaction

1. Inference Pipeline

First Prompt

Second Prompt

Output

./timeseries/

2. Data Preparation Pipeline

Requirements

Usage

Ensure that:

Example

Steps

Functions

Notes

Purpose

Server usage

Requirements

Library usage

Versioning

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes