Skip to main content

Package for pre- and post-processing of images and data for working with ilastik-software

Project description

caactus

caactus (cell analysis and counting tool using ilastik software) is a collection of python scripts to provide a streamlined workflow for ilastik-software, including data preparation, processing and analysis. It aims to provide biologist with an easy-to-use tool for counting and analyzing cells from a large number of microscopy pictures.

workflow

Introduction

The goal of this script collection is to provide an easy-to-use completion for the Boundary-based segmentation with Multicut-workflow in ilastik. This workflow allows for the automatization of cell-counting from messy microscopic images with different (touching) cell types for biological research. For easy copy & paste, commands are provided in grey code boxes with one-click copy & paste.

Installation

Install miniconda, create an environment and install Python and vigra

  • Download and install miniconda for your respective operating system according to the instructions.
    • Miniconda provides a lightweight package and environment manager. It allows you to create isolated environments so that Python versions and package dependencies required by caactus do not interfere with your system Python or other projects.
  • Once installed, create an environment for using caactus with the following command from your cmd-line
    conda create -n caactus-env -c pytorch -c conda-forge python=3.12 pytorch vigra h5py
    

Install caactus

  • Activate the caactus-env from the cmd-line with
    conda activate caactus-env
    
  • To install caactus plus the needed dependencies inside your environment, use
    pip install caactus
    
  • During the below described steps that call the caactus-scripts, make sure to have the caactus-env activated.

Install ilastik

[!NOTE] We developed the pipeline on ilastik 1.4.0. For optimal user experience, we recommend installing ilastik 1.4.0. Scroll down to "Previous stable versions" on the ilastik download webpage.

Quick Overview of the workflow

Below is a short version of the steps performed. For more detail, please consult Detailed Description of the Workflow.

  1. Culture organism of interest in 96-well plate
  2. Acquire images of cells via microscopy.
  3. Create project directory
  4. Rename Files with caactus.
  5. Convert files to HDF5 Format with the caactus.
  6. Train a pixel classification model in ilastik for and later run it batch-mode.
  7. Train a boundary-based segmentation with Multicut model in ilastik for and later run it batch-mode.
  8. Remove the background from the *_Multicut Segmentation.h5files with caactus.
  9. Train a object classification model in ilastik for and later run it batch-mode.
  10. Pool all csv-tables from the individual images into one global table with caactus.
  • output generated:
    • df_clean.csv
  1. Summarize the data with caactus.
  • output generated:
    • a) df_summary_complete.csv = .csv-table containing also not usable category,
    • b) df_refined_complete.csv = .csv-table without not usable category",
    • c) counts.csv dataframe used in PlnModelling
    • d) stacked bar graph (barchart.png)
  1. Model the count data with caactus
  • output generated:
    • a) correlation_circle.png
    • b) pca_plot.png

[!NOTE] Power users may direclty edit the config.toml and run the scripts from the cmd-line. For instructions, go to 7.1-7.7.

Sample Dataset

  • a sample dataset to quickly test the workflow can be accessed via zenodo
  • to showcase the functionalties, the ilastik steps have been pretrained. Use caactus in batch-modes.

[!IMPORTANT] go to 8.1-8.10 for a detailed tutorial with the sample data set

Detailed Description of the Workflow

1. Culturing

  • Culture your cells in a flat bottom plate of your choice and according to the needs of the organims being researched.

2. Image acquisition

  • In your respective microscopy software environment, export the images of interest to .tif-format.
  • From the image metadata, copy the pixel size.

3. Data Preparation

3.1 Create Project Directory

  • For portability of the ilastik projects create the directory in the following structure:\

[!NOTE] The directory structure below already includes examples of resulting files in each sub-directory.

  • This allows you to copy an already trained workflow and use it multiple times with new datasets, when relative paths are enabled.
project_directory = Main folder  
├── 1_pixel_classification.ilp  
├── 2_boundary_segmentation.ilp  
├── 3_object_classification.ilp
├── renaming.csv
├── conif.toml
├── 0_1_original_tif_training_images
  ├── training-1.tif
  ├── training-2.tif
  ├── ...
├── 0_2_original_tif_batch_images
  ├── image-1.tif
  ├── image-2.tif
  ├── ..
├── 0_3_batch_tif_renamed
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.tif
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.tif
  ├── ..
├── 1_images
  ├── training-1.h5
  ├── training-2.h5
  ├── ...
├── 2_probabilities
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
  ├── ...
├── 3_multicut
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
  ├── ...
├── 4_objectclassification
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
  ├── ...
├── 5_batch_images
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.h5
  ├── ...
├── 6_batch_probabilities
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
  ├── ...
├── 7_batch_multicut
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
  ├── ...
├── 8_batch_objectclassification
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
  ├── ...
├── 9_data_analysis

3.2 Getting started

  • Open the caactus Graphical User Interface (GUI) by opening the command line in Unix or Anaconda Powershell/Prompt in Windows.
  • Make sure you have the caactus environment activated:
conda activate caactus-env
  • Type caactus and hit Enter to start the GUI:
caactus

3.3 The Graphic User Interface (GUI)

The graphic user interface is structured in four parts.

3.3.1 Global Settings

global_settings

  • At the top, enter the path to your Main Folder (use the Browse button or type/ copy&paste the full path).
  • Select Modebetween training and batch.
    • trainingrefers to all steps during annotation of the ilastik classifiers
    • batchrefers to all steps performed on large datasets with ready trained ilastik models used in batch mode and subsequent data analysis.
  • Set shared analysis parameters once in Global Settings: Pixel Size, Variable Names, Class Order, Color Mapping. EUCAST-specific settings can be expanded below.
  • When working with a EUCAST dataset, edit EUCAST settings from the dropdown menu.

3.3.2 Pre-Processing

pre_process

  • The workflow is shown as a numbered list of steps. When in training or batch modes, select the respective mode from the dropdown Global Settings.
  • Click Run to execute a step.
  • Each step includes description in ? Help pop-up.
  • Processing messages appear in the log panel at the bottom.
  • The output can be accessed in the respective subdirectory of your main folder.

3.3.3 ilastik

ilastiks_steps

  • For ilastik steps, detailed step-by-step instructions are included.
  • When in training or batch modes, select the respective mode from the dropdown Global Settings.
  • Click Run to execute a Backgrground processing, Processing messages appear in the log panel at the bottom.
  • All other steps have to be performed in ilastik.
  • Each step includes description in ? Help pop-up.

3.3.4 Data analysis

data_analysis

  • Data analysis steps are performed with the results of batch-processing.
  • Click Run to execute a step and ? Help for pop-up window instructions.
  • Use 9. insead of 8. When working with a EUCAST dataset.

4. Training

To facilitate cross-platform reusability of the ilastik models, make sure to store Raw Data, Probabilities and Prediction Maps in Relative Links. This allows for portability of the models to other storage locations.

relative_link

In case absolute file path is selected, right click on the location and select edit properties under storage the path logic can be modified

relative_storage

4.1. Selection of Training Images and Conversion

4.1.1 Selection of Training data

  • select a set of images that represant the different experimental conditions best
  • store them in 0_1_original_tif_training_images

4.1.2 Conversion

  1. In the caactus GUI, select training from the dropdown menu in Global Settings
  2. Find 2. Tif to H5.
  1. Click Run.

4.2. Pixel Classification

  1. When first training a pixel classification model in ilastik, open ilastik.

  2. Create a new project and select Pixel Classification as the workflow.

  3. Save it as 1_pixel_classification.ilp inside the main project directory.

  4. Under Raw Data, add the *.h5 files from 1_images folder.

  5. Feature selection. Select the features you want to use for training. It is recommended to use all features.

  6. For working with neighbouring / touching cells, it is suggested to create three classes: 0 = interior, 1 = background, 2 = boundary (This follows python's 0-indexing logic where counting is started at 0).

pixel_classes

  1. Annotate the classes by drawing on the images.

  2. Export the Predictions. In prediction export change the settings to

  • Convert to Data Type: integer 8-bit
  • Renormalize from 0.00 1.00 to 0 255
  • File:
    {dataset_dir}/../2_probabilties/{nickname}_{result_type}.h5
    

export_prob

  1. Click OK.

  2. Click Export All.

  3. The output will be saved as *_Probabilities.h5 files in the 2_probabilities folder.

4.3 Boundary-based Segmentation with Multicut

  1. When first training a boundary-based Segmentation model in ilastik, open ilastik.

  2. Create a new project and select Boundary-based Segmentation with Multicut as the workflow.

  3. Save it as 2_boundary_segmentation.ilp inside the main project directory.

  4. Under Raw Data, add the .h5 files from 1_images folder.

  5. Under Probabilities, add the data_Probabilities.h5 files from 2_probabilites folder.

  6. in DT Watershed, use the input channel the corresponds to the order you used under project setup (in this case input channel = 2).

watershed

  1. Annotate the edges by clicking on the edges between cells. Annotate the background by clicking on the background.

  2. Export the Multicut Segmentation. In prediction export change the settings to

  • Convert to Data Type: integer 8-bit
  • Renormalize from 0.00 1.00 to 0 255
  • Format: compressed hdf5
  • File:
    {dataset_dir}/../3_multicut/{nickname}_{result_type}.h5
    

export_multicut

  1. Click OK.

  2. Click Export All.

  3. The output will be saved as *_Multicut Segmentation.h5 files in the 3_multicut folder.

4.4 Background Processing

For further processing in object classification, the background must be removed from the multicut data sets. This script sets the numerical value of the largest region to 0, making it transparent in the next step. The operation runs in-place on all *_Multicut Segmentation.h5 files in 3_multicut/.

  1. In the caactus GUI, find 5. Background Processing.
  2. Make sure trainingis still selected in Mode under Global Settings.
  3. Click Run.

4.5. Object Classification

  1. When first training a Object classification model in ilastik, open ilastik.

  2. Create a new project and select Object Classification [Inputs: Raw, Data, Pixel Prediction Map] as the workflow.

  3. Save it as 3_object_classification.ilp inside the main project directory.

  4. Under Raw Data, add the .h5 files from 1_images folder.

  5. Under Segmentation Image, add the *_Multicut Segmentation.h5 files from 3_multicut folder.

  6. Define your cell types plus an additional category for not usable objects, e.g. cell debris and cut-off objects on the side of the images.

[!NOTE] Default class names in caactus are resting, swollen, germling, hyphae, notusable (and mycelium for the EUCAST workflow). You are welcome to change them — just make sure to also update the names in the caactus GUI when performing the analysis steps below.

  1. Annotate the edges by clicking on the edges between cells. Annotate the background by clicking on the background.

  2. Export the Object_Predictions. Under 4. Object Information Export:

    • From the dropdown select Object Predictions

object_pred

In Choose Export Image Settings change settings to

  • Convert to Data Type: integer 8-bit
  • Renormalize from 0.00 1.00 to 0 255
  • Format: compressed hdf5
  • File:
    {dataset_dir}/../4_objectclassification/{nickname}_{result_type}.h5
    

export_multicut

  1. Export the Object data_table.csv-files In Configure Feature Table Export General change seetings to
  • format .csv and output directory File:
    {dataset_dir}/../4_objectclassification/{nickname}.csv
    
  • select your features of interest for exporting export_prob
  1. Click OK.

  2. Click Export All.

  3. The output will be saved as *_Object Predictions.h5 files and *_table.csv in the 4_objectclassification folder.

5. Batch Processing

  • Once you have successfully trained all three ilastik models, you are ready to process large image datasets with the caactus pipeline.
  1. store the images you want to process in the 0_2_original_tif_batch_images directory
  2. Perform steps 4.1 to 4.5 in batch mode, as explained in detail below (5.1 to 5.5).
  3. Select Mode batch in the dropdown menu in Global settings in the caactus GUI.

5.1 Rename Files

  • Rename the .tif-files so that they contain information about your cells and experimental conditions
  1. Create a csv-file that contains the information you need in columns. Each row corresponds to one image. Follow the same order as your images files are stored in the respective directory (alphabetically).
  • The script will rename your files in the following format columnA-value1_columnB-value2_columnC_etc.tif eg. as seen in the example below picture 1 (well A1 from our plate) will be named

strain-ATCC11559_date-20241707_timepoint-6h_biorep-A_techrep-1.tif

[!CAUTION] Do not use underscores (_) or dashes (-) in column names or values — these characters are used as delimiters in the new file names.

[!IMPORTANT] The only hardcoded column names required are biorep and techrep. They are needed in downstream analysis for calculating averages.

  1. In the caactus GUI under Pre-Processing, find 1. Renaming and click Run. 96_well

[!TIP] After renaming, we recommend deleting the contents of 0_2_original_tif_batch_images to save disk space.

5.2 Conversion

  1. In the caactus GUI, find 2. Tif to H5. Select batch from the dropdown menu.

The script converts .tif files to .h5 format for better performance in ilastik. 2. Click Run.

[!TIP] After converting, we recommend deleting the contents of 0_3_batch_renamed to save disk space.

5.3 Batch Processing Pixel Classification

In the caactus GUI, find 3. Pixel Classification, click ? Help for the full ilastik instructions. Summary:

  1. Open ilastik.

  2. Open your trained pixel classification project (e.g. 1_pixel_classification.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. Feature Selection, or 3. Training when running Batch Processing!

  1. Under 4. Prediction Export:

    • Select Probabilities from the dropdown.

    pixel_prob

    • Click Choose Export Image Settings and set the output file path at File:
{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5

batch_pixel

  1. Click OK

  2. Go to 5. Batch processing tab

  3. Under Raw data, add the .h5 files from 5_batch_images folder.

  4. Now click Process all files.

  5. The output will be saved as *_Probabilities.h5 files in the output folder (6_batch_probabilities).

5.4 Batch Processing Multicut Segmentation

In the caactus GUI, find 4. Boundary Segmentation, click ? Help for the full ilastik instructions. Summary:

  1. Open ilastik.

  2. Open your trained Boundary Segmentation project (e.g. 2_boundary_segmentation.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. DT Watershed, or 3. Training and Multicut when running Batch Processing!

[!NOTE] The *_Multicut Segmentation.h5 output files are generated by ilastik in this step — they do not exist beforehand.

  1. Under 4. Data Export,

batch_multicut

  • Click Choose Export Image Settings and set the output path at File:
    {dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5
    
  1. Click OK

  2. Go to 5. Batch processing.

  3. Under,Raw data, add the .h5 files from 5_batch_images folder.

  4. Under Probabilities, add the data_Probabilities.h5 files from 6_batch_probabilities folder.

batch_multicut

  1. Go to 5. Batch Processing and click Process all files.

  2. The output will be saved as *_Multicut Segmentation.h5 files in the output folder (7_batch_multicut).

5.5 Background Processing

  1. In the caactus GUI, find 5. Background Processing, click ? Help for the description.
  2. Click Run. This removes the background from all *_Multicut Segmentation.h5 files in 7_batch_multicut/ by setting the largest region value to 0 (transparent in ilastik).

5.6 Batch processing Object classification

In the caactus GUI, find 6. Object Classification, click ? Help for the full ilastik instructions. Summary:

  1. Open ilastik.

  2. Open your trained object classification project (3_object_classification.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. Object Feature Selection, or 3. Object Classification when running Batch Processing!

  1. Under 4. Object Information Export:
    • From the dropdown select Object Predictions (default). object_pred
    • Click Choose Export Image Settings and set the output path at File:
{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5

object_image

  1. Under "4. Object Information Export", choose "Configure Feature Table Export" with the following settings: feature_table

  2. In Configure Feature Table Export General choose format .csv and change output directory to:

{dataset_dir}/../8_batch_objectclassification/{nickname}.csv

Choose Features to choose the Feature you are interested in exporting

feature_features

  1. Click OK

  2. Go to 5. Batch Processing tab

  3. Under Raw data, add the .h5 files from 5_batch_images folder.

  4. Under Segmentation Image, add the data_Multicut Segmentation.h5 files from 7_batch_multicut folder.

  5. Go to 5. Batch Processing and click Process all files.

  6. The output will be saved as *_Object Predictions.h5 files and *_table.csv in the output folder (8_batch_objectclassification).

6. Post-Processing and Data Analysis

[!NOTE]
Please be aware, the last two scripts, summary_statisitcs.py and pln_modelling.py at this stage are written for the analysis and visualization of two independent variables. The take the result of the batch-processing steps as input.

  1. In the caactus GUI, set Variable Names, Class Order, Color Mapping and Pixel Size (µm) in Global Settings.

6.1 Merging Data Tables and Table Export

The next script will combine all tables from all images into one global table for further analysis. Additionally, the information stored in the file name will be added as columns to the dataset.

  1. Find 7. CSV Summary and click Run.
  2. The output generated will be df_clean.csv in 9_data_analysis
  3. This spreadsheet now has all feature tables that are the output of 5.6 Object classification united in one spreadsheet.

[!TIP]
Technically from this point on, you can continue to use whatever software / workflow that is easiest for you for subsequent data analysis (e.g. GraphPadPrism, EXCEL etc.)

6.2 Creating Summary Statistics

  • This script processes EUCAST data and generates summary statistics and a stacked bar plot of predicted classes cell categories.
  • If working with EUCAST antifungal susceptibility testing, use the 9. EUCAST Summary Statistics
  • For the stacked bar plot, it groups data by the two variables that you enter.
  • It computes the average count and percentage of each predicted class, across replicates (technical and biological), for each combination of the two grouping variables.
  • It visualizes the distribution in stacked bar plots of classes across different conditions.
  • The first variable you enter will be displayed on the x-axis (e.g. incubation temperature), and the second variable will be used for faceting (e.g. timepoint).
  • This will create separate subplots for each level of that variable.
  • The plot will show the percentage distribution of predicted classes for each condition, allowing you to compare how the classes are distributed across different experimental conditions defined by the two grouping variables.
  • The colors of the bars will correspond to the predicted classes, as defined in your color mapping.
  • By default the IBM coloor-blind friendly palette is used, but you can customize the colors by providing the HEX color code.
  1. Find 8. Summary Statistics and click Run.
  2. Output:
    • a) df_summary_complete.csv — full table including "not usable" category
    • b) df_refined_complete.csv — table without "not usable" category
    • c) counts.csv — count data used for PLN modelling
    • d) barchart.png — stacked bar chart

6.3 PLN Modelling

  • This script runs ZIPln modelling on input data with dynamic design and generates PCA visualizations and a correlation circle plot.

  • The two grouping variables you enter will be used in the model formula and for visualizing the PCA results.

  • The will be combined into a single factor for the model, and the PCA plot will show the latent variable projections colored by this combined category.

  • The correlation circle plot will show how the original variables relate to the latent dimensions, helping you interpret the PCA results in terms of the original grouping variables.

[!WARNING] The limit of categories displayed in the PCA plot is n=15.

  1. In the caactus GUI, make sure Variable Names and Class Order are set correctly in Global Settings.
  2. Find step 10 – PLN Modelling and click Run.
  3. Output:
    • a) correlation_circle.png
    • b) pca_plot.png

[!NOTE] Variable Names and Class Order are shared with Summary Statistics — set them once in Global Settings.

7. Running caactus from the command line

7.1 Setup config.toml-file

  • copy config/config.toml to your working directory and modify it as needed.
  • the caactus scripts are setup for pulling the information needed for running from the file
    • CAVE: for Windows users make sure to change the backlash fro /path/to/config.toml to \path\to\config.toml, when copying the path to your working directory
  • open the command line (for Windows: Anaconda Powershell) and save the path to your project file to a variable
    • whole command UNIX:
    p = "\path\to\config.toml" 
    
  • whole command Windows:
    $p = "\path\to\config.toml"
    
  • enter "-c" and enter path to config.toml
  • enter "-m" and choose "training" or "batch" to switch between modes

7.2 Conversion

  • call the tif2h5py script from the cmd prompt to transform all .tif-files to .h5-format.
  • whole command UNIX:
    tif2h5py -c "$p" -m training
    
  • whole command Windows:
    tif2h5py.exe -c $p -m training
    
  • For batch processing enter "-m batch" for batch mode.
    • whole command UNIX:
    tif2h5py -c "$p" -m batch
    - whole command Windows:
    
    ```bash
    tif2h5py.exe -c $p -m batch
    

7.3 Background Processing

  • call the background-processing script from the cmd prompt
    • whole command UNIX:
    background_processing -c "$p" -m training
    - whole command Windows:
    ```bash
    background_processing.exe -c $p -m training
    
  • For batch processing enter "-m batch" for batch mode
    • whole command Unix:
    background_processing -c "$p" -m batch
    - whole command Windows:
    ```bash
    background_processing.exe -c $p -m batch
    

7.4 Rename Files

  • Call the rename script from the cmd prompt to rename all your original .tif-files to their new name.
    • whole command Unix:
    renaming -c "$p"
    - whole command Windows:
    ```bash
    renaming.exe -c $p
    

7.5 CSV-Summary

  • call the csv_summary.py script from the cmd prompt
    • whole command Unix:
    csv_summary -c "$p"
    - whole command Windows
    ```bash
    csv_summary.exe -c $p
    

7.6 Creating Summary Statistics

  • call the summary_statistics.py script from the cmd prompt
    • whole command Unix:
    summary_statistics -c "$p"
    - whole command Windows:
    ```bash
    summary_statistics.exe -c $p
    
  • if working with EUCAST antifungal susceptibility testing, call
    • whole command Unix:
    summary_statistics_eucast -c "$p" 
    - whole command Windows:
    ```bash 
    summary_statistics_eucast -c $p
    

7.7 PLN Modelling

  • call the pln_modelling.py script from the cmd prompt`
  • whole command Unix:
    pln_modelling -c "$p"
    
  • whole command Windows:
    pln_modelling.exe -c $p
    

8. Tutorial

8.1 Download Sample Data

  1. Go to zenodo to download the sample data.
  2. Unpack the .zip-file into your project folder.
  3. The path to where you unpacked the sample data will be your main folder (e.g. /home/usr/Documents/sampledata_CD6_zenodo).
  4. To showcase the functionalties, the ilastik steps have been pretrained. Use caactus in batch-mode for the following steps. From the dropdown menu in Global settings in the GUI, select batch

[!NOTE] Some subdirectories are intentionally left empty. The tutorial is designed to show how the batch mode works with pretrained models. 0_1_original_tif_training_images stays empty; the other empty subdirectories will be filled as you follow the steps below.

  1. make sure you have caactus installed (see Installation above)
  2. make sure you have the caactus environmnet activated
conda activate caactus-env
  1. now simply type caactus and hit enterto start the graphical user interace
caactus
  1. We recommend working with two screens. This allows to follow the instructions implemented in the caactus GUI while performing the steps in ilastik and quickly switiching back to the caactus steps for fast completion of the pipeline.

8.2 Global Settings

  1. On the top, enter the path to your mainfolder.

main_folder

  1. Change the default values ['strain', 'timepoint'] to
['condition1','condition2']

change_variable

  1. Set Mode to batch.

mode_batch

8.3 Pre-Processing - Renaming

  1. In your main folder (sampledata_CD6_zenodo), inspect the renaming.csv spreadsheet to see how it is constructed.
  2. In the GUI, go to `Pre-Processing 1. Renaming and click Run.
  3. If you click on the dropdown menu Advanced paths, a menu will open that will allow you to change the input and output folders, as well as the name of the renaming file.

renaming

8.4 Pre-Processing - Tif to h5

  1. In the GUI, go to `Pre-Processing 2. Tif to h5 and click Run.
  2. If you click on the dropdown menu Advanced paths, a menu will open that will allow you to change the input and output folders.

tif2h5

8.4 Batch Pixel Classification

In the caactus GUI, find 3. Pixel Classification, click ? Help for the full instructions. Summary:

  1. Open ilastik.

  2. Open the pre-trained pixel classification project from the sample data (1_pixel_classification.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. Feature Selection, or 3. Training when running Batch Processing!

  1. Under 4. Prediction Export:
    • Select Probabilities from the dropdown.

      pixel_prob

    • Click Choose Export Image Settings and set the output path at File:

{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5

batch_pixel

  1. Click OK

  2. Go to 5. Batch processing tab

  3. Under Raw data, add the .h5 files from 5_batch_images folder.

  4. Now click Process all files.

  5. The output will be saved as _Probabilities.h5 files in the output folder.

8.5 Batch Processing Multicut Segmentation

In the caactus GUI, find 4. Boundary Segmentation, click ? Help for the full instructions. Summary:

  1. In ilastik, open the pre-trained Boundary Segmentation project (2_boundary_segmentation.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. DT Watershed, or 3. Training and Multicut when running Batch Processing!

[!NOTE] The *_Multicut Segmentation.h5 files are generated here — they do not exist beforehand.

  1. Under 4. Data Export,

batch_multicut

  • click Choose Export Image Settings and set the output path at File:
    {dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5
    
  1. Click OK

  2. Go to 5. Batch processing.

  3. Under,Raw data, add the .h5 files from 5_batch_images folder.

  4. Under Probabilities, add the data_Probabilities.h5 files from 6_batch_probabilities folder.

batch_multicut

  1. Go to 5. Batch Processing and click Process all files.

  2. The output will be saved as *_Multicut Segmentation.h5 files in the output folder (7_batch_multicut).

  3. Close the 2_boundary_segmentation.ilpproject-file in ilastik.

8.6 Batch Background Processing

  1. Switch back to the caactus GUI.
  2. Find 5. Background Processing.
  3. Click Run. The background is now removed and you can continue with object classification in ilastik.

batch_background

8.7 Batch Object Classification

In the caactus GUI, find 6. Object Classification, and click ? Help for the full instructions. Summary:

  1. Switch back to ilastik.

  2. Open your trained object classification project (3_object_classification.ilp).

[!CAUTION] DO NOT CHANGE anything in 1. Input Data, 2. Object Feature Selection, or 3. Object Classification when running Batch Processing!

  1. Under 4. Object Information Export:
    • Select Object Predictions from the dropdown.

object_pred

  • Click Choose Export Image Settings and set the output path at File:
{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5

object_image

  1. Under "4. Object Information Export", choose "Configure Feature Table Export" with the following settings: feature_table

  2. In Configure Feature Table Export General choose format .csv and change output directory to:

{dataset_dir}/../8_batch_objectclassification/{nickname}.csv

Choose Features to choose the Feature you are interested in exporting.

feature_features

  1. Click OK

  2. Go to 5. Batch Processing tab

  3. Under Raw data, add the .h5 files from 5_batch_images folder.

  4. Under Segmentation Image, add the data_Multicut Segmentation.h5 files from 7_batch_multicut folder.

  5. Go to 5. Batch Processing and click Process all files.

  6. The output will be saved as *_Object Predictions.h5 files and *_table.csv in the output folder (8_batch_objectclassification).

  7. Now you have performed all steps in ilastik. You can close ilastik.

8.8 CSV Summary

  1. Switch back to the caactus GUI. Scroll down to Data Analysis

data_analysis

  1. The default Pixel Size is already set in Global Settings — you can leave it as-is for the sample data.
  2. Find 7. CSV Summary and click Run.
  3. Inspect the generated df_clean.csv. This spreadsheet combines all feature tables from Object Classification into one file for downstream analysis.

8.9 Summary Statistics

  1. Find 8. Summary Statistics and click Run.
  2. Inspect the generated results. The output generated will be
    • a) df_summary_complete.csv = .csv-table containing also not usable category,
    • b) df_refined_complete.csv = .csv-table without not usable category",
    • c) counts.csv dataframe used in PlnModelling
    • d) bar graph (barchart.png) (faceted by condition1 on x-axis, percent of morphotypes "Predicted Class" on the y-axis and condition2 as the facetting variable in rows.) You can play around by putting 'condition2' first and 'condition1' second to see how it changes the plot.
  3. You may also change the colors: change the default in Global Settings
{'resting': '#FE6100', 'swollen': '#648FFF', 'germling': '#785EF0', 'hyphae': '#DC267F'}

to

{'resting': 'yellow', 'swollen': 'cyan', 'germling': 'blue', 'hyphae': 'magenta'}
  1. Again, find 8. Summary Statistics and click Run. The colors now should be changed.

  2. Similarly, you my change the morphotype names. Open df_clean.csv in a speadsheet software (e.g. Excel). Replace all restingwith dormant (use Ctrl+F - Replace all, save df_clean.csv). Now re-do step 7.9 Summary Statistics.

Before you click Run, make sure you replace resting with dormantin both Class order

['resting', 'swollen', 'germling', 'hyphae']

and Color Mapping fields.

{'dormant': 'yellow', 'swollen': 'cyan', 'germling': 'blue', 'hyphae': 'magenta'}
  1. Again, find 8. Summary Statistics and click Run. The names now should be changed.

  2. Let's imagine you only have 3 cell categories in your dataset. Again, open df_clean.csv in a speadsheet software (e.g. Excel). Replace all dormantwith spores (use Ctrl+F - Replace all, save df_clean.csv).

Similarly, replace all swollenwith spores (use Ctrl+F - Replace all, save df_clean.csv). Now change the Class Orderfield to

['spores', 'germling', 'hyphae']

and the Color Mappingfield to

{'spores': 'yellow', 'germling': 'blue', 'hyphae': 'magenta'}
  1. Again, find 8. Summary Statistics and click Run. The names now should be changed.

8.10 PLN Modelling

  1. Find 10. PLN Modelling and click Run.
  2. Inspect the generated results in the subdirecory /sampledata_CD6_zenodo/9_data_analysis/ The output generated will be
    • a) correlation_circle.png. Shows that PCA1, accounting for ~57% of the variance, primarily separated samples by condition2, whereas PCA2 accounted for ~25% of the variance based on condition1.
    • b) pca_plot.png. The PCA plot shows how the images are grouped together in 2D-space based on combined category of condition1 and condition2 (the categorical levels will be combined).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caactus-0.3.0.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

caactus-0.3.0-py3-none-any.whl (2.4 MB view details)

Uploaded Python 3

File details

Details for the file caactus-0.3.0.tar.gz.

File metadata

  • Download URL: caactus-0.3.0.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for caactus-0.3.0.tar.gz
Algorithm Hash digest
SHA256 acb92a87fc0c2efaadb8d81876cc6edc12342ec76a2d196d49444e9a99027e53
MD5 1afced8c6ab0768e41ea3eab50fab350
BLAKE2b-256 3b037d42b9f0d1ec44110059709e146ac7b13aa4d78a3dcd8d3d11d3622c0275

See more details on using hashes here.

File details

Details for the file caactus-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: caactus-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for caactus-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 35537559a902302a09f55f61daaabbc850d1504bb875df2c7a22dfb35e90c4ac
MD5 71a1ad4c6691befbb18c46f58367ac62
BLAKE2b-256 f58bf904fdd4c4cefdaadf6fe3c6d4a8c6b099f2e16c54bbc0138c48b8ba39a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page