Package for pre- and post-processing of images and data for working with ilastik-software
Project description
caactus
caactus (cell analysis and counting tool using ilastik software) is a collection of python scripts to provide a streamlined workflow for ilastik-software, including data preparation, processing and analysis. It aims to provide biologists with an easy-to-use tool for counting and analyzing cells from a large number of microscopy pictures.
Introduction
The goal of this script collection is to provide an easy-to-use completion for the Boundary-based segmentation with Multicut-workflow in ilastik.
This workflow allows for the automatization of cell-counting from messy microscopic images with different (touching) cell types for biological research.
For easy copy & paste, commands are provided in grey code boxes with one-click copy & paste.
Installation
Install miniconda, create an environment and install Python and vigra
- Download and install miniconda for your respective operating system according to the instructions.
- Miniconda provides a lightweight package and environment manager. It allows you to create isolated environments so that Python versions and package dependencies required by caactus do not interfere with your system Python or other projects.
- Once installed, create an environment for using
caactuswith the following command from your cmd-lineconda create -n caactus-env -c pytorch -c conda-forge python=3.12 pytorch vigra h5py
Install caactus
- Activate the
caactus-envfrom the cmd-line withconda activate caactus-env
- To install
caactusplus the needed dependencies inside your environment, usepip install caactus
- During the below described steps that call the
caactus-scripts, make sure to have thecaactus-envactivated.
Install ilastik
- Download and install ilastik for your respective operating system.
[!NOTE] We developed the pipeline on ilastik 1.4.0. For optimal user experience, we recommend installing ilastik 1.4.0. Scroll down to "Previous stable versions" on the ilastik download webpage.
Quick Overview of the workflow
Below is a short version of the steps performed. For more detail, please consult Detailed Description of the Workflow.
- Culture organism of interest in 96-well plate
- Acquire images of cells via microscopy.
- Create project directory
- Rename Files with
caactus. - Convert files to HDF5 Format with the
caactus. - Train a pixel classification model in ilastik for training and later run it batch-mode.
- Train a boundary-based segmentation with Multicut model in ilastik for training and later run it batch-mode.
- Remove the background from the
*_Multicut Segmentation.h5files withcaactus. - Train a object classification model in ilastik for and later run it batch-mode.
- Pool all csv-tables from the individual images into one global table with
caactus.
- output generated:
df_clean.csv
- Summarize the data with caactus.
- output generated:
- a)
df_summary_complete.csv= .csv-table containing also not usable category, - b)
df_refined_complete.csv= .csv-table without not usable category", - c)
counts.csvdataframe used in PlnModelling - d) stacked bar graph (
barchart.png)
- a)
- Model the count data with
caactus
- output generated:
- a)
correlation_circle.png - b)
pca_plot.png
- a)
[!NOTE] Power users may directly edit the config.toml and run the scripts from the cmd-line. For instructions, go to 7.1-7.7.
Sample Dataset
- a sample dataset to quickly test the workflow can be accessed via zenodo
- to showcase the functionalities, the ilastik steps have been pretrained. Use caactus in batch-modes.
[!IMPORTANT] go to 8.1-8.10 for a detailed tutorial with the sample data set
Detailed Description of the Workflow
1. Culturing
- Culture your cells in a flat bottom plate of your choice and according to the needs of the organisms being researched.
2. Image acquisition
- In your respective microscopy software environment, export the images of interest to
.tif-format. - For the workflow and file conversion steps, caactus currently supports grayscale (1-channel) and RGB (3-channel) images in
.tif-format. - From the image metadata, copy the pixel size.
[!NOTE] We recommend exporting the images without scale bars, because they will introduce distraction for the classifier during the annotation.
3. Data Preparation
3.1 Create Project Directory
- For portability of the ilastik projects create the directory in the following structure:\
[!NOTE] The directory structure below already includes examples of resulting files in each sub-directory.
- This allows you to copy an already trained workflow and use it multiple times with new datasets, when relative paths are enabled.
project_directory = Main folder
├── 1_pixel_classification.ilp
├── 2_boundary_segmentation.ilp
├── 3_object_classification.ilp
├── renaming.csv
├── config.toml
├── 0_1_original_tif_training_images
├── training-1.tif
├── training-2.tif
├── ...
├── 0_2_original_tif_batch_images
├── image-1.tif
├── image-2.tif
├── ..
├── 0_3_batch_tif_renamed
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.tif
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.tif
├── ..
├── 1_images
├── training-1.h5
├── training-2.h5
├── ...
├── 2_probabilities
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
├── ...
├── 3_multicut
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
├── ...
├── 4_objectclassification
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
├── ...
├── 5_batch_images
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.h5
├── ...
├── 6_batch_probabilities
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
├── ...
├── 7_batch_multicut
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
├── ...
├── 8_batch_objectclassification
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
├── ...
├── 9_data_analysis
3.2 Getting started
- Open the caactus Graphical User Interface (GUI) by opening the command line in Unix or Anaconda Powershell/Prompt in Windows.
- Make sure you have the caactus environment activated:
conda activate caactus-env
- Type
caactusand hitEnterto start the GUI:
caactus
3.3 The Graphic User Interface (GUI)
The graphic user interface is structured in four parts.
3.3.1 Global Settings
- At the top, enter the path to your Main Folder (use the Browse button or type/ copy&paste the full path).
- Select
Modebetweentrainingandbatch.trainingrefers to all steps during annotation of the ilastik classifiersbatchrefers to all steps performed on large datasets with ready trained ilastik models used in batch mode and subsequent data analysis.
- Set shared analysis parameters once in Global Settings:
Pixel Size,Variable Names,Class Order,Color Mapping. EUCAST-specific settings can be expanded below. - When working with a EUCAST dataset, edit EUCAST settings from the dropdown menu.
3.3.2 Pre-Processing
- The workflow is shown as a numbered list of steps. When in training or batch modes, select the respective mode from the dropdown Global Settings.
- Click Run to execute a step.
- Each step includes description in ? Help pop-up.
- Processing messages appear in the log panel at the bottom.
- The output can be accessed in the respective subdirectory of your main folder.
3.3.3 ilastik
- For ilastik steps, detailed step-by-step instructions are included.
- When in training or batch modes, select the respective mode from the dropdown Global Settings.
- Click Run to execute a Backgrground processing, Processing messages appear in the log panel at the bottom.
- All other steps have to be performed in ilastik.
- Each step includes description in ? Help pop-up.
3.3.4 Data analysis
- Data analysis steps are performed with the results of batch-processing.
- Click Run to execute a step and ? Help for pop-up window instructions.
- Use 9. instead of 8. when working with a EUCAST dataset.
4. Training
To facilitate cross-platform reusability of the ilastik models, make sure to store Raw Data, Probabilities and Prediction Maps in Relative Links. This allows for portability of the models to other storage locations.
In case absolute file path is selected, right click on the location and select edit properties under storage the path logic can be modified
4.1. Selection of Training Images and Conversion
4.1.1 Selection of Training data
- select a set of images that represent the different experimental conditions best
- store them in
0_1_original_tif_training_images
4.1.2 Conversion
- In the caactus GUI, select
trainingfrom the dropdown menu in Global Settings - Find 2. Tif to H5.
- The script converts
.tiffiles to.h5format for better performance in ilastik.
- Click Run.
4.2. Pixel Classification
-
When first training a pixel classification model in ilastik, open ilastik.
-
Create a new project and select Pixel Classification as the workflow.
-
Save it as
1_pixel_classification.ilpinside the main project directory. -
Under Raw Data, add the
*.h5files from1_imagesfolder. -
Feature selection. Select the features you want to use for training. It is recommended to use all features.
-
For working with neighbouring / touching cells, it is suggested to create three classes: 0 = interior, 1 = background, 2 = boundary (This follows python's 0-indexing logic where counting is started at 0).
-
Annotate the classes by drawing on the images.
-
Export the Predictions. In prediction export change the settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- File:
{dataset_dir}/../2_probabilities/{nickname}_{result_type}.h5
-
Click
OK. -
Click
Export All. -
The output will be saved as
*_Probabilities.h5files in the2_probabilitiesfolder.
- For more information, consult the documentation for pixel classification with ilastik.
4.3 Boundary-based Segmentation with Multicut
-
When first training a boundary-based Segmentation model in ilastik, open ilastik.
-
Create a new project and select Boundary-based Segmentation with Multicut as the workflow.
-
Save it as
2_boundary_segmentation.ilpinside the main project directory. -
Under Raw Data, add the .h5 files from
1_images folder. -
Under Probabilities, add the data_Probabilities.h5 files from
2_probabilitiesfolder. -
in DT Watershed, use the input channel the corresponds to the order you used under project setup (in this case input channel = 2).
-
Annotate the edges by clicking on the edges between cells. Annotate the background by clicking on the background.
-
Export the Multicut Segmentation. In prediction export change the settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- Format:
compressed hdf5 - File:
{dataset_dir}/../3_multicut/{nickname}_{result_type}.h5
-
Click
OK. -
Click
Export All. -
The output will be saved as
*_Multicut Segmentation.h5files in the3_multicutfolder.
- For more information follow the documentation for boundary-based segmentation with Multicut.
4.4 Background Processing
For further processing in object classification, the background must be removed from the multicut data sets. This script sets the numerical value of the largest region to 0, making it transparent in the next step. The operation runs in-place on all *_Multicut Segmentation.h5 files in 3_multicut/.
- In the caactus GUI, find 5. Background Processing.
- Make sure
trainingis still selected in Mode under Global Settings. - Click Run.
4.5. Object Classification
-
When first training an Object classification model in ilastik, open ilastik.
-
Create a new project and select Object Classification [Inputs: Raw, Data, Pixel Prediction Map] as the workflow.
-
Save it as
3_object_classification.ilpinside the main project directory. -
Under Raw Data, add the
.h5files from1_images folder. -
Under Segmentation Image, add the
*_Multicut Segmentation.h5files from3_multicutfolder. -
Define your cell types plus an additional category for not usable objects, e.g. cell debris and cut-off objects on the side of the images.
[!NOTE] Default class names in caactus are
resting,swollen,germling,hyphae,notusable(andmyceliumfor the EUCAST workflow). You are welcome to change them — just make sure to also update the names in the caactus GUI when performing the analysis steps below.
-
Annotate the edges by clicking on the edges between cells. Annotate the background by clicking on the background.
-
Export the Object_Predictions. Under
4. Object Information Export:- From the dropdown select Object Predictions
In Choose Export Image Settings change settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- Format:
compressed hdf5 - File:
{dataset_dir}/../4_objectclassification/{nickname}_{result_type}.h5
- Export the Object data_table.csv-files
In
Configure Feature Table Export Generalchange settings to
- format
.csvand output directory File:{dataset_dir}/../4_objectclassification/{nickname}.csv
- select your features of interest for exporting
-
Click
OK. -
Click
Export All. -
The output will be saved as
*_Object Predictions.h5files and*_table.csvin the4_objectclassificationfolder.
- For more information follow the documentation for object classification.
5. Batch Processing
- Once you have successfully trained all three ilastik models, you are ready to process large image datasets with the caactus pipeline.
- store the images you want to process in the
0_2_original_tif_batch_imagesdirectory - Perform steps 4.1 to 4.5 in batch mode, as explained in detail below (5.1 to 5.5).
- Select Mode
batchin the dropdown menu in Global settings in the caactus GUI.
- For more information, follow the documentation for batch processing
5.1 Rename Files
- Rename the
.tif-filesso that they contain information about your cells and experimental conditions
- Create a csv-file that contains the information you need in columns. Each row corresponds to one image. Follow the same order as your images files are stored in the respective directory (alphabetically).
- The script will rename your files in the following format
columnA-value1_columnB-value2_columnC_etc.tifeg. as seen in the example below picture 1 (well A1 from our plate) will be named
strain-ATCC11559_date-20241707_timepoint-6h_biorep-A_techrep-1.tif
[!CAUTION] Do not use underscores (
_) or dashes (-) in column names or values — these characters are used as delimiters in the new file names.
[!IMPORTANT] The only hardcoded column names required are biorep and techrep. They are needed in downstream analysis for calculating averages.
- In the caactus GUI under Pre-Processing, find 1. Renaming and click Run.
[!TIP] After renaming, we recommend deleting the contents of
0_2_original_tif_batch_imagesto save disk space.
5.2 Conversion
- In the caactus GUI, find 2. Tif to H5. Select
batchfrom the dropdown menu.
The script converts .tif files to .h5 format for better performance in ilastik.
2. Click Run.
[!TIP] After converting, we recommend deleting the contents of
0_3_batch_renamedto save disk space.
5.3 Batch Processing Pixel Classification
In the caactus GUI, find 3. Pixel Classification, click ? Help for the full ilastik instructions. Summary:
-
Open ilastik.
-
Open your trained pixel classification project (e.g.
1_pixel_classification.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. Feature Selection, or3. Trainingwhen running Batch Processing!
-
Under
4. Prediction Export:- Select Probabilities from the dropdown.
- Click Choose Export Image Settings and set the output file path at
File:
{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5
-
Click
OK -
Go to
5. Batch processingtab -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Now click
Process all files. -
The output will be saved as
*_Probabilities.h5files in the output folder (6_batch_probabilities).
5.4 Batch Processing Multicut Segmentation
In the caactus GUI, find 4. Boundary Segmentation, click ? Help for the full ilastik instructions. Summary:
-
Open ilastik.
-
Open your trained Boundary Segmentation project (e.g.
2_boundary_segmentation.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. DT Watershed, or3. Training and Multicutwhen running Batch Processing!
[!NOTE] The
*_Multicut Segmentation.h5output files are generated by ilastik in this step — they do not exist beforehand.
- Under
4. Data Export,
- Click Choose Export Image Settings and set the output path at
File:{dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5
-
Click
OK -
Go to
5. Batch processing. -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Under
Probabilities, add the data_Probabilities.h5 files from6_batch_probabilitiesfolder.
-
Go to
5. Batch Processingand clickProcess all files. -
The output will be saved as
*_Multicut Segmentation.h5files in the output folder (7_batch_multicut).
5.5 Background Processing
- In the caactus GUI, find 5. Background Processing, click ? Help for the description.
- Click Run. This removes the background from all
*_Multicut Segmentation.h5files in7_batch_multicut/by setting the largest region value to 0 (transparent in ilastik).
5.6 Batch processing Object classification
In the caactus GUI, find 6. Object Classification, click ? Help for the full ilastik instructions. Summary:
-
Open ilastik.
-
Open your trained object classification project (
3_object_classification.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. Object Feature Selection, or3. Object Classificationwhen running Batch Processing!
- Under
4. Object Information Export:- From the dropdown select Object Predictions (default).
- Click Choose Export Image Settings and set the output path at
File:
- From the dropdown select Object Predictions (default).
{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5
-
Under "4. Object Information Export", choose "Configure Feature Table Export" with the following settings:
-
In
Configure Feature Table Export Generalchoose format.csvand change output directory to:
{dataset_dir}/../8_batch_objectclassification/{nickname}.csv
Choose Features to choose the Feature you are interested in exporting
-
Click
OK -
Go to 5.
Batch Processingtab -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Under
Segmentation Image, add the data_Multicut Segmentation.h5 files from7_batch_multicutfolder. -
Go to
5. Batch Processingand clickProcess all files. -
The output will be saved as
*_Object Predictions.h5files and*_table.csvin the output folder (8_batch_objectclassification).
6. Post-Processing and Data Analysis
[!NOTE]
Please be aware, the last two scripts,summary_statistics.pyandpln_modelling.pyat this stage are written for the analysis and visualization of two independent variables. They take the result of the batch-processing steps as input.
- In the caactus GUI, set Variable Names, Class Order, Color Mapping and Pixel Size (µm) in Global Settings.
6.1 Merging Data Tables and Table Export
The next script will combine all tables from all images into one global table for further analysis. Additionally, the information stored in the file name will be added as columns to the dataset.
- Find 7. CSV Summary and click Run.
- The output generated will be
df_clean.csvin9_data_analysis - This spreadsheet now has all feature tables that are the output of 5.6 Object classification united in one spreadsheet.
[!TIP]
Technically from this point on, you can continue to use whatever software / workflow that is easiest for you for subsequent data analysis (e.g. GraphPadPrism, EXCEL etc.)
6.2 Creating Summary Statistics
- This script processes EUCAST data and generates summary statistics and a stacked bar plot of predicted classes cell categories.
- If working with EUCAST antifungal susceptibility testing, use the
9. EUCAST Summary Statistics - For the stacked bar plot, it groups data by the two variables that you enter.
- It computes the average count and percentage of each predicted class, across replicates (technical and biological), for each combination of the two grouping variables.
- It visualizes the distribution in stacked bar plots of classes across different conditions.
- The first variable you enter will be displayed on the x-axis (e.g. incubation temperature), and the second variable will be used for faceting (e.g. timepoint).
- This will create separate subplots for each level of that variable.
- The plot will show the percentage distribution of predicted classes for each condition, allowing you to compare how the classes are distributed across different experimental conditions defined by the two grouping variables.
- The colors of the bars will correspond to the predicted classes, as defined in your color mapping.
- By default the IBM color-blind friendly palette is used, but you can customize the colors by providing the HEX color code.
- Find 8. Summary Statistics and click Run.
- Output:
- a)
df_summary_complete.csv— full table including "not usable" category - b)
df_refined_complete.csv— table without "not usable" category - c)
counts.csv— count data used for PLN modelling - d)
barchart.png— stacked bar chart
- a)
6.3 PLN Modelling
-
This script runs ZIPln modelling on input data with dynamic design and generates PCA visualizations and a correlation circle plot.
-
The two grouping variables you enter will be used in the model formula and for visualizing the PCA results.
-
They will be combined into a single factor for the model, and the PCA plot will show the latent variable projections colored by this combined category.
-
The correlation circle plot will show how the original variables relate to the latent dimensions, helping you interpret the PCA results in terms of the original grouping variables.
[!WARNING] The limit of categories displayed in the PCA plot is n=15.
- In the caactus GUI, make sure Variable Names and Class Order are set correctly in Global Settings.
- Find step 10 – PLN Modelling and click Run.
- Output:
- a)
correlation_circle.png - b)
pca_plot.png
- a)
[!NOTE] Variable Names and Class Order are shared with Summary Statistics — set them once in Global Settings.
7. Running caactus from the command line
7.1 Setup config.toml-file
-
copy config/config.toml to your working directory and modify it as needed.
-
the caactus scripts are setup for pulling the information needed for running from the file
- CAVE: for Windows users make sure to change the backslash from
/path/to/config.tomlto\path\to\config.toml, when copying the path to your working directory
- CAVE: for Windows users make sure to change the backslash from
-
open the command line (for Windows: Anaconda Powershell) and save the path to your project file to a variable
- whole command UNIX:
p = "\path\to\config.toml"
-
whole command Windows:
$p = "\path\to\config.toml"
-
enter "-c" and enter path to config.toml
-
enter "-m" and choose "training" or "batch" to switch between modes
7.2 Conversion
-
call the
tif2h5pyscript from the cmd prompt to transform all.tif-filesto.h5-format. -
whole command UNIX:
tif2h5py -c "$p" -m training
-
whole command Windows:
tif2h5py.exe -c $p -m training
-
For batch processing enter "-m batch" for batch mode.
- whole command UNIX:
tif2h5py -c "$p" -m batch
- whole command Windows:
tif2h5py.exe -c $p -m batch
7.3 Background Processing
-
call the
background-processingscript from the cmd prompt- whole command UNIX:
background_processing -c "$p" -m training
- whole command Windows:
background_processing.exe -c $p -m training
-
For batch processing enter "-m batch" for batch mode
- whole command Unix:
background_processing -c "$p" -m batch
- whole command Windows:
background_processing.exe -c $p -m batch
7.4 Rename Files
- Call the
renamescript from the cmd prompt to rename all your original.tif-filesto their new name.- whole command Unix:
renaming -c "$p"
- whole command Windows:
renaming.exe -c $p
7.5 CSV-Summary
- call the
csv_summary.pyscript from the cmd prompt- whole command Unix:
csv_summary -c "$p"
- whole command Windows:
csv_summary.exe -c $p
7.6 Creating Summary Statistics
- call the
summary_statistics.pyscript from the cmd prompt- whole command Unix:
summary_statistics -c "$p"
- whole command Windows:
summary_statistics.exe -c $p
- if working with EUCAST antifungal susceptibility testing, call
- whole command Unix:
summary_statistics_eucast -c "$p"
- whole command Windows:
summary_statistics_eucast -c $p
7.7 PLN Modelling
- call the
pln_modelling.pyscript from the cmd prompt - whole command Unix:
pln_modelling -c "$p"
- whole command Windows:
pln_modelling.exe -c $p
8. Tutorial
8.1 Download Sample Data
- Go to zenodo to download the sample data.
- Unpack the
.zip-file into your project folder. - The path to where you unpacked the sample data will be your main folder (e.g.
/home/usr/Documents/sampledata_CD6_zenodo). - To showcase the functionalities, the ilastik steps have been pretrained. Use caactus in batch-mode for the following steps. From the dropdown menu in Global settings in the GUI, select
batch
[!NOTE] Some subdirectories are intentionally left empty. The tutorial is designed to show how the batch mode works with pretrained models.
0_1_original_tif_training_imagesstays empty; the other empty subdirectories will be filled as you follow the steps below.
- make sure you have caactus installed (see Installation above)
- make sure you have the caactus environment activated
conda activate caactus-env
- now simply type
caactusand hitenterto start the graphical user interface
caactus
- We recommend working with two screens. This allows to follow the instructions implemented in the caactus GUI while performing the steps in ilastik and quickly switching back to the caactus steps for fast completion of the pipeline.
8.2 Global Settings
- On the top, enter the path to your main folder.
- Change the default values
['strain', 'timepoint']to
['condition1','condition2']
- Set
Modetobatch.
8.3 Pre-Processing - Renaming
- In your main folder (
sampledata_CD6_zenodo), inspect therenaming.csvspreadsheet to see how it is constructed. - In the GUI, go to
Pre-Processing1. Renaming and click Run. - If you click on the dropdown menu
Advanced paths, a menu will open that will allow you to change the input and output folders, as well as the name of the renaming file.
8.4 Pre-Processing - Tif to h5
- In the GUI, go to
Pre-Processing2. Tif to h5 and click Run. - If you click on the dropdown menu
Advanced paths, a menu will open that will allow you to change the input and output folders.
8.5 Batch Pixel Classification
In the caactus GUI, find 3. Pixel Classification, click ? Help for the full instructions. Summary:
-
Open ilastik.
-
Open the pre-trained pixel classification project from the sample data (
1_pixel_classification.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. Feature Selection, or3. Trainingwhen running Batch Processing!
- Under
4. Prediction Export:-
Select Probabilities from the dropdown.
-
Click Choose Export Image Settings and set the output path at
File:
-
{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5
-
Click
OK -
Go to
5. Batch processingtab -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Now click
Process all files. -
The output will be saved as _Probabilities.h5 files in the output folder.
8.6 Batch Processing Multicut Segmentation
In the caactus GUI, find 4. Boundary Segmentation, click ? Help for the full instructions. Summary:
- In ilastik, open the pre-trained Boundary Segmentation project (
2_boundary_segmentation.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. DT Watershed, or3. Training and Multicutwhen running Batch Processing!
[!NOTE] The
*_Multicut Segmentation.h5files are generated here — they do not exist beforehand.
- Under
4. Data Export,
- click Choose Export Image Settings and set the output path at
File:{dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5
-
Click
OK -
Go to
5. Batch processing. -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Under
Probabilities, add the data_Probabilities.h5 files from6_batch_probabilitiesfolder.
-
Go to
5. Batch Processingand clickProcess all files. -
The output will be saved as
*_Multicut Segmentation.h5files in the output folder (7_batch_multicut). -
Close the
2_boundary_segmentation.ilpproject-file in ilastik.
8.7 Batch Background Processing
- Switch back to the caactus GUI.
- Find 5. Background Processing.
- Click Run. The background is now removed and you can continue with object classification in ilastik.
8.8 Batch Object Classification
In the caactus GUI, find 6. Object Classification, and click ? Help for the full instructions. Summary:
-
Switch back to ilastik.
-
Open your trained object classification project (
3_object_classification.ilp).
[!CAUTION] DO NOT CHANGE anything in
1. Input Data,2. Object Feature Selection, or3. Object Classificationwhen running Batch Processing!
- Under
4. Object Information Export:- Select Object Predictions from the dropdown.
- Click Choose Export Image Settings and set the output path at
File:
{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5
-
Under "4. Object Information Export", choose "Configure Feature Table Export" with the following settings:
-
In
Configure Feature Table Export Generalchoose format.csvand change output directory to:
{dataset_dir}/../8_batch_objectclassification/{nickname}.csv
Choose Features to choose the Feature you are interested in exporting.
-
Click
OK -
Go to 5.
Batch Processingtab -
Under
Raw data, add the .h5 files from5_batch_imagesfolder. -
Under
Segmentation Image, add the data_Multicut Segmentation.h5 files from7_batch_multicutfolder. -
Go to
5. Batch Processingand clickProcess all files. -
The output will be saved as
*_Object Predictions.h5files and*_table.csvin the output folder (8_batch_objectclassification). -
Now you have performed all steps in ilastik. You can close ilastik.
8.9 CSV Summary
- Switch back to the caactus GUI. Scroll down to Data Analysis
- The default Pixel Size is already set in Global Settings — you can leave it as-is for the sample data.
- Find 7. CSV Summary and click Run.
- Inspect the generated
df_clean.csv. This spreadsheet combines all feature tables from Object Classification into one file for downstream analysis.
8.10 Summary Statistics
- Find 8. Summary Statistics and click Run.
- Inspect the generated results. The output generated will be
- a)
df_summary_complete.csv= .csv-table containing also not usable category, - b)
df_refined_complete.csv= .csv-table without not usable category", - c)
counts.csvdataframe used in PlnModelling - d) bar graph (
barchart.png) (faceted by condition1 on x-axis, percent of morphotypes "Predicted Class" on the y-axis and condition2 as the facetting variable in rows.) You can play around by putting'condition2'first and'condition1'second to see how it changes the plot.
- a)
- You may also change the colors: change the default in Global Settings
{'resting': '#FE6100', 'swollen': '#648FFF', 'germling': '#785EF0', 'hyphae': '#DC267F'}
to
{'resting': 'yellow', 'swollen': 'cyan', 'germling': 'blue', 'hyphae': 'magenta'}
-
Again, find 8. Summary Statistics and click Run. The colors now should be changed.
-
Similarly, you may change the morphotype names. Open
df_clean.csvin a spreadsheet software (e.g. Excel). Replace allrestingwithdormant(useCtrl+F-Replace all, savedf_clean.csv). Now re-do step7.9 Summary Statistics.
Before you click Run, make sure you replace resting with dormantin both Class order
['resting', 'swollen', 'germling', 'hyphae']
and Color Mapping fields.
{'dormant': 'yellow', 'swollen': 'cyan', 'germling': 'blue', 'hyphae': 'magenta'}
-
Again, find 8. Summary Statistics and click Run. The names now should be changed.
-
Let's imagine you only have 3 cell categories in your dataset. Again, open
df_clean.csvin a spreadsheet software (e.g. Excel). Replace alldormantwithspores(useCtrl+F-Replace all, savedf_clean.csv).
Similarly, replace all swollenwith spores (use Ctrl+F - Replace all, save df_clean.csv).
Now change the Class Orderfield to
['spores', 'germling', 'hyphae']
and the Color Mappingfield to
{'spores': 'yellow', 'germling': 'blue', 'hyphae': 'magenta'}
- Again, find 8. Summary Statistics and click Run. The names now should be changed.
8.11 PLN Modelling
- Find 10. PLN Modelling and click Run.
- Inspect the generated results in the subdirectory
/sampledata_CD6_zenodo/9_data_analysis/The output generated will be- a)
correlation_circle.png. Shows that PCA1, accounting for ~57% of the variance, primarily separated samples by condition2, whereas PCA2 accounted for ~25% of the variance based on condition1. - b)
pca_plot.png. The PCA plot shows how the images are grouped together in 2D-space based on combined category of condition1 and condition2 (the categorical levels will be combined).
- a)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file caactus-0.3.1.tar.gz.
File metadata
- Download URL: caactus-0.3.1.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9d65c2cfc9d12c06ae19e1f8b14b63ae2fe0fcf966acd3b97114bd4b5e62046
|
|
| MD5 |
eb082f439229720cd822fc8d043aebb8
|
|
| BLAKE2b-256 |
a97ad604fc09e50da1c9166ccee658e76efa2ccf52b0e4c6bb41232acda5e743
|
File details
Details for the file caactus-0.3.1-py3-none-any.whl.
File metadata
- Download URL: caactus-0.3.1-py3-none-any.whl
- Upload date:
- Size: 2.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88334f9edab23829557907bbaf2c5e053b424d606cb7c8a568d3089b696c5e44
|
|
| MD5 |
bc3c22acb0624e965f7bf4a3c21d0001
|
|
| BLAKE2b-256 |
b6ad69bb9ddfa77024f50b05d070ff26d79893d8c4196496d3870bfcf0b11289
|