Package for pre- and post-processing of images and data for working with ilastik-software

These details have not been verified by PyPI

Project links

Project description

caactus

caactus (cell analysis and counting tool using ilastik software) is a collection of python scripts to provide a streamlined workflow for ilastik-software, including data preparation, processing and analysis. It aims to provide biologist with an easy-to-use tool for counting and analyzing cells from a large number of microscopy pictures.

workflow

Introduction

The goal of this script collection is to provide an easy-to-use completion for the Boundary-based segmentation with Multicut-workflow in ilastik. This workflow allows for the automatization of cell-counting from messy microscopic images with different (touching) cell types for biological research. For easy copy & paste, commands are provided in grey code boxes with one-click copy & paste.

Installation

Install miniconda, create an environment and install Python and vigra

Download and install miniconda for your respective operating system according to the instructions.
- Miniconda provides a lightweight package and environment manager. It allows you to create isolated environments so that Python versions and package dependencies required by caactus do not interfere with your system Python or other projects.
Once installed, create an environment for using caactus with the following command from your cmd-line
```
conda create -n caactus-env -c conda-forge "python>=3.10.12" vigra 
```

Install caactus

Activate the caactus-env from the cmd-line with
```
conda activate caactus-env
```
To install caactus plus the needed dependencies inside your environment, use
```
pip install caactus
```
During the below described steps that call the caactus-scripts, make sure to have the caactus-env activated.

Install ilastik

Download and install ilastik for your respective operating system.

Quick Overview of the workflow

Culture organism of interest in 96-well plate
Acquire images of cells via microscopy.
Create project directory
Rename Files with the caactus-script renaming
Convert files to HDF5 Format with the caactus-script tif2h5py
Train a pixel classification model in ilastik for and later run it batch-mode.
Train a boundary-based segmentation with Multicut model in ilastik for and later run it batch-mode.
Remove the background from the images using background_processing
Train a object classification model in ilastik for and later run it batch-mode.
Pool all csv-tables from the individual images into one global table with csv_summary

output generated:
- "df_clean.csv"

Summarize the data with summary_statistics

output generated:
- a) "df_summary_complete.csv" = .csv-table containing also "not usable" category,
- b) "df_refined_complete.csv" = .csv-table without "not usable" category",
- c) "counts.csv" dataframe used in PlnModelling
- d) bar graph ("barchart.png")

Model the count data with pln_modelling

Detailed Description of the Workflow

1. Culturing

Culture your cells in a flat bottom plate of your choice and according to the needs of the organims being researched.

2. Image acquisition

In your respective microscopy software environment, save the images of interest to .tif-format.
From the image metadata, copy the pixel size and magnification used.

3. Data Preparation

3.1 Create Project Directory

For portability of the ilastik projects create the directory in the following structure:
(Please note: the below example already includes examples of resulting files in each sub-directory)
This allows you to copy an already trained workflow and use it multiple times with new datasets

project_directory  
├── 1_pixel_classification.ilp  
├── 2_boundary_segmentation.ilp  
├── 3_object_classification.ilp
├── renaming.csv
├── conif.toml
├── 0_1_original_tif_training_images
  ├── training-1.tif
  ├── training-2.tif
  ├── ...
├── 0_2_original_tif_batch_images
  ├── image-1.tif
  ├── image-2.tif
  ├── ..
├── 0_3_batch_tif_renamed
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.tif
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.tif
  ├── ..
├── 1_images
  ├── training-1.h5
  ├── training-2.h5
  ├── ...
├── 2_probabilities
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
  ├── ...
├── 3_multicut
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
  ├── ...
├── 4_objectclassification
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
  ├── ...
├── 5_batch_images
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.h5
  ├── ...
├── 6_batch_probabilities
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
  ├── ...
├── 7_batch_multicut
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
  ├── ...
├── 8_batch_objectclassification
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
  ├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
  ├── ...
├── 9_data_analysis

3.2 Setup config.toml-file

copy config/config.toml to your working directory and modify it as needed.
the caactus scripts are setup for pulling the information needed for running from the file
- CAVE: for Windows users make sure to change the backlash fro /path/to/config.toml to \path\to\config.toml, when copying the path to your working directory
open the command line (for Windows: Anaconda Powershell) and save the path to your project file to a variable
- whole command UNIX:
```
p = "\path\to\config.toml" 
```
whole command Windows:
```
$p = "\path\to\config.toml"
```

4. Training

4.1. Selection of Training Images and Conversion

4.1.1 Selection of Training data

select a set of images that represant the different experimental conditions best
store them in 0_1_original_tif_training_images

4.1.2 Conversion

call the tif2h5py script from the cmd prompt to transform all .tif-files to .h5-format. The .h5-format allows for better performance when working with ilastik.
select "-c" and enter path to config.toml
select "-m" and choose "training"
whole command UNIX:
```
tif2h5py -c "$p" -m training
```
whole command Windows:
```
tif2h5py.exe -c $p -m training
```

4.2. Pixel Classification

4.2.1 Project setup

Follow the the documentation for pixel classification with ilastik.
Create the 1_pixel_classification.ilp-project file inside the project directory.
For working with neighbouring / touching cells, it is suggested to create three classes: 0 = interior, 1 = background, 2 = boundary (This follows python's 0-indexing logic where counting is started at 0).

pixel_classes

4.2.2 Export Probabilties

In prediction export change the settings to

Convert to Data Type: integer 8-bit
Renormalize from 0.00 1.00 to 0 255

File:

{dataset_dir}/../2_probabilties/{nickname}_{result_type}.h5

export_prob

4.3 Boundary-based Segmentation with Multicut

4.3.1 Project setup

Follow the the documentation for boundary-based segmentation with Multicut.
Create the 2_boundary_segmentation.ilp-project file inside the project directory.
In DT Watershed use the input channel the corresponds to the order you used under project setup (in this case input channel = 2).

watershed

4.3.2 Export Multicut Segmentation

In prediction export change the settings to

Convert to Data Type: integer 8-bit
Renormalize from 0.00 1.00 to 0 255
Format: compressed hdf5

File:

{dataset_dir}/../3_multicut/{nickname}_{result_type}.h5

export_multicut

4.4 Background Processing

For futher processing in the object classification, the background needs to eliminated from the multicut data sets. For this the next script will set the numerical value of the largest region to 0. It will thus be shown as transpartent in the next step of the workflow. This operation will be performed in-situ on all .*data_Multicut Segmentation.h5-files in the project_directory/3_multicut/.

call the background-processing script from the cmd prompt
select "-c" and enter path to config.toml
enter "-m training" for training mode

whole command UNIX:

background_processing -c "$p" -m training

whole command Windows:

background_processing.exe -c $p -m training

4.5. Object Classification

4.5.1 Project setup

Follow the the documentation for object classification.
define your cell types plus an additional category for "not-usuable" objects, e.g. cell debris and cut-off objects on the side of the images

4.5.2 Export Object Information

In Choose Export Imager Settings change settings to

Convert to Data Type: integer 8-bit
Renormalize from 0.00 1.00 to 0 255
Format: compressed hdf5

File:

{dataset_dir}/../4_objectclassification/{nickname}_{result_type}.h5

export_multicut

In Configure Feature Table Export General change seetings to

format .csv and output directory File:

{dataset_dir}/../4_objectclassification/{nickname}.csv`

select your features of interest for exporting

export_prob

5. Batch Processing

Follow the documentation for batch processing
store the images you want to process in the 0_2_original_tif_batch_images directory
Perform steps D.2 to D.5 in batch mode, as explained in detail below (E.2 to E.5)

5.1 Rename Files

Rename the .tif-files so that they contain information about your cells and experimental conditions
Create a csv-file that contains the information you need in columns. Each row corresponds to one image. Follow the same order as the sequence of image acquisition.
the only hardcoded columns that have to be added are biorep for "biological replicate" and techrep for "technical replicate". They are needed for downstream analysis for calculating the averages
The script will rename your files in the following format columnA-value1_columnB-value2_columnC_etc.tif eg. as seen in the example below picture 1 (well A1 from our plate) will be named strain-ATCC11559_date-20241707_timepoint-6h_biorep-A_techrep-1.tif
Call the rename script from the cmd prompt to rename all your original .tif-files to their new name.
whole command Unix:
```
renaming -c "$p"
```
whole command Windows:
```
renaming.exe -c $p
```

5.2 Conversion

call the tif2h5py script from the cmd prompt to transform all .tif-files to .h5-format.
select "-m" and choose "batch"
whole command UNIX:
```
tif2h5py -c "$p" -m batch
```
whole command Windows:
```
tif2h5py.exe -c $p -m batch
```

96-well-plate

5.3 Batch Processing Pixel Classification

open the 1_pixel_classification.ilp project file

under Prediction Export change the export directory to File:

{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5

under Batch Processing Raw Data select all files from 5_batch_images

5.4 Batch Processing Multicut Segmentation

open the 2_boundary_segmentation.ilp project file
under Choose Export Image Settings change the export directory to File:
```
{dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5
```
under Batch Processing Raw Data select all files from 5_batch_images
under Batch Processing Probabilities select all files from 6_batch_probabilities

5.5 Background Processing

call the background-processing.py script from the cmd prompt
enter "-m batch" for batch mode
whole command Unix:
```
background_processing -c "$p" -m batch
```

whole command Windows:

background_processing.exe -c $p -m batch

5.6 Batch processing Object classification

under Choose Export Image Settings change the export directory to File:

{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5

in Configure Feature Table Export General choose format .csv and change output directory to:
```
{dataset_dir}/../8_batch_objectclassification/{nickname}.csv
```
select your features of interest for exporting
under Batch Processing Raw Data select all files from 5_batch_images
under Batch Processing Segmentation Image select all files from 7_batch_multicut

6. Post-Processing and Data Analysis

Please be aware, the last two scripts, summary_statisitcs.py and pln_modelling.py at this stage are written for the analysis and visualization of two independent variables.

6.1 Merging Data Tables and Table Export

The next script will combine all tables from all images into one global table for further analysis. Additionally, the information stored in the file name will be added as columns to the dataset.

call the csv_summary.py script from the cmd prompt
whole command Unix:
```
csv_summary -c "$p"
```
whole command Windows
```
csv_summary.exe -c $p
```
Technically from this point on, you can continue to use whatever software / workflow your that is easiest for use for subsequent data analysis.

6.2 Creating Summary Statistics

call the summary_statistics.py script from the cmd prompt
whole command Unix:
```
summary_statistics -c "$p"
```
whole command Windows:
```
summary_statistics.exe -c $p
```
if working with EUCAST antifungal susceptibility testing, call summary_statistics_eucast

6.3 PLN Modelling

call the pln_modelling.py script from the cmd prompt`
whole command Unix:
```
pln_modelling -c "$p"
```
whole command Windows:
```
pln_modelling.exe -c $p
```
please note: the limit of categories for display in the PCA-plot is n=15

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.4

Apr 24, 2026

0.3.3

Apr 23, 2026

0.3.2

Apr 7, 2026

0.3.1

Apr 4, 2026

0.3.0

Mar 31, 2026

0.2.9

Mar 1, 2026

0.2.8

Feb 27, 2026

0.2.7

Feb 27, 2026

0.2.6

Feb 27, 2026

0.2.5

Feb 27, 2026

0.2.4

Feb 27, 2026

0.2.3

Feb 27, 2026

0.2.2

Feb 27, 2026

0.2.1

Feb 25, 2026

0.1.8

Nov 12, 2025

0.1.7

Nov 12, 2025

This version

0.1.6

Nov 12, 2025

0.1.5

Nov 10, 2025

0.1.4

Oct 5, 2025

0.1.3

Jul 3, 2025

0.1.2

Jul 3, 2025

0.1.1

Jul 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caactus-0.1.6.tar.gz (1.2 MB view details)

Uploaded Nov 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

caactus-0.1.6-py3-none-any.whl (26.4 kB view details)

Uploaded Nov 12, 2025 Python 3

File details

Details for the file caactus-0.1.6.tar.gz.

File metadata

Download URL: caactus-0.1.6.tar.gz
Upload date: Nov 12, 2025
Size: 1.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for caactus-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`96dfac7e539f758cca4e1d8db38cfe2c71545e456324b48591c785222f3c895b`
MD5	`b9d90ee222186586601136a4310606fa`
BLAKE2b-256	`35f04fc3150827fdb1062339ace84d1780bf2eb7ddb06d924b4d68ca47074bd5`

See more details on using hashes here.

File details

Details for the file caactus-0.1.6-py3-none-any.whl.

File metadata

Download URL: caactus-0.1.6-py3-none-any.whl
Upload date: Nov 12, 2025
Size: 26.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for caactus-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cc047d236b40043712858e938b35fbf828d2caf6e717642514b1a422b7df5ed7`
MD5	`c1e711c7581a50eebe486b029dbb9006`
BLAKE2b-256	`0690b308b36b711674022dee77320c463331d345429edd3adee82996ca8a3c5d`

See more details on using hashes here.

caactus 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

caactus

Introduction

Installation

Install miniconda, create an environment and install Python and vigra

Install caactus

Install ilastik

Quick Overview of the workflow

Detailed Description of the Workflow

1. Culturing

2. Image acquisition

3. Data Preparation

3.1 Create Project Directory

3.2 Setup config.toml-file

4. Training

4.1. Selection of Training Images and Conversion

4.1.1 Selection of Training data

4.1.2 Conversion

4.2. Pixel Classification

4.2.1 Project setup

4.2.2 Export Probabilties

4.3 Boundary-based Segmentation with Multicut

4.3.1 Project setup

4.3.2 Export Multicut Segmentation

4.4 Background Processing

4.5. Object Classification

4.5.1 Project setup

4.5.2 Export Object Information

5. Batch Processing

5.1 Rename Files

5.2 Conversion

5.3 Batch Processing Pixel Classification

5.4 Batch Processing Multicut Segmentation

5.5 Background Processing

5.6 Batch processing Object classification

6. Post-Processing and Data Analysis

6.1 Merging Data Tables and Table Export

6.2 Creating Summary Statistics

6.3 PLN Modelling

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes