Package for pre- and post-processing of images and data for working with ilastik-software
Project description
caactus
caactus (cell analysis and counting Tool using ilastik software) is a collection of python scripts to provide a streamlined workflow for ilastik-software, including data preparation, processing and analysis. It aims to provide biologist with an easy-to-use tool for counting and analyzing cells from a large number of microscopy pictures.
Introduction
The goal of this script collection is to provide an easy-to-use completion for the Boundary-based segmentation with Multicut-workflow in ilastik. This worklow allwows for the automatization of cell-counting from messy microscopic images with different (touching) cell types for biological research.
Installation
Install python
- Download and install python for your respective operating system
- Make sure that the
pip-installerwas installed along thepython-installation by typingpip helpin the command prompt.
Install ilastik
- Download and install ilastik for your respective operating system.
Install vigra
- Follow the install instructions for vigra on the developer's website
Install caactus
- To install
caactususepip install caactusto install all scripts plus the needed dependencies.
Workflow
A Culturing
- Culture your cells in a flat bottom plate of your choice and according to the needs of the organims being researched.
B Image acquisition
- In your respective microscopy software environment, save the images of interest to
.tif-format. - From the image metadata, copy the pixel size and magnification used.
C Data Preparation
C.1 Create Project Directory
- For portability of the ilastik projects create the directory in the following structure:
(Please note: the below example already includes examples of resulting files in each sub-directory)
project_directory
├── 1_pixel_classification.ilp
├── 2_boundary_segmentation.ilp
├── 3_object_classification.ilp
├── renaming.csv
├── conif.toml
├── 0_1_original_tif_training_images
├── training-1.tif
├── training-2.tif
├── ...
├── 0_2_original_tif_batch_images
├── image-1.tif
├── image-2.tif
├── ..
├── 0_3_batch_tif_renamed
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.tif
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.tif
├── ..
├── 1_images
├── training-1.h5
├── training-2.h5
├── ...
├── 2_probabilities
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
├── ...
├── 3_multicut
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
├── ...
├── 4_objectclassification
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
├── ...
├── 5_batch_images
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2.h5
├── ...
├── 6_batch_probabilities
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Probabilities.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Probabilities.h5
├── ...
├── 7_batch_multicut
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Multicut Segmentation.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Multicut Segmentation.h5
├── ...
├── 8_batch_objectclassification
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-1-data_table.csv
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_Object Predictions.h5
├── strain-xx_day-yymmdd_condition1-yy_timepoint-zz_parallel-2-data_table.csv
├── ...
├── 9_data_analysis
C.1 Setup config.toml-file
- copy config/config.toml to your working directory and modify it as needed.
- the caactus scripts are setup for pulling the information needed for running from the file
D Training
D.1. Selection of Training Images and Conversion
D.1.1 Selection of Training data
- select a set of images that represant the different experimental conditions best
- store them in 0_1_original_tif_training_images
D.1.2 Conversion
- call the
tif2h5pyscript from the cmd prompt to transform all.tif-filesto.h5-format. The.h5-formatallows for better performance when working with ilastik. - select "-c" and enter path to config.toml
- select "-m" and choose "training"
- whole command
tif2hpy -c \path\to\config.toml -m training
D.2. Pixel Classification
D.2.1 Project setup
- Follow the the documentation for pixel classification with ilastik.
- Create the
1_pixel_classification.ilp-project file inside the project directory. - For working with neighbouring / touching cells, it is suggested to create three classes: 0 = interior, 1 = background, 2 = boundary (This follows python's 0-indexing logic where counting is started at 0).
D.2.2 Export Probabilties
In prediction export change the settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- File:
{dataset_dir}/../2_probabilties/{nickname}_{result_type}.h5
D.3 Boundary-based Segmentation with Multicut
D.3.1 Project setup
- Follow the the documentation for boundary-based segmentation with Multicut.
- Create the
2_boundary_segmentation.ilp-project file inside the project directory. - In
DT Watersheduse the input channel the corresponds to the order you used under project setup ( in this case input channel = 2).
D.3.2 Export Multicut Segmentation
In prediction export change the settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- Format:
compressed hdf5 - File:
{dataset_dir}/../3_multicut/{nickname}_{result_type}.h5
D.4 Background Processing
For futher processing in the object classification, the background needs to eliminated from the multicut data sets. For this the next script will set the numerical value of the largest region to 0. It will thus be shown as transpartent in the next step of the workflow. This operation will be performed in-situ on all .*data_Multicut Segmentation.h5-files in the project_directory/3_multicut/.
- call the
background-processingscript from the cmd prompt - select "-c" and enter path to config.toml
- enter "-m training" for training mode
- whole command
background-processing -c \path\to\config.toml -m training
D.5. Object Classification
D.5.1 Project setup
- Follow the the documentation for object classification.
- define your cell types plus an additional category for "not-usuable" objects, e.g. cell debris and cut-off objects on the side of the images
D.5.2 Export Object Information
In Choose Export Imager Settings change settings to
Convert to Data Type: integer 8-bitRenormalize from 0.00 1.00 to 0 255- Format:
compressed hdf5 - File:
{dataset_dir}/../4_objectclassification/{nickname}_{result_type}.h5
In Configure Feature Table Export General change seetings to
- File:
{dataset_dir}/../4_objectclassification/{nickname}.csvas the output directory and format.csv - select your feautres of interest for exporting
E Batch Processing
- Follow the documentation for batch processing
- store the images you want to process in the 0_2_original_tif_batch_images directory
- Perform steps D.2 to D.5 in batch mode, as explained in detail below (E.2 to E.5)
E.1 Rename Files
- Rename the
.tif-filesso that they contain information about your cells and experimental conditions - Create a csv-file that contains the information you need in columns. Each row corresponds to one image. Follow the same order as the sequence of image acquisition.
- the only hardcoded columns that have to be added are
biorepfor "biological replicate" andtechrepfor "technical replicate". They are needed for downstream analysis for calculating the averages - The script will rename your files in the following format
columnA-value1_columnB-value2_columnC_etc.tifeg. as seen in the example below picture 1 (well A1 from our plate) will be namedstrain-ATCC11559_date-20241707_timepoint-6h_biorep-A_techrep-1.tif - Call the
renamescript from the cmd prompt to rename all your original.tif-filesto their new name. - whole command:
rename -c \path\to\config.toml
E.2 Batch Processing Pixel Classification
- open the
1_pixel_classification.ilpproject file - under
Prediction Exportchange the export directory toFile:{dataset_dir}/../6_batch_probabilities/{nickname}_{result_type}.h5 - under
Batch ProcessingRaw Dataselect all files from5_batch_images
E.3 Batch Processing Multicut Segmentation
- open the
2_boundary_segmentation.ilpproject file - under
Choose Export Image Settingschange the export directory toFile:{dataset_dir}/../7_batch_multicut/{nickname}_{result_type}.h5 - under
Batch ProcessingRaw Dataselect all files from5_batch_images - under
Batch ProcessingProbabilitiesselect all files from6_batch_probabilities
E.4 Background Processing
For futher processing in the object classification, the background needs to eliminated from the multicut data sets. For this the next script will set the numerical value of the largest region to 0. It will thus be shown as transpartent in the next step of the workflow. This operation will be performed in-situ on all .*data_Multicut Segmentation.h5-files in the project_directory/3_multicut/.
- call the
background-processing.pyscript from the cmd prompt - enter "-m batch" for batch mode
- whole command:
background-processing -c \path\to\config.toml -m batch
E.5 Batch processing Object classification
- under
Choose Export Image Settingschange the export directory toFile:{dataset_dir}/../8_batch_objectclassification/{nickname}_{result_type}.h5 - in
Configure Feature Table Export Generalchoose{dataset_dir}/../8_batch_objectclassification/{nickname}.csvas the output directory and format.csv - select your feautres of interest for exporting
- under
Batch ProcessingRaw Dataselect all files from5_batch_images - under
Batch ProcessingSegmentation Imageselect all files from7_batch_multicut
F Post-Processing and Data Analysis
- Please be aware, the last two scripts,
summary_statisitcs.pyand `pln_modelling.py at this stage are written for the analysis and visualization of two independent variables.
F.1 Merging Data Tables and Table Export
The next script will combine all tables from all images into one global table for further analysis. Additionally, the information stored in the file name will be added as columns to the dataset.
- call the
csv_summary.pyscript from the cmd prompt - whole command
python csv_summary.py - Technically from this point on, you can continue to use whatever software / workflow your that is easiest for use for subsequent data analysis.
F.2 Creating Summary Statistics
- call the
summary_statistics.pyscript from the cmd prompt - whole command
summary_statistics - if working with EUCAST antifungal susceptibility testing, call
summary_statistics_eucast
F.3 PLN Modelling
- call the
pln_modelling.pyscript from the cmd prompt` - whole command
pln_modelling - please note: the limit of categories for display in the PCA-plot is n=15
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file caactus-0.1.3.tar.gz.
File metadata
- Download URL: caactus-0.1.3.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
857dc3997cecf5621702cda8332865cabd4d5efe0b4d76e2a9f034264faf069a
|
|
| MD5 |
57956da70c704e07d1cf4f649e00d97f
|
|
| BLAKE2b-256 |
767b0387a152fa592cb96d9ebf918102f2bec841ef5b029f7da27981db858335
|
File details
Details for the file caactus-0.1.3-py3-none-any.whl.
File metadata
- Download URL: caactus-0.1.3-py3-none-any.whl
- Upload date:
- Size: 25.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abe13ce630e1faa688b2adc4392e0608739064dfd64cb5de2f1f806eca9394ea
|
|
| MD5 |
9a2bf174a06769d28b697237bdbe59bb
|
|
| BLAKE2b-256 |
5028bb0f6fcf4e4476ab3c4b0246b3d5332ef9133b92fc5604ddd0be117bf232
|