Skip to main content

ImSpiRE is a python script (python 3.8+) for spatial resolution enhancenment by solving the entropic regularized fused Gromov-Wasserstein transport (FGW) problem for in situ capturing (ISC) datasets.

Project description

ImSpiRE: Image-aided Spatial Resolution Enhancement

label1 label2

| Overview | Installation | Quick Start | Parameter Details | Run ImSpiRE in Python interface | Data&Code | Citation |

Figure1

Overview

ImSpiRE is an Image-aided Spatial Resolution Enhancement method for in situ capturing (ISC) datasets. It is written in Python 3.8 and requires CellProfiler 4.2.1. It is available as a command line tool and a Python library to meet the needs of different users. We also commit ImSpiRE as a docker image, which eliminates the cumbersome installation and allows for portability across different computing systems.

ImSpiRE arranges virtual spots densely on the tissue section, and extract image features of sub-images from original spots and virtual spots in the histology image. Then, it calculates an optimal probabilistic embedding from the original spots to the virtual spots to construct a high-resolution spatial transcriptional profiling, according to the assumptions that all spots with similar gene expression levels are close to each other in spatial dimension and have similar image features in image dimension.

Installation

Installation as a docker image

For convenience, we committed ImSpiRE as a docker image, which eliminates the cumbersome installation and allows for portability across different computing systems. You can use the command below to pull the image, then start a container based on the image and run ImSpiRE in the container. It will take a few minutes to pull the image.

$ docker pull tongjizhanglab/imspire:1.1.1
$ docker run -it --name `whoami`_imspire -v ~:/home fb6f4c5bb574 /bin/bash

Here, fb6f4c5bb574 is the IMAGE ID of ImSpiRE.

Installation as a Python library

ImSpiRE has been packaged and uploaded to PyPI. Before your installation, ensure that you have pip available. pip3 is the package installer for Python. If you do not have pip3 on your machine, click here to install it.

ImSpiRE and other relevant packages can be installed using a single command.

$ pip3 install imspire 

You may need to use the command below to add the default installation path of pip3 to your system path.

$ export PATH=~/.local/bin:$PATH

Then, type the command below to check whether ImSpiRE has been installed successfully.

$ ImSpiRE -h

You can also download CellProfiler to extract image features. To avoid installation problems, we recommend installing CellProfiler 4.2.1 in advance. CellProfiler 4 should be pip installable in Python 3.8+ when a number of prerequisite packages are installed. More details can be found here.

$ pip3 install cellprofiler==4.2.1

Or you can install CellProfiler 4.2.1 via source code.

$ wget -O core-4.2.1.tar.gz https://github.com/CellProfiler/core/archive/refs/tags/v4.2.1.tar.gz
$ wget -O CellProfiler-4.2.1.tar.gz https://github.com/CellProfiler/CellProfiler/archive/refs/tags/v4.2.1.tar.gz
$ tar -xzvf CellProfiler-4.2.1.tar.gz
$ tar -xzvf core-4.2.1.tar.gz

Due to problems with package dependencies, we recommend modifying the setup.py files of cellprofiler and cellprofiler-core to change the pyzmq==18.0.1 to pyzmq==18.1.1, as they say too here. And now, you can install cellprofiler-core and cellprofiler in order.

$ cd core-4.2.1
$ pip3 install .
$ cd CellProfiler-4.2.1
$ pip3 install .

Then, type the command below to check whether CellProfiler has been installed successfully.

$ cellprofiler -h

We note that you may encounter a failure to install wxPython 4.1.0 when installing CellProfiler. You can download the corresponding WHL file and then install it. As an example, for Ubuntu 18.0.4, you can install wxPython 4.1.0 with the following code.

$ wget https://extras.wxpython.org/wxPython4/extras/linux/gtk3/ubuntu-18.04/wxPython-4.1.0-cp38-cp38-linux_x86_64.whl
$ pip3 install wxPython-4.1.0-cp38-cp38-linux_x86_64.whl

You may also need to download CellProfiler piplines used in ImSpiRE.

$ wget https://github.com/Yizhi-Zhang/ImSpiRE/raw/master/CellProfiler_piplines/cellprofiler_piplines.zip
$ unzip cellprofiler_piplines.zip

ImSpiRE uses two different methods of extracting the foreground, one of which uses the Python package backgroundremover. Please note that when you first run the program, it will check to see if you have the u2net models, if you do not, it will get them from u2net's google drive, as they say too here. Download the pre-trained model u2net.pth and push it into the dirctory ~/.u2net.

Quick Start

Step 1. ImSpiRE installation following the above tutorial.

Step 2. Input preparation

ImSpiRE utilizes the count file in tab-delimited format or hierarchical-data format (HDF5 or H5) and the image file in TIFF format, as well as a file containing spot coordinates as input.

We provided a small test dataset containing the raw count matrix, image and spot coordinates. A CellProfiler pipeline is also included in the test dataset for use if required.

Or type the command below to download.

$ wget https://github.com/Yizhi-Zhang/ImSpiRE/raw/master/test/test_data.zip
$ unzip test_data.zip

Step 3. Operation of ImSpiRE

Type the command below to run ImSpiRE.

$ ImSpiRE -i test_data/ -c test_data/count_matrix.tsv -s test_data/image.tif -p ST -n test_output -O

Or use CellProfiler to extract image features.

$ ImSpiRE -i test_data/ -c test_data/count_matrix.tsv -s test_data/image.tif -p ST -n test_output -m 2 --CellProfilerParam_Pipeline Cellprofiler_Pipeline_HE.cppipe -O

Step 4. Output

The contents of the output directory in tree format will be displayed as described below, including the high-resolution spatial transcriptional profiling stored in Anndata h5ad file format, the text file containing patch coordinates, the spot sub-images and patch sub-images if CellProfiler is used to extract image features, the image features, the matrices involved in OT and other supplementary results.

PATH/ProjectName
├── ProjectName_ResolutionEnhancementResult.h5ad  ## the high-resolution spatial transcriptional profiling
├── ProjectName_PatchLocations.txt  ## the text file containing patch coordinates
├── ImageResults  ## the spot and patch sub-images if CellProfiler is used to extract image features   ├── SpotImage
│   └── PatchImage
├── FeatureResults  ## the image features   ├── SpotFeature
│   └── PatchFeature
└── SupplementaryResults  ## the matrices involved in OT and other supplementary results
    ├── ot_C1_alpha_beta_epsilon.npy
    ├── ot_C2_alpha_beta_epsilon.npy
    ├── ot_M_alpha_beta_epsilon.npy
    ├── ot_T_alpha_beta_epsilon.npy
    └── ...

Parameter Details

The parameter details of ImSpiRE are as follows:

usage: ImSpiRE [-h] <-i InputDir> <-c filtered_feature_bc_matrix.h5> <-s image.tif> [-n ProjectName] [-o OutputDir] [options]

ImSpiRE is an Image-aided Spatial Resolution Enhancenment method for in situ capturing (ISC) datasets.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show version number of ImSpiRE and exit
  -i INPUT_DIR, --Input_Dir INPUT_DIR
                        This indicates the path to the directory for input
                        datafiles. For 10X Visium, it is similar to the
                        standard output format of Space Ranger, which should
                        contain a count file in the {Input_Dir} and a text
                        file named "tissue_positions_list.csv" that describes
                        the spot locations in the {Input_Dir}/spatial/. For
                        ST, it should contain a count file and a text file
                        named "pxl_pos.txt" that describes the spot locations
                        in the {Input_Dir}. Note that the file
                        "tissue_positions_list.csv" does not contain a header
                        column, while the file "pxl_pos.txt" includes a header
                        column and contains two columns, which represent the
                        pixel coordinates of rows and columns of spots,
                        respectively.
  -c INPUT_COUNT_FILE, --Input_Count_File INPUT_COUNT_FILE
                        The input count file. It would typically be
                        "filtered_feature_bc_matrix.h5" and "count_matrix.tsv"
                        for 10X Visium and ST, respectively. Note that the
                        count file of ST platform should be tab-delimited.
  -s INPUT_SECTION_IMAGE_FILE, --Input_Section_Image_File INPUT_SECTION_IMAGE_FILE
                        The high-resolution tissue image, which should be a
                        Tagged Image File Format (TIF or TIFF) file.
  -t {H&E,IF}, --Input_Image_Type {H&E,IF}
                        The types of staining, including Haematoxylin & Eosin
                        (H&E) staining and Immunofluorescence (IF) staining.
                        DEFAULT: "H&E".
  -o OUTPUT_DIR, --Output_Dir OUTPUT_DIR
                        The project directory, which is used to save all
                        output files. DEFAULT: "./".
  -n OUTPUT_NAME, --Output_Name OUTPUT_NAME
                        The project name, which is used to generate output
                        file names as a prefix. DEFAULT: "ImSpiRE".
  -p {Visium,ST}, --Platform {Visium,ST}
                        The platform that generates the dataset, Visium or ST.
                        DEFAULT: "Visium".
  -m {1,2}, --Mode {1,2}
                        Two types of extracted image features. When this
                        parameter is set to 1, ImSpiRE will extract intensity
                        and texture features of the image, which are the
                        objective features of the image itself. When this
                        parameter is set to 2, ImSpiRE will use CellProfiler
                        to extract image features, which are more biologically
                        significant. DEFAULT: 1.
  -O, --Overwriting     The switch of overwriting. If add the parameter "-O",
                        ImSpiRE will overwrite the exiting folders.
  --Verbose             The verbose flag. If add the parameter "--Verbose",
                        ImSpiRE will verbose the output.
  --Random_State RANDOM_STATE
                        Fix the seed for reproducibility. DEFAULT: 0.
  --Switch_Preprocess {ON,OFF}
                        The switch of basic filtering of spots and genes, ON
                        or OFF. DEFAULT: "ON".
  --Threshold_MinCounts THRESHOLD_MINCOUNTS
                        The minimum number of counts required for a spot to
                        pass filtering, which is enabled only if "--
                        Switch_SpotFilter" is "ON". DEFAULT: 100.
  --Threshold_MaxCounts THRESHOLD_MAXCOUNTS
                        The maximum number of counts required for a spot to
                        pass filtering, which is enabled only if "--
                        Switch_SpotFilter" is "ON". DEFAULT: 10000.
  --Threshold_MitoPercent THRESHOLD_MITOPERCENT
                        The maximum count percent of mitochondrial genes
                        required for a spot to pass filtering, which is
                        enabled only if "--Switch_SpotFilter" is "ON".
                        DEFAULT: 20.
  --Threshold_MinSpot THRESHOLD_MINSPOT
                        The minimum number of spots expressed required for a
                        gene to pass filtering, which is enabled only if "--
                        Switch_SpotFilter" is "ON". DEFAULT: 10.
  --ImageParam_CropSize IMAGEPARAM_CROPSIZE
                        The pixel size of each patch subimage. For example, "
                        --ImageParam_CropSize 100" means each patch subimage
                        is 100*100 pixels. DEFAULT: 100.
  --ImageParam_PatchDist IMAGEPARAM_PATCHDIST
                        The pixel distance between adjacent patches. DEFAULT:
                        100.
  --ImageParam_TotalChannelNumber {3,4}
                        The total number of channels, which is needed only if
                        "--Input_Image_Type" is "IF". For example, "--
                        ImageParam_TotalChannelNumber 3" means the TIFF file
                        contain three channels.
  --ImageParam_DAPIChannel {1,2,3,4}
                        The channel of the DAPI, which is needed only if "--
                        Input_Image_Type" is "IF". For example: "--
                        ImageParam_DAPIChannel 1".
  --ImageParam_FiducialFrameChannel {1,2,3,4}
                        The channel of the fiducial frame, which is needed
                        only if "--Input_Image_Type" is "IF". For example: "--
                        ImageParam_FiducialFrameChannel 3".
  --FeatureParam_ProcessNumber FEATUREPARAM_PROCESSNUMBER
                        The number of worker processes to create when
                        extracting texture and intensity features, which is
                        used when "-m" is 1. DEFAULT: 10.
  --FeatureParam_FeatureType {0,1,2}
                        This determines which type of image features to use
                        when "-m" is 1. 0 for both texture and intensity
                        features, 1 for texture features only and 2 for
                        intensity features only. DEFAULT: 0.
  --FeatureParam_ClipLimit FEATUREPARAM_CLIPLIMIT
                        The clipping limit, which is used when "-m" is 1. It
                        is normalized between 0 and 1, with higher values
                        representing more contrast. DEFAULT: 0.01.
  --FeatureParam_IterCount FEATUREPARAM_ITERCOUNT
                        Number of iterations Grabcut image segmentation
                        algorithm should make before returning the result.
                        DEFAULT: 50.
  --CellProfilerParam_Pipeline CELLPROFILERPARAM_PIPELINE
                        The path to the CellProfiler pipline. It would be
                        better to use different piplines for H&E and IF
                        samples. For H&E samples,
                        "Cellprofiler_Pipeline_HE.cppipe" is recommended. For
                        IF samples, "Cellprofiler_Pipeline_IF_C3/4.cppipe" is
                        recommended based on the total number of channels. In
                        the docker image, the pipelines are stored in "/root".
                        You can also download them from github to your own
                        working directory. DEFAULT:
                        "Cellprofiler_Pipeline_HE.cppipe".
  --CellProfilerParam_KernelNumber CELLPROFILERPARAM_KERNELNUMBER
                        This option specifies the number of kernel to use to
                        run CellProfiler. DEFAULT: 10.
  --OptimalTransportParam_Alpha OPTIMALTRANSPORTPARAM_ALPHA
                        The constant interpolating between image features cost
                        matrices and locations cost matirces, ranging from 0
                        to 1. For example, "--OptimalTransportParam_Alpha 0.5"
                        means ImSpiRE will equally consider the weight of
                        similarity from the saptial and image dimensions. The
                        larger the value of alpha is, the more ImSpiRE
                        considers the weight of similarity from the image
                        dimension. DEFAULT: 0.5.
  --OptimalTransportParam_Beta OPTIMALTRANSPORTPARAM_BETA
                        The trade-off parameter of Fused-gromov-Wasserstein
                        transport ranging from 0 to 1. DEFAULT: 0.5.
  --OptimalTransportParam_NumNeighbors OPTIMALTRANSPORTPARAM_NUMNEIGHBORS
                        The number of neighbors for nearest neighbors graph.
                        DEFAULT: 5.
  --OptimalTransportParam_Epsilon OPTIMALTRANSPORTPARAM_EPSILON
                        The entropic regularization term with value greater
                        than 0. DEFAULT: 0.001.
  --OptimalTransportParam_NumIterMax OPTIMALTRANSPORTPARAM_NUMITERMAX
                        Max number of iterations when solving the OT problem.
                        DEFAULT: 10.

Run ImSpiRE in Python interface

ImSpiRE can also be used step by step in the Python interface and easily integrated into custom scripts. Here is a tutorial written using Jupyter Notebook.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imspire-1.1.1.tar.gz (32.9 kB view hashes)

Uploaded Source

Built Distribution

imspire-1.1.1-py3-none-any.whl (29.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page