Skip to main content

3D shape analysis using deep learning

Project description

Project Status: Active – The project has reached a stable, usable state and is being actively developed. Python Version PyPI Downloads Wheel Development Status Tests Coverage Status Code style: black

Cellshape logo by Matt De Vries

3D single-cell shape analysis of cancer cells using geometric deep learning

This is a Python package for 3D cell shape features and classes using deep learning. Please refer to our preprint on bioRxiv here.

cellshape is the main package which imports from sub-packages:

  • cellshape-helper: Facilitates point cloud generation from 3D binary masks.
  • cellshape-cloud: Implementations of graph-based autoencoders for shape representation learning on point cloud input data.
  • cellshape-voxel: Implementations of 3D convolutional autoencoders for shape representation learning on voxel input data.
  • cellshape-cluster: Implementation of deep embedded clustering to add to autoencoder models.

Installation and requirements

Dependencies

The software requires Python 3.7 or greater, PyTorch, torchvision, pyntcloud, numpy, scikit-learn, tensorboard, tqdm, datetime. This repo makes extensive use of cellshape-cloud, cellshape-cluster, cellshape-helper, and cellshape-voxel. to reproduce our results in our paper, only cellshape-cloud, cellshape-cluster are needed.

To install

  1. We recommend creating a new conda environment:
conda create --name cellshape-env python=3.8
conda activate cellshape-env
pip install --upgrade pip
  1. Install cellshape from pip
pip install cellshape

Hardware requirements

We have tested this software on an Ubuntu 20.04LTS with 128Gb RAM and NVIDIA Quadro RTX 6000 GPU.

Data structure

Our data is structured in the following way:

cellshapeData/
    all_data_stats.csv
    Plate1/
        stacked_pointcloud/
            Binimetinib/
                0010_0001_accelerator_20210315_bakal01_erk_main_21-03-15_12-37-27.ply
                ...
            Blebbistatin/
            ...
    Plate2/
        stacked_pointcloud/
    Plate3/
        stacked_pointcloud/

Data availability

Datasets to reproduce our results in our paper are available here.

Usage

The following steps assume that one already has point cloud representations of cells or nuclei. If you need to generate point clouds from 3D binary masks please go to cellshape-helper.

The training procedure follows two steps:

  1. Training the dynamic graph convolutional foldingnet (DFN) autoencoder to automatically learn shape features.
  2. Adding the clustering layer to refine shape features and learn shape classes simultaneously.

Inference can be done after each step.

For help on all command line options run:

cellshape-train -h

1. Train DFN autoencoder

cellshape-train \
--model_type "cloud" \
--train_type "pretrain" \
--cloud_dataset_path "path/to/cellshapeData/" \ # change to where you saved data
--dataset_type "SingleCell" \
--dataframe_path "path/to/cellshapeData/all_data_stats.csv" \ # change to where you saved data
--output_dir "path/to/output/"
--num_epochs_autoencoder 250 \
--encoder_type "dgcnn" \
--decoder_type "foldingnetbasic"
--num_features 128 \

This step will create an output directory "path/to/output/" with the subfolders: nets, reports, and runs which contain the model weights, logged outputs, and tensorboard runs respectively for each experiment. Each experiment is named with the following convention {encoder_type}{decoder_type}{num_features}{train_type}{xxx}, where {xxx} is a counter. For example, if this was the first experiment you have run, the trained model weights will be saved to: path/to/output/nets/dgcnn_foldingnetbasic_128_pretrain_001.pt.

To monitor the training using Tensorboard, run:

tensorboard --logdir "path/to/output/runs/"

2. Add clustering layer to refine shape features and learn shape classes simultaneously

cellshape-train \
--model_type "cloud" \
--train_type "DEC" \
--pretrain False \ # this was done in the previous step
--cloud_dataset_path "path/to/cellshapeData/" \
--dataset_type "SingleCell" \
--dataframe_path "path/to/cellshapeData/all_data_stats.csv" \
--output_dir "path/to/output/"
--num_epochs_clustering 250 \
--num_features 128 \
--num_clusters 5 \
--pretrained_path "path/to/output/nets/pretrained_autoencoder.pt" # path/to/output/nets/dgcnn_foldingnetbasic_128_pretrain_001.pt in our example

For developers

  • Fork the repository
  • Clone your fork
git clone https://github.com/USERNAME/cellshape
  • Install an editable version (-e) with the development requirements (dev)
cd cellshape
pip install -e .[dev] 
  • To install pre-commit hooks to ensure formatting is correct:
pre-commit install
  • To release a new version:

Firstly, update the version with bump2version (bump2version patch, bump2version minor or bump2version major). This will increment the package version (to a release candidate - e.g. 0.0.1rc0) and tag the commit. Push this tag to GitHub to run the deployment workflow:

git push --follow-tags

Once the release candidate has been tested, the release version can be created with:

bump2version release

References

[1] An Tao, 'Unsupervised Point Cloud Reconstruction for Classific Feature Learning', GitHub Repo, 2020

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellshape-0.0.17.tar.gz (11.2 kB view hashes)

Uploaded Source

Built Distribution

cellshape-0.0.17-py3-none-any.whl (10.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page