Skip to main content

Histomic Atlases of Variation Of Cancers (HAVOC) is a versatile tool that helps map histomic heterogeneity across H&E-stained digital slide images to help guide regional deployment of molecular resources to the most relevant/biodiverse tumor niches

Project description

Histomic Atlases of Variation Of Cancers (HAVOC)

HAVOC is a versatile tool that maps histomic heterogeneity across H&E-stained digital slide images to help guide regional deployment of molecular resources to the most relevant/biodiverse tumor niches

Cloud usage

Explore HAVOC on https://www.codido.co to run on the cloud

Installation

Use the package manager pip to install havoc-clustering.

!! Tensorflow will need to be installed as well as OpenSlide (https://openslide.org/download/) !!

pip install havoc-clustering

Usage

from havoc_clustering.havoc import HAVOC
from havoc_clustering.general_utility.slide import Slide

# create a new Slide object that represents the image.
# Requirements allow to filter out undesired images within ie a loop
s = Slide(
    slide_path,
    img_requirements={
        'compression': [70],
        'mpp': None  # all mpp values (magnification) are supported currently
    }
)

# to instantiate a HAVOC instance requires the following:
# 1. a Slide object
# 2. path to a tensorflow model which acts as the feature extractor to base clustering off of. 
    # Our model used in our works can be found at https://bitbucket.org/diamandislabii/faust-feature-vectors-2019/src/master/models/74_class/
# 3. the directory to save output to
# 4. size of the tiles to extract and work with within the slide. default is 1024 (original trained size for the model above)
# 5. by default, we use the slide's resized thumbnail as the background for the colortile map. turn off to make it HD at the expense of time
havoc = HAVOC(s, feature_extractor_path, save_dir, tile_size=512, hd_backdrop=False)

# to run, requires the following:
# 1. the k values to use for clustering
# 2. the blank filter cutoff. 0.5 means that there must be less than (100-50=50)% blank within a tile to decide to use it.
    # ie tiles that are >50% blank would be skipped; a non-conservative number to only cluster tiles with plentiful tissue
# 3. the layer name within the feature extractor model that is responsible for generating the features
# 4. Additional kwargs; OPTIONAL
kwargs = {
    # saves a thumbnail image of the original slide
    'save_thumbnail': False,
    # make a dendrogram of the clustering used to make the colortile maps (generated for each k value)
    'make_dendrogram': True,
    # make a tsne of each color cluster (generated for each k value)
    'make_tsne': True,
    # make a Pearson coeffcient clustermap of each color cluster (generated for each k value)
    'make_corr_map': True,
    # save the tiles belonging to each color cluster within the colortile map for a given k
    # ie [4,9] would save the colored tiles belonging to k=4 and k=9
    # NOTE: this should be a subset of k_vals
    'save_tiles_k_vals': []
}
havoc.run(k_vals=[9], min_non_blank_amt=0.5, layer_name='global_average_pooling2d_1', **kwargs)

Result output

  • Colortiled maps
  • CSV file of cluster info + DLFVs (cluster_info_df.csv)
  • Optionally:
    • Original slide thumbnail
    • TSNEs
    • Dendrograms
    • Correlation clustermap

Multi-slide correlation map

By running HAVOC on multiple slides, you may want to combine all the generated correlation clustermaps into a mega clustermap.

  1. Create a folder containing each slide's cluster_info_df.csv file
from havoc_clustering.correlation_of_dlfv_groups import create_correlation_clustermap_multi_slide

create_correlation_clustermap_multi_slide(folder_of_csvs, target_k=9)

NOTE: the target_k should be a k-value you ran HAVOC with

Citation

Please refer to the paper "HAVOC: Small-scale histomic mapping of biodiversity across entire tumor specimens using deep neural networks"

License

GNU General Public License v3 (GPLv3)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

havoc_clustering-0.0.20.tar.gz (35.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

havoc_clustering-0.0.20-py3-none-any.whl (32.4 kB view details)

Uploaded Python 3

File details

Details for the file havoc_clustering-0.0.20.tar.gz.

File metadata

  • Download URL: havoc_clustering-0.0.20.tar.gz
  • Upload date:
  • Size: 35.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for havoc_clustering-0.0.20.tar.gz
Algorithm Hash digest
SHA256 4890fb7573aa96309785ca3929ed336bb158bc47fdc9d525054d6185e989b5c4
MD5 57d8bf361480c3b5f33f40a8e1030e42
BLAKE2b-256 7ed3d33cebe6cefe100dbe5e44880700773a44a1eb3d23515a6ca2b1f8338765

See more details on using hashes here.

File details

Details for the file havoc_clustering-0.0.20-py3-none-any.whl.

File metadata

File hashes

Hashes for havoc_clustering-0.0.20-py3-none-any.whl
Algorithm Hash digest
SHA256 c6ea9653be62d79cb2311823bd247972367f22d2ba1b276a09e4ba9721f1599a
MD5 a24cb5797b4983b8ab7ae2ac508cc980
BLAKE2b-256 a0ad929c78fd9782c9e537c03169d4ba1924ce9c72ade9fdd2fcc4e53c5472d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page