Skip to main content

Histomic Atlases of Variation Of Cancers (HAVOC) is a versatile tool that helps map histomic heterogeneity across H&E-stained digital slide images to help guide regional deployment of molecular resources to the most relevant/biodiverse tumor niches

Project description

Histomic Atlases of Variation Of Cancers (HAVOC)

HAVOC is a versatile tool that maps histomic heterogeneity across H&E-stained digital slide images to help guide regional deployment of molecular resources to the most relevant/biodiverse tumor niches

Cloud usage

Explore HAVOC on https://www.codido.co to run on the cloud

Installation

Use the package manager pip to install havoc-clustering.

!! Tensorflow will need to be installed as well as OpenSlide (https://openslide.org/download/) !!

pip install havoc-clustering

Usage

from havoc_clustering.havoc import HAVOC
from havoc_clustering.general_utility.slide import Slide

# create a new Slide object that represents the image.
# Requirements allow to filter out undesired images within ie a loop
s = Slide(
    slide_path,
    img_requirements={
        'compression': [70],
        'mpp': None  # all mpp values (magnification) are supported currently
    }
)

# to instantiate a HAVOC instance requires the following:
# 1. a Slide object
# 2. path to a tensorflow model which acts as the feature extractor to base clustering off of. 
    # Our model used in our works can be found at https://bitbucket.org/diamandislabii/faust-feature-vectors-2019/src/master/models/74_class/
# 3. the directory to save output to
# 4. size of the tiles to extract and work with within the slide. default is 1024 (original trained size for the model above)
# 5. by default, we use the slide's resized thumbnail as the background for the colortile map. turn off to make it HD at the expense of time
havoc = HAVOC(s, feature_extractor_path, save_dir, tile_size=512, hd_backdrop=False)

# to run, requires the following:
# 1. the k values to use for clustering
# 2. the blank filter cutoff. 0.5 means that there must be less than (100-50=50)% blank within a tile to decide to use it.
    # ie tiles that are >50% blank would be skipped; a non-conservative number to only cluster tiles with plentiful tissue
# 3. the layer name within the feature extractor model that is responsible for generating the features
# 4. Additional kwargs; OPTIONAL
kwargs = {
    # saves a thumbnail image of the original slide
    'save_thumbnail': False,
    # make a dendrogram of the clustering used to make the colortile maps (generated for each k value)
    'make_dendrogram': True,
    # make a tsne of each color cluster (generated for each k value)
    'make_tsne': True,
    # make a Pearson coeffcient clustermap of each color cluster (generated for each k value)
    'make_corr_map': True,
    # save the tiles belonging to each color cluster within the colortile map for a given k
    # ie [4,9] would save the colored tiles belonging to k=4 and k=9
    # NOTE: this should be a subset of k_vals
    'save_tiles_k_vals': []
}
havoc.run(k_vals=[9], min_non_blank_amt=0.5, layer_name='global_average_pooling2d_1', **kwargs)

Result output

  • Colortiled maps
  • CSV file of cluster info + DLFVs (cluster_info_df.csv)
  • Optionally:
    • Original slide thumbnail
    • TSNEs
    • Dendrograms
    • Correlation clustermap

Multi-slide correlation map

By running HAVOC on multiple slides, you may want to combine all the generated correlation clustermaps into a mega clustermap.

  1. Create a folder containing each slide's cluster_info_df.csv file
from havoc_clustering.correlation_of_dlfv_groups import create_correlation_clustermap_multi_slide

create_correlation_clustermap_multi_slide(folder_of_csvs, target_k=9)

NOTE: the target_k should be a k-value you ran HAVOC with

Citation

Please refer to the paper "HAVOC: Small-scale histomic mapping of biodiversity across entire tumor specimens using deep neural networks"

License

GNU General Public License v3 (GPLv3)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

havoc_clustering-0.0.19.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

havoc_clustering-0.0.19-py3-none-any.whl (32.4 kB view details)

Uploaded Python 3

File details

Details for the file havoc_clustering-0.0.19.tar.gz.

File metadata

  • Download URL: havoc_clustering-0.0.19.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for havoc_clustering-0.0.19.tar.gz
Algorithm Hash digest
SHA256 db77017f877c113add89fcae808993201e27fb13057f968703097d0bae3f7d5f
MD5 5a6ca3a069b7953b34a5c90311a5d04a
BLAKE2b-256 9fb86f68a206fbe823a62d0e2de426bc65e79e1b7383dedceb9fd2d287bd3d28

See more details on using hashes here.

File details

Details for the file havoc_clustering-0.0.19-py3-none-any.whl.

File metadata

File hashes

Hashes for havoc_clustering-0.0.19-py3-none-any.whl
Algorithm Hash digest
SHA256 8ebb425cca76bc558cb15f4775a30a4617444143b65eca1fa65c09407418beb5
MD5 ba8824205015983ec3cde28296602f5e
BLAKE2b-256 2d29defdf34ecb96e0e466a5f20da9093317015bf05342fd7283a521415f8815

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page