Optimized TCRDist calculation for TCR repertoire data analysis
Project description
fast_tcrdist
fast_tcrdist is an optimized version of the TCRDist algorithm published by Dash et al. Nature (2017): doi:10.1038/nature22383
To enhance the original implementation of TCRDist, fast_tcrdist uses the Needleman-Wunsch alignment algorithm to align TCR sequences and creates the TCRDist matrix via cython. To integrate well with other common single cell analysis tools, fast_tcrdist utilizes the Anndata data structure to store the TCRDist matrix and associated metadata. Currently, this has been tested on TCR/gene expression output from Cellranger (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger), but future releases will aim to allow for other file formats.
In addition to running the TCRDist algorithm, fast_tcrdist allows you to aggregate TCR info with single-cell gene expression into a single anndata object to allow for integrated downstream analyses.
Outside files
The BLOSUM62 matrix used for alignments came from NCBI:https://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/data/BLOSUM62
CDR amino acid info was taken from the TCRDist database file "alphabeta_db.tsv" (https://www.dropbox.com/s/kivfp27gbz2m2st/tcrdist_extras_v2.tgz) and reformated into .json format (mouse_CDRs_for_10X.json and human_CDRs_for_10X.json)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fast_tcrdist-0.0.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 785865ad3dc147a124300808274aaabb74c0306fcc0c9a4d054f1aa8744ac39d |
|
MD5 | cbe11c591dbd0ec74ba6c9ada8492dc0 |
|
BLAKE2b-256 | 70b065c0bef0b00ff1e59f1cb3893858c034e65eda015ec3c51c75f03be60380 |