flexible distance measures for comparing T cell receptors
Project description
tcrdist3
Flexible distance measures for comparing T cell receptors
tcrdist3 is a python API-enabled toolkit for analyzing T-cell receptor repertoires. Some of the functionality and code is adapted from the original tcr-dist package which was released with the publication of Dash et al. Nature (2017) doi:10.1038/nature22383. This package contains a new API for computing tcrdistance measures as well as new features for biomarker development (bioRxiv (2020)). The package has been expanded to include gamma-delta TCRs; it has also been recoded to increase CPU efficiency using numba, a high-performance just-in-time compiler.
Installation
pip install tcrdist3
or
pip install git+https://github.com/kmayerb/tcrdist3.git@0.2.2
Docker
docker pull quay.io/kmayerb/tcrdist3:0.2.2
User-Contributed Colab Notebook Examples Using tcrdist3
1. Example K Nearest Neighbor Classification using tcrdist3
(Author: Liel Cohen-Lavi). This notebook illustrates how to integrate tcrdist3 with scikit-learn's implementation of K Nearest Neighbor classification. TCRdist-based KNN classification performance on a set of labeled receptors is assessed with cross-validation or training/test splits This simple method is proposed as a quickly implementable benchmark for the performance of more computationally intensive TCR-epitope specificity prediction approaches.
Package Documentation
More documentation can be found at tcrdist3.readthedocs.
Basic Usage
import pandas as pd
from tcrdist.repertoire import TCRrep
df = pd.read_csv("dash.csv")
tr = TCRrep(cell_df = df,
organism = 'mouse',
chains = ['alpha','beta'],
db_file = 'alphabeta_gammadelta_db.tsv')
tr.pw_alpha
tr.pw_beta
tr.pw_cdr3_a_aa
tr.pw_cdr3_b_aa
from tcrdist.public import _neighbors_fixed_radius
_neighbors_fixed_radius(tr.pw_beta, 50)
Sparse Matrix Representation
import pandas as pd
from tcrdist.repertoire import TCRrep
from tcrdist.breadth import get_safe_chunk
df = pd.read_csv("dash.csv")
tr = TCRrep(cell_df = df[['subject','epitope','count','v_b_gene','j_b_gene','cdr3_b_aa','cdr3_b_nucseq']],
organism = 'mouse',
chains = ['beta'],
compute_distances = False)
# Set to desired number of CPUs
tr.cpus = 2
# Identify a safe chunk size based on input data shape and target number of
# pairwise distance to be temporarily held in memory per node.
safe_chunk_size = get_safe_chunk(
tr.clone_df.shape[0],
tr.clone_df.shape[0],
target = 10**7)
tr.compute_sparse_rect_distances(
df = tr.clone_df,
radius=50,
chunk_size = safe_chunk_size)
print(tr.rw_beta)
Citing
TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs
Mayer-Blackwell K, Schattgen S, Cohen-Lavi L, Crawford JC, Souquette A, Gaevert JA, Hertz T, Thomas PG, Bradley PH, Fiore-Gartland A. eLife (2021).
Quantifiable predictive features define epitope-specific T cell receptor repertoires
Pradyot Dash, Andrew J. Fiore-Gartland, Tomer Hertz, George C. Wang, Shalini Sharma, Aisha Souquette, Jeremy Chase Crawford, E. Bridie Clemens, Thi H. O. Nguyen, Katherine Kedzierska, Nicole L. La Gruta, Philip Bradley & Paul G. Thomas Nature (2017).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tcrdist3-0.3.tar.gz
.
File metadata
- Download URL: tcrdist3-0.3.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d8ed3be0f14436b953071cc944af772e8c7aaf35fa35dd6ab77ab5d739a3a8c |
|
MD5 | 6175ec1b68187e97a87e3a58c025a88f |
|
BLAKE2b-256 | efc80a85a780652d924b74ad7170d682e0f3d435001e3c7ca92823eb8f6ac54c |
File details
Details for the file tcrdist3-0.3-py3-none-any.whl
.
File metadata
- Download URL: tcrdist3-0.3-py3-none-any.whl
- Upload date:
- Size: 2.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 834a94c7e33851e515568b8d2a2a0990528907f57e891ca15157b47be776094c |
|
MD5 | da08bcab37a47315737f07ad057cd89b |
|
BLAKE2b-256 | 5d04eac6af0e2011be7b71ac3f62d13de9086a44cd1990c3ef0bd096e4d1a28d |