Skip to main content

a Python interface for rapid clustering of large sets of CDR3 sequences

Project description

ImmuneWatch ClusTCR

A Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity

A two-step clustering approach that combines the speed of the Faiss Clustering Library with the accuracy of Markov Clustering Algorithm

On a standard machine*, clusTCR can cluster 1 million CDR3 sequences in under 5 minutes.
*Intel(R) Core(TM) i7-10875H CPU @ 2.30GHz, using 8 CPUs

Compared to other state-of-the-art clustering algorithms (GLIPH2, iSMART and tcrdist), clusTCR shows comparable clustering quality, but provides a steep increase in speed and scalability.

Documentation & Install

All of our documentation, installation info and examples can be found in the above link! To get you started, here's how to install clusTCR

$ pip install immunewatch-clustcr

Development Guide

Environment

uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"

Testing

# Run all tests
pytest

# Run with coverage report
pytest --cov=clustcr --cov-report=html

Build Distribution

# Install build tool
uv pip install build twine

# Build source and wheel distributions
python -m build

# Check the built distributions
twine check dist/*

# Upload to TestPyPI
twine upload --repository testpypi dist/*

# Upload to PyPI
twine upload dist/*

Cite

Please cite as:

Sebastiaan Valkiers, Max Van Houcke, Kris Laukens, Pieter Meysman, ClusTCR: a Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity, Bioinformatics, 2021;, btab446, https://doi.org/10.1093/bioinformatics/btab446

Bibtex:

@article{valkiers2021clustcr,
    author = {Valkiers, Sebastiaan and Van Houcke, Max and Laukens, Kris and Meysman, Pieter},
    title = "{ClusTCR: a Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity}",
    journal = {Bioinformatics},
    year = {2021},
    month = {06},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btab446},
    url = {https://doi.org/10.1093/bioinformatics/btab446},
    note = {btab446},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btab446/38660282/btab446.pdf},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

immunewatch_clustcr-1.0.1.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

immunewatch_clustcr-1.0.1-py3-none-any.whl (2.4 MB view details)

Uploaded Python 3

File details

Details for the file immunewatch_clustcr-1.0.1.tar.gz.

File metadata

  • Download URL: immunewatch_clustcr-1.0.1.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for immunewatch_clustcr-1.0.1.tar.gz
Algorithm Hash digest
SHA256 5f24cedca46ad4e765f01c5dc3f22ad86da810e1afaf4b4a3f062c0b86d0c61e
MD5 13fe81660ca4ab1348a4452603b94627
BLAKE2b-256 5063fd8384de2b82010cbeb178bbb3a053799182442675e0025859f1d43fa489

See more details on using hashes here.

File details

Details for the file immunewatch_clustcr-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for immunewatch_clustcr-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9e2abe80e2b38d6ce4391e31fa2b93437d811bc6566b954eff3f7d1130f08022
MD5 c010ad82008e02e95cdeddfa524b6766
BLAKE2b-256 2a91446253ce041f7051e35dc7d275b19face00f43635418861de991e70fede9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page