Skip to main content

a Python interface for rapid clustering of large sets of CDR3 sequences

Project description

ImmuneWatch ClusTCR

A Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity

A two-step clustering approach that combines the speed of the Faiss Clustering Library with the accuracy of Markov Clustering Algorithm

On a standard machine*, clusTCR can cluster 1 million CDR3 sequences in under 5 minutes.
*Intel(R) Core(TM) i7-10875H CPU @ 2.30GHz, using 8 CPUs

Compared to other state-of-the-art clustering algorithms (GLIPH2, iSMART and tcrdist), clusTCR shows comparable clustering quality, but provides a steep increase in speed and scalability.

Documentation & Install

All of our documentation, installation info and examples can be found in the above link! To get you started, here's how to install clusTCR

$ pip install immunewatch-clustcr

Development Guide

Environment

uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"

Testing

# Run all tests
pytest

# Run with coverage report
pytest --cov=clustcr --cov-report=html

Build Distribution

# Install build tool
uv pip install build twine

# Build source and wheel distributions
python -m build

# Check the built distributions
twine check dist/*

# Upload to TestPyPI
twine upload --repository testpypi dist/*

# Upload to PyPI
twine upload dist/*

Cite

Please cite as:

Sebastiaan Valkiers, Max Van Houcke, Kris Laukens, Pieter Meysman, ClusTCR: a Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity, Bioinformatics, 2021;, btab446, https://doi.org/10.1093/bioinformatics/btab446

Bibtex:

@article{valkiers2021clustcr,
    author = {Valkiers, Sebastiaan and Van Houcke, Max and Laukens, Kris and Meysman, Pieter},
    title = "{ClusTCR: a Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity}",
    journal = {Bioinformatics},
    year = {2021},
    month = {06},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btab446},
    url = {https://doi.org/10.1093/bioinformatics/btab446},
    note = {btab446},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btab446/38660282/btab446.pdf},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

immunewatch_clustcr-1.0.0.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

immunewatch_clustcr-1.0.0-py3-none-any.whl (2.4 MB view details)

Uploaded Python 3

File details

Details for the file immunewatch_clustcr-1.0.0.tar.gz.

File metadata

  • Download URL: immunewatch_clustcr-1.0.0.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for immunewatch_clustcr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9b9e6b052c26e43fae888972d088ba54fb8d74b074d394b3c9044fd1e4116a49
MD5 724c6eff1c10e3a06e518d4e11e4ba1b
BLAKE2b-256 d9cb7fd180fd11be1d440629faf48fbcda6a4680f5e0fc15a7031b9566183a65

See more details on using hashes here.

File details

Details for the file immunewatch_clustcr-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for immunewatch_clustcr-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce09a5624336d76676513670db4c0550fa9471a9035be2c35140ff9088f13811
MD5 ebbef0c2eed5ef12c90d0777bd2c083a
BLAKE2b-256 96802ad29c87f73afb1e06850543dc5b3062d3ae88ca283859cef94ab021762d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page