Skip to main content

Structure-based prediction of TCR recognition of epitopes via residue-level statistical potentials

Project description

tcren

tcren — structure-based prediction of TCR–epitope recognition

tests docs python license

TCRen predicts which epitopes a T-cell receptor recognises from a single TCR–peptide–MHC structure (experimental or modelled). It extracts the TCR–peptide contact map and scores every candidate peptide with a residue-level statistical potential derived from contact preferences in TCR:pMHC crystal structures — answering not "what fancy complex can a model draw?" but "is this binding physically plausible?".

This is a documented, tested, CLI-driven Python library. TCR chains are annotated with the sibling arda; MHC chains are mapped and the groove partitioned against a curated reference; structures are oriented into one canonical frame; and the original contact maps, potential, and scores are reproduced numerically (validated against committed oracles to floating-point precision).

While the original tcren focused on TCR:peptide contacts, the new version brings in features to score TCR:MHC and peptide:MHC interactions, required to get full picture of TCR:pMHC binding mechanics and estimate ddG values.

Install

bash setup.sh              # creates the `tcren` conda env, installs arda + tcren, fetches data/
conda activate tcren

Or, once released, simply pip install tcren (binary wheels ship the C++ extension). tcren is on PyPI; install with pip install tcren.

tcren ships a small pybind11/C++ extension (tcren._align) for the MHC-pseudosequence fitting-alignment hot path, built on install by scikit-build-core (a Biopython fallback runs if it is not built). TCR annotation is provided by arda, a runtime dependency published to PyPI as arda-mapper (it imports as arda); pip/setup.sh pull it automatically. setup.sh also runs tcren fetch-data to populate data/ with the reference structure sets (Native2026, Canonical2026) used by orient/superimpose (set TCREN_NO_FETCH=1 to skip).

Command line

# Full pipeline: annotate -> superimpose -> resmarkup / canonical Cα / contacts -> per-interface
# energies (TCRen for TCR↔peptide, MJ for TCR↔MHC and peptide↔MHC) + total
tcren pipeline -s complex.pdb -o scores.csv

# End-to-end candidate-epitope scoring from a structure
tcren score -s complex.pdb -c candidates.txt -o ranked.csv

# Structures: any of .pdb / .cif / .pdb.gz / .cif.gz, a directory, or a .tar.gz batch
tcren contacts -s batch.tar.gz -o contacts.csv --interface tcr_peptide

# Per-residue markup: TCR (CDR/FR) + MHC groove (helix/floor) + peptide in one table.
# --regions all|tcr|mhc|peptide filters; --pseudo also marks NetMHCpan groove residues (MPS).
tcren annotate -s complex.cif.gz -o markup.csv --regions mhc --pseudo

# Superimpose structure(s) onto the canonical frame, by MHC, against the canonical database
# (data/Canonical2026, fetched at install). Detects MHC class + species and averages the
# superposition over every database structure of that class/species. Chains -> A=Vα B=Vβ
# C=peptide D=MHCα E=MHCβ/β2m. -s takes a file / directory / .tar.gz / glob; -o is a directory,
# or a single structure file (one input) whose extension must match --mmCIF/--compress; -t threads.
tcren superimpose -s complex.pdb -o oriented.pdb           # single file
tcren superimpose -s 'data/*.pdb' -o oriented/ -t 8        # glob -> directory, threaded

# Build a canonical database from native complexes (how Canonical2026 is produced). Annotation
# is one batched mmseqs call; -t threads only the structural alignment + write.
tcren orient -s data/Native2026 -o data/Canonical2026 -t 8

# Structure outputs are plain .pdb by default; add --mmCIF for .cif and --compress for .gz.
tcren superimpose -s complex.pdb -o oriented/ --mmCIF --compress   # -> oriented/<id>.cif.gz

# Fetch recent TCR-pMHC structures from RCSB -> data/pdb_recent (mmCIF .cif.gz, 5-chain validated)
tcren fetch-recent --discover --after 2024-01-01

# Build the MHC reference once (IMGT/HLA + mouse H-2; cached, not committed)
tcren build-mhc-ref

tcren info
tcren --install-completion        # shell tab-completion (bash/zsh/fish)

tcren orient and tcren superimpose need the reference sets in data/ (Native2026, Canonical2026); setup.sh fetches them at install via tcren fetch-data (re-run it any time).

Library

from tcren import run_pipeline, parse_structure, import_structure, ContactMap, score_peptides
from tcren.annotation import classify_chains
from tcren.potential import tcren

# One call: annotate -> superimpose -> contacts -> per-interface energies + total
res = run_pipeline("complex.pdb")              # res.scores, res.markup, res.contacts, res.oriented

# …or the individual steps:
s = parse_structure("complex.pdb.gz")          # also .cif/.cif.gz; import_structure trims the C-gene
classify_chains(s, organism="human")           # TRA/TRB via arda, peptide, MHC
cm = ContactMap.from_structure(s)              # 5 Å contacts + interface partitioning
ranked = score_peptides(cm, ["KQWLVWLFL", "RLLHPHHPL"], tcren())

Batch inputs, gzip, archives

from tcren.structure import iter_structures
for pdb_id, structure in iter_structures("batch.tar.gz"):   # file | directory | .tar.gz
    classify_chains(structure, organism="human")
    ...

Canonical orientation, contacts, docking geometry

from tcren.mhc import annotate_mhc
from tcren.orient import canonicalize_structure, superimpose, docking_angles
from tcren.contacts import multi_contacts, ContactDefinition

annotate_mhc(s)
oriented, info = canonicalize_structure(s)     # frame: z=MHC→TCR, y=peptide, x=thin; chains A–E
oriented, info = superimpose(s)                # orient onto data/Canonical2026 by MHC (class+species ensemble)
layers = multi_contacts(s, ContactDefinition(d1=5, d2=8, d3=12))   # heavy-atom / Cβ / Cα
d = docking_angles(s)                          # crossing (~20–70° αβ) + incident angle

2D complementarity maps & region-pair contacts

from tcren.project2d import (project_structure, residue_markup_table, contacts_table,
                             region_pair_summary)
from tcren.viz import render_complementarity_map, view_pocket_cdr

proj = project_structure(s)                                   # canonical groove plane
svg  = render_complementarity_map(residue_markup_table(s, proj),
                                  contacts=contacts_table(s, threshold=5.0))
region_pair_summary(s, kind="closest")        # contacts per region pair + bond types (cb/ca too)
view_pocket_cdr(s).show()                      # interactive 3D pocket + CDR overlay (py3Dmol)

Data

Structures live in the Hugging Face dataset isalgo/tcren_structures, all gzipped:

folder contents
Native2022 the 2022 paper set (oracle)
Native2026 the comprehensive 2026 TCR:pMHC set the current potential is derived from
Canonical2026 Native2026 re-oriented into the canonical frame (tcren orient)

tcren reads .pdb/.cif/.pdb.gz/.cif.gz and .tar.gz batches; an installed library lazily fetches the canonical reference structures from the Hub when orienting a new complex. The root data/ holds Native2026 (+ Canonical2026, gitignored, fetched on demand), PDB_date.tsv, orient_metadata.json, and TCRen_potential.csv — the current potential derived from the Native2026 set (use it with tcren score -p data/TCRen_potential.csv).

Notebooks

Runnable examples under notebooks/ (rendered in the docs):

  • complementarity_map_2d — 2D interface maps, multiple structural + map views of 1ao7
  • contact_thresholds_and_bondtypes — region-pair contact counts (closest/Cβ/Cα) + bond types
  • canonical_frame_figures — canonical-frame QC across the Native2026 set
  • pymol_canonical_figures — ray-traced PyMOL panels (overlay, groove, interface) by class/species
  • mhc_pseudosequence_mps — NetMHCpan MHC pseudosequence (MPS) residues vs. peptide contacts
  • example_gil_a02_rs_motif — GILGFVFTL/HLA-A*02 and the public CDR3β Arg–Ser motif
  • natcompsci2022/ — full reproduction of the Nat Comput Sci 2022 analyses

Performance

Per-stage timings on a TCR-pMHC complex (1ao7), Apple M3, single thread (RUN_BENCHMARK=1 pytest -k benchmark -s to reproduce):

stage time notes
parse a gzipped structure ~19 ms .pdb.gz / .cif.gz
contact map (5 Å, cKDTree) ~9 ms per structure
score 1000 candidate peptides ~8 ms ~8 µs/peptide (vectorised)
annotate (TCR + MHC), batched ~213 ms/structure one mmseqs2 call for the whole set; vs ~1.5 s/structure unbatched
peak RSS, single-structure pipeline ~195 MB

Annotation is the only network/compute-heavy step and is always batched (one mmseqs2 search over all chains; mmseqs2 parallelises internally — never per-structure, never Python-threaded). Threads are used only for the embarrassingly-parallel, mmseqs-free stages (structural alignment, write, rendering): tcren orient -t N.

Tests

pytest -m "not slow"          # unit + fast regression (the CI gate)
pytest                        # add the arda/mmseqs-backed regression tests
RUN_BENCHMARK=1 pytest -k benchmark -s

Citing

TCRen is free for academic and non-commercial use. If you use it, please cite our latest Nature Computational Science 2024 paper:

Karnaukhov VK, Shcherbinin DS, Chugunov AO, Chudakov DM, Efremov RG, Zvyagin IV, Shugay M. Structure-based prediction of T cell receptor recognition of unseen epitopes using TCRen. Nat Comput Sci. 2024 Jul;4(7):510-521. doi: 10.1038/s43588-024-00653-0. Epub 2024 Jul 10. PMID: 38987378.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tcren-0.1.0.tar.gz (159.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tcren-0.1.0-cp313-cp313-win_amd64.whl (254.1 kB view details)

Uploaded CPython 3.13Windows x86-64

tcren-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (265.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

tcren-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (232.7 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

tcren-0.1.0-cp312-cp312-win_amd64.whl (254.0 kB view details)

Uploaded CPython 3.12Windows x86-64

tcren-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (265.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

tcren-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (232.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

tcren-0.1.0-cp311-cp311-win_amd64.whl (252.4 kB view details)

Uploaded CPython 3.11Windows x86-64

tcren-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (266.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

tcren-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (232.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

tcren-0.1.0-cp310-cp310-win_amd64.whl (251.6 kB view details)

Uploaded CPython 3.10Windows x86-64

tcren-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (264.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

tcren-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (231.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file tcren-0.1.0.tar.gz.

File metadata

  • Download URL: tcren-0.1.0.tar.gz
  • Upload date:
  • Size: 159.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tcren-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1b19fbcdc888505f88eb13cbe05a3823a3e416840622cabf3ac7c8d860a385c3
MD5 89f03200d32872e9f9551a7f99f360d1
BLAKE2b-256 0aa6300de10a37e91a3278a8bfd057235488ee1bd8db25c33ad3071d7f44f8b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0.tar.gz:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: tcren-0.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 254.1 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tcren-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 bfddf5179b2ea151bc8d9a34f83ef08b43dcc0f40222d5b70aad36f7ea863442
MD5 497a4043887114eb3e9f2a63f32a35a0
BLAKE2b-256 2434cc378cd62e592611a7a8587faafa07922907cd26a5d62387c56a31c44c95

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f685ff6d413d27e9f5b024a8d3b25124a5eeaf6284fa5348b6f0d5c9467b2fcb
MD5 1e21d070db680c592864a50cb9844b34
BLAKE2b-256 c2c6b59393cdd74a6995ef454d3257d2241722cbc5fef9527955e7109a3add4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 15eba80702d626aeb5a7bb56512ccbf341d59d3a4a3ca6d9a51ace6a9e0bda17
MD5 b175fab635c0e6d55323af3ce09c3da4
BLAKE2b-256 e60610a65713f2dd571b67121f0bed08ff7c707fca4f5850fa37f28b5be92948

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: tcren-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 254.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tcren-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4f4bd313e7d5a908d40be24b19bfdca731e6e881ebedaf2c02a2fc81b4dc2e99
MD5 ef2d24c8088ebf7289df1db9a1760ab7
BLAKE2b-256 7deb071909c19d044e3745f6808963b782a1264a43739d4b3a4300b77d8585fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd84c9832d49a74946c6f65b7f093e5ed0dbb6fa45a8c5e5d82ab6baed81f651
MD5 9287350b1412ed04533050a7ce7f9442
BLAKE2b-256 5f9ba30fbc0e6e6f864ad62aaf5ec9bd647f7870b0a3b5eaded75a75d9f6224f

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ec38973443358ec776f09d5294815214edf2327f050392655d34c55d6634dee9
MD5 1082cae63427d7c89941aa318d556fe6
BLAKE2b-256 e86f196a7e9929cec51ebd9c8d76a6d55fe631baae888d8f6ef3db1ead99ee88

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: tcren-0.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 252.4 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tcren-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d968e42a43a2bbd22c9d60cd21435863143e9bd6b361d9b5b2a187880febbfdd
MD5 18b8f201b26ba6a4c6a53a1a049b9dec
BLAKE2b-256 c68abf42466243c63715f4ab10feb0e887c91de4186af6a9ef630fe813ed5edc

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9ab477baa38600559526d3e4bead3e21747d4ce1ea4df65b17c69955dc9b930f
MD5 99a6b29cf374325055799ba5b8ed64a9
BLAKE2b-256 a292a9ad6e15469f47ade15771d17d769da2008182d73eac2c7e50b9740d5f76

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fb64dde2883bbf7502034dc5fecaaaaaa18e50352f6dc8aef8014cfbc3d501ae
MD5 455b42798d325d24f14718bcb5ad58d0
BLAKE2b-256 852043f4ddb8f6dcd28187e91bbb96276f1a30228a97e40520372f41f3a5fa49

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: tcren-0.1.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 251.6 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tcren-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f484cdfd5be00b54a08ad2fbdc28d61fbb6d0f696190ae355e4d750a0bf9e9f3
MD5 e231e81ee78e3d266d38b316200a5fa6
BLAKE2b-256 a2be72e253f0315f1c4c42ef192b888a843888f1c37d531266c1ea6270d8dd44

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7082ddbc484299cdfa21791a6e3d2301f2651724dd38a01fb615006ded33b2f8
MD5 072b9da0fec52f7a2fb70841828b2600
BLAKE2b-256 964d791fc33c65c924cb7ee4e62dbd38f20df39767f27ab88b884684f2cb3192

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tcren-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tcren-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d18313b25adb630b42d504b52bff8747213733544bf0aef36bc5a086d8bfe5f4
MD5 5b20da142c9047aad439a95ca6099c73
BLAKE2b-256 4b026ab8f7124d65b1a541bce359a73cbed6cff1c6489d24154edee326b836de

See more details on using hashes here.

Provenance

The following attestation bundles were made for tcren-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: publish.yml on antigenomics/tcren

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page