Skip to main content

Integrated TCR-Gene-Antigen Prediction: dataset tooling and models for TCR-antigen recognition.

Project description

ITGAP — Integrated TCR-Gene-Antigen Prediction

itgap is a Python package for building TCR-peptide datasets from the 10x Genomics CD8+ T-cell multi-omics benchmark and training TCR-antigen recognition models that integrate gene expression (GEX), V/J gene usage, and CDR3 sequence information.

It bundles:

  • NegativeSamplingTool — two-stage synthetic negative TCR-peptide sampling.
  • Sequence encoding utilities (Atchley factors + positional encoding) with the Atchley table shipped as a package resource.
  • Autoencoder + encoder–decoder integration models for combining sequence, V/J, and GEX modalities.
  • Residual-MLP binary classifiers, sklearn baselines (logistic regression, random forest), and standard evaluation/plotting helpers.

Install

Core install (small footprint, only numpy, pandas, scikit-learn, matplotlib):

pip install itgap

Optional extras:

pip install 'itgap[gex]'   # adds scanpy + anndata for h5ad loading
pip install 'itgap[tf]'    # adds tensorflow (or tensorflow-macos on Apple Silicon)
pip install 'itgap[all]'   # everything

itgap[tf] resolves to tensorflow-macos>=2.9 on macOS arm64 and to tensorflow>=2.9 elsewhere.

Data

The package ships only the small atchley.txt reference table. The large benchmark file merge_gex_all_donors_all_peptides_meta.h5ad (~200 MB) is not included; download it from the 10x Genomics CD8+ T-cell multi-omics dataset and pass its path to NegativeSamplingTool(data_dir=...) or to load_dataset(h5ad_path=...). The pre-computed CSV embeddings used in the example notebooks live in the project repository under examples/data/.

Quickstart

Generate a negative-sampled training set:

from itgap import NegativeSamplingTool

tool = NegativeSamplingTool(
    data_dir="path/to/10x",   # contains merge_gex_all_donors_all_peptides_meta.h5ad
    negative_ratio=3.0,
    random_seed=42,
)
result = tool.create_combined_dataset(negative_ratio=3.0)
print(result["dataset"].shape, result["statistics"])

Train a residual-MLP classifier on assembled features (requires itgap[tf]):

from itgap import (
    load_atchley, build_residual_mlp, compile_and_train, evaluate_classifier,
)

word_vectors, aa_idx = load_atchley()   # uses the packaged atchley.txt
model = build_residual_mlp(input_dim=X_train.shape[1])
history = compile_and_train(model, X_train, y_train, X_val, y_val, epochs=50)
metrics = evaluate_classifier(model, X_test, y_test)

Command-line

Installing the package exposes a console script:

itgap-negative-sampling   # runs NegativeSamplingTool with default settings

Examples

End-to-end Jupyter notebooks live in examples/ of the repository:

  • data_preparation_notebook.ipynb — build the labeled dataset.
  • tcr_beta_prediction_notebook.ipynb — beta-chain only model.
  • tcr_alpha_beta_prediction_notebook.ipynb — alpha + beta + GEX + VJ.

Development

pip install -e '.[dev,all]'
pytest
python -m build

License

MIT. See LICENSE.

Citation

If you use ITGAP in a publication, please cite the project repository: https://github.com/mlizhangx/ITGAP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

itgap-0.1.0.tar.gz (626.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

itgap-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file itgap-0.1.0.tar.gz.

File metadata

  • Download URL: itgap-0.1.0.tar.gz
  • Upload date:
  • Size: 626.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for itgap-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f50a87349811d483c45ccc3fc3bddb29f9d7ef182636ead6ac3cf9d85a1d4d88
MD5 c93e1cac2faf013f59542aa705321799
BLAKE2b-256 d9ba34d25b86e4339681b97f4bc2f84bf4350c5cc80f174787e86a10a66ea190

See more details on using hashes here.

File details

Details for the file itgap-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: itgap-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for itgap-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5ba680a58cc948849ef371255f84e1d6470f2c9537532967d118c8450c07ffc
MD5 0e52ef466853e1b56372981bb332cd5d
BLAKE2b-256 76881220238363a18c2d8c451d3e0b40f1ffd9c4e140383233b729fea257bf1a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page