Pure-Python port of the R package turboGliph — GLIPH and GLIPH2 specificity-group clustering of T-cell receptor CDR3 repertoires.
Project description
py-gliph
Pure-Python port of the R package turboGliph — implementations of GLIPH (Glanville et al., Nature 2017) and GLIPH2 (Huang et al., Nat. Biotechnol. 2020): Grouping of Lymphocyte Interactions by Paratope Hotspots.
pygliph clusters T-cell receptors (TCRs) into specificity groups
predicted to bind the same MHC-restricted peptide antigen, based on
shared CDR3 motifs. It is a standalone, dependency-light re-implementation
that does not require R or rpy2.
| PyPI / import name | pygliph |
| Repository | omicverse/py-gliph |
| License | Apache-2.0 |
| Upstream | turboGliph 0.99.2 (GPL-3, R) |
| Numerical parity | deterministic parts bit-exact vs turboGliph |
Install
pip install pygliph # once published
# or, from a checkout:
pip install -e .
Dependencies: numpy, scipy, pandas, networkx (and matplotlib
for plot_network, an optional extra).
What it does
GLIPH/GLIPH2 build specificity groups from two kinds of CDR3 similarity:
- Local (motif) similarity — enriched short continuous CDR3 motifs (2–4-mers). GLIPH2 scores enrichment with Fisher's exact test (hypergeometric) against a naive reference TCR pool; motif clusters are position-restricted (a motif may shift by < 3 residues). GLIPH v1 instead uses repeated random sampling from the reference.
- Global similarity — CDR3s of equal length differing by a single amino acid (Hamming distance 1). GLIPH2 uses position-specific structures with an optional BLOSUM62 interchangeability constraint.
- Specificity-group construction — connected components of the combined local + global similarity graph.
- Per-group scoring — network size, CDR3-length restriction,
V-gene bias, clonal-expansion enrichment and shared-HLA enrichment
combined into a
total.score(GLIPH2'sconvergence_groups.txt). - N/P-residue up-weighting — local cluster significance is boosted for motifs overlapping non-germline (N/P-nucleotide) encoded residues.
Quick start
import pygliph as pg
# bundled example: ~2000 TCRs of known specificity
data = pg.datasets.load_gliph_input_data()
# GLIPH2
res = pg.gliph2(data, sim_depth=1000)
res["cluster_properties"] # scored specificity groups
res["motif_enrichment"]["selected_motifs"] # enriched local motifs
res["global_enrichment"] # global-similarity structures
res["cluster_list"] # {tag: member DataFrame}
# original GLIPH (v1)
res1 = pg.turbo_gliph(data, sim_depth=1000)
# configurable hybrid
res2 = pg.gliph_combined(data, local_method="fisher", global_method="fisher")
API
| Function | Purpose |
|---|---|
gliph2 |
GLIPH2 algorithm — Fisher-test local motifs + global structures |
turbo_gliph |
original GLIPH — resampling local motifs + Hamming globals |
gliph_combined |
configurable hybrid of GLIPH / GLIPH2 |
cluster_scoring |
per-cluster enrichment scoring |
find_motifs |
continuous / discontinuous motif enumeration |
de_novo_TCRs |
de-novo CDR3 generation from a specificity group |
plot_network |
specificity-group network visualisation |
save_gliph_output / load_gliph_output |
result-folder I/O |
datasets |
bundled example data and reference resources |
Each result is a dict mirroring turboGliph: motif_enrichment,
global_enrichment, connections, cluster_properties,
cluster_list, parameters (and sample_log for GLIPH v1).
Input format
A pandas.DataFrame (or a plain list of CDR3b strings) with column
CDR3b and any of the optional columns TRBV, patient, HLA,
counts — the same schema as turboGliph.
R parity
tests/ runs the same input through turboGliph (R) and pygliph
and asserts agreement. The deterministic parts are bit-exact:
find_motifscontinuous and discontinuous motif counts;- GLIPH2 local-motif hypergeometric ("Fisher") p-values, fold change, selected motifs and their position ranges;
- GLIPH2 global-similarity structures and their Fisher scores;
- the clone-network edge list (directed 4-tuples);
- cluster membership, sizes and cluster-level Fisher scores;
de_novo_TCRsposition weight matrices and sample-sequence scores.
Unavoidable approximate parts: any score derived from random
resampling — GLIPH v1's repeated-random-sampling local-motif selection,
and the total.score/length/V-gene/clonal-expansion sub-scores in both
algorithms — depends on the RNG. R's sample.int (Mersenne-Twister)
and NumPy's PCG64 produce different draws, so these are checked for
high set / rank agreement (correlation > 0.95) rather than bit equality.
Fix random_state for reproducible Python runs.
References
- Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).
- Huang, H. et al. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat. Biotechnol. 38, 1194–1202 (2020).
- turboGliph: https://github.com/HetzDra/turboGliph
License
Apache-2.0. The upstream R package turboGliph is GPL-3; pygliph is an
independent re-implementation from the published algorithms and the
turboGliph source, and ships the small reference data tables exported
from turboGliph for parity testing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pygliph-0.1.0.tar.gz.
File metadata
- Download URL: pygliph-0.1.0.tar.gz
- Upload date:
- Size: 708.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c63b1fb9a911cea8134c920dc994fe5fd7cd77246f956baec8d102fa589b383
|
|
| MD5 |
2044499d1ec1100107aeea4651365495
|
|
| BLAKE2b-256 |
7e71bc7bbf2b43156bceefc7560d8129b11dfd90f55656bf915908abb108cd72
|
Provenance
The following attestation bundles were made for pygliph-0.1.0.tar.gz:
Publisher:
publish.yml on omicverse/py-gliph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pygliph-0.1.0.tar.gz -
Subject digest:
0c63b1fb9a911cea8134c920dc994fe5fd7cd77246f956baec8d102fa589b383 - Sigstore transparency entry: 1591935066
- Sigstore integration time:
-
Permalink:
omicverse/py-gliph@36277cdb3856ef4b63930e663d53ca8e86348bf0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@36277cdb3856ef4b63930e663d53ca8e86348bf0 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pygliph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pygliph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 704.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
742e88824ccc88446091b12649bf5a134cf14b7409fca98986368b00ad6c78ff
|
|
| MD5 |
8e3fc78c2ef71667943cd948688dd58b
|
|
| BLAKE2b-256 |
0e49096a421f4256041b83fcc1d4c2d060ee9aa734fc3547c2eef33e188bb351
|
Provenance
The following attestation bundles were made for pygliph-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on omicverse/py-gliph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pygliph-0.1.0-py3-none-any.whl -
Subject digest:
742e88824ccc88446091b12649bf5a134cf14b7409fca98986368b00ad6c78ff - Sigstore transparency entry: 1591935113
- Sigstore integration time:
-
Permalink:
omicverse/py-gliph@36277cdb3856ef4b63930e663d53ca8e86348bf0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/omicverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@36277cdb3856ef4b63930e663d53ca8e86348bf0 -
Trigger Event:
workflow_dispatch
-
Statement type: