Fast Python implementation of confind — protein side-chain contact-degree analysis.
Project description
pyconfind
A modern Python implementation of confind — the rotamer-based protein side-chain contact-degree analysis introduced in Zheng & Grigoryan's work on tertiary structural motifs.
The Python output is byte-for-byte identical to the upstream C++ binary on 248 of 253 real structures tested (100 single-chain PDB + 100 AlphaFold DB + 50 multi-chain + 3 high-resolution; see docs/stress_test_results.md) and on all 11 example structures shipped with the original codebase. The 5 exceptions are insertion-code structures where the C++ ordering relies on undefined behavior (documented).
pyconfind is also faster than the C++ binary, with two interchangeable contact-degree backends (both byte-identical to the reference):
- a pure NumPy/SciPy reference, which on its own already beats the C++ binary;
- an optional Numba JIT/multi-threaded backend (
pip install pyconfind[fast]) that is ~2-3× faster again.
With the Numba backend and the rotamer library amortized across a batch, the per-structure analysis is ~8-18× faster than the C++ binary.
Runtime scales sub-quadratically with sequence length (the CA-distance cutoff bounds each residue's neighbor count). See docs/benchmark.md for details.
Install
pip install -e ".[dev]" # includes the Numba fast backend
# or, runtime only:
pip install -e . # pure-Python reference backend
pip install -e ".[fast]" # + Numba backend
Example notebook
examples/pyconfind_demo.ipynb is a runnable
walkthrough (install → fetch a PDB → analyze via the library API → visualize a
contact map, per-residue scores, and a 3D structure colored by contact degree).
Click the badge to run it on a free Colab CPU runtime.
Quick start
CLI (matches the original confind flag names, so existing pipelines drop in):
pyconfind --p input.pdb --rLib path/to/rotlibs --o out.cont
# Inputs may be PDB or mmCIF (format auto-detected via gemmi):
pyconfind --p input.cif --rLib path/to/rotlibs --o out.cont
# Modern structured output:
pyconfind --p input.pdb --rLib path/to/rotlibs --json --o out.json
# Only consider the native AA at each position (no AA substitution):
pyconfind --p input.pdb --rLib path/to/rotlibs --native-only --o out.cont
# Restrict the computed/output residues (MSL selection language):
pyconfind --p input.pdb --rLib path/to/rotlibs --sel "chain A AND resi 20-60" --o out.cont
# Pre-select part of the structure before anything runs:
pyconfind --p input.pdb --rLib path/to/rotlibs --psel "NAME CA WITHIN 25 OF CHAIN A" --o out.cont
Library API:
from pyconfind import analyze, format_confind_text
result = analyze("input.pdb", rotamer_library="path/to/rotlibs")
print(format_confind_text(result.positions, result.report))
# Inspect raw contacts:
for c in result.report.contacts:
pi, pj = result.positions[c.pos_i], result.positions[c.pos_j]
print(f"{pi.position.chain},{pi.position.resnum} <-> "
f"{pj.position.chain},{pj.position.resnum}: degree={c.degree}")
Rotamer libraries
Out of the box, pyconfind supports the Dunbrack 2010 MSL-format library that
ships with the upstream confind source (EBL.out + BEBL.out). Point
--rLib at a directory containing both files (backbone-dependent) or at a
single EBL.out-style file (backbone-independent).
Modern Dunbrack and Richardson-style libraries are next on the roadmap.
Native-only mode (extension over the C++ binary)
The original C++ confind substitutes in all 18 non-Gly/Pro amino acids at
every position and computes contact degree across the full rotamer space.
pyconfind adds --native-only: at each position, only place rotamers of the
native amino acid (but still consider every rotamer of that AA). Useful when
you want a contact-degree estimate that holds the sequence fixed.
Validation
The C++ reference binary is built from the upstream tarball by:
scripts/build-reference.sh
The byte-identity tests then compare pyconfind's output against the C++ output on every example PDB. To run them yourself:
pytest tests/
References
-
"Sequence statistics of tertiary structural motifs reflect protein stability", F. Zheng, G. Grigoryan, PLoS ONE, 12(5): e0178272, 2017.
-
"Tertiary Structural Propensities Reveal Fundamental Sequence/Structure Relationships", F. Zheng, J. Zhang, G. Grigoryan, Structure, 23(5): 961-971, 2015.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyconfind-0.1.0.tar.gz.
File metadata
- Download URL: pyconfind-0.1.0.tar.gz
- Upload date:
- Size: 5.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a0d0794807f38eac43dccdc47e5023f7ff041c5124bb873c8a7a6e241ccbcf9
|
|
| MD5 |
9251b8520931f244ab710fbde6877a6a
|
|
| BLAKE2b-256 |
661374e7ba544186d4b9858d6605b43cd1b1a39131dd3b3fe27183b835bff07c
|
Provenance
The following attestation bundles were made for pyconfind-0.1.0.tar.gz:
Publisher:
publish.yml on timodonnell/pyconfind
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyconfind-0.1.0.tar.gz -
Subject digest:
1a0d0794807f38eac43dccdc47e5023f7ff041c5124bb873c8a7a6e241ccbcf9 - Sigstore transparency entry: 1660189827
- Sigstore integration time:
-
Permalink:
timodonnell/pyconfind@dc8c72ea809edf3bceab223dee65a855d142a57e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/timodonnell
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc8c72ea809edf3bceab223dee65a855d142a57e -
Trigger Event:
release
-
Statement type:
File details
Details for the file pyconfind-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyconfind-0.1.0-py3-none-any.whl
- Upload date:
- Size: 66.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76c242330c4abe4c188e2d760b04d4eec75b1dc3d7063d77ca81e806fcb8da76
|
|
| MD5 |
27b608de679a94e3b8489ebce7ea90c0
|
|
| BLAKE2b-256 |
b2d0aebe7e79f231a67c40d8bd5dada4eb18a6a57a61076c26a968df0acff9f5
|
Provenance
The following attestation bundles were made for pyconfind-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on timodonnell/pyconfind
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyconfind-0.1.0-py3-none-any.whl -
Subject digest:
76c242330c4abe4c188e2d760b04d4eec75b1dc3d7063d77ca81e806fcb8da76 - Sigstore transparency entry: 1660189951
- Sigstore integration time:
-
Permalink:
timodonnell/pyconfind@dc8c72ea809edf3bceab223dee65a855d142a57e -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/timodonnell
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc8c72ea809edf3bceab223dee65a855d142a57e -
Trigger Event:
release
-
Statement type: