procustes

Add your description here

Project description

Procustes

procustes truncates a protein around a ligand using cutoff-based residue selection.

CLI

procustes INPUT_STRUCTURE OUTPUT_DIR [options]

Required positional arguments:

INPUT_STRUCTURE: input .pdb or .cif containing protein + ligand
OUTPUT_DIR: base output directory where all outputs are written

Options:

--ligand ligand residue name (default: LIG)
--cutoff cutoff distance in angstrom (default: 4.0)
--ca use only alpha-carbon distances (default: use any residue atom)
--fill-gaplength internal removed gaps shorter than this value are restored to original residues; gaps at or above it are considered for alanine-based filling (default: 4)
--extra-residues comma-separated extra protein residues to force-keep before gap logic (RESID for single-chain inputs, CHAIN:RESID for multi-chain; spaces/trailing commas are accepted)
--nofill disable long-gap filling
--caps add ACE/NME caps to all no-fill biopolymer chains (requires --nofill)
--fill-method filling backend: pdbfixer or boltz (default: pdbfixer)
--fill-models-count number of Boltz fill candidates per cutoff (default: 3, max: 20)
--aa-length residue spacing used to estimate minimum bridge alanines from terminal CA distance (default: 4.0)
--boltz-cache optional Boltz cache path
--boltz-diffusion-samples diffusion samples passed to boltz predict (default: 1)
--boltz-devices device count passed to boltz predict (default: 1)
--boltz-accelerator accelerator passed to boltz predict: cpu, gpu, tpu (default: gpu)
--boltz-use-msa-server pass --use_msa_server to boltz predict
--no-boltz-potentials disable Boltz --use_potentials (enabled by default)
--boltz-template-threshold template force threshold written in Boltz YAML (default: 0.1)
--color colorized progress output mode: auto (default), always, never
--quiet disable progress output

Outputs are written directly under OUTPUT_DIR:

OUTPUT_DIR/
  _boltz/
    cutoff_<cutoff>_template.pdb
    cutoff_<cutoff>_<candidate>/
      <job>.yaml
      predictions/...
  a<cutoff>truncated.pdb
  b<cutoff>truncated.pdb
  ...
  <cutoff>truncated.pdb
  summary.json

OUTPUT_DIR is created if missing, but only when its parent directory already exists. If the parent path does not exist, procustes fails with an error.

summary.json is written once per run and includes run parameters (including extra_residues_requested) plus a cutoffs array (single entry) with residue counts, candidate scores, winning candidate metadata, and extra_residues_applied.

When --nofill is set, Boltz is skipped and only <cutoff>truncated.pdb is written.

When both --nofill --caps are set, every resulting protein chain is capped with ACE and NME, chain IDs are reassigned deterministically starting at A, and small-molecule binder chain IDs are reassigned from X to avoid collisions.

If --nofill is set, custom fill arguments (--fill-models-count, --aa-length, or any --boltz-* option) raise an error.

If --caps is set without --nofill, procustes raises an error.

If --fill-method pdbfixer is selected, any --boltz-* options raise an error.

Final output normalization is always applied to <cutoff>truncated.pdb: ligand/small-molecule residues are written first, small-molecule chain IDs are assigned from X, and biopolymer chains are assigned from A to avoid chain-ID collisions.

During CLI execution, procustes prints per-cutoff stage logs (residue selection, detected gap ranges/lengths, Boltz command invocation, candidate scores) plus final summaries with kept residues, alanine-filled residues, elapsed time, and output file path.

For Boltz fill runs (--fill-method boltz), each candidate YAML includes a templates entry pointing to OUTPUT_DIR/_boltz/cutoff_<cutoff>_template.pdb (protein after short-gap restoration), with chain_id, template_id, force: true, and threshold so Boltz can enforce template guidance while modeling alanine bridge regions.

After each Boltz candidate model is generated, procustes aligns it to the cutoff template with MDAnalysis using only non-inserted residues (the original kept residues, excluding alanine bridge insertions), then grafts template coordinates for those non-gap residues before merging ligand atoms.

Integration reference workflow

The TYK2 end-to-end integration suite lives in tests/integration/test_tyk2_end_to_end.py and validates four compressed fixtures (ejm31, ejm42, jmc27, jmc28) by running the full CLI entrypoint in-process.

Reference artifacts are stored under tests/reference/<complex>/ as:

9truncated.pdb (byte-for-byte comparison after stripping hydrogen records, to avoid OpenMM/PDBFixer hydrogen-placement nondeterminism)
summary.json (field-aware JSON comparison)

To regenerate these references intentionally (one-time baseline refresh), run:

uv run --extra dev python scripts/generate_tyk2_references.py

By default, integration test temporary directories are deleted. Set PROCUSTES_KEEP_ITEST_TMP=1 to retain them for debugging.

Development

Use the project dev environment with uv:

uv sync --extra dev

Run formatting and linting:

uv run --extra dev ruff format src tests scripts
uv run --extra dev ruff check src tests scripts

Run tests:

uv run --extra dev pytest -q

Run only the TYK2 integration tests:

cd tests/integration && pytest -q

PyPI Release

Tag-based releases use hatch-vcs dynamic versioning and upload wheel-only artifacts.

Prerequisites:

clean git working tree
local branch fully synced with upstream
~/.pypirc configured for [pypi] credentials
git and uv available on PATH

Run:

python scripts/release_pypi.py X.Y.Z

The release script will:

validate X.Y.Z format
verify git cleanliness and upstream sync
ensure tag does not already exist locally/remotely
create annotated tag X.Y.Z
build exactly one wheel into dist/ (no sdist)
upload only that wheel via twine using ~/.pypirc pypi section
push the release tag to origin

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Mar 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

procustes-0.1.0-py3-none-any.whl (29.7 kB view details)

Uploaded Mar 5, 2026 Python 3

File details

Details for the file procustes-0.1.0-py3-none-any.whl.

File metadata

Download URL: procustes-0.1.0-py3-none-any.whl
Upload date: Mar 5, 2026
Size: 29.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for procustes-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`661b25b196736650f5a9a30e0afe0b6b631aa07142bf9f3c571c3cb4078e8229`
MD5	`19e6da3470fc3e59e303ec920008cb21`
BLAKE2b-256	`b63350f1fcc5084d59c8d132c540b9046b3d94600704f1415a4c9584bc4062bf`

See more details on using hashes here.

procustes 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta