Skip to main content

Beta-barrel-like chain detector for PDB/mmCIF structures.

Project description

Cooper-Beta

Cooper-Beta is named after a “cooper”, a traditional barrel maker, reflecting its focus on β-barrel fold detection. Cooper-Beta detects beta-barrel-like protein chains in PDB, CIF, and mmCIF structures. It parses structures with Biopython, runs DSSP, slices beta-sheet C-alpha coordinates, fits ellipses to cross sections, applies geometric consistency rules, and returns chain-level results.

Quick Start

Cooper-Beta requires Python 3.10 or newer and a DSSP executable (mkdssp or dssp) on PATH.

pip install cooper-beta
cooper-beta --check-env
cooper-beta path/to/structures --out cooper_beta_results.csv

If DSSP is installed outside PATH, pass its location as a configuration override:

cooper-beta path/to/structures runtime.dssp_bin_path=/absolute/path/to/mkdssp

Installation

Install the detector:

pip install cooper-beta

Install optional tools:

pip install "cooper-beta[eval]"   # pandas for evaluation helpers
pip install "cooper-beta[scripts]" # dependencies for source-checkout helper scripts
pip install "cooper-beta[full]"   # all optional extras

For development from a source checkout:

pip install -e ".[full,dev]"

The repository also includes environment.yml and scripts/setup_env.sh for a Conda or Mamba environment that installs DSSP from conda-forge:

bash scripts/setup_env.sh --dev

Command Line

Run Cooper-Beta on a single file or a directory:

cooper-beta path/to/structure.cif --out results.csv
cooper-beta path/to/structures --workers 8 --prepare-workers 8 --out results.csv

Useful options:

  • --check-env: print the Python executable and resolved DSSP executable.
  • --workers: number of analysis worker processes.
  • --prepare-workers: number of structure-preparation worker processes.
  • --out: output CSV path.
  • --version: print the installed Cooper-Beta version.

Advanced configuration uses Hydra-style KEY=VALUE overrides:

cooper-beta path/to/structures \
  runtime.dssp_bin_path=/absolute/path/to/mkdssp \
  analyzer.rules.angle.max_gap_deg=160 \
  output.summary_limit=-1

Python API

The recommended Python entry point is detect, which returns a structured PipelineRunResult. CSV output is written only when output is provided or write_csv=True.

from cooper_beta import detect

run = detect(
    "path/to/structures",
    workers=4,
    output="results.csv",
    overrides={"runtime.dssp_bin_path": "/usr/bin/mkdssp"},
)

print(run.result_counts)
for row in run.rows:
    print(row.filename, row.chain, row.result, row.reason)

For visualization or debugging of one chain, extract_chain_slices returns the aligned slice intersections and, optionally, the analyzer's selected sequence cores and layer diagnostics:

from cooper_beta import extract_chain_slices

bundle = extract_chain_slices(
    "path/to/structure.pdb",
    chain="A",
    include_raw_axis_slices=True,
    include_core_slices=True,
    include_layer_diagnostics=True,
)

print(bundle.informative_slices)
first_z, first_points = next(iter(bundle.raw_axis_slices.items()))
print(first_z, first_points)
print(bundle.layer_diagnostics[0].reason)

Public interfaces:

  • cooper_beta.detect(...): run detection and return structured results.
  • cooper_beta.extract_chain_slices(...): extract one chain's aligned slice intersections, optional core slices, and optional layer diagnostics.
  • cooper_beta.main(...): backward-compatible entry point returning row dicts.
  • cooper_beta.build_config(...): build an AppConfig from overrides.
  • cooper_beta.PipelineRunResult: complete run result with rows, input_files, output_path, and result_counts.
  • cooper_beta.ChainSliceBundle: one-chain slice extraction result.
  • cooper_beta.DetectionResult: one chain-level result row.
  • cooper_beta.ProteinLoader: parse structures and collect per-chain C-alpha and DSSP annotations.
  • cooper_beta.PCAAligner, ProteinSlicer, and BarrelAnalyzer: lower-level analysis components for custom workflows.

User-facing failures raise Cooper-Beta exceptions such as InputValidationError, DsspNotFoundError, DsspError, StructureParseError, and ChainNotFoundError.

Output

The result CSV includes one row per chain. Core columns include:

  • filename and chain
  • result: BARREL, NON_BARREL, FILTERED_OUT, or ERROR
  • result_stage and reason
  • decision_score, decision_basis, and decision_threshold
  • score_raw and score_adjust
  • valid_layers, scored_layers, total_layers, junk_layers, and invalid_layers
  • chain_residues, sheet_residues, and informative_slices

Large directory runs use bounded prepare and analysis batches, and the CLI writes the CSV with deterministic file/chain ordering. When CSV output is enabled, Cooper-Beta also writes <results.csv>.manifest.json with the resolved config, config hash, package versions, input files, and DSSP executable metadata. The console summary is capped by default; set output.summary_limit=-1 to print every row.

Evaluation Helpers

Evaluation utilities are available after installing cooper-beta[eval]:

cooper-beta-eval \
  --positives path/to/positive-structures \
  --negatives path/to/negative-structures \
  --save-dir evaluation-results
python -m cooper_beta.evaluation \
  --positives path/to/positive-structures \
  --negatives path/to/negative-structures \
  --ablation

The GitHub repository also contains helper scripts for local datasets and Cooper-Beta CSV outputs. Local structure datasets, manual review notes, and research-only helper scripts are intentionally excluded from the package artifacts.

External Baselines

Evaluation-only adapters for external methods live in external_methods/. The first adapter supports isitabarrel_structure_map, which generates structure-derived contact-map pickles from PDB/CIF/mmCIF inputs, invokes an external isitabarrel.py checkout, and normalizes its results.tsv output. The repository also includes pred_tmbb2_single_juchmme, a sequence-only baseline adapter that extracts chain FASTA records from structures, invokes an external JUCHMME/PRED-TMBB2 checkout, and normalizes topology-derived results. The foldseek_tmalign_structure_search adapter exports one structure file per chain, invokes an external Foldseek binary against a curated beta-barrel reference database using global TMalign mode, and normalizes TM-score-derived results. The upstream AGPL-3.0/GPL-3.0 code is not vendored into Cooper-Beta. These external baseline adapters are kept for repository-level reproducibility and are intentionally excluded from PyPI package artifacts.

Changelog

0.1.1

  • Engineering-quality hardening release focused on reproducibility, packaging, and public interfaces.
  • Added deterministic result ordering, richer run manifests, safer prepare caching, and clearer decision audit fields.
  • Hardened public scripts, external baseline adapters, mmCIF/DSSP handling, release CI, and package validation.

0.1.0

  • Initial public release.
  • CLI and Python API for PDB/CIF/mmCIF beta-barrel-like chain detection.
  • DSSP-backed secondary-structure parsing.
  • Ellipse fitting, PCA axis search, geometric rules, and CSV output.
  • Evaluation helpers and ablation utilities.

License

Cooper-Beta is released under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cooper_beta-0.1.1.tar.gz (74.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cooper_beta-0.1.1-py3-none-any.whl (66.0 kB view details)

Uploaded Python 3

File details

Details for the file cooper_beta-0.1.1.tar.gz.

File metadata

  • Download URL: cooper_beta-0.1.1.tar.gz
  • Upload date:
  • Size: 74.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cooper_beta-0.1.1.tar.gz
Algorithm Hash digest
SHA256 21106c8e8b63076f8d235e43361db7348709827a3767977d14e4112c5be5acba
MD5 e624be13182757ee2acffbad826f537b
BLAKE2b-256 e5aa4cc61cff82476a3b7d3bd48d1d6ed5b79eba9856535464d5ec7ad01997ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for cooper_beta-0.1.1.tar.gz:

Publisher: publish.yml on GeraltZeroZhong/Cooper-Beta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cooper_beta-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: cooper_beta-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 66.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cooper_beta-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8dc518380f7e9898f2710e1db620421986232aeb649d804f6548477a99343430
MD5 0c79a863fe970207ec5ea355ff791e7a
BLAKE2b-256 ee037f56cb0db5c296cf564b0387ea5a889df920112914faa90d21b234c67606

See more details on using hashes here.

Provenance

The following attestation bundles were made for cooper_beta-0.1.1-py3-none-any.whl:

Publisher: publish.yml on GeraltZeroZhong/Cooper-Beta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page