Skip to main content

CPU inference tools for CophyloForge cophylogeny scenario prediction

Project description

cophyloforge

cophyloforge is a CPU inference package for predicting cophylogeny scenario families from a host tree, a symbiont tree, and observed host-symbiont associations. It packages the end-user prediction workflow separately from the research simulator, dataset builder, and training code in this repository.

The default frozen model is archived on Zenodo:

https://doi.org/10.5281/zenodo.19656529

Installation

pip install cophyloforge

The package uses PyTorch for CPU inference. A GPU is not required.

Quickstart

cophyloforge download-model
cophyloforge predict \
  --host-tree host.nwk \
  --sym-tree sym.nwk \
  --associations assoc.tsv \
  --outdir results

The prediction command writes:

  • results/prediction.json
  • results/prediction.tsv
  • results/prediction_report.txt

The JSON and TSV outputs include the package version, model name, model version, model DOI/source, timestamp, input filenames, scenario prediction, scenario probabilities, tracking_score, switch_score, multi_host_fraction, difficulty_score, and recommended_abstain.

Input Files

cophyloforge expects:

  • a host tree in Newick format
  • a symbiont tree in Newick format
  • an association table or matrix in TSV or CSV format

The association file describes the observed biological links between host tips and symbiont tips. It usually comes from your interaction data, a spreadsheet, field observations, museum or sequence metadata, or literature curation. Tip names in this file must match the labels in the two Newick trees.

Two public association formats are supported.

Edge-list TSV, with one observed link per row:

host	symbiont
host_A	sym_1
host_B	sym_1
host_C	sym_2

Binary matrix TSV or CSV, with symbionts in the first column and host names in the remaining columns:

symbiont	host_A	host_B	host_C
sym_1	1	1	0
sym_2	0	0	1
sym_3	0	0	0

Use 1, true, or any nonzero/nonempty value for a present association. Use 0, false, or a blank cell for no association.

Create a template from tree tip labels:

cophyloforge init-association-template \
  --host-tree host.nwk \
  --sym-tree sym.nwk \
  --format matrix \
  --out association_template.tsv

For edge lists, the template contains the required header. For matrices, it contains all symbiont rows and host columns with zero-filled cells.

Validate inputs before prediction:

cophyloforge validate-input \
  --host-tree host.nwk \
  --sym-tree sym.nwk \
  --associations assoc.tsv

CLI Reference

cophyloforge --help
cophyloforge version
cophyloforge download-model --model default
cophyloforge validate-input --host-tree host.nwk --sym-tree sym.nwk --associations assoc.tsv
cophyloforge predict --host-tree host.nwk --sym-tree sym.nwk --associations assoc.tsv --outdir results

Batch prediction from a manifest:

cophyloforge batch-predict \
  --manifest cases.csv \
  --outdir batch-results

The manifest must contain host_tree, sym_tree, and associations columns. An optional case_id column is used for output folder names.

Batch prediction from folders:

cophyloforge batch-predict \
  --input-dir cases \
  --outdir batch-results

Each case folder can contain either host.nwk, sym.nwk, and assoc.tsv, or the research dataset layout trees/host_obs_sampled.nwk, trees/sym_obs_sampled.nwk, and associations/assoc_obs.tsv.

Python API

from cophyloforge import download_model, predict

download_model()

result = predict(
    host_tree="host.nwk",
    sym_tree="sym.nwk",
    associations="assoc.tsv",
    outdir="results",
)

print(result["scenario_prediction"])
print(result["scenario_probabilities"])

Model Cache

Model artifacts are not embedded in the wheel. cophyloforge download-model downloads the default Zenodo checkpoint into the user cache directory:

  • Linux: ~/.cache/cophyloforge
  • macOS: ~/Library/Caches/cophyloforge
  • Windows: %LOCALAPPDATA%\cophyloforge

Set COPHYLOFORGE_CACHE_DIR or pass --cache-dir to use a different cache location.

Citation

If you use the software or model, cite the Zenodo model record:

CoPhyloForge frozen inference model. Zenodo. https://doi.org/10.5281/zenodo.19656529

Also cite this repository or release when appropriate for the software version used.

Development

Create an environment and install the package in editable mode:

python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
pytest

Build and check distributions:

python -m build
python -m twine check dist/*

The installable package uses a src/ layout and contains only inference-facing code. Research scripts, simulator code, generated datasets, training code, and checkpoint artifacts remain in the repository but are excluded from the package distribution.

For a manual-style guide covering end-user commands, developer checks, and release/publish workflow, see README_MANUAL.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cophyloforge-0.1.2.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cophyloforge-0.1.2-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file cophyloforge-0.1.2.tar.gz.

File metadata

  • Download URL: cophyloforge-0.1.2.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cophyloforge-0.1.2.tar.gz
Algorithm Hash digest
SHA256 56a183ea0a356d73491543d06efc3f6265945fe2491c8f125b93fe6206c26880
MD5 9ecad5f1278a230d91f71263d0b2a2b2
BLAKE2b-256 6ffb2729814757e27d1fe9a97208cf45e0269e308cd80a8768e594224c290064

See more details on using hashes here.

Provenance

The following attestation bundles were made for cophyloforge-0.1.2.tar.gz:

Publisher: publish.yml on sinhakrishnendu/CoPhyloForge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cophyloforge-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: cophyloforge-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 25.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cophyloforge-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd63455b429d0e94b714da703c298461656474ff6c8c3072ae39ba5522a3e624
MD5 0d719340757342ce9d346a9b02ed5290
BLAKE2b-256 c9d5694675a99d2deb07414da96f3028079587584e7e3bb9a687d266067b6da5

See more details on using hashes here.

Provenance

The following attestation bundles were made for cophyloforge-0.1.2-py3-none-any.whl:

Publisher: publish.yml on sinhakrishnendu/CoPhyloForge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page