CPU inference tools for CophyloForge cophylogeny scenario prediction
Project description
cophyloforge
cophyloforge is a CPU inference package for predicting cophylogeny scenario families from a host tree, a symbiont tree, and observed host-symbiont associations. It packages the end-user prediction workflow separately from the research simulator, dataset builder, and training code in this repository.
The default frozen model is archived on Zenodo:
https://doi.org/10.5281/zenodo.19656529
Installation
pip install cophyloforge
The package uses PyTorch for CPU inference. A GPU is not required.
Quickstart
cophyloforge download-model
cophyloforge predict \
--host-tree host.nwk \
--sym-tree sym.nwk \
--associations assoc.tsv \
--outdir results
The prediction command writes:
results/prediction.jsonresults/prediction.tsvresults/prediction_report.txt
The JSON and TSV outputs include the package version, model name, model version, model DOI/source, timestamp, input filenames, scenario prediction, scenario probabilities, tracking_score, switch_score, multi_host_fraction, difficulty_score, and recommended_abstain.
Input Files
cophyloforge expects:
- a host tree in Newick format
- a symbiont tree in Newick format
- an association table or matrix in TSV or CSV format
The association file describes the observed biological links between host tips and symbiont tips. It usually comes from your interaction data, a spreadsheet, field observations, museum or sequence metadata, or literature curation. Tip names in this file must match the labels in the two Newick trees.
Two public association formats are supported.
Edge-list TSV, with one observed link per row:
host symbiont
host_A sym_1
host_B sym_1
host_C sym_2
Binary matrix TSV or CSV, with symbionts in the first column and host names in the remaining columns:
symbiont host_A host_B host_C
sym_1 1 1 0
sym_2 0 0 1
sym_3 0 0 0
Use 1, true, or any nonzero/nonempty value for a present association. Use 0, false, or a blank cell for no association.
Create a template from tree tip labels:
cophyloforge init-association-template \
--host-tree host.nwk \
--sym-tree sym.nwk \
--format matrix \
--out association_template.tsv
For edge lists, the template contains the required header. For matrices, it contains all symbiont rows and host columns with zero-filled cells.
Validate inputs before prediction:
cophyloforge validate-input \
--host-tree host.nwk \
--sym-tree sym.nwk \
--associations assoc.tsv
CLI Reference
cophyloforge --help
cophyloforge version
cophyloforge download-model --model default
cophyloforge validate-input --host-tree host.nwk --sym-tree sym.nwk --associations assoc.tsv
cophyloforge predict --host-tree host.nwk --sym-tree sym.nwk --associations assoc.tsv --outdir results
Batch prediction from a manifest:
cophyloforge batch-predict \
--manifest cases.csv \
--outdir batch-results
The manifest must contain host_tree, sym_tree, and associations columns. An optional case_id column is used for output folder names.
Batch prediction from folders:
cophyloforge batch-predict \
--input-dir cases \
--outdir batch-results
Each case folder can contain either host.nwk, sym.nwk, and assoc.tsv, or the research dataset layout trees/host_obs_sampled.nwk, trees/sym_obs_sampled.nwk, and associations/assoc_obs.tsv.
Python API
from cophyloforge import download_model, predict
download_model()
result = predict(
host_tree="host.nwk",
sym_tree="sym.nwk",
associations="assoc.tsv",
outdir="results",
)
print(result["scenario_prediction"])
print(result["scenario_probabilities"])
Model Cache
Model artifacts are not embedded in the wheel. cophyloforge download-model downloads the default Zenodo checkpoint into the user cache directory:
- Linux:
~/.cache/cophyloforge - macOS:
~/Library/Caches/cophyloforge - Windows:
%LOCALAPPDATA%\cophyloforge
Set COPHYLOFORGE_CACHE_DIR or pass --cache-dir to use a different cache location.
Citation
If you use the software or model, cite the Zenodo model record:
CoPhyloForge frozen inference model. Zenodo. https://doi.org/10.5281/zenodo.19656529
Also cite this repository or release when appropriate for the software version used.
Development
Create an environment and install the package in editable mode:
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
pytest
Build and check distributions:
python -m build
python -m twine check dist/*
The installable package uses a src/ layout and contains only inference-facing code. Research scripts, simulator code, generated datasets, training code, and checkpoint artifacts remain in the repository but are excluded from the package distribution.
For a manual-style guide covering end-user commands, developer checks, and release/publish workflow, see README_MANUAL.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cophyloforge-0.1.2.tar.gz.
File metadata
- Download URL: cophyloforge-0.1.2.tar.gz
- Upload date:
- Size: 27.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56a183ea0a356d73491543d06efc3f6265945fe2491c8f125b93fe6206c26880
|
|
| MD5 |
9ecad5f1278a230d91f71263d0b2a2b2
|
|
| BLAKE2b-256 |
6ffb2729814757e27d1fe9a97208cf45e0269e308cd80a8768e594224c290064
|
Provenance
The following attestation bundles were made for cophyloforge-0.1.2.tar.gz:
Publisher:
publish.yml on sinhakrishnendu/CoPhyloForge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cophyloforge-0.1.2.tar.gz -
Subject digest:
56a183ea0a356d73491543d06efc3f6265945fe2491c8f125b93fe6206c26880 - Sigstore transparency entry: 1347560231
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/CoPhyloForge@05e96c76cd3594129f5323444e70cabc842977b5 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@05e96c76cd3594129f5323444e70cabc842977b5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cophyloforge-0.1.2-py3-none-any.whl.
File metadata
- Download URL: cophyloforge-0.1.2-py3-none-any.whl
- Upload date:
- Size: 25.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd63455b429d0e94b714da703c298461656474ff6c8c3072ae39ba5522a3e624
|
|
| MD5 |
0d719340757342ce9d346a9b02ed5290
|
|
| BLAKE2b-256 |
c9d5694675a99d2deb07414da96f3028079587584e7e3bb9a687d266067b6da5
|
Provenance
The following attestation bundles were made for cophyloforge-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on sinhakrishnendu/CoPhyloForge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cophyloforge-0.1.2-py3-none-any.whl -
Subject digest:
dd63455b429d0e94b714da703c298461656474ff6c8c3072ae39ba5522a3e624 - Sigstore transparency entry: 1347560305
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/CoPhyloForge@05e96c76cd3594129f5323444e70cabc842977b5 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@05e96c76cd3594129f5323444e70cabc842977b5 -
Trigger Event:
push
-
Statement type: