Intermediate representation and converters for protein folding model inputs
Project description
SPIR
SPIR (Structure Prediction Intermediate Representation) exists to make it practical to compare and iterate across multiple structure prediction models without constantly rewriting inputs by hand. Different predictors (AlphaFold3 Server/non-Server, Chai-1, Boltz-2, Protenix) can yield meaningfully different structures, confidence metrics, and binding/interface hypotheses on the same biological system; being able to run the same job across models is essential for validating conclusions, spotting model-specific artifacts, and choosing the best tool for a given target or constraint set.
In practice, such comparisons are hindered by the format fragmentation across different models, especially for glycans, where representations range from compact tree strings with implicit chemistry (e.g., AF3 Server) to fully specified multi-component ligands with explicitly specified bonded atom pairs. Reliably converting between formats requires more than renaming fields: it necessitates an intermediate graph-like representation that preserves residue identity, connectivity, attachment sites, and (when needed) explicit linkage atoms/positions, while also handling cases where a target format omits or infers chemistry. SPIR provides that IR together with model-specific converters so scientific questions, not input wrangling, drive the workflow.
Installation
The easiest way to install SPIR is to use pip:
pip install spir
If you want to build from source, you can clone the repository and run:
git clone https://github.com/briney/spir
cd spir
pip install -e .
Usage
spir convert --from DIALECT INPUT_FILE --to DIALECT OUTPUT_PREFIX
As a more concrete example, to convert an AlphaFold3 Server input to an AlphaFold3 (non-Server) output, you can run:
spir convert --from alphafold3server path/to/input.json --to alphafold3 path/to/output
[!NOTE] The
output_prefixshould only contain the prefix for the output files (no extension). The correct extension will be added automatically.
If your input is Chai-1 formatted and includes restraints, you can specify the restraints file with the --restraints option:
spir convert --from chai1 input.fasta --to protenix output --restraints restraints.csv
Supported Models
SPIR supports the following structure prediction models:
| model | dialect |
|---|---|
| AlphaFold3 Server | alphafold3server |
| AlphaFold3 (non-Server) | alphafold3 |
| Boltz-2 | boltz2 |
| Chai-1 | chai1 |
| Protenix | protenix |
Custom MSAs
AlphaFold3 (non-Server) and Boltz-2 support custom MSA paths as part of their respective input formats. We anticipate many users will want to convert from the AlphaFold3 Server format to one of these dialects, since the AlphaFold3 Server format is particularly user-friendly with respect to glycans. While the official AlphaFold3 Server input format does not support custom MSA paths, SPIR allows users to supply custom MSA using an unofficial msa_path field for any proteinChain, dnaSequence, or rnaSequence in an AlphaFold3 Server input, like so:
{
"name": "msa_test",
"modelSeeds": [42],
"sequences": [
{
"proteinChain": {
"id": "A",
"sequence": "MVLSPADKTN",
"msa_path": "/path/to/msa/protein_a.a3m"
}
}
]
}
SPIR will then add the custom MSA path to the appropriate format for the target output dialect. For example, if you convert to AlphaFold3 (non-Server), the supplied MSA path will be added to the unpairedMsaPath field. For Boltz-2, the supplied MSA path will be added to the msa field.
[!NOTE] The unofficial
msa_pathfield in AlphaFold3 Server is only supported for input files. If an AlphaFold3 (non-Server) or Boltz-2 input file containing an MSA path is converted to AlphaFold3 Server format, the MSA path will be ignored.
Format Validation
SPIR also provides a tool for validating inputs against the appropriate format schema:
spir validate --dialect DIALECT INPUT_FILE
License
SPIR is licensed under the permissive MIT License. Both commercial and non-commercial use are permitted without restriction. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spir-0.3.0.tar.gz.
File metadata
- Download URL: spir-0.3.0.tar.gz
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2be95bb56fb950a26b115858b4b615f648a79b7963bf7d6b8153cf787628848
|
|
| MD5 |
54357c846800ed440ff0fb21d0b240ae
|
|
| BLAKE2b-256 |
0ba1f0062b9188096f67f130916619442ee8e7bad18ed525b2ff3749edf89499
|
Provenance
The following attestation bundles were made for spir-0.3.0.tar.gz:
Publisher:
python-publish.yaml on briney/spir
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spir-0.3.0.tar.gz -
Subject digest:
e2be95bb56fb950a26b115858b4b615f648a79b7963bf7d6b8153cf787628848 - Sigstore transparency entry: 776163383
- Sigstore integration time:
-
Permalink:
briney/spir@572dff04e570a861b3a5a1e830682a47723f1c59 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/briney
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yaml@572dff04e570a861b3a5a1e830682a47723f1c59 -
Trigger Event:
release
-
Statement type:
File details
Details for the file spir-0.3.0-py3-none-any.whl.
File metadata
- Download URL: spir-0.3.0-py3-none-any.whl
- Upload date:
- Size: 4.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e948e3c8f0714706494d4b04512b2ffe3992d058df29e35a385d05074db0a567
|
|
| MD5 |
2aa487b3237666c487b73184ef936d43
|
|
| BLAKE2b-256 |
a54b1c6c664abf478a348fd384cf11cee7bfbece63172ccfcecd6a8d08ddf20f
|
Provenance
The following attestation bundles were made for spir-0.3.0-py3-none-any.whl:
Publisher:
python-publish.yaml on briney/spir
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spir-0.3.0-py3-none-any.whl -
Subject digest:
e948e3c8f0714706494d4b04512b2ffe3992d058df29e35a385d05074db0a567 - Sigstore transparency entry: 776163385
- Sigstore integration time:
-
Permalink:
briney/spir@572dff04e570a861b3a5a1e830682a47723f1c59 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/briney
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yaml@572dff04e570a861b3a5a1e830682a47723f1c59 -
Trigger Event:
release
-
Statement type: