Fast analysis of massive-scale data produced with MassiveFold
Project description
Fast analysis of massive-scale data produced with MassiveFold
Introduction
Massif is a high-throughput analysis suite built to process the large structural ensembles generated by MassiveFold. It helps MassiveFold users review many predictions at once, evaluate interfaces and distances, and identify models that warrant follow-up. Instead of working through raw model folders manually, Massif gathers the metrics needed for filtering, ranking, and selecting structures in one place.
Getting Started
Massif can be both installed as CLI tool and python libary via pip (requires a Rust toolchain).
Get the last massif release from the pypi release with:
python -m pip install massif
Or build an up-to-date version with non-tested but new features with:
python -m pip install .
Python-installed CLI
After pip install, the massif command is available in your environment and uses the same CLI
syntax as the Rust binary:
massif --help
massif fit <OUTPUT_DIR> <REFERENCE_PDB> <CHAIN_IDS> <STRUCTURE_DIR> <OUTPUT_CSV>
Python Package
Example usage:
import massif
files = massif.structure_files("path/to/structures")
distances = massif.distances(
"path/to/structures",
"path/to/reference.pdb",
distance_mode="TM-score",
)
Notes:
massif.distanceswrites a CSV report in the current working directory.- Functions print progress output to stdout while running.
pip installalso exposes amassifconsole script that runs the Rust CLI.
Building from source
Prerequisites
- Rust toolchain >= 1.74 (install via
rustup) - A directory containing the structures you want to process (PDB or mmCIF files); filenames are sorted numerically on the first
_-separated index
Build
cargo build --release
Command Help
cargo run -- --help
Usage
Massif expects positional arguments in the following order:
massif <COMMAND> [COMMAND OPTIONS] <STRUCTURE_DIR> <OUTPUT_CSV> [OPTIONS]
STRUCTURE_DIR: directory containing the input PDB/CIF filesOUTPUT_CSV: base report name; data is currently written to<OUTPUT_CSV>_alternative.*--disable-parallel: force single-threaded execution (Rayon is enabled by default)
The COMMAND argument selects one of the following subcommands:
fit
Align every structure against a reference chain, save aligned coordinates, and compute distances (currently TM-score).
massif fit <OUTPUT_DIR> <REFERENCE_PDB> <CHAIN_IDS> [METRIC] [DISTANCE_CHAINS] <STRUCTURE_DIR> <OUTPUT_CSV>
OUTPUT_DIR: folder where aligned structures are writtenREFERENCE_PDB: path to the reference structure used for alignment and distance computationCHAIN_IDS: concatenated chain identifiers (for exampleABorC) that define the fitting anchor in both reference and target structuresMETRIC(optional):TM-score(default) orrmsd-curDISTANCE_CHAINS(optional): chain group used for the post-fit distance computation, including bothrmsd-curandTM-score(for exampleAB)- Output columns:
TM-score to <reference>plusModels
contacts
Characterise interface contacts and clashes across the ensemble.
massif contacts <OUTPUT_DIR> <STRUCTURE_DIR> <OUTPUT_CSV>
- Extracts direct residue-residue contacts from each model interface and writes one
<model>_contact_details.csvfile per structure - Reports the number of atomic clashes per model and prints the automatic exclusion threshold (mean + 2×SD)
- Adds interface score placeholders (future integration of pTM/ipTM based scoring)
- Aligned structures are not emitted;
OUTPUT_DIRis reserved for future extensions
iplddt
Compute the mean pLDDT over residues at a user-defined interface.
massif iplddt <AGGREGATE_1> <AGGREGATE_2> <THRESHOLD> <STRUCTURE_DIR> <OUTPUT_CSV>
AGGREGATE_1/AGGREGATE_2: chain groups (for exampleABvsC)THRESHOLD: distance cutoff (Å) between atoms to treat residues as contacting- Returns an
i-plddtcolumn per model; failures are reported as-1
cluster
Align every structure on a reference, reduce a selected chain group to one 3D point, and assign complete-linkage clusters in the reduced space.
massif cluster <REFERENCE_PDB> <ANCHOR_CHAINS> <REDUCTION_CHAINS> <CUTOFF> <STRUCTURE_DIR> <OUTPUT_CSV> [--aligned-output-dir <OUTPUT_DIR>]
REFERENCE_PDB: path to the reference structure used for alignmentANCHOR_CHAINS: concatenated chain identifiers used as the alignment anchor (for exampleABorC)REDUCTION_CHAINS: concatenated chain identifiers whose aligned atoms are averaged into one point per modelCUTOFF: complete-linkage cutoff (Å) applied to the reduced 3D points--aligned-output-dir: optional directory where the aligned reference and aligned models are written- Output columns:
point_x,point_y,point_z,cluster_id, andModels - When
--aligned-output-diris not provided, Massif reuses cached reduced coordinates from the existing structured CSV when possible
distances
Measure minimal distances between every pair of chains and optionally retain a subset.
massif distances <FILENAME> <CHAIN_PAIRS> <STRUCTURE_DIR> <OUTPUT_CSV>
FILENAME: reserved for future use (currently ignored)CHAIN_PAIRS: comma-separated list (for exampleAB,AC,BC); each pair becomes a CSV column- Records minimal heavy-atom distances in Å
scoring
Placeholder for future scoring pipelines.
massif scoring <STRUCTURE_DIR> <OUTPUT_CSV>
- Currently returns a vector of
1.0for each model and does not write extra columns
Output Layout
<OUTPUT_CSV>_alternative.csv: structured report with stable column ordering that merges new results with previous runs- Aligned structures are written to the provided
OUTPUT_DIRforfitand to--aligned-output-dirforcluster
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file massif-0.6.0.tar.gz.
File metadata
- Download URL: massif-0.6.0.tar.gz
- Upload date:
- Size: 57.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9aaa7558e0120b378af6d73b8ce68154860137e2101235db5fe37b20aee8f3e
|
|
| MD5 |
f2b6fad266ac26b13b51654c6cee947b
|
|
| BLAKE2b-256 |
aa8d7c5b20c78ee6043921d3d486a4f56c707cc046f39184b36a7347f6630768
|
File details
Details for the file massif-0.6.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: massif-0.6.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 5.0 MB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bed67da50823aed265d720048d3804d1ca68c622c7bed3ca7c5fa9ace311b704
|
|
| MD5 |
c1d92b8a914f851e597721d4f8d6383b
|
|
| BLAKE2b-256 |
58cdf273d80407422529d4f613f2283adcd2785aff37de86de87292656df27c8
|
File details
Details for the file massif-0.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: massif-0.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.9 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11d7c6e700174e13c7f93a6859280520e84f90499ea71443bf9a696a1ca545bf
|
|
| MD5 |
331c286d18b4a3b8d9e35531137cfd65
|
|
| BLAKE2b-256 |
c3c9e7a91d5a985fc533eed1b9246eb602c471780e7e11af8e12fa6f1c30f67a
|
File details
Details for the file massif-0.6.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: massif-0.6.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 5.2 MB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93db768bfdd900c9ba94538d39cb2f959ab72572c84e2e416bbfc2d5253cb1b5
|
|
| MD5 |
416061b37f606d620d3edc02460ad841
|
|
| BLAKE2b-256 |
45df9d460cd0a122f1f95dec26766da965a3b11c7a862abf5db48de5d974315c
|
File details
Details for the file massif-0.6.0-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: massif-0.6.0-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cde0a54b30c3230a421af84c0618b65a5e8222c4b0790867e3cbff52df00d86b
|
|
| MD5 |
0956080cc1285c41134a36cc9f381185
|
|
| BLAKE2b-256 |
5e7a60a1383c62c582ebfee465a45418266d35eb1ddb0ecd85fb31cce7652c9d
|