Skip to main content

Building and managing MSA prior to stucture inference

Project description

Multiple Sequence Ali/Alpha Fold

Streamlining the MSA building stages

installation

External dependencies

MSAF uses the following tools:

  • mmseqs2 for database search
  • mafft for multiple sequence alignment You will need those two sotware installed

Python package

Just, pip install msaf

Global setup

MSAF requires a configuration file as first parameter. This configuration file is under yaml format of the following shape

databases : 
  - /path/to/databases/mmseqs
executables:
  mafft: /usr/local/bin/mafft
  mmseqs: /opt/homebrew/bin/mmseqs
settings:
  cache : /path/to/msaf/cache
cocktails:
    test:
        ingredients:
        - target: swissprot
            label: pif.sto
        - target: uniprot
            label: paf.a3m

Where,

  • databases is a list of folder, where MSAF recursively looks for mmseqs database
  • executables are key, value of paths to executable external dependencies
  • cache points to a folder used to store MSAF mess, it MUST exist
  • cocktails is dictionary of recipes

MSAF recipes

Recipes are declared in the configuration file. A recipe is caracterized by a name (eg:test) and ingredients. ingredients define database search and save schema as list of target and label. The target key defines the database to search and label defines the resulting msa file (and format). Recipes feature an optional PDQT parameter, which if set to TRUE will wrap all a3m files in an aligned.pdqt file

In the above exemple, the test recipe will trigger a search in swissprot and uniprot for all supplied queries.

  • The result of swissprot search will be save under stockholm format in a file named pif.sto
  • The result of the uniprot search will be save under a3m format in a file named paf.a3m

Usage

List available database

At startup, MSAF will recurively search inside all databases item found in configuration file for mmseqs database files (<database_name>_h, <database_name>_.index, <database_name>.lookup, <database_name>.index).

The registred <database_name> can be displayed with python -m msaf config.yaml --list

run a search

python -m msaf config.yaml --query query1.fasta query2.fasta --bp test With --bp refering to one recipe defined in the config file and --query to absolute path(s) of query sequence file(s) (fasta format).

Multimer search

Results will be saved in the --output folder (msas, by default) with subfolders using sequential one letter chain identifier along the sequence of query files. If the same file is provided more than once as a query, only one folder will be created. Hence, results of an homodimer search will be stored under a single A/ subfolder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msaf2-0.1.0.tar.gz (6.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

msaf2-0.1.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file msaf2-0.1.0.tar.gz.

File metadata

  • Download URL: msaf2-0.1.0.tar.gz
  • Upload date:
  • Size: 6.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.17

File hashes

Hashes for msaf2-0.1.0.tar.gz
Algorithm Hash digest
SHA256 11bc40605dbdf4588fa6bc7149f43c21d86b36d4e29c36b0a3a6d9c18cdc6aaa
MD5 ec8d592e17e81a0997ac7dc8bf3d74ba
BLAKE2b-256 18aefeba9650be650ebfc00ad0e22262413b25bf9203734d1d813b0f03cce3e6

See more details on using hashes here.

File details

Details for the file msaf2-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: msaf2-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.17

File hashes

Hashes for msaf2-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fa09f4a45f42cb4b6e72c4f059621ccf9d92596805266aad9eb0d068ef8d1ba7
MD5 da5a1cd7d5ec32751881805f8c3ff83d
BLAKE2b-256 14b4c48ec4d4c7d05c8d569d8afe9e8a441a2536b0bb67f5c1e577e71c0914ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page