Building and managing MSA prior to stucture inference
Project description
Multiple Sequence Align/Alpha Fold
Streamlining the MSA building stages
Gives you control to the database search and the bundling of msa files prior to structure inference.
Installation
External dependencies
MSAF uses the following tools:
- mmseqs2 for database search
- mafft for multiple sequence alignment You will need those two sotware installed
Python package
Just, pip install msaf2
Global setup
MSAF often requires a configuration file (as -c flag).
This configuration file is in yaml format and has the following shape
databases :
- /path/to/databases/mmseqs
executables:
mafft: /usr/local/bin/mafft
mmseqs: /opt/homebrew/bin/mmseqs
settings:
cache : /path/to/msaf/cache
cocktails:
test:
ingredients:
- target: swissprot
label: pif.sto
- target: uniprot
label: paf.a3m
Where,
databasesis a list of folders, where MSAF recursively looks for mmseqs databaseexecutablesare key, value of paths to executable external dependenciescachepoints to a folder used to store MSAF mess, it MUST existcocktailsis dictionary of recipes
A configuration template file can be generated by the following command
python -m msaf2 --generate
Which you can then edit according to your settings.
MSAF recipes
Recipes are declared in the configuration file. A recipe is caracterized by a name (eg:test) and ingredients. ingredients define database search and save schema as list of target and label. The target key defines the database to search and label defines the resulting msa file (and format).
Recipes may also feature an optional PDQT parameter, which if set to TRUE will wrap all a3m files in an aligned.pdqt file
In the above exemple, the test recipe will trigger a search in swissprot and uniprot for all supplied queries.
- The result of swissprot search will be saved under stockholm format in a file named
pif.sto - The result of the uniprot search will be saved under a3m format in a file named
paf.a3m
Usage
List available database
At startup, MSAF will recurively search inside all databases item found in configuration file for mmseqs database files (<database_name>_h, <database_name>_.index, <database_name>.lookup, <database_name>.index).
The registred <database_name> can be displayed with
python -m msaf config.yaml --list
run a search
python -m msaf2 -c config.yaml --query <abs_path_query1.fasta> <abs_path_query2.fasta> --bp test
With --bp refering to one recipe defined in the config file and --query to absolute path(s) of query sequence file(s) (fasta format).
Multimer search
Results will be saved in the --output folder (msas, by default) with subfolders using sequential one letter chain identifier along the sequence of query files. If the same file is provided more than once as a query, only one folder will be created. Hence, results of an homodimer search will be stored under a single A/ subfolder.
wrap a preexisting folder of msa
if a preexisitng folder is passed with the --pdqt flag, the a3m msa files present in this folder will be archive in a aligned.pdqt file.
python -m msaf2 --pdqt <results_a3m_folder>
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file msaf2-0.2.1.tar.gz.
File metadata
- Download URL: msaf2-0.2.1.tar.gz
- Upload date:
- Size: 6.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
665f7366841b522afb705bcea2e83f98bd0724a6abfef9c82d91157e28f162bb
|
|
| MD5 |
f7bbf49044ca892a754468be4901c121
|
|
| BLAKE2b-256 |
c894e8ed071b2f943d192ff79ed926683f021c56384259ba4b636cd6470f8c25
|
File details
Details for the file msaf2-0.2.1-py3-none-any.whl.
File metadata
- Download URL: msaf2-0.2.1-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c16d4c08be250582a7531b6f54fca3ee856f0318c5eb0239babd62929c2e2e62
|
|
| MD5 |
0725707aa7fbca59d6944c457a885603
|
|
| BLAKE2b-256 |
57b46532725311d763999f62925f99d9e90431c2b75d8b7fb4ca405fb219c090
|