Find Lysine Acylations & other Modification Sites
Project description
FLAMS: Find Lysine Acylations & other Modification Sites
A bioinformatics tool to analyze the conservation of post-translational modifications (PTMs), by means of a position-based search against the Compendium of Protein Lysine Modifications (CPLM database) v.4 and the experimental PTM sites in dbPTM. FLAMS is available as command-line tool and as a web service.
Table of contents
- Introduction
- System requirements
- Installation
- Usage
- Output
- Supported PTMs
- Contact
- References
- License
Introduction
FLAMS is a bioinformatics tool to analyze the conservation of post-translational modifications, by means of a position-based search against the CPLM database v.4 (Zhang, W. et al. Nucleic Acids Research. 2021, 44 (5): 243–250.) and the part of the dbPTM database with experimental support (Li, Z. et al. Nucleic Acids Research. 2022, 50:D471–D479.). FLAMS can be used (i) to quickly verify whether modifications in a specific protein have been reported before, (ii) to assess whether findings in one species might translate to other species, and (iii) to systematically assess the novelty and conservation of reported modification sites.
The tool takes as input a protein (identifier or sequence) and the position of an amino acid. This repository contains the command-line tool FLAMS
, which obtains an overview of the previously reported post-translational modifications matching your query, by using the following scripts:
- input.py: processing the user-provided input
- cplmv4.py, dbptm.py and setup.py: downloading and preparing the modification-specific databases
- run_blast.py: searching your query against the databases of proteins with post-translational modifications
- display.py: formatting the list of conserved post-translational modifications to a tab delimited output file
- utils.py: dealing with OS-dependent directory systems
FLAMS is also available as a web service at https://www.biw.kuleuven.be/m2s/cmpg/research/CSB/tools/flams/ .
System requirements
Linux 64-bit, Windows and Mac OS supported.
General dependencies
- Python3 (>=3.10, <3.12)
Third-party dependencies
Installation
The recommended installation for Mac OS and Linux is through conda:
conda install -c conda-forge -c bioconda flams
It is also possible to install FLAMS through pip (recommended installation for Windows):
pip install flams
Please note that the pip install requires users to have BLAST+ installed locally and available in PATH. For more information on how to install BLAST+ on Windows, click here .
Usage
Run the tool:
FLAMS [-h] (--in inputFilePath | --id UniProtID | --batch batchFilePath) [-p position] [--range errorRange] [-o outputFilePath] [-d dataDir] [-t threadsBLAST] [-e evalueBLAST] [-m modification [modification ...]]
Required argument:
- one of:
inputFilePath
, used with--in
, is the path to a .fasta file with the protein you wish to query against. (has to contain only 1 protein)UniProtID
, used with--id
, is the UniProt ID of the protein you wish to query against.batchFilePath
, used with--batch
, is the path to a tab seperated file for batch runs. The file should contain 1 entry per line, with UniProt IDs in the 1st column, and positions in the 2nd column.
Required argument when running FLAMS with --id/--in:
position
is the position of a lysine in the protein, which you want to query against.
Optional arguments:
errorRange
is an number of positions before and afterposition
to also search for modifications. [default: 0]outputFilePath
is the path to where the result will be saved (in a .tsv file format). [default: out.tsv] If FLAMS is run with --batch, the specified -o/--output is used as preposition, followed by '_$UniProtID_$position.tsv'. [default: '']dataDir
is the path to the directory where intermediate files (the UniProt sequence files) are stored. [default: $PWD/data]threadsBLAST
is a BLAST parameter, allows you to speed up the search by multithreading. [default: 1]evalueBLAST
is a BLAST parameter, allows you to filter out low quality BLAST hits. [default: 0.01]modification
is a space-separated list of modifications (all lower case) to search for at the given position. Possible values are any (combinations) of the CPLM and dbPTM modifications. We also provide aggregated combinations for each amino acid ($AA-All), and the CPLM combinations. For a full list of all supported PTMs, and how they are named, see the Supported PTM types section of the README. In general, PTMs are written all lowercase, and spaces within a PTM name are replaced by underscores. [default: K-All]
Example use case
We provide two example use cases for FLAMS:
With the following command, you search whether the TatA (UniProt ID: A0A916NWA0) acetylation on K66 in Dehalococcoide mccartyi strain CBDB1, as described by Greiner-Haas (2021), had been previously detected.
FLAMS --in A0A916NWA0.fa -p 66 -m acetylation -o tatA.tsv
With the following command, you search whether the Mycobabcterium smegmatis' FadD2 (UniProt ID: A0QQ22) K537 is known to carry any modifications of the 'acylations' category, similar to what was reported by Xu (2020).
FLAMS --id A0QQ22 -p 537 -m CPLM-Acylations -o FadD2.tsv
You can find the example input and output data in the folder test_data
. The output data is organized in folders reflecting the FLAMS version used to generate it, as the output can vary depending on the exact FLAMS version (due to FLAMS database updates).
For more example use cases, see the Supplementary information of the paper.
Output
The output file is a .tsv containing one row per modification that matched the query, i.e., a modification aligning (within the user-specified range) to the query position, in a protein similar to the query protein. In case of batch jobs (ran with --batch), one output file per query (= a single line in the batch job file) will be generated.
The output file contains 14 columns:
- UniProt ID: UniProt identifier of the matched protein
- Protein name: protein name of the matched protein
- Modification: the type of modification found in the matched protein
- $AA location: the location of this matched modification in the matched protein
- $AA window: the local sequence containing the conserved modification (window of five amino acids before and after°)
- Species: the textual description of the species of the matched protein
- BLAST E Value: E value of BLASTp search of the matched protein against your query protein
- BLAST identity: % identity of BLASTp search of the matched protein against your query protein
- BLAST coverage: % coverage of BLASTp search of the matched protein against your query protein
- CPLM ID: CPLM ID of matched protein modification (if found in CPLM, otherwise empty)
- CPLM evidence code: CPLM evidence code of matched protein modification. Can be Exp(erimental), Dat(abase) or both. (if found in CPLM, otherwise empty)
- CPLM evidence links: CPLM evidence link of matched protein modification. Can be PubMed ID (for Exp.), or a database identified (for Dat) or both. (if found in CPLM, otherwise empty)
- dbPTM evidence code: dbPTM evidence code of matched protein modification (if found in dbPTM, otherwise empty)
- dbPTM evidence links: dbPTM evidence link of matched protein modification. Refers to PubMed IDs. (if found in dbPTM, otherwise empty)
°: window can be smaller than the [-5;+5] window if the sequence alignment ends sooner, which can happen for modified sites near the start/end of the protein
Supported PTMs
Supported PTM databases
FLAMS updates its search databases regularly. To get an overview of the supported databases, see the table below.
FLAMS version | CPLM version | dbPTM version | database available for download | UniProt release |
---|---|---|---|---|
v1.1.4 | v4 | 2024_April | yes | 2024_02 |
v1.1.0-3 | v4 | 2023_November | yes | 2023_05 |
v1.0 | v4 | yes | NA |
Please note that only part of dbPTM is integrated into FLAMS, namely the PTM sites with experimental evidence, as found here. As dbPTM does not store complete protein sequences, these are fetched during database creation based on UniProt identifiers reported in dbPTM and the UniProt release available at the time of database creation. As a consequence, FLAMS database updates can change the content of the PTM databases, beyond the simple addition of new dbPTM and/or CPLM entries, reflecting changes in UniProt. The most common UniProt changes affecting FLAMS databases are removed UniProt entries (leading to the removal of PTM entries on the affected protein in our database) and sequence updates. We are aware of this issue, impacting the completeness and interpretation of FLAMS' results, and will consider solutions in future FLAMS releases.
Instructions on how to download the CPLM and dbPTM database yourself are in section 'Local CPLM and dbPTM install'. This is not recommended, as it takes multiple hours to generate some databases.
Supported PTM types
FLAMS allows searches for all PTM types included in CPLM, and for those with experimental evidence in dbPTM. An overview of the PTM types, how to call them in FLAMS, how they are called in CPLM and/or dbPTM, and on which amino acid they can be found is given in the table below. This table can also be found as a tab seperated file named FLAMS_supported_ptms_v11.txt .
FLAMS PTM name | CPLM name | dbPTM name | A (Ala) | C (Cys) | D (Asp) | E (Glu) | F (Phe) | G (Gly) | H (His) | I (Ile) | K (Lys) | L (Leu) | M (Met) | N (Asn) | P (Pro) | Q (Gln) | R (Arg) | S (Ser) | T (Thr) | V (Val) | W (Trp) | Y (Tyr) | CPLM-Acylations | CPLM-Ubs | CPLM-Others | CPLM-All |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acetylation | Acetylation | Acetylation | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||
adp-ribosylation | ADP-ribosylation | X | X | X | X | X | X | X | X | X | X | |||||||||||||||
amidation | Amidation | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||
ampylation | AMPylation | X | X | X | ||||||||||||||||||||||
benzoylation | Benzoylation | X | X | X | ||||||||||||||||||||||
beta-hydroxybutyrylation | β-Hydroxybutyrylation | X | X | X | ||||||||||||||||||||||
biotinylation | Biotinylation | Biotinylation | X | X | X | |||||||||||||||||||||
blocked_amino_end | Blocked amino end | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||
butyrylation | Butyrylation | Butyrylation | X | X | X | |||||||||||||||||||||
carbamidation | Carbamidation | X | ||||||||||||||||||||||||
carboxyethylation | Carboxyethylation | Carboxyethylation | X | X | X | |||||||||||||||||||||
carboxylation | Carboxylation | Carboxylation | X | X | X | |||||||||||||||||||||
carboxymethylation | Carboxymethylation | X | X | X | ||||||||||||||||||||||
cholesterol_ester | Cholesterol ester | X | ||||||||||||||||||||||||
citrullination | Citrullination | X | ||||||||||||||||||||||||
crotonylation | Crotonylation | Crotonylation | X | X | X | |||||||||||||||||||||
c-linked_glycosylation | C-linked Glycosylation | X | ||||||||||||||||||||||||
deamidation | Deamidation | X | X | |||||||||||||||||||||||
deamination | Deamination | X | ||||||||||||||||||||||||
decanoylation | Decanoylation | X | X | |||||||||||||||||||||||
decarboxylation | Decarboxylation | X | X | |||||||||||||||||||||||
dephosphorylation | Dephosphorylation | X | X | X | ||||||||||||||||||||||
dietylphosphorylation | Dietylphosphorylation | X | X | X | ||||||||||||||||||||||
disulfide_bond | Disulfide bond | X | ||||||||||||||||||||||||
d-glucuronylation | D-glucuronoylation | X | ||||||||||||||||||||||||
farnesylation | Farnesylation | X | ||||||||||||||||||||||||
formation_of_an_isopeptide_bond | Formation of an isopeptide bond | X | X | |||||||||||||||||||||||
formylation | Formylation | Formylation | X | X | X | X | X | |||||||||||||||||||
gamma-carboxyglutamic_acid | Gamma-carboxyglutamic acid | X | ||||||||||||||||||||||||
geranylgeranylation | Geranylgeranylation | X | ||||||||||||||||||||||||
glutarylation | Glutarylation | Glutarylation | X | X | X | |||||||||||||||||||||
glutathionylation | Glutathionylation | X | ||||||||||||||||||||||||
glycation | Glycation | X | X | X | ||||||||||||||||||||||
gpi-anchor | GPI-anchor | X | X | X | X | X | X | X | ||||||||||||||||||
hmgylation | HMGylation | X | X | X | ||||||||||||||||||||||
hydroxyceramide_ester | Hydroxyceramide ester | X | ||||||||||||||||||||||||
hydroxylation | Hydroxylation | Hydroxylation | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||||||
iodination | Iodination | X | ||||||||||||||||||||||||
lactoylation | Lactoylation | X | ||||||||||||||||||||||||
lactylation | Lactylation | Lactylation | X | X | X | |||||||||||||||||||||
lipoylation | Lipoylation | Lipoylation | X | X | X | |||||||||||||||||||||
malonylation | Malonylation | Malonylation | X | X | X | |||||||||||||||||||||
methylation | Methylation | Methylation | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||
mgcylation | MGcylation | X | X | X | ||||||||||||||||||||||
mgylation | MGylation | X | X | X | ||||||||||||||||||||||
myristoylation | Myristoylation | X | X | X | ||||||||||||||||||||||
neddylation | Neddylation | Neddylation | X | X | X | |||||||||||||||||||||
nitration | Nitration | X | ||||||||||||||||||||||||
n-carbamoylation | N-carbamoylation | X | ||||||||||||||||||||||||
n-linked_glycosylation | N-linked Glycosylation | X | X | X | X | X | X | X | X | X | ||||||||||||||||
n-palmitoylation | N-palmitoylation | X | X | X | ||||||||||||||||||||||
octanoylation | Octanoylation | X | X | |||||||||||||||||||||||
oxidation | Oxidation | X | X | X | X | X | ||||||||||||||||||||
o-linked_glycosylation | O-linked Glycosylation | X | X | X | X | X | ||||||||||||||||||||
o-palmitoleoylation | O-palmitoleoylation | X | ||||||||||||||||||||||||
o-palmitoylation | O-palmitoylation | X | X | |||||||||||||||||||||||
phosphatidylethanolamine_amidation | Phosphatidylethanolamine amidation | X | ||||||||||||||||||||||||
phosphoglycerylation | Phosphoglycerylation | X | X | X | ||||||||||||||||||||||
phosphorylation | Phosphorylation | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||||||
propionylation | Propionylation | Propionylation | X | X | X | |||||||||||||||||||||
pupylation | Pupylation | X | X | X | ||||||||||||||||||||||
pyrrolidone_carboxylic_acid | Pyrrolidone carboxylic acid | X | X | |||||||||||||||||||||||
pyrrolylation | Pyrrolylation | X | ||||||||||||||||||||||||
pyruvate | Pyruvate | X | X | |||||||||||||||||||||||
serotonylation | Serotonylation | X | ||||||||||||||||||||||||
stearoylation | Stearoylation | X | ||||||||||||||||||||||||
succinylation | Succinylation | Succinylation | X | X | X | X | X | |||||||||||||||||||
sulfation | Sulfation | X | X | X | X | |||||||||||||||||||||
sulfhydration | Sulfhydration | X | ||||||||||||||||||||||||
sulfoxidation | Sulfoxidation | X | ||||||||||||||||||||||||
sumoylation | Sumoylation | Sumoylation | X | X | X | |||||||||||||||||||||
s-archaeol | S-archaeol | X | ||||||||||||||||||||||||
s-carbamoylation | S-carbamoylation | X | ||||||||||||||||||||||||
s-cyanation | S-Cyanation | X | ||||||||||||||||||||||||
s-cysteinylation | S-cysteinylation | X | ||||||||||||||||||||||||
s-diacylglycerol | S-diacylglycerol | X | ||||||||||||||||||||||||
s-linked_glycosylation | S-linked Glycosylation | X | ||||||||||||||||||||||||
s-nitrosylation | S-nitrosylation | X | ||||||||||||||||||||||||
s-palmitoylation | S-palmitoylation | X | ||||||||||||||||||||||||
thiocarboxylation | Thiocarboxylation | X | ||||||||||||||||||||||||
ubiquitination | Ubiquitination | Ubiquitination | X | X | X | X | X | X | ||||||||||||||||||
umpylation | UMPylation | X | X | X | ||||||||||||||||||||||
2-hydroxyisobutyrylation | 2-Hydroxyisobutyrylation | X | X | X |
Local CPLM and dbPTM install
It is possible to install the CPLM and dbPTM databases directly, instead of using the pre-generated databases that are hosted on Zenodo. This is however not recommended as the download takes several hours for larger databases, such as phosphorylation, ubiquitination and acetylation.
However, if desired, follow these instructions to modify the scripts:
- Make sure you are working in an environment with the correct dependencies:
- on Linux/MacOS: create a FLAMS conda environment, get all dependencies by installing FLAMS as specified for Linux/MacOS
- on Windows: create a FLAMS conda environment, get all dependencies by installing FLAMS as specified for Windows. Make sure BLAST+ is correctly installed.
-
Download the latest FLAMS version from GitHub.
-
Adapt the scripts:
-
on a fresh install (= never ran FLAMS before, so no FLAMS databases yet):
- go to
src/flams/databases/setup.py
- comment out lines 503-508 (function
_generate_blastdb_if_not_up_to_date
- the try/except _get_fasta_from_zenodo) - uncomment line 512 (function
_generate_blastdb_if_not_up_to_date
- the _get_fasta_for_blast)
- go to
-
on a FLAMS version with previously generated BLAST databases:
- go to
src/flams/databases/setup.py
- change the version numbers of the databases you wish to update on lines 75-458. E.g.:
"2-hydroxyisobutyrylation": ModificationType( "2-hydroxyisobutyrylation", 1.0, [ModificationDatabase(cplmv4, "2-Hydroxyisobutyrylation")], ["K"] ),
becomes
"2-hydroxyisobutyrylation": ModificationType( "2-hydroxyisobutyrylation", 2.0, [ModificationDatabase(cplmv4, "2-Hydroxyisobutyrylation")], ["K"] ),
- comment out lines 503-508 (function
_generate_blastdb_if_not_up_to_date
- the try/except _get_fasta_from_zenodo) - uncomment line 512 (function
_generate_blastdb_if_not_up_to_date
- the _get_fasta_for_blast)
- go to
-
Install your adapted FLAMS version locally:
python -m pip install ./PathToLocalFLAMS
Contact
Laboratory of Computational Systems Biology, KU Leuven.
References
If you use FLAMS in your work, please cite:
Longin, H. et al (2024) "FLAMS: Find Lysine Acylations and other Modification Sites." Bioinformatics. 40(1):btae005.
In addition, FLAMS relies on third-party software & databases:
Altschul, S.F. et al (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410.
Li, Z. et al (2022) "dbPTM in 2022: an updated database for exploring regulatory networks and functional associations of protein post-translational modifications." Nucleic Acids Research. 50:D471–D479.
Zhang, W. et al (2021) "CPLM 4.0: an updated database with rich annotations for protein lysine modifications." Nucleic Acids Research. 44(5):243–250.
License
FLAMS is freely available under an MIT license.
Use of the third-party software, libraries or code referred to in the References section above may be governed by separate terms and conditions or license provisions. Your use of the third-party software, libraries or code is subject to any such terms and you should check that you can comply with any applicable restrictions or terms and conditions before use.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file flams-1.1.4.tar.gz
.
File metadata
- Download URL: flams-1.1.4.tar.gz
- Upload date:
- Size: 34.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d590d0b51f41bd47830136060f1c5d38475c1905d66a7c2ea11e417b7e641b5 |
|
MD5 | 99406297502d1c5306e13fa5e2ff7322 |
|
BLAKE2b-256 | 33b39e8978cfdeb2549670f04b9fe7fd3e9d8485a939292b7f851f17810b46ff |
File details
Details for the file flams-1.1.4-py3-none-any.whl
.
File metadata
- Download URL: flams-1.1.4-py3-none-any.whl
- Upload date:
- Size: 32.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ce913fe9f1b9066aab1c91bec15245a3ef3fb8de65f910acfd2cae3963e2fbe |
|
MD5 | a4874ebf9a966c35694ed9d7a8fc6f91 |
|
BLAKE2b-256 | a7080d4f422c862195c48c9ff64fd0233565635cdbdbfb60035b5d49195c2e00 |