A tool to discover and annotate tandem protein kinases
Project description
tkp-finder
tkp-finder
is a CLI tool to discover and annotate tandem protein kinases.
It's based on lXtractor -- a general-purpose library for data mining from sequences and structures. The latter is under active development, so bugs are possible.
Table of Contents
Installation
pip install tkp-finder
License
tkp-finder
is distributed under the terms of the MIT license.
Usage
The installation should make the script tkp-finder
globally available.
The interface has two commands:
The setup
command will download and prepare HMM models for annotation.
→ tkp-finder setup --help
Usage: tkp-finder setup [OPTIONS]
Command to initialize the HMM data needed for TKPs' annotation.
Options:
-H, --hmm_dir DIRECTORY Path to a directory to store hmm-related data.
[required]
-d, --download If True, download the Pfam data from interpro.
-q, --quiet Disable verbose output.
--path_pfam_a FILE A path to downloaded Pfam-A HMM profiles. By
default, if `download` is ``False``,will try to
find it within the `hmm_dir`.
--path_pfam_dat FILE A path to downloaded Pfam-A (meta)data file. By
default, if `download` is ``False``,will try to
find it within the `hmm_dir`.
-h, --help Show this message and exit.
For the first-time usage, invoke
→ tkp-finder setup -H hmm -d
This will download Pfam-A HMMs and accompanying metadata, and split the models into categories. The resulting directory:
→ tree -L 2 hmm
hmm
├── PF00069.hmm
├── Pfam-A.hmm
├── Pfam-A.hmm.dat
├── pfam_entries.tsv
└── profiles
├── Coiled-coil
├── Disordered
├── Domain
├── Family
├── Motif
├── Repeat
└── unknown
To dicover and annotate TKPs, refer to tkp-finder find
command:
→ tkp-finder find --help
Usage: tkp-finder find [OPTIONS] [FASTA]...
Options:
-H, --hmm_dir DIRECTORY Directory with HMM profiles. Expected to contain
`profiles` dir and target PK profile
(PF00069.hmm). See `tkp-finder setup` on how to
prepare this dir.
-t, --hmm_type TEXT Which HMM types to use for annotating the
discovered TKPs. The names must correspond to
folders within he `hmm_dir`. [default: Family,
Domain, Motif]
-p, --pk_profile FILE A path to the PK HMM profile. By default, will
try to find it within the `hmm_dir`.
-m, --motif TEXT A motif to discriminate between PKs and pseudo
PKs. This corresponds to the following conserved
elements:: (1) b3-Lys(2) aC-helix Glu(3-4-5) HRD
motif(6-7-8) DFG motif [default: KEXXDDXX]
-o, --output DIRECTORY Output directory to store the results. Be
default, will store within `./tkp-finder`.
-n, --num_proc INTEGER The number of cpus for data parallelism: each
input fasta will be annotated within separate
process. HINT: one may split large fasta files
for faster processing.
-q, --quiet Disable logging and progress bar
--pk_map_name TEXT Use this name for the protein kinase domain.
[default: PK]
--ppk_map_name TEXT Use this name for pseudo protein kinases.
[default: PPK]
--min_domain_size INTEGER The minimum number of amino acid residues within
a PK domain. [default: 150]
--min_domains INTEGER The number of domains to classify a protein as
TKP.
--timeout INTEGER For parallel processing, indicate timeout for
getting results of a single process.
-h, --help Show this message and exit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tkp_finder-0.2.tar.gz
.
File metadata
- Download URL: tkp_finder-0.2.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.23.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 451451d3679620c0b28bff1e468624b898dc9c9125f4589b737a5708ce21aab9 |
|
MD5 | 078339c956849b928d0851a7ae31c790 |
|
BLAKE2b-256 | 161440f25e5e5dbdedbb99168e0cbf781a048694178ba1d9e1c8d788860921ca |
File details
Details for the file tkp_finder-0.2-py3-none-any.whl
.
File metadata
- Download URL: tkp_finder-0.2-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.23.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19eced730e6961bb616027a407125daeb60b71a4cba4aa2a96367cfc0d6b2bfc |
|
MD5 | 75fce37cfb82b498c7fef343fd2faf75 |
|
BLAKE2b-256 | 52a5b593d8f6f71d6c9f9ae7f75a305ce62304e4bff3466e70faed426f731958 |