Skip to main content

A tool to discover and annotate tandem protein kinases

Project description

tkp-finder

PyPI - Version PyPI - Python Version

tkp-finder is a CLI tool to discover and annotate tandem protein kinases.

It's based on lXtractor -- a general-purpose library for data mining from sequences and structures. The latter is under active development, so bugs are possible.


Table of Contents

Installation

pip install tkp-finder

License

tkp-finder is distributed under the terms of the MIT license.

Usage

The installation should make the script tkp-finder globally available. The interface has two commands:

The setup command will download and prepare HMM models for annotation.

→ tkp-finder setup --help

Usage: tkp-finder setup [OPTIONS]

  Command to initialize the HMM data needed for TKPs' annotation.

Options:
  -H, --hmm_dir DIRECTORY  Path to a directory to store hmm-related data.
                           [required]
  -d, --download           If True, download the Pfam data from interpro.
  -q, --quiet              Disable verbose output.
  --path_pfam_a FILE       A path to downloaded Pfam-A HMM profiles. By
                           default, if `download` is ``False``,will try to
                           find it within the `hmm_dir`.
  --path_pfam_dat FILE     A path to downloaded Pfam-A (meta)data file. By
                           default, if `download` is ``False``,will try to
                           find it within the `hmm_dir`.
  -h, --help               Show this message and exit.

For the first-time usage, invoke

→ tkp-finder setup -H hmm -d

This will download Pfam-A HMMs and accompanying metadata, and split the models into categories. The resulting directory:

→ tree -L 2 hmm

hmm
├── PF00069.hmm
├── Pfam-A.hmm
├── Pfam-A.hmm.dat
├── pfam_entries.tsv
└── profiles
    ├── Coiled-coil
    ├── Disordered
    ├── Domain
    ├── Family
    ├── Motif
    ├── Repeat
    └── unknown

To dicover and annotate TKPs, refer to tkp-finder find command:

→ tkp-finder find --help

Usage: tkp-finder find [OPTIONS] [FASTA]...

Options:
  -H, --hmm_dir DIRECTORY    Directory with HMM profiles. Expected to contain
                             `profiles` dir and target PK profile
                             (PF00069.hmm). See `tkp-finder setup` on how to
                             prepare this dir.
  -t, --hmm_type TEXT        Which HMM types to use for annotating the
                             discovered TKPs. The names must correspond to
                             folders within he `hmm_dir`.  [default: Family,
                             Domain, Motif]
  -p, --pk_profile FILE      A path to the PK HMM profile. By default, will
                             try to find it within the `hmm_dir`.
  -m, --motif TEXT           A motif to discriminate between PKs and pseudo
                             PKs. This corresponds to the following conserved
                             elements::  (1) b3-Lys(2) aC-helix Glu(3-4-5) HRD
                             motif(6-7-8) DFG motif  [default: KEXXDDXX]
  -o, --output DIRECTORY     Output directory to store the results. Be
                             default, will store within `./tkp-finder`.
  -n, --num_proc INTEGER     The number of cpus for data parallelism: each
                             input fasta will be annotated within separate
                             process. HINT: one may split large fasta files
                             for faster processing.
  -q, --quiet                Disable logging and progress bar
  --pk_map_name TEXT         Use this name for the protein kinase domain.
                             [default: PK]
  --ppk_map_name TEXT        Use this name for pseudo protein kinases.
                             [default: PPK]
  --min_domain_size INTEGER  The minimum number of amino acid residues within
                             a PK domain.  [default: 150]
  --min_domains INTEGER      The number of domains to classify a protein as
                             TKP.
  --timeout INTEGER          For parallel processing, indicate timeout for
                             getting results of a single process.
  -h, --help                 Show this message and exit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tkp_finder-0.2.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

tkp_finder-0.2-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file tkp_finder-0.2.tar.gz.

File metadata

  • Download URL: tkp_finder-0.2.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.23.0

File hashes

Hashes for tkp_finder-0.2.tar.gz
Algorithm Hash digest
SHA256 451451d3679620c0b28bff1e468624b898dc9c9125f4589b737a5708ce21aab9
MD5 078339c956849b928d0851a7ae31c790
BLAKE2b-256 161440f25e5e5dbdedbb99168e0cbf781a048694178ba1d9e1c8d788860921ca

See more details on using hashes here.

File details

Details for the file tkp_finder-0.2-py3-none-any.whl.

File metadata

  • Download URL: tkp_finder-0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.23.0

File hashes

Hashes for tkp_finder-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 19eced730e6961bb616027a407125daeb60b71a4cba4aa2a96367cfc0d6b2bfc
MD5 75fce37cfb82b498c7fef343fd2faf75
BLAKE2b-256 52a5b593d8f6f71d6c9f9ae7f75a305ce62304e4bff3466e70faed426f731958

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page