Skip to main content

a CLI tool for KEGG. It can use all conventional KEGG REST API commands and a custom command get-seq to get nt/aa seq from given gene IDs/pathway/module

Project description

kegg-cli

PyPI - Version PyPI - Python Version


[!NOTE] You MUST make absolutely sure to comply with the conditions of using KEGG and its API: http://www.kegg.jp/kegg/legal.html and http://www.kegg.jp/kegg/rest/.

Table of Contents

Features

  • Query KEGG database information
  • Retrieve entries from KEGG databases
  • List database entries
  • Search by keywords
  • Convert between KEGG and external database IDs
  • Find related entries between KEGG databases
  • Download gene sequences (nucleotide or amino acid)

Installation

Via pip

pip install kegg-cli

From source

git clone https://www.github.com/vidyasagar0405/kegg-cli
cd kegg-cli
pip install .

Commands

you can use conventional KEGG API commands such as info, get, list, find, conv, link, see below for example usage check KEGG REST API documentation for more info: https://www.kegg.jp/kegg/rest/keggapi.html

info

kegg-cli info kegg

get

  • Retrives gene entry
kegg-cli get eco:b0002
  • Enclose ids in "double quotes", use -op/--option for options for available options check KEGG REST API
kegg-cli get "hsa:10458 ece:Z5100" -op aaseq

list

  • Returns the list of KEGG organisms with taxonomic classification
kegg-cli list organism
  • Returns the list of the genes
kegg-cli list "rsz:19816419 rsz:19816420" 
  • Returns the list of human pathways
kegg-cli list pathway -org hsa

find

  • Returns compound ids with the said formula
kegg-cli find C7H10O5 -db compound -op formula
  • Returns genes involved in cancer in humans
kegg-cli find "cancer hsa"
  • Returns Pathways with RNA in their name
kegg-cli find "rna" -db pathway

conv

  • Converts KEGG geneIDS to NCBI proteinIDs
kegg-cli conv ncbi-proteinid "rsz:19816419 rsz:19816420 rsz:19816421 rsz:19816422"
  • Converts KEGG geneIDS to NCBI geneIDs
kegg-cli conv ncbi-geneid "rsz:19816419 rsz:19816420 rsz:19816421 rsz:19816422" 
  • Converts NCBI geneIDs to KEGG geneIDS
kegg-cli conv genes "ncbi-geneid:19816419 ncbi-geneid:19816420 ncbi-geneid:19816421 ncbi-geneid:19816422" 
  • Converts NCBI proteinIDs to KEGG geneIDS
kegg-cli conv genes "ncbi-proteinid:YP_009046967 ncbi-proteinid:YP_009046968 ncbi-proteinid:YP_009046969 ncbi-proteinid:YP_009046970" 

link

  • Returns genes linked to the rsz00966 pathway
kegg-cli link rsz rsz00966 
  • Returns compounds linked to the rsz00966 pathway
kegg-cli link cpd map00010 
  • Returns pathways linked to the given genes
kegg-cli link pathway "hsa:10458 ece:Z5100"

get-seq

It is not a part of KEGG REST API, but uses link and get API calls to get the nucleotide sequence (ntseq) or amino acid sequence (aaseq).

get-seq can be used in four different ways:

  • Fetches the ntseq (default) of the given genes and saves it to a file named with time_date.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148" 
  • Fetches the aaseq of the given genes and saves it to a file named with time_date.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148" --seq-type aaseq 
  • Fetches the aaseq of the given genes and saves it to a file named path/to/file.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148" --seq-type aaseq -o path/to/file.fasta 
  • Fetches the aaseq of the genes in the file (one per line) and saves it to a file named path/to/file.fasta
kegg-cli get-seq /home/vs/Documents/bioinfo/practise/kegg/rsz_rsz_M00005_genes.tsv --seq-type aaseq -o expected_aaseq.fasta 
  • Fetches the aaseq of the genes in the second column (0 based indexing) of the file (one per line) and saves it to a file named path/to/file.fasta (delimiter can also be changed with --delimiter, defaults to '\t')
kegg-cli get-seq /home/vs/Documents/bioinfo/practise/kegg/rsz_rsz_M00005_genes.tsv --field 1 --seq-type aaseq -o expected_aaseq.fasta
  • Fetches the ntseq of all the genes in the given pathway (make sure to add 'path:' prefix, to the pathway ID. One pathway at a time)
kegg-cli get-seq path:minc00966 --seq-type ntseq -o tests/data/expected_pathway_ntseq.fasta 
  • Fetches the ntseq of all the genes in the given module (make sure to add 'md:' prefix, to the module ID. One module at a time)
kegg-cli get-seq md:rsz_M00005 --seq-type ntseq -o tests/data/expected_module_ntseq.fasta 

License

kegg-cli is distributed under the terms of the MIT license.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kegg_cli-0.0.1.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kegg_cli-0.0.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file kegg_cli-0.0.1.tar.gz.

File metadata

  • Download URL: kegg_cli-0.0.1.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for kegg_cli-0.0.1.tar.gz
Algorithm Hash digest
SHA256 52e941644d5bcdd5e41b30e08c53c9e734988e4cd291097944e6466de4eb2063
MD5 9a6765a7f68c1d5725ea11cf6e4f585d
BLAKE2b-256 918ad7fd8634e15d493439c4131573b9a97f3fa33b25ccaee0eeb753d11bad11

See more details on using hashes here.

File details

Details for the file kegg_cli-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: kegg_cli-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for kegg_cli-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5894e7adc7d227b10f6043b982f06488a315e9cd5297e7434db0259f454995da
MD5 5ab0c2edc1fa8c2f6cd91ce40a45f381
BLAKE2b-256 5d179b78a7567921b9132c82158477f502863d2d1f49e88aaa416d365a3c9276

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page