a CLI tool for KEGG. It can use all conventional KEGG REST API commands and a custom command get-seq to get nt/aa seq from given gene IDs/pathway/module
Project description
kegg-cli
[!NOTE] You MUST make absolutely sure to comply with the conditions of using KEGG and its API: http://www.kegg.jp/kegg/legal.html and http://www.kegg.jp/kegg/rest/.
Table of Contents
Features
- Query KEGG database information
- Retrieve entries from KEGG databases
- List database entries
- Search by keywords
- Convert between KEGG and external database IDs
- Find related entries between KEGG databases
- Download gene sequences (nucleotide or amino acid)
Installation
Via pip
pip install kegg-cli
From source
git clone https://www.github.com/vidyasagar0405/kegg-cli
cd kegg-cli
pip install .
Commands
you can use conventional KEGG API commands such as info, get, list, find, conv, link, see below for example usage check KEGG REST API documentation for more info: https://www.kegg.jp/kegg/rest/keggapi.html
info
kegg-cli info kegg
get
- Retrives gene entry
kegg-cli get eco:b0002
- Enclose ids in "double quotes", use -op/--option for options for available options check KEGG REST API
kegg-cli get "hsa:10458 ece:Z5100" -op aaseq
list
- Returns the list of KEGG organisms with taxonomic classification
kegg-cli list organism
- Returns the list of the genes
kegg-cli list "rsz:19816419 rsz:19816420"
- Returns the list of human pathways
kegg-cli list pathway -org hsa
find
- Returns compound ids with the said formula
kegg-cli find C7H10O5 -db compound -op formula
- Returns genes involved in cancer in humans
kegg-cli find "cancer hsa"
- Returns Pathways with RNA in their name
kegg-cli find "rna" -db pathway
conv
- Converts KEGG geneIDS to NCBI proteinIDs
kegg-cli conv ncbi-proteinid "rsz:19816419 rsz:19816420 rsz:19816421 rsz:19816422"
- Converts KEGG geneIDS to NCBI geneIDs
kegg-cli conv ncbi-geneid "rsz:19816419 rsz:19816420 rsz:19816421 rsz:19816422"
- Converts NCBI geneIDs to KEGG geneIDS
kegg-cli conv genes "ncbi-geneid:19816419 ncbi-geneid:19816420 ncbi-geneid:19816421 ncbi-geneid:19816422"
- Converts NCBI proteinIDs to KEGG geneIDS
kegg-cli conv genes "ncbi-proteinid:YP_009046967 ncbi-proteinid:YP_009046968 ncbi-proteinid:YP_009046969 ncbi-proteinid:YP_009046970"
link
- Returns genes linked to the rsz00966 pathway
kegg-cli link rsz rsz00966
- Returns compounds linked to the rsz00966 pathway
kegg-cli link cpd map00010
- Returns pathways linked to the given genes
kegg-cli link pathway "hsa:10458 ece:Z5100"
get-seq
It is not a part of KEGG REST API, but uses link and get API calls to get the nucleotide sequence (ntseq) or amino acid sequence (aaseq).
get-seq can be used in four different ways:
- Fetches the ntseq (default) of the given genes and saves it to a file named with time_date.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148"
- Fetches the aaseq of the given genes and saves it to a file named with time_date.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148" --seq-type aaseq
- Fetches the aaseq of the given genes and saves it to a file named path/to/file.fasta
kegg-cli get-seq "rsz:108806876 rsz:108839148" --seq-type aaseq -o path/to/file.fasta
- Fetches the aaseq of the genes in the file (one per line) and saves it to a file named path/to/file.fasta
kegg-cli get-seq /home/vs/Documents/bioinfo/practise/kegg/rsz_rsz_M00005_genes.tsv --seq-type aaseq -o expected_aaseq.fasta
- Fetches the aaseq of the genes in the second column (0 based indexing) of the file (one per line) and saves it to a file named path/to/file.fasta (delimiter can also be changed with --delimiter, defaults to '\t')
kegg-cli get-seq /home/vs/Documents/bioinfo/practise/kegg/rsz_rsz_M00005_genes.tsv --field 1 --seq-type aaseq -o expected_aaseq.fasta
- Fetches the ntseq of all the genes in the given pathway (make sure to add 'path:' prefix, to the pathway ID. One pathway at a time)
kegg-cli get-seq path:minc00966 --seq-type ntseq -o tests/data/expected_pathway_ntseq.fasta
- Fetches the ntseq of all the genes in the given module (make sure to add 'md:' prefix, to the module ID. One module at a time)
kegg-cli get-seq md:rsz_M00005 --seq-type ntseq -o tests/data/expected_module_ntseq.fasta
License
kegg-cli is distributed under the terms of the MIT license.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kegg_cli-0.0.1.tar.gz.
File metadata
- Download URL: kegg_cli-0.0.1.tar.gz
- Upload date:
- Size: 24.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52e941644d5bcdd5e41b30e08c53c9e734988e4cd291097944e6466de4eb2063
|
|
| MD5 |
9a6765a7f68c1d5725ea11cf6e4f585d
|
|
| BLAKE2b-256 |
918ad7fd8634e15d493439c4131573b9a97f3fa33b25ccaee0eeb753d11bad11
|
File details
Details for the file kegg_cli-0.0.1-py3-none-any.whl.
File metadata
- Download URL: kegg_cli-0.0.1-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5894e7adc7d227b10f6043b982f06488a315e9cd5297e7434db0259f454995da
|
|
| MD5 |
5ab0c2edc1fa8c2f6cd91ce40a45f381
|
|
| BLAKE2b-256 |
5d179b78a7567921b9132c82158477f502863d2d1f49e88aaa416d365a3c9276
|