Collection of scripts that wrap the Entrez Bio python module, to query and download portions of NCBI databases
Project description
ncbi_db
Collection of commands to query or process NCBI data
Installation
conda install -c mmariotti -c conda-forge -c etetoolkit ncbi_db
Tools
These command line tools are available:
- ncbi_assembly search and download assemblies/genomes for any species/lineage, or its annotation/proteome
- ncbi_sequences search and download nucleotide/protein sequences or their metadata
- ncbi_pubmed search and format ncbi pubmed entries
- ncbi_taxonomy search ncbi taxonomy for species or lineages
- ncbi_taxonomy_tree obtain a tree from ncbi taxonomy for a set of input species
- ncbi_search generic search tool for any ncbi DB
- parse_genbank parse a genbank flat file; requires installation of GBParsy
Run any tool with option -h to display its usage.
Most tools require internet, as they connect online to ncbi.
Using ncbi_db as module
To use these functionalities from another python module, import them from ncbi_db and run their "main" function providing the same arguments as you would on the command line, but in form of dictionary. Use option 'silent' to avoid printing results on screen. For example:
from ncbi_db import ncbi_sequences
arguments={'m':'P', 'f':1, 'I':'AAB88790', 'silent':1}
results=ncbi_sequences.main(arguments)
print(results)
{'AAB88790':
['AAB88790.1 gi|2411487|gb|AAB88790.1| selenophosphate synthetase [Drosophila melanogaster]',
'MSYAADVLNSAHLELHGGGDAELRRPFDPTAHDLDASFRLTRFADLKGRGCKVPQDVLSKLVSALQQDYSAQDQEPQFLNVAIPRIGIGLDCSVIPLRHGGLCLVQTTDFFYPIVDDPYMMGKIACANVLSDLYAMGVTDCDNMLMLLAVSTKMTEKERDVVIPLIMRGFKDSALEAGTTVTGGQSVVNPWCTIGGVASTICQPNEYIVPDNAVVGDVLVLTKPLGTQVAVNAHQWIDQPERWNRIKLVVSEKNVRKAYHRAMNSMARLNRVAARLMHKYNAHGATDITGFGLLGHAQTLAAHQKKDVSFVIHNLPVIAKMAAVAKACGNMFQLLQGHSAETSGGLLICLPREQAAAYCKDIEKQEGYQAWIIGIVEKGNKTARIIDKPRVIEVPAKD']}
Developers
Marco Mariotti https://github.com/marco-mariotti
Didac Santesmasses https://github.com/didacs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ncbi_db-0.1.1.tar.gz
.
File metadata
- Download URL: ncbi_db-0.1.1.tar.gz
- Upload date:
- Size: 132.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbf92c354b8452d829e139a1b60a1979848c7976f707bb136f651b68a49ba029 |
|
MD5 | 6837f7505deb50625a5571f73be6be8f |
|
BLAKE2b-256 | b12c41caf46c4e0dba623d8ec1057de0c98e5fd8b38dd3192edaf5c214944f0b |
File details
Details for the file ncbi_db-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: ncbi_db-0.1.1-py3-none-any.whl
- Upload date:
- Size: 140.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e26e3a4f1423eec5d710377c88a84134c5414243198b115a31e252a1db9056d2 |
|
MD5 | 43f861f40110ac166b0f15e4b3dbb309 |
|
BLAKE2b-256 | a9c4236c1489ccfe2cfd2820588faca5bf8e7e1911a2e4d600acab67b36f99d4 |