Skip to main content

Collection of scripts that wrap the Entrez Bio python module, to query and download portions of NCBI databases

Project description

ncbi_db

Collection of commands to query or process NCBI data

Installation

conda install -c mmariotti -c conda-forge -c etetoolkit ncbi_db

Tools

These command line tools are available:

  • ncbi_assembly search and download assemblies/genomes for any species/lineage, or its annotation/proteome
  • ncbi_sequences search and download nucleotide/protein sequences or their metadata
  • ncbi_pubmed search and format ncbi pubmed entries
  • ncbi_taxonomy search ncbi taxonomy for species or lineages
  • ncbi_taxonomy_tree obtain a tree from ncbi taxonomy for a set of input species
  • ncbi_search generic search tool for any ncbi DB
  • parse_genbank parse a genbank flat file; requires installation of GBParsy

Run any tool with option -h to display its usage.

Most tools require internet, as they connect online to ncbi.

Using ncbi_db as module

To use these functionalities from another python module, import them from ncbi_db and run their "main" function providing the same arguments as you would on the command line, but in form of dictionary. Use option 'silent' to avoid printing results on screen. For example:

from ncbi_db import ncbi_sequences
arguments={'m':'P', 'f':1, 'I':'AAB88790', 'silent':1}
results=ncbi_sequences.main(arguments)
print(results)
{'AAB88790':
 ['AAB88790.1 gi|2411487|gb|AAB88790.1| selenophosphate synthetase [Drosophila melanogaster]',
 'MSYAADVLNSAHLELHGGGDAELRRPFDPTAHDLDASFRLTRFADLKGRGCKVPQDVLSKLVSALQQDYSAQDQEPQFLNVAIPRIGIGLDCSVIPLRHGGLCLVQTTDFFYPIVDDPYMMGKIACANVLSDLYAMGVTDCDNMLMLLAVSTKMTEKERDVVIPLIMRGFKDSALEAGTTVTGGQSVVNPWCTIGGVASTICQPNEYIVPDNAVVGDVLVLTKPLGTQVAVNAHQWIDQPERWNRIKLVVSEKNVRKAYHRAMNSMARLNRVAARLMHKYNAHGATDITGFGLLGHAQTLAAHQKKDVSFVIHNLPVIAKMAAVAKACGNMFQLLQGHSAETSGGLLICLPREQAAAYCKDIEKQEGYQAWIIGIVEKGNKTARIIDKPRVIEVPAKD']}

Developers

Marco Mariotti https://github.com/marco-mariotti

Didac Santesmasses https://github.com/didacs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncbi_db-0.1.1.tar.gz (132.2 kB view details)

Uploaded Source

Built Distribution

ncbi_db-0.1.1-py3-none-any.whl (140.0 kB view details)

Uploaded Python 3

File details

Details for the file ncbi_db-0.1.1.tar.gz.

File metadata

  • Download URL: ncbi_db-0.1.1.tar.gz
  • Upload date:
  • Size: 132.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for ncbi_db-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bbf92c354b8452d829e139a1b60a1979848c7976f707bb136f651b68a49ba029
MD5 6837f7505deb50625a5571f73be6be8f
BLAKE2b-256 b12c41caf46c4e0dba623d8ec1057de0c98e5fd8b38dd3192edaf5c214944f0b

See more details on using hashes here.

File details

Details for the file ncbi_db-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ncbi_db-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 140.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for ncbi_db-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e26e3a4f1423eec5d710377c88a84134c5414243198b115a31e252a1db9056d2
MD5 43f861f40110ac166b0f15e4b3dbb309
BLAKE2b-256 a9c4236c1489ccfe2cfd2820588faca5bf8e7e1911a2e4d600acab67b36f99d4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page