Skip to main content

Download genome files from NCBI by accession.

Project description

NCBI accession download script

A partner script to the popular ncbi-genome-download script, ncbi-acc-download allows you to download sequences from GenBank/RefSeq by accession through the NCBI ENTREZ API.

Installation

pip install ncbi-acc-download

Alternatively, clone this repository from GitHub, then run (in a python virtual environment)

pip install .

If this fails on older versions of Python, try updating your pip tool first:

pip install --upgrade pip

and then rerun the ncbi-acc-download install.

ncbi-acc-download is only developed and tested on Python releases still under active support by the Python project. At the moment, this means versions 3.6, 3.7, 3.8, and 3.9. Specifically, no attempt at testing under Python versions older than 3.6 is being made.

ncbi-acc-download 0.2.6 was the last version to support Python 2.7.

If your system is stuck on an older version of Python, consider using a tool like Homebrew or Linuxbrew to obtain a more up-to-date version.

Usage

To download a nucleotide record AB_12345 in GenBank format, run

ncbi-acc-download AB_12345

To download a nucleotide record AB_12345 in FASTA format, run

ncbi-acc-download --format fasta AB_12345

To download a protein record WP_12345 in FASTA format, run

ncbi-acc-download --molecule protein WP_12345

To just generate a list of download URLs to run the actual download elsewhere, run

ncbi-acc-download --url AB_12345

If you want to concatenate multiple sequences into a single file, run

ncbi-acc-download --out two_genomes.gbk AB_12345 AB_23456

You can use this with /dev/stdout as the filename to print the downloaded data to standard output instead of writing to a file if you want to chain ncbi-acc-download with other command line tools, like so:

ncbi-genome-download --out /dev/stdout --format fasta AB_12345 AB_23456 | gzip > two_genomes.fa.gz

If you want to download all records covered by a WGS master record instead of the master record itself, run

ncbi-acc-download --recursive NZ_EXMP01000000

You can supply a genomic range to the accession download using --range

ncbi-acc-download NC_007194 --range 1001:9000

As cutting a record up with a range operator like that can leave partial features at both ends of the record, you can combine the range download with the new correct extended validator to remove the partial features.

ncbi-acc-download NC_007194 --range 1001:9000 --extended-validation correct

You can get more detailed information on the download progress by using the --verbose or -v flag.

To get an overview of all options, run

ncbi-acc-download --help

License

All code is available under the Apache License version 2, see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ncbi-acc-download-0.2.8.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

ncbi_acc_download-0.2.8-py2.py3-none-any.whl (17.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file ncbi-acc-download-0.2.8.tar.gz.

File metadata

  • Download URL: ncbi-acc-download-0.2.8.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for ncbi-acc-download-0.2.8.tar.gz
Algorithm Hash digest
SHA256 6f6832fe0c44630d794f98680520d061bd9bb192f5f918d4d0a9b24d07243308
MD5 a0a96e37137e65993a1647f363d9b75c
BLAKE2b-256 4d5ed68fdc377b2abb179773fb6ee97f519db184d8b457fab26b2f0c6b6f2186

See more details on using hashes here.

File details

Details for the file ncbi_acc_download-0.2.8-py2.py3-none-any.whl.

File metadata

  • Download URL: ncbi_acc_download-0.2.8-py2.py3-none-any.whl
  • Upload date:
  • Size: 17.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for ncbi_acc_download-0.2.8-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 3834cb36f4827c1fe866c3f5aadbd8d1f7a0b74770377e75160d16edea260d93
MD5 ece22c179ae12586875c272d3ed86396
BLAKE2b-256 5483ab2176355c6628cd61779e80a1a95b14488d049d6d0dab6a0c58c7c90d74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page