Skip to main content

Easily updatable local NCBI taxonomy dumps file copy

Project description

NtDownload

NtDownload is a tool for downloading and keeping up-to-date a local version of the NCBI Taxonomy database dump files

After download it creates a timestamp file, so that the next time the download is repeated only if a newer version is available.

Installation

The software is distributed as a Python 3 package and can be installed using pip install ntdownload.

Command line interface

ntdownload

To download the dump files, use the ntdownload script. Thereby the output directory is passed as CLI argument. If it does not exist, it is created. The file is downloaded using FTP, unless this does not work or the option --force-https is used, in which case HTTPS is used.

The dump files archive is unpacked after download and deleted, unless the option --no-unpack is used.

If the option --exitcode is used, then the exit code of the script is 100 if no newer version of the dump files was found, and thus nothing was downloaded. Otherwise the exit code is always 0 (or 1 on error).

ntnames

The script ntnames is provided, which, after download using ntdownload can be used for creating a list of taxonomy IDs and scientific names, which can be used as attribute source file for fastsubtrees.

API

Downloader

To download the dump files, use the Downloader class:

from ntdownload import Downloader
d = Downloader(output_directory_name)
has_downloaded = d.run()

The output directory is created if it does not exist. The dump files archive is unpacked and the archive deleted, except if the option unpack=False is used.

The download protocol is FTP unless it is not working or the option force_https=True is used, in which case HTTPS is used.

The return value of run() is True if a dump file was downloaded, False if no newer version was available.

Scientific names iterator

The function yield_scientific_names_from_dump(ntdumps_dir) yields tuples in the form (tax_id, scientific_name) reading them from the names.dmp file.

Test suite

The test suite is run using pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ntdownload-1.7.1.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

ntdownload-1.7.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file ntdownload-1.7.1.tar.gz.

File metadata

  • Download URL: ntdownload-1.7.1.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for ntdownload-1.7.1.tar.gz
Algorithm Hash digest
SHA256 545be1862a59ceb50b89754819f6e16356ac906e0aa0b1e93311051fb8771999
MD5 e9c8d1f1633fe50fd176f79912721200
BLAKE2b-256 c3c8d077469684ff239fb529d101243918d98b00fb494904e57d639197c6c11b

See more details on using hashes here.

File details

Details for the file ntdownload-1.7.1-py3-none-any.whl.

File metadata

  • Download URL: ntdownload-1.7.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for ntdownload-1.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 192c5b89882199d2e3c3579a0f2242bcb4d9abd7ee686924f6bcd37db0605d56
MD5 5d1e2f026c3dd59ea43578f161f5fa84
BLAKE2b-256 4632c92d9723deb1d7ada118263d08e876e250e2b27fb725a7103c9c662cb1b8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page