Download genome files from the NCBI FTP server.
Project description
# NCBI Genome Downloading Scripts
Some script to download bacterial and fungal genomes from NCBI after they restructured their FTP a while ago.
Idea shamelessly stolen from [Mick Watson’s Kraken downloader scripts](http://www.opiniomics.org/building-a-kraken-database-with-new-ftp-structure-and-no-gi-numbers/) that can also be found in [Mick’s GitHub repo](https://github.com/mw55309/Kraken_db_install_scripts). However, Mick’s scripts are ~~written in Perl~~ specific to actually building a Kraken database (as advertised).
So this is a set of scripts that focuses on the actual genome downloading.
## Installation
` pip install ncbi-genome-download `
Alternatively, clone this repository from GitHub, then run (in a python virtual environment) ` pip install . `
## Usage
To download all bacterial RefSeq genomes in GenBank format from NCBI, run the following: ` ncbi-genome-download bacteria `
If you’re on a reasonably fast connection, you might want to try running multiple downloads in parallel: ` ncbi-genome-download bacteria --parallel 4 `
To download all fungal GenBank genomes from NCBI in GenBank format, run: ` ncbi-genome-download --section genbank fungi `
To download all viral RefSeq genomes in FASTA format, run: ` ncbi-genome-download --format fasta viral `
To download only completed bacterial RefSeq genomes in GenBank format, run: ` ncbi-genome-download --assembly-level complete bacteria `
To download bacterial RefSeq genomes of the genus _Streptomyces_, run: ` ncbi-genome-download --genus Streptomyces bacteria ` Note: This is a simple string match on the organism name provided by NCBI only.
To get an overview of all options, run ` ncbi-genome-download --help `
## License All code is available under the Apache License version 2, see the [LICENSE](LICENSE) file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for ncbi-genome-download-0.1.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a0c32f0aff736f2ac1839de295a31d75882dbdcc631e17337e151ea8e7977c6 |
|
MD5 | 26462caab62d228486cd204b4ab70ba0 |
|
BLAKE2b-256 | 67699a2ebc3840621f4b539c4bc325d4661a878e862869ee68be40bcfd74362d |