Skip to main content

A script for batch-downloading and automatic compression of data from NCBI Sequence Read Archive. Built on SRA-Toolkit.

Project description

sra-downloader

A script for batch-downloading and automatic compression of data from NCBI Sequence Read Archive. Built on SRA-Toolkit.

Features:

  • Downloading SRAs using either accession IDs or NCBI generated files
  • Organizing sequences by projects that they come from
  • Detecting which runs have been already downloaded

Requirements

  • python >= 3.6
  • sra-toolkit >= 2.9.6
  • pigz

Instalation

0. Docker

  1. Run docker run wwydmanski/sra-downloader -h

1. Conda (recommended, also downloads sra-toolkit)

  1. Run conda install -c bioconda -c bioinf-mcb sra-downloader

Note: if you don't specify the bioconda channel you will get a dependency error.

2. From PyPi

  1. Run pip install sra-downloader

3. From sources

  1. Download a repo into a folder
  2. Run pip install .

Usage

usage: sra-downloader [-h] [--fname FILENAME] [--save-dir SAVE_DIRECTORY] [--uncompressed [UNCOMPRESSED]] [--cores [CORES]] [sra_id [sra_id ...]]

Download SRA data and organize them by projects

positional arguments:
  sra_id                SRA IDs to download

optional arguments:
  -h, --help            show this help message and exit
  --fname FILENAME      CSV file with list of SRAs to download. Header must include `Run` and `BioProject`.
  --save-dir SAVE_DIRECTORY
                        a directory that the files will be saved to. (default: ./downloaded)
  --uncompressed [UNCOMPRESSED]
                        if present, the files will not be compressed. (default: False)
  --cores [CORES]       Cores used for compression. (default is the number of online processors, or 8 if unknown)

Examples

sra-downloader ERR2177760 --uncompressed
docker run -v $(pwd)/downloads:/downloaded wwydmanski/sra-downloader ERR1551967
sra-downloader --fname SraRunTable.txt --save-dir ./SRAs --cores 4

Sample output

└─── save_folder
    ├── PRJEB14961
    │   ├── ERR1551967.sra_1.fastq.gz  # - raw read archived files from SRA
    │   ├── ERR1551967.sra_2.fastq.gz 
    │   └── SraRunTable.txt            # - original SraRunTable.txt with useful metadata about samples  
    └── PRJEB20463
        ├── ERR2177760.sra_1.fastq.gz
        ├── ERR2177760.sra_2.fastq.gz 
        ├── absent.txt                 # - entries that were unaccessible due to various reasons
        └── SraRunTable.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sra-downloader-1.0.7.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

sra_downloader-1.0.7-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file sra-downloader-1.0.7.tar.gz.

File metadata

  • Download URL: sra-downloader-1.0.7.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.5

File hashes

Hashes for sra-downloader-1.0.7.tar.gz
Algorithm Hash digest
SHA256 d1b94ebfc4cd7b55d8ba15c149aa055b0050b36b3d2c0a301e67ab4563f6806b
MD5 1042732ad3fb274cb0c537f9b6212aa6
BLAKE2b-256 60cc3b87418232fdb614aab55b265708bc117a0fa3985356b4c154282352f093

See more details on using hashes here.

File details

Details for the file sra_downloader-1.0.7-py3-none-any.whl.

File metadata

  • Download URL: sra_downloader-1.0.7-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.8.5

File hashes

Hashes for sra_downloader-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 04cbf8a472b4ec6d1cd84a633ea1fea158dcca540e1445d6115a4cd932a1e990
MD5 8ab0a441a034426bcd7095fe1c11cf52
BLAKE2b-256 114e98494ca0554eea20c219fe853029f148382c1fa01aee22e715cd6d857f11

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page