Skip to main content

The AsperaSRAgetter provides a easy way to download sequencing data from ENA by using Aspera.

Project description

AsperaSRAgetter

AsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.

Installation

AsperaSRAgetter has been distributed on pypi. You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI with Conda.

# You may create a new invironment for AsperaSRAgetter, but this is optional
conda create -n AsperaSRAgetter python=3.10
conda activate AsperaSRAgetter

# Install AsperaSRAgetter using pip
pip install AsperaSRAgetter

# Install Aspera-CLI using conda
conda install -c hcc aspera-cli

Workflow

AsperaSRAgetter first inquiry for corresponding fastq.gz file report through ENA filereport API. Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command ascp to download the fastq.gz file.

The file reports will be stored as a .tsv table as records of the downloading process.

All files' MD5 hash values are saved in .md5 file which users can further verify the integrity of files.

workflow

Usage

The command name of AsperaSRAgetter is sragetter. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below). Note that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENT_PATH/etc/asperaweb_id_dsa.openssh)

usage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR

options:
  -h, --help            show this help message and exit
  -v, --version         Show SRAdownloader version number and exit
  -acc ACCESSION, --accession ACCESSION
                        SRA data accession
  -f FILE, --file FILE  TXT file with multiple SRA accessions
  -ssh SSH_KEY, --ssh-key SSH_KEY
                        Public key authentication file provided by Aspera command line client download package as the 'asperaweb_id_dsa.openssh' file
  -o OUTDIR, --outdir OUTDIR
                        Path to store the downloaded SRA data

Usage
-----------------
Download with one accession:
    $ sragetter --accession sra_accession --ssh-key sshkey_path.openssh --outdir outdir_path

Download with TXT file containing multiple accessions:
    $ sragetter --file sra_accessions.txt --ssh-key sshkey_path.openssh --outdir outdir_path

Contact

If you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asperasragetter-2.2.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

AsperaSRAgetter-2.2-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file asperasragetter-2.2.tar.gz.

File metadata

  • Download URL: asperasragetter-2.2.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.9

File hashes

Hashes for asperasragetter-2.2.tar.gz
Algorithm Hash digest
SHA256 ce1ce2a0a9d43ed7166b7e8bb9ba5fc67e8a35c01a8d99973c292340d5cd82a6
MD5 94d2b0d81e2a26084c629fbf516df20f
BLAKE2b-256 285bd232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad

See more details on using hashes here.

File details

Details for the file AsperaSRAgetter-2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for AsperaSRAgetter-2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e9d7aa1c3a64d3a2fda6a354a474778a89e4168e46c371cfc789007fdfeda25d
MD5 861cce82e556e71c7260d416d655b524
BLAKE2b-256 2bf0008583e36f8c599860d7aa4897ef272e91a1e90d8a8f650bcff523ed5020

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page