Skip to main content

Fast metadata and fastq retrieval with ffq and aria2c

Project description

ngsfetch

GitHub License DOI GitHub Release GitHub Release Date Create Release Publish PyPI Python PyPI Conda Docker Pulls Docker Image Size

A utility to retrieve fastq files with ffq and aria2. It is designed to be fast and efficient, allowing you to download large datasets quickly and easily. This tool can be used to fetch fastq files from various public repositories, including:

  • GEO: Gene Expression Omnibus,
  • SRA: Sequence Read Archive,
  • EMBL-EBI: European Molecular BIology Laboratory’s European BIoinformatics Institute.

[!IMPORTANT]

  • Fast: Uses aria2 to download files in parallel, which can significantly speed up the download process.
  • Integrity: Verifies the integrity of downloaded files using md5sum to ensure that the files are not corrupted during the download process.
  • Retry Mechanism: Automatically attempts to re-download files if the initial download fails, ensuring successful retrieval of data.

Quick start

# Fetch fastq files of GSE52856
ngsfetch -i GSE52856 -o /path/to/output/GSE52856 -p 16

# Fetch fastq files of SRP175008
ngsfetch -i SRP175008 -o /path/to/output/SRP175008 -p 16

# Fetch fastq files of ERP126666
ngsfetch -i ERP126666 -o /path/to/output/ERP126666 -p 16

How to install

pip

pip install ngsfetch

or

git clone https://github.com/NaotoKubota/ngsfetch.git
cd ngsfetch
pip install .

conda

conda create -n ngsfetch python=3.9
conda activate ngsfetch
conda install -c bioconda ngsfetch

Docker

docker pull naotokubota/ngsfetch

Dependencies

Operating system

  • Linux (i.e. where the md5sum command is available)

python packages

  • python (>=3.9)
  • ffq (>=0.3.1)
  • aria2 (>=0.0.1b0)

Usage

usage: ngsfetch [-h] [-i ID] [-o OUTPUT] [-p PROCESSES] [--attempts ATTEMPTS] [-v]

ngsfetch v0.1.0 - fast retrieval of metadata and fastq files with ffq and aria2c

optional arguments:
  -h, --help            show this help message and exit
  -i ID, --id ID        ID of the data to fetch
  -o OUTPUT, --output OUTPUT
                        Output directory
  -p PROCESSES, --processes PROCESSES
                        Number of processes to use (up to 16)
  --attempts ATTEMPTS   Number of attempts to fetch metadata and fastq files
  -v, --verbose         Increase verbosity

Contributing

Thank you for wanting to improve ngsfetch! If you have any bugs or questions, feel free to open an issue or pull request.

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngsfetch-0.1.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ngsfetch-0.1.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file ngsfetch-0.1.0.tar.gz.

File metadata

  • Download URL: ngsfetch-0.1.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for ngsfetch-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1a94cbe776c973f3856e8c27fb501dfb099166335de75e5ac614a21cdf32021f
MD5 fa3a43c2b1c673284f591269c514eaf7
BLAKE2b-256 ae5fd3e1ee08b5662a6ed481412c0c539893151d65262b6503d81468c6c084e6

See more details on using hashes here.

File details

Details for the file ngsfetch-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ngsfetch-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for ngsfetch-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 006c3df2f846cf7000afef4925f554577e3881fb244fa3aafa2e7c84a97634e7
MD5 8179dce54bb0f37488ea346e14c5fc87
BLAKE2b-256 eb31431a356b34a69f83efc30d8eeab19005b3e94fd660a7574189b856ad90ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page