Skip to main content

No project description provided

Project description

srautils

srautils is a program used for download and dump NCBI SRA archive raw fastq data. It provides a fast and easy way to fetch sra data and covert sra file into fastq/fasta sequence data for our scientific research.

1. Requirement

2. Install

The latest release can be installed with

pypi:

pip3 install srautils -U

The development version can be installed with:(for recommend)

pip3 install git+https://github.com/yodeng/srautils.git

3. Usage

srautils include srautils fetch and srautils dump sub-commands.

3.1 srautils fetch

The fetch command is used for download SRA file by only giving an accession SRA id, it's a rapid and interruptable download accelerator. All original SRA files are obtained directly from AWS Cloud with UNSIGNED assess. This tools split the hold download into many pieces and record the progress for each chunk in a *.ht binary file, this can significantly speed up the download. Auto resume can be running by loading the progress file if any interruption. Command help as follows:

$ srautils fetch -h 
usage: srautils fetch [-h] -i <str> [-o <str>] [-n <int>] [-s <str>]

optional arguments:
  -h, --help            show this help message and exit
  -i <str>, --id <str>  input sra-id, SRR/ERR/DRR allowed, required
  -o <str>, --outdir <str>
                        output sra directory, current dir by default
  -n <int>, --num <int>
                        the max number of concurrency, default: auto
  -s <str>, --max-speed <str>
                        specify maximum speed per second, case-insensitive unit support (K[b], M[b]...), no-limited by default
options descriptions
-h/--help show this help message and exit
-i/--id input valid accession SRA id
-o/--outdir output directory
-n/--num the max number of concurrency, auto detect by sra file size
-s/--max-speed maximum speed per second, case-insensitive unit support (K[b], M[b]...), no-limited by default

3.2 srautils dump

The dump command is a parallel fastq-dump wrapper which used for dump SRA file and get the raw fastq/fasta sequence data as output.

NCBI fastq-dump is very slow, even if you have high machine resources (network, IO, CPU). This tool speeds up the process by dividing the work into multiple jobs and runing all chunked jobs parallelly in localhost or sge cluster (default) environment. After chunk jobs finished, all resuslts will be concatenate together. The command usage below here:

$ srautils dump -h 
usage: srautils dump [-h] -i <file> [-o <dir>] [-p <int>] [-q [<str> ...]] [-l <file>] [--no-gzip] [--fasta] [--local]

optional arguments:
  -h, --help            show this help message and exit
  -i <file>, --input <file>
                        input sra file, required
  -o <dir>, --outdir <dir>
                        output directory, current dir by default
  -p <int>, --processes <int>
                        number of dumps processors, 10 by default
  -q [<str> ...], --queue [<str> ...]
                        sge queue, multi-queue can be sepreated by whitespace, all.q by default
  -l <file>, --log <file>
                        append srautils log info to file, stdout by default
  --no-gzip             do not compress output
  --fasta               fasta only
  --local               run sra-dumps in localhost instead of sge
options descriptions
-h/--help show this help message and exit
-i/--input input sra file
-o/--output output directory
-p/--process divide chunks number, 10 by default
-q/--queue running all chunked jobs in sge queue if set, all.q by default
-l/--log process logging file, stdout by default
--no-gzip do not gzip output, gzip output by default
--fasta output fasta instead of fastq
--local running all chunked jobs in localhost instead of sge cluster

4. License

srautils is distributed under the MIT licence.

5. Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

srautils-1.0.1-7-py2.py3-none-any.whl (8.9 kB view details)

Uploaded Python 2 Python 3

srautils-1.0.1-6-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

srautils-1.0.1-5-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

srautils-1.0.1-3-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file srautils-1.0.1-7-py2.py3-none-any.whl.

File metadata

  • Download URL: srautils-1.0.1-7-py2.py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for srautils-1.0.1-7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bfd7412774b2f96b5d63f62f9a782f62695fccf391deb4ee3bb90645cd8a2f1a
MD5 e1bab3b9cebbc73af040ecba96c2d55b
BLAKE2b-256 48762586f7e858da6c0a97a01a7a0798ce9806a2fc63b9698afa35937b954ab5

See more details on using hashes here.

File details

Details for the file srautils-1.0.1-6-py3-none-any.whl.

File metadata

  • Download URL: srautils-1.0.1-6-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for srautils-1.0.1-6-py3-none-any.whl
Algorithm Hash digest
SHA256 2c95c5747a3f610339f0123cc41326d227d348444d752779740466a655e2ec4f
MD5 7fa9926f4e4ef0d883f6f681fd52635b
BLAKE2b-256 9c2e8fbf7109a92754e8491c6ceb95300f72400e5fda7b6d5914fe06afe5c74a

See more details on using hashes here.

File details

Details for the file srautils-1.0.1-5-py3-none-any.whl.

File metadata

  • Download URL: srautils-1.0.1-5-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for srautils-1.0.1-5-py3-none-any.whl
Algorithm Hash digest
SHA256 167d7875dc8e40eb894dd4c44233ce9780a602f64f581ddcf6baeafeb91c88c2
MD5 b59ddc27377b0b8c6af4f7155e748bdf
BLAKE2b-256 c3c8d72a85a88627a223d86996d1492019df6ad5a76b3621d0f386ae5a6c53fc

See more details on using hashes here.

File details

Details for the file srautils-1.0.1-3-py3-none-any.whl.

File metadata

  • Download URL: srautils-1.0.1-3-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for srautils-1.0.1-3-py3-none-any.whl
Algorithm Hash digest
SHA256 641de3cfc90a7418912a821213d46272b57e9fb4adf20f5a40b9783a4a86bb50
MD5 ee7533eb1457fff78914e2fb81ddd95f
BLAKE2b-256 46db1c60dbecc62119ceca49e6764fd3c988c56469472b8fb709d1ad06287e33

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page