No project description provided
Project description
srautils
srautils is a program used for download and dump NCBI SRA archive raw fastq data. It provides a fast and easy way to fetch sra data and covert sra file into fastq/fasta sequence data for our scientific research.
1. Requirement
- Linux64
- python >=3.8
- sratookits
2. Install
The latest release can be installed with
pypi:
pip3 install srautils -U
The development version can be installed with:(for recommend)
pip3 install git+https://github.com/yodeng/srautils.git
3. Usage
srautils include srautils fetch and srautils dump sub-commands.
3.1 srautils fetch
The fetch command is used for download SRA file by only giving an accession SRA id, it's a rapid and interruptable download accelerator. All original SRA files are obtained directly from AWS Cloud with UNSIGNED assess. This tools split the hold download into many pieces and record the progress for each chunk in a *.ht binary file, this can significantly speed up the download. Auto resume can be running by loading the progress file if any interruption. Command help as follows:
$ srautils fetch -h
usage: srautils fetch [-h] -i <str> [-o <str>] [-n <int>] [-s <str>]
optional arguments:
-h, --help show this help message and exit
-i <str>, --id <str> input sra-id, SRR/ERR/DRR allowed, required
-o <str>, --outdir <str>
output sra directory, current dir by default
-n <int>, --num <int>
the max number of concurrency, default: auto
-s <str>, --max-speed <str>
specify maximum speed per second, case-insensitive unit support (K[b], M[b]...), no-limited by default
| options | descriptions |
|---|---|
| -h/--help | show this help message and exit |
| -i/--id | input valid accession SRA id |
| -o/--outdir | output directory |
| -n/--num | the max number of concurrency, auto detect by sra file size |
| -s/--max-speed | maximum speed per second, case-insensitive unit support (K[b], M[b]...), no-limited by default |
3.2 srautils dump
The dump command is a parallel fastq-dump wrapper which used for dump SRA file and get the raw fastq/fasta sequence data as output.
NCBI fastq-dump is very slow, even if you have high machine resources (network, IO, CPU). This tool speeds up the process by dividing the work into multiple jobs and runing all chunked jobs parallelly in localhost or sge cluster (default) environment. After chunk jobs finished, all resuslts will be concatenate together. The command usage below here:
$ srautils dump -h
usage: srautils dump [-h] -i <file> [-o <dir>] [-p <int>] [-q [<str> ...]] [-l <file>] [--no-gzip] [--fasta] [--local]
optional arguments:
-h, --help show this help message and exit
-i <file>, --input <file>
input sra file, required
-o <dir>, --outdir <dir>
output directory, current dir by default
-p <int>, --processes <int>
number of dumps processors, 10 by default
-q [<str> ...], --queue [<str> ...]
sge queue, multi-queue can be sepreated by whitespace, all.q by default
-l <file>, --log <file>
append srautils log info to file, stdout by default
--no-gzip do not compress output
--fasta fasta only
--local run sra-dumps in localhost instead of sge
| options | descriptions |
|---|---|
| -h/--help | show this help message and exit |
| -i/--input | input sra file |
| -o/--output | output directory |
| -p/--process | divide chunks number, 10 by default |
| -q/--queue | running all chunked jobs in sge queue if set, all.q by default |
| -l/--log | process logging file, stdout by default |
| --no-gzip | do not gzip output, gzip output by default |
| --fasta | output fasta instead of fastq |
| --local | running all chunked jobs in localhost instead of sge cluster |
4. License
srautils is distributed under the MIT licence.
5. Reference
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file srautils-1.0.1-7-py2.py3-none-any.whl.
File metadata
- Download URL: srautils-1.0.1-7-py2.py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfd7412774b2f96b5d63f62f9a782f62695fccf391deb4ee3bb90645cd8a2f1a
|
|
| MD5 |
e1bab3b9cebbc73af040ecba96c2d55b
|
|
| BLAKE2b-256 |
48762586f7e858da6c0a97a01a7a0798ce9806a2fc63b9698afa35937b954ab5
|
File details
Details for the file srautils-1.0.1-6-py3-none-any.whl.
File metadata
- Download URL: srautils-1.0.1-6-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c95c5747a3f610339f0123cc41326d227d348444d752779740466a655e2ec4f
|
|
| MD5 |
7fa9926f4e4ef0d883f6f681fd52635b
|
|
| BLAKE2b-256 |
9c2e8fbf7109a92754e8491c6ceb95300f72400e5fda7b6d5914fe06afe5c74a
|
File details
Details for the file srautils-1.0.1-5-py3-none-any.whl.
File metadata
- Download URL: srautils-1.0.1-5-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
167d7875dc8e40eb894dd4c44233ce9780a602f64f581ddcf6baeafeb91c88c2
|
|
| MD5 |
b59ddc27377b0b8c6af4f7155e748bdf
|
|
| BLAKE2b-256 |
c3c8d72a85a88627a223d86996d1492019df6ad5a76b3621d0f386ae5a6c53fc
|
File details
Details for the file srautils-1.0.1-3-py3-none-any.whl.
File metadata
- Download URL: srautils-1.0.1-3-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
641de3cfc90a7418912a821213d46272b57e9fb4adf20f5a40b9783a4a86bb50
|
|
| MD5 |
ee7533eb1457fff78914e2fb81ddd95f
|
|
| BLAKE2b-256 |
46db1c60dbecc62119ceca49e6764fd3c988c56469472b8fb709d1ad06287e33
|