Utility Python package to download Genome-in-a-Bottle data from their index files.
Project description
download_giab
Utility Python package to download Genome-in-a-Bottle (GIAB) data from their index files.
This requires Python 3.6 or later.
To install, run the following:
pip install download_giab
If you're installing on a cluster, this might be more like:
pip install --user download_giab
To use, run something like the following:
download_giab https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/AshkenazimTrio/sequence.index.AJtrio_Illumina300X_wgs_07292015.HG002
This will download everything in the linked index to the directory the utility is run from. It can also download from local index files.
If you want to download lots of data and not have the program hang up upon session disconnect,
you can use nohup
and &
:
nohup download_giab https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/AshkenazimTrio/sequence.index.AJtrio_Illumina300X_wgs_07292015.HG002 &
If you are downloading paired-end reads and want to concatenate all FASTQ files into two files,
you can use the --cat-paired
flag. This will generate two files per sample: [sample]_1.fastq.gz
and [sample]_2.fastq.gz
. If a sample ID is not present, the literal text paired
will be used.
This will not work for some tools (e.g. bwa mem
) if the FASTQ files in a pair-set are of
different lengths.
If instead you want to store the read pairs + a suggested common name, use the --store-paired-names
flag. This will write to a file called paired_names.txt
.
To filter what files are downloaded, the --filter
flag can be provided with a case insensitive
string or regular expression (in Python syntax.)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file download_giab-0.7.0.tar.gz
.
File metadata
- Download URL: download_giab-0.7.0.tar.gz
- Upload date:
- Size: 16.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4c6ef66ff50c7c09b93027d9e21f869b14d32f53cae3bef1f25a91b135bad60 |
|
MD5 | 687ed6646b6e6aaf55e18b41918ed212 |
|
BLAKE2b-256 | 9d6e0a6a14adce94cf988f3b7c80d7f5b98fce1bbe2dd9cc32a40751c9c46ff6 |
File details
Details for the file download_giab-0.7.0-py3-none-any.whl
.
File metadata
- Download URL: download_giab-0.7.0-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f87f179b47e42b8451d010b17c8f32ce6bd713640a6ea2643e80def11da886eb |
|
MD5 | 0c013ebe40d7a1a7aebf839e2db70531 |
|
BLAKE2b-256 | 8005438580a95c9c8780fd3ea2338e76a345e1821df2f77a85f3d8ade615f607 |