Skip to main content

Utility Python package to download Genome-in-a-Bottle data from their index files.

Project description

download_giab

Utility Python package to download Genome-in-a-Bottle (GIAB) data from their index files.

This requires Python 3.6 or later.

To install, run the following:

pip install download_giab

If you're installing on a cluster, this might be more like:

pip install --user download_giab

To use, run something like the following:

download_giab https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/AshkenazimTrio/sequence.index.AJtrio_Illumina300X_wgs_07292015.HG002

This will download everything in the linked index to the directory the utility is run from. It can also download from local index files.

If you want to download lots of data and not have the program hang up upon session disconnect, you can use nohup and &:

nohup download_giab https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/AshkenazimTrio/sequence.index.AJtrio_Illumina300X_wgs_07292015.HG002 &

If you are downloading paired-end reads and want to concatenate all FASTQ files into two files, you can use the --cat-paired flag. This will generate two files per sample: [sample]_1.fastq.gz and [sample]_2.fastq.gz. If a sample ID is not present, the literal text paired will be used.

This will not work for some tools (e.g. bwa mem) if the FASTQ files in a pair-set are of different lengths.

If instead you want to store the read pairs + a suggested common name, use the --store-paired-names flag. This will write to a file called paired_names.txt.

To filter what files are downloaded, the --filter flag can be provided with a case insensitive string or regular expression (in Python syntax.)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

download_giab-0.7.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

download_giab-0.7.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file download_giab-0.7.0.tar.gz.

File metadata

  • Download URL: download_giab-0.7.0.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.7

File hashes

Hashes for download_giab-0.7.0.tar.gz
Algorithm Hash digest
SHA256 d4c6ef66ff50c7c09b93027d9e21f869b14d32f53cae3bef1f25a91b135bad60
MD5 687ed6646b6e6aaf55e18b41918ed212
BLAKE2b-256 9d6e0a6a14adce94cf988f3b7c80d7f5b98fce1bbe2dd9cc32a40751c9c46ff6

See more details on using hashes here.

File details

Details for the file download_giab-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: download_giab-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/57.4.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.7

File hashes

Hashes for download_giab-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f87f179b47e42b8451d010b17c8f32ce6bd713640a6ea2643e80def11da886eb
MD5 0c013ebe40d7a1a7aebf839e2db70531
BLAKE2b-256 8005438580a95c9c8780fd3ea2338e76a345e1821df2f77a85f3d8ade615f607

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page