Skip to main content

Data file generation for CGAP's Higlass browsers

Project description

higlass-data

Package that creates data files for CGAP's Higlass browsers

Installation

Simply run pip install cgap-higlass-data to install the package. You need at least Python 3.8.

To develop this package, clone this repo, make sure poetry is installed on your system and run make install.

Commands

After installation the following commands can be run from the command line:

Convert BED file to BW (bigWig) file

Assume you have a BED file of the form

# HEADER LINE 1
# HEADER LINE 2
chr1	0	1024	.	423
chr1	1024	2048	.	32
chr1	2048	3072	.	734

This BED file can be converted to a BW file with the following command

# -i input BED file path
# -o output BW file path
# -a assembly (currently only 'hg38' is supported
# -l number of header lines in the BED file
convert-bed-to-bw -i ./PATH/input.bed \
                  -o ./PATH/output.bw \
                  -a hg38 \
                  -l 2

Note that the bedGraphToBigWig must be installed on your system for this to work. It can be installed via conda (conda install -c bioconda ucsc-bedgraphtobigwig). You can also download the binary here: http://hgdownload.soe.ucsc.edu/admin/exe/

Create variant-level VCF for CGAP's cohort browser

This command creates a multiresolution VCF file that is compatible to CGAP's cohort browser. Typically, the input VCF will be VEP annotated and has at least the info field level_most_severe_consequence (which is one of HIGH, LOW, MODERATE, MODIFIER) and an importance value that can ranks/sorts the variants. The info field that is used for that purpose can be set dynamically.

# -i input VCF path
# -o output VCF path
# -c info field in the input VCF that ranks the variants
# -m maximal tile values per consequence. Controls how may variants are displayed at once and a certain zoom level
# -q quiet True / False. Toggles verbose output
# -w chromosome-wise True / False. Significantly less memory intensive, but slightly slower.
# -t index output. True / False. If true, the output vcf will be indexed.
create-cohort-vcf -i ./PATH/input.vcf \
                  -o ./PATH/output.vcf \
                  -c p_value_negative_log_10 \
                  -q True

Create coverage BED file from VCF

Counts the number of variants in a 1024bp window and creates a BED file with the results.

# -i input VCF path
# -o output VCF path
# -a assembly
# -q quiet True / False. Toggles verbose output
create-coverage-bed -i ./PATH/input.vcf \
                    -o ./PATH/output.bed \
                    -a hg38 \
                    -q True

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cgap_higlass_data-0.4.0.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

cgap_higlass_data-0.4.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file cgap_higlass_data-0.4.0.tar.gz.

File metadata

  • Download URL: cgap_higlass_data-0.4.0.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.8.13 Darwin/20.6.0

File hashes

Hashes for cgap_higlass_data-0.4.0.tar.gz
Algorithm Hash digest
SHA256 7a0c901cda43f5c47b61b8914fb166f0882d5c5c9fdcc2468f3c9ab966b72a3d
MD5 c8ddd11f90f7147619781fecd6e685a6
BLAKE2b-256 d525e3a40f627f4f4dc14be60797b6b6ce1dd83eaf6eac544fab0f72b228404e

See more details on using hashes here.

File details

Details for the file cgap_higlass_data-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for cgap_higlass_data-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8abf1f38b5a1e36ef451b8ab43b6879542fce3263d50b7deadf9f37402411632
MD5 c98f128597063b2977684987aa93b99f
BLAKE2b-256 5393207fddfa0bb740641f79e82b67307b91dfe0c33fbeb8743aac69bce53311

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page