Data file generation for CGAP's Higlass browsers
Project description
higlass-data
Package that creates data files for CGAP's Higlass browsers
Installation
Simply run pip install cgap-higlass-data
to install the package. You need at least Python 3.8.
To develop this package, clone this repo, make sure poetry
is installed on your system and run make install
.
Commands
After installation the following commands can be run from the command line:
Convert BED file to BW (bigWig) file
Assume you have a BED file of the form
# HEADER LINE 1
# HEADER LINE 2
chr1 0 1024 . 423
chr1 1024 2048 . 32
chr1 2048 3072 . 734
This BED file can be converted to a BW file with the following command
# -i input BED file path
# -o output BW file path
# -a assembly (currently only 'hg38' is supported
# -l number of header lines in the BED file
convert-bed-to-bw -i ./PATH/input.bed \
-o ./PATH/output.bw \
-a hg38 \
-l 2
Note that the bedGraphToBigWig
must be installed on your system for this to work. It can be installed via conda (conda install -c bioconda ucsc-bedgraphtobigwig
). You can also download the binary here: http://hgdownload.soe.ucsc.edu/admin/exe/
Create variant-level VCF for CGAP's cohort browser
This command creates a multiresolution VCF file that is compatible to CGAP's cohort browser. Typically, the input VCF will be VEP annotated and has at least the info field level_most_severe_consequence
(which is one of HIGH
, LOW
, MODERATE
, MODIFIER
) and an importance value that can ranks/sorts the variants. The info field that is used for that purpose can be set dynamically.
# -i input VCF path
# -o output VCF path
# -c info field in the input VCF that ranks the variants
# -m maximal tile values per consequence. Controls how may variants are displayed at once and a certain zoom level
# -q quiet True / False. Toggles verbose output
# -w chromosome-wise True / False. Significantly less memory intensive, but slightly slower.
# -t index output. True / False. If true, the output vcf will be indexed.
create-cohort-vcf -i ./PATH/input.vcf \
-o ./PATH/output.vcf \
-c p_value_negative_log_10 \
-q True
Create coverage BED file from VCF
Counts the number of variants in a 1024bp window and creates a BED file with the results.
# -i input VCF path
# -o output VCF path
# -a assembly
# -q quiet True / False. Toggles verbose output
create-coverage-bed -i ./PATH/input.vcf \
-o ./PATH/output.bed \
-a hg38 \
-q True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cgap_higlass_data-0.4.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8abf1f38b5a1e36ef451b8ab43b6879542fce3263d50b7deadf9f37402411632 |
|
MD5 | c98f128597063b2977684987aa93b99f |
|
BLAKE2b-256 | 5393207fddfa0bb740641f79e82b67307b91dfe0c33fbeb8743aac69bce53311 |