Data file generation for CGAP's Higlass browsers
Project description
higlass-data
Package that creates data files for CGAP's Higlass browsers
Installation
Simply run pip install cgap-higlass-data
to install the package. You need at least Python 3.8.
To develop this package, clone this repo, make sure poetry
is installed on your system and run make install
.
Commands
After installation the following commands can be run from the command line:
Convert BED file to BW (bigWig) file
Assume you have a BED file of the form
# HEADER LINE 1
# HEADER LINE 2
chr1 0 1024 . 423
chr1 1024 2048 . 32
chr1 2048 3072 . 734
This BED file can be converted to a BW file with the following command
# -i input BED file path
# -o output BW file path
# -a assembly (currently only 'hg38' is supported
# -l number of header lines in the BED file
convert-bed-to-bw -i ./PATH/input.bed \
-o ./PATH/output.bw \
-a hg38 \
-l 2
Note that the bedGraphToBigWig
must be installed on your system for this to work. It can be installed via conda (conda install -c bioconda ucsc-bedgraphtobigwig
). You can also download the binary here: http://hgdownload.soe.ucsc.edu/admin/exe/
Create variant-level VCF for CGAP's cohort browser
This command creates a multiresolution VCF file that is compatible to CGAP's cohort browser. Typically, the input VCF will be VEP annotated and has at least the info field level_most_severe_consequence
(which is one of HIGH
, LOW
, MODERATE
, MODIFIER
) and an importance value that can ranks/sorts the variants. The info field that is used for that purpose can be set dynamically.
# -i input VCF path
# -o output VCF path
# -c info field in the input VCF that ranks the variants
# -m maximal tile values per consequence. Controls how may variants are displayed at once and a certain zoom level
# -q quiet True / False. Toggles verbose output
# -w chromosome-wise True / False. Significantly less memory intensive, but slightly slower.
# -t index output. True / False. If true, the output vcf will be indexed.
create-cohort-vcf -i ./PATH/input.vcf \
-o ./PATH/output.vcf \
-c p_value_negative_log_10 \
-q True
Create coverage BED file from VCF
Counts the number of variants in a 1024bp window and creates a BED file with the results.
# -i input VCF path
# -o output VCF path
# -a assembly
# -q quiet True / False. Toggles verbose output
create-coverage-bed -i ./PATH/input.vcf \
-o ./PATH/output.bed \
-a hg38 \
-q True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cgap_higlass_data-0.4.0.tar.gz
.
File metadata
- Download URL: cgap_higlass_data-0.4.0.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.8.13 Darwin/20.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a0c901cda43f5c47b61b8914fb166f0882d5c5c9fdcc2468f3c9ab966b72a3d |
|
MD5 | c8ddd11f90f7147619781fecd6e685a6 |
|
BLAKE2b-256 | d525e3a40f627f4f4dc14be60797b6b6ce1dd83eaf6eac544fab0f72b228404e |
File details
Details for the file cgap_higlass_data-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: cgap_higlass_data-0.4.0-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.4.2 CPython/3.8.13 Darwin/20.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8abf1f38b5a1e36ef451b8ab43b6879542fce3263d50b7deadf9f37402411632 |
|
MD5 | c98f128597063b2977684987aa93b99f |
|
BLAKE2b-256 | 5393207fddfa0bb740641f79e82b67307b91dfe0c33fbeb8743aac69bce53311 |