Skip to main content

Creates self-contained html pages for visual variant review with IGV (igv.js).

Project description

igv-reports

A Python application to generate self-contained HTML reports that consist of a table of genomic sites or regions and associated IGV views for each site. The generated HTML page contains all data neccessary for IGV as uuencoded blobs. It can be opened within a web browser with the "file" protocol; no internet connection is required.

Installation

Prerequisites

igv-reports requires Python 3.6 or greater and pip.

As with all Python projects, use of a virtual environment is recommended. Instructions for creating a virtual environment using conda follow.

1. Install Anaconda from https://docs.anaconda.com/anaconda/

2. Create a virtual environment

conda create -n igvreports python=3.7.1
conda activate igvreports

Installing igv-reports

pip install igv-reports

igv-reports requires the package pysam which should be installed automatically. However on OSX this sometimes fails due to missing dependent libraries. This can be fixed following the procedure below, from the pysam docs;
"The recommended way to install pysam is through conda/bioconda. This will install pysam from the bioconda channel and automatically makes sure that dependencies are installed. Also, compilation flags will be set automatically, which will potentially save a lot of trouble on OS X."

conda config --add channels r
conda config --add channels bioconda
conda install pysam

Creating a report

A report consists of a table of sites or regions and an associated IGV views for each site. Reports are created with the command line script create_report. Command line arguments are described below. Although --tracks is optional, a typical report will include at least an alignment track (BAM or CRAM) file from which the variants were called.

Arguments:

  • Required

    • sites VCF (tabix indexed vcf.gz file), BED, MAF, or generic tab delimited file of genomic variant sites.
    • fasta Reference fasta file; must be indexed.
  • Required for generic tab delimited sites file

    • --begin INT. Column of start chromosomal position for sites file. Used for generic tab delimited input.
    • --end INT. Column of end chromosomal position for sites. Used for generic tab delimited input.
    • --sequence INT. Column of sequence (chromosome) name.
  • Optional for generic tab delimited sites file

    • --zero-based Specify that the position in the sites file is 0-based (e.g. UCSC files) rather than 1-based. Default is false.
  • Optional

    • --tracks LIST. Space-delimited list of track files, see below for supported formats. If both tracks and track-config are specified tracks will appear first by default.
    • --track-config FILE. File containing array of json configuration objects for igv.js tracks. See the igv.js wiki for more details. This option allows customization of track parameters. When using this option, the track url and indexURL properties should be set to the paths to the respective files.
    • --ideogram FILE. Ideogram file in UCSC cytoIdeo format.
    • --template FILE. HTML template file.
    • --output FILE. Output file name; default="igvjs_viewer.html".
    • --info-columns LIST. Space delimited list of field names to includ in the variant table. If _sites is a VCF file these are the info field names. If sites is a tab delimited format these are column names.
    • --info-columns-prefixes LIST. Space delimited list of prefixes of VCF info field names to include in variant table.
    • --sample-columns LIST. Space delimited list of VCF sample/format field names to include in variant table.
    • --flanking INT. Genomic region to include either side of variant; default=1000.
    • --standalone Embed all JavaScript referenced via <script> tags in the page.
    • --sort Applies to alignment racks only. If specified alignments are initally sorted by the specified option. Supported values include BASE, STRAND, INSERT_SIZE, MATE_CHR, and NONE. Default value is BASE for single nucleotide variants, NONE (no sorting) otherwise. See the igv.js documentation for more information.

**Tab delimited sites file

Variant sites can be defined from a VCF,
UCSC BED, or a generic tab delimited file.

Note: VCF files must be tabix indexed, and must end with a ".gz" extension. The ".bgz" extension is not supported.

Track file formats:

Currently supported track file formats are BAM, CRAM, VCF, BED, GFF3, and GTF. FASTA. BAM, CRAM, and VCF files must be indexed. Tabix is supported for other file types and it is recommended that all large files be indexed.

Examples

Data for the examples are available for download.

Creating a variant report from a VCF file:

create_report examples/variants/variants.vcf.gz http://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC --tracks examples/variants/variants.vcf.gz examples/variants/recalibrated.bam examples/variants/refGene.sort.bed.gz --output example1.html

Creating a variant report from a "track-config" json file

create_report examples/variants/variants.vcf.gz http://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC --track-config examples/variants/trackConfigs.json --output example_config.html

Creating a variant report from a TCGA MAF file

create_report examples/variants/tcga_test.maf http://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg19/hg19.fasta --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns Chromosome Start_position End_position Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS --tracks  examples/variants/refGene.sort.bed.gz --output example_maf.html

Creating a variant report from a generic tab-delimited file

create_report examples/variants/test.maflite.tsv http://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg19/hg19.fasta --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --sequence 1 --begin 2 --end 3 --info-columns chr start end ref_allele alt_allele --tracks examples/variants/refGene.sort.bed.gz --output example_tab.html

Creating a junction report from a splice-junction bed file

create_report examples/junctions/Introns.38.bed http://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --type junction --ideogram examples/junctions/cytoBandIdeo.txt --output junctions.html --track-config examples/junctions/tracks.json --info-columns TCGA GTEx variant_name --title "Sample A"

Converting genomic files to data URIs for use in igv.js

The script create_datauri converts the contents of a file to a data uri for use in igv.js. The datauri will be printed to stdout. NOTE It is not neccessary to run this script explicitly to create a report, it is documented here for use with stand-alone igv.js.

Convert a gzipped vcf file to a datauri.

create_datauri examples/variants/variants.vcf.gz

Convert a slice of a remote cram file to a cram datauri.

create_datauri \
--region 8:127,738,322-127,738,508 \
http://s3.amazonaws.com/1000genomes/data/HG00096/alignment/HG00096.alt_bwamem_GRCh38DH.20150718.GBR.low_coverage.cram 

Release Notes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

igv-reports-1.0.5.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

igv_reports-1.0.5-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file igv-reports-1.0.5.tar.gz.

File metadata

  • Download URL: igv-reports-1.0.5.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.1

File hashes

Hashes for igv-reports-1.0.5.tar.gz
Algorithm Hash digest
SHA256 653437a15df6689beee79f8b5463d3f55478c93ed09e9a1de767909c7b6c55a6
MD5 ff87366f5cf6150d03bae14d3eaee5df
BLAKE2b-256 13c9efad4e9090beef7b8c3158e903e15215d7c79512c178b022ba84d24ea66d

See more details on using hashes here.

File details

Details for the file igv_reports-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: igv_reports-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.1

File hashes

Hashes for igv_reports-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 304862797bfdfdbe893e496871ca4add1fe0b468496c1899e35ef51706d86daf
MD5 5ad66fc1421b5a9ef43b181aa1dbf353
BLAKE2b-256 0cdfa2d5012c7db184e3f3ee0a4f0ae36321cda1654f9d410dc832df56556f37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page