Skip to main content

Converts the CNVkit structural variants VCF file into JSON format

Project description

Convert CNVkit VCF 4.2 to JSON

Overview

cnv_vcf2json is a command‐line tool that converts CNVkit structural variant VCF files into JSON files conforming to the Progenetix database variant schema (pgxVariant.yaml). The tool extracts variant data—including chromosomal coordinates, variant type, and copy‐number information—from a VCF file. If a copy‐number (CN) is missing (which is common for deletion calls from HMM-based segmentation), the script infers it from the genotype (GT): heterozygous deletions (GT=0/1 or 1/0) are assigned CN=1, and homozygous deletions (GT=1/1) are assigned CN=0. For duplications, if no CN is provided, a default value of 3 is assumed (interpreted as low-level gain), and values ≥4 are treated as high-level gain.

Additionally, the converter offers extra flexibility by allowing the user to supply optional metadata—including assembly, analysis, individual, sequence, reference sequence, and fusion identifiers—that will be incorporated into the JSON output according to the Progenetix schema.

Requirements

Installation and Update

Using Pip3

  1. Install the package:

    pip3 install cnv_vcf2json
    
  2. Update the package, if needed:

    pip3 install cnv_vcf2json --upgrade
    
  3. Test your installation:

    cnv-vcf2json --help
    

Using Conda

  1. Add the conda-forge channel (if not already added):

    conda config --add channels conda-forge
    
  2. Install the package:

    conda install cnv_vcf2json
    
  3. Update the package, if needed:

    conda update cnv_vcf2json
    
  4. Test your installation:

    cnv-vcf2json --help
    

Usage cnv-vcf2json

usage: cnv_vcf2json.py [-h] -o OUTPUT [--assembly ASSEMBLY] [--analysis ANALYSIS] [--individual INDIVIDUAL] [--sequence SEQUENCE] [--reference REFERENCE]
                       [--fusion FUSION]
                       input

Convert CNVkit VCF to Beacon JSON format following the Progenetix pgxVariant schema

positional arguments:
  input                 Input VCF file name

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output JSON file name
  --assembly ASSEMBLY   Assembly identifier (e.g. GRCh38); if omitted, assemblyId will be excluded
  --analysis ANALYSIS   Analysis identifier (analysisId)
  --individual INDIVIDUAL
                        Individual identifier (individualId)
  --sequence SEQUENCE   Variant sequence
  --reference REFERENCE
                        Reference sequence
  --fusion FUSION       Fusion identifier (fusionId)
          Define the collection name for deletion

Basic Conversion

cnv-vcf2json -i input.vcf -o output.json --assembly GRCh38

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnv_vcf2json-2.0.0.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cnv_vcf2json-2.0.0-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file cnv_vcf2json-2.0.0.tar.gz.

File metadata

  • Download URL: cnv_vcf2json-2.0.0.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for cnv_vcf2json-2.0.0.tar.gz
Algorithm Hash digest
SHA256 7a1fc7bde137e7d2dc766aeda21b446373c32129af1d5259304dae24b5f857e5
MD5 63d5c158b81c3dbe9c6201ee1eeff8aa
BLAKE2b-256 a9883e31ce7c85cbdc27879d2d3b4e7f99e29dfa6f80f859ae2360a82ccd54db

See more details on using hashes here.

File details

Details for the file cnv_vcf2json-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: cnv_vcf2json-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for cnv_vcf2json-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2399c3aeee57e0adfd1cd5f2ea8a94540becd2b1026486f8df6315cec8ad4c14
MD5 a2b78af071eae415fe89c5dd44cd69e9
BLAKE2b-256 804669b5172e0508b82be6af958ed3e30afb1f14c51d37e7e1d3a68c2083bfbb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page