Converts the CNVkit structural variants VCF file into JSON format
Project description
Convert CNVkit VCF 4.2 to JSON
Overview
cnv_vcf2json is a command‐line tool that converts CNVkit structural variant VCF files into JSON files conforming to the Progenetix database variant schema (pgxVariant.yaml). The tool extracts variant data—including chromosomal coordinates, variant type, and copy‐number information—from a VCF file. If a copy‐number (CN) is missing (which is common for deletion calls from HMM-based segmentation), the script infers it from the genotype (GT): heterozygous deletions (GT=0/1 or 1/0) are assigned CN=1, and homozygous deletions (GT=1/1) are assigned CN=0. For duplications, if no CN is provided, a default value of 3 is assumed (interpreted as low-level gain), and values ≥4 are treated as high-level gain.
Additionally, the converter offers extra flexibility by allowing the user to supply optional metadata—including assembly, analysis, individual, sequence, reference sequence, and fusion identifiers—that will be incorporated into the JSON output according to the Progenetix schema.
Requirements
- Python 3.6 or newer (download instructions)
Installation and Update
Using Pip3
-
Install the package:
pip3 install cnv_vcf2json
-
Update the package, if needed:
pip3 install cnv_vcf2json --upgrade
-
Test your installation:
cnv-vcf2json --help
Using Conda
-
Add the conda-forge channel (if not already added):
conda config --add channels conda-forge
-
Install the package:
conda install cnv_vcf2json
-
Update the package, if needed:
conda update cnv_vcf2json
-
Test your installation:
cnv-vcf2json --help
Usage cnv-vcf2json
usage: cnv_vcf2json.py [-h] -o OUTPUT [--assembly ASSEMBLY] [--analysis ANALYSIS] [--individual INDIVIDUAL] [--sequence SEQUENCE] [--reference REFERENCE]
[--fusion FUSION]
input
Convert CNVkit VCF to Beacon JSON format following the Progenetix pgxVariant schema
positional arguments:
input Input VCF file name
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output JSON file name
--assembly ASSEMBLY Assembly identifier (e.g. GRCh38); if omitted, assemblyId will be excluded
--analysis ANALYSIS Analysis identifier (analysisId)
--individual INDIVIDUAL
Individual identifier (individualId)
--sequence SEQUENCE Variant sequence
--reference REFERENCE
Reference sequence
--fusion FUSION Fusion identifier (fusionId)
Define the collection name for deletion
Basic Conversion
cnv-vcf2json -i input.vcf -o output.json --assembly GRCh38
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cnv_vcf2json-2.0.0.tar.gz.
File metadata
- Download URL: cnv_vcf2json-2.0.0.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a1fc7bde137e7d2dc766aeda21b446373c32129af1d5259304dae24b5f857e5
|
|
| MD5 |
63d5c158b81c3dbe9c6201ee1eeff8aa
|
|
| BLAKE2b-256 |
a9883e31ce7c85cbdc27879d2d3b4e7f99e29dfa6f80f859ae2360a82ccd54db
|
File details
Details for the file cnv_vcf2json-2.0.0-py3-none-any.whl.
File metadata
- Download URL: cnv_vcf2json-2.0.0-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2399c3aeee57e0adfd1cd5f2ea8a94540becd2b1026486f8df6315cec8ad4c14
|
|
| MD5 |
a2b78af071eae415fe89c5dd44cd69e9
|
|
| BLAKE2b-256 |
804669b5172e0508b82be6af958ed3e30afb1f14c51d37e7e1d3a68c2083bfbb
|