OpenCRAVAT CSV processor for generating sample CSVs for aiva-database import

These details have not been verified by PyPI

Project links

Homepage

Project description

AIVA Sample CSV Processor

A Python package for processing OpenCRAVAT CSV output files and generating sample CSVs for database import. This tool helps streamline the workflow from variant calling to database import by converting OpenCRAVAT annotations into structured CSV files ready for database loading.

Features

Process OpenCRAVAT CSV files into structured database-ready formats
Generate VRS IDs for variants using the aiva-vrs library
Create separate CSV files for variants, transcript consequences, and sample variants
Support for multi-sample VCF inputs
Handle various zygosity, quality, and depth metrics
Optional compression of output files
Customizable sample metadata

Installation

Quick Install

pip install aiva-sample-processor

Development Install

git clone https://github.com/MHSPL/aiva-sample-processor.git
cd aiva-sample-processor
pip install -e .

Dependencies

This package requires:

Python 3.8+
open-cravat 2.2.0+
aiva-vrs 0.1.0+
pandas
tqdm
psycopg2-binary

All dependencies will be installed automatically when installing with pip.

Usage

1. Run OpenCRAVAT

First, run OpenCRAVAT on your VCF file to generate the input CSV:

# Install OpenCRAVAT modules (first time only)
oc module install-base
oc module install csvreporter

# Run OpenCRAVAT
oc run input.vcf -l hg38 -t csv

2. Process the OpenCRAVAT Output

Use the aiva-sample-processor command to process the OpenCRAVAT output:

aiva-sample-processor --input input.vcf.variant.csv --output-dir output_csvs

Command Line Options

usage: aiva-sample-processor [-h] --input INPUT --output-dir OUTPUT_DIR [--assembly {GRCh37,GRCh38}]
                           [--no-compress] [--owner-id OWNER_ID] [--group-id GROUP_ID]
                           [--is-public {true,false}] [--sample-type SAMPLE_TYPE]
                           [--status STATUS] [--review-status REVIEW_STATUS]
                           [--view-status VIEW_STATUS] [--archive-status ARCHIVE_STATUS]
                           [--clinical-notes CLINICAL_NOTES] [--phenotype-terms PHENOTYPE_TERMS]

Generate CSVs for importing sample variants into the database.

required arguments:
  --input INPUT          Path to OpenCRAVAT CSV file
  --output-dir OUTPUT_DIR
                        Directory to write output CSVs to

optional arguments:
  --assembly {GRCh37,GRCh38}
                        Genome assembly (default: GRCh38)
  --no-compress         Do not compress output files with gzip
  --owner-id OWNER_ID   User ID of the owner (default: user-1)
  --group-id GROUP_ID   Group ID for the samples
  --is-public {true,false}
                        Whether samples are public (default: false)
  --sample-type SAMPLE_TYPE
                        Type of sample (default: blood)
  --status STATUS       Sample processing status (default: processed)
  --review-status REVIEW_STATUS
                        Review status of the sample (default: not_reviewed)
  --view-status VIEW_STATUS
                        View status of the sample (default: none)
  --archive-status ARCHIVE_STATUS
                        Archive status of the sample (default: active)
  --clinical-notes CLINICAL_NOTES
                        Clinical notes for the sample
  --phenotype-terms PHENOTYPE_TERMS
                        JSON array of phenotype terms (default: [])

Output Files

The tool generates the following CSV files:

variants.csv(.gz): Contains variant information with VRS IDs
transcript_consequences.csv(.gz): Contains transcript consequences for each variant
sample_variants.csv(.gz): Contains sample-specific variant information
samples.csv: Contains sample metadata

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenCRAVAT for providing the annotation framework
GA4GH VRS for the variant representation standard

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Jul 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiva_sample_processor-0.1.0.tar.gz (15.1 kB view details)

Uploaded Jul 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aiva_sample_processor-0.1.0-py3-none-any.whl (15.9 kB view details)

Uploaded Jul 23, 2025 Python 3

File details

Details for the file aiva_sample_processor-0.1.0.tar.gz.

File metadata

Download URL: aiva_sample_processor-0.1.0.tar.gz
Upload date: Jul 23, 2025
Size: 15.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for aiva_sample_processor-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e8d69f3d71eac7187072dcd90a92d5e229be4938a770d5dfa4ce8479b7384bb6`
MD5	`f4e38b156008961a988ba42b3cba7e8c`
BLAKE2b-256	`4d3475250c6c94061a83c9920dea7f4f8fcd07a6f26bc8e5d25376a1c6263b25`

See more details on using hashes here.

File details

Details for the file aiva_sample_processor-0.1.0-py3-none-any.whl.

File metadata

Download URL: aiva_sample_processor-0.1.0-py3-none-any.whl
Upload date: Jul 23, 2025
Size: 15.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for aiva_sample_processor-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e4320357f88b75066f1dda8727fb887ce28c274ce1ffc9ab90da42dea584616`
MD5	`56374db5be9705b054c70d45fb673804`
BLAKE2b-256	`dc5d48300c9aa81ba40593b0ba581cd5cbd3e7719fe88869d3bf150cd99dcfe2`

See more details on using hashes here.

aiva-sample-processor 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AIVA Sample CSV Processor

Features

Installation

Quick Install

Development Install

Dependencies

Usage

1. Run OpenCRAVAT

2. Process the OpenCRAVAT Output

Command Line Options

Output Files

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes