Skip to main content

AGVD Variant Query Command Line Tool

Project description

AGVD Variant Query Tool

The AGVD Variant Query Tool is a command-line utility for querying variant information against the African Genome Variation Database (AGVD). It supports input from VCF, CSV, TSV, or Excel files and provides threshold-based filtering and clustering of variants using AGVD's GraphQL API.


🚀 Features

  • Supports VCF, CSV, TSV, and Excel input formats
  • Accepts both rsID and CHR_POS_REF_ALT variant formats
  • Submits queries in batches for improved performance
  • Optional local caching for repeated queries
  • Dry-run mode for validation without querying
  • Exports enriched results and JSON summary
  • Multithreaded for faster processing

📦 Requirements

  • Python 3.7+
  • Dependencies (installed via pip install -r requirements.txt):
pandas
tqdm
pysam
requests
openpyxl

🔧 Usage

python agvd \
  --KEY YOUR_AGVD_API_KEY \
  --INFILE path/to/input.vcf \
  --OUTPUT path/to/output.csv \
  --THRESHOLD 0.01

Optional Arguments:

Argument Description
--BATCH Batch size for API queries (default: 1000)
--COLUMN Column name with variant IDs (CSV/TSV/Excel only)
--CHR Chromosome column name
--POS Position column name
--REF Reference allele column name
--ALT Alternate allele column name
--dry-run Validates the file without submitting queries
--verbose Enables debug-level logging
--cache Enables local query caching

📂 Input Format Examples

VCF

Standard .vcf file with #CHROM, POS, REF, and ALT fields.

CSV/TSV/Excel

Either:

  • Single column with rsID or CHR_POS_REF_ALT format
  • Separate columns for --CHR, --POS, --REF, --ALT

🧪 Output

  • A file containing original input +:
    • AGVDCUTOFF: status based on MAF threshold
    • African_MAF: MAF value
    • <Cluster>_MAF: MAF per population cluster
  • A _summary.json with success/failure statistics

🛠 Development

To test locally:

python agvd \
  -k test_key \
  -i examples/test.csv \
  -o out.csv \
  -t 0.05 \
  --verbose

To profile performance:

python -m cProfile agvd ...

🧾 License

MIT License © 2025 AGVD Team


📬 Contact

For support or questions, please contact: agvd@afrigen-d.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agvd-0.1.1.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agvd-0.1.1-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file agvd-0.1.1.tar.gz.

File metadata

  • Download URL: agvd-0.1.1.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for agvd-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d235714594b1d9ac21b1b3469a8099f3f12824b70ed6584c900b54150158abf7
MD5 731c50c84ef7874e94553a94d7f59350
BLAKE2b-256 0b42a3d5f7f09d2eec02a4ed196283d152e5a97e2345c457f912443473d3d1c0

See more details on using hashes here.

File details

Details for the file agvd-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: agvd-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for agvd-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2f578a594fa28c628a376971b7a309bd321d68bceef1f216007d8d41ce97bce5
MD5 46fe19652cdc0f723b33ddb9cd3054a0
BLAKE2b-256 09707e495fdbe8688b50fea65b1df3b04500eeb71511c702eabe8a03cf4652ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page