Skip to main content

AGVD Variant Query Command Line Tool

Project description

AGVD Variant Query Tool

The AGVD Variant Query Tool is a command-line utility for querying variant information against the African Genome Variation Database (AGVD). It supports input from VCF, CSV, TSV, or Excel files and provides threshold-based filtering and clustering of variants using AGVD's GraphQL API.


🚀 Features

  • Supports VCF, CSV, TSV, and Excel input formats
  • Accepts both rsID and CHR_POS_REF_ALT variant formats
  • Submits queries in batches for improved performance
  • Optional local caching for repeated queries
  • Dry-run mode for validation without querying
  • Exports enriched results and JSON summary
  • Multithreaded for faster processing

📦 Requirements

  • Python 3.7+
  • Dependencies (installed via pip install -r requirements.txt):
pandas
tqdm
pysam
requests
openpyxl

🔧 Usage

python agvd \
  --KEY YOUR_AGVD_API_KEY \
  --INFILE path/to/input.vcf \
  --OUTPUT path/to/output.csv \
  --THRESHOLD 0.01

Optional Arguments:

Argument Description
--BATCH Batch size for API queries (default: 1000)
--COLUMN Column name with variant IDs (CSV/TSV/Excel only)
--CHR Chromosome column name
--POS Position column name
--REF Reference allele column name
--ALT Alternate allele column name
--dry-run Validates the file without submitting queries
--verbose Enables debug-level logging
--cache Enables local query caching

📂 Input Format Examples

VCF

Standard .vcf file with #CHROM, POS, REF, and ALT fields.

CSV/TSV/Excel

Either:

  • Single column with rsID or CHR_POS_REF_ALT format
  • Separate columns for --CHR, --POS, --REF, --ALT

🧪 Output

  • A file containing original input +:
    • AGVDCUTOFF: status based on MAF threshold
    • African_MAF: MAF value
    • <Cluster>_MAF: MAF per population cluster
  • A _summary.json with success/failure statistics

🛠 Development

To test locally:

python agvd \
  -k test_key \
  -i examples/test.csv \
  -o out.csv \
  -t 0.05 \
  --verbose

To profile performance:

python -m cProfile agvd ...

🧾 License

MIT License © 2025 AGVD Team


📬 Contact

For support or questions, please contact: agvd@afrigen-d.org

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agvd-0.1.0.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agvd-0.1.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file agvd-0.1.0.tar.gz.

File metadata

  • Download URL: agvd-0.1.0.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for agvd-0.1.0.tar.gz
Algorithm Hash digest
SHA256 526a5c6175d8bb67349e0346a0d7ddfe73f944f00eee8f6ef03648aa6dad876d
MD5 3f04aee1e364542da23643453b74c4ce
BLAKE2b-256 d338a2e5be38179aa41320ceb581e18d763ce39d80fe43a13dfcaffc7cc5f85c

See more details on using hashes here.

File details

Details for the file agvd-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agvd-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for agvd-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4b43aa1803a94ed2c87e76fce39fab4098bcdb7247002fc6c105e7454dae0c3
MD5 99e8e5635315c90cf4fd24b3845b5c74
BLAKE2b-256 451ba36d0564bf6df49665d87f6d8bcc47d20137e4bf23383fc547a649097d68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page