Skip to main content

A command-line tool for sex inference from genomic data using zygosity distributions

Project description

Zigo: Sex Checking by Zigosity Distributions

A command-line tool for sex inference from genomic data. This tool uses a distilled polynomial equation to predict genetic sex from SNP zygosity distributions and can compare predictions with provided PED files.

Installation

We are working on publishing this package to PyPI. In the meantime, you can install the latest version directly from the repository using pip:

pip install git+https://github.com/AI-sandbox/zigo.git

Key Features

  • Genetic Sex Prediction: Accurately predicts genetic sex from genomic data using a machine learning model
  • PED File Integration: Optional comparison with provided sex information from PED files (requires "Individual ID" and "Gender" columns)
  • Comprehensive Logging: Detailed logs of the analysis process

Usage

zigo -i INPUT_FILE -o OUTPUT_DIR [--ped PED_FILE]

Arguments

  • -i, --input: Path to the genomic input file (.vcf, .vcf.gz, .bed, .pgen)
  • -o, --output: Directory for saving results
  • --ped: (Optional) Path to the PED file containing 'Individual ID' and 'Gender' columns

Input Format Requirements

  • Genomic Data: Supports .bed, .pgen, and includes a high-performance C-based reader for .vcf and .vcf.gz.
  • PED File: Tab-separated file with at least two columns:
    • 'Individual ID': Sample identifiers matching those in the genomic data
    • 'Gender': Sex information coded as 1 (male) or 2 (female)

Output Files

  • results.sexcheck: Main results file with sex predictions, containing the following columns:

    • FID: Family ID (same as IID in this implementation)
    • IID: Individual ID
    • SID: Sample ID (same as IID in this implementation)
    • PEDSEX: Sex as provided in the PED file (1=male, 2=female, NA=not available)
    • SNPSEX: Predicted sex from SNP data (1=male, 2=female)
    • STATUS: Comparison result ("OK" if PEDSEX matches SNPSEX, "PROBLEM" if they differ)
    • F: Probability score for the predicted sex label

    Columns related to Y chromosome have been skipped.

  • results.nosex: List of samples with no sex information

  • sexcheck.log: Detailed log of the analysis process

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Cite

If you use Zigo in your research, please cite the source code.

📝 Note: Preprint will be published soon! Once published, we will update this section with the official citation.

In the meantime, please cite the repository as follows:

BibTeX:

@misc{zigo_2026,
  author = {Oscar Molina-Sedano, Daniel Mas Montserrat and Alexander Ioannidis},
  title = {Zigo: Sex Checking by Zigosity Distributions},
  year = {2026},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{[https://github.com/AI-sandbox/sex-check](https://github.com/AI-sandbox/sex-check)}}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zigo-0.1.0.tar.gz (17.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zigo-0.1.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file zigo-0.1.0.tar.gz.

File metadata

  • Download URL: zigo-0.1.0.tar.gz
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for zigo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 52c17368b57c96fde94a4af567ad16bcd30f47297c26bed6ed4b4479ecf1b268
MD5 d32f8c9c599785f3e4681a71e6a14a41
BLAKE2b-256 fe8820a6d9014e872d0c4b445b3b7df5824d2e49eab498fd69fa02b87f18bec5

See more details on using hashes here.

File details

Details for the file zigo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: zigo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for zigo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b37ead31f65873b9d8ff25d332c8263c70edadbaf7e31a39cf3a7dfacbd3df72
MD5 56329284ed5e6b5ca09487a331dab221
BLAKE2b-256 20d4cd2a8e210abd90d6dda4698ec5f12c3302848d37cd79c6ebf86d78f7dc2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page