Skip to main content

package to run a genotype quality control pipeline

Project description

IDEAL-GENOM

IDEAL-GENOM is a comprehensive Python package for automated, reproducible analysis of human genotype data. Currently it has implemented three pipelines: genomic quality control (QC) for case/control studies; processing VCF files after imputation; and genome-wide association studies (GWAS). It wraps years of research at CGE Tübingen, leveraging PLINK 1.9/2.0, GCTA and bcftools and also providing rich reporting and visualizations.

Key Features

  • Sample QC: Automated sample-level filtering.
  • Ancestry QC: Detection of outlier ancestry samples tailored for homogenous populations and based on 1KG data.
  • Variant QC: Automated variant-level (SNPs) filtering.
  • VCF Processing: Post-imputation VCF to PLINK filtering, harmonization and conversion to PLINK1.9 binaries.
  • GWAS: Generalized Linear Model (GLM) and Generalized Linear Mixed Model (GLMM) and top-hits finding.
  • Visualization: QC steps are complemented with high quality plots for reporting. Population structure visualization powered by dimensionality reduction algorithms such as Uniform Manifold Approximation and Projection (UMAP) and t-SNE. Moreover, it has a visualization functionalities to report GWAS' summary statistics.
  • Flexible configuration: Modular pipeline steps whose configuration is based on YAML files.
  • CLI, Jupyter, and Docker: Run as a command-line tool, in notebooks, or containerized
  • Reproducible: All steps, parameters, and outputs are logged.

Installation

You can install IDEAL-GENOM using pip:

pip install ideal-genom

Or clone the repository and install locally:

git clone https://github.com/LuisGiraldo86/IDEAL-GENOM.git
cd IDEAL-GENOM
pip install .

For Docker usage:

docker build -t ideal-genom .
docker run -it ideal-genom

Installed Genomic Tools in Docker

The IDEAL-GENOM Docker image comes pre-installed with the following genomic analysis tools:

  • PLINK 1.9: Version 20231211
  • PLINK 2.0: Version 20240105 (AVX2 build)
  • GCTA: Version 1.95.0 (Linux x86_64)
  • BCFtools: Version 1.23

These tools are available in the container's PATH and can be used directly in your pipeline steps or custom scripts. Example usage inside the container:

plink --help
plink2 --help
gcta64 --help
bcftools --help

You can run these commands interactively by starting a shell in the container:

docker run -it ideal-genom /bin/bash

This ensures reproducible and ready-to-use genomic analysis workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ideal_genom-1.1.0.tar.gz (145.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ideal_genom-1.1.0-py3-none-any.whl (166.9 kB view details)

Uploaded Python 3

File details

Details for the file ideal_genom-1.1.0.tar.gz.

File metadata

  • Download URL: ideal_genom-1.1.0.tar.gz
  • Upload date:
  • Size: 145.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for ideal_genom-1.1.0.tar.gz
Algorithm Hash digest
SHA256 985eaa0029a0762dd948b6d59de88427b9cdfe53f50a6845d65f1f51eceddf9f
MD5 a329ceffd4b44ce3a38cf1eb97c0ab26
BLAKE2b-256 5247eb097963318b2f82a554917141b8dcac3b58b331af0c4d497cb80536a653

See more details on using hashes here.

File details

Details for the file ideal_genom-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: ideal_genom-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 166.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for ideal_genom-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 33400b0b7a9ec712cbf3cf5fe83540dcdd4fe9c2c209d5c55c02dd045c135a8d
MD5 cc0041b89459688836fcd442b255e82a
BLAKE2b-256 28607e3f0f30b4d799834e6456bc620cb881eefeb4369de3c251b95d51b3beb1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page