package to run a genotype quality control pipeline
Project description
IDEAL-GENOM
IDEAL-GENOM is a comprehensive Python package for automated, reproducible analysis of human genotype data. Currently it has implemented three pipelines: genomic quality control (QC) for case/control studies; processing VCF files after imputation; and genome-wide association studies (GWAS). It wraps years of research at CGE Tübingen, leveraging PLINK 1.9/2.0, GCTA and bcftools and also providing rich reporting and visualizations.
Key Features
- Sample QC: Automated sample-level filtering.
- Ancestry QC: Detection of outlier ancestry samples tailored for homogenous populations and based on 1KG data.
- Variant QC: Automated variant-level (SNPs) filtering.
- VCF Processing: Post-imputation VCF to PLINK filtering, harmonization and conversion to PLINK1.9 binaries.
- GWAS: Generalized Linear Model (GLM) and Generalized Linear Mixed Model (GLMM) and top-hits finding.
- Visualization: QC steps are complemented with high quality plots for reporting. Population structure visualization powered by dimensionality reduction algorithms such as Uniform Manifold Approximation and Projection (UMAP) and t-SNE. Moreover, it has a visualization functionalities to report GWAS' summary statistics.
- Flexible configuration: Modular pipeline steps whose configuration is based on YAML files.
- CLI, Jupyter, and Docker: Run as a command-line tool, in notebooks, or containerized
- Reproducible: All steps, parameters, and outputs are logged.
Installation
You can install IDEAL-GENOM using pip:
pip install ideal-genom
Or clone the repository and install locally:
git clone https://github.com/LuisGiraldo86/IDEAL-GENOM.git
cd IDEAL-GENOM
pip install .
For Docker usage:
docker build -t ideal-genom .
docker run -it ideal-genom
Installed Genomic Tools in Docker
The IDEAL-GENOM Docker image comes pre-installed with the following genomic analysis tools:
- PLINK 1.9: Version 20231211
- PLINK 2.0: Version 20240105 (AVX2 build)
- GCTA: Version 1.95.0 (Linux x86_64)
- BCFtools: Version 1.23
These tools are available in the container's PATH and can be used directly in your pipeline steps or custom scripts. Example usage inside the container:
plink --help
plink2 --help
gcta64 --help
bcftools --help
You can run these commands interactively by starting a shell in the container:
docker run -it ideal-genom /bin/bash
This ensures reproducible and ready-to-use genomic analysis workflows.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ideal_genom-1.1.0.tar.gz.
File metadata
- Download URL: ideal_genom-1.1.0.tar.gz
- Upload date:
- Size: 145.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
985eaa0029a0762dd948b6d59de88427b9cdfe53f50a6845d65f1f51eceddf9f
|
|
| MD5 |
a329ceffd4b44ce3a38cf1eb97c0ab26
|
|
| BLAKE2b-256 |
5247eb097963318b2f82a554917141b8dcac3b58b331af0c4d497cb80536a653
|
File details
Details for the file ideal_genom-1.1.0-py3-none-any.whl.
File metadata
- Download URL: ideal_genom-1.1.0-py3-none-any.whl
- Upload date:
- Size: 166.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33400b0b7a9ec712cbf3cf5fe83540dcdd4fe9c2c209d5c55c02dd045c135a8d
|
|
| MD5 |
cc0041b89459688836fcd442b255e82a
|
|
| BLAKE2b-256 |
28607e3f0f30b4d799834e6456bc620cb881eefeb4369de3c251b95d51b3beb1
|