No project description provided
Project description
GWASStudio: A Tool for Genomic Data Management
Overview
GWASStudio is a powerful CLI tool designed for efficient storage, retrieval, and querying of genomic summary statistics. It offers a high-performance infrastructure for handling and analyzing large-scale GWAS and QTL datasets, enabling seamless cross-dataset exploration.
Core Purpose
GWASStudio provides a unified interface across the CDH infrastructure, handling the ingestion, storage, querying and export of genomic data using high-performance technologies.
Key Functionalities
GWASStudio consists of several key functionalities:
1. Data Ingestion
- Data Ingestion: Imports summary statistics data and its metadata associated.
- Support for Multiple Storage Options: Works with both local filesystems and cloud storage (S3).
2. Data Querying
- Flexible Search: Enables searching metadata using template files.
3. Data Export
- Selective Export: Extracts subsets of data and its metadata associated based on genomic regions, SNPs, or the entire set of data.
Technical Architecture
GWASStudio leverages several advanced technologies:
- TileDB Embedded: A high-performance array storage engine that enables efficient storage and retrieval of genomic data.
- MongoDB: A flexible, scalable NoSQL database used for storing and querying metadata associated with genomic datasets.
- Dask: Provides distributed computing capabilities for processing large datasets.
- Python Ecosystem: Built on Python with libraries like Click/Cloup for CLI interfaces, Pandas for data manipulation, and various genomics-specific tools.
Installation
For detailed installation instructions, please refer to the documentation at https://ht-diva.github.io/gwasstudio/
Usage
For detailed instructions on how to use this tool, please refer to the documentation and check the cli_test script for a practical guide by examples.
Citation
Example files are derived from:
The variant call format provides efficient and robust storage of GWAS summary statistics. Matthew Lyon, Shea J Andrews, Ben Elsworth, Tom R Gaunt, Gibran Hemani, Edoardo Marcora. bioRxiv 2020.05.29.115824; doi: https://doi.org/10.1101/2020.05.29.115824
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gwasstudio-2.12.2.tar.gz.
File metadata
- Download URL: gwasstudio-2.12.2.tar.gz
- Upload date:
- Size: 38.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87895f748646082b2261eee3eb89fe0929e2ed96a2064456bc64dbc67be09d32
|
|
| MD5 |
b9fa903b1270b14e6903cbb98ff6799d
|
|
| BLAKE2b-256 |
09b3351cda8f58b7a893e146e8ce0edaf714ded9a70ac633f5215ee0ae7e4ef9
|
Provenance
The following attestation bundles were made for gwasstudio-2.12.2.tar.gz:
Publisher:
release.yml on ht-diva/gwasstudio
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gwasstudio-2.12.2.tar.gz -
Subject digest:
87895f748646082b2261eee3eb89fe0929e2ed96a2064456bc64dbc67be09d32 - Sigstore transparency entry: 730745546
- Sigstore integration time:
-
Permalink:
ht-diva/gwasstudio@892fb66a2876ac5300b7f476d92325397b785fb6 -
Branch / Tag:
refs/tags/v2.12.2 - Owner: https://github.com/ht-diva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@892fb66a2876ac5300b7f476d92325397b785fb6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file gwasstudio-2.12.2-py3-none-any.whl.
File metadata
- Download URL: gwasstudio-2.12.2-py3-none-any.whl
- Upload date:
- Size: 48.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2855a274deef8ae19acbbdffd68e7e9dd2c86c0a58be5311030c510f0f03fa8
|
|
| MD5 |
44f9fa4ea6f812351a40e4a5ae31a501
|
|
| BLAKE2b-256 |
fa0d87531be572de153949b2724364cde1d09ed94d1df100e15acf9ef23824e6
|
Provenance
The following attestation bundles were made for gwasstudio-2.12.2-py3-none-any.whl:
Publisher:
release.yml on ht-diva/gwasstudio
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gwasstudio-2.12.2-py3-none-any.whl -
Subject digest:
b2855a274deef8ae19acbbdffd68e7e9dd2c86c0a58be5311030c510f0f03fa8 - Sigstore transparency entry: 730745553
- Sigstore integration time:
-
Permalink:
ht-diva/gwasstudio@892fb66a2876ac5300b7f476d92325397b785fb6 -
Branch / Tag:
refs/tags/v2.12.2 - Owner: https://github.com/ht-diva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@892fb66a2876ac5300b7f476d92325397b785fb6 -
Trigger Event:
push
-
Statement type: