Skip to main content

Data exploration app for large phenotypic datasets analyzed using mGWAS, originally developed for Scoary2

Project description

mGWAS data exploration app

[!NOTE] This tool was built for Scoary2!

Description

This is a simple, static HTML/JS data exploration that allows you to explore the results of mGWAS software, particularly large phenotypic datasets.

Output

The app produces two types of HTML files that can be opened in any browser:

  • overview.html: A simple overview of all traits in the dataset.
  • trait.html: A more detailed view of a single trait.

The usage of this app is described on the Scoary2 wiki.

Installation

  1. Using pip: pip install mgwas-data-exploration-app
  2. Using docker: docker pull troder/scoary-2

How to prepare your data

Expected folder structure

The app expects the following folder structure:

.
└── workdir
    ├── summary.tsv
    ├── traits.tsv
    ├── tree.nwk
    ├── isolate_info.tsv (optional)
    └── traits
        ├── trait1
        │   ├── coverage-matrix.tsv
        │   ├── meta.json
        │   ├── result.tsv
        │   └── values.tsv
        ├── trait2
        │   └── ...
        └── ...

Input arguments

  • summary_df: A table with the results of the mGWAS analysis. Rows: traits; columns: genes. (Separator: tab)
  • traits_df: A table with the metadata of the traits. Rows: traits; columns: metadata. (Separator: tab)
  • workdir: Folder where the mGWAS output must be located, exepect to find a folder 'traits' with subfolders for each trait.
  • is_numeric: Whether the data is numeric or binary.
  • app_config: A JSON file that overwrites the default app config. See the default config.json. (Optional)
  • distance_metric: The distance metric for the clustering of binary data. See the scipy documentation. (Binary data only; default: jaccard)
  • linkage_method: The linkage method for the clustering. One of [single, complete, average, weighted, ward, centroid, median]. (Default: ward)
  • optimal_ordering: Whether to use optimal ordering. See scipy.cluster.hierarchy.linkage (Default: True)
  • corr_scale: Whether to scale numeric data before clustering. (Numeric data only; default: True)
  • corr_method: The correlation method for numeric data. One of [pearson, kendall, spearman]. (Numeric data only; default: pearson)
  • dendrogram_x_scale: The x-axis scale for the dendrogram. One of [linear, squareroot, log, symlog, logit]. (Default: linear)
  • scores_x_scale: The x-axis scale for the scores plot. One of [linear, manhattan]. (Default: linear)

Usage

Get help with mgwas-data-exploration-app --help or reading the docstring of main.py.

Python

Click here to expand.
from mgwas_data_exploration_app.main import mgwas_app

mgwas_app(
    summary_df="summary.tsv",  # or a pandas.DataFrame
    traits_df="traits.tsv",  # or a pandas.DataFrame
    workdir="out",
    is_numeric=False,
    app_config="app_config.json",  # or dict
    distance_metric="jaccard",
    linkage_method="ward",
    optimal_ordering=True,
    corr_scale=True,
    corr_method="pearson",
    dendrogram_x_scale="linear",
    scores_x_scale="linear",
)

Command line

Click here to expand.
mgwas-data-exploration-app \
    --summary summary.tsv \
    --traits traits.tsv \
    --workdir out \
    --is-numeric False \
    --app-config None \
    --distance-metric jaccard \
    --linkage-method ward \
    --optimal-ordering True \
    --corr-scale True \
    --corr-method pearson \
    --dendrogram-x-scale linear \
    --scores-x-scale linear

Credits

This project is built using the following libraries:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mgwas_data_exploration_app-0.1.1.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file mgwas_data_exploration_app-0.1.1.tar.gz.

File metadata

  • Download URL: mgwas_data_exploration_app-0.1.1.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.0 Linux/6.6.4-200.fc39.x86_64

File hashes

Hashes for mgwas_data_exploration_app-0.1.1.tar.gz
Algorithm Hash digest
SHA256 968a9dcdab8686dd40179ac96b2c7007ae626fc150d98d91a5474828f4a730ae
MD5 372222c02ffc3b5945884265fea4f6d2
BLAKE2b-256 17d3a72fe199446bd9d071e3ac957570d7bbc9123b762501301ff3dd9ae626bd

See more details on using hashes here.

File details

Details for the file mgwas_data_exploration_app-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mgwas_data_exploration_app-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8a72a972e928f00193e0b689639c67a6fee9df69280cea2b900d1415e1ba17d9
MD5 a4931c0a8f85576cb012890b21ef8697
BLAKE2b-256 6cc2a414b5860bd4909a7a450ac1377dfc7a1861223cc2683095090dffd872b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page