create SHarable, interactive, stANdalone html dashboard from Tabular proteomIcs data

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nara3m

These details have not been verified by PyPI

Project description

🧘 Shanti

create SHarable, interactive, stANdalone html dashboard from Tabular proteomIcs data

Shanti is a Python library for creating interactive, standalone HTML dashboards from proteomics data (specifically tabular data in Excel format). This package simplifies the process of creating volcano plots and histograms. This tool uses Bokeh library in the background to generate a HTML file that contains interactive plots and tables. The HTML files can be opened in a browser (Firefox, Chrome, Safari, Edge) and shared with colleagues. Your colleagues can explore proteomics data with without requiring any server or software installation. This tool is relevant for Mass Spectrometry Core Facilities to create protoemics reports for clients. This tool is conceptualized, designed, built, documented and published by Nara Marella at the Molecular Discovery Platform of CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna

Table of Contents

Installation
Key Components
Input Files Required
Usage
Final Output
FAQ
For Developers
Cite

📦 Installation

You can install the package with pip:

pip install shanti

🚀 Key Components

1. `load_data()`

loads proteomics data from Excel files, processes it, and prepares it for visualization. The volcano plot visualization includes threshold curves for significance. The curves are calculated based on the threshold function in CurveCurator package. Some default parameters are already set in example snippet below. Only one parameter fc_lim needs to adjusted frequently

with basic parameters

source = load_data(
    file_path = "Shanti_Test_Proteins.xlsx",
    fc_lim = 0.25,
    l2fc_col = "KO_WT_l2FC",
    pAdj_col = "KO_WT_pAdj"
)

file_path is the path to file containing Protein level data. See Shanti_Test_Proteins.xlsx for the format of Protein level data file. Column UniProtID is mandatory and column name is hardcoded. Other column names are flexible. ⚠️ Avoid special characters or blank spaces in table column names of the input file because output HTML file does not parse special column names correctly
fc_lim is the threshold for significance curve. Although a default value is defined, this parameter should be manually adjusted for each new run becasue of the unique data distribution of input. After trail and error, 0.25 was selected as the best value for column KO_WT_l2FC in demo dataset (Shanti_Test_Proteins.xlsx)
l2fc_col is the column name contining log2 fold change values. In demo dataset (Shanti_Test_Proteins.xlsx), column KO_WT_l2FC was used
pAdj_col is the column name contining adjusted P values. In demo dataset (Shanti_Test_Proteins.xlsx), column KO_WT_pAdj was used

with advanced parameters

to fine tune the threshold curve, additional parameters such as alpha, dfn, dfd, loc, scale, two_sided can be adjusted

source = load_data(
    file_path = "Shanti_Test_Proteins.xlsx",
    sheet_name=0,
    alpha = 0.05,
    dfn = 10,
    dfd = 10,
    loc = 0,
    scale = 1,
    two_sided=False,
    fc_lim = 0.25,
    l2fc_col = "KO_WT_l2FC",
    pAdj_col = "KO_WT_pAdj"
)

2. `make_histogram()`

creates histograms of the control and treated sample groups. The bin sizes are set to 20 but can be adjusted in the source code

hist, data_filtered, bin_edges_log, bottoms, bar_height = make_histogram(
    source=source,
    hist_col="AN_KO_Mean",
    title="KO dTAG",
    x_axis_label="protein count"
)

source is output of load_data() function
hist_col is the name of the column containing abundance (or normalized abundances). The numerator in the fold change ratio is usually the first histogram (for example, call it hist1 instead of hist). In example dataset Shanti_Test_Proteins.xlsx, column AN_KO_Mean is used for first histogram. KO meaning KnockOut or Treatment Group. The denominator in the fold change ratio is usually the second histogram (for example, call it hist2 instead of hist). In example dataset, column AN_WT_Mean is used for second histogram. WT meaning WildType or Control Group
title is the str to diplay on top of Histogram in HTML output file. Default is no title
x_axis_label default is empty, but good to give a str

3. `create_interactive_dashboard()`

generates final output HTML file

create_interactive_dashboard(
    source,
    l2fc_col="KO_WT_l2FC",
    pAdj_col="KO_WT_pAdj",
    volcano_title="KO dTAG vs DMSO Comparison",
    hist1_col="AN_KO_Mean",
    hist2_col="AN_WT_Mean",
    table_columns=["UniProtID", "Gene", "Description", "Peptides", "PeptidesU", "PSMs"],
    peptides_file="Shanti_Test_PeptideGroups.xlsx",
    peptide_columns=["UniProtID", "Sequence", "ProteinGroups", "Proteins", "PSMs", "Position", "MissedCleavages", "QuanInfo"],
    output_path="dashboard.html"
    plot2=hist1,
    plot3=hist2,
    hist1_data_filtered=hist1_data_filtered,
    hist2_data_filtered=hist2_data_filtered,
    hist1_bin_edges_log=hist1_bin_edges_log,
    hist2_bin_edges_log=hist2_bin_edges_log,
    hist1_bottoms=hist1_bottoms,
    hist2_bottoms=hist2_bottoms,
    hist1_bar_height=hist1_bar_height,
    hist2_bar_height=hist2_bar_height,
)

source is output of load_data() function
l2fc_col and pAdj_col were explained in load_data() function
volcano_title is str to display on top of the Volcano Plot in HTML file. Default is empty
table_columns are the lsit of Protein columns to display. Number of columns to display are fixed at 6 becuase of the HTML page dimentions. In Test example, Shanti_Test_Proteins.xlsx, columns UniProtID, Gene, Description, Peptides, PeptidesU, PSMs were selected to display
peptides_file is path to the file containing Peptide level data. Column name UniProtID is mandatory and hardcoded. See Shanti_Test_PeptideGroups.xlsx for the format. Other column names are flexible
peptide_columns are the columns to disaply in HTML file. Columns UniProtID, Sequence, ProteinGroups, Proteins, PSMs, Position, MissedCleavages, QuanInfo from Shanti_Test_PeptideGroups.xlsx were used to generate demo HTML file. Limited to 8 columns becuase of the HTML page dimentions. Column widths can be adjusted in source code but not directly accessible with function arguments
output_path is the filename of the HTML file. defaults to dashboard.html
hist1_col and hist2_col were explained in make_histogram() function
plot2, plot3, hist1_data_filtered, hist2_data_filtered, hist1_bin_edges_log, hist2_bin_edges_log, hist1_bottoms, hist2_bottoms, hist1_bar_height, hist2_bar_height are outputs of make_histogram() function

`DataProcessor()`

internal Class that handles

Statistical calculations specifically for protein level data
Classification of volcano data points based on significance thresholds

📂 Input Files Required

Protein data Excel file (e.g. Shanti_Test_Proteins.xlsx)
Peptide data Excel file (e.g. Shanti_Test_PeptideGroups.xlsx)

🧪 Usage

⚠️ create_interactive_dashboard() function fails in Jupyter notebooks because of the incompatibility with Bokeh. Therefore, for example, combine load_data(), make_histogram() and create_interactive_dashboard() snippets in a python script called run.py and exectute from termainal.

# save as run.py

from shanti import load_data, make_histogram, create_interactive_dashboard

source = load_data(
    file_path = "Shanti_Test_Proteins.xlsx",
    fc_lim = 0.25,
    l2fc_col = "KO_WT_l2FC",
    pAdj_col = "KO_WT_pAdj"
)

hist1, hist1_data_filtered, hist1_bin_edges_log, hist1_bottoms, hist1_bar_height = make_histogram(
    source=source,
    hist_col="AN_KO_Mean",
    title="KO dTAG",
    x_axis_label="protein count"
)

hist2, hist2_data_filtered, hist2_bin_edges_log, hist2_bottoms, hist2_bar_height = make_histogram(
    source,
    hist_col="AN_WT_Mean",
    title="DMSO",
    x_axis_label="protein count"
)

create_interactive_dashboard(
    source,
    l2fc_col="KO_WT_l2FC",
    pAdj_col="KO_WT_pAdj",
    volcano_title="KO dTAG vs DMSO Comparison",
    hist1_col="AN_KO_Mean",
    hist2_col="AN_WT_Mean",
    table_columns=["UniProtID", "Gene", "Description", "Peptides", "PeptidesU", "PSMs"],
    peptides_file="Shanti_Test_PeptideGroups.xlsx",
    peptide_columns=["UniProtID", "Sequence", "ProteinGroups", "Proteins", "PSMs", "Position", "MissedCleavages", "QuanInfo"],
    output_path="dashboard.html"
    plot2=hist1,
    plot3=hist2,
    hist1_data_filtered=hist1_data_filtered,
    hist2_data_filtered=hist2_data_filtered,
    hist1_bin_edges_log=hist1_bin_edges_log,
    hist2_bin_edges_log=hist2_bin_edges_log,
    hist1_bottoms=hist1_bottoms,
    hist2_bottoms=hist2_bottoms,
    hist1_bar_height=hist1_bar_height,
    hist2_bar_height=hist2_bar_height,
)

python run.py

📊 Final Output

The result of run.py is a fully interactive HTML dashboard that can be opened in any moderen browser. A demo HTML output file created with Test datasets is available here.

Volcano Plot showing log fold change vs p-value
Histograms comparing protein abundance distribution overlaid with selected proteins
Interactive tables of proteins and peptides
Ability to click/select proteins and see related peptides instantly

Detailed guide to understand output HTML file and perform interactive data exploration is available here: nara3m.github.io/shanti

Demo output HTML file created with Test Datasets

🧑‍💻 For Developers

To extend or modify this tool:

Check the shanti source code
Edit the histogram, volcano, or dashboard layout logic

🙋 FAQ

Q: What kind of Excel format is expected?

A: See Shanti_Test_Proteins.xlsx and Shanti_Test_PeptideGroups.xlsx. The protein and peptide files should contain a mandatory column with the name UniProtID. It is hard coded. A fold change column, a p-value (or adjusted p value) column, two normalized abundance columns (for histograms) are minimum columns required. See demo HTML file for columns used in Protien and Petide tables. It is recommended to have atleast 6 Protein columns and 8 Peptide columns to display in table. It is also possible to display log2 fold change and p values in Protein table. ⚠️ The UniProtID (name is hardcoded) column in Protein table should contain only one ID per row. ⚠️ The UniProtID (name is hardcoded) column in Peptide table can contain multiple colon ; seperated UniProtIDs.

Q: Does it support .csv files?

A: Not yet, but it's easy to adapt by editing the load_data function.

📬 Questions?

Feel free to open an issue or reach out with feedback!

Cite

Marella, N. (2025). Shanti: create SHarable, interactive, stANdalone html dashboard from Tabular proteomIcs data (v0.1.1). Zenodo. doi.org/10.5281/zenodo.15307776

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nara3m

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.3

May 13, 2025

0.1.1

May 12, 2025

0.1.0

Apr 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shanti-0.1.3.tar.gz (32.1 kB view details)

Uploaded May 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

shanti-0.1.3-py3-none-any.whl (39.9 kB view details)

Uploaded May 13, 2025 Python 3

File details

Details for the file shanti-0.1.3.tar.gz.

File metadata

Download URL: shanti-0.1.3.tar.gz
Upload date: May 13, 2025
Size: 32.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for shanti-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`b20f85e1baee65c744113dbdbc0a801d9f1090698e8e488f40c4a5dd09547b2e`
MD5	`a5e02283d35f5d6cd39aa63011be47f7`
BLAKE2b-256	`62a2a55d10b6c9a64af8d6d9af6005a4fee001ff79fa05fcf8cd6341fedd40f4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for shanti-0.1.3.tar.gz:

Publisher: publish.yml on n3m4u/shanti

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: shanti-0.1.3.tar.gz
- Subject digest: b20f85e1baee65c744113dbdbc0a801d9f1090698e8e488f40c4a5dd09547b2e
- Sigstore transparency entry: 212215880
- Sigstore integration time: May 13, 2025
Source repository:
- Permalink: n3m4u/shanti@b8296c8ad1ab7631339bb42fbf179798091cab7f
- Branch / Tag: refs/tags/0.1.3
- Owner: https://github.com/n3m4u
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b8296c8ad1ab7631339bb42fbf179798091cab7f
- Trigger Event: release

File details

Details for the file shanti-0.1.3-py3-none-any.whl.

File metadata

Download URL: shanti-0.1.3-py3-none-any.whl
Upload date: May 13, 2025
Size: 39.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for shanti-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f6a756c83c099cb4f69728afaf32678680bdf05a289e0817d9f3fd4a0bf4860d`
MD5	`0dec5d8499314994268dce42bffa33c0`
BLAKE2b-256	`e8643bd3c60074162421e5bac7b0817182ad7c60cf56cc84f274b92cb33482cb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for shanti-0.1.3-py3-none-any.whl:

Publisher: publish.yml on n3m4u/shanti

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: shanti-0.1.3-py3-none-any.whl
- Subject digest: f6a756c83c099cb4f69728afaf32678680bdf05a289e0817d9f3fd4a0bf4860d
- Sigstore transparency entry: 212215881
- Sigstore integration time: May 13, 2025
Source repository:
- Permalink: n3m4u/shanti@b8296c8ad1ab7631339bb42fbf179798091cab7f
- Branch / Tag: refs/tags/0.1.3
- Owner: https://github.com/n3m4u
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b8296c8ad1ab7631339bb42fbf179798091cab7f
- Trigger Event: release

shanti 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🧘 Shanti

📦 Installation

🚀 Key Components

1. load_data()

with basic parameters

with advanced parameters

2. make_histogram()

3. create_interactive_dashboard()

DataProcessor()

📂 Input Files Required

🧪 Usage

📊 Final Output

🧑‍💻 For Developers

🙋 FAQ

📬 Questions?

Cite

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

1. `load_data()`

2. `make_histogram()`

3. `create_interactive_dashboard()`

`DataProcessor()`