Skip to main content

No project description provided

Project description

MLMarker

MLMarker is a Python package for tissue-specific proteomics prediction using machine learning, with integrated SHAP-based explainability features.

Key Features

  • Dual Model Support: Binary and quantitative tissue prediction models
  • SHAP-Based Predictions: Uses SHAP values for more interpretable predictions
  • Feature Penalty System: Adjustable penalty for absent features using penalty_factor
  • Visualization Tools: Force plots, radar charts, and custom visualizations
  • Protein Analysis: Integrated tools for NSAF calculation and protein information retrieval
  • Data Validation: Automatic handling of missing features

Installation

pip install mlmarker

Quick Start

import pandas as pd
from mlmarker import MLMarker

# Load your data
df = pd.read_csv("your_sample.csv")

# Initialize model (binary=False for quantitative model)
model = MLMarker(binary=False, penalty_factor=1)

# Load and validate your sample
model.load_sample(df)

# Get predictions
predictions = model.predict_top_tissues_shap(n_preds=5)

Core Features

1. Model Initialization

# Binary model
binary_model = MLMarker(binary=True)

# Quantitative model with penalty for absent features
quant_model = MLMarker(binary=False, penalty_factor=1)

2. SHAP-Based Predictions

# Get predictions with SHAP explanations
predictions = model.predict_top_tissues_shap(n_preds=5)

# Visualize SHAP force plot
model.shap_force_plot(n_preds=3)

# Generate radar chart of predictions
model.radar_chart()

3. SHAP Value Analysis

# Get raw SHAP values
shap_values = model.explainability.calculate_shap()

# Get processed SHAP values with optional penalty
shap_df = model.explainability.get_shap_values(n_preds=5)

4. Feature Handling

# Get model features
features = model.get_model_features()

# Load sample with feature validation
added_features = model.load_sample(df, output_added_features=True)

5. NSAF Calculations

# Calculate NSAF scores for proteins
nsaf_df = model.explainability.calculate_NSAF(protein_df, lengths_df)

Advanced Usage

Penalty Factor

The penalty_factor parameter controls how absent features influence predictions:

  • 0: No penalty (default)
  • 1: Full penalty for absent features
  • Values between 0-1: Partial penalty
# Model with full penalty for absent features
model = MLMarker(penalty_factor=1)

Custom SHAP Visualization

# Visualize specific tissue
model.shap_force_plot(tissue_name="Liver")

# Visualize top N predictions
model.shap_force_plot(n_preds=3)

Additional Utilities

from mlmarker.utils import (
    get_protein_info,
    get_hpa_info,
    get_go_enrichment,
    visualise_custom_plot
)

# Get protein information
protein_info = get_protein_info("P12345")

# Get Human Protein Atlas information
hpa_info = get_hpa_info("P12345")

# Perform GO enrichment analysis
enrichment = get_go_enrichment(protein_list)

Requirements

  • Python ≥ 3.8
  • numpy==1.23.5
  • pandas
  • scikit-learn
  • shap==0.42.0
  • plotly
  • bioservices
  • gprofiler-official

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlmarker-0.1.6.tar.gz (12.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlmarker-0.1.6-py3-none-any.whl (13.0 MB view details)

Uploaded Python 3

File details

Details for the file mlmarker-0.1.6.tar.gz.

File metadata

  • Download URL: mlmarker-0.1.6.tar.gz
  • Upload date:
  • Size: 12.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.9.23 Linux/6.12.38+deb13-amd64

File hashes

Hashes for mlmarker-0.1.6.tar.gz
Algorithm Hash digest
SHA256 d1b7f0df5836ce9953c1e850e2f82bcefd67173938a63eeb7b8c16cd9c0397b7
MD5 cba16abb98fcf8f4441e0a2887dd280c
BLAKE2b-256 b7feb581fe5bcb4532ac8a8d30748a239d2fb2181d0356044acc0380686c7628

See more details on using hashes here.

File details

Details for the file mlmarker-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: mlmarker-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 13.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.9.23 Linux/6.12.38+deb13-amd64

File hashes

Hashes for mlmarker-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 8f009ab08744c844dc6e6d70e7d8770bfa6878818bafe3cc728ad3d531e4f5e7
MD5 7805aef7dac220268047e8b787ad58d4
BLAKE2b-256 65b56a58a16e03e7f9fc8fc57a69a67354900c70b6ad3d237e622cdf3bc13bed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page