No project description provided
Project description
MLMarker
MLMarker is a Python package for tissue-specific proteomics prediction using machine learning, with integrated SHAP-based explainability features.
Key Features
- Dual Model Support: Binary and quantitative tissue prediction models
- SHAP-Based Predictions: Uses SHAP values for more interpretable predictions
- Feature Penalty System: Adjustable penalty for absent features using
penalty_factor - Visualization Tools: Force plots, radar charts, and custom visualizations
- Protein Analysis: Integrated tools for NSAF calculation and protein information retrieval
- Data Validation: Automatic handling of missing features
Installation
pip install mlmarker
Quick Start
import pandas as pd
from mlmarker import MLMarker
# Load your data
df = pd.read_csv("your_sample.csv")
# Initialize model (binary=False for quantitative model)
model = MLMarker(binary=False, penalty_factor=1)
# Load and validate your sample
model.load_sample(df)
# Get predictions
predictions = model.predict_top_tissues_shap(n_preds=5)
Core Features
1. Model Initialization
# Binary model
binary_model = MLMarker(binary=True)
# Quantitative model with penalty for absent features
quant_model = MLMarker(binary=False, penalty_factor=1)
2. SHAP-Based Predictions
# Get predictions with SHAP explanations
predictions = model.predict_top_tissues_shap(n_preds=5)
# Visualize SHAP force plot
model.shap_force_plot(n_preds=3)
# Generate radar chart of predictions
model.radar_chart()
3. SHAP Value Analysis
# Get raw SHAP values
shap_values = model.explainability.calculate_shap()
# Get processed SHAP values with optional penalty
shap_df = model.explainability.get_shap_values(n_preds=5)
4. Feature Handling
# Get model features
features = model.get_model_features()
# Load sample with feature validation
added_features = model.load_sample(df, output_added_features=True)
5. NSAF Calculations
# Calculate NSAF scores for proteins
nsaf_df = model.explainability.calculate_NSAF(protein_df, lengths_df)
Advanced Usage
Penalty Factor
The penalty_factor parameter controls how absent features influence predictions:
0: No penalty (default)1: Full penalty for absent features- Values between 0-1: Partial penalty
# Model with full penalty for absent features
model = MLMarker(penalty_factor=1)
Custom SHAP Visualization
# Visualize specific tissue
model.shap_force_plot(tissue_name="Liver")
# Visualize top N predictions
model.shap_force_plot(n_preds=3)
Additional Utilities
from mlmarker.utils import (
get_protein_info,
get_hpa_info,
get_go_enrichment,
visualise_custom_plot
)
# Get protein information
protein_info = get_protein_info("P12345")
# Get Human Protein Atlas information
hpa_info = get_hpa_info("P12345")
# Perform GO enrichment analysis
enrichment = get_go_enrichment(protein_list)
Requirements
- Python ≥ 3.8
- numpy==1.23.5
- pandas
- scikit-learn
- shap==0.42.0
- plotly
- bioservices
- gprofiler-official
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mlmarker-0.1.6.tar.gz
(12.8 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
mlmarker-0.1.6-py3-none-any.whl
(13.0 MB
view details)
File details
Details for the file mlmarker-0.1.6.tar.gz.
File metadata
- Download URL: mlmarker-0.1.6.tar.gz
- Upload date:
- Size: 12.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.9.23 Linux/6.12.38+deb13-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1b7f0df5836ce9953c1e850e2f82bcefd67173938a63eeb7b8c16cd9c0397b7
|
|
| MD5 |
cba16abb98fcf8f4441e0a2887dd280c
|
|
| BLAKE2b-256 |
b7feb581fe5bcb4532ac8a8d30748a239d2fb2181d0356044acc0380686c7628
|
File details
Details for the file mlmarker-0.1.6-py3-none-any.whl.
File metadata
- Download URL: mlmarker-0.1.6-py3-none-any.whl
- Upload date:
- Size: 13.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.9.23 Linux/6.12.38+deb13-amd64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f009ab08744c844dc6e6d70e7d8770bfa6878818bafe3cc728ad3d531e4f5e7
|
|
| MD5 |
7805aef7dac220268047e8b787ad58d4
|
|
| BLAKE2b-256 |
65b56a58a16e03e7f9fc8fc57a69a67354900c70b6ad3d237e622cdf3bc13bed
|