Skip to main content

A collection of SBM utility functions

Project description

sbmutils

A collection of SBM functions.

Installation

pip install sbmutils

Features

Preprocessing

Normalization

  • quantilenorm: Performs 2D quantile normalization over columns
    • Supports both mean and median averaging methods
    • Handles missing values (NaN)
    • Input validation and error handling
  • stacked_quantilenorm: Performs quantile normalization on stacked data with batch information
  • referenced_quantilenorm: Normalizes data using reference quantiles
  • standardize: Standardizes data by centering and scaling

Biological Normalization

  • comba/combat_seqt: Batch effect correction using ComBat
  • counts_to_fpkm: Converts count data to FPKM

Filtering

  • parse_gtf: Parses GTF files for gene information
  • get_gene_id_to_entrez_mapper: Maps gene IDs to Entrez IDs
  • entrez_filtering: Filters data based on Entrez IDs
  • protein_coding_filtering: Filters for protein-coding genes

Decomposition

  • NMF: Non-negative Matrix Factorization implementation

Usage

import numpy as np
import pandas as pd
from sbmutils.preprocess import quantilenorm, standardize, combat
from sbmutils.decomp import NMF

# Quantile normalization example
data = pd.DataFrame([[1, 4], [2, 5], [3, 6]])
normalized_data = quantilenorm(data, average="mean")

# Batch effect correction
counts = pd.DataFrame(...)  # Your count data
batch = [1, 1, 2, 2, ...]   # Batch information
corrected_data = combat(counts, batch)

# NMF decomposition
nmf = NMF(num_components=3)
nmf.fit(data)

Requirements

  • Python >= 3.6
  • NumPy >= 1.19.0
  • SciPy >= 1.7.0
  • inmoose >= 0.1.0
  • pyranges >= 0.0.100
  • gtfparse >= 1.2.1
  • pybiomart >= 0.1.0

How to check MATLAB compatibility

  1. Intall MATLAB engine for python
cd "mablab_root/extern/engines/python"
python setup.py install
  1. Start MATLAB engine
import matlab
import matlab.engine
eng = matlab.engine.start_matlab()
  1. Test with MATLAB function
x_matlab = matlab.double(x_python.tolist())

result_python = function(x_python)
result_matlab = eng.function(x_matlab)

np.testing.assert_array_almost_equal(result_python, result_matlab)

License

This project is licensed under the MIT License.

import matlab
import matlab.engine

"""
test code for NMF
"""
eng = matlab.engine.start_matlab()

x = np.random.randn(1000, 200)
x.ravel()[np.random.choice(x.size, 128, replace=False)] = np.nan
w = np.random.uniform(size=[1000, 4])
h = np.random.uniform(size=[4, 400])

nmf = NMF(num_components=4, num_iter=10, nmf_iter=10)
nmf.fit(x, init_w=w, init_h=h)

coph_cor_py = nmf.correlation_coefficient
ave_C_py = nmf.consensus_matrix

x_matlab = matlab.double(x.tolist())
w_matlab = matlab.double(w.tolist())
h_matlab = matlab.double(h.tolist())
out_m = eng.aoNMF_subtyping_NaN(x_matlab, 4, 10., 10., w_matlab, h_matlab, nargout=4)

coph_cor_m = np.array(out_m[0])
ave_C_m = np.array(out_m[1])

np.testing.assert_array_almost_equal(coph_cor_py, coph_cor_m)
np.testing.assert_array_almost_equal(ave_C_py, ave_C_m)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbmutils-0.2.3.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sbmutils-0.2.3-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file sbmutils-0.2.3.tar.gz.

File metadata

  • Download URL: sbmutils-0.2.3.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sbmutils-0.2.3.tar.gz
Algorithm Hash digest
SHA256 f27c0908bf7cf74c92e12a13a7f8d283655edac1c4e57c4f4730d05f5c7a82e6
MD5 9ea7846933de5eb5e823007d868554da
BLAKE2b-256 ec159e9a51e427b3affdb152ca404207e798a14c394b7624b771d36efae6b56d

See more details on using hashes here.

File details

Details for the file sbmutils-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: sbmutils-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sbmutils-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 396bd2f7b053688787e97c12ca9cd26382dcca8749e8560d6486b99cb630beec
MD5 d19f74294090bc142b96b64d5f418ea0
BLAKE2b-256 e2e66bd13971bb588d923c87d87b343c6ad6a679d83410bbf4117e4e16589ed6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page