A collection of SBM utility functions
Project description
sbmutils
A collection of SBM functions.
Installation
pip install sbmutils
Features
Preprocessing
Normalization
quantilenorm: Performs 2D quantile normalization over columns- Supports both mean and median averaging methods
- Handles missing values (NaN)
- Input validation and error handling
stacked_quantilenorm: Performs quantile normalization on stacked data with batch informationreferenced_quantilenorm: Normalizes data using reference quantilesstandardize: Standardizes data by centering and scaling
Biological Normalization
comba/combat_seqt: Batch effect correction using ComBatcounts_to_fpkm: Converts count data to FPKM
Filtering
parse_gtf: Parses GTF files for gene informationget_gene_id_to_entrez_mapper: Maps gene IDs to Entrez IDsentrez_filtering: Filters data based on Entrez IDsprotein_coding_filtering: Filters for protein-coding genes
Decomposition
NMF: Non-negative Matrix Factorization implementation
Usage
import numpy as np
import pandas as pd
from sbmutils.preprocess import quantilenorm, standardize, combat
from sbmutils.decomp import NMF
# Quantile normalization example
data = pd.DataFrame([[1, 4], [2, 5], [3, 6]])
normalized_data = quantilenorm(data, average="mean")
# Batch effect correction
counts = pd.DataFrame(...) # Your count data
batch = [1, 1, 2, 2, ...] # Batch information
corrected_data = combat(counts, batch)
# NMF decomposition
nmf = NMF(num_components=3)
nmf.fit(data)
Requirements
- Python >= 3.6
- NumPy >= 1.19.0
- SciPy >= 1.7.0
- inmoose >= 0.1.0
- pyranges >= 0.0.100
- gtfparse >= 1.2.1
- pybiomart >= 0.1.0
How to check MATLAB compatibility
- Intall MATLAB engine for python
cd "mablab_root/extern/engines/python"
python setup.py install
- Start MATLAB engine
import matlab
import matlab.engine
eng = matlab.engine.start_matlab()
- Test with MATLAB function
x_matlab = matlab.double(x_python.tolist())
result_python = function(x_python)
result_matlab = eng.function(x_matlab)
np.testing.assert_array_almost_equal(result_python, result_matlab)
License
This project is licensed under the MIT License.
import matlab
import matlab.engine
"""
test code for NMF
"""
eng = matlab.engine.start_matlab()
x = np.random.randn(1000, 200)
x.ravel()[np.random.choice(x.size, 128, replace=False)] = np.nan
w = np.random.uniform(size=[1000, 4])
h = np.random.uniform(size=[4, 400])
nmf = NMF(num_components=4, num_iter=10, nmf_iter=10)
nmf.fit(x, init_w=w, init_h=h)
coph_cor_py = nmf.correlation_coefficient
ave_C_py = nmf.consensus_matrix
x_matlab = matlab.double(x.tolist())
w_matlab = matlab.double(w.tolist())
h_matlab = matlab.double(h.tolist())
out_m = eng.aoNMF_subtyping_NaN(x_matlab, 4, 10., 10., w_matlab, h_matlab, nargout=4)
coph_cor_m = np.array(out_m[0])
ave_C_m = np.array(out_m[1])
np.testing.assert_array_almost_equal(coph_cor_py, coph_cor_m)
np.testing.assert_array_almost_equal(ave_C_py, ave_C_m)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sbmutils-0.2.3.tar.gz
(10.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
sbmutils-0.2.3-py3-none-any.whl
(10.7 kB
view details)
File details
Details for the file sbmutils-0.2.3.tar.gz.
File metadata
- Download URL: sbmutils-0.2.3.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f27c0908bf7cf74c92e12a13a7f8d283655edac1c4e57c4f4730d05f5c7a82e6
|
|
| MD5 |
9ea7846933de5eb5e823007d868554da
|
|
| BLAKE2b-256 |
ec159e9a51e427b3affdb152ca404207e798a14c394b7624b771d36efae6b56d
|
File details
Details for the file sbmutils-0.2.3-py3-none-any.whl.
File metadata
- Download URL: sbmutils-0.2.3-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
396bd2f7b053688787e97c12ca9cd26382dcca8749e8560d6486b99cb630beec
|
|
| MD5 |
d19f74294090bc142b96b64d5f418ea0
|
|
| BLAKE2b-256 |
e2e66bd13971bb588d923c87d87b343c6ad6a679d83410bbf4117e4e16589ed6
|