Optimal transport-based data integration.
Project description
We are currently updating transmorph to its v0.2.0, PyPi version is unchanged yet.
transmorph is a python framework dedicated to data integration, with a focus on single-cell applications. Dataset integration describes the problem of embedding two or more datasets together, across different batches or feature spaces, so that similar samples end up close from one another. In transmorph we aim to provide a comprehensive framework to design, apply, report and benchmark data integration models using a system of interactive building blocks supported by statistical and plotting tools. We included pre-built models as well as benchmarking databanks in order to easily set up integration tasks. This package can be used in compatibility with scanpy and anndata packages, and works in jupyter notebooks.
Transmorph is also computationally efficient, and can scale to large datasets with competitive integration quality.
Documentation
https://transmorph.readthedocs.io/en/latest/
Installation
transmorph can be installed either from source of from the python repository PyPi. PyPi version is commonly more stable, but may not contain latest features, while you can find the development version on GitHub. Using a python environment is highly recommended (for instance pipenv) in order to easily handle dependencies and versions.
Notable dependencies (automatically installed via $pip)
- anndata
- igraph
- leidenalg
- numba
- numpy
- osqp (quadratic program solver)
- POT (optimal transport in python)
- pymde
- pynndescent
- scipy
- scikit-learn
- stabilized-ica
- umap-learn
Install from source (latest version)
git clone https://github.com/Risitop/transmorph
pip install ./transmorph
Install from PyPi (recommended, latest stable version)
pip install transmorph
Quick starting with a pre-built model
All transmorph models take a list of AnnData objects as input for data integration. Let us start by loading some benchmarking data, gathered from [Chen 2020] (3.4GB size).
from transmorph.datasets import load_chen_10x
chen_10x = load_chen_10x()
One can then either create a custom integration model, or load a pre-built transmorph model. We will choose the EmbedMNN model with default parameters for this example, which embeds all datasets into a common abstract 2D space.
from transmorph.models import EmbedMNN
model = EmbedMNN()
model.fit(chen_10x)
Integration embedding coordinates can be gathered in each AnnData object, in AnnData.obsm['transmorph'].
chen_10x['P01'].obsm['transmorph']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file transmorph-0.2.0.tar.gz
.
File metadata
- Download URL: transmorph-0.2.0.tar.gz
- Upload date:
- Size: 687.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2164f2ee53d3b46f82c7549bb6f38338ec704517e96b787cac62aa2ba0ec1092 |
|
MD5 | 7fb42182308e2a5624f9990c9b79afca |
|
BLAKE2b-256 | f22c289926e3d59558ee3ac42e9724215c7b5fc434422d2108c674165144ddb4 |
File details
Details for the file transmorph-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: transmorph-0.2.0-py3-none-any.whl
- Upload date:
- Size: 172.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00d82809c97002fe595e39408cfa94a0b61f0dcf13b811b3326cbe509776d754 |
|
MD5 | 00cbd2433811951b1108f95903b775f9 |
|
BLAKE2b-256 | 5abc23db1f0cad9ec77716785fe38ad966991a4ae53d3396e844b7a5ddf118a6 |