A data integration algorithm.
Project description
harmonypy
Harmony is an algorithm for integrating multiple high-dimensional datasets.
harmonypy is a port of the harmony R package by Ilya Korsunsky.
Example
This animation shows the Harmony alignment of three single-cell RNA-seq datasets from different donors.
Installation
This package has been tested with Python 3.7.
Use pip to install:
pip install harmonypy
Usage
Here is a brief example using the data that comes with the R package:
# Load data
import pandas as pd
meta_data = pd.read_csv("data/meta.tsv.gz", sep = "\t")
vars_use = ['dataset']
# meta_data
#
# cell_id dataset nGene percent_mito cell_type
# 0 half_TGAAATTGGTCTAG half 3664 0.017722 jurkat
# 1 half_GCGATATGCTGATG half 3858 0.029228 t293
# 2 half_ATTTCTCTCACTAG half 4049 0.015966 jurkat
# 3 half_CGTAACGACGAGAG half 3443 0.020379 jurkat
# 4 half_ACGCCTTGTTTACC half 2813 0.024774 t293
# .. ... ... ... ... ...
# 295 t293_TTACGTACGACACT t293 4152 0.033997 t293
# 296 t293_TAGAATTGTTGGTG t293 3097 0.021769 t293
# 297 t293_CGGATAACACCACA t293 3157 0.020411 t293
# 298 t293_GGTACTGAGTCGAT t293 2685 0.027846 t293
# 299 t293_ACGCTGCTTCTTAC t293 3513 0.021240 t293
data_mat = pd.read_csv("data/pcs.tsv.gz", sep = "\t")
data_mat = np.array(data_mat)
# data_mat[:5,:5]
#
# array([[ 0.0071695 , -0.00552724, -0.0036281 , -0.00798025, 0.00028931],
# [-0.011333 , 0.00022233, -0.00073589, -0.00192452, 0.0032624 ],
# [ 0.0091214 , -0.00940727, -0.00106816, -0.0042749 , -0.00029096],
# [ 0.00866286, -0.00514987, -0.0008989 , -0.00821785, -0.00126997],
# [-0.00953977, 0.00222714, -0.00374373, -0.00028554, 0.00063737]])
# meta_data.shape # 300 cells, 5 variables
# (300, 5)
#
# data_mat.shape # 300 cells, 20 PCs
# (300, 20)
# Run Harmony
import harmonypy as hm
ho = hm.run_harmony(data_mat, meta_data, vars_use)
# Write the adjusted PCs to a new file.
res = pd.DataFrame(ho.Z_corr)
res.columns = ['X{}'.format(i + 1) for i in range(res.shape[1])]
res.to_csv("data/adj.tsv.gz", sep = "\t", index = False)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
harmonypy-0.0.10.tar.gz
(20.3 kB
view details)
Built Distribution
File details
Details for the file harmonypy-0.0.10.tar.gz
.
File metadata
- Download URL: harmonypy-0.0.10.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27bd39a6f9ada1708ffa577e46c9b7363d1e2fd62740e477ce11fd61819a54df |
|
MD5 | 5f971ed174891e6296769080bef9fc8a |
|
BLAKE2b-256 | 1a699af6183745618057797b940a76320c52a38ad2a69e688e6345e2a0219655 |
File details
Details for the file harmonypy-0.0.10-py3-none-any.whl
.
File metadata
- Download URL: harmonypy-0.0.10-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dab528052f909204e521c9c2bd980221c64003538b0c0fe25be2e43c1199282b |
|
MD5 | 2b259a6ff2da6511de867c0064ac61a9 |
|
BLAKE2b-256 | cccd9479dd66e503af191edc016a302d2125c4f02ea777ebea1e48f6b944b073 |