Skip to main content

Accurate integration of single-cell DNA and RNA data for a deeper understanding of tumor heterogeneity

Project description

MaCroDNA

By Mohammadamin Edrisi, Xiru Huang, Huw A. Ogilvie and Luay Nakhleh

Department of Computer Science, Rice University

Table of Contents

1. Introduction

2. Citation

3. Installation

4. Usage

5. Contacts

Introduction

MaCroDNA (Matching Cross-Domain Nucleic Acids) is a correlation-based method to perform mapping between scRNA-seq gene expression and scDNA-seq copy number values. This repository contains the source code for MaCroDNA described in the paper "MaCroDNA: Accurate integration of single-cell DNA and RNA data for a deeper understanding of tumor heterogeneity".

Citation

Installation

Package Requirements

Python >= 3.7

numpy

pandas

scipy

math

gurobipy

Here, only gurobipy need to be installed manually before installing MaCroDNA. Other packages can be installed automatically while installing MaCroDNA

gurobipy Installation

First, Gurobi need to be installed. The academic license is free. The installation instructions for Gurobi are

OS Instruction
Linux Gurobi Installation Guide: Linux
Windows Gurobi Installation Guide: Windows
macOS Gurobi Installation Guide: macOS

After the installation of Gurobi, gurobipy can be installed using pip

python -m pip install gurobipy

Other installation methods can be found How do I install Gurobi for Python

MaCroDNA Installation

pip install -i https://test.pypi.org/simple/ MaCroDNA==0.0.5

Installation Test

After installing MaCroDNA, you can test the installation using the following code in python

$ python
> from MaCroDNA import MaCroDNA
> test_macrodna = MaCroDNA()
> test_macrodna.tiny_test()

And the output should be like

******Test DNA data is:
      cell1  cell2  cell3  cell4
gene                            
g1        2      2      1      2
g2        2      2      1      2
g3        3      2      2      2
g4        1      2      2      2
g5        6      2      2      2
g6        2      2      3      6
******Test RNA data is:
      cell1  cell2  cell3  cell4
gene                            
g1        0      2      0      1
g2        0      2      0      1
g3       10      2      2      1
g4        0      2      2      1
g5       20      2      0      1
g6        0      2      5     20
g7        0      0      0      0
******Clone id for each DNA cell is:
   clone   cell
0      0  cell1
1      1  cell2
2      2  cell3
3      3  cell4
**********
Start Mapping RNA cells to DNA clones
**********
After selecting the same genes in RNA and DNA
number of cells in dna data 4
number of cells in rna data 4
number of genes in dna data 6
number of genes in rna data 6
[[ 0.95553309  0.          0.15877684 -0.29277002]
 [ 0.          0.          0.          0.        ]
 [-0.34698896  0.          0.87447463  0.86824314]
 [-0.18650096  0.          0.7592566   1.        ]]
1 0
MaCroDNA will be run for 1 steps
the smallest set has 4 number of cells
Set parameter Username
Academic license - for non-commercial use only - expires 2023-04-18
Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)
Thread count: 10 physical cores, 20 logical processors, using up to 20 threads
Optimize a model with 9 rows, 16 columns and 48 nonzeros
Model fingerprint: 0xbe36aead
Variable types: 0 continuous, 16 integer (16 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [2e-01, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 4e+00]
Found heuristic solution: objective 2.8300077
Presolve removed 9 rows and 16 columns
Presolve time: 0.00s
Presolve: All rows and columns removed

Explored 0 nodes (0 simplex iterations) in 0.00 seconds (0.00 work units)
Thread count was 1 (of 20 available processors)

Solution count 1: 2.83001 

Optimal solution found (tolerance 1.00e-04)
Best objective 2.830007718086e+00, best bound 2.830007718086e+00, gap 0.0000%
IsMIP: 1
Solved with MIPFocus: 0
The model has been optimized
Obj: 2.83001
the number of associations in the correspondence matrix 4.0
**********
Finish Mapping
Test Success
**********

For more complicated test, you can use test_macrodna.py under test/ directory

Usage

cell-to-cell mapping

For mapping RNA cells to DNA cells, the input should be two dataframes dna_df and rna_df. In these two dataframes, the row ids should be the genes and the column ids should be the cells. The genes on RNA data and those in DNA data can be different, because MaCroDNA can select the same genes from the data by itself.

To get the cell-to-cell mapping

$ python
> from MaCroDNA import MaCroDNA
> macrodna = MaCroDNA(rna_df, dna_df)
> cell2cell = macrodna.cell2cell_assignment()

The output cell2cell is also a dataframe. The index ids are the RNA cell ids. And it only has on column "predict_cell", which is the DNA cell assigned to the corresponding RNA cell.

cell-to-clone mapping

For mapping RNA cells to DNA clones, the input needs three dataframes dna_df, rna_df and dna_label. dna_df and rna_df are same as cell-to-cell mapping. dna_label has two columns "clone" and "cell", where "cell" is the DNA cells and "clone" is the corresponding clone id for that DNA cell.

To get the cell-to-clone mapping

$ python
> from MaCroDNA import MaCroDNA
> macrodna = MaCroDNA(rna_df, dna_df, dna_label)
> cell2clone = macrodna.cell2clone_assignment()

The output cell2clone is also a dataframe. The index ids are the RNA cell ids. It has two columns. One is "predict_cell", which is the corresponding DNA cell for that RNA cell. The other column is "predict_clone", which is the predict clone id for that RNA cell.

Contacts

If you have any questions, please contact us via edrisi@rice.edu or xiru.huang@rice.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

MaCroDNA-0.0.1.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

MaCroDNA-0.0.1-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file MaCroDNA-0.0.1.tar.gz.

File metadata

  • Download URL: MaCroDNA-0.0.1.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.7

File hashes

Hashes for MaCroDNA-0.0.1.tar.gz
Algorithm Hash digest
SHA256 64f3d0923453ab4d2ec444ec541c30bb776d75a0333bd91edde7d75e5423d0bd
MD5 a714e76ad4cce77c9eb077caec200a35
BLAKE2b-256 2f86ff3e69941e74dd8c2983b281d668e212ef27101f00637c3694bd76efdf66

See more details on using hashes here.

File details

Details for the file MaCroDNA-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: MaCroDNA-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.7

File hashes

Hashes for MaCroDNA-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c9b14ada31eb62935e6456ea72a549d2e8450f16a16182430217a152b301916f
MD5 d703c61b9b99ecb7c5a108dbcb42201e
BLAKE2b-256 d24d0cdb1927584ae8e4ace97050273c396d1b8849a83ed329f5c92a86f9d459

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page