Skip to main content

SCYN: Single cell CNV profiling method using dynamic programming

Project description

SCYN: Single cell CNV profiling method using dynamic programming

SCYN: Single cell CNV profiling method using dynamic programming

Pre-requirements

  • python3
  • numpy>=1.16.1
  • pandas>=0.23.4,<0.24
  • tasklogger>=0.4.0
  • scipy>=1.3.0
  • pysam>=0.15.3
  • SCOPE

install requirements

pip install -r requirements.txt

To install R package SCOPE, please refer to the README of SCOPE. SCYN integrates the SCOPE to get the cell-by-bin reads depth matrix and perform the normalization. SCYN mainly focuses on finding the optimal CNV segmentation profiling using dynamic programming.

Installation

Installation with pip

To install with pip, run the following from a terminal:

pip install scyn

Installation from Github

To clone the repository and install manually, run the following from a terminal:

git clone https://github.com/xikanfeng2/SCYN.git
cd SCYN
python setup.py install

Usage

Quick start

The following code runs SCYN.

In command line:

usage: python run-scyn.py [-h] [options] -i input_bams_dir

SCYN: Single cell CNV profiling method using dynamic programming efficiently
and effectively

required arguments:
  -i, --indir   <str> the input bams directory (default: None)

optional arguments:
  -o, --outdir  <str> the output directory (default: ./)
  --seq           <str> the reads type: single-end or paired-end. (default:
                    single-end)
  --bin_len       <int> the bin length, default is 500K. (default: 500)
  --ref           <str> the reference genome version: hg19 or hg38.
                    (default: hg19)
  --reg           <str> the regular expression to match all BAM files in
                    your input directory. For example, ".bam" will match all
                    BAM files ended with '.bam'. (default: *.bam)
  --mapq          <int> the mapping quality cutoff when calculating the
                    reads coverage. (default: 40)
  --verbose       <int> If > 0, print log messages. (default: 1)
  -h, --help

In Python:

import scyn

# create SCYN object
scyn_operator = scyn.SCYN()

# call cnv
# bam_dir is the input bam directory and output_dir is the output directory
scyn_operator.call(bam_dir, output_dir)

# store cnv matrix to a csv file
scyn_operator.cnv.to_csv('your file name')

For 10X merged BAM(One bam file), SCYN provides the function to split merged bam to cell bams based on the barcodes.

import scyn
scyn.demultiplex_10X_bam(info_file, bam_file, out_dir)

This function demultiplexs 10X merged bam file according to barcode Parameters:

  • info_file : the sample summary info file. Please refer to the 10X websites breast_tissue_A_2k_per_cell_summary_metrics.csv
  • bam_file : the merged bam file path.
  • out_dir : output directory. The splited bams will be saved in this directory, named as cell-barcode.bam. cell-barcode is the barcode of each cell.

SCYN attributes

scyn_operator = scyn.SCYN()
  • scyn_operator.cnv is the copy number variants matrix.
  • scyn_operator.segments is the segments for each chromosome.
  • scyn_operator.meta_info is the meta information of cells, include gini and ploidy.

SCYN Output Format

The output of SCYN consits of two cnv files and one meta file.

  • cnv.csv: with cell as row and bin as column. This file can be used as the input of Oviz-SingleCell CNV analysis.

  • cnv_T.csv: with bin as column and cell as row, it is the transpose matrix of cnv.csv. This file can be parse by popular R packages like ExpressionSet for downstream analysis.

  • segments.csv is the cnv segments information for each chromosome.

  • meta.csv: with cell as row, and meta information as column. The default meta information is:

    • c_gini: stores the gini coeficient of each cell.
    • c_ploidy: stores the mean ploidy of each cell, it is calculated from cnv.csv (not the one SCOPE provide).

    User can manually add extra cell meta information like 'cell_type', 'cluster', or 'group' for downstream analysis. Prefix c here denotes numeric continuous value. The absence of prefix c denotes category meta information like 'group' or 'cluster'.

Parameters

SCYN(seq='single-end', bin_len=500, ref='hg19', reg='*.bam', mapq=40, verbose=1)

Parameters

  • seq : string, optional, default: single-end The reads type: single-end or paired-end

  • bin_len : int, optional, default: 500 The bin length, default is 500K

  • ref : string, optional, default: hg19 The reference genome version: hg19 or hg38

  • reg : string, optional, default: .bam The regular expression to match all BAM files in your input directory. For example, ".bam" will match all BAM files ended with '.bam'

  • mapq : int, optional, default: 40 The mapping quality cutoff when calculating the reads coverage

  • verbose : int or boolean, optional, default: 1

    If True or > 0, print status messages

Cite us

Help

If you have any questions or require assistance using SCYN, please contact us with xikanfeng2@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SCYN-1.0.6.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

SCYN-1.0.6-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file SCYN-1.0.6.tar.gz.

File metadata

  • Download URL: SCYN-1.0.6.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.0

File hashes

Hashes for SCYN-1.0.6.tar.gz
Algorithm Hash digest
SHA256 988edcdaeb183b1367d89e7735852ffd6e95c94a39d4f2a234bc8cfadaee55b9
MD5 e92fb9e5794aa1f1497eaf52607fb01f
BLAKE2b-256 2668b35703b7e758ea9dcd874af36e2923103490871152160120fce26b524ad7

See more details on using hashes here.

File details

Details for the file SCYN-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: SCYN-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.0

File hashes

Hashes for SCYN-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1f8bea0b521c6ad2ea11126cce3b643f6db8c528022e4aee52d0cc24384490fa
MD5 b348ec630524bbb9b206212c1e304ce8
BLAKE2b-256 dc1f1ad8f038e9a9d46e1287c3b6b80ecb3e40c5f23ddf8e4dabd79f3c11ef14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page