A package for estimating alternative polyadenylation events from scRNA-seq data.
Project description
SCAPE-APA: a package for estimating alternative polyadenylation events from scRNA-seq data
Installation
Method 1
conda config --append channels bioconda
conda config --append channels conda-forge
conda config --append channels anaconda
conda create -n scape_env python=3.11
conda activate scape_env
conda install anaconda::numpy
conda install anaconda::scipy
conda install anaconda::pandas
conda install anaconda::matplotlib
conda install anaconda::click
conda install anaconda::tomli-w
conda install anaconda::requests
conda install conda-forge::psutil
conda install conda-forge::tomli-w
conda install bioconda::bedtools
conda install bioconda::pybedtools
conda install bioconda::pysam
conda install bioconda::gffutils
pip install taichi
pip install scape-apa
Method 2
# Mac
conda env create -f mac_environment.yml
conda activate scape_env
Method 3
# Linux
conda env create -f linux_environment.yml
conda activate scape_env
pip install -r linux_requirements.txt
Method 4
# install locally
git clone https://github.com/chengl7-lab/scape.git
cd scape
pip install .
Commands
| Command | Description |
|---|---|
| scape gen_utr_annotation | Generate UTR annotation. |
| scape prepare_input | Prepare data per UTR. |
| scape infer_pa | Parameters inference. |
| scape merge_pa | Merge PA within junction per gene or UTR. |
| scape cal_exp_pa_len | Calculate the expected length of PA. |
| scape ex_pa_cnt_mat | Extract read count matrix. |
Get help information of scape or scape commands.
scape --help
scape gen_utr_annotation --help
Usage
gen_utr_annotation
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --gff_file | TEXT | Yes | NA | The gff3 or gff3.gz file including annotation of gene body. |
| --output_dir | TEXT | Yes | NA | Directory to save dataframe of selected UTR. |
| --res_file_name | TEXT | Yes | NA | File Name of dataframe of the UTR annotation. The suffix .csv is automatically generated. |
| --gff_merge_strategy | TEXT | No | merge | Method for processing overlapping regions. It follows merge_strategy in package gffutils. |
OUTPUT: An csv file including information of annotated 3UTR which is stored at {output_dir}/{res_file_name}.csv.
prepare_input
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --utr_file | TEXT | Yes | NA | UTR annotation file (dataframe, resulted from gen_utr_annotation). |
| --cb_file | TEXT | Yes | NA | File of tsv.gz including all validated barcodes (by CellRanger). This file has one column of cell barcode which must be consistent with value of CB tag in bam_file file. |
| --bam_file | TEXT | Yes | NA | Bam file that is used for searching reads over annotated UTR. |
| --output_dir | TEXT | Yes | NA | Output directory to save pickle files of selected reads over annotated UTR. |
| --chunksize | INTERGER | No | 1000 | Number of UTR regions included in each small pickle file, which contains the preprocessed input file for APA analysis. |
OUTPUT: Pickle files that include tuples (gene info, dataframe of parameter).
infer_pa
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --input_pickle_file | TEXT | Yes | NA | Input pickle file (result of prepare_input) |
| --output_dir | TEXT | Yes | NA | Directory to save output pickle files including PAS information over annotated UTR. |
| --toml_para_file | TEXT | No | None | A TOML file (example) specifies user-defined parameters. |
| --pre_para_pkl_file | TEXT | No | None | A pickle file with pre-specified pA sites and utr length, result file of scape analysis. |
OUTPUT: Pickle file including Parameters for each UTR region.
merge_pa
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --output_dir | TEXT | Yes | NA | Directory which was used in previous steps to save output by prepare_input and infer_pa. |
| --utr_merge | BOOLEAN | No | True | If True, PA sites from the same gene are merge. Otherwise, if False, PA sites from the same UTR are merged. |
OUTPUT: A single pickle file containing all UTRs of all genes is stored in output_dir/. Its name is res.gene.pkl if utr_merge=True, otherwise, its name is res.utr.pkl.
cal_exp_pa_len
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --output_dir | TEXT | Yes | NA | Directory which was used in previous steps to save output by prepare_input and infer_pa. |
| --cell_cluster_file | TEXT | No | - | An csv file containing two columns in order: cell barcode (CB) and respective group (cell_cluster_file). Its name will be included in the file name of final result. |
| --res_pkl_file | TEXT | No | - | Name of res pickle file that contains PASs for calculating expected PA length. Its name will be included in the file name of final result. |
OUTPUT: exp_pa_len.csv. It is a dataframe with 2 columns.
ex_pa_cnt_mat
| Input Argument | Type | Required | Default | Description |
|---|---|---|---|---|
| --output_dir | TEXT | Yes | NA | Directory which was used in previous steps to save output by prepare_input and infer_pa. |
| --res_pkl_file | TEXT | No | - | Name of res pickle file that contains PASs for calculating expected PA length. Its name will be included in the file name of final result. |
OUTPUT: An tsv.gz file named {res_pkl_file.cnt.tsv.gz} is stored in output_dir/.
Demo & Tutorials
The data used can be downloaded from examples.
Citation
Guangzhao Cheng, Tien Le, Ran Zhou, and Lu Cheng. SCAPE-APA: a package for estimating alternative polyadenylation events from scRNA-seq data. bioRxiv 2024.03.12.584547; doi: https://doi.org/10.1101/2024.03.12.584547
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scape_apa-1.0.4.tar.gz.
File metadata
- Download URL: scape_apa-1.0.4.tar.gz
- Upload date:
- Size: 43.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92d64a86fd634d7fd132e172fbaea0f58e9f4f8fd5163fee6266ebd5ab462be0
|
|
| MD5 |
ba9fffc801536506760d37dc89e2264c
|
|
| BLAKE2b-256 |
3d23f159656b8e3163cd87db4d0b6bce039665becac8b3fe26490b979db71cc0
|
File details
Details for the file scape_apa-1.0.4-py3-none-any.whl.
File metadata
- Download URL: scape_apa-1.0.4-py3-none-any.whl
- Upload date:
- Size: 45.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af2b50c44df7298812ce49ad7132f55916749a0808fcd47ad0906bf4a0da0165
|
|
| MD5 |
6204826a5ac858a242677286f5d86371
|
|
| BLAKE2b-256 |
8e938096bea5194685464443f41e234d43c5efc8ca20b36def633e13047abab7
|