Skip to main content

Small RNAseq pipeline for paired-end reads

Project description

XICRA: Small RNAseq pipeline for paired-end reads

Description

XICRA is a python pipeline developed in multiple separated modules that it is designed to take paired end fastq reads, trim adapters and low-quality base pairs positions, and merge reads (R1 & R2) that overlap. Using joined reads it describes all major RNA biotypes present in the samples including miRNA and isomiRs, tRNA fragments (tRFs) and piwi associated RNAs (piRNAs).

So far, XICRA produces a miRNA analysis at the isomiR level using joined reads, multiple software at the user selection and following a standardization procedure. Results are generated for each sample analyzed and summarized for all samples in a single expression matrix. This information can be processed at the miRNA or isomiR level (single sequence) but also summarizing for each isomiR variant type. This information can be easily accessed using the accompanied R package XICRA.stats. Although the pipeline is designed to take paired-end reads, it also accepts single-end reads.

Installation

XICRA will require python v3.7 and java (we tested in openjdk 14 2020-03-17).

The XICRA python pipeline is available in pip and also available using conda.

XICRA depends on multiple third party software that we have listed below.

Dependencies

Python XICRA module will install itself along some python modules dependencies (pandas, multiqc, pybedtools, biopython etc.).

But additionally, XICRA depends on third party software that we listed in the following table.

Conda environment

We encourage you to install XICRA and all dependencies using the conda environment we created and following these instructions.

To create a new conda environment, install third party software, install XICRA and missing dependencies, do as follows:

  1. Get requirements file from XICRA git repo
wget https://raw.githubusercontent.com/HCGB-IGTP/XICRA/master/XICRA_pip/devel/conda/environment.yml
  1. Create environment named XICRA and install required packages using conda:
conda env create -f environment.yml
  1. Activate environment and install XICRA
## activate
conda activate XICRA

## install latest python code
pip install XICRA
  1. Install missing software: Unfortunately, a couple of executables are not available neither as a conda or pip packages. These packages are miraligner, sRNAbench and 'MINTmap'. We have generated a bash script to retrieve and include within your conda environment.
## install missing software
wget https://raw.githubusercontent.com/HCGB-IGTP/XICRA/master/XICRA_pip/XICRA/config/software/installer.sh
sh installer.sh

To check everything is fine, try executing the config module:

XICRA config

Documentation

See a full documentation, user guide and manual in here

Example

Here we include a brief example on how to use XICRA.

First, we create a python environment and will install XICRA and dependencies. See example details shown before. Then, we can test XICRA by using an example of 100 miRNA simulated and provideded within the repository as an example of simulation.

## run XICRA example
ln -s ~/BMC_bioinformatics_paper/simulation/example/reads/

## prepare reads
XICRA prep --input reads/ --output_folder test_XICRA

## join reads
XICRA join --input test_XICRA --noTrim

## create miRNA analysis
XICRA miRNA --input test_XICRA --software miraligner sRNAbench

## explore results
ls test_XICRA/report/

Documentation

For a full documentation and details visit Read the Docs site here.

See a brief example on how to install and run XICRA here

License

MIT License

Copyright (c) 2020-2022 HCGB-IGTP

See additional details here

Developed and maintained by Jose F. Sanchez-Herrero and Lauro Sumoy at HCGB-IGTP

http://www.germanstrias.org/technology-services/genomica-bioinformatica/

Citation

Sanchez Herrero, J.F., Pluvinet, R., Luna de Haro, A. et al. Paired-end small RNA sequencing reveals a possible overestimation in the isomiR sequence repertoire previously reported from conventional single read data analysis. BMC Bioinformatics 22, 215 (2021). https://doi.org/10.1186/s12859-021-04128-1

Authors

Antonio Luna de Haro (v0.1) Jose F Sanchez-Herrero (v1.0)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

XICRA-1.4.5.tar.gz (80.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

XICRA-1.4.5-py3-none-any.whl (105.3 kB view details)

Uploaded Python 3

File details

Details for the file XICRA-1.4.5.tar.gz.

File metadata

  • Download URL: XICRA-1.4.5.tar.gz
  • Upload date:
  • Size: 80.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.2

File hashes

Hashes for XICRA-1.4.5.tar.gz
Algorithm Hash digest
SHA256 2bd911c2a3671d351229d6c92cfbb64387e7bbdd75e6049021acb27cf9616875
MD5 7a19d59dee80d32161fa60b9b38b92ca
BLAKE2b-256 9042890d7a8e2dbb1c7f4d00228004fe891644ee405e8ca37e07772a309482c8

See more details on using hashes here.

File details

Details for the file XICRA-1.4.5-py3-none-any.whl.

File metadata

  • Download URL: XICRA-1.4.5-py3-none-any.whl
  • Upload date:
  • Size: 105.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.4.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.2

File hashes

Hashes for XICRA-1.4.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8ee9648f414c4053090201a48c3fa0c11e90ce80ff03c15649fdcee7f05479a6
MD5 26adcad43f6d195b73377edeb79b631a
BLAKE2b-256 06dfeb0776ae5bb7f725c264dfa271e789f378800aa5c9daf201d619f455fb30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page