Turn noise to read
Project description
Turn ‘noise’ to signal: accurately rectify millions of erroneous short reads through graph learning on edit distances
noise2read, originated in a computable rule translated from PCR erring mechanism that: a rare read is erroneous if it has a neighboring read of high abundance, turns erroneous reads into their original state without bringing up any non-existing sequences into the short read set(<300bp) including DNA and RNA sequencing (DNA/RNA-seq), small RNA, unique molecular identifiers (UMI) and amplicon sequencing data.
Click noise2read to jump to its documentation
Quick-run example
Quick-run example for testing noise2read by setting only 1 trial for Optuna and 10 estimators for xGboost which are not the parameters used in our paper.
noise2read installation
Please refer to QuickStart or Installation.
Clone the codes with datasets in github
git clone https://github.com/Jappy0/noise2read
cd noise2read/Examples/simulated_miRNAs
Quick-run testing noise2read on D14
with high ambiguous errors correction and using GPU for training (running about 4 mins with 26 cores and GPU)
noise2read -m correction -c ../../config/Quick_test.ini -a True -g gpu_hist
Examples for correcting simulated miRNAs data with mimic UMIs by noise2read
Take data sets D14 and D16 as examples.
noise2read installation
Please refer to QuickStart or Installation.
Clone the codes with datasets in github
git clone https://github.com/Jappy0/noise2read
cd noise2read/Examples/simulated_miRNAs
Reproduce the evaluation results for D14 and D16 from raw, true and corrected datasets
noise2read -m evaluation -i ./simulated_miRNAs/raw/D14_umi_miRNA_mix.fa -t ./simulated_miRNAs/true/D14_umi_miRNA_mix.fa -r ./simulated_miRNAs/correct/D14_umi_miRNA_mix.fasta -d ./result
noise2read -m evaluation -i ./simulated_miRNAs/raw/D16_umi_miRNA_subs.fa -t ./simulated_miRNAs/true/D16_umi_miRNA_subs.fa -r ./simulated_miRNAs/correct/D16_umi_miRNA_subs.fasta -d ./result
correcting D14
with high ambiguous errors correction and using GPU for training
noise2read -m correction -c ../../config/D14.ini -a True -g gpu_hist
without high ambiguous errors correction and using CPU (default) for training
noise2read -m correction -c ../../config/D14.ini -a False
correcting D16
with high ambiguous errors correction and using GPU for training
noise2read -m correction -c ../../config/D16.ini -a True -g gpu_hist
without high ambiguous errors correction and using CPU (default) for training
noise2read -m correction -c ../../config/D16.ini -a False
Examples for correcting outcome sequence of ABEs and CBEs by noise2read
Clone the codes
git clone https://github.com/Jappy0/noise2read
cd noise2read/CaseStudies
mkdir ABEs_CBEs
cd ABEs_CBEs
Download datasets D32_D33.
Using noise2read to correct the datasets. The running time of each experiment is about 13 minutes using 26 cores and GPU for training.
noise2read -m correction -i ./D32_D33/raw/D32_ABE_outcome_seqs.fasta -a False -d ./ABE/
noise2read -m correction -i ./D32_D33/raw/D33_CBE_outcome_seqs.fasta -a False -d ./CBE/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for noise2read-0.0.99-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61899d52e586f41d54a35b8f1f607b2b7a57e7d912a5f59a80abc990e396bf47 |
|
MD5 | 6034997fd8afdb30965eef5cdfb87f53 |
|
BLAKE2b-256 | 2dbeaca2fe12eb57b807fdbd7d064fa1f67a42711e958e51c142d07b7c12ffaf |