Skip to main content

UMI error correct

Project description

# umierrorcorrect

Pipeline for analyzing barcoded amplicon sequencing data with Unique molecular identifiers (UMI)

Reference

UMIErrorCorrect has been published in Clinical Chemistry.

[Link to the Umierrorcorrect paper](https://doi.org/10.1093/clinchem/hvac136)

Österlund T., Filges S., Johansson G., Ståhlberg A. UMIErrorCorrect and UMIAnalyzer: Software for Consensus Read Generation, Error Correction, and Visualization Using Unique Molecular Identifiers, Clinical Chemistry, 2022;, hvac136

Installation

To run Umierrorcorrect via Docker, see the [Docker documentation](doc/docker.md).

To install the UMI-errorcorrect pipeline from source, open a terminal and type the following:

` pip install umierrorcorrect `

After installation, try to run the pipeline:

` run_umierrorcorrect.py -h `

Dependencies

Umi-errorcorrect runs using Python 3 and requires the following programs/libraries to be installed (if you run through docker all dependencies are already handled):

Python-libraries (should be installed automatically):

pysam (v 0.8.4 or greater)

External programs:

bwa (bwa mem command is used) Either of gzip or pigz (parallel gzip)

Install the external programs and add them to the path.

Since the umierrorcorrect pipeline is using bwa for mapping of reads, a bwa-indexed reference genome is needed. Index the reference genome with the command bwa index -a bwtsw reference.fa.

Usage

Example syntax for running the whole pipeline:

run_umierrorcorrect.py -r1 read1.fastq.gz -r2 read2.fastq.gz -ul umi_length -sl spacer_length -r reference_fasta_file.fasta -o output_directory

The run_umierrorcorrect.py pipeline performs the following steps:

  • Preprocessing of fastq files (remove the UMI and spacer and puts the UMI in the header)

  • Mapping of preprocessed fastq reads to the reference genome

  • Perform UMI clustering, then error correcion of each UMI cluster

  • Create consensus reads (one representative read per UMI cluster written to a BAM file)

  • Create a consensus output file (collapsed counts per position)

  • Perform variant calling.

It is also to possible to run the pipeline step-by-step.

To see the options for each step, type the following:

` preprocess.py -h run_mapping.py -h umi_error_correct.py -h get_consensus_statistics.py -h call_variants.py -h filter_bam.py -h filter_cons.py -h ` Tutorial ——–

[Link to the Umierrorcorrect tutorial](https://github.com/stahlberggroup/umierrorcorrect/wiki/Tutorial)

Example of UMI definition options

[UMI definition options](https://github.com/stahlberggroup/umierrorcorrect/wiki/UMI-definition-options)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

umierrorcorrect-0.26.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

umierrorcorrect-0.26-py3-none-any.whl (77.1 kB view details)

Uploaded Python 3

File details

Details for the file umierrorcorrect-0.26.tar.gz.

File metadata

  • Download URL: umierrorcorrect-0.26.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.9

File hashes

Hashes for umierrorcorrect-0.26.tar.gz
Algorithm Hash digest
SHA256 e72d34380025c1de0e5dc79a365b3193fd5eda2752221a72cc9e08851971d633
MD5 6a90abb25a1727fbaae5e63bc449d797
BLAKE2b-256 b76d3b2e2dfd9493f11b20ec62493c3622ffe48e2a6fbcd11794d7a0d8ba664b

See more details on using hashes here.

File details

Details for the file umierrorcorrect-0.26-py3-none-any.whl.

File metadata

  • Download URL: umierrorcorrect-0.26-py3-none-any.whl
  • Upload date:
  • Size: 77.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.9

File hashes

Hashes for umierrorcorrect-0.26-py3-none-any.whl
Algorithm Hash digest
SHA256 06a887a47ee45f0e9e5b2bf4af2b990acc93e3d43b4a70a1594b6005052af772
MD5 2af4652b630ae719ad4bbabfb8c62481
BLAKE2b-256 b32be3d1203639832e4f61f48ecdf3ab517f99038441baffd61a4332dde4d0d9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page