Skip to main content

Locally reassembling haplotypes from BAM files generated by WGS data.

Project description

Local Reassembly

Local Reassembly is a tool for locally reassembling reads from a mapped BAM file. We have two modes of operation: "assembly" and "haplotype". In "assembly" mode, we use SPAdes to perform de novo assembly of the mapped reads. In "haplotype" mode, we use whatshap to perform haplotype assembly of the mapped reads. The output is a FASTA file containing the assembled contigs.

Installation

Local Reassembly is a Python package that can be installed using pip. To install the package, run the following command:

pip install local_reassembly

We also recommend installing with Docker, please see Dockerfile for more information.

Usage

  • Assembly
usage: reloc reassm [-h] [-o OUTPUT_DIR] [-d] [-m {assembly,haplotype}] [-p] [-a {spades,megahit}] input_genome_file input_bam_file region

Local reassembly

positional arguments:
  input_genome_file     input genome file in FASTA format
  input_bam_file        input BAM file
  region                genomic region in the format chr:start-end

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        output directory for the reassembly files
  -d, --debug           debug mode, default False
  -m {assembly,haplotype}, --mode {assembly,haplotype}
                        mode of operation: "assembly" for local assembly, "haplotype" for haplotype reconstruction
  -p, --polish          whether to polish the assembly with Pilon, default False
  -a {spades,megahit}, --assembler {spades,megahit}
                        assembler to use, default is megahit
  • Annotation
usage: reloc reanno [-h] [-o OUTPUT_PREFIX] [-t TMP_WORK_DIR] [-d] local_assem_fasta ref_pt_fasta ref_cDNA_fasta

Local reannotation

positional arguments:
  local_assem_fasta     local assembly FASTA file
  ref_pt_fasta          reference point FASTA file
  ref_cDNA_fasta        reference cDNA FASTA file

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT_PREFIX, --output_prefix OUTPUT_PREFIX
                        output prefix for the reannotation files
  -t TMP_WORK_DIR, --tmp_work_dir TMP_WORK_DIR
                        temporary working directory, default is current directory
  -d, --debug           debug mode, default False
  • Gene pipeline

Build gene database

usage: reloc genedb [-h] [-g GENE_FLANK] [-i INTRON_FLANK] input_genome_file gene_gff_file db_path

Gene database generation

positional arguments:
  input_genome_file     input genome file in FASTA format
  gene_gff_file         gene annotation GFF file
  db_path               output database path

optional arguments:
  -h, --help            show this help message and exit
  -g GENE_FLANK, --gene_flank GENE_FLANK
                        gene flanking region size, default 2000
  -i INTRON_FLANK, --intron_flank INTRON_FLANK
                        intron flanking region size, default 500

run local reassembly and reannotation for a gene

usage: reloc genepipe [-h] [-w WORK_DIR] [-d] [-m ASSEMBLY_MODE] [-a {spades,megahit}] [-p] gene_id genome_file gene_db_path bam_file

Gene pipeline

positional arguments:
  gene_id               gene ID to process
  genome_file           input genome file in FASTA format
  gene_db_path          path to the gene database, should be generated by genedb command
  bam_file              input BAM file

optional arguments:
  -h, --help            show this help message and exit
  -w WORK_DIR, --work_dir WORK_DIR
                        working directory, default is current directory
  -d, --debug           debug mode, default False
  -m ASSEMBLY_MODE, --assembly_mode ASSEMBLY_MODE
                        assembly mode, default is assembly
  -a {spades,megahit}, --assembler {spades,megahit}
                        assembler to use, default is megahit
  -p, --polish          whether to polish the assembly with Pilon, default False

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_reassembly-0.0.5.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_reassembly-0.0.5-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file local_reassembly-0.0.5.tar.gz.

File metadata

  • Download URL: local_reassembly-0.0.5.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for local_reassembly-0.0.5.tar.gz
Algorithm Hash digest
SHA256 3f022b2c5398a5ff0f8d6ec3341cf7b9b7e0d95b4cba506c360cfd91f8091f0f
MD5 fcf0c8ed55df251edc3f975c4752d9f1
BLAKE2b-256 b9ef8fc5ab0d50ec3cb786a01303fe57985d5374f34986ccdff5094e9d33c26e

See more details on using hashes here.

File details

Details for the file local_reassembly-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for local_reassembly-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 16344f14d9d276d1ab5e920f4fbdf35d264e67b21b8af4751126098560618552
MD5 ee9b545990d591aaa1abc9305705a5a6
BLAKE2b-256 6499b498f1ee9f556eb0bcb234ea6be79813c87f559fe6a46b42d21f54841741

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page