Skip to main content

NucFlag misassembly identifier.

Project description

NucFlag

CI PyPI - Version install with bioconda

Generates nucleotide frequency plots and genome misassembly BED files. Fork of NucFreq.

Labeled Misassemblies

Quickstart

pip install nucflag==1.0.0a2

[!NOTE] NucFlag v1.0.0-alpha.2 is not installable via bioconda at the moment.

Align long-reads to assembly.

# Align long reads to assembly generated from those reads.
minimap2 -x lr:hqae -I 8G asm.fa.gz reads.fq.gz | samtools view -bh -o asm.bam
samtools index asm.bam

Detect misassemblies.

# Call putative misassemblies from read alignments on a whole genome.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed
# Or on a set of regions...
nucflag call -i asm.bam -f asm.fa.gz -b regions.bed -o misassemblies.bed
# Also runs on ONT read alignments.
nucflag call -i asm_ont_r9.bam -f asm.fa.gz -x ont_r9 -o misassemblies.bed
nucflag call -i asm_ont_r10.bam -f asm.fa.gz -x ont_r10 -o misassemblies.bed
# Provide a configfile for finer control.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed -c config.toml

Visualize misassemblies in a number of ways.

# Generate NucFreq plots.
nucflag call -i asm.bam -f asm.fa.gz -d plots
# And add any number of tracks...
nucflag call -i asm.bam -f asm.fa.gz -d plots --tracks repeatmasker.bed segdups.bed
# Or generate bigWigs of specific signals and then merge them with `bigtools`.
# For use in IGV or other genome browsers.
nucflag call -i asm.bam -f asm.fa.gz -d plots \
    --output_pileup_dir bigwigs \
    --add_pileup_data cov mismatch mapq
bigwigmerge -l <(find bigwigs -name "*_first.bw") merged_first.bw

Generate status BED or breakdown showing distribution of assembly issues.

nucflag status -i misassemblies.bed > status.bed
# Or plot by "length"
nucflag breakdown -i misassemblies.bed -o breakdown -t percent

Estimate QV from BED file.

nucflag qv -i misassemblies.bed > qv.bed

Generate ideogram.

nucflag ideogram -i misassemblies.bed -o ideogram
# Add cytobands with a BED file with chrom, start, end, name, and btype.
nucflag ideogram -i misassemblies.bed -c cytobands.bed -o ideogram

Get consensus misassembly calls by intersection.

nucflag consensus -i nucflag_ont.bed nucflag_hifi.bed hmm_flagger_hifi.bed hmm_flagger_ont.bed > consensus.bed

Generate sample config.

nucflag config -x ont_r10 > config_r10.toml

Input

  • BAM file of PacBio HiFi, ONT R9, or ONT R10 reads aligned to an assembly.
  • (Recommended) Assembly.
  • (Optional) BED file of regions.

Output

  • BED file of misassemblies.
  • (Optional) Plots with coverage and mismatch pileup with misassemblies flagged.
  • (Optional) bigWigs of pileup signals.
  • (Optional) BED file of assembly status.

Documentation

Read the docs at the NucFlag wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucflag-1.0.0a2.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucflag-1.0.0a2-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file nucflag-1.0.0a2.tar.gz.

File metadata

  • Download URL: nucflag-1.0.0a2.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a2.tar.gz
Algorithm Hash digest
SHA256 5f21d9361cfb1c974ebfb74723b2a0a094aef338aff8b6aa0226eb92e1a2ed8f
MD5 035d0ad9be2ca8a52dcb648568e15694
BLAKE2b-256 0ac976f2f847fa1494634fdba36d216421e74fbb3ea7dc7ea01f9fedae1b05b1

See more details on using hashes here.

File details

Details for the file nucflag-1.0.0a2-py3-none-any.whl.

File metadata

  • Download URL: nucflag-1.0.0a2-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 1ce39ada1959430e2cbcf9c8162511a8d71e39fe0445d9bb9d2b82a7d777c056
MD5 69157bf33ecc02950566f09ed0acb8c1
BLAKE2b-256 4ff98ce404e397ee80cfa8b370fce3fadef266694c3ec9d4150610f96e3da3fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page