Skip to main content

NucFlag misassembly identifier.

Project description

NucFlag

CI PyPI - Version install with bioconda

Generates nucleotide frequency plots and genome misassembly BED files. Fork of NucFreq.

Labeled Misassemblies

Quickstart

# Requires Python>=3.12
pip install nucflag==1.0.0a2

[!NOTE] NucFlag v1.0.0-alpha.2 is not installable via bioconda at the moment.

Align long-reads to assembly.

# Align long reads to assembly generated from those reads.
minimap2 -x lr:hqae -I 8G asm.fa.gz reads.fq.gz | samtools view -bh -o asm.bam
samtools index asm.bam

Detect misassemblies.

# Call putative misassemblies from read alignments on a whole genome.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed
# Or on a set of regions...
nucflag call -i asm.bam -f asm.fa.gz -b regions.bed -o misassemblies.bed
# Also runs on ONT read alignments.
nucflag call -i asm_ont_r9.bam -f asm.fa.gz -x ont_r9 -o misassemblies.bed
nucflag call -i asm_ont_r10.bam -f asm.fa.gz -x ont_r10 -o misassemblies.bed
# Provide a configfile for finer control.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed -c config.toml

Visualize misassemblies in a number of ways.

# Generate NucFreq plots.
nucflag call -i asm.bam -f asm.fa.gz -d plots
# And add any number of tracks...
nucflag call -i asm.bam -f asm.fa.gz -d plots --tracks repeatmasker.bed segdups.bed
# Or generate bigWigs of specific signals and then merge them with `bigtools`.
# For use in IGV or other genome browsers.
nucflag call -i asm.bam -f asm.fa.gz -d plots \
    --output_pileup_dir bigwigs \
    --add_pileup_data cov mismatch mapq
bigwigmerge -l <(find bigwigs -name "*_first.bw") merged_first.bw

Generate status BED or breakdown showing distribution of assembly issues.

nucflag status -i misassemblies.bed > status.bed
# Or plot by "length"
nucflag breakdown -i misassemblies.bed -o breakdown -t percent

Estimate QV from BED file.

nucflag qv -i misassemblies.bed > qv.bed

Generate ideogram.

nucflag ideogram -i misassemblies.bed -o ideogram
# Add cytobands with a BED file with chrom, start, end, name, and btype.
nucflag ideogram -i misassemblies.bed -c cytobands.bed -o ideogram

Get consensus misassembly calls by intersection.

nucflag consensus -i nucflag_ont.bed nucflag_hifi.bed hmm_flagger_hifi.bed hmm_flagger_ont.bed > consensus.bed

Generate sample config.

nucflag config -x ont_r10 > config_r10.toml

Input

  • BAM file of PacBio HiFi, ONT R9, or ONT R10 reads aligned to an assembly.
  • (Recommended) Assembly.
  • (Optional) BED file of regions.

Output

  • BED file of misassemblies.
  • (Optional) Plots with coverage and mismatch pileup with misassemblies flagged.
  • (Optional) bigWigs of pileup signals.
  • (Optional) BED file of assembly status.

Documentation

Read the docs at the NucFlag wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucflag-1.0.0a3.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucflag-1.0.0a3-py3-none-any.whl (31.8 kB view details)

Uploaded Python 3

File details

Details for the file nucflag-1.0.0a3.tar.gz.

File metadata

  • Download URL: nucflag-1.0.0a3.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a3.tar.gz
Algorithm Hash digest
SHA256 00b3eb1921e2df17276903b01e4f903800a08855cf4337a3ca835b2a4a03c1ed
MD5 7630cde68332c3cd22e1bf50ddabd8cf
BLAKE2b-256 45a5ebd9374b69c996892b07fa1fce2c5123d787321cce1f9e92e498d3a141fa

See more details on using hashes here.

File details

Details for the file nucflag-1.0.0a3-py3-none-any.whl.

File metadata

  • Download URL: nucflag-1.0.0a3-py3-none-any.whl
  • Upload date:
  • Size: 31.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 e97b572aad779a11977988a9d0f4dcccc0768760071779120a7318be240a66e1
MD5 80d2a429bc7eb24b7af093bca3936a85
BLAKE2b-256 96a51207339774b6c4295f834604802aaedd69630692d72ad9703d6b71f49b67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page