Skip to main content

NucFlag misassembly identifier.

Project description

NucFlag

CI PyPI - Version install with bioconda

Generates nucleotide frequency plots and genome misassembly BED files. Fork of NucFreq.

Labeled Misassemblies

Quickstart

# Requires Python>=3.11
pip install nucflag==1.0.0a4

[!NOTE] NucFlag v1.0.0-alpha.4 is not installable via bioconda at the moment.

Align long-reads to assembly.

# Align long reads to assembly generated from those reads.
minimap2 -x lr:hqae -I 8G asm.fa.gz reads.fq.gz | samtools view -bh -o asm.bam
samtools index asm.bam

Detect misassemblies.

# Call putative misassemblies from read alignments on a whole genome.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed
# Or on a set of regions...
nucflag call -i asm.bam -f asm.fa.gz -b regions.bed -o misassemblies.bed
# Also runs on ONT read alignments.
nucflag call -i asm_ont_r9.bam -f asm.fa.gz -x ont_r9 -o misassemblies.bed
nucflag call -i asm_ont_r10.bam -f asm.fa.gz -x ont_r10 -o misassemblies.bed
# Provide a configfile for finer control.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed -c config.toml

Visualize misassemblies in a number of ways.

# Generate NucFreq plots.
nucflag call -i asm.bam -f asm.fa.gz -d plots
# And add any number of tracks...
nucflag call -i asm.bam -f asm.fa.gz -d plots --tracks repeatmasker.bed segdups.bed
# Or generate bigWigs of specific signals and then merge them with `bigtools`.
# For use in IGV or other genome browsers.
nucflag call -i asm.bam -f asm.fa.gz -d plots \
    --output_pileup_dir bigwigs \
    --add_pileup_data cov mismatch mapq
bigwigmerge -l <(find bigwigs -name "*_first.bw") merged_first.bw

Generate status BED or breakdown showing distribution of assembly issues.

nucflag status -i misassemblies.bed > status.bed
# Or plot by "length"
nucflag breakdown -i misassemblies.bed -o breakdown -t percent

Estimate QV from BED file.

nucflag qv -i misassemblies.bed > qv.bed

Generate ideogram.

nucflag ideogram -i misassemblies.bed -o ideogram
# Add cytobands with a BED file with chrom, start, end, name, and btype.
nucflag ideogram -i misassemblies.bed -c cytobands.bed -o ideogram

Get consensus misassembly calls by intersection.

nucflag consensus -i nucflag_ont.bed nucflag_hifi.bed hmm_flagger_hifi.bed hmm_flagger_ont.bed > consensus.bed

Generate sample config.

nucflag config -x ont_r10 > config_r10.toml

Input

  • BAM file of PacBio HiFi, ONT R9, or ONT R10 reads aligned to an assembly.
  • (Recommended) Assembly.
  • (Optional) BED file of regions.

Output

  • BED file of misassemblies.
  • (Optional) Plots with coverage and mismatch pileup with misassemblies flagged.
  • (Optional) bigWigs of pileup signals.
  • (Optional) BED file of assembly status.

Documentation

Read the docs at the NucFlag wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucflag-1.0.0a4.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucflag-1.0.0a4-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file nucflag-1.0.0a4.tar.gz.

File metadata

  • Download URL: nucflag-1.0.0a4.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a4.tar.gz
Algorithm Hash digest
SHA256 a76a3852a26c2d4836082427aa4d0a128696f414183d75b970c6704e42a16d41
MD5 5454b03e45198e712ec56aa771de06c0
BLAKE2b-256 f11d0a04f3a867054773e99a065f59e83aae574c8d8e81e20f1fbafaa3d5903d

See more details on using hashes here.

File details

Details for the file nucflag-1.0.0a4-py3-none-any.whl.

File metadata

  • Download URL: nucflag-1.0.0a4-py3-none-any.whl
  • Upload date:
  • Size: 31.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a4-py3-none-any.whl
Algorithm Hash digest
SHA256 ca4b6c067d1340cb97a863e605cccf6112ffbdfb0d79d9cd03a4fc2d3d430ee6
MD5 a5e5f3e5f07fdfde2abd4311d8dc03f8
BLAKE2b-256 ffb5a36936f3db24423353341c047ab950433a77e204d0250ffb75ce257beb12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page