Skip to main content

NucFlag misassembly identifier.

Project description

NucFlag

CI PyPI - Version install with bioconda

Generates nucleotide frequency plots and genome misassembly BED files. Fork of NucFreq.

Labeled Misassemblies

Quickstart

pip install nucflag
# Or conda install nucflag

[!NOTE] NucFlag v1.0.0-alpha.1 is not installable via bioconda at the moment.

Align long-reads to assembly.

# Align long reads to assembly generated from those reads.
minimap2 -x lr:hqae -I 8G asm.fa.gz reads.fq.gz | samtools view -bh -o asm.bam
samtools index asm.bam

Detect misassemblies.

# Call putative misassemblies from read alignments on a whole genome.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed
# Or on a set of regions...
nucflag call -i asm.bam -f asm.fa.gz -b regions.bed -o misassemblies.bed
# Also runs on ONT read alignments.
nucflag call -i asm_ont_r9.bam -f asm.fa.gz -x ont_r9 -o misassemblies.bed
nucflag call -i asm_ont_r10.bam -f asm.fa.gz -x ont_r10 -o misassemblies.bed
# Provide a configfile for finer control.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed -c config.toml

Visualize misassemblies in a number of ways.

# Generate NucFreq plots.
nucflag call -i asm.bam -f asm.fa.gz -d plots
# And add any number of tracks...
nucflag call -i asm.bam -f asm.fa.gz -d plots --tracks repeatmasker.bed segdups.bed
# Or generate bigWigs of specific signals and then merge them with `bigtools`.
# For use in IGV or other genome browsers.
nucflag call -i asm.bam -f asm.fa.gz -d plots \
    --output_pileup_dir bigwigs \
    --add_pileup_data cov mismatch mapq
bigwigmerge -l <(find bigwigs -name "*_first.bw") merged_first.bw

Generate status BED or breakdown showing distribution of assembly issues.

nucflag status -i misassemblies.bed > status.bed
# Or plot by "length"
nucflag breakdown -i misassemblies.bed -o breakdown -t percent

Estimate QV from BED file.

nucflag qv -i misassemblies.bed > qv.bed

Generate ideogram.

nucflag ideogram -i misassemblies.bed -o ideogram
# Add cytobands with a BED file with chrom, start, end, name, and btype.
nucflag ideogram -i misassemblies.bed -c cytobands.bed -o ideogram

Get consensus misassembly calls by intersection.

nucflag consensus -i nucflag_ont.bed nucflag_hifi.bed hmm_flagger_hifi.bed hmm_flagger_ont.bed > consensus.bed

Generate sample config.

nucflag config -x ont_r10 > config_r10.toml

Input

  • BAM file of PacBio HiFi, ONT R9, or ONT R10 reads aligned to an assembly.
  • (Recommended) Assembly.
  • (Optional) BED file of regions.

Output

  • BED file of misassemblies.
  • (Optional) Plots with coverage and mismatch pileup with misassemblies flagged.
  • (Optional) bigWigs of pileup signals.
  • (Optional) BED file of assembly status.

Documentation

Read the docs at the NucFlag wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucflag-1.0.0a1.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucflag-1.0.0a1-py3-none-any.whl (31.3 kB view details)

Uploaded Python 3

File details

Details for the file nucflag-1.0.0a1.tar.gz.

File metadata

  • Download URL: nucflag-1.0.0a1.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a1.tar.gz
Algorithm Hash digest
SHA256 b80612fc06f9eb895c5f606f483e4f9514d4da8bc5480c253a4b0801162ee337
MD5 c1f7f7d806fb89257b0f01972dd73147
BLAKE2b-256 f657e5da35523ee704d84c27364cfc65eadcb82647e7a61819789dc4e55ebf46

See more details on using hashes here.

File details

Details for the file nucflag-1.0.0a1-py3-none-any.whl.

File metadata

  • Download URL: nucflag-1.0.0a1-py3-none-any.whl
  • Upload date:
  • Size: 31.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nucflag-1.0.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 4d1fce3e1f9d8531ef445bb7586e91c9c25b6d671d20375669c225b71b8c9b50
MD5 705f1683df7997a9a7364601718bf001
BLAKE2b-256 4d71f035a4eebe1cc290e3a8743b2314a46e3a8baec1023d98af2543ad5547ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page