Skip to main content

NucFlag misassembly identifier.

Project description

NucFlag

CI PyPI - Version install with bioconda

Generates nucleotide frequency plots and genome misassembly BED files. Fork of NucFreq.

Labeled Misassemblies

Quickstart

# Requires Python>=3.11
pip install nucflag==1.0.0a5

[!NOTE] NucFlag v1.0.0-alpha.5 is not installable via bioconda at the moment.

Align long-reads to assembly.

# Align long reads to assembly generated from those reads.
minimap2 -x lr:hqae -I 8G asm.fa.gz reads.fq.gz | samtools view -bh -o asm.bam
samtools index asm.bam

Detect misassemblies.

# Call putative misassemblies from read alignments on a whole genome.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed
# Or on a set of regions...
nucflag call -i asm.bam -f asm.fa.gz -b regions.bed -o misassemblies.bed
# Also runs on ONT read alignments.
nucflag call -i asm_ont_r9.bam -f asm.fa.gz -x ont_r9 -o misassemblies.bed
nucflag call -i asm_ont_r10.bam -f asm.fa.gz -x ont_r10 -o misassemblies.bed
# Provide a configfile for finer control.
nucflag call -i asm.bam -f asm.fa.gz -o misassemblies.bed -c config.toml

Visualize misassemblies in a number of ways.

# Generate NucFreq plots.
nucflag call -i asm.bam -f asm.fa.gz -d plots
# And add any number of tracks...
nucflag call -i asm.bam -f asm.fa.gz -d plots --tracks repeatmasker.bed segdups.bed
# Or generate bigWigs of specific signals and then merge them with `bigtools`.
# For use in IGV or other genome browsers.
nucflag call -i asm.bam -f asm.fa.gz -d plots \
    --output_pileup_dir bigwigs \
    --add_pileup_data cov mismatch mapq
bigwigmerge -l <(find bigwigs -name "*_first.bw") merged_first.bw

Generate status BED or breakdown showing distribution of assembly issues.

nucflag status -i misassemblies.bed > status.bed
# Get status of specific regions from a given BED file.
nucflag status -i misassemblies.bed -b censat.bed -g name > status_censat.bed
# Or plot by "length"
nucflag breakdown -i misassemblies.bed -o breakdown -t percent

Estimate QV from BED file.

nucflag qv -i misassemblies.bed > qv.bed

Generate ideogram.

nucflag ideogram -i misassemblies.bed -o ideogram
# Add cytobands with a BED file with chrom, start, end, name, and btype.
nucflag ideogram -i misassemblies.bed -c cytobands.bed -o ideogram

Get consensus misassembly calls by intersection.

nucflag consensus -i nucflag_ont.bed nucflag_hifi.bed hmm_flagger_hifi.bed hmm_flagger_ont.bed > consensus.bed

Generate sample config.

nucflag config -x ont_r10 > config_r10.toml

Input

  • BAM file of PacBio HiFi, ONT R9, or ONT R10 reads aligned to an assembly.
  • (Recommended) Assembly.
  • (Optional) BED file of regions.

Output

  • BED file of misassemblies.
  • (Optional) Plots with coverage and mismatch pileup with misassemblies flagged.
  • (Optional) bigWigs of pileup signals.
  • (Optional) BED file of assembly status.

Documentation

Read the docs at the NucFlag wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nucflag-1.0.0a5.tar.gz (27.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nucflag-1.0.0a5-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file nucflag-1.0.0a5.tar.gz.

File metadata

  • Download URL: nucflag-1.0.0a5.tar.gz
  • Upload date:
  • Size: 27.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nucflag-1.0.0a5.tar.gz
Algorithm Hash digest
SHA256 73f21d7740aec96d611429e5c00b9c26c61080bcd45aa976b8e87accde9e1f85
MD5 a002a4252896f87002612279c436809e
BLAKE2b-256 67415c47499cf261667742aeabdf99c3b40ec2571d8d7d153b4f726364f88d55

See more details on using hashes here.

File details

Details for the file nucflag-1.0.0a5-py3-none-any.whl.

File metadata

  • Download URL: nucflag-1.0.0a5-py3-none-any.whl
  • Upload date:
  • Size: 33.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nucflag-1.0.0a5-py3-none-any.whl
Algorithm Hash digest
SHA256 4d1393ef6c4c1b45d61804e132f844f4e7c41e3dbd28cc3380ed53965253607c
MD5 bdc33c9ba6589c58cad5d1838abf19f3
BLAKE2b-256 ef0174c1d3a527599d8b2670cc04e37d35b18907730b27c5eeebf31b93f66e39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page