Skip to main content

Identifying methlyation motifs in nanopore data

Project description

Nanomotif

Anaconda-Server Badge PyPI version

Nanomotif is a Python package designed to explore methylation in prokaryotic genomes using Nanopore sequencing. Nanomotif is a fast, scalable, and sensitive tool for identification and utilization of methylation motifs in monocultures and metagenomic samples.

Nanomotif offers

  • de novo methylated motif identification
  • metagenomic bin contamination detection
  • bin association of unbinned contigs (eg. plasmids)
  • association of MTases and RM-systems to motifs.

Documentation

Please see the documentation for detailed installation and usage instructions, descriptions of required files, and analysis examples.

Skip the Documentation: Your Quickstart Guide to Nanomotif

Installation

Nanomotif can easily be installed using Conda for managing your Python environments. You can create a new environment and install Nanomotif as follows:

conda create -n nanomotif  python=3.12
conda activate nanomotif
conda install -c bioconda nanomotif

Check installation

Once installed, the installation can be checked by running:

nanomotif check_installation

This runs a test run on a small dataset, ensuring everything works.

For further details, check out the installation guidelines.

Usage

Required files

To identify methylated motifs, the following files are required:

  • Assembly (fasta file)
  • modkit methylation pileup
  • tab-separated file describing contig-bin relationship.

For further details, check out the required files documentation.

Motif discovery

Whether you are interested in finding methylated motifs in monoculture samples or metagenomic samples, we recomment just running motif_discovery

nanomotif motif_discovery ASSEMBLY.fasta PILEUP.bed CONTIG_BIN.tsv -t THREADS --out OUT

This will create three files: motifs.tsv,motif-scored.tsv, and bin-motifs.tsv. Highly methylated motifs are found in bin-motifs.tsv.

See usage and output for detailed usage and output information.

Bin contamination

After motif identification it is possible to identify contamination in bins using the bin-motifs.tsv, contig-bin.tsv and motif-scored.tsv files.

nanomotif detect_contamination --motifs_scored MOTIFS_SCORED.tsv --bin_motifs BIN_MOTIFS.tsv --contig_bins CONTIG_BINS.tsv -t THREADS --out OUT

This will generate a bin_contamination.tsv specifying the contigs, which is flagged as contamination.

If the --write_bins and the --assembly_file flags are specified new de-contaminated bins will be written to a bins folder.

See usage and output for detailed usage and output information.

Include unbinned contigs

The include_contigs command assigns unbinned contigs in the assembly file to bins by comparing the methylation pattern of the contig to the bin consensus pattern. The contig must have a unique perfect match to the bin consensus pattern to be assigned to a bin. Additionally, the include_contigs assigns all the contigs in the bin_contamination.tsv file as unbinned.

nanomotif include_contigs --motifs_scored MOTIFS_SCORED.tsv --bin_motifs BIN_MOTIFS.tsv --contig_bins CONTIG_BINS.tsv --run_detect_contamination -t THREADS --out OUT

If decontamination should not be performed, the include_contigs can be run without the --run_detect_contamination flag or without the --contamination_file flag.

MTase-linker

This module tries to link methylation motifs to their corresponding MTase and, when present, their entire RM system.

The MTase-Linker module has additional dependencies that are not automatically installed with Nanomotif. Therefore, before using this module, you must manually install these dependencies using the MTase-linker install command. The MTase-linker module requires that conda is available on your system.

nanomotif MTase-linker install

This will create a folder named ML_dependencies in your current working directory, containing the required dependencies for the MTase-linker module. You can use the --dependency_dir flag to change the installation location of the ML_dependencies folder.

The installation requires conda to generate a few environments, and it takes a bit time as it runs the workflow on a small dataset to check the installation.

When the additional dependencies are installed you can run the workflow using MTase-linker run

nanomotif MTase-linker run -t 10 --assembly ASSEMBLY.fasta --contig_bin contig_bin.tsv --bin_motifs nanomotif/bin_motifs.tsv -d ML_dependencies -o mtase_linker

Running the nanomotif MTase-linker run command will generate two primary output files: mtase_assignment_table.tsv and nanomotif_assignment_table.tsv. The first file lists all predicted MTase genes in the genome along with their predicted methylation characteristics and whether the module was able to unambiguously assign any detected motifs to the MTase (linked = (True/False)). The second file includes data from the bin-motifs.tsv of the nanomotif output with two additional columns linked and candidate_genes. The linked variable is a boolean indicator if the motif could be unambiguously linked to a MTase in the bin/genome (TRUE/FALSE). If True the gene_id of the MTase is provided in candidate_gene. If False, the candidate_gene variable lists feasible candidate facilitators of the modification based on motif type and modification type predictions.

Citation

Please cite our preprint if you use Nanomotif for your research:

Nanomotif: Identification and Exploitation of DNA Methylation Motifs in Metagenomes using Oxford Nanopore Sequencing Søren Heidelbach, Sebastian Mølvang Dall, Jeppe Støtt Bøjer, Jacob Nissen, Lucas Nicolaas Ludovic van der Maas, Mantas Sereika, Rasmus Kirkegaard, Sabrina Just Kousgaard, Ole Thorlacius-Ussing, Sheila I Jensen, Katja Hose, Thomas Dyhre Nielsen, Mads Albertsen. Preprint at bioRxiv https://doi.org/10.1101/2024.04.29.591623 (2024)

License

Nanomotif is released under the MIT License. Feel free to use, modify, and distribute the package in accordance with the terms of the license.

Acknowledgments

Nanomotif builds upon various open-source libraries and tools that are instrumental in its functionality. We would like to express our gratitude to the developers and contributors of these projects for their valuable work.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nanomotif-0.4.16.tar.gz (6.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nanomotif-0.4.16-py3-none-any.whl (6.6 MB view details)

Uploaded Python 3

File details

Details for the file nanomotif-0.4.16.tar.gz.

File metadata

  • Download URL: nanomotif-0.4.16.tar.gz
  • Upload date:
  • Size: 6.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for nanomotif-0.4.16.tar.gz
Algorithm Hash digest
SHA256 7a75fc0564ac9e534632add2da73e75ee71ec252e82a18e8d9fd964d705cb7c1
MD5 47db003fadf7d12433e42487eb3473a1
BLAKE2b-256 0374a9319dae307b2d2d873d673fc59c3837fc34ad3fedffc7761f4193723787

See more details on using hashes here.

File details

Details for the file nanomotif-0.4.16-py3-none-any.whl.

File metadata

  • Download URL: nanomotif-0.4.16-py3-none-any.whl
  • Upload date:
  • Size: 6.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for nanomotif-0.4.16-py3-none-any.whl
Algorithm Hash digest
SHA256 671d2b1fd17fdd668f5b8704e4215d79f6cf8313c1b425d51f1d0ee72cdc5593
MD5 38f78ba0932ae79903b24825d1698035
BLAKE2b-256 32c179d690b14dbfea99fd8ecb748692a102acf44aea5eb9e169a1b6181f1e9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page