Skip to main content

Single Molecule Footprinting Analysis in Python.

Project description

PyPI Docs

smftools

A Python tool for processing raw sequencing data derived from single molecule footprinting experiments into anndata objects. Additional functionality for preprocessing, analysis, and visualization.

Philosophy

While most genomic data structures handle low-coverage data (<100X) along large references, smftools prioritizes high-coverage data (scalable to at least 1 million X coverage) of a few genomic loci at a time. This enables efficient data storage, rapid data operations, hierarchical metadata handling, seamless integration with various machine-learning packages, and ease of visualization. Furthermore, functionality is modularized, enabling analysis sessions to be saved, reloaded, and easily shared with collaborators. Analyses are centered around the anndata object, and are heavily inspired by the work conducted within the single-cell genomics community.

Dependencies

The following CLI tools need to be installed and configured before using the informatics (smftools.inform) module of smftools:

  1. Dorado -> For standard/modified basecalling and alignment. Can be attained by downloading and configuring nanopore MinKnow software.
  2. Samtools -> For working with SAM/BAM files
  3. Minimap2 -> The aligner used by Dorado
  4. Modkit -> Extracting summary statistics and read level methylation calls from modified BAM files
  5. Bedtools -> For generating Bedgraphs from BAM alignment files.
  6. BedGraphToBigWig -> For converting BedGraphs to BigWig files for IGV sessions.

Modules

Informatics: Processes raw Nanopore/Illumina data from SMF experiments into an AnnData object.

Preprocessing: Appends QC metrics to the AnnData object and perfroms filtering.

  • Tools: Appends various analyses to the AnnData object.
  • Plotting: Visualization of analyses stored within the AnnData object.

Announcements

09/09/24 - The pre-alpha phase package (smftools-0.1.1)

The informatics module has been bumped to alpha-phase status. This module can deal with POD5s and unaligned BAMS from nanopore conversion and direct SMF experiments, as well as FASTQs from Illumina conversion SMF experiments. Primary output from this module is an AnnData object containing all relevant SMF data, which is compatible with all downstream smftools modules. The other modules are still in pre-alpha phase. Preprocessing, Tools, and Plotting modules should be promoted to alpha-phase within the next month or so.

08/30/24 - The pre-alpha phase package (smftools-0.1.0) is installable through pypi!

Currently, this package (smftools-0.1.0) is going through rapid improvement (dependency handling accross Linux and Mac OS, testing, documentation, debugging) and is still too early in development for standard use. The underlying functionality was originally developed as a collection of scripts for single molecule footprinting (SMF) experiments in our lab, but is being packaged/developed to facilitate the expansion of SMF to any lab that is interested in performing these styles of experiments/analyses. The alpha-phase package is expected to be available within a couple months, so stay tuned!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smftools-0.1.3.tar.gz (8.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smftools-0.1.3-py3-none-any.whl (7.8 MB view details)

Uploaded Python 3

File details

Details for the file smftools-0.1.3.tar.gz.

File metadata

  • Download URL: smftools-0.1.3.tar.gz
  • Upload date:
  • Size: 8.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.2

File hashes

Hashes for smftools-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0749bd1066ce7280e45ad0f925c342a020c16d64ff28b9062be5dea9adc07a74
MD5 328e6ea91999bd9334201d9cb3162c71
BLAKE2b-256 f00dd50b8652ace8f2fea36e7f9f88c75d49b5f8eaffab8d93723869ab1f1172

See more details on using hashes here.

File details

Details for the file smftools-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: smftools-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.2

File hashes

Hashes for smftools-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c43d63b11f7bd521beb3f72be98d07dcd7bc3235fb5e3c0154b25c626f64b6fb
MD5 68d8762e2267d7b73a334ac7e5bdb573
BLAKE2b-256 2fccf82f2718e00d6f6f99aaf733a0635ed49bf68d881961e83dd2b54d32f633

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page