Skip to main content

Newmap: tools for genome and methylome mappability

Project description

Newmap

Introduction

Newmap is a software package that efficiently identifies uniquely mappable regions of any genome. It accomplishes this task by outputting read lengths at every position that are unique to that genome. From the range of unique read lengths produced, the single-read mappability and the multi-read mappability for a specific read length can be generated.

Newmap can search for unique k-mer/read lengths on specific values, or entire continuous ranges using a binary search method allowing for finding the minimum possible unique k-mer/read length.

Newmap requires a CPU that supports the AVX2 instruction set.

OpenMP is required for parallel processing.

Documentation

The latest for Newmap is available on Read the Docs.

All commands have a --help option to provide additional usage information.

Quick start

Installation

Python Package Index (PyPI)

pip install newmap

Bioconda

conda install newmap

Usage

1. Create an index for a genome

newmap index genome.fa

By default this will create a genome.awfmi file in the current directory.

2. Find the minimum unique k-mer lengths for the genome using the index

Searching the entire genome, using 20 threads, printing status information, and searching lengths ranging from 20 to 200 bp:

newmap search --verbose --num-threads=20 --search-range=20:200 --output-directory=unique_lengths genome.fa

This will create *.unique.uint8 files (one for each sequence ID) in the unique_lengths directory.

3. Convert the unique lengths to mappability tracks

To output single-read and multi-read mappability for a 24 bp read length:

newmap track --single-read=24.bed --multi-read=24.wig 24 unique_lengths/*.unique.uint8

For both single-read and multi-read mappability, this will generate a single file that contains the mappability for all sequences listed in the unique_lengths directory. The resulting BED file will be the single read mappability, and the WIG file will be the multi-read mappability.

Credits

Newmap is a reimplementation of the output of Umap. Umap was developed by Mehran Karimzadeh. The repository for that implmemention is found at https://www.github.com/hoffmangroup/umap. Umap in turn was originally developed by Anshul Kundaje and was written in MATLAB. The original repository is available https://sites.google.com/site/anshulkundaje/projects/mappability.

This project uses the excellent AvxWindowFMIndex library. Read their published article here (https://doi.org/10.1186/s13015-021-00204-6)

Project details


Release history Release notifications | RSS feed

This version

0.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

newmap-0.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (169.5 kB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

File details

Details for the file newmap-0.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for newmap-0.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 99c8f116619baa0513c6cb5e01bf7ae13e0733a2f5f171dd63b5d0afaf44aa29
MD5 64d40bc64f781f32d4b5acde4c2c56c7
BLAKE2b-256 15b1814ea1d74079287fd3e25647b6a88b83ae2225dff522f249c6214e744c83

See more details on using hashes here.

Provenance

The following attestation bundles were made for newmap-0.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-publish.yml on hoffmangroup/newmap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page