Skip to main content

A package to create and manage genome-related TileDB arrays

Project description

momics

momics is both a revolutionary file format for efficient storage of genomic data, and a Python package designed to query and manipulate these files. The file format is specifically designed to handle genomic coverage tracks and sequences. The package provides an intuitive command-line interface (CLI) and a Python library for bioinformatics workflows involving genomic data.

Install

You can install momics using pip:

pip install momics

Alternatively, clone this repository and install the package locally:

git clone https://github.com/js2264/momics.git
cd momics
pip install .

Features

  • Efficient genomic data storage: Store large genomic coverage tracks and genome reference sequences compactly.
  • Multi-Range querying: Query multiple genomic regions simultaneously with high performance.
  • Rich Python library: Directly access and manipulate genomic data using Python objects and methods.
  • Full-fledged command-line interface (CLI): Perform common tasks such as adding new tracks, querying data, and extracting information directly from the shell.

Usage

CLI Commands

  • Add a track:

To ingest a .bw genomic coverage data into a momics repository, you can use the add command:

momics add tracks -f bw1=path/to/file.bw path/to/momics_repo
  • Query genomic coverage:

You can query tracks using either UCSC-style coordinates or a BED file:

momics query tracks --coordinates "chr1:1-1000" path/to/momics_repo
momics query tracks --file path/to/file.bed path/to/momics_repo

Python API

In Python, you can load and query a momics repository like this:

from momics import Momics

# Load a Momics repository
repo = Momics("path/to/momics_repo")

# Query tracks with coordinates
df = repo.query_tracks("chr1:1-1000")

Data Format

momics uses a custom data format that combines genomic sequences and coverage tracks in a compressed and indexed form. The format allows for rapid access to any region of the genome and supports simultaneous querying of multiple genomic regions.

Contributing

Contributions are welcome! Please submit pull requests or issues on the GitHub repository.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

momics-0.3.0.tar.gz (27.1 kB view details)

Uploaded Source

Built Distribution

momics-0.3.0-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file momics-0.3.0.tar.gz.

File metadata

  • Download URL: momics-0.3.0.tar.gz
  • Upload date:
  • Size: 27.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for momics-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2e85560f6e99a960bd56e56bd52e5980248505ef3634146cffbaeddd4111206a
MD5 5211ee70e29a7f981e13a2d0493f78bb
BLAKE2b-256 a5e68d07c5bebcf198ca3756074ad7b536033e624a40e35a858197c1797b1981

See more details on using hashes here.

File details

Details for the file momics-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: momics-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for momics-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 773cf925002db97875251c8ea9f0405b6ba303fcae3bce74c5c8e559ba7758fc
MD5 2955c4c3a9d7f9182c8926379b47d066
BLAKE2b-256 6b619002e2bcf190207310845eacee724705d033aad753e4bc1d0acf42cf7483

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page