Skip to main content

A package to create and manage genome-related TileDB arrays

Project description

Python Version from PEP 621 TOML PyPI - Version Test & Doc Codecov Black

momics

momics is both a revolutionary file format for efficient storage of multi-omics data, and a Python package designed to query and manipulate these files. The file format is specifically designed to handle genomic coverage tracks and sequences. The package provides an intuitive command-line interface (CLI) and a Python library for bioinformatics workflows involving genomic data.

Install

You can install momics using pip:

pip install momics

Alternatively, clone this repository and install the package locally:

git clone https://github.com/js2264/momics.git
cd momics
pip install .

Features

  • Efficient genomic data storage: Store large genomic coverage tracks and genome reference sequences compactly.
  • Multi-Range querying: Query multiple genomic regions simultaneously with high performance.
  • Rich Python library: Directly access and manipulate genomic data using Python objects and methods.
  • Full-fledged command-line interface (CLI): Perform common tasks such as adding new tracks, querying data, and extracting information directly from the shell.

Usage

CLI Commands

  • Add a track:

To ingest a .bw genomic coverage data into a momics repository, you can use the ingest command:

momics ingest tracks -f bw1=path/to/file.bw path/to/momics_repo
  • Query genomic coverage:

You can query tracks using either UCSC-style coordinates or a BED file:

momics query tracks --coordinates "chr1:1-1000" path/to/momics_repo
momics query tracks --file path/to/file.bed path/to/momics_repo

Python API

In Python, you can load and query a momics repository like this:

from momics.momics import Momics

# Load a Momics repository
repo = Momics("path/to/momics_repo")

# Query tracks with coordinates
df = repo.query_tracks("chr1:1-1000")

Data Format

momics uses a custom data format that combines genomic sequences and coverage tracks in a compressed and indexed form. The format allows for rapid access to any region of the genome and supports simultaneous querying of multiple genomic regions.

Contributing

Contributions are welcome! Please submit pull requests or issues on the GitHub repository.

This project uses black to format code and ruff for linting. We also support pre-commit to ensure these have been run. To configure your local environment, please install these development dependencies and set up the commit hooks.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

momics-0.4.0.tar.gz (75.0 kB view details)

Uploaded Source

Built Distribution

momics-0.4.0-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file momics-0.4.0.tar.gz.

File metadata

  • Download URL: momics-0.4.0.tar.gz
  • Upload date:
  • Size: 75.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for momics-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1bda04915dd83fb52656d5020afc3a367ab66b4905c0c1322fce6714944b1d7e
MD5 443af8125317aefdc45832cc18c285f0
BLAKE2b-256 003c8ff1158a308a27a993bf458c7bfc88c1aa9c4441b131735dc40cd665e97a

See more details on using hashes here.

File details

Details for the file momics-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: momics-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 43.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for momics-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4dc90ba4f8ac5031f30b516d420c1cdea86bf2be9f1b3a5053a08c3118a18e3f
MD5 e804f311a21f9122c2df19dc28890f20
BLAKE2b-256 fddcc6314136f83614cc9db87ef533330f53da6256730307b10f942c0c27d7cb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page