A package to create and manage genome-related TileDB arrays
Project description
momics
momics
is both a revolutionary file format for efficient storage of multi-omics data, and a Python package designed to query and manipulate these files. The file format is specifically designed to handle genomic coverage tracks and sequences. The package provides an intuitive command-line interface (CLI) and a Python library for bioinformatics workflows involving genomic data.
Install
You can install momics
using pip
:
pip install momics
Alternatively, clone this repository and install the package locally:
git clone https://github.com/js2264/momics.git
cd momics
pip install .
Features
- Efficient genomic data storage: Store large genomic coverage tracks and genome reference sequences compactly.
- Multi-Range querying: Query multiple genomic regions simultaneously with high performance.
- Rich Python library: Directly access and manipulate genomic data using Python objects and methods.
- Full-fledged command-line interface (CLI): Perform common tasks such as adding new tracks, querying data, and extracting information directly from the shell.
Usage
CLI Commands
- Add a track:
To ingest a .bw
genomic coverage data into a momics repository, you can use the ingest
command:
momics ingest tracks -f bw1=path/to/file.bw path/to/momics_repo
- Query genomic coverage:
You can query tracks using either UCSC-style coordinates or a BED file:
momics query tracks --coordinates "chr1:1-1000" path/to/momics_repo
momics query tracks --file path/to/file.bed path/to/momics_repo
Python API
In Python, you can load and query a momics repository like this:
from momics.momics import Momics
# Load a Momics repository
repo = Momics("path/to/momics_repo")
# Query tracks with coordinates
df = repo.query_tracks("chr1:1-1000")
Data Format
momics
uses a custom data format that combines genomic sequences and coverage tracks in a compressed and indexed form. The format allows for rapid access to any region of the genome and supports simultaneous querying of multiple genomic regions.
Contributing
Contributions are welcome! Please submit pull requests or issues on the GitHub repository.
This project uses black
to format code and ruff
for linting. We also support pre-commit
to ensure
these have been run. To configure your local environment, please install these development dependencies and set up
the commit hooks.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file momics-0.4.0.tar.gz
.
File metadata
- Download URL: momics-0.4.0.tar.gz
- Upload date:
- Size: 75.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1bda04915dd83fb52656d5020afc3a367ab66b4905c0c1322fce6714944b1d7e |
|
MD5 | 443af8125317aefdc45832cc18c285f0 |
|
BLAKE2b-256 | 003c8ff1158a308a27a993bf458c7bfc88c1aa9c4441b131735dc40cd665e97a |
File details
Details for the file momics-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: momics-0.4.0-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4dc90ba4f8ac5031f30b516d420c1cdea86bf2be9f1b3a5053a08c3118a18e3f |
|
MD5 | e804f311a21f9122c2df19dc28890f20 |
|
BLAKE2b-256 | fddcc6314136f83614cc9db87ef533330f53da6256730307b10f942c0c27d7cb |