Skip to main content

A SQLite and DuckDB wrapper suitable for bioinformatic analysis of multi-omic data.

Project description

omilayers

Documentation Status Downloads

omilayers is a Python data management library. It is suitable for multi-omic data analysis, hence the omi prefix, that involves the handling of diverse datasets usually referred to as omic layers. omilayers is based on DuckDB and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.

The rationale behind omilayers is the following:

  • User stores layers of omic data (tables in SQL lingo).
  • User creates new layers by processing and restructuring existing layers.
  • User can group layers using tags.
  • User can store a brief description for each layer.

Why omilayers?

Using the Python API provided by DuckDB, the user would need to write the following code to parse a column named foo from a layer called omicdata:

import duckdb

with duckdb.connect("dbname.duckdb") as con:
   result = con.sql("SELECT foo FROM omicdata").fetchdf()

Although the above SQL query is straightfoward, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of omilayers is to provide this level of abstaction to facilitate bioinformatic data analysis. The omilayers API resembles the pandas API and the user needs to write the following code to perform the above task:

from omilayers import Omilayers

omi = Omilayers("dbname.duckdb")
result = omi.layers['omicdata']['foo']

Installation

pip install omilayers

Testing with synthetic omic data

The directory synthetic_data includes a jupyter notebook for testing omilayers using synthetic multi-omic data. It also includes the Python script create_synthetic_vcf/synthesize_vcf.py that was used to create the synthetic VCF that is hosted in Zenodo DOI.

The recreation of the synthetic VCF can be done as following:

for i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;done

To join the generated VCFs into a single VCF:

for i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;done

Documentation

You can read the full documentation here: https://omilayers.readthedocs.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omilayers-0.2.0.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omilayers-0.2.0-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file omilayers-0.2.0.tar.gz.

File metadata

  • Download URL: omilayers-0.2.0.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for omilayers-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b039c91b1025c378ff26a5b06e80b80f6b083204e63d3606cb0ba502c7dfbde9
MD5 6d8962265e2593eb3e6b0993c878cd93
BLAKE2b-256 758c417fa4505e198f13fd2bdc0f030bd04a9cd8cf9ae4b76502a2d314e82597

See more details on using hashes here.

File details

Details for the file omilayers-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: omilayers-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for omilayers-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4170aa6ad020e7c8643cea67460217e83fd46f0530ad87e33a0c633926e6205
MD5 e1a12e50e9f1447c60c4ba6f00f65778
BLAKE2b-256 10e12ba2651b07b841e9265019e864492c93e968b6acb1429e853eedfbd84298

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page