Skip to main content

SMOC Multi-Omics Digital Object System API

Project description

modos logo

Current Release label Test Status label Documentation website License label

modos-api

Access and manage Multi-Omics Digital Objects (MODOs).

Context

Goals

Provide a digital object and system to process, store and serve multi-omics data with their metadata such that:

  • Traceability and reproducibility is ensured by rich metadata
  • The different omics layers are processed and distributed together
  • Common operations such as liftover can be automated easily and ensure that omics layers are kept in sync
  • Data can be accessed, sliced and streamed over the network without downloading the dataset.

Architecture

The client library by itself can be used to work with local MODOs, or connect to a server to access objects over s3.

The server configuration and setup insructions can be found in deploy. It consists of a REST API, an s3 server and an htsget server to stream CRAM/BCF over the network. The aim is to provide transparent remote access to MODOs without storing the data locally.

Format

The digital object is composed of a folder with:

  • Genomic data files (CRAM, BCF, ...)
  • A zarr archive for metadata and array-based data

The metadata links to the different files and provides context using the modos-schema.

Installation

The library can be installed with pip:

pip install modos

The development version can be installed directly from github:

pip install git+https://github.com/sdsc-ordes/modos-api.git@main

Usage

The CLI is convenient for quickly managing modos (creation, edition, deletion) and quick inspections:

$ # remote example
$ modos --endpoint http://localhost show --zarr s3://ex-bucket/ex-modo
/
 ├── assay
    └── assay1
 ├── data
    ├── calls1
    └── demo1
 ├── reference
    └── reference1
 └── sample
     └── sample1

$ # local example
$ modos show --files data/ex
data/ex/reference1.fa.fai
data/ex/demo1.cram
data/ex/reference1.fa
data/ex/calls1.bcf
data/ex/demo1.cram.crai
data/ex/calls1.bcf.csi

The user facing API is in modos.api. It provides full programmatic access to the object's [meta]data:

>>> from modos.api import MODO

>>> ex = MODO('./example-digital-object')
>>> ex.list_samples()
['sample/sample1']
>>> ex.metadata["data/calls1"]
{'@type': 'DataEntity',
 'data_format': 'BCF',
 'data_path': 'calls1.bcf',
 'description': 'variant calls for tests',
 'has_reference': ['reference/reference1'],
 'has_sample': ['sample/sample1'],
 'name': 'Calls 1'}
>>> rec = next(ex.stream_genomics("calls1.bcf", "chr1:103-1321"))
>>> rec.alleles
('A', 'C')

For advanced use cases, the object's metadata can be queried with SPARQL!

>>> # Build a table with all files from male samples
>>> query = """
...   SELECT ?assay ?sample ?file
...   WHERE {
...     [] schema:name ?assay ;
...       modos:has_data [
...         modos:data_path ?file ;
...         modos:has_sample [
...           schema:name ?sample ;
...           modos:sex ?sex
...         ]
...       ] .
...     FILTER(?sex = "Male")
...   }
... """
>>> ex.query(query).serialize(format="csv").decode())
assay,sample,file
Assay 1,Sample 1,file://ex/calls1.bcf
Assay 1,Sample 1,file://ex/demo1.cram

Contributing

First, read the Contribution Guidelines.

For technical documentation on setup and development, see the Development Guide

Acknowledgements and Funding

The development of the Multi-Omics Digital Object System (MODOS) is being funded by the Personalized Health Data Analysis Hub, a joint initiative of the Personalized Health and Related Technologies (PHRT) and the Swiss Data Science Center (SDSC), for a period of three years from 2023 to 2025. The SDSC leads the development of MODOS, bringing expertise in complex data structures associated with multi-omics and imaging data to advance privacy-centric clinical-grade integration. The PHRT contributes its domain expertise of the Swiss Multi-Omics Center (SMOC) in the generation, analysis, and interpretation of multi-omics data for personalized health and precision medicine applications. We gratefully acknowledge the Health 2030 Genome Center for their substantial contributions to the development of MODOS by providing test data sets, deployment infrastructure, and expertise.

Copyright

Copyright © 2023-2024 Swiss Data Science Center (SDSC), www.datascience.ch. All rights reserved. The SDSC is jointly established and legally represented by the École Polytechnique Fédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich (ETH Zürich). This copyright encompasses all materials, software, documentation, and other content created and developed by the SDSC in the context of the Personalized Health Data Analysis Hub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modos-0.3.5.tar.gz (768.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modos-0.3.5-py3-none-any.whl (49.7 kB view details)

Uploaded Python 3

File details

Details for the file modos-0.3.5.tar.gz.

File metadata

  • Download URL: modos-0.3.5.tar.gz
  • Upload date:
  • Size: 768.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modos-0.3.5.tar.gz
Algorithm Hash digest
SHA256 8aec70fc99cfeb2dc76a1c836223f15e0b045c043cec9d15890f6f9c09d9f2de
MD5 4c7ce87e9462fcc3570922cdf8c54db1
BLAKE2b-256 21b1ba9666804d78d5c31e6054d22440057896ecfd98f515a108445d4eaa6a99

See more details on using hashes here.

Provenance

The following attestation bundles were made for modos-0.3.5.tar.gz:

Publisher: uv-publish.yml on sdsc-ordes/modos-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file modos-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: modos-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 49.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modos-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f6f336e34659d15503abcf92ff4fd8c58c80d69ed283a6c23a6aa3c15003a5ff
MD5 6a887be70fa69c14d6e548822a9a1cf2
BLAKE2b-256 c8e75baee26e63a21235bd884eeac58b99f0eeac4742cd5922c66739462958ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for modos-0.3.5-py3-none-any.whl:

Publisher: uv-publish.yml on sdsc-ordes/modos-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page